128 79 30MB
English Pages 332 [324] Year 2022
Lecture Notes in Networks and Systems 576
Jezreel Mejia Mirna Muñoz Álvaro Rocha Víctor Hernández-Nava Editors
New Perspectives in Software Engineering Proceedings of the 11th International Conference on Software Process Improvement (CIMPS 2022)
Lecture Notes in Networks and Systems Volume 576
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
More information about this series at https://link.springer.com/bookseries/15179
Jezreel Mejia Mirna Muñoz Álvaro Rocha Víctor Hernández-Nava •
•
•
Editors
New Perspectives in Software Engineering Proceedings of the 11th International Conference on Software Process Improvement (CIMPS 2022)
123
Editors Jezreel Mejia Centro de Investigación en Matemáticas, A.C. Unidad Zacatecas Zacatecas, Mexico Álvaro Rocha ISEG Universidade de Lisboa Lisbon, Portugal
Mirna Muñoz Centro de Investigación en Matemáticas, A.C. Unidad Zacatecas, Zacatecas Zacatecas, Zacatecas, Mexico Víctor Hernández-Nava Universidad Hipócrates Acapulco de Juárez, Guerrero, Mexico
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-20321-3 ISBN 978-3-031-20322-0 (eBook) https://doi.org/10.1007/978-3-031-20322-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Introduction
This book contains a selection of papers accepted for presentation and discussion at the 2022 International Conference on Software Process Improvement (CIMPS’22). This conference had the support of the CIMAT A.C. (Mathematics Research Center/Centro de Investigación en Matemáticas); Hippocrates University, Acapulco, Guerrero, México; CAMSA Corporation; CANIETI Guerrero, México; Revista RISTI, RECIBE. It took place at Hippocrates University as a venue, in Acapulco, Guerrero, México, from October 19 to 21, 2022. The International Conference on Software Process Improvement (CIMPS) is a global forum for researchers and practitioners that present and discuss the most recent innovations, trends, results, experiences and concerns in the several perspectives of software engineering with clear relationship but not limited to software processes, Security in Information and Communication Technology and Big Data Field. One of its main aims is to strengthen the drive toward a holistic symbiosis among academy, society, industry, government and business community promoting the creation of networks by disseminating the results of recent research in order to align their needs. CIMPS’22 built on the successes of CIMPS’12, CIMPS’13 and CIMPS’14, which took place on Zacatecas, Zac, CIMPS’15 which took place on Mazatlán, Sinaloa, CIMPS’16 which took place on Aguascalientes, Aguascalientes, México, CIMPS’17 which took place again on Zacatecas, Zac, México, CIMPS’18 which took place on Guadalajara, Jalisco, México, CIMPS’19 which took place on León, Guanajuato, México, CIMPS’20 which took place on Mazatlán, Sinaloa, México as the virtual venue and the last edition CIMPS’21 which took place on Torreón, Coahuila, México as virtual venue. The Program Committee of CIMPS’22 was composed of a multidisciplinary group of experts and those who are intimately concerned with software engineering and information systems and technologies. They have had the responsibility for evaluating, in a ‘blind review’ process, the papers received for each of the main themes proposed for the conference: organizational models, standards and methodologies, knowledge management, software systems, applications and tools, information and communication technologies and processes in non-software
v
vi
Introduction
domains (mining, automotive, aerospace, business, health care, manufacturing, etc.) with a demonstrated relationship to software engineering challenges. CIMPS’22 received contributions from several countries around the world. The articles accepted for presentation and discussion at the conference are published by Springer (this book), and extended versions of the best selected papers will be published in relevant journals, including SCI/SSCI and Scopus indexed journals. We acknowledge all those who contributed to the staging of CIMPS’22 (authors, committees and sponsors); their involvement and support are very much appreciated. October 2022
Jezreel Mejia Mirna Muñoz Álvaro Rocha Víctor Hernández-Nava
Organization
Conference General Chairs Jezreel Mejía Mirna Muñoz
Mathematics Research Center, Research Unit Zacatecas, Mexico Mathematics Research Center, Research Unit Zacatecas, Mexico
The general chairs and co-chair are researchers in computer science at the Research Center in Mathematics, Zacatecas, México. Their research field is software engineering, which focuses on process improvement, multi-model environment, project management, acquisition and outsourcing process, solicitation and supplier agreement development, agile methodologies, metrics, validation and verification and information technology security. They have published several technical papers on acquisition process improvement, project management, TSPi, CMMI and multi-model environment. They have been members of the team that has translated CMMI-DEV v1.2 and v1.3 to Spanish. General Support CIMPS general support represents centers, organizations or networks. These members collaborate with different European, Latin America and North America Organizations. The following people have been members of the CIMPS conference since its foundation for the last 10 years. Gonzalo Cuevas Agustín Politechnical University of Madrid, Spain Jose A. Calvo-Manzano Politechnical University of Madrid, Spain Villalón Tomas San Feliu Gilabert Politechnical University of Madrid, Spain Álvaro Rocha Universidade de Lisboa, Portugal Yadira Quiñonez Autonomous University of Sinaloa, Mexico Gabriel A. García Mireles University of Sonora, Mexico vii
viii
Adriana Peña Pérez-Negrón Iván García Pacheco
Organization
University of Guadalajara, Mexico Technological University of the Mixteca, Mexico
Local Committee CIMPS established a Local Committee from the Mathematics Research Center, Research Unit Zacatecas, MX, the University Hippocrates, of Acapulco, Guerrero, México, CAMSA Corporation, Cybersecurity Systems and Intelligence Enterprise and CANIETI Guerrero, México. The list below comprises the Local Committee members. CIMAT UNIT ZACATECAS: Grecia Yeraldi González Castillo (Support), Mexico Nieves González Olvera (Support), Mexico Nohemí Arely Ruiz Hernández (Support), Mexico Isaac Rodríguez Maldonado (Support), Mexico Luis Angel Arroyo Morales (Support), Mexico Isaul Ibarra Belmonte (Support), Mexico Antonio Tablada Dominguez (Support), Mexico Luis Roberto Villa Salas (Support), Mexico Jair Nájera Salaices (Support), Mexico Ivan Ibrahim Fernandez Morales (Support), Mexico Victor Miguel Terrón Macias (Support), Mexico Ernesto Alejandro Orozco Jiménez (Support), Mexico Daniela Acevedo Dueñas (Support), Mexico Carmen Lizarraga (External Support), Autonomous University of Sinaloa, Mexico Raquel Aguayo (External Support), Autonomous University of Sinaloa, Mexico Hipócrates University (HU): Marisol Manzanarez Nava (Rector HU), Mexico Juan Ramon Nieto Quezada (Academic Vice Chancellor HU), Mexico Víctor Hernández Nava (Operations Vice Chancellor HU), Mexico Rogelio Enrique Palomino Castellanos (Director of the Faculty of Exact Sciences and Engineering HU), Mexico Luz Verónica Radilla Tapia (Director of Planning and Linking Relations HU), Mexico Irma Baldovinos Leyva (Director of Research and Entrepreneurship HU), Mexico Dámariz Arce Rodríguez (Coordinator of Social Communication HU), Mexico Dashiell J. López Obregón (Coordinator of Systems and Technological Development HU), Mexico Maria Janinne Irra Olvera (Executive Translator HU), Mexico CAMSA Corporation: Jair de Jesús Cambrón Navarrete (President), Mexico
Organization
ix
Cybersecurity Systems and Intelligence Enterprise: Hugo Montoya Diaz (CEO and CTO), Mexico CANIETI Guerrero, México: Luis Felipe Monroy Álvarez (President of the Office), Mexico
Scientific Program Committee CIMPS established an international committee of selected well-known experts in software engineering who are willing to be mentioned in the program and to review a set of papers each year. The list below comprises the Scientific Program Committee members. Adriana Peña Pérez-Negrón Alejandro Rodríguez González Alejandra García Hernández Álvaro Rocha Ángel M. García Pedrero Antoni Lluis Mesquida Calafat Antonio de Amescua Seco Baltasar García Pérez Benjamín Ojeda Magaña Carlos Abraham Carballo Monsivais Carla Pacheco Carlos Alberto Fernández y Fernández Cesar Guerra-García Claudio Meneses Villegas Edgar Alan Calvillo Moreno Edgar Oswaldo Díaz Eduardo de la Cruz Gamez Eleazar Aguirre Anaya Fernando Moreira Francisco Jesús Rey Losada Gabriel A. García Mireles Giner Alor Hernández Gloria Monica Martinez Gloria P. Gasca Hurtado Gonzalo Cuevas Agustín Gonzalo Luzardo
University of Guadalajara, Mexico Politechnical University of Madrid, Spain Autonomous University of Zacatecas, Mexico Universidade de Lisboa, Portugal Politechnical University of Madrid, Spain University of Islas Baleares, Spain University Carlos III of Madrid, Spain Schofield, University of Vigo, Spain University of Guadalajara, Mexico CIMAT Unit Zacatecas, Mexico Technological University of Mixteca, Oaxaca, Mexico Technological University of Mixteca, Oaxaca, Mexico Autonomous University of San Luis, Mexico Catholic University of North, Chile Technological University of Aguascalientes, Mexico INEGI, Mexico Technological Institute of Acapulco, Mexico National Politechnical Institute, Mexico University of Portucalense, Portugal University of Vigo, Spain University of Sonora, Mexico Technological Institute of Orizaba, Mexico Technological University of Torreón, Mexico University of Medellin, Colombia Politechnical University of Madrid, Spain Higher Polytechnic School of Litoral, Ecuador
x
Gustavo Illescas Héctor Cardona Reyes Himer Ávila George Hugo Arnoldo Mitre Hernández Hugo O. Alejandrez-Sánchez Iván García Pacheco Jezreel Mejía Miranda Jorge Luis García Alcaraz José Alberto Benítez Andrades Jose A. Calvo-Manzano Villalón José Antonio Cervantes Álvarez José Antonio Orizaga Trejo José-Eder Guzmán-Mendoza Jose-Francisco Gazga-Portillo José Guadalupe Arceo Olague José Luis David Bonilla José Luis Hernández Hernández José Luis Sánchez Cervantes Juan Francisco Peraza Garzón Juan Manuel Toloza Lisbeth Rodriguez Mazahua Lizbeth A. Hernández González Lohana Lema Moreta Luis Omar Colombo Mendoza Luz Sussy Bayona Ore Magdalena Arcilla Cobián Manuel Pérez Cota María de León Sigg María del Pilar Salas Zárate Mario Andrés Paredes Valverde Mario Hernández Hernández
Organization
National University of Central Buenos Aires Province, Argentina CIMAT Unit Zacatecas, Mexico University of Guadalajara, Mexico CIMAT Unit Zacatecas, Mexico National Center for Research and Technological Development, CENIDET, Mexico Technological University of Mixteca, Oaxaca, Mexico CIMAT Unit Zacatecas, Mexico Autonomous University of Juárez City, Mexico University of Lion, Spain Politechnical University of Madrid, Spain University of Guadalajara, Mexico University of Guadalajara CUCEA, Mexico Polytechnic University of Aguascalientes, Mexico Technological Institute of Acapulco, Mexico Autonomous University of Zacatecas, Mexico University of Guadalajara, Mexico Institute of Chilpancigo, Mexico Technological Institute of Orizaba, Mexico Autonomous University of Sinaloa, Mexico National University of Central Buenos Aires Province, Argentina Technological University of Orizaba, Mexico University of Veracruz, Mexico University of the Holy Spirit, Ecuador Technological Institute of Teziutlán, Mexico Autonomous University of Peru National Distance Education University, Spain University of Vigo, Spain Autonomous University of Zacatecas, Mexico Technological Institute of Teziutlan, Mexico University of Murcia, Spain Technological Institute of Chilpancingo, Mexico
Organization
Mary Luz Sánchez-Gordón Miguel Ángel De la Torre Gómora Miguel Hidalgo-Reyes Mirna Muñoz Mata Miriam Martínez Arroyo Omar S. Gómez Patricia María Henríquez Coronel Perla Velasco-Elizondo Ramiro Goncalves Raúl Aguilar Vera Ricardo Colomo Palacios Roberto Solis Robles Santiago Matalonga Sergio Galván Cruz Sodel Vázquez Reyes Sonia López Ruiz Stewart Santos Arce Sulema Torres Ramos Tomas San Feliu Gilabert Ulises Juárez Martínez Uziel Trujillo Colón Uziel Jaramillo Ávila Vianca Vega Víctor Saquicela Viviana Y. Rosales Morales Yadira Quiñonez Yasmin Hernández Yilmaz Murat
xi
Østfold University College, Norway University of Guadalajara CUCEI, Mexico Superior Technological Institute of Xalapa, Mexico CIMAT Unit Zacatecas, Mexico Technological Institute of Acapulco, Mexico Higher Polytechnic School of Chimborazo, Ecuador University Eloy, Alfaro de Manabi, Ecuador Autonomous University of Zacatecas, Mexico University Tras-os Montes, Portugal Autonomous University of Yucatán, Mexico Østfold University College, Norway Autonomous University of Zacatecas, Mexico University of the West, Scotland Autonomous University of Aguascalientes, Mexico Autonomous University of Zacatecas, Mexico University of Guadalajara, Mexico University of Guadalajara, Mexico University of Guadalajara, Mexico Politechnical University of Madrid, Spain Technological Institute of Orizaba, Mexico Technological Institute of Acapulco, Mexico CIMAT Unit Zacatecas, Mexico Catholic University of North Chile, Chile University of Cuenca, Ecuador University of Veracruz, Mexico Autonomous University of Sinaloa, Mexico CENIDET, Mexico Çankaya University, Turkey
Contents
Organizational Models, Standards and Methodologies AI-Oriented Software Engineering (AIOSE): Challenges, Opportunities, and New Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md Jobair Hossain Faruk, Hasan Pournaghshband, and Hossain Shahriar
3
Give Me the Right Sensor and I Can Learn Anything – Even Beat the Burnout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreea Ionica and Monica Leba
20
A Profile of Practices for Reporting Systematic Reviews: A Conference Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriel Alberto García-Mireles
34
A Look Through the SN Compiler: Reverse Engineering Results . . . . . Pedro de Jesús González-Palafox, Ulises Juárez-Martinéz, Oscar Pulido-Prieto, Lisbeth Rodríguez-Mazahua, and Mara Antonieta Abud-Figueroa A Software Development Model for Analytical Semantic Similarity Assessment on Spanish and English . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omar Zatarain, Efren Plascencia JR, Walter Abraham Bernal Diaz, Silvia Ramos Cabral, Miguel De la Torre, Rodolfo Omar Dominguez Garcia, Juan Carlos Gonzalez Castolo, and Miriam A. Carlos Mancilla Effects of Pilot, Navigator, and Solo Programming Roles on Motivation: An Experimental Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marcel Valový Video Game Development Process for Soft Skills Analysis . . . . . . . . . . . Adriana Peña Pérez Negrón, David Bonilla Carranza, and Mirna Muñoz
50
63
84 99
xiii
xiv
Contents
29110+ST: Integrated Security Practices. Case Study . . . . . . . . . . . . . . 113 Perla Maciel-Gallegos, Jezreel Mejía, and Yadira Quiñonez Comprehension of Computer Programs Through Reverse Engineering Approaches and Techniques: A Systematic Mapping Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Yazmin Alejandra Luna-Herrera, Juan Carlos Pérez-Arriaga, Jorge Octavio Ocharán-Hernández, and Ángel J. Sanchéz-García Data Science Based Methodology: Design Process of a Correlation Model Between EEG Signals and Brain Regions Mapping in Anxiety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Julia Elizabeth Calderón-Reyes, Humberto Muñoz-Bautista, Francisco Javier Alvarez-Rodriguez, María Lorena Barba-Gonzalez, and Héctor Cardona-Reyes Can Undergraduates Get the Experience Required by the Software Industry During Their University? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Mirna Muñoz Knowledge Management Data Mining Prospective Associated with the Purchase of Life Insurance Through Predictive Models . . . . . . . . . . . . . . . . . . . . . . . . . . 165 José Quintana Cruz and Freddy Tapia A Review of Graph Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Jaime I. Lopez-Veyna, Ivan Castillo-Zuñiga, and Mariana Ortiz-Garcia Implementation of Sentiment Analysis in Chatbots in Spanish to Detect Signs of Mental Health Problems . . . . . . . . . . . . . . . . . . . . . . . . 196 Eduardo Aguilar Yáñez, Sodel Vazquez Reyes, Juan F. Rivera Gómez, Perla Velasco Elizondo, Alejandro Mauricio Gonzalez, and Alejandra García Hernández Measurement of Physical Activity Energy Expenditure Using Inertial Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Juan Antonio Miguel-Ruiz, Javier Ortiz-Hernandez, Yasmín Hernández, Hugo Estrada-Esquivel, and Alicia Martinez-Rebollar Software Systems, Applications and Tools A New Proposal for Virtual Academic Advisories Using ChatBots . . . . 233 Carmen Lizarraga, Raquel Aguayo, Yadira Quiñonez, Víctor Reyes, and Jezreel Mejia
Contents
xv
A Telemonitoring System for Patients Undergoing Peritoneal Dialysis Treatment: Implementation in the IONIC Multiplatform Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Juan Manuel Sánchez Juárez, Eduardo López Domínguez, Yesenia Hernández Velázquez, Saúl Domínguez Isidro, María Auxilio Medina Nieto, and Jorge De la Calleja System for Monitoring and Control of in Vitro Ruminal Fermentation Kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 Luis Manuel Villasana-Reyna, Juan Carlos Elizondo-Leal, Daniel Lopez-Aguirre, Jose Hugo Barron-Zambrano, Alan Diaz-Manriquez, Vicente Paul Saldivar-Alonso, Yadira Quiñonez, and Jose Ramon Martinez-Angulo Delays by Multiplication for Embedded Systems: Method to Design Delays by Software for Long Times, by Means of Mathematical Models and Methods, to Obtain the Algorithm with Exact Times . . . . . 272 Miguel Morán, Alicia García, Alfredo Cedano, and Patricia Ventura Monitoring System for Dry Matter Intake in Ruminants . . . . . . . . . . . . 286 Jesus Sierra Martinez, Juan Carlos Elizondo Leal, Daniel Lopez Aguirre, Yadira Quiñonez, Jose Hugo Barron Zambrano, Alan Diaz Manriquez, Vicente Paul Saldivar Alonso, and Jose Ramon Martinez Angulo Speaker Identification in Noisy Environments for Forensic Purposes . . . 299 Armando Rodarte-Rodríguez, Aldonso Becerra-Sánchez, José I. De La Rosa-Vargas, Nivia I. Escalante-García, José E. Olvera-González, Emmanuel de J. Velásquez-Martínez, and Gustavo Zepeda-Valles Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Organizational Models, Standards and Methodologies
AI-Oriented Software Engineering (AIOSE): Challenges, Opportunities, and New Directions Md Jobair Hossain Faruk1(B) , Hasan Pournaghshband2 , and Hossain Shahriar3 1 Department of Computer Science, Kennesaw State University, Marietta, USA
[email protected]
2 Department of Information Technology, Kennesaw State University, Marietta, USA
[email protected]
3 Department of Software Engineering, Kennesaw State University, Marietta, USA
[email protected]
Abstract. Nowadays, the idea of artificial intelligence for software engineering is an emerging research domain and it has been receiving more and more attention recently. Software Engineering (SE) is a well-established technological paradigm, providing clear guidance and direction on every aspect of software development and maintenance. Artificial Intelligence (AI), contrary to the conventional approach, is an emerging and futuristic concept that reshapes a new era, designing intelligent machines that can accomplish tasks that humans do towards a technological wonderland. In parallel with technological advancements, SE is yet to be considered in adopting the state-of-the-art principle for AI-based methodologies. We realize the importance of facilitating the SE towards AI-based intelligent machines, devising novel techniques and approaches for AI-Oriented Software Engineering (AIOSE). In this study, we thoroughly explore the principle of SE within the AI boundary for ensuring effective development activities throughout the AIOSE development process. We believe that such a compendium study will help with linking AI and SE towards overcoming existing limitations for the betterment of both SE and AI communities. Keywords: Software engineering · Artificial Intelligence · AI · AIOSE · AI-oriented application · Intelligent system
1 Introduction Software engineering (SE) refers to a study of a domain of a systematic, disciplined, quantifiable approach to the planning, designing, implementing, controlling the quality, and maintenance of software systems [1, 2]. SE has now at the focal point of innovation from not only on the technological points of view but also on societal and economical aspects. And the concepts, methods, techniques, and tools available are designated for ensuring high quality and productivity of conventional software applications. In this technological wonderland and rapidly emerging technology, intersection between the principle of software engineering (SE) and futuristic technology, AI and Blockchain © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 3–19, 2023. https://doi.org/10.1007/978-3-031-20322-0_1
4
M. J. Hossain Faruk et al.
Technology for instance is necessary towards utilizing each domain for the benefit of tech community. Artificial intelligence (AI) is a paradigm of the upsurge in research, intelligent application implementation and deployment. To provide solutions to traditional software engineering problems by injecting AI paradigms is an emerging idea and can be useful significantly [3]. AI has the potential to automate much of the tedious and error-prone collaborative software development tasks and provide assistants to humans by improving their productivity and allowing them to focus on the creative aspects of software development [4]. Various AI domains including machine learning, neural networks, deep learning, and to improve the process of software engineering fundamental and address major challenges, natural language processing could be applied. Adopting artificial intelligence (AI) can be a new era and potential of reshaping the current software engineering paradigm that can assure software quality adequately. Our work intends to investigate the importance of novel state-of-the-art software engineering practices in the field of artificial intelligence and unveil the current issues and new directions for AI-Oriented Software Engineering (AIOSE). In this paper, the primary contributions are as follows: • We provide an overview of AI-oriented software engineering (AIOSE) and identify the most relevant studies for state-of-the-art AIOSE practice. • We address each software engineering phase including requirements, design, development, testing, release, and maintenance according to AI-based techniques and approaches. • We discuss the challenges, limitations and provide future research direction for AIOSE. This paper is organized as follows: Sect. 1 provides introduction with descriptions of software engineering (SE), artificial intelligence (AI) and key contribution. Section 2 provides adopted research methods and related work. Section 3 defines AI-Oriented Software Engineering (AIOSE) and address the challenges followed by description of software engineering research domains for AI in Sect. 4 While Sect. 5 discusses the challenges, limitations and provide future research direction for AIOSE. Finally, we conclude the paper in Sect. 6.
2 Research Design and Methodology To identify the relevant existing study, a thorough assessment of the literature was conducted [5]. We outline the objective and present the selected research questions in this section. Additionally, we present an overview of the research methodology, inclusion and exclusion criteria, and procedures for selecting the most suitable articles, all of which will help the study progress toward a thorough assessment of AI-Oriented Software Engineering (AIOSE).
AI-Oriented Software Engineering (AIOSE)
5
2.1 Research Goal The goal of this study is to present the current state of the art in software engineering methodology and process for AI-Oriented Software Engineering (AIOSE). We extensively studied and formulated the following research questions to be addressed in this study after carefully evaluating our purpose for this paper: RQ1: What is artificial intelligence (AI) and how can it be integrated with software engineering to embrace cutting-edge technologies? RQ2: What is the possibility for quantifying, formulating, and assessing the intersection between phases of software engineering and AI disciplines? RQ3: Where the software engineering community stands now within the boundary of state-of-the-art AIOSE practice? RQ4: Does the software development life cycle (SDLC) comprises sufficient intelligence for the development of intelligent systems? RQ5: What are the most important trends and necessary improvements shall be carried out in AIOSE? 2.2 Primary Studies Selection A “Search Process” was utilized to find studies that addressed the topic of our study, as shown in Fig. 1 [6]. The following keywords were included in the potential search strings: • • • • • • • •
“Software Engineering” “AI in Software Engineering” “AI-Oriented Software Engineering (AIOSE)” “Artificial Intelligence (AI) for Software Engineering” “Artificial Intelligence and Software Engineering” “Intelligent System” “Artificial Intelligence for Software Engineering” and “Software Engineering for Artificial Intelligence”
Throughout the various study, we utilized relevant keywords and a various of scientific resources. On July 28, 2022, the search was carried out incorporating both the title and the chosen keywords. We considered all studies that have been published as of the aforementioned date. The scientific databases reviewed to find these publications included the following: • • • • • •
“IEEE Xplore Digital Library” “Google Scholar” “ACM Digital Library” “arXiv e-Print Archive” “ScienceDirect” and “SpringerLink”
6
M. J. Hossain Faruk et al.
2.3 Search Procedure and Results After applying the search strings, we initially adopt a filtration strategy during the search process. To identify recent papers, we limited our search to comprise only studies published between 2016 and 2022. To further focus our search for relevant topics, additional filters were added to each database. ScienceDirect prompted us to pick Computer Science as the topic area and research papers as the article type, IEEE Xplore includes conferences and journals. During the first search, 354 research publications in total were identified. We conducted a preliminary search and then used inclusion and exclusion criteria.
Fig. 1. Process of systematic literature review
2.4 Inclusion and Exclusion Criteria Research papers with features unrelated to our literature review as well as duplicates that showed up during the initial search were removed using an exclusion and inclusion process based on (i) duplicate publications (ii) full-text availability, (iii) peer-reviewed, and (iv) papers that are not related to the paper topic. Later, we consider a variety of thorough screening procedures, including empirical method and paper findings, developing
AI-Oriented Software Engineering (AIOSE)
7
features of artificial intelligence and software engineering, and suggested solutions. 22 publications were chosen for our study after a thorough screening procedure that took into account the publication title, abstract, experimental results, and conclusions. The specifics of the inclusion and exclusion process are shown in Table 2. Table 1. Illustrates search criteria of our study Scientific databases
Initial keyword search
Total inclusion
Google Scholar
45
07
ScienceDirect
16
02
IEEE Xplore
16
02
Springer Link
05
01
ACM
03
01
Other Sources
20
10
Total
94
22
Table 2. Depicts the inclusion and exclusion criteria for the primary studies Condition (Inclusion)
Condition (Exclusion)
The research must be concerned with software Studies covering subjects other than inclusion engineering and artificial intelligence articles In many scholarly databases, studies are not repeated
Similar Studies in a Variety of Scientific Databases
Information on software engineering and artificial intelligence is included in papers
Documents that don’t adhere to the intended domain
Studies that are available in English
Studies in languages other than English
Articles that have undergone peer review and been published in a journal or conference proceedings
Articles published without peer review
2.5 Quality Evaluation We adopted a quality evaluation in our primary studies, recommended by Kitchenham and Charters [7], to ensure that the chosen studies were pertinent to the research questions. Following the authors, we randomly chose five publications for the quality evaluation procedure [8]. Stage 1: Artificial Intelligence and Software Engineering: The research will concentrate on AI and software engineering, where specific problems should be well-described to meet RQ2.
8
M. J. Hossain Faruk et al.
Stage 2: Context: We concentrated on contextual factors, such as the research aims and outcomes, in order to determine a correct interpretation of the study. Stage 3: Corresponding Framework: The papers must include pertinent frameworks that cross artificial intelligence for software engineering and respond to our RQ1 and RQ3. Stage 4: Approach Context: Presented frameworks must theoretically or practically relate to artificial intelligence in software engineering and address RQ4. Stage 5: Study Findings: The articles must have presented adequate evaluation findings or research results. Due to the new development that responds to RQ3, the investigations are also anticipated to provide future study direction. Phase 6: Future Research: The publications must have outlined RQ5-related future research directions. 2.6 Publication Over Time The idea of artificial intelligence for software engineering is an emerging research domain and it has been receiving more and more attention recently. However, ideas of AIOSE concepts are still in their infancy stage, we identified publications on our topic published after 2016 and depict the total number of papers published. According to the criteria outlined in our study design, Fig. 2 shows the history of published studies on AIOSE.
Publication Year and Source Types 25 20 15 10 5 0 2017
2018
2019 Journal
2020
Conference
2021
2022
Arcle
Fig. 2. Number of publications and venues identified in search process
AI-Oriented Software Engineering (AIOSE)
9
2.7 Significant Keywords Counts Finding the right keywords is important for a systematic study to yield results that are appropriate for the study’s objective. In this study, we list many terms that feature often in the primary study we studied. The keywords and proportion of various terms in all the primary studies are shown in Table 3. Table 3. Relevant keywords count in the selected studies Common keywords
Total count
Artificial intelligence
3592
Software engineering
2048
Intelligent machine
979
Intelligent system
957
Machine learning
771
Deep learning
828
Techniques
745
Software development process
628
AISO
109
2.8 Prior Research Professor Derek Partridge from the University of Exeter illustrates the relationship of AI to software engineering where the author provided a framework for the interactions of artificial intelligence (AI) and software engineering (SE) [9]. The researcher provides scores of classes on the interaction of AI and SE that include software support environments. The usage of conventional software technology in AI systems as well as the integration of AI tools and techniques into conventional software may have an impact on the software development process. For the software development environment, the author reflects on minimizing the complexity of software implementation for software developers while in conventional software, AI tools and techniques refer to future AIbased applications that shall perform robust and reliable way. While the use of traditional software approaches in AI systems can be fascinating to AI-based application developers although SE and the development of AI practices have significant differences that need to be addressed and overcome. Researcher Mark Harman from University College London works on a topic entitled “The Role of Artificial Intelligence in Software Engineering” where the author explores some of the relationships between software engineering and artificial intelligence [9]. According to the author, the development and maintenance of integrated, intelligent, sophisticated, interactive systems on a large scale have superseded small, localized, insulated, specialized, well-defined construction in technology. And engineering community
10
M. J. Hossain Faruk et al.
needs more attention to the development and deployment techniques to be provided with well-suited solutions. Researcher Lubna Mahmoud Abu Zohair from The British University in Dubai focuses on a topic with concern that whether AI can replace software engineers in near future [10]. The author conducted a qualitative study directed towards software engineers and artificial intelligence professionals. According to the findings, software engineers shall be the primary actors to shape the future of AI-based systems. Another group of researchers study the domain of artificial intelligence in software engineering where authors review the most commonplace methods of AI applied to SE [3]. The researchers investigated various studies between the years 1975 and 2017, 46 important AI-driven techniques were discovered, including 19 for design, 15 for development, 68 for testing, and 15 for release and maintenance. According to the authors, developing AI-based systems may have different issues; however, based on the two fundamentally different premises and goal of each field which is SE and AI, all the conundrums that this overlapping might lead, it is not straightforward to claim the success or the necessity of implementing AI methodologies. Other than that, Masuda et al. [11] conducted a study on the assessment and advancement of the ML application software quality. Washizaki et al. [12] carried out a multivocal analysis to find ML system architecture and design trends. Both a case study and a thorough literature analysis regarding software architecture/about software architecture might sound a bit better for machine learning was carried out by Serban and Visser [13]. A review of deep neural network (DNN) verification and validation methods for the automobile sector was carried out by Borg et al. [14]. Aliza Tariq et al. [15] conducted a survey on software measurement by adopting artificial intelligence where the researchers explored the software and automation requirements in the healthcare industry. Pornsiri Muenchaisri [16] studies how to apply Application of artificial intelligence and machine learning to issues in software engineering research.
3 AI-Oriented Software Engineering: Challenges We define AI-Oriented Software Engineering (AIOSE) as a domain that can facilitate the principle of software engineering and intelligent system by intersecting both domains. Considering the distinctive marks of AI, software engineering practitioners could benefit from the application of AIOSE practices. Our effort is to identify the most relevant AIOSE challenges and issues. a. Role of SE and AI Professionals: The role of SE and AI professionals in the AIOSE domain is crucial and more researchers, developers and other roles will be needed for improving the AIOSE field. Practicing the AIOSE by the experts in different files including computer science and information technology shall provide an edge to this emerging area. The primary challenge that the experts from other fields who will try to accomplish AIOSE will face is to intersect both fields efficiently in the real-world scenario. In order to be successful in AIOSE, the professional need to understand both principles of software engineering and the key concept of artificial intelligence. In academia, AI and SE have two different curricula, and even computer
AI-Oriented Software Engineering (AIOSE)
11
science students have limited knowledge in the software engineering field. As a result, graduates and new professionals should focus on both AI and SE towards contributing to this emerging field. b. Software Security, Quality and Reliability: Software Security, Quality, and Reliability are important domains for every software product that spans throughout the software development lifecycle (SDLC). According to Anthony Iannino and John D. MusaI [17], software reliability is a dynamic operation rather than design of a computer program and a customer-oriented view of software quality that improves the probability of error-free software operation for a specified period in a specified environment. AIOSE domain must ensure both security and reliability challenges and each software engineering and artificial intelligence can contribute to accomplishing the security and reliability challenges. For instance, various AI methods and techniques including Machine Learning (Naive Bayes, Neural Network, Convolutional Neural Network) and Deep Learning can be adopted to ensure the security challenges of an AI-based intelligent software product. AI also has the capability to ensure that Intelligent software applications are of high quality. c. Relevant Software Tools: In the software development life cycle (SDLC), Relevant Software Tools play a crucial role that paves the way for requirements engineers, software architects, developers, and testers to implement a software product effectively with cost-efficiency. Without suitable tools and techniques, completing each development phase in SDLC shall be challenging. Considering modeling the software product, specialized graphic models for the representation of AIOSE need focus to improve existing models. Due to the unique nature of Intelligent Systems, improving or creating new models (UML Diagram: Use Case, Class, Activity, Sequence Diagram for instance) is a demand for effective and uninterrupted AIOSE software development.
4 AIOSE: Software Engineering Research Domains The disciplines of Artificial Intelligence (AI) and Software Engineering (SE) have developed separately where research in SE is addressed to support software engineering practitioners to develop better and more efficient software products. While AI focuses to perceive, reason, and act from a computational perspective. In recent years, various research directions have been made to build new research areas by intersecting both AI and SE disciplines. Distributed AI (DAI) or agent-oriented software engineering (AOSE), and Knowledge-Based Systems (KBS) are a few examples of such new domains. The primary purpose of intersecting AI and SE and carry out extensive research for intelligent software systems is to provide a clear picture of how to develop such applications. Considering sub-fields, methods, and techniques of AI are shown in Fig. 3. The scientific strand of software engineering and artificial intelligence orientates towards providing helpful guidance for interdisciplinary research.
12
M. J. Hossain Faruk et al.
Fig. 3. Illustrates AI fields, methods, and techniques [18]
4.1 AIOSE: Artificial Intelligence for Software Engineering Applying AI technologies to address software engineering problems is not a new trend but rather a decades-long effort to develop intelligent or automation tools and frameworks to provide effective software development experience for the software engineering community [19]. Adopting AI ideas in general, applying intelligence in software engineering solutions at various levels [20]. The primary concern that both AI and SE research community should address is how to synergically integrate computing intelligence and human intelligence. The adoption of AI concepts and techniques at every level of the development process increases the efficiency of the whole SDLC process flow and display in Fig. 4. Automation is a global trend where the engineering community put efforts to automate everything in technology [21]. The role of the AI domain in the SE can pave the automation of software engineering tools, techniques and approaches significantly that shall be benefited by all the stakeholders involve organizationally and software project teams. Generating code from a UML diagram is possible because of intelligent systems [22]. Artificial intelligence can also help software engineering to automate programming that will allow software developers simply to say what is wanted and have a program produced completely automatically [23]. 4.2 AIOSE: Software Engineering for Artificial Intelligence Due to AI advancements, AI-based technologies are becoming more prevalent in society which refer to novel software systems comprised of distinctive features enabled by at least one AI component [3]. Developing, operating, and maintaining AI-based systems by applying the principles of conventional Software Engineering (SE) unable to fulfill the entire process to understand the state-of-the-art SE by the software engineering practitioners. Figure 5 displays the relationship between software engineering and artificial intelligence.
AI-Oriented Software Engineering (AIOSE)
13
Fig. 4. Utilizing AI techniques across the software development life cycle
Fig. 5. Illustrates Interaction between AI and SE
4.3 AIOSE: Requirements Engineering Requirements Engineering (RE) is considered as one of the most important phases of the software development life cycle (SDLC) and traditional RE for conventional software development is well established today. Traditional RE is unable to address the challenges coming from the emerging technologies in the SDLC process; as a result, improvement of RE needs significant attention focusing on state-of-art concepts towards AI-Oriented software development. By Applying Artificial Intelligence for faster, better, and clearer requirements is necessary for advancing Requirements Engineering that will ensure the
14
M. J. Hossain Faruk et al.
automating processes, improving the quality and reducing risk by transforming data into insight and models [24]. This is true because good RE is the foundation of many successful software product launches. On the other hand, poor the foundation of many project delays and failures that result from miscommunication, unnecessary dispute, and inaccurate interpretation leads to time and cost complexity. AI approaches including deep learning and natural language processing (NLP) can help understanding the semantics concepts of a language to understand processes in the context of requirements quality [25]. AI also can be adopted to furnish RE from various aspects including completeness, consistency, and accuracy to preventable confusion and delays, improve the consistency and reliably articulate the objectives of stakeholders. Furthermore, because of the numerous problems for requirements and systems engineering for AI-based applications, AI-oriented systems demand special attention from the requirements engineering perspective. In order to ensure system behavior, attributes including safety, robustness, and quality, as well as to establish process support in an organization, four major problem areas for requirement engineering of AI-based applications need to be solved: specifying data attributes and requirements, evaluating performance definition and monitoring, and human factors [26]. 4.4 AIOSE: Software Design, Development and Testing Karpathy [27] defines software 2.0 as an improved version of software 1.0 where humans write codes. However, in software 2.0 artificial intelligence-based machines will develop the code based on a simple input which can be a set of problems or designs. There are different domains that already adopted intelligence-based programming including routines for self-driving cars, voice synthesis, speech recognition, visual recognition, and gaming [28]. One of the popular AI approach neural networks has been utilized to assist software coding that indicates the capability of AI in automated software implementation. For software development, various AI techniques can be utilized including artificial neural networks and deep learning to process design or automated debugging and improvement procedures, automatic techniques for converting problem statements into code, better implementation times, reduced costs, and improved teamwork [28]. Figure 6 displays the areas where AI can manifest the software engineer’s task. Particularly in the software implementation by generating automated code, automating software testing by creating test cases, identifying bugs, and automated deployment [29]. On the other hand, machine learning can be utilized for checking and testing the test scripts, identifying the using big data for probabilistic error prediction, improving abbreviation and cost-efficiency of the test process, integrating of existing programs, improving the efficiency in automated debugging and compiling [30]. 4.5 AIOSE: Software Process Improvement (SPI) Due to the early stages of AI development, the dearth of research in this area, the presence of unsolved technical problems, and the lack of substantial studies that deploy AI-oriented SE applications in organizational settings. The research community must perform effective research to show the processes, approaches, methods, and techniques
AI-Oriented Software Engineering (AIOSE)
15
Fig. 6. Applying AI to software development
for AI-focused software architecture, development, testing, and maintenance. An important challenge is to help organizations in improving the quality and efficacy of intelligent software development by adhering to the proper methodology throughout the AI-oriented software development life cycle is improving the current software process for AI-oriented software development. Organizations achieve from Software Process Improvement (SPI) because it improves product quality while reducing project costs and development times. Largescale software development is time and cost difficult due to a variety of existing processes and approaches, especially when it comes to novel and emerging technology. Emerging AI-focused software development needs concentrate on SPI to overcome obstacles such knowledge management, high cost, resource management, and change in workplace culture. These obstacles include dependency on a single body of standardization for certification. We would want to suggest a domain called “AI-Oriented Software Process Improvement (AIOSPI)” and a suitable framework that would provide efficiency in SPI, minimize time, cost, and resources, and assist in managing knowledge utilized to accomplish SPI. Software process improvement phases such as SPI Planning, SPI Implementation, SPI Evaluation, and SPI Current Stance are shown in Fig. 7 for organizations. In order to minimize project costs by improving the process and averting problems, redundancies,
16
M. J. Hossain Faruk et al.
Fig. 7. display software process improvement phases
and flaws, SPI is crucial in the creation of AI-oriented software. For increased productivity in AIOS development, the research community should concentrate on automated AIOSE-SPI frameworks in near future. 4.6 AIOSE: Software Project Management Software project management (SPM) aims to produce high-quality products on schedule, within budget, and in alignment with business objectives. Stakeholders play a key role in determining the processes and techniques that will meet the requirements of the intended products. Planning, organizing, monitoring, and modifying are the four primary phases of a project, together with team allocation, time management, budget, and maintenance [31, 32]. There are several research in the area of managing software projects with an AI emphasis. The evolution of artificial intelligence (AI) in project management is seen in Fig. 8. Adopting machine learning techniques for software project management can help in a variety of areas, such as project risk assessment to reduce project losses, boost success rates, effectively lower project failure probabilities, boost output ratios for growth, and analyze software fault prediction based on accuracy. The entire team has to adopt a better model and framework in order to establish the baseline for assigning the project work, deadline, and budget based on the stakeholder
Fig. 8. Evolution of AI in project management
AI-Oriented Software Engineering (AIOSE)
17
request. Due to the complexity of software development, managing the development of large-scale software systems are a problem for all software project managers. AI-enabled software development would be much more difficult, thus the research community should emphasis on this field. Although we haven’t done any high-level SPM research for this study, we do plan to do explicit research on AI-oriented software project management in the future.
5 AIOSE: Discussion and New Research Direction Software Engineering community’s appetite for AI techniques reflects the need for future research to intersect both fields for the betterment of both communities. Artificial Intelligence (AI) techniques are becoming more prevalent in real-world applications and researchers and practitioners focusing on best practices strive to design intelligent systems capable of addressing software engineering issues. Besides, Conventional software engineering can be applied to the development of AI-based systems, but it may trigger not only vulnerability or security issues but also the quality of the software product as a whole. Characteristics and challenges are unique to software engineering principles for intelligent systems. Although there has been substantial advancement in the intersection of software engineering and artificial intelligence in recent years, significant research efforts are still needed to transform this field from an idea to a working paradigm. Thus, extensive research shall be carried out in the near future that would facilitate the principles of software engineering including requirements engineering, software architecture, software development, software testing, software project management, and software process improvement to prepare the SE for the futuristic software development by adopting the state-of-the-art concepts.
6 Conclusion In this paper, we provided a substantial overview of AI-oriented software engineering and acknowledged the most pertinent studies for the state-of-the-art AIOSE practice. We specifically addressed software development phases including requirements, design, testing, release, and maintenance, conforming to AI-based techniques and methodologies. We also discussed the challenges, limitations, and proposed future research directions for AIOSE. We concluded, based on our research that adopting artificial intelligence for software engineering can facilitate many subdomains such as software development, requirements engineering, and project risk assessment. That would increase the project’s success rate, allows for efficient software development, and facilitates accurate analysis of software fault prediction. We suggest future extensive studies for the betterment of both SE and AI communities. Acknowledgment. The lead author would like to express special thanks of gratitude to Professor Hassan Pournaghshband for his meaningful advice and continuous guidance throughout the research. The work is partially supported by the U.S. National Science Foundation Awards Award #2209638. Any opinions, findings, and conclusions or recommendations expressed in this
18
M. J. Hossain Faruk et al.
material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
References 1. Hossain Faruk, M.J., Subramanian, S., Shahriar, H., et al.: Software engineering process and methodology in blockchain-oriented software development: a systematic study. In: 2022 IEEE/ACIS 20th International Conference on Software Engineering Research, Management and Applications (SERA), pp. 120–127. https://doi.org/10.1109/SERA54885.2022.9806817 (2022) 2. Neesi.: Software Engineering: Key Enabler for Innovation (2014) 3. Batarseh, F.A., Mohod, R., Kumar, A., Bui, J.: The application of artificial intelligence in software engineering: a review challenging conventional wisdom. Data Democr Nexus Artif. Intell. Softw. Dev. Knowl. Eng. 179–232. https://doi.org/10.1016/B978-0-12-818366-3.000 10-1 (2020) 4. Sundaresan, N.: Research talks: AI for software development. In: Microsoft Res. Summit 2021 (2021). https://www.microsoft.com/en-us/research/video/research-talks-ai-for-sof tware-development 5. Upama, P., et al.: Evolution of quantum computing: a systematic survey on the use of quantum computing tools. In: COMPSAC 2022: Computer Software and Applications Conference. Torino, Italy (2022) 6. Nazim, M.T., et al.: Systematic analysis of deep learning model for vulnerable code detection. In: 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 1768–1773 (2022) 7. Kaiwartya, O., et al.: Guidelines for performing systematic literature reviews in software engineering. 4, 5356–5373 (2022) 8. Hosseini, S., Turhan, B., Gunarathna, D.: A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans. Softw. Eng. 45, 111–147 (2019). https://doi. org/10.1109/TSE.2017.2770124 9. Harman M.: The role of artificial intelligence in software engineering. In: 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE), pp. 1–6 (2012). https://doi.org/10.1109/RAISE.2012.6227961 10. Mahmoud, L., Zohair, A.: The future of software engineering by 2050s: will AI replace software engineers? Int. J. Inf. Technol. Lang. Stud. 2, 1–13 (2018) 11. Masuda, S., Ono, K., Yasue, T., Hosokawa, N.: A survey of software quality for machine learning applications. In: 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW, pp. 279–284 (2018).https://doi.org/10.1109/ICSTW. 2018.00061 12. Washizaki, H., Uchida, H., Khomh, F., Guéhéneuc, Y.G.: Studying software engineering patterns for designing machine learning systems. In: 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP), pp. 49–54 (2019). https://doi.org/10. 1109/IWESEP49350.2019.00017 13. Serban, A., Van Der Blom, K., Hoos, H., Visser, J.: Adoption and effects of software engineering best practices in machine learning. In: Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 1–12 (2020). https://doi.org/10.1145/3382494.3410681 14. Vogelsang, A., Borg, M.: Requirements engineering for machine learning: perspectives from data scientists. In: 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW), pp. 245–251 (2019). https://doi.org/10.1109/REW.2019.00050
AI-Oriented Software Engineering (AIOSE)
19
15. Tariq, A., et al.: Software measurement by using artificial intelligence. J. Nanomater (2022). https://doi.org/10.1155/2022/7283171 16. Muenchaisri, P.: Literature reviews on applying artificial intelligence/machine learning to software engineering research problems: preliminary. CEUR Workshop Proc 2506, 30–35 (2019) 17. Iannino, A., Musa, J.D.: Software reliability. Adv. Comput. 30, 85–170 (1990). https://doi. org/10.1016/S0065-2458(08)60299-5 18. Rech, J., Althoff, K.-D.: Artificial intelligence and software engineering: status and future trends. Themenschwerpkt KI SE, KI 3, 5–11 (2004) 19. Hema Shankari, K.: A survey on using artificial intelligence techniques in the software development process. J Eng Res Appl 4, 24–33 (2014). www.ijera.com 20. Meziane, F., Vadera, S.: Artificial intelligence in software engineering, 278–299 (2010). https://doi.org/10.4018/978-1-60566-758-4.ch014 21. Shehab, M., et al.: Artificial intelligence in software engineering and inverse: Rev. Int. J. Comput. Integr. Manuf. 33, 1129–1144 (2020). https://doi.org/10.1080/0951192X.2020.178 0320 22. Sejans, J., Nikiforova, O.: Problems and perspectives of code generation from UML class diagram. Sci. J. Riga Tech. Univ. Comput. Sci. 44, 75–84 (2012). https://doi.org/10.2478/v10 143-011-0024-3 23. Ford, L.: Artificial intelligence and software engineering: a tutorial introduction to their relationship. Artif. Intell. Rev. 1, 255–273. https://doi.org/10.1007/BF00142926 (1987) 24. Dalpiaz, F., Niu, N.: Requirements engineering in the days of artificial intelligence. IEEE Softw. 37, 7–10 (2020). https://doi.org/10.1109/MS.2020.2986047 25. Zollinger, P.: Advancing Requirements Engineering by Applying Artificial Intelligence. Evocean 26. Heyn, H.M., et al.: Requirement engineering challenges for AI-intense systems development. In: 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN), pp. 89–96 (2021 ). https://doi.org/10.1109/WAIN52551.2021.00020 27. Karpathy, A.: Software 2.0. Medium (2017) 28. Barenkamp, M., Rebstadt, J., Thomas, O.: Applications of AI in classical software engineering. AI Perspect. 2(1), 1–15 (2020). https://doi.org/10.1186/s42467-020-00005-4 29. Yao, M.: 6 Ways AI Transforms How We Develop Software. Forbes Media LLC (2018) 30. Meziane, F., Vadera, S., Global, I.: Artificial intelligence applications for improved software engineering development: new prospects. Artif. Intell. Softw. Eng. 278–299 (2010) 31. Wessling, F., Gruhn, V.: Engineering software architectures of blockchain-oriented applications. In: 2018 IEEE International Conference on Software Architecture Companion (ICSA-C), pp. 45–46 (2018).https://doi.org/10.1109/ICSA-C.2018.00019 32. Ortu, M., Orru, M., Destefanis, G.: On comparing software quality metrics of traditional vs blockchain-oriented software: an empirical study. In: 2019 IEEE International Workshop on Blockchain Oriented Software Engineering (IWBOSE), pp. 32–37 (2019). https://doi.org/10. 1109/IWBOSE.2019.8666575
Give Me the Right Sensor and I Can Learn Anything – Even Beat the Burnout Andreea Ionica and Monica Leba(B) University of Petrosani, Universitatii 20, 332006 Petrosani, Romania {andreeaionica,monicaleba}@upet.ro
Abstract. The aim of the research is to present the results of the design and use of an organizational climate assessment - burnout level relationship system, that integrates through appropriate methods, techniques and tools, extrinsic influences, related to particular aspects of organizational climate, and intrinsic ones, individualized by personal physiological parameters, reflected on the state of burnout of principals in schools from the southern Israel. The approach is based on the integration of research results considering the influence of organizational climate on the state of burnout and research on the design of a system for assessing and predicting the state of burnout of school principals in southern Israel, with the goal of validation of this system. Keywords: Burnout · Organizational climate · Prototype · Machine learning · Prediction
1 Introduction The proposed approach uses methods, techniques and tools belonging to different disciplines being transferred, hence obtaining a burnout evaluation/prediction system that has as key elements the results of organizational climate analysis, burnout evaluation and results of physiological parameters measurement which are based on an algorithm that uses machine learning. Through this approach, the specialist will be assisted by a system that learns on its own and that becomes more and more accurate in assessment as more data is entered. The input elements are: general elements, resulting from the conceptual delimitations regarding the burnout phenomenon, the inventory of organizational and individual factors affecting burnout and the burnout assessment tool used in different fields of activity and specific elements that have allowed the identification of the research area and context, such as the characteristics given by the Arab schools in southern Israel (Bedouin area), elements that define the profile of the school principal in the studied area, specific stressors and applicable coping strategies, which are all potential sources of burnout. The output elements represent the results of the evaluation by the proposed system of the level of burnout of school principals in relation to the organizational climate and individual factors reflected in the measured physiological parameters. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 20–33, 2023. https://doi.org/10.1007/978-3-031-20322-0_2
Give Me the Right Sensor and I Can Learn Anything
21
The identification of the research premises, area and context and justification of the system input elements are presented in the section Conceptual background and related work. The data collection protocol, the developed prototype and the machine learning algorithm are presented in section Materials and Methods. The outcomes are presented in the section Results, based on the evaluation by the proposed system of the level of burnout of school principals in relation to the organizational climate and individual characteristics reflected in the measured physiological parameters. The obtained results are then compared to other research’s results both from the point of view of parameters considered in evaluation of stress state and of the machine learning model in the Discussion and Conclusions section. The system, as a result of research, is designed for school principals in southern Israel (Bedouin area), but can be adapted to other areas of activity. Both the type of burnout assessment questionnaire and the organizational climate analysis questionnaire change at the entry level. The wearable device offers a new sensor, but a person must close the loop and take action with the help of the specialist based on knowledge of coping strategies (cognitive and emotional) in a person-centered approach.
2 Conceptual Background and Related Works Figure 1 shows the conceptual approach that was the basis for the study of the literature on establishing the relationship between the key variables (constructs) of the research topic - burnout and the organizational climate - and the context and premises of the research.
Fig. 1. Conceptual approach.
In order to evaluate the relationship between the two constructs - burnout and organizational climate - the following were analyzed from the literature: Burnout - conceptual delimitations, determinants and physiological parameters, organizational climate in general and specifically the school climate, aspects of satisfaction and motivation in work in connection with the two constructs, respectively the methods and tools that link the two constructs.
22
A. Ionica and M. Leba
The evaluation of the relationship results from the application of the methodological tool based on the study of the methods and tools used in different researches. Burnout is generated by chronic occupational stress, the occurrence of which is determined by aspects related to the organizational climate. The existence of psycho-socio-professional risks and a certain level of stress associated with work are realities, but also challenges at the moment, due to the significant incidence at the level of the entire organization of problems identified at employee level such as cardiovascular disease, gastrointestinal problems, musculoskeletal disorders, accidents, substance abuse, conflicts between employees and family, anxiety, depression, fatigue, but also burnout syndrome [1]. Burnout syndrome should not be seen only at individual level, but as a symptom of the health of the entire organization. The last decades have been marked by major changes in economic, political, social and cultural terms, with a profound impact on the activities carried out in organizations. Their influence, also reported by [2], concerns the occurrence of depression, panic disorder, stress and burnout syndrome, mental disorders, substance use disorders and much more, each of these disorders being correlated with the activity in the professional environment. Burnout became a topic of interest in psychology and psychiatry, first described in the mid-1970s [3]. Generally viewed as a work-induced condition [4], this concept has also become a topic of interest in the field of occupational safety and health [5]. The syndrome has been associated with a variety of negative occupational consequences, including decreased work performance, absenteeism [6, 7] but also serious symptoms of employee’s health [8]. Research on burnout has intensified in recent years, with various recommendations for managing occupational stress and burnout [9, 10]. According to [11], the term burnout tries to integrate symptoms such as fatigue, emotional exhaustion, reduced personal fulfillment and distance from colleagues. Regarding the second key variable, although the climate has been consistently described as employees’ perceptions of their organizations, the construct has suffered over the years due to contradictory definitions and inconsistencies in operationalization. The dominant approach conceptualizes climate as employees’ common perceptions of organizational events, practices, and procedures. These perceptions are assumed to be more descriptive than evaluative [12]. Later, [13] contradicted this view, suggesting strong evaluation components. At the level of individual analysis, called the psychological climate [14], these perceptions represent how work environments are cognitively assessed and represented in terms of their meaning and significance for individual employees within organizations. The analysis of the organizational climate deals with the evaluation of the level of professional satisfaction. If a description of personal perceptions that refer to certain aspects of work in the organization or an assessment of how the employee is motivated/encouraged/stimulated to complete a task is achieved, then this is an analysis of the organizational climate and not an assessment of the attitude at work. The general features specific to various schools in terms of organizational climate analysis are: the presence of subjective variables (attitudes of students, principle, teachers, etc.) and objective variables (environment, time, resources, etc.), general axiomatic aspects (pedagogical management, curriculum design), the latent action with lasting effects in the development process of the organization, relative stability in relation to the
Give Me the Right Sensor and I Can Learn Anything
23
subjective and objective variables, the specific cultural dimension manifested through a set of actions and multiple pedagogical influences. The climate of the school organization also takes into account the definition of the psychosocial state expressed at the cultural level through the internal and external context, all these being generated by objective and subjective variables. Each school is characterized by a unique climate, because schools operate in different ways, and the type of climate that prevails in a school is a mixture of the behavior of the school principal with that of teachers, students and parents in the school. The climate varies from school to school and is seen as constantly changing. 2.1 Related Works A series of researches address the relationship between school organizational climate and occupational stress [15], teachers’ motivation, their performance at work given their convergence towards burnout. The relationship between organizational climate and the motivation of teachers [16] and their performance at work [17] have been explored in various researches, but the literature does not capture enough how the organizational climate and motivation of employees influences the appearance of burnout. However, [18] shows that the organizational climate and the low motivation at work are predictors of the occurrence of occupational stress that can lead to burnout [19]. conducted a study on the relationship between teacher motivation and organizational climate, emphasizing the importance of the management style. The study of the relationship between organizational climate and workplace burnout conducted by [20, 21], and [22] followed the relationship between organizational climate, workplace stress and burnout syndrome. In terms of methods and tools used, research has shown that the most commonly used model in operationalizing burnout is the one proposed by Maslach and Jackson (Maslach Burnout Inventory, MBI), but which has been criticized and recommended to take into account the causal relationship between stressors and climate dimensions. Arguments regarding the need to design a new instrument are also supported by the fact that without an adequate and unanimously accepted conceptualization, applicable in all cases of burnout, the results of applying measurement scales can be challenged and will have a questionable applicability [23]. Thus, [24, 25] recommended to use multidimensional burnout questionnaires in combination with valid depression scales to be able to clinically assess exhaustion, more effectively than using exclusively MBI. As the research area includes school principals in the southern part of Israel, the premises for the development of research were identified on the basis of research undertaken by [26, 27] having as subjects, principals of Israeli schools, the main purpose being to identify the main risk factors, organizational and individual stressors on burnout and by [28] on the burnout phenomenon of Israeli Arab school principals, considered a vulnerable group exposed to burnout due to working in a complex environment, but also due to the fact that they represent members of a minority community in the state of Israel, both numerically and as religious affiliation. The influence of changes in the last 20 years and changes in the organizational climate have led to the need for further investigations into the phenomenon, the continuation and updating of research on the occurrence of burnout in the principals of Israeli Arab
24
A. Ionica and M. Leba
schools. The relationship between Burnout and the Organizational climate was evaluated in two phases. In the first phase, the methodological toolset included questionnaires demonstrating stable reliability coefficients and also, being validated in the educational contexts and systems from Israel. The Israeli Arab teachers who participated in this study were selected from the sample of 4 schools from South Israel, (representing 22% of total schools) that mean a total of 142 teachers aged between 22 and 62, obtaining a number of 120 validated questionnaires. Previous to this research, the application, the analysis and the interpretation of the questionnaires was made at the level of each school. There were no significant differences between the results achieved at each school level, which allowed in the second phase of the research for the processing and the interpretation of the results to provide an overview of the issues concerning the communication between teachers and the school principal, the dimensions of the organizational climate (harmony, interpersonal relationships and trustworthy), respectively the level of burnout of principals. Cross-sectional research has been done using a stratified sampling [29]. The results obtained in this study highlight aspects that constitute the premises for the next phase of research. The second phase is aimed to increase the research area, which was initially established at four schools from districts from Southern Israel (Ber Mshash, Abutal, Elbaian, Aatam). Investigating a larger population (both for principals and teachers) can lead to more accurate and relevant findings in order to express the trends that might be useful for estimating burnout levels for school principals. For reaching the objective of the second phase a quantitative method was used as well, a questionnairebased survey. From a total of 35 schools from the Bedouin area of Southern Israel, a number of 30 agreed to participate. The questionnaire was applied to the total number of employees of the 30 schools, namely 705, being validated a number of 686 questionnaires (filled by 30 principals and 656 teachers). The research was conducted during the period June-August 2019 [30]. The statistical research confirmed that there is a significant negative correlation between the principals’ burnout state and the perception regarding the organizational climate (p < 0.01, r = –0.637). The results obtained are: the identification of the dimensions of the organizational climate with the greatest influence on the appearance of the burnout state (these are work satisfaction and feeling of security), with no influence from the demographic variables. Also, based on these results and the results presented by [27] a synthesis of the relationship between the dimensions of burnout, the dimensions of the organizational climate and the risk factors was made pointing out the fact that the risk factors are taken into account through their influence on the climate dimensions, as presented in the paper [30].
3 Materials and Methods Based on the obtained research results, the wearable device prototype [31] is designed, predicting the occurrence of the burnout state of the school principals in the Bedouin area of southern Israel under the influence of the organizational climate related risk
Give Me the Right Sensor and I Can Learn Anything
25
factors, the device is calibrated using both the principals’ answers to questionnaires and the values of physiological parameters. 3.1 Prototype The burnout estimation device was built with HR and SpO2 sensors placed on the finger, in order to acquire data necessary to develop the estimation algorithm. In Fig. 2 is shown the current prototype for physiological parameters measurement.
Fig. 2. Current prototype of the burnout device.
The prototype from Fig. 2 consists in HR and SpO2 sensors, a microcontroller-based system and a RGB LED output, that signals by a color code the stress class resulted from the classification. 3.2 Protocol Prior to running test and data acquisition, the administration, scoring, and test interpretation, have been extensively analyzed. Therefore, the research focused on the pilot study on 30 subjects. For this purpose, the following steps have been taken: 1. Assessment of burnout state based on questionnaires [32]; 2. Application of the protocol aiming to obtaining data by inducing different stress states (hyperventilation - HP and discussions on conflicting events – DEC) to establish the link between physiological parameters and the level of discomfort appreciated by the specialist (Y1-Discomfort; Y2-Anxiety; Y3-Body sensations; Y4-Non-control of thoughts). HR/SpO2/GS measurements are taking under the influence of stressors (HV and DEC). Previous research [33] showed that the order of stress stimuli (HV, DEC) does not matter. Thus, the subjects participated in an experimental session using physiological and psychological signals under the following conditions: calm state (baseline L1), physiological stress stimulus (HV), psychological stress stimuli (DEC). Emotional state was assessed by the specialist after each session part (baseline (L1), threat (DEC) and stimulation (HV) and post-stress baseline (L2)), using a Likert scale (0–100), and represents the training outputs of the evaluation system. Specifically, the
26
A. Ionica and M. Leba
subjects were asked to evaluate the following emotional parameters: the level of discomfort, the level of anxiety, the bodily sensations and the uncontrolled thoughts. In short, as a general presentation, there are four stages: The first stage (L1): induction of a state of relaxation suggesting the subject to be comfortable. Second stage (HV): hyperventilation, i.e. deep and rapid breathing. The third stage (DEC): discussions about conflicting events. The fourth stage (L2): resting state. Following the application of the first two stages, there are obtained data for the machine learning algorithm. 3.3 Machine Learning Algorithm We treat the problem of classifying the state of stress considering four classes, as follows: y1 = Low stress, y2 = Medium stress, y3 = High stress and y4 = Burnout. For this we will use a logistic regression algorithm assuming the sigmoid function: hθ (X ) =
1 1 + e−θ
T ·X
(1)
This is the estimated probability that the output will be one of the four classes above for a given input X. As previously justified in estimating the degree of stress, a number of n = 4 parameters are relevant as follows: x1 = director’s perception of climate, x2 = teachers’ perception of climate, x3 = HR, x4 = SpO2. T X = 1 x1 x2 x3 x4
(2)
For parameter x3 we considered a reference value, the base value for each person to which we reported the current value. No reporting was required for x4 because oximetry can be considered variable within standard limits without losing generality. The values of the parameters x1 and x2 are the result of questionnaires and thus the values are already homogenized. We used as training data those obtained from a group of 30 people with the implementation of a stress measurement protocol of m = 120 obtaining a number [33, 34] thus values for the training set of the form: X (1) , y(1) , X (2) , y(2) , ..., X (120) , y(120) . The usual applications for determining the level of stress, such as those that exist in smartphones or smartwatches, use two physiological parameters, namely HR and SpO2. For the target group considered and the training data collected, we represented graphically, using Octave, in Fig. 3 the four classes of classification of the state of stress considered on the basis of the two parameters mentioned above. As can be seen, there is no clear demarcation between the four classes based only on these two parameters. This means that additional parameters are needed to provide extra information to allow for clearer class differentiation. Previous studies have shown the significant influence of the organizational climate and we considered two more parameters related to the organizational climate perceived by the principal and teachers.
Give Me the Right Sensor and I Can Learn Anything
27
Fig. 3. Stress level as function of heart rate (HR) and oximetry (SpO2).
Figure 4, Fig. 5 and Fig. 6 present the classification of stress level in the four considered classes based on the combinations of the four parameters found to be relevant. As can be seen the combination of these four parameters allows a good classification based on the 120 subjects group data.
Fig. 4. Stress level as function of heart rate (HR), SpO2 and climate perception from principle’s point of view.
28
A. Ionica and M. Leba
Fig. 5. Stress level as function of HR and climate perception from principle’s and teachers’ point of view.
Fig. 6. Stress level as function of SpO2 and climate perception from principle’s and teachers’ point of view.
4 Results Given that the problem here is multi-class classification, we will address the one-vs-all option and we will develop 4 binary classification solutions, one for each class. Thus, we will have 4 hypotheses of the following form: (i)
hθ (X ) = P(y = yi /X ; θ ), i = 1, 2, 3, 4
(3)
Which, for each class I, determines the probability that Y = yi. For each X entry we make the 4 predictions and choose the class i that maximizes: (i) (4) prediction = max hθ (X ) i
To choose the theta parameters we use the cost function: m 1 (i) J (θ ) = − · y · log hθ X (i) + 1 − y(i) · log 1 − hθ X (i) m i=1
(5)
Give Me the Right Sensor and I Can Learn Anything
29
which is a convex function and provides us with a global minimum. Using this function, we apply the gradient descent algorithm which is identical to that of linear regression. For our problem we initially considered a linear regression function and obtained the following decision boundaries. Decision boundary for class 1 (Low Stress), with a cost at optimized theta of: 0.235666: 2.77 · x1 + 1.35 · x2 − 20.41 · x3 + 0.08 · x4 = 3.43. Decision boundary for class 2 (Medium Stress), with a cost at optimized theta of: 0.531876: 0.33 · x1 − 1.96 · x2 − 4.39 · x3 + 0.37 · x4 = 24.4. Decision boundary for class 3 (High Stress), with a cost at optimized theta of: 0.483802: 0.6 · x1 + 0.6 · x2 + 2.57 · x3 − 0.3 · x4 = −20.16.
Fig. 7. Gradient descent for: a. second degree characteristics in hypothesis, b. Third degree characteristics in hypothesis.
Decision boundary for class 4 (Burnout), with a cost at optimized theta of: 0.110065: 12.49 · x1 − 7.2 · x2 − 18.91 · x3 + 0.94 · x4 = 82.15. For the set of 120 training data, we obtained the following train accuracies: Train Accuracy for Low stress: 90.000000 Train Accuracy for Medium stress: 48.333333 Train Accuracy for High stress: 72.500000 Train Accuracy for Burnout: 89.166667 We also performed the training with nonlinear functions of second and third degree and the train accuracies was similar. Figure 7.a. and Fig. 7.b. present the gradient descent for these two nonlinear functions for each of the four classification problems. For testing, we made a controlled collection of data from the 30 subjects, i.e. the classification by the psychologist in one of the four classes considered. There are 7 subjects in state y1 (Low stress), 9 in y2 (Medium stress), 11 in y3 (High stress) and 3 in y4 (Burnout). We tested the 3 variants (one linear and 2 nonlinear) and got the following errors from Table 1.
30
A. Ionica and M. Leba Table 1. Prediction errors.
Hypothesis
False positive
False negative
low stress
Medium stress
High stress
Burnout
Low stress
Medium stress
High stress
Burnout
Linear
4
6
5
1
5
4
6
1
Second order
0
5
3
0
1
3
4
0
Third order
1
5
4
1
2
4
5
0
As can be seen, a good fitting variant is the second order function, the linear variant has too many errors and the third order is comparable to the second order. Most of the false positives in all cases are for the Medium Stress, while most of the false negatives are for High Stress. Also, in linear case there are many false negative cases for Low Stress, that can be an issue. Even if there are not many false positive cases of Burnout, for the linear and third order cases, the unnecessary alarming could also be an issue. Below there are presented two of the 30 test cases with the exact values obtained by running the trained machine learning model. For example, for the following entries, that correspond to Low stress, we get the T predictions mentioned below: X = 4.5 3.8 1.07 94 . We obtained the following values: Linear: we predict probabilities of Low stress = 0.422307, Medium stress = 0.358564, High stress = 0.443746, Burnout = 0.000000 Second order: we predict probabilities of Low stress = 0.561328, Medium stress = 0.534520, High stress = 0.075111, Burnout = 0.009682 Third order: we predict probabilities of Low stress = 0.289657, Medium stress = 0.270074, High stress = 0.150221, Burnout = 0.002823 It is observed that the linear system classifies in High stress while the other two in Low Stress. T In exchange for the case, that corresponds to Burnout: X = 2.5 2.4 1.4 94 , all three considered hypotheses fall into the Burnout category. Linear: we predict probabilities of Low stress = 0.000001, Medium stress = 0.515083, High stress = 0.194848, Burnout = 0.997406 Second order: we predict probabilities of Low stress = 0.126824, Medium stress = 0.037977, High stress = 0.000000, Burnout = 0.999982 Third order: we predict probabilities of Low stress = 0.013348, Medium stress = 0.000000, High stress = 0.000000, Burnout = 1.000000
5 Discussions and Conclusions The usefulness of using the machine learning model is related to the inclusion as parameters the dimensions of the organizational climate that allows a better classification than the one using exclusively the physiological parameters. Data measured by a psychologist
Give Me the Right Sensor and I Can Learn Anything
31
specialized in educational field stress issues were used to train the model, and the target group chosen for this study was from schools in the Bedouin area of southern Israel. The present research has shown the need to introduce, in addition to physiological data in assessing the burnout state of principles, organizational climate issues, which is supported by previous research that has shown, based on data analysis, the existence of a negative relationship between climate and burnout. This need is also visible in the graphical representation of the burnout state based only on the physiological data, which does not allow a clear differentiation in stress classes, these being very much overlapping (Fig. 1). The research highlighted the results of the evaluation of the organizational climate from the perspective of principals and teachers for each school. The study was conducted using as subjects 30 school principals from southern Israel and a total of 658 responding teachers. The aim of the research was to create a tool to warn the user about the danger of reaching the state of burnout in order to prevent it. There are currently wearable smart bracelet devices that use only physiological data to assess stress levels. What they are doing is not enough in the context of burnout prevention, which has become a topic of great interest, and previous research has looked at it from an occupational disease standpoint. The research hardware and software support prototype allow the collection of physiological data with dedicated sensors and includes a logistic regression machine learning algorithm for estimating the current state of stress and is an important step for assessing the tendency to burnout as quickly and accurately as possible. The research resulted tool, i.e., the prototype embedding physiological sensors and machine learning based burnout prediction algorithm, under testing on the 30 principles considered, provided over 90% accuracy for Low stress and Burnout classes prediction and 75% accuracy for High stress class. This precision can be improved by increasing the number of subjects and data collected. The critical class in the context of predicting the state of Burnout is High stress, which has an accuracy of 75%. Incorrectly classified cases were all in the Medium stress category, and these subjects in the psychologist’s assessment based on the 4 items (1. Discomfort, 2. Anxiety, 3. Body sensations 4. Uncontrolled thoughts) of the burnout assessment protocol were ranked close to the lower limits of the High stress category and the upper ones of the Medium stress category. This is certainly a limitation of this research but also represents concrete results for the case, given the limited number of subjects from the research area, the Bedouin area in southern Israel, which contains 30 schools, all analyzed in this research. Adopting the use of organizational climate questionnaires on occupational areas and geographical regions will impact the machine learning model by training with data specific to an occupation and a context. Another direction of research would be to assess the state of burnout in pandemic conditions. For this the specific elements of the organizational climate in the analyzed area can be considered a very good reference point in dealing with the problem of burnout in the current pandemic situation.
32
A. Ionica and M. Leba
References 1. Cox, T., Griffiths, A., Rial-Gonzalez, E.: Research on work-related stress. Luxembourg: Office for Official Publications of the European Communities: European Agency for Safety & Health at Work (2000) 2. Lima, M.E.: A polêmica em torno do nexo causal entre distúrbio mental e trabalho. Psicologia em Revista 10(14), 82–91 (2003) 3. Maslach, C., Pines, A.: The burn-out syndrome in the day care setting. Child Youth Care Forum 6(2), 100–130 (1977) 4. Schaufeli, W.B., Taris, T.W.: The conceptualization and measurement of burnout: common ground and worlds apart. Work Stress 19(3), 256–262 (2005) 5. Schonfeld, I.S., Chang, C.-H.: Occupational Health Psychology: Work, Stress, and Health. Springer, New York (2017) 6. Swider, B.W., Zimmerman, R.D.: Born to burnout: a meta-analytic path model of personality, job burnout, and work outcomes. J. Vocat. Behav. 76(3), 487–506 (2010) 7. Weber, A., Jaekel-Reinhard, A.: Burnout syndrome: a disease of modern societies. Occup. Med. J. 50, 512 (2000) 8. Toker, S., Melamed, S., Berliner, S., Zeltser, D., Shapira, I.: Burnout and risk of coronary heart disease: a prospective study of 8838 employees. Psychosom Med 74(8), 840–847 (2012) 9. Epstein, R.M., Privitera, M.R.: Doing something about physician burnout. Lancet 388(10057), 2216 (2016) 10. Shanafelt, T.D., Dyrbye, L.N., West, C.P.: Addressing physician burnout: the way forward. JAMA 317(9), 901–902 (2017) 11. Hillert, A., Marwitz, M.: Die Burnout-Epidemie oder brennt die Leistungsgesellschaft aus? Munchen: CH Beck (2006) 12. Schneider, B.R.: On the etiology of climates. Pers. Psychol 36, 19–39 (1983) 13. Patterson, M.G., Warr, P.B., West, M.A.: Climatul organizat, ional si perfomant, a companiei: rolul afectivit˘at, ii angajatului s, i al nivelului s˘au. Jurnalul Psihologiei Ocupat, ionale s, i Organizat, ionale 77, 193–216 (2004) 14. James, L.R., Jones, A.P.: Climatul organizat, ional: o revizuire a teoriei s, i a cercet˘arii. Buletinul Psihologic 81, 1096–1112 (1974) 15. Ghosy, A.: The role of school organizational climate in occupational stress among secondary school teachers in Tehran. Int. J. Occup. Med. Environ. Health 21(4), 319–329 (2008) 16. Russel, J.: Work motivation of secondary school teachers in relation to organizational climate. Int. J. Educ. Psychol. Res. (IJEPR), 3(1), 62–67 (2014) 17. Utami, P.: The impact of working climate and motivation towards job satisfaction that implies the employee performance in PT Indonesia power generation business unit of Suralaya Banten. Int. J. Sci. Res. Publ. 6(7), 26–31 (2016) 18. Samuel, E.: Perceived organizational climate and job motivation as predictors of teachers’ attitude to work. Niger. J. Appl. Behav. Sci. 1, 1–10 (2013) 19. Viseu, J., de Jesus, S.N., Rus, C., Canavarro, J.M., Pereira, J.: Relationship between teacher motivation and organizational variables: a literature review. Paidéia 26(63), 111–120 (2016) 20. Salari, F.: Investigation the relationship between organizational climate and job Burnout of personnel in university of Bandar Abbas. Ac. J. Psy. Stud. 2(2), 39–46 (2013) 21. Branch, A.: The relationship between school organizational climate and physical education teachers’ burnout (Case study: Ramian-Iran). Europ. J. Exper. Biol. 4(1), 600–602 (2014) 22. Karimi, M.: The relationship between organizational climate, job stress, job burnout of the physical education teachers: a case of the schools at Islamshahr. Iran. Res. J. Rec. Sci. 4(5), 114–119 (2015)
Give Me the Right Sensor and I Can Learn Anything
33
23. Hanebuth, D., Aydin, D., Scherf, T.: Burnout and related conditions in managers: a five-year longitudinal study. Innsbruck J. Psychol. des Alltagshandelns/Psychol. Everyday Act. 5(2), 1–36 (2012) 24. Kristensen, T.S., Borritz, M., Villadsen, E., Christensen, K.B.: The copenhagen burnout inventory: a new tool for the assessment of burnout. Work Stress. 19(3), 192–207 (2005) 25. Wurm, W., et al.: Depression-burnout overlap in physicians. PLoS ONE 11(3), e0149913 (2016) 26. Friedman.: Multipathways to burnout: Cognitive and emotional scenarios in teacher burnout. Anxiety Stress Coping 9, 245–259 (1996) 27. Friedman, I.: Burnout in school principals: role related antecedents. Soc. Psychol. Educ. 5, 229–251 (2002) 28. Kremer-Hayon, L., Faraj, H., Wubbels, T.: Burn-out among Israeli Arab school principals as a function of professional identity and interpersonal relationships with teachers. Int. J. Leadersh. Educ. 5(2), 149–162 (2002) 29. Ionica, A., Nassar, Y., Mangu, S.: Organizational climate aspects and principal’s burnout in Southern Israel schools. In: MATEC Web of Conferences, vol. 290, p. 07009 (2019). https:// doi.org/10.1051/matecconf/201929007009 30. Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S., Orovic, I., Moreira, F. (eds.): WorldCIST 2020. AISC, vol. 1159. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45688-7 31. Nassar, Y, Ionica, A., Leba.: Burnout Status Identification and Alarming System. Israel: Israel Patent Office (2019) 32. Maslach, C., Jackson, S.E.: Maslach Burnout Inventory Manua. Consulting Psychologists Press.CA Consulting Psychologists Press, Palo Alto (1986) 33. De Santos, A., Sánchez Ávila, C., Casanova, J., Bailador, G.: Real-time stress detection by means of physiological signals. Rec. Appl. Biometr. 23–44. IntechOpen (2011) 34. Riurean, S., Leba, M., Ionica, A., Nassar, Y.: Technical solution for burnout, the modern age health issue. In: IEEE 20th Mediterranean Electrotechnical Conference (MELECON), pp. 350–353. Palermo: IEEE (2020)
A Profile of Practices for Reporting Systematic Reviews: A Conference Case Gabriel Alberto García-Mireles(B) Depto. de Matemáticas, Universidad de Sonora, Blvd. Encinas Y Rosales S/N, 83000 Hermosillo, Sonora, México [email protected]
Abstract. Several criticisms about the quality of reporting systematic reviews (SRs) have been published and new guidelines propose to use standardized instruments to report them. To identify practices for reporting SRs, I reviewed 32 SRs published in the International Conference on Software Process Improvement (CIMPS). Well reported practices are related to the execution of the automatic database search process and the identification of selection criteria. However, issues arise related to the completeness of the search process, procedures for dealing with inconsistencies during selection, extraction, and classification of data. Besides, validity threats only are addressed by a third of SRs. As a conclusion, the identification of reporting practices can help SR authors to identify both strengths and opportunities areas for conducting and reporting SRs. Keywords: Systematic literature review · Systematic mapping study · Practices for reporting systematic reviews
1 Introduction In Software Engineering (SE) field, systematic reviews (SRs) are well accepted type of literature reviews that support both scientific research goals and evaluation of specific technologies to support decision made by practitioners [1, 2]. SRs support the acquisition of new knowledge, assessment of effectiveness of specific technologies, identification of research gaps, among other goals. In contrast to ad hoc literature reviews, SRs follows a rigorous method to identify, assess, and aggregate empirical evidence that answers a research question [3]. The most common types of SRs are Systematic Literature Reviews (SLRs) and Systematic Mapping Studies (SMSs) [4]. The former asks narrow questions about the effectiveness of an intervention to synthesize evidence, while the latter focuses on a broad question to map research trends of a given topic [3]. Both SLRs and SMSs are carried out following guidelines (e.g., [3, 5]) which purpose is to produce a reliable SR considering both the method used and the findings reported. Although quality of SRs is improving [4, 6], several issues have been arisen as regards conducting and reporting SRs. Budgen and Brereton [4] note the unreliable use of SLR term when the content of a review corresponds to an SMS. Other researchers have © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 34–49, 2023. https://doi.org/10.1007/978-3-031-20322-0_3
A Profile of Practices for Reporting Systematic Reviews
35
informed issues about synthesis of qualitative evidence [7] and quality assessment of primary papers [8]. Regarding validity threats, there is not an accepted general taxonomy to identify and mitigate threats when conducting SRs [9]. Considering reproducibility of searching procedures in SRs, Li [10] declares: “urge the community to define reproducibility-oriented regulations for reporting and reviewing EBSE [evidence-based software engineering] studies.” Similarly, Budgen et al. [11] states: “our aim was to persuade authors and reviewers of the urgent need to improve the quality of published review”. It is suggested that reviewers should follow the SR methods [12] and report any deviation of the protocol that could impact on credibility of findings [13]. Indeed, the rigor, i. e., following the SR guidelines with which an empirical study is conducted, is related to the credibility of their findings [14]. An important activity in a SR is identifying “potential methodological flaws that can bias the outcome of primary studies” and risk of bias of synthesis provided in the SR [13]. Thus, the way a SR is reported provide information about its rigor. In a previous work, we reported 34 SRs that were found in CIMPS conference [15] and applied DARE (Database of Abstracts of Reviews of Effects) criteria to assess their quality [16]. These SRs have a mean quality score of 2.79 which is similar to the mean value of SRs published in other conferences [6]. DARE criteria is composed of five items that assess selection criteria, completeness of the searching process, quality assessment of primary papers, synthesis, and description of each one of the primary studies. However, DARE criteria is a high-level view of the quality of a SLR. SRs contain details that are not described in DARE criteria but related to SRs guidelines. Thus, this work aims at investigating the reporting practices of SRs and compare them with SRs guidelines. The purpose is to improve the quality with which SRs are conducted and reported in the context of the CIMPS conference. The main contribution is a reporting profile of SRs in CIMPS that can be used as a baseline to address improvement initiatives. Section 2 provides related work and briefly summarizes main characteristics of SR guidelines. Section 3 describes the methodology for synthesis applied in the review. Besides, Sect. 4 shows the main results of analyzing 30 SRs. Section 5 discusses results, while Sect. 6 addresses limitations. Finally, Sect. 7 presents conclusions and future work.
2 Related Work Studying the effectiveness of SRs guidelines is an important research topic for producing reliable findings. Kitchenham et al. [17] evaluated 20 SLRs and reported that quality of 17 SRs is greater than 2 (out of 4) using DARE criteria. Kitchenham and Brereton [12] assessed the guidelines for conducting SLRs and provide new advice for conducting SLRs. Besides, a book describes the SLR methodology [18]. Considering SMSs, Petersen et al. [5] assessed 52 SMSs to identify the practices used by researchers. As a result, they proposed an update to the SMS guidelines and a checklist to assess the report of the SMS. Besides, Khan et al. [6] assessed the quality of 210 SMS and main issues identified were the lack of providing a quality assessment of primary papers and the lack of details about synthesis methods [6]. Other studies focus on the quality of specific activities of a SR. Analyzing research questions of SRs, da Silva et al. [19] suggest that the inconsistent use between SLR and
36
G. A. García-Mireles
SMS can be identified by analyzing research questions. Considering synthesis methods used in SLRs, Cruzes and Dyba [7], found that almost 50% of SRs analyzed (49 studies) lack of a reference to follow a synthesis method and few papers provide details about its application. Regarding validity threats, Ampatzoglou et al. [9] found that validity threats are discussed using different frameworks that make difficult to compare results. Managing validity threats is a critical aspect that is related to the credibility of the SR results [9]. SRs are considered empirical studies, as a such they are “subject to both systematic and random errors” [7]. The report of an empirical study is “the main source of information for judging the quality of the study” [14] (p. 69). Therefore, general SR guidelines include a section for reporting both the method and findings of a SR [3, 5]. However, reporting guidelines are not so detailed and do not promote a standardized presentation of results [19]. Indeed, Budgen et al. [20] provide 12 lessons learned for improving reporting of SLRs. Recently, a new guideline for reporting SRs is proposed that is based on a standardized reporting instrument [13]. A summary of activities recommended by SRs guidelines in the SE field is presented in Table 1. They were based on the most frequently reported guidelines for conducting SR [3, 21] an update of SMS guidelines [5], and lessons learned for reporting SLRs [20]. Kitchenham and Charters guidelines [3] address the nine activities of Table 1. Petersen et al. guidelines [5] used a review protocol to conduct the mapping study, but it is not included in the checklist. Thus, ‘Plan the SR’ activity is coded with ‘I’.
3 Methodology In a previous work, I conducted a tertiary study about SRs published in CIMPS conference [15]. They were evaluated using the DARE criteria. From the 34 SRs (Appendix A), one of them is a tertiary study (TS) (S08) and another is a multi-vocal literature review (MLR) (S34). These two studies were excluded because they are not comparable with similar SRs. Thus, this study considers 32 SRs, SLRs and SMSs mainly addressing topics in the SE field. The same ID that in previous work is used and the reference to the original papers can be found in the Appendix A and the Reference Section. To identify the practices reported in SRs, the thematic synthesis method was applied [22]. The goal is identifying recurring themes for reporting practices of the SR based on the activities depicted in Table 1. To code data, a deductive approach was applied where a list of codes was built of potential practices used in each one of the SR activities. The focus was on manifest content. Data was extracted and analyzed by one researcher in two steps. The first step extracted verbatim text segments in a word document and codes were assigned. In the second step, the code assigned to each text segment was verified and was used as input in a spreadsheet tool to analyze frequency of themes reported. For analyzing validity threats reported in SRs, the classification provided by [9] was used.
A Profile of Practices for Reporting Systematic Reviews
37
Table 1. Summary of activities to conduct a SR (Y: Yes, I: Implicit in guidelines). Activity
Description
References [3] [21] [5] [20]
Describe the need for the review
This activity describes the problem Y or issue that requires a SR. It states both the objective and research question(s). It identifies similar reviews
Y
Y
Plan the SR
This activity develops a review protocol that describe the methods used for conducting a SR
Y
Establish searching strategy
Reviewers should determine the exhaustiveness of the search process. This activity includes several searching procedures
Y
Y
Y
Y
Determine selection process
The SR reports both inclusion and exclusion criteria and validation procedures
Y
Y
Y
Y
Assess quality of primary studies
The purpose of quality assessment Y of primary studies is to reduce systematic bias that may be introduced by low quality empirical studies
I
Y
Extract data from primary studies The data extraction form should be Y piloted and adjusted when necessary. Procedures for validation should be included
Y
Synthesize or classify data
Identify methods used for synthesis Y and classification of evidence
Y
Y
Y
Report results of the review
Report the list of primary papers Y and tables where readers can compare them. Furthermore, report the strength of evidence
Y
Y
Y
Discuss validity threats
Discuss the validity threats that Y could impact on results provided by primary papers as well as the SR
Y
4 Results Considering the activities in Table 1, the following paragraphs report the main findings.
38
G. A. García-Mireles
4.1 Describe the Need for Review The 32 SRs provide sufficient information to identify the domain and main problems that could require a SR to answer a research question or fulfill an objective. Considering the SR name reported in the study, 26 SRs are identified by authors as SLRs and remaining 6, as SMSs (S13, S14, S22, S23, S25, S29). Regarding the type of research question, 91% of SRs posed exploratory research questions to determine the existence of a technology or to characterize it in term of the salient features. Seven SRs (22%) address research question about trends or evolution of the phenomenon under study (S05, S10, S11, S23, S14, S22, S23). On the other hand, six SRs (S06, S07, S14, S19, S20, S21) posed causality questions about the influence of the technology under study, as well as the benefits and drawbacks. About reviewing previous SRs related to the research questions, 11 SRs reported them (S02, S04, S05, S07, S13, S14, S21, S22, S23, S24, S32). There is no information about assessing their quality and seven SRs used main findings of these reviews to support the value of conducting a new SR (S04, S05, S07, S13, S21, S22, S24). 4.2 Plan the SR Planning stage was considered in 23 SRs. Most of them reference the SLR guidelines that include a planning activity [3]. However, few SRs informs that a protocol was built (S02, S10, S12, S14, S19, S21, S24, S32) and few SRs mentioned that the protocol was assessed (S19) or validated (S32). There is not information about how the protocol was modified during the execution of the SR study. 4.3 Establish a Searching Strategy Automatic database search strategy is reported in the 32 SRs. Nine SRs report the date when searching in database was conducted (S02, S07, S13, S14, S19, S21, S27, S32, S33). The number of databases used in SRs is in the range of 3 to 8, where the median is 4 databases. The most frequently reported databases are IEEE Xplore (28), ACM Digital Library (27), Scopus (21), Science Direct (19) and Springer (13). Besides, two SRs (S19, S21) mentioned a manual search procedure and ten SRs also used a snowballing approach to identify relevant primary papers (S02, S04, S09, S11, S12, S14, S15, S18, S21, S27). However, few papers show details about snowballing procedures. Some studies reported the number of primary studies included by means of snowballing (S12, S14, S18, S21, S27) and only one reported the database used for carrying out the forward snowballing (S12). For composing the search string, SRs reports one (14 SRs) or two (11 SRs) actions. 13 SRs informed that keywords were obtained from research questions. Similarly, ten SRs noted that PICO/PICOC (Population, Intervention, Comparison, Outcome, Context) acronyms were used to define the search string. Besides, ten SRs reported the usage of synonyms (or alternative terms) for defining search string while eight SRs mentioned that the search string was adapted to each database. Procedures for assessing the completeness of the search strategy were not found.
A Profile of Practices for Reporting Systematic Reviews
39
4.4 Determine the Selection Process Most SRs (31 out of 32) provide, at least, one inclusion criterion. Similarly, 30 out of 32 SRs provide exclusion criteria. In general, it is easy to identify each one criterion, except in four SRs that uses running text to describe them. Considering the description of the procedure for selecting primary studies, 14 out 32 SRs includes tables that can show data from the number of records retrieved from each database to the final number of primary papers selected. Similarly, 12 SRs depicts a figure that shows the stages of the selection procedure. Only two SRs (S04, S25) refers to a list of candidate studies that were considered as potential primary studies. About the reliability of the selection process, seven SRs briefly mentioned some actions carried out. S04 notes that adviser validated the analysis while S25 reports a kappa indicator for measuring the agreement between researchers during the selection process. Other approaches were based on the review of peers, team members, or experts (S07, S14, S19, S21, S32). 4.5 Assess the Quality of Primary Studies Quality assessment instruments are reported in 18 SRs. The instruments used to assess the quality of primary papers range from 2 to 11 items. Instruments composed of three items are the most common reported (S05, S06, S10, S16, S31, S33). From the 32 SRs, only 11 provide a reference for the quality instrument used. The most frequently reported reference is [23] used in S14, S20, S21, and S32. Besides, some papers cite quality assessment instruments used in other SRs (e.g.: S10, S19). Considering the purpose for assessing quality of primary papers, two SRs used it to measure the reliability of primary studies to improve the credibility of SRs’ findings (S04, S19). Besides, ten SRs used quality assessment like another filter to select primary studies (S02, S06, S07, S16, S18, S24, S27, S31, S32, S33). Six SRs reported the quality of each primary study (S02, S04, S14, S21, S24, S27). However, only four SRs uses information about the quality of primary studies to explore the value of the findings of the SR (S04, S24, S25, S27). 4.6 Extract and Synthesize Data Data extraction form was mentioned in 24 SRs. Most of them note the fields considered in the form. There is not information about piloting data extraction forms. Regarding validation procedures for extracting data, S07 and S11 briefly reported them. Regarding qualitative synthesis methods, only six papers mention the method used. Narrative synthesis is reported by S04, S14, S19, S21, and S32. Content analysis is mentioned by S07. However, data illustrating the way methods were applied is lacking, except S14, S21, and S32, that mentioned the step of ‘grouping and clustering’ from the narrative synthesis method. 4.7 Results and Discussion The list of primary studies is included in the primary paper as a table (12 out of 32), in an appendix (8 out of 32) or in a reference to external file (4 out of 32). Two references
40
G. A. García-Mireles
to external files fail to provide documents. Seven SRs do not include the list of primary studies. About the extent tables allow comparison between primary studies, 23 SRs provides, at least a characteristic that could be used to compare the distinct primary studies. Besides, all papers answer the research questions posed and provide a summary of main findings in the running text. Tables are the main approach to summarize results. These can be based on general classification or classification derived from topics addressed in the primary papers. However, few general taxonomies are used to classify papers. In total, 12 SRs present some form of general classification scheme, based on standards or citing a relevant reference. ISO/IEC 25010 is used for classifying primary papers in S02, S07, while ISO/IEC 12207 is used in S23. CMMI Process Areas are used in S28. Classification of papers based on [25] is used by S16 and S27. On the other hand, 78% (25) SRs used a classification of papers that emerged from the content of the set of primary papers studies addressed in each SR. Table 2. Validity threats identified in SRs Threat category
Identification
Mitigation action
Selection of databases No access to DBs (DBs) Cumbersome interface Uncertain about retrieving all relevant papers
SRs S04, S07, S24, S27, S30, S31, S32
Researcher bias
SR executed by a participant of a team Author understanding of the topic
Protocol developed Maintain logs
S04, S14, S20, S21, S23, S24, S32
Study inclusion /exclusion bias
SR results depend on selected papers SMS accept papers of different quality level Inaccessible papers
Validating search process by a random selection of papers Protocol developed Adviser validates search Crosschecked papers
S07, S14, S23, S25, S27, S31, S32
Keywords from literature Synonyms, PICOC Expert recommendation Search string adapted to DBs
S09, S23, S25, S27, S31
Construction of search string
(continued)
A Profile of Practices for Reporting Systematic Reviews
41
Table 2. (continued) Threat category
Identification
Mitigation action
SRs
Robustness of initial classification
Classification could overlook categories No standardization
Researcher built their own classification
S24, S26
Data extraction form to extract verbatim data
S23, S32
Data extraction bias
Few graphical charts were found in the set of SRs. The most common used are bar graphs followed of pie charts. Bar graphs are used to depict frequency of studies by year (S03, S09, S11, S13, S14, S17, S31, S33), frequency of papers per database (S05, S31), and frequency of papers per country (S13, S14, S33), among others. Pie charts are used for depicting papers per database (S29), or papers per type of venue (S23), among others. On the other hand, only a paper (S22) includes a bubble chart and other, a word cloud (S30). Indeed, 13 SRs do not include figures or charts to summarize the results of the study. Concerning strengths and weakness of the review, a few papers discuss the quality of evidence for describing findings. At least, two studies report either the awareness of the potential meaning of findings by contrasting them with previous SRs (S13) or the lack of studies to conduct a quality assessment (S25). On the other hand, 14 SRs present threats to validity using distinct terms to refer either to the name of the threat as well as the actions taken to mitigate its impact (Table 2). The most common validity threats reported in SRs are selection of databases (7), researcher bias (7), and study inclusion/exclusion bias (7) (Table 2). Threats are described such as the identification of an issue or mitigation action. In the case of selection of databases, threats are only identified. The main problem reported is that reviewers cannot accesses specific databases resources for conducting the search process. Considering the researcher bias, the most common issue is the subjectivity to make decisions when only one participant execute a SR activity. The mitigation actions carried out are to develop a protocol and to keep records of the conduction of the SR. On the other hand, threats for the selection process of primary studies inform the uncertainty of selecting all relevant papers for answering the research question and the extent its quality is appropriate. Mitigation actions address the inclusion of objective rules for selecting primary papers and validating the selection of papers. Both construction of search string and data extraction bias provides information about mitigation actions that are suggested to include in protocol when planning the SR. In the case of robustness of initial classification bias, the issue is the validity of the adapted classification approach. Based on findings aforementioned, a profile of the practice for reporting SRs in CIMPS conference 2015–2021 is presented: 1. SRs correspond to SMS because the main contribution is to identify research trends by means of classification of primary studies. This proposition is derived from the
42
2.
3. 4.
5.
6.
G. A. García-Mireles
pattern that 91% of SRs posed exploratory questions and few pieces of research reported the method used for conducting the qualitative synthesis (19%). The search strategy of SRs is based on automatic database search. This approach can impact on the completeness of searching process. A third part of SRs briefly mention snowballing tasks, but without sufficient details to assess or replicate them. Quality assessment of primary papers is used as selection filter. Quality assessment of primary papers also contributes for analyzing the quality of evidence. Majority of SRs uses a topic-based classification that emerged from primary papers. When the categories used in a classification are not defined, the comparison with similar SRs could be related to construct bias. Procedures for validating selection, data extraction, classification or analysis are barely reported. The SRs lack details about the extent the reviewers team participate in resolving inconsistencies or validating the different stages of the process. The quality of the body of evidence of the SR is barely addressed. A few SRs discuss confidence on findings using ad hoc approach. The reader needs information about the quality of the synthesis to evaluate the credibility of results.
5 Discussion Several researchers have informed that there is an inconsistent use of the SLR term to describe a SMS work [4, 19] and there is a tendency to publish qualitative SMSs [4]. Thus, the description of the type of SR should be reviewed. Besides, the majority (66%) of SRs lack explicit information about related SR published that could help to compare results of the current review. Given that majority of research questions are focused on knowledge to describe how the world is [26], it is relevant for SE field to report the contributions of current SRs in terms of previous secondary studies. Planning stage is described in SLR guidelines [3] and suggest the construction of a protocol. However, 25% of SRs mentioned that a protocol was built and only 6% of SRs briefly informs that the protocol was validated. A protocol can mitigate researcher bias [3] and it should be developed and validated: “The protocol is a critical element of a review and researchers need to specify and carry out procedures for its validation” [27]. Regarding search strategy, 100% of review used a database search procedure and provide the keywords of the search string. Besides, four databases are appropriate, and they are included between suggested ones for SE field [5, 12]. However, automatic search is insufficient for completeness of the search process [3]. The search process could include a snowballing procedure or a manual search on a set of specific journals and conferences [5, 12]. Besides, the lack of details to reproduce the searches in the databases is an issue that could affect the credibility of SRs findings [10]. On the other hand, the selection process is the best reported in SRs. However, only 20% of SRs briefly mention a validation of the search process. Guidelines suggest applying validation and agreement procedures [3, 5]. Above half of SRs mention that a quality assessment was conducted and 31% of SRs used these results for ensure the selection of primary papers. Given that quality assessment is not required in SMSs [5] and it could be conducted considering different purposes, SRs should provide the reason for assessing the quality of primary papers [8,
A Profile of Practices for Reporting Systematic Reviews
43
13]. Besides, quality assessment results in SLRs can be used “to evaluate the level of confidence that can be placed in the results of a specific synthesis” [13] or to carry out a sensitivity analysis [3]. About results and discussion, for further research activities by any research group in the context of SMSs, it is suggested that “all the references are cited and the classification information for each study is reported” [28]. Besides, using a published classification improves the comparability among findings of SMSs [5]. Finally, validity threats are discussed by 44% of the SRs under review. There are issues for identifying the specific threats and the mitigation actions [9, 13].
6 Limitations Several validity threats should be considered in this work. Selection bias was mitigated because in a previous tertiary study about SRs in CIMPS conference [15], a team of researchers participated the selection of SRs. However, the completeness of the selection procedure was not validated. Data extraction bias refers to problems when reviewers identify text fragments for answering research questions. In this research, a list of tasks for each one of the activities of a SR was identified and used to coding text fragments. To mitigate the extraction bias, the extraction procedure was based on a list of codes. Regarding researcher bias, the fact that one researcher conducts this synthesis cannot eliminate subjectivity in extracting data and analyzing the codes to derive high order themes. However, the results presented in this work are consistent with other tertiary studies mentioned in Sect. 2 and 6. Publication bias can refer to situations when primary papers are identified in a specific publication venue. In this work I only analyzed SRs published in CIMPS conference. The purpose of this study is understanding the extent practices used in our community for reporting SRs are in conformance with general SRs guidelines and providing an initial assessment which could support the introduction of standardized reporting guidelines [13]. Although the results provided in this work are not generalizable to other conferences, the findings and the descriptive profile can help authors of SRs with understanding strengths of their reporting practices.
7 Conclusions The identification of practices for reporting systematic reviews in the CIMPS conference is presented. Most studies are better categorized as systematic mapping studies and the strengths in reporting is focused on the automatic database search process and describing selection criteria. However, issues regarding the completeness of the search process arise, as well as the validation procedures. The high order themes can provide a descriptive profile of practices for reporting SRs. As further work, there is a need to verify the extent reporting practices are consistent with the way the SR is conducted and identify the potential barriers for applying general SRs guidelines, particularly by novice researchers. Besides, standardized guidelines for reporting SRs would impact current practices. Thus, it is necessary to determine the actions that could be implemented by reviewers’ team to improve the quality of the SRs and the way they are reported.
44
G. A. García-Mireles
Appendix A The categories of the studies will be low (L) (0.5 ≤ quality score ≤ 2), medium (M) (2.5 ≤ quality score ≤ 3) and high (H) (3.5 ≤ quality score ≤ 5). Table 3 depicts SRs and quality assessment is based on [15]. Table 3. List of SRs published in CIMPS 2015–2021 ID Reference
Type
Quality assessment
ID Reference
Type
Quality assessment
S01 [29]
SLR
M
S18 [32]
SLR
H
S02 [31]
SLR
H
S19 [34]
SLR
H
S03 [33]
SLR
M
S20 [36]
SLR
H
S04 [35]
SLR
H
S21 [38]
SLR
H
S05 [37]
SLR
M
S22 [40]
SMS
L
S06 [39]
SLR
M
S23 [42]
SMS
L
S07 [41]
SLR
H
S24 [44]
SLR
H
S08 [43]
TS
H
S25 [46]
SMS
M
S09 [45]
SLR
M
S26 [48]
SLR
M
S10 [47]
SLR
M
S27 [50]
SLR
H
S11 [49]
SLR
L
S28 [52]
SLR
M
S12 [51]
SLR
H
S29 [54]
SMS
L
S13 [53]
SMS
L
S30 [56]
SLR
L
S14 [55]
SMS
H
S31 [58]
SLR
H
S15 [57]
SLR
L
S32 [60]
SLR
M
S16 [59]
SLR
M
S33 [61]
SLR
L
S17 [30]
SLR
L
S34 [62]
MLR
L
References 1. Dyba, T., Kitchenham, B.A., Jorgensen, M.: Evidence-based software engineering for practitioners. IEEE Softw. 22, 58–65 (2005) 2. Kitchenham, B.A., Dyba, T., Jorgensen, M.: Evidence-based software engineering. In: Proceedings. 26th International Conference on Software Engineering, pp. 273–281 (2004) 3. Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. Tech. report, Ver. 2.3 EBSE Tech. Report. EBSE (2007) 4. Budgen, D., Brereton, P.: Evolution of secondary studies in software engineering. Inf. Softw. Technol. 145, 106840 (2022) 5. Petersen, K., Vakkalanka, S., Kuzniarz, L.: Guidelines for conducting systematic mapping studies in software engineering: an update. Inf. Softw. Technol. 64, 1–18 (2015). https://doi. org/10.1016/j.infsof.2015.03.007
A Profile of Practices for Reporting Systematic Reviews
45
6. Khan, M.U., Sherin, S., Iqbal, M.Z., Zahid, R.: Landscaping systematic mapping studies in software engineering: a tertiary study. J. Syst. Softw. 149, 396–436 (2019). https://doi.org/ 10.1016/j.jss.2018.12.018 7. Cruzes, D.S., Dyba, T.: Research synthesis in software engineering: a tertiary study. Inf. Softw. Technol. 53, 440–455 (2011). https://doi.org/10.1016/j.infsof.2011.01.004 8. Zhou, Y., Zhang, H., Huang, X., Yang, S., Babar, M.A., Tang, H.: Quality assessment of systematic reviews in software engineering: a tertiary study. In: Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, pp. 1–14 (2015). https://doi.org/10.1145/2745802.2745815 9. Ampatzoglou, A., Bibi, S., Avgeriou, P., Verbeek, M., Chatzigeorgiou, A.: Identifying, categorizing and mitigating threats to validity in software engineering secondary studies. Inf. Softw. Technol. 106, 201–230 (2019) 10. Li, Z.: Stop building castles on a swamp! the crisis of reproducing automatic search in evidence-based software engineering. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pp. 16–20 (2021) 11. Budgen, D., Brereton, P., Williams, N., Drummond, S.: What support do systematic reviews provide for evidence-informed teaching about software engineering practice? e-informatica Softw. Eng. J. 14, 7–60 (2020) 12. Kitchenham, B., Brereton, P.: A systematic review of systematic review process research in software engineering. Inf. Softw. Technol. 55, 2049–2075 (2013) 13. Kitchenham, B.A., Madeyski, L., Budgen, D.: SEGRESS: software engineering guidelines for reporting secondary studies. IEEE Trans. Softw. Eng. XX. 1 (2022). https://doi.org/10. 1109/tse.2022.3174092 14. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer Science \& Business Media (2012) 15. García-Mireles, G.A., Mejía, J., Arroyo-Morales, L., Villa-Salas, L.: Systematic Reviews in the International Conference on Software Process Improvement: A Tertiary Study. Manuscript Submitted for Publication (2022) 16. CDR: Welcome to the CRD Database. About DARE, CDR, University of York (2020). https:// www.crd.york.ac.uk/CRDWeb/AboutPage.asp. Accessed 05 Mar 2020 17. Kitchenham, B., Brereton, O.P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering–a systematic literature review. Inf. Softw. Technol. 51, 7–15 (2009) 18. Kitchenham, B.A., Budgen, D., Brereton, P.: Evidence-Based Software Engineering and Systematic Reviews, vol. 4. CRC Press (2015) 19. Da Silva, F.Q.B., Santos, A.L.M., Soares, S.C.B., França, A.C.C., Monteiro, C.V.F.: A critical appraisal of systematic reviews in software engineering from the perspective of the research questions asked in the reviews. In: Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–4 (2010) 20. Budgen, D., Brereton, P., Drummond, S., Williams, N.: Reporting systematic reviews: Some lessons from a tertiary study. Inf. Softw. Technol. 95, 62–74 (2018) 21. Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M.: Systematic mapping studies in software engineering. In: 12th International Conference on Evaluation and Assessment in Software Engineering (EASE) 12, pp. 1–10 (2008). https://doi.org/10.14236/ewic/ease2008.8 22. Cruzes, D.S., Dybå, T.: Recommended steps for thematic synthesis in software engineering. In: 2011 International Symposium on Empirical Software Engineering and Measurement, pp. 275–284 (2011). https://doi.org/10.1109/esem.2011.36 23. Dybå, T., Dingsøyr, T.: Empirical studies of agile software development: a systematic review. Inf. Softw. Technol. 50, 833–859 (2008)
46
G. A. García-Mireles
24. Garousi, V., Felderer, M., Mäntylä, M.: V: Guidelines for including grey literature and conducting multivocal literature reviews in software engineering. Inf. Softw. Technol. 106, 101–121 (2019) 25. Wieringa, R., Maiden, N., Mead, N., Rolland, C.: Requirements engineering paper classification and evaluation criteria: a proposal and a discussion. Requir. Eng. 11, 102–107 (2006) 26. Easterbrook, S., Singer, J., Storey, M.-A., Damian, D.: Selecting empirical methods for software engineering research. In: Shull, F., Singer, J., Sjøberg, D.I.K. (eds.) Guide to Advanced Empirical Software Engineering, pp. 285–311. IEEE (2008). https://doi.org/10.1007/978-184800-044-5_11 27. Brereton, P., Kitchenham, B.A., Budgen, D., Turner, M., Khalil, M.: Lessons from applying the systematic literature review process within the software engineering domain. J. Syst. Softw. 80, 571–583 (2007) 28. Kitchenham, B.A., Budgen, D., Pearl Brereton, O.: Using mapping studies as the basis for further research - a participant-observer case study. Inf. Softw. Technol. 53, 638–651 (2011). https://doi.org/10.1016/j.infsof.2010.12.011 29. Galeano-Ospino, S., Machuca-Villegas, L., Gasca-Hurtado, G.P.: Knowledge transfer in software development teams using gamification: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y. (eds.) CIMPS 2020. AISC, vol. 1297, pp. 115–130. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63329-5_8 30. Teixeira, S., Martins, J., Branco, F., Gonçalves, R., Au-Yong-Oliveira, M., Moreira, F.: A theoretical analysis of digital marketing adoption by startups. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y., Calvo-Manzano, J. (eds.) Trends and Applications in Software Engineering. CIMPS 2017. Advances in Intelligent Systems and Computing, vol. 688, pp. 94–105. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-69341-5_9 31. Monzón, I., Angeleri, P., Dávila, A.: Design techniques for usability in m-commerce context: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y. (eds.) CIMPS 2020. AISC, vol. 1297, pp. 305–322. Springer, Cham (2021). https://doi.org/10.1007/978-3030-63329-5_21 32. Rea-Guaman, A.M., San Feliu, T., Calvo-Manzano, J.A., Sanchez-Garcia, I.D.: Systematic review: cybersecurity risk taxonomy. In: Mejia J., Muñoz M., Rocha Á., Quiñonez Y., CalvoManzano J. (eds) Trends and Applications in Software Engineering. CIMPS 2017. Advances in Intelligent Systems and Computing, vol. 688, pp. 137–146. Springer, Cham. (2018). https:// doi.org/10.1007/978-3-319-69341-5_13 33. Solís-Galván, J.A., Vázquez-Reyes, S., Martínez-Fierro, M., Velasco-Elizondo, P., GarzaVeloz, I., Caldera-Villalobos, C.: Towards development of a mobile application to evaluate mental health: systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y. (eds.) CIMPS 2020. AISC, vol. 1297, pp. 232–257. Springer, Cham (2021). https://doi. org/10.1007/978-3-030-63329-5_16 34. Linares, J., Melendez, K., Flores, L., Dávila, A.: Project portfolio management in small context in software industry: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y., Calvo-Manzano, J. (eds.) Trends and Applications in Software Engineering. CIMPS 2017. Advances in Intelligent Systems and Computing, vol. 688, pp. 45–60. Springer, Cham. https://doi.org/10.1007/978-3-319-69341-5_5 35. Ordoñez-Pacheco, R., Cortes-Verdin, K., Ocharán-Hernández, J.O.: Best practices for software development: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y. (eds.) CIMPS 2020. AISC, vol. 1297, pp. 38–55. Springer, Cham (2021). https:// doi.org/10.1007/978-3-030-63329-5_3
A Profile of Practices for Reporting Systematic Reviews
47
36. Iriarte, C., Bayona Orè, S.: Soft skills for IT project success: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y., Calvo-Manzano, J. (eds.) Trends and Applications in Software Engineering. CIMPS 2017. Advances in Intelligent Systems and Computing, vol. 688, pp. 147–158 Springer, Cham. https://doi.org/10.1007/978-3-319-69341-5_14 37. Hernández-Velázquez, Y., Mezura-Godoy, C., Rosales-Morales, V.Y.: M-learning and student-centered design: a systematic review of the literature. In: Mejia, J., Muñoz, M., Rocha, Á., Quiñonez, Y. (eds.) CIMPS 2020. AISC, vol. 1297, pp. 349–363. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63329-5_24 38. Palomino, M., Dávila A., Melendez, K., Pessoa, M.: Agile practices adoption in CMMI organizations: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., San Feliu, T., Peña, A. (eds.) Trends and Applications in Software Engineering. CIMPS 2016. Advances in Intelligent Systems and Computing, vol. 537, pp. 57—67. Springer, Cham (2017). https:// doi.org/10.1007/978-3-319-48523-2_6 39. Milán, A., Mejía, J., Muñoz, M., Carballo, C.: Success factors and benefits of using business intelligence for corporate performance management In: 2020 9th International Conference on Software Process Improvement (CIMPS), pp. 19–27 (2020). https://doi.org/10.1109/CIM PS52057.2020.9390108 40. Berntsen, K.R., Olsen, M.R., Limbu, N., Tran, A.T., Colomo-Palacios, R.: Sustainability in software engineering - a systematic mapping. In: Mejia, J., Muñoz, M., Rocha, Á., San Feliu, T., Peña, A. (eds.) Trends and Applications in Software Engineering. CIMPS 2016. Advances in Intelligent Systems and Computing, vol. 537, pp. 23–32. Springer, Cham (2017)https:// doi.org/10.1007/978-3-319-48523-2_3 41. Céspedes, D., Angeleri, P., Melendez, K., Dávila, A.: Software product quality in devops contexts: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., A. CalvoManzano, J. (eds.) CIMPS 2019. AISC, vol. 1071, pp. 51–64. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-33547-2_5 42. García-Mireles, G.A.: Environmental sustainability in software process improvement: a systematic mapping study. In: Mejia, J., Muñoz, M., Rocha, Á., San Feliu, T., Peña, A. (eds.) Trends and Applications in Software Engineering. CIMPS 2016. Advances in Intelligent Systems and Computing, vol. 537, pp. 69–78. Springer, Cham (2017). https://doi.org/10.1007/ 978-3-319-48523-2_7 43. García-Mireles, G.A., Morales-Trujillo, M.E.: Gamification in software engineering: a tertiary study. In: Mejia, J., Muñoz, M., Rocha, Á., A. Calvo-Manzano, J. (eds.) CIMPS 2019. AISC, vol. 1071, pp. 116–128. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-335472_10 44. Saavedra, V., Dávila, A., Melendez, K., Pessoa, M.: Organizational maturity models architectures: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., San Feliu, T., Peña, A. (eds.) Trends and Applications in Software Engineering. CIMPS 2016. Advances in Intelligent Systems and Computing, vol. 537, pp. 33–46. Springer, Cham (2017).https://doi. org/10.1007/978-3-319-48523-2_4 45. García-Ramírez, M.O., De-la-Torre, M., Monsalve, C.: Methodologies for the design of application frameworks: systematic review. In: 2019 8th International Conference on Software Process Improvement (CIMPS), pp. 1–10 (2019). https://doi.org/10.1109/CIMPS49236.2019. 9082427 46. Espinel, P., Espinosa, E., Urbieta, M.: Software configuration management for software product line paradigm: a systematic mapping study. In: 2016 International Conference on Software Process Improvement (CIMPS), pp. 1–8 (2016). https://doi.org/10.1109/CIMPS.2016. 7802801 47. Chancusig, J.C., Bayona-Orè, S.: Adoption model of information and communication technologies in education. In: 2019 8th International Conference on Software Process Improvement (CIMPS), pp. 1–6 (2019). https://doi.org/10.1109/CIMPS49236.2019.9082425
48
G. A. García-Mireles
48. Martínez, J., Mejía, J., Muñoz, M.: Security analysis of the Internet of Things: a systematic literature review. In: 2016 International Conference on Software Process Improvement (CIMPS), pp. 1–6 (2016). https://doi.org/10.1109/CIMPS.2016.7802809 49. Ponce-Corona, E., Sánchez, M.G., Fajardo-Delgado, D., Castro, W., De-la-Torre, M., AvilaGeorge, H.: Detection of vegetation using unmanned aerial vehicles images: a systematic review. In: 2019 8th International Conference on Software Process Improvement (CIMPS), pp. 1–7 (2019). https://doi.org/10.1109/CIMPS49236.2019.9082434 50. Cohn-Muroy, D., Pow-Sang, J.A.: Can user stories and use cases be used in combination in a same project? a systematic review. In: Mejia, J., Muñoz, M., Rocha, Á., Calvo-Manzano, J. (eds.) Trends and Applications in Software Engineering. AISC, vol. 405, pp. 15–24. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26285-7_2 51. Peña, M.R., Bayona-Oré, S.: Process mining and automatic process discovery. In: 2018 7th International Conference on Software Process Improvement (CIMPS), pp. 41–46 (2018). https://doi.org/10.1109/CIMPS.2018.8625621 52. Miramontes, J., Muñoz, M., Calvo-Manzano, J.A., Corona, B.: Establishing the state of the art of frameworks, methods and methodologies focused on lightening software process: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Calvo-Manzano, J. (eds.) Trends and Applications in Software Engineering. AISC, vol. 405, pp. 71–85. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-26285-7_7 53. Cáceres, S.V., Pow-Sang, J.A.: A systematic mapping review of usability evaluation methods for educational applications on mobile devices. In: 2018 7th International Conference on Software Process Improvement (CIMPS), pp. 59–68 (2018). https://doi.org/10.1109/CIMPS. 2018.8625629 54. Medina, O.C., Cota, M.P., Damiano, L.E., Mea, K.D., Marciszack, M.M.: Systematic mapping of literature on applicable patterns in conceptual modelling of information systems. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 41–54. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-899 09-7_4 55. Palomino, M., Dávila, A., Melendez, K.: Methodologies, methods, techniques and tools used on SLR elaboration: a mapping study. In: Mejia, J., Muñoz, M., Rocha, Á., Peña, A., PérezCisneros, M. (eds.) CIMPS 2018. AISC, vol. 865, pp. 14–30. Springer, Cham (2019). https:// doi.org/10.1007/978-3-030-01171-0_2 56. Damasceno, E., Azevedo, A., Perez-Cota, M.: The state-of-the-art of business intelligence and data mining in the context of grid and utility computing: a prisma systematic review. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 83–96. Springer, Cham (2022). https://doi.org/10.1007/978-3030-89909-7_7 57. Gutiérrez, L., Keith, B.: A systematic literature review on word embeddings. In: Mejia, J., Muñoz, M., Rocha, Á., Peña, A., Pérez-Cisneros, M. (eds.) CIMPS 2018. AISC, vol. 865, pp. 132–141. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01171-0_12 58. Méndez-Becerra, L.R., Rosales-Morales, V.Y., Alor-Hernández, G., Mezura-Godoy, C.: User research techniques for user interface design of learning management systems: a decade review. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 218–232. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-030-89909-7_17 59. Machuca-Villegas, L., Gasca-Hurtado, G.P.: Gamification for improving software project management processes: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Peña, A., Pérez-Cisneros, M. (eds.) CIMPS 2018. AISC, vol. 865, pp. 41–54. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01171-0_4
A Profile of Practices for Reporting Systematic Reviews
49
60. Alfaro, F., Silva, C., Dávila, A.: CMMI adoption and retention factors: a systematic literature review. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 15–28. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-030-89909-7_2 61. Hernández, L., Muñoz, M., Mejia, J., Peña, A.: Gamification in software engineering teamworks: a systematic literature review. In: 2016 International Conference on Software Process Improvement (CIMPS), pp. 1–8 (2016). https://doi.org/10.1109/CIMPS.2016.7802799 62. García, C., Timbi, C.: From a N layers distributed system to oriented service architecture: systematic review. In: 2021 10th International Conference on Software Process Improvement (CIMPS), pp. 10–23 (2021). https://doi.org/10.1109/CIMPS54606.2021.9652718
A Look Through the SN Compiler: Reverse Engineering Results Pedro de Jesús González-Palafox(B) , Ulises Juárez-Martinéz, Oscar Pulido-Prieto, Lisbeth Rodríguez-Mazahua, and Mara Antonieta Abud-Figueroa Instituto Tecnológico de Orizaba, Orizaba, Veracruz, México {m15011160,maria.af}@orizaba.tecnm.mx, {ujuarez, lrodriguezm}@ito-depi.edu.mx
Abstract. Naturalistic programming is a topic of great interest because it allows reducing the gap between the problem domain and the solution domain in the software creation process. SN is a naturalistic language prototype that as a result returns Java bytecode, uses indirect references, allows a higher level of description to de ne procedures, defines nouns that optionally possess a plural form, adjectives that combine with nouns either during definition or instantiation and finally, the prototype presents a limited ability to describe circumstances. The objective of the present work is to apply a reverse engineering process to the SN compiler to know the details of the implementation, identify its characteristics, and document the elements of the language of which there is little or no record. Reverse engineering recovers information about the design of the language and its compiler, which facilitates the understanding of the language. This paper presents the results of the reverse engineering process applied to the SN language compiler. The results include the organization of the SN code by class diagrams and package diagrams; on the other hand, the compilation process of SN programs is detailed. Both results provide insight into the current state of SN and its operation. Keywords: Naturalistic · Compiler internals · Naturalistic programming · Reverse engineering
1 Introduction 1.1 Motivation The use of English as a programming language to reduce the gap between the problem domain and the solution domain inherent in the software development process can be traced back to the 1960s [15]. Computers can process, without any difficulties, programming languages based on unambiguous formalisms at the expense of adapting the requirements of the problem domain. Programming languages have a strong influence on the requirements transformation process resulting in a loss of expressiveness. Understanding expressiveness as the ability to communicate and expose ideas, it is desirable that programming languages achieve a high degree of expressiveness to accurately preserve the ideas in the best possible way. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 50–62, 2023. https://doi.org/10.1007/978-3-031-20322-0_4
A Look Through the SN Compiler: Reverse Engineering Results
51
Existing implementations of naturalistic languages reported are mainly focused on solving specific problems such as software development automation [6] and simple mathematical problems [2], however, few e orts have been made for the formal development of general-purpose naturalistic programming languages. Domain-specific naturalistic languages have the limitation of only being useful to deal with formalisms of specific domain, instead a general-purpose naturalistic language is expected to integrate the properties of natural language to deal with the formalisms of any domain. In that sense, the SN implementation presents a general-purpose naturalistic language prototype, in whose, the grammar is a formalized subset of the English language that generates Java’s bytecode [13]. This article is structured as follows: Sect. 2 presents the analysis of the physical architecture of the prototype; Sect. 3 report the analysis of the logical architecture of the prototype; Sect. 4 bring forward a discussion of the implications of the reverse engineering results. 1.2 Background From its origins in binary systems, through assembly languages to high-level paradigms such as object-oriented programming or functional programming, programming has constantly evolved to adapt to the needs of software development [3]. Natural languages are the means that people use to describe their needs that are later interpreted as software requirements; however, they are ambiguous and inefficient for programming. In contrast, programming languages employ mathematical formalisms to eliminate the use of ambiguous definitions. As a result of the differences between natural languages and programming languages, a gap is created between the ambiguity with which the client expresses itself and the formality required to program the system. Computational Models. A computational model is a formal system that defines a language and how computational computations are executed by an abstract machine. Each computational model has its own set of techniques for programming and reasoning about programs, the above definition of a computational model is very general. What is a reasonable computational model? A reasonable model is used to solve many problems, has simple and practical reasoning techniques, and is implemented efficiently [16]. Naturalistic Programming. Naturalistic programming is a paradigm based on the integration of elements of natural languages for the design of programming languages whose syntax approaches the descriptive power of natural languages. Naturalistic programming is a paradigm based on integrating elements of natural languages to design programming languages whose syntax approaches the descriptive power of natural languages. However, natural languages lack usefulness for programming because of their ambiguity. In contrast, programming languages are di cult to maintain if the code lacks up-to-date documentation, mainly because developers must read and write comments and documentations in order to explain the code they create.
52
P. de Jesús González-Palafox et al.
The code itself is not expressive enough to be self-explicable to other programmers, possibly not even to those involved in the same project [5]. Therefore, a formalized programming language with elements of natural languages has been proposed as a middle ground where the solution is found [7]. Writing programs with a natural language approach could reduce the gap between the approach and the solution. But on the other hand, having the ability to write programs in a naturalistic way would reduce the burden on developers to learn new programming languages repeatedly.
1.3 Related Work In [7] the authors reflect on how object-oriented programming and aspect-oriented programming improve concerning the encapsulation of sparse elements, however, they point out that the capacity of the same is insufficient. Difficulties in capturing system requirements create, according to [5], a gap between programming techniques and desired system behavior; the authors of [8] reflect on how di cult it is for non-native English-speaking developers to adapt to development environments in an international setting. In [4] the authors resume and extend the reflections carried out in [5], they conclude that, although the natural language has the same purpose as programming languages, natural language has mechanisms that programming languages lack. In [11] the authors go over several concepts previously explored in [4, 5, 7]; also, the use of the referential power of natural language. In [12] and [13] the author, in the first article, details the creation of his model to support naturalistic programming, in the second article he reviews some of the elements previously shown but focuses his e orts on exemplifying the particular implementation of the model. Some examples of naturalistic languages are: Pegasus is a tool that allows the development of software from an input program expressed in natural language, and from which an executable program is generated, known as output program. This program is a representation of the ideas described in the naturalistic language and offers support to generate code in Java [5]. Pegasus bases its mechanism on an abstraction of the human brain, which needs a dictionary of entities that associates concepts and words, as well as their conjugations. Metafor is a tool designed to support programmers by describing a problem using natural language and generating a codebase in Python. The generation is done through a “dialog” between the programmer and the tool, the computer generates class structures from the programmer’s descriptions [6]. Cal 4700 is a development environment and programming language in the form of a Windows program, including a dedicated graphical interface, a le manager, a text editor, a document editor, and a compiler/analyzer that generates native code compatible with the Microsoft Windows operating system and Intel architecture. The entire project is written in Plain English with a source code of about 25 thousand sentences. Plain English is the style and way of writing instructions in Cal-4700, designed by the creators [14].
A Look Through the SN Compiler: Reverse Engineering Results
53
SN differs from Pegasus and Metafor in that SN aims to serve as a naturalistic programming language, although SN uses Scala and AspectJ as an intermediate step, it was implemented through intermediate languages given the time constraints of the original work, the goal of SN is to obtain executable code. In addition, SN only allows writing using English as the natural language since the solution focuses on a subset of English. The authors of Pegasus de ne it as a programming system that adjusts to human needs and does not force a human programmer to accord with the requirements of a programming language. According to the authors, it will be possible to code using natural languages such as German or English. Natural-language programs written in Pegasus are translated into several target languages, e.g., Java, C, C++ , Ruby, Python, and Haskell. Metafor is a system that aims to help beginner programmers develop programming intuition and facilitate system planning for intermediate programmers. Like Pegasus, Metafor allows returning source code from another programming language, in this case, Python. Cal 4700 and SN share the essence of being naturalistic programming languages and not code generation tools. The main difference is that Cal 4700 offers a programming language and a complete naturalistic development environment. On the other hand, SN is a general-purpose naturalistic language but does not o er an integrated development environment. 1.4 Reverse Engineering Generalities The main objective of this work is to report the results of the reverse engineering process to the SN compiler and expose the details of the reverse engineering process that exceeds the scope of this work. First, however, it is necessary to present the generalities of the process to understand the origin of the results, so the following are the generalities of the reverse engineering process. The mentioned procedure was done following the general recommendations for reverse engineering of [9]. Process focused on extracting the design information from the source code of the SN compiler. As a first step, the unstructured ("dirty") source code was restructured to make the source code easier to read and provide the basis for subsequent activities. The core of reverse engineering consists of the abstraction extraction activity, whereby the source code must be evaluated to obtain a meaningful specification of the processing that is performed. In the case of the SN compiler, the abstraction extraction process consists of two major parts: reverse engineering to understand the data and reverse engineering to understand the processing. The reverse engineering to understand the data focused on analyzing the internal data structures used by the compiler by identifying and grouping the program variables related to them.
54
P. de Jesús González-Palafox et al.
Reverse engineering to understand the processing began with an attempt to understand first and then extract procedural abstractions represented by the source code. First, the entire system’s global functionality must be understood to perform a more detailed reverse engineering process where each system package represents a functional abstraction with a high level of detail; the previous allows the generation of a processing narrative for each component. Finally, the code inside each component was analyzed for code sections that represent generic procedural patterns; in almost every component, one section of code prepares the data for processing, a different section of code performs the processing, and another prepares the processing results for export from the component. Smaller patterns may be found within each of these sections, for example, data validation. In the case of the SN compiler, the abstraction extraction process consists of two major parts: reverse engineering to understand the data and reverse engineering to understand the processing. The reverse engineering to understand the data focused on analyzing the internal data structures used by the compiler, by identifying and grouping the program variables related between them. Reverse engineering to understand the processing began with an attempt to understand first and then extract procedural abstractions represented by the source code, first the global functionality of the entire system must be understood in order to perform a more detailed reverse engineering process where each system package represents a functional abstraction with a high level of detail; the previous allows the generation of a processing narrative for each component. Finally, the code inside each component was analyzed for code sections that represent generic procedural patterns; in almost every component, one section of code prepares the data for processing, a different section of code performs the processing, and another prepares the processing results for export from the component. Within each of these sections, smaller patterns may be found; for example, data validation.
2 SN Compiler - Physical Architecture SN1 is a general-purpose naturalistic language prototype, created to validate the model proposed by [10]. This work defines the conceptual model with which SN is built, but does not present the details regarding the operation of the SN compiler. Through a reverse engineering process, the physical architecture of the compiler was obtained to know the state of the code, with its corresponding class diagram and package diagram. Figure 1 shows the representation of the physical architecture of the compiler which consists of two main components. The main package contains the main classes of the compiler, in this package is the starting point of the compiler.
1 The name SN is inspired by the Latin locution Sicut Naturali which means “as the natural”, but
its creator says the correct name is SN.
A Look Through the SN Compiler: Reverse Engineering Results
55
Fig. 1. SN compiler package diagram
Figure 2 shows the naturalistic package containing the grammar package, which contains the lexical analysis, syntactic analysis and semantic analysis classes. The other package contained in the naturalistic package is the library package, besides containing the circumstances and instructions packages, the library package contains the classes for code generation of the naturalistic abstractions, nouns and adjectives; the main abstraction; singular and plural attributes; verbs and the compilation unit. Figure 3 shows the package circumstances which contains the necessary classes for the code generation of circumstances. In SN a circumstance is a mechanism to set constraints based on which adjectives are allowed for composition, which adjectives are required, or which adjectives are not allowed [1]. Figure 4 shows the instructions which contains the blocks and values packages and contains the code generation classes of the instructions. Internally the SN compiler labels the instructions as BasicInstruction or as BlockInstruction. Instruction blocks are
Fig. 2. Class diagram of the library package
56
P. de Jesús González-Palafox et al.
composed of basic instructions. In addition, SN has two instructions that it deals with independently, these are: Embedded Grammar2 :
Fig. 3. Class diagram of the circumstances package
It is used to work with embedded grammars; VerbTypeInstruction: It is used for instructions that return some value when called. Figure 5 shows the blocks package that contains the classes for managing instruction blocks, both repeating and decision blocks. For both cases, the instruction blocks are categorized as LineInstruction or MultipleInstruction, the Multiple Instruction blocks are instructions in which each instruction in the block is written on a different line, while the LineInstruction, although they also work with multiple instructions, are characterized by being all written on a single line.
Fig. 4. Class diagram of the instructions package
2 The SN language is focused on describing naturalistic expressions, so it requires a mechanism
to de ne formalisms of a particular domain to work properly. The mechanism implemented to describe instructions of a specific domain is an embedded grammar.
A Look Through the SN Compiler: Reverse Engineering Results
57
Fig. 5. Class diagram of the blocks package
Figure 6 shows the values package that contains the literals package and contains the code generation classes necessary for SN to work with values. Among the most important elements are the class to work with identifiers, the class to de ne the type of the value, and the classes necessary to work with plural values3 . Finally, Fig. 7 shows the literals package that contains the classes for generating the code of the data types implemented by default in SN.
Fig. 6. Class diagram of the values package
The literals4 implemented in SN are: Boolean, Character, Integer, Null, Real and String. 3 Plurals are expected to have attributes and verbs, but since their definition depends on the noun
they represent, their elements are defined in the noun itself by prefixing the reserved word plural. 4 A literal is a constant value consisting of a sequence of characters. Any statement in SN that defines a constant value - a value that does not change during program execution - is a literal.
58
P. de Jesús González-Palafox et al.
Fig. 7. Class diagram of the literals package
3 SN Compiler - Logical Architecture An analysis of the logical architecture of the compiler was done by reverse engineering to know the internal operation of the compiler and understand its operation. As a result of this analysis, the logical architecture scheme of the SN compiler operation was obtained. The SN compiler is divided into three stages which are described below. 3.1 Analysis Phase The analysis phase is the starting point of the compilation, in this phase the SN source code goes through the lexical analyzer and as a result generates the table tokens and the symbol table. Once this phase is finished, the syntactic analysis of the input text is carried out, from this phase the abstract syntax tree (AST) of the input program is obtained. Subsequently, the semantic analysis of the source code is done, this analyzer reviews the syntactic restrictions of SN, additionally, the semantic analyzer creates a data stack that is used for the support of indirect references. Once the parsing phase is completed, a valid SN AST and the symbol table are obtained as a result. The schematic of the parsing phase of the compiler is shown in Fig. 8. 3.2 Intermediate Processing Stage The intermediate processing phase starts with the AST generated in the analysis phase. In the first instance of this phase, the intermediate representation of the SN program is generated: CompilationUnit. This intermediate representation contains different data structures to manage the various components of the SN program. Among the most notable structures are: 1. HashMap importList used to store and manage the import of the packages necessary
A Look Through the SN Compiler: Reverse Engineering Results
59
Fig. 8. SN compiler architecture - analysis phase
for the correct operation of the intermediate code; 2. ArrayList naturalisticAbstractions used to contain all naturalistic abstractions; 3. ArrayList fileNames containing the names of the inter-meridian code les that will need to be generated. Each naturalistic abstraction, nouns and adjectives, are in charge of the management of their instructions individually. Additionally, the abstraction main is in charge of managing all instructions, and blocks of instructions, that are not contained in any noun or adjective. The schematic of the intermediate processing phase is shown in Fig. 9.
Fig. 9. SN compiler architecture - intermediate processing phase
From the intermediate representation, and through code generation rules, the intermediate code is generated. Each language, Scala and AspectJ, has its generation rules. As a result of the processing phase, a series of Scala and AspectJ les are obtained with which the final stage of the SN compilation process begins. 3.3 Destination Processing Phase The last phase of the SN compilation process is the destination processing phase. In this phase, the Scala and AspectJ les obtained in the intermediate processing phase are compiled using automated commands.
60
P. de Jesús González-Palafox et al.
Internally, priority is given to the Scala code which is the first to be compiled, once the compilation is completed, the conditions described by the circumstances are applied. For the circumstances, the compiler uses aspect-oriented programming. The compiler translates the circumstances into the AspectJ code. As a final result of this process several bytecode files are obtained which are executed by the Java Virtual Machine. The schematic of the target processing phase is shown in Fig. 10.
Fig. 10. SN compiler architecture - destination processing phase
3.4 Example of Compilation Process An example of the SN compilation process is presented in Fig. 11. The code below defines the noun Point and the adjective ThreeDimensional and finally defines the compound noun ThreeDPoint from both abstractions. As a result of the analysis and intermediate processing phase, the compiler generates the Point.scala.le and the Point_ValidatorAspect.aj file, then in the destination processing phase, invokes the corresponding Scala and AspectJ libraries to generate the.class executable files. noun Point: attribute X is 0. attribute Y is 0. adjective ThreeDimensional: attribute Z is 0. noun ThreeDPoint is a ThreeDimensional Point.
A Look Through the SN Compiler: Reverse Engineering Results
61
Fig. 11. SN compiler example - three dimensional point example
4 Conclusions The result of this reverse engineering process allows looking through the SN compiler. The reverse engineering process provides knowledge of the current state of the compiler recovering information about its design, of which we have little or no record, and also provides knowledge of the language compilation process; understanding these details of the compiler’s structure and operation sets a path for the language’s extension and refinement. Understanding how the compiler works and identifying the parts that conform to it is only the first step toward consolidating the language. To achieve this objective, it is necessary not only to attract the interest of the developers towards the naturalistic paradigm but also to provide them with the necessary tools to integrate into the SN language’s development. The naturalistic paradigm has a high potential, and it is necessary to spread it, so the developer’s community knows about its benefits; for this purpose, it is not only important to make known the naturalistic languages themselves, but it is also necessary to supply to developers the information about the functioning of naturalistic languages. The results presented in the present work assist the understanding of the language of SN and allow the elimination of the esoteric4 mantle that surrounds the concept of naturalistic languages. As future work is considered continuing to re ne the compiler, both in structure and processes, removing intermediate languages, and expanding the language to add support for graphical interfaces and web capabilities. Acknowledgments. This work is supported by Consejo Nacional de Ciencia y Tecnología (CONACyT).
References 1. Alducin-Francisco, L.M., Juarez-Martinez, U., Pelaez-Camarena, S.G., Rodriguez- Mazahua, L., Abud-Figueroa, M.A., Pulido-Prieto, O.: Perspectives for soft- ware development using the naturalistic languages. In: 2019 8th International Conference on Software Process Improvement (CIMPS), pp. 1–11 (2019). https://doi.org/10.1109/CIMPS49236.2019. 9082428 2. Biermann, A.W., Ballard, B.W.: Toward natural language computation. Comput. Linguist. 6(2), 71–86 (1980) 3. Booch, G.: Object-Oriented Analysis and Design with Applications, 3rd edn. Addison Wesley Longman Publishing Co., Inc, USA (2004)
62
P. de Jesús González-Palafox et al.
4. Knöll, R., Gasiunas, V., Mezini, M.: Naturalistic types. In: Proceedings of the 10th SIGPLAN Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, pp. 33–48 (2011) 5. Knöll, R., Mezini, M.: Pegasus: first steps toward a naturalistic programming language. In: Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications, pp. 542–559. OOPSLA 2006, Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1176617.1176628, https:// doi.org/10.1145/1176617.1176628 6. Liu, H., Lieberman, H.: Metafor: Visualizing stories as code. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, pp. 305 307. IUI 2005, Association for Computing Machinery, New York, NY, USA (2005). https://doi.org/10.1145/1040830. 1040908, https://doi.org/10.1145/1040830.1040908 7. Lopes, C.V., Dourish, P., Lorenz, D.H., Lieberherr, K.: Beyond aop: Toward naturalistic programming. SIGPLAN Not. 38(12), 34–43 (2003). https://doi.org/10.1145/966051.966058, https://doi.org/10.1145/966051.966058 8. Mefteh, M., Bouassida, N., Ben-Abdallah, H.: Towards naturalistic programming: Mapping language-independent requirements to constrained language specifications. Sci. Comput. Program. 166, 89–119 (05 2018). https://doi.org/10.1016/j.scico.2018.05.006 9. Presmman, R.: Ingeniería del Software un Enfoque Práctico. Mc. Graw Hill, Esta- dos Unidos de America (2010) 10. Pulido-Priedo, O.: Modelo conceptual para la implementación de lenguajes de programación naturalísticos de propósito general. PhD. thesis, Instituto Tecnológico de Orizaba (2019) 11. Pulido-Prieto, O., Ju rez-Martínez, U.: A survey of naturalistic programming technologies. ACM Comput. Surv. 50(5), 1–35 (2017). https://doi.org/10.1145/3109481, https://doi.org/10. 1145/3109481 12. Pulido-Prieto, O., Juárez Martínez, U.: A model for naturalistic programming with implementation. Appl. Sci. 9, 3936 (2019). https://doi.org/10.3390/app9183936 13. Pulido Prieto, O., Ju rez Martínez, U.: Naturalistic programming: model and implementation. IEEE Latin Am. Trans. 18(07), 1230–1237 (2020). https://doi.org/10.1109/TLA.2020.909 9764 14. Romero, J.A.J.: Desarrollo de software utilizando el lenguaje de programación naturalístico Cal-4700. PhD. thesis, Instituto Tecnológico de Orizaba (2022) 15. Sammet, J.E.: The use of English as a programming language. Commun. ACM 9(3), 228–230 (1966). https://doi.org/10.1145/365230.365274, https://doi.org/10.1145/365230.365274 16. Van Roy, P., Haridi, S.: Concepts, Techniques, and Models of Computer Programming. MIT Press (2004)
A Software Development Model for Analytical Semantic Similarity Assessment on Spanish and English Omar Zatarain1 , Efren Plascencia JR4 , Walter Abraham Bernal Diaz3 , Silvia Ramos Cabral1 , Miguel De la Torre1 , Rodolfo Omar Dominguez Garcia1(B) , Juan Carlos Gonzalez Castolo2 , and Miriam A. Carlos Mancilla5 1
3 4
Departamento de Ciencias Computacionales e Ingenier´ıas, University of Guadalajara (Universidad de Guadalajara), 44450 Guadalajara, Jal, Mexico {omar.zatarain,silvia.ramos,miguel.dgomora,odomi}@academicos.udg.mx 2 Departamento de Sistemas de Informaci´ on, Universidad de Guadalajara, Guadalajara, Mexico Maestr´ıa en Ingenier´ıa de Software, Universidad de Guadalajara, Ameca, Mexico Ing. en Electronica y Computaci´ on, Universidad de Guadalajara, Ameca, Mexico 5 (CIIDETEC-UVM), Universidad del Valle de M´exico, Tlaquepaque, Mexico miriam [email protected]
Abstract. We propose a software development method for semantic text similarity based on the analysis of structural properties of pairs of snippets of texts. An agile software method focuses on the detection of biases on generated prototypes to enhance the quality of the expected features. A system for the detection of semantic text similarities is designed and implemented using the software method. The system design defines a set of structures for recording and computation of similarities using algorithms with polynomial complexities. The algorithms exploit knowledge bases for information extraction. Two implementations for English and Spanish were produced to test the portability of the system on European languages. The An experiment on Spanish and English text snippet is performed to test the performance of the prototypes. Results show that the degree of accuracy of the proposed algorithms are improved as the knowledge base content increases. The advantages of this methodologies is the elicitation of similarities between texts and the assessment of the similarity degree from scratch without prior knowledge assumptions. Keywords: Knowledge base · Semantic analysis Perception of textual structures
1
· Text similarity ·
Introduction
Semantic text similarity is a challenging task due to the presence of synonyms, aphorisms, obscure meaning of words, ambiguity, incomplete knowledge, as well c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 63–83, 2023. https://doi.org/10.1007/978-3-031-20322-0_5
64
O. Zatarain et al.
as structural properties of the syntax of a natural language. The most successful methodologies in semantic text similarities are combinations of word embedding, deep learning and wordnet [4,9,11]. These methods are based ob learning strategies to improve the performance. However, supervised learning strategies typically require large labeled datasets to learn the underlying properties of data. However, the digital activities of people generate scientific and factual knowledge everyday, this phenomenon expands and updates the domains of knowledge and, in some cases, it generates new knowledge domains. Despite of the breakthroughs on natural language processing using deep learning, the need of analytical strategies for the elicitation of new concepts is key to assimilate new knowledge [8]. This paper presents a set of analytical rules and a modified agile software methodology to exploit knowledge bases and detect the similarities between two texts. Obtain the similarities of a pair of text snippets in Spanish using the structures of a natural language and a knowledge base to elicit the semantic properties and assess a degree of similarity. Our research is motivated from the following questions 1. How is it possible to assess the similarity of a pair of texts without knowing or training on a given domain? 2. If the assessing of texts is possible, how can a method refer the similarities and differences of documents using a set of reasoning rules and strategies? First question implies the inclusion of analysis as the replacement of memorization, second question refers to the skill of pinpoint the similarities and differences in a human readable way instead of the use of vector representations of knowledge. Current text similarity methodologies use supervised strategies such as the training of a neural network or support vector machines, these methodologies demand big sets for training in order to enhance their performance, a few of them reached outstanding results. However, it is desirable to find methodologies that are capable to assess the knowledge contained within a pair of texts with resilience to scarce sources of information, capable to make inferences regardless the domain of knowledge and no previous assumptions more than those imposed by the language of use and its grammar. The objectives of this research are: 1. Have an analytical method for the development of natural language processing systems. 2. Develop the structures for recording the semantics and syntax properties of a text. 3. Adapt agile strategies to complex issues on natural language processing such as non-determinism of expressions, use of synonyms, and fault-tolerance on incomplete information of provided sources. The scope of this research is to design a method that enables the extraction of information from a knowledge base to elicit the similarities of a pair of text
Software Development Model
65
snippets. The knowledge base contains the definitions of terms, synonyms and antonyms. The methodology focuses on the rules of syntax properties and lexicon of a natural language contained in a knowledge base, therefore, the performance is affected by the content of the database. The rest of this article is structured as follows: Sect. 2 describes the current state of art on semantic text similarity, Sect. 3 describes the development strategy focused on the detection of biases on features that the system should accomplish, Sect. 4 describes the normalized criteria for assessment of the parts of speech, Sect. 5 defines the structures used for recording the findings on the parsing of a sentence, verbal phrases, similarity, Sect. 6 describes the algorithms used for the creation of the systems that find the similarities, Sect. 7 describes the specific features for the implementations in Spanish and English, Sect. 8 provides the details of the experiments performed for both implementations on English and Spanish, Sect. 9 shows the outcome of both experiments.
2
State of the Art
The most proficient semantic text similarity methods include a combination of techniques such wordnets [6], word embedding [5], recurrent neural networks [1] and alignment. Word embedding [5] is a way to represent knowledge as high dimensional vectors for managing several domains of knowledge. Recurrent neural networks [1] are supervised systems with capacity to learn multiple features in natural language processing such phonology and named entity recognition. In semantic text similarity is used in combination of word embedding to learn the assessment according to SemEval 2017 scale [2]. One of the most successful methods in semantic text analysis is a universal model for multilingual and cross-lingual semantic textual similarity [9]. The universal model uses a combination of sequence features, syntactic parse features [14], alignment features [15], bag of words, dependency features, word embedding [5,13,19], regression algorithms, a deep learning module based on a deep averaging network [17] and a long short term memory network [3]. Semantic information space [11] focuses on information content [20] which uses statistics of frequency of concepts, applies a preprocessing with a name entity recognition [16] and develops three methods, the first method is unsupervised and uses the hypernym-hiponym relations of a wordnet, the use of probabilities based on a corpus, the second method uses support vector machines [21] and alignment, the third is a deep learning method and uses word embeddings and linear regressions. Alignments, sentence-level embeddings and Gaussian mixture model [4] include the embedding strategy word2vec [5], wordnets [6], the Hungarian algorithm for alignments [7] and produces sentence vectors to apply the cosine similarity. The use of convolutional neural networks [12] includes a preprocessing to produce word embeddings [19] use high dimension vectors to train the neural networks with pooling operations. Knowledge representation methodologies such as the Semantic Web and ontologies [22] and Wordnets [6] describe explicit relationships between concepts and have the advantage of a direct extraction of such relations, the disadvantage is
66
O. Zatarain et al.
that in many cases establishing such hard relations require the human specification on a case by case basis, and therefore, the ontology or wordnet may have biases regarding the concepts and their relations when a brand new knowledge is being analyzed.
3
Agile Development of the Similarity Assessment System
This section describes the methodology used to detect the biases produced due to the complexity of the issues to solve, also describes the risk assessment to reduce the number and complexity of issues and describe a development process to enhance the quality of the prototypes. The development methodology considers the use of prototypes and stages of development similar to the stories and artifacts describe in the agile methodology of SCRUM [10] with the following differences: 1. The development environment is immerse in research instead of commercial software environments. 2. The project backlog is being described under the agreement of the student and the researcher. 3. Once a prototype is discarded, an analysis of faulty requirements detects the biases or missing parts. 4. If a prototype was discarded and/or there were biases that require new requirements and/or already considered requirements that are not implemented yet, thus a new sprint with the non-implement requirements develops the new baseline to address the faulty/missing requirements. The produced prototype in any iteration is evaluated towards the enhancement of the structures for the acquisition of linguistic features. The process of system development consists of the following steps 1. 2. 3. 4. 5.
Specification of the initial requirements, Design of the functional properties for the prototype, Development, including the detection of issues, Specification of test cases, Assessment of the prototype: analysis of the achieved features and detection of biases on the specification, 6. Re-engineering of the non-implemented requirements, 7. Specification of the integrated backlog for next sprint. The prototype assessment is key to enhance the software product, thus, it is expected that each sprint may produce the detection of biases due to the complexity of the natural language in study.
Software Development Model
3.1
67
Risk Assessment
Managing the assessment requires a qualitative and quantitative approach to avoid risks. The risk management has the following principles – Measure the state of the maturity of the latest prototype against the biases found at the sprint, for each feature considered by the sprint requirements, contrast it against the test samples provided by the testing dataset. If the new requirements are difficult to be addressed in the next sprint, record the needs to the plan and produce the analysis of requirements and include them in the project’s backlog. – Prioritize the requirements of the next sprint taking in account the closest requirements according to old and new project’s needs. – Avoid big changes unless you identify a critical requirement – Prioritize key requirements instead of regular requirements for next sprint. – Measure the degree of change proposed/identified on the detection of biases by reviewing the current sprint baseline and compare it against the specification of the new requirements and assess the latter according the scale [absent, partially considered, mostly considered ]. 3.2
Development Process
A method to detect requirement biases is proposed due to the complexity of linguistic phenomena. In fact, the analysis of any language is prone to biases that are likely to ignored on the requirement elicitation and analysis. The development process performs the assessment of achievements for the prototype produced in each sprint and applies a risk analysis to identify biases on the specification of requirements. Figure 1 shows the process of a sprint, that starts with the development of the analysis of requirements to be addressed on the sprint. Then, follows the design of the solution, the implementation, the testing part retrieves besides the assessment of the quality on implemented requirements, a set of identified cases where the tests cases found unexpected results and linguistic features. These unexpected results and linguistic features are analyzed and described in their complexity to be added in the next or a future sprint.
4
Semantic Similarities
In this section, the equations to produce the assessment of semantic similarities of texts are defined. The linguistic types considered for similarity assessment are noun phrases (NP), subject-verb-object (SVO) clauses and questions. The similarity assessments developed in this section are inspired by the Jaccard similarity, this type of metric was chosen instead of others such as cosine similarity due to the objective of the research is to avoid transformations of the words as example word-embedding metrics that map the words into high dimensional vectors. The motivation to avoid the trend of knowledge representations with vectors is due to the need to provide analytical strategies for autonomous extraction of
68
O. Zatarain et al.
Fig. 1. Methodology for prototype assessment on SCRUM [10]. The modification is the analysis to detect the biases not addressed by the sprint and the elicitation of updated/created requirements.
the knowledge and provide a ruled based reasoning on the similarities of pairs of texts rather than training neural networks that use such high dimensional vectors. Definition 1. Similarity of synonyms. Let a pair of texts Tx and Ty , the similarity by synonymy of a pair of words is given as the degree of common knowledge specified by the matching (1) of the synonym sets (synsets) S1 ∈ Tx and S2 ∈ Ty where Tx and Ty are purged of stopwords1 which are extracted from a knowledge base, dictionary or thesaurus. M (S1 , S2 ) =
1 : S1 ∩ S2 = ∅ 0 : S1 ∩ S2 = ∅
(1)
Definition 2. Similarity of NPs. Let a pair of texts Tx and Ty containing only NPs, the similarity of a pair of NPs is defined as the degree of common knowledge specified by the matching (2) of the synonym sets Tx (i).S and Ty (k).S of a pair of words in Tx and Ty , where Tx and Ty are purged of stopwords which are extracted from a knowledge base, dictionary or thesaurus.
1
A stopword is a word that is widely used in texts regardless the domain of text, as example determiners, pronouns, conjunctions, adverbs, as well as a few verbs as be, have, do.
Software Development Model |Tx | |Ty |
S(Tx , Ty ) =
i=1 k=1
69
M (Tx (i).S, Ty (k).S) |Tx ∪ Ty |
(2)
Definition 3. Similarity of SVOs or VSO. Let a pair of texts Tx and Ty containing SVOs(sentences) and/or VSOs (questions), the similarity of a pair of SVO and/or VSO is given as the degree of common knowledge specified by the matching of the parts of speech at the subjects S1 and S2, verbs V1 and V2 and objects O1 and O2, where α means a discount factor due to conditions of partial similarity of related concepts in one or several of the parts of speech. S(S1, S2) + S(V 1, V 2) + S(O1, O2) −α (3) 3 One case where a discount factor is non zero are when similarity and the verbs are dissimilar after the analysis through synonyms then α = 0.06. Another case is when there are detection of differences between the two sentences, in this 1 case α = dif f × 2×δ , where dif f is the number of differences found between two texts and δ stands for the number of different synsets found by applying the Eq. (1). Sim(Tx , Ty ) =
5
Structural Models
The information contained within the texts is analyzed and stored based on the structural properties of the clauses. A series of structures must be developed to capture the syntax structures at one hand, at the other hand the semantics must be denoted. The structures designed for extraction on a natural language are the following: 1. 2. 3. 4.
Definition structure Sentence structure Verb tense structure Similarity structure
5.1
Definition Structure
The definition structure describes the content of a concept’s definition and is used for the population of the dictionary and the thesaurus that constitute the knowledge base. The definition structure has the following fields as shown in Fig. 2: type, objects(divided into antonyms and synonyms), attributes as a plain text sentence describing the concept. The knowledge base is populated with entries that indicate implicit relations of synonymy with other concepts that may but not necessarily are included within the dictionary or thesaurus. The reasons of having implicit relations instead of explicit relations is that most knowledge available is non-structured, and in few cases, there are no available ontology or the existent ontology has limited relations between concepts, specially on brand new knowledge.
70
O. Zatarain et al.
Fig. 2. The definition structure.
5.2
Sentence Structure
As shown in Fig. 3, the sentence structure has the purpose of capturing the analysis of the parts of speech of a sentence. It contains information related to the sentence on the features of type of a term2 , the positions of the subject, object and the verb, the sets of synonyms for each term, the positions of the verb. Struct Sentence ::= ( < Struct PS >,::= ( RN i=1 Cont(i) ::= ( < string word >, < string Set type >, < string root >, < string set synset >, ) < Struct VTenses >, < string Type >, < string SentenceType >, < Array verbpositions >, < Array infinitives>, < Array adverbs>, < Array adjectives >, < Array pronouns >, )
Fig. 3. The sentence structure.
5.3
Verb Tense Structure
The verb tense structure shown in Fig. 4 has the information related of the action within a clause. It may contain the features of composed verb phrases such as passive voice or continuous forms, includes fields to detect if the verb phrase 2
The type of a term may be noun, adjective, verb, adverb, conjunction, determiner.
Software Development Model
71
includes modal verbs, negations, the relative positions of the verbs the infinitive form of the verb.
Fig. 4. The verb tense structure.
5.4
Similarity Structure
The similarity structure shown in Fig. 5 captures the alignments of the parts-ofspeech (POS) between texts, and the order of content regarding the POS. The contents of this structure is used for the assessment of similarities.
Fig. 5. The similarity structure.
6
Process Models
This section describes the three key steps to produce the similarity assessment from scratch and using a knowledge base. The process models define the steps to
72
O. Zatarain et al.
achieve the extraction and recording of relevant information from the sentences and knowledge base to elicit the similarities through the analysis of synonyms and morphologies. Three process models are created: 1. Preprocessing 2. Similarities extraction process 3. Assessment process 6.1
Preprocessing
The preprocessing performs a parsing of a non-annotated text and produces the parts of speech and record them into the sentence structure. The parsing also produces one or several verb tenses and record them into a verb tense structure. Algorithm 1 shows the preprocessing, the first step generates one or more blocks of text following the rules defined for detection of SVO clauses or questions from the positions of punctuation signs. Once the blocks are defined, for each word at each block, the possible lexical types and its synsets are extracted from a knowledge base. Next a set of rules to obtain the possible verb tenses are tested on each block. If the verb tenses are non-empty, a process selects one of the verb tenses as the most possible verb tense case. Finally, the blocks are regrouped by taking in account the SVO/Question cases against the noun phrases and produces either clauses with complex subjects or complex objects according to the relative positions of the noun phrases against the SVO/Question clause. Algorithm 1. Preprocessing of a text Input: a text S as sequence of words, SW as StopWords, KB as a link to the knowledge base Output: B: as blocks of sentences, NPs and/or questions 1: B ← DetectBlocks(S, SW, KB) 2: blocksize ← size(Blocks) 3: for i = 1 to blocksize do 4: B(i).P S ← P arseBlock(B(i), SW, KB) 5: B(i).V T enses ← GetV T (B(i), SW, KB) 6: if B(i).T ype = ”SEN T EN CE” then 7: if B(i).V T enses = ∅ then 8: B(i).SelV T ← SelV T s(B(i)) 9: B(i).SentenceT ype = ”SV O” 10: else 11: B(i).SentenceT ype = ”N P ” 12: end if 13: end if 14: if B(i).T ype = ”QU EST ” then 15: B(i).SentenceT ype = ”QU EST ION ” 16: end if 17: end for 18: B ← RegroupBlocks(B, SW, KB)
Software Development Model
73
Algorithm 2 shows the parsing of a block. The first step is to create a sentence and for each token in S, the token is assigned to the field word, the type (types) of the token is extracted from the knowledge base, if one of the labels of the word is verb, then the is extracted, finally, the sets of synonyms are extracted from the knowledge. Algorithm 2. ParseBlock Input: a text S as sequence of words, SW as the stopwords, KB as a link to the knowledge base Output: PS as a structure of the parsed sentence 1: P S ← CreateP arsedStructure() 2: psize ← size(S) 3: P S.Content ← Struct(psize) 4: for i = 1 to pize do 5: P S.Cont(i).word ← S(i) 6: P S.Cont(i).type ← ExtLabels(S(i), SW s, KB) 7: P S.Cont(i).root ← ExtRoots(S(i), SW, KB) 8: P S.Cont(i).Syn ← ExtSynsets(S(i), SW, KB) 9: if P S.Content(i).Roots = ∅ then 10: P S.V positions ← P S.V positions ∪ (i) 11: end if 12: end for
6.2
Similarity Extraction
The similarity extraction performs a comparison of the parts of speech between two text snippets to bind and record them into the similarity structure. The similarity extraction is driven by six types of combinations as described in Table 1. The class I (NP-NP) addresses the similarity as a combination of the Jaccard similarity, variations on the word forms and the use of synonyms by the extraction of related concepts from a knowledge base. Class II (SVO-SVO) applies the similarity described in class I in the parts of speech of both SVO clauses and analyses the appearances of the NPs with regards subjects and objects, the ordering is necessary due to the actor or non-actor role that a NP may have. Class III applies also an analysis by POS due to analogous reasoning as class II. Class IV requires the extraction of subject and object from the SVO to compare them against the NP, in this case the verb is secondary due to the absence of an action in one of the texts. Class V requires the analysis between the POS of both texts, however, the texts may not have the full similarity due to one of the texts is a SVO clause and the other is a question. Algorithm 3 shows the process for similarity extraction, it requires as inputs a pair of text blocks, a set of stop-words, an a link for the knowledge base. The process starts with a preprocessing of the blocks as described in Algorithm 4 performing the morphological analyses of words, the analyses of sets of synonyms
74
O. Zatarain et al. Table 1. Classes of similarities from the syntactic properties Type Class
Criteria for similarity analysis
I
NP-NP
Analysis of BoW (Jaccard similarity, morphologies and synsets), morphological and sets of synonyms
II
SVO-SVO Analysis by parts of speech (subject, objects and verbs), order of NPs
III
Q-Q
Use BoW for subjects, verbs and objects of question clauses
IV
SVO-NP
Use BoW between the NPs of the SVO against the NP of the other text
V
SVO-Q
Use BoW for subjects, verbs and objects of SVO and question clauses
VI
Q-NP
Use BoW between the NPs of the question against the NP of the other text
between tokens from both sets of blocks at levels of subjects, verbs, objects and the extraction of the similarity properties throughout both sets of blocks. Once the pair preprocessing is finished, and based on the type detected on each block, an analysis is performed based on the combination detected as specified in Table 1. 6.3
Assessment
The assessment produces a weighting of the similarities produced by the extraction to produce a degree of similarity within the range [0, 5] according to the criteria defined in SemEval. Table 2 is extracted from SemEval 2017 [2] and defines the degree of similarity for the scale. The Algorithm 5 for assessment has as inputs the similarity structures and produces the output of assessment judgement. Since our functions to obtain the similarities of classes I to VI produce assessment in the range [0,1], the average is multiplied by 5 to fit on the SemEval 2017 scale [2]. Table 2. SemEval similarity scale [2] Degree Similarity criteria 5
The two sentences are completely equivalent, as they mean the same thing
4
The two sentences are mostly equivalent, but some unimportant details differ
3
The two sentences are roughly equivalent, but some important information differs/missing
2
The two sentences are not equivalent, but share some details
1
The two sentences are not equivalent, but are on the same topic
0
The two sentences are completely dissimilar
Software Development Model
75
Algorithm 3. Similarities Detection Algorithm Input: B1, B2 as blocks from a pairs of texts, SW as the set of stopwords, KB as a link to the knowledge base Output: Sim as a set of similarity structures 1: b1size ← size(B1) 2: b2size ← size(B2) 3: Sim ← ∅ 4: [B1, B2] ← P airP reprocessing(B1, B2) 5: sc ← 0 6: for i = 1 to b1size do 7: for k = 1 to b2size do 8: if B1(i).ST = ”N P ” ∧ B2(k).ST = ”N P ” then 9: [f lag, Similarity] ← Case I(B1(i), B2(k)) 10: if f lag = true then 11: sc ← sc + 1 12: Sim(sc).similarity ← Similarity 13: end if 14: end if 15: if B1(i).ST = ”SV O” ∧ B2(k).ST = ”SV O” then 16: [f lag, Similarity] ← Case II(B1(i), B2(k)) 17: if f lag = true then 18: sc ← sc + 1 19: Sim(sc).similarity ← Similarity 20: end if 21: end if 22: if B1(i).ST = ”QU EST ” ∧ B2(k).ST = ”QU EST ” then 23: [f lag, Similarity] ← Case III(B1(i), B2(k)) 24: if f lag = true then 25: sc ← sc + 1 26: Sim(sc).similarity ← Similarity 27: end if 28: end if 29: if B1(i).ST = ”SV O” ∧ B2(k).ST = ”N P ” then 30: [f lag, Similarity] ← Case IV (B1(i), B2(k)) 31: if f lag = true then 32: sc ← sc + 1 33: Sim(sc).similarity ← Similarity 34: end if 35: end if 36: if B1(i).ST = ”SV O” ∧ B2(k).ST = ”QU EST ” then 37: [f lag, Similarity] ← Case V (B1(i), B2(k)) 38: if f lag = true then 39: sc ← sc + 1 40: Sim(sc).similarity ← Similarity 41: end if 42: end if 43: if B1(i).ST = ”QU EST ” ∧ B2(k).ST = ”N P ” then 44: [f lag, Similarity] ← Case V I(B1(i), B2(k)) 45: if f lag = true then 46: sc ← sc + 1 47: Sim(sc).similarity ← Similarity 48: end if 49: end if 50: end for 51: end for
76
O. Zatarain et al.
Algorithm 4. Pair Preprocessing Algorithm Input: B1, B2 as blocks from a pairs of texts, SW as the set of stopwords, KB as a link to the knowledge base Output: Updated blocks B1 and B2 and Properties 1: [B1, B2] ← M orpho(B1, B2) 2: [B1, B2] ← Synsets(B1, B2) 3: [B1, B2] ← V erbSynSets(B1, B2) 4: [P roperties] ← GeneralP roperties(B1, B2)
Algorithm 5. General Assessment Input: SimStruct as the set of Similarities Output: GE as the asessment of the pair 1: simsize ← size(SimStruct) 2: GE ← 0 3: for i = 1 to pize do 4: GE ← GE + SimStruct(i).Similarity 5: end for 6: GE ← 5 ∗ GE/simsize
6.4
Complexity Analysis
The algorithms for extraction of definitions and synsets for each word are O(x) and O(y) where x is the number of definitions found in the dictionary and y is the number of synonym sets found in a thesaurus. The algorithms for comparison of a pair of texts described in this section have a complexity of O(N * M) where N is the number of words of the first text and M is the number of words in the second text. The polynomial complexities of the algorithms are minimal for the systematic comparison of a pair of texts, other strategies such neural networks using word embeddings require the processing of 400-Dimension vectors and the use of layers that usually require GPUs to reduce the processing time [4,5,15].
7
Implementations
The software methodology described in Sect. 3 was used to produce two prototypes for semantic text similarity. The first implementation was on the English an the second was produced for Spanish. Differences of natural languages require the specification of custom parsing rules. As example consider the words forms that English may assign several interpretations such as noun, verbs, pronouns. On the other hand, Spanish has multiple forms for a concept, such as the use of gender, more differentiated conjugations of verbs, more tenses than English. 7.1
English Implementation
The English implementation focuses on the detection of verbs at a first stage, second stage is the detection of the rest of parts of speech. The most challenging
Software Development Model
77
part is the detection of verb phrases that may contain one or more verbs, in some cases the verb is absent and participles or continuous forms appear alone. Table 3 specify the structures of sentences according to the possible verb phrases. Table 3. Types of syntax structures due to verbal phrases in English Type
Description
Verb tenses
NP
A clause with no verbs at all
No verb tense
S[BE]O
An informal clause with no verbs and Present tense enacted a place adverb
Be about to A clause where the main verb is in (present be, past be) + about infinitive form after an auxiliary verb + infinitive be SV(O)
A subject verb (object) clause
Present, past, passive, perfect
Continuous
A clause in continuous form
be + ING
Cont. I
An informal continuous clause without [be] ING auxiliary verb be
There be
Passive simple form where the subject there + be appears as object
Passive PP
Passive past participle
Perfect
A sentence where the main verb Auxiliary have + participle appears as participle
Cond. 0
A sentence that contains a cause and a (if + present consequence present simple)
Cond. 1
A sentence that contains a cause and a (if + present simple, ... will + consequence infinitive)
Cond. 2
A sentence that contains a cause and a (if + present simple, ... would consequence + infinitive)
Cond. 3
A sentence that contains a cause and a (if + past perfect, ... would + consequence have + PP)
Mixed
A set of clauses within a sentence composed by a combination of the types described above
7.2
continuous,
be + participle
simple,
...
Spanish Implementation
The Spanish implementation focuses on the detection of conjugations of verbs when the verbs are regular in the tenses, the detection of the conjugation facilitates the identification of the parts of speech. For demonstrative purposes this implementation uses a smaller knowledge base compared with the English versions and only the rules for verbal tenses of present, past, and future are included
78
O. Zatarain et al.
despite of Spanish has more verb tenses than English. The steps of similarity analysis and assessment are implemented using the same algorithms used for English.
8
Experiments
We use two datasets of SemEval 2017 [2]3 used at semantic text similarity contest, the Spanish dataset consisting of 250 pairs of text snippets, the English dataset has also 250 pairs. For both datasets, a Gold standard [2] is provided as the degree of similarity established by a human expert for each pair in the interval [0,5]. Two knowledge bases in English and Spanish are provided to perform the extraction of information, these knowledge bases have a format of dictionaries and we do not use other knowledge representation formats such as wordnet, nor annotated corpus due to we want to assess similarities from scratch. Once the results are generated, a Pearson correlation is computed between the results and the gold standard. The English knowledge base has 342,125 concepts an the Spanish knowledge base has 89,354 concepts.
9
Results
The results generated by the methods for Spanish Test dataset and English Test dataset are compared against the Gold Standard of each dataset using a Pearson correlation. The sets of words containing the lexicon from the Spanish and English pairs are described in Table 4, for the total of words containing each dataset after consulting the Spanish and English knowledge bases respectively, it was found that the degree of incompleteness in the Spanish knowledge base was 52.61% and the degree of incompleteness with the English knowledge base was 5.2%. The results of testing the proposed algorithms in Spanish shown in the Fig. 6 achieve a Pearson correlation of 53%. The results for the English version shown in the Fig. 7 produce a Pearson correlation of 77.42%. As example, in Table 5 is shown the first pairs of English”test” dataset, in pair 1 where the gold standard assigns a degree of 2.4 of similarity while the assessment of our method defines a 3 out 5 of similarity since there are differences on the verb and basketball and baseball have a degree of similarity through the concept game by applying (1) on the synset of basketball and the synset of baseball, despite they are not exactly the same concept, therefore by applying (3) times 5 (due to SemEval uses a scale [0, 5] and ours in [0,1]) we have the assessment depicted in (4).
3
The series of Semantic Text Similarity ended with the 2017 edition, despite the organization that promoted the series still organizes other contests such as sentiment analyses, they decided to stop this contest due to the most accurate deep learning methods did not reached a Pearson correlation of 86%.
Software Development Model
79
Table 4. Lexicon contained in the SemEval Test datasets for Spanish and English Dataset Total words Found Not found Pearson Incompleteness Spanish 975
462
513
English 872
826
46
53% 77.42%
52.61% 5.2%
Table 5. Types of syntax structures due to verbal phrases in English Pair Text 1
Text 2
GS Our
1
A person is on a baseball team A person is playing basketball 2.4 3 on a team
164
A woman is in the bathroom
The woman is in a bathroom 4.6 5
S(Sx , Sy ) + S(Vx , Vy ) + S(Ox , Oy ) −α 5 Sim(P 1x , P 1y ) = 3 1+0+1 =5 − 0.06 = 3 3
(4)
Another example in Table 5, the pair 164 has a gold standard of 4.6, the similarity of 5 was computed as described in (5), where the difference considered in the gold standard value of 4.6 was due to differences of the determiners. S(Sx , Sy ) + S(Vx , Vy ) + S(Ox , Oy ) −α 5 Sim(P 164x , P 164y ) = 3 (5) 1+1+1 =5 −0 =5 3
10
Discussion
Differences on results of both versions are due to the following causes: 1. The degree of completeness of the knowledge bases on each natural language: Spanish knowledge base is smaller than English knowledge base. 2. Both knowledge bases have content biases (missing concepts) that affect the performance on the similarity assessment. Another key factor that affects the results compared to supervised methods is the structure of the knowledge bases, in our experiments we use dictionaries instead of wordnets. The latter have explicit relationships meanwhile the dictionaries have the relations of synonymy and antonymy by their word-forms with no explicit links between synonyms. The most related work to our proposal is the unsupervised first method that uses information content and frequencies on corpuses [11], the coincidences between our method and this related work are the use of Jaccard similarity, the differences are:
80
O. Zatarain et al.
Fig. 6. First results of the Spanish version.
Fig. 7. Results of the English version.
Software Development Model
81
– The use of knowledge bases in the form of dictionaries to obtain the semantics of pairs of words through synsets on our method, at the other hand, related method uses statistics for word frequencies. – In the preprocessing: our method extracts the sets of synonyms from definitions of concepts of a dictionary, produces the parts of speech, while the compared method uses the output of a supervised method to get the parts of speech [16], and uses algorithms to find the relations between concepts through the search of paths of each concept to its root in a wordnet. – Our method does not use probabilities based on the frequencies from a text corpus, instead, we remove the stopwords to facilitate the analysis and prevent unnecessary queries on the knowledge base. We consider that analytical strategies are worthy precisely by the incompleteness of information of datasets, knowledge bases, wordnets and other forms of knowledge representation. Undoubtedly, there are opportunity areas for analytical strategies to become as good as supervised strategies such as deep learning, however, this work is an example of machines mimicking analytic skills using linguistic rules on knowledge bases and produce descriptions of similarity instead of vector representations of knowledge and threshold-tuning learning hard to understand by humans. Our method is free of previous assumptions such as the specification of domain classes and does not require the transformation of knowledge into vectors, or probabilities. The use of structures to represent human readable information can boost the performance of simplified mathematical equations for the assessment of methodologies. The degree of completeness of a knowledge base is an issue to our method. However, knowledge biases are pervasive issues regardless the kind of unsupervised or supervised method and the forms of the information sources.
11
Conclusion
This work presents a software engineering method and an analytical similarity method of a pair of verbs based on linguistic rules. The methodology for prototype assessment enables the detection of biases on the elicitation of requirements to enhance the features of the software product. The implementation of the methodology regardless the natural language of study demonstrates the portability of the strategy on European languages such as English and Spanish. The results shows that the degree of completeness of knowledge bases has a direct impact on the similarity assessment and the performance improves as the knowledge base includes more concepts. Future work may include the development of knowledge bases that include explicit relationships on concepts with a degree of synonym/antonym, hypernyms, meronyms.
References 1. Botvinick, M.M., Botvinick, M.M., Plaut, D.C., Plaut, D.C.: Short-term memory for serial order: a recurrent neural network model. Psychol. Rev. 113(2), 201, 233 (2006)
82
O. Zatarain et al.
2. Cer, D., Diab, M., Agirre, E., I˜ nigo, L.-G., Specia, L.: SemEval-2017 Task 1: semantic textual similarity multilingual and cross-lingual focused evaluation. In: Proceedings of the 11th international Workshop on Semantic Evaluations (2017) 3. Hochreiter, S., Schmidhuber, J.: Long short term computation. Neural Comput. 9(8) (1997) 4. Maharjan, N., Banjade, R., Gautam, D., Tamang, L.J., Rus, V.: DT Team at SemEval-2017 Task 1: Semantic similarity using alignments, sentence-level embeddings and Gaussian mixture model output. I: Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), pp. 120–124 (2017) 5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Jeffrey, D.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013: Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 3111–3119 (2017) 6. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995) 7. Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logist. Q. 2(1–2), 83–97 (1955) 8. Roberto Navigli and Simone Paolo Ponzetto: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012) 9. Tian, J., Zhou, Z., Lan, M., Wu, Y.: ECNU at SemEval-2017 Task 1: leverage kernel-based traditional NLP features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In: Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), pp. 191–197. (2017) 10. Schwaber, K., Beedle, M.: Agile Software Development with Scrum, 1st edn. Prentice Hall PTR, Upper Saddle River (2001) 11. Wu, H., Huang, H., Jian, P., Guo, Y., Su, C.: BIT at SemEval-2017 Task 1: using semantic information space to evaluate semantic textual similarity. In: Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), pp. 77–84 (2017) 12. Shao, Y.: HCTI at SemEval-2017 Task 1: use convolutional neural network to evaluate semantic textual similarity. In: Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval-2017), pp. 130–133. Association for Computational Linguistics (2017) 13. Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Towards universal paraphrastic sentence embeddings, pp. 1–19 (2016) 14. Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: F¨ urnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006). https://doi. org/10.1007/11871842 32 15. Sultan, A., Bethard, S., Sumner, T.: DLS@CU: sentence similarity from word alignment and semantic vector composition, pp. 148–153 (2015) 16. Manning, C.D., Bauer, J., Finkel, J., Bethard, S.J.: The Stanford CoreNLP natural language processing toolkit Christopher. In: Proceedings of52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60. Association for Computational Linguistics (2014) 17. Iyyer, M., Manjunatha, V., Boyd-Graber, J., Iii, H.D.: Deep unordered composition rivals syntactic methods for text classification, pp. 1681–1691 (2015)
Software Development Model
83
18. Maharjan, N., Banjade, R., Rus, V.: Automated assessment of open-ended student answers in tutorial dialogues automated assessment of open-ended student answers in tutorial dialogues using gaussian mixture models. In: Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference, pp. 98–103 (2017) 19. Pennington, J.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistic (2014) 20. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: IJCAI 1995: Proceedings of the 14th international joint conference on Artificial intelligence, pp. 448–453 (1997) 21. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011) 22. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web: a new form of web content that is meaningful to computers will unleash a revolution of new possibilities, pp. 1–3 (2001)
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation: An Experimental Study Marcel Valový(B) Prague University of Economics and Business, W. Churchill Sq. 1938/4, 130 67 Prague, Czech Republic [email protected]
Abstract. [Objective] We face a period in time where alternative ways of motivating software personnel must be explored. This study aimed for a detailed description and interpretation of the topic of pair programming roles and motivation. [Method] Using a mixed-methods approach, the present study examined a proposed nomological network of personality traits, programming roles, and motivation. Three experimental sessions produced (N = 654) motivation inventories in two software engineering university classrooms which were quantitatively investigated using student’s t-test, χ2 test, and hierarchical cluster analysis. Consequently, the author conducted semi-structured interviews with twelve experiment participants and utilized the thematic analysis method in an essentialist’s way. [Results] Eight produced themes captured that pair programming carries both positive and negative motivational consequences, depending on personality variables. The statistical analysis confirmed that the suitability of a given role for a programmer can be determined by his personality: (i) pilot – openness, (ii) navigator – extraversion and agreeableness, (iii) solo – neuroticism and introversion. Keywords: Agile development · Intrinsic motivation · Big five · Thematic analysis · Hierarchical cluster analysis · Software engineering
1 Introduction This paper presents mixed-methods research on the agile software development practice of pair programming and motivational issues in software teams during the pandemic times. Why is this important? First, the topic of pair programming is now actively discussed in both scientific and industrial communities in relation to remote work settings, and secondly, traditional motivation techniques are now less effective. In pair programming, two programmers collaborate on the same task using one computer and a single keyboard [35]. One of them takes on the pilot role and writes the code, and the other concurrently takes on the navigator role, thinking about the problem and solution, double-checking the written code, and addressing issues. Pair programming has © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 84–98, 2023. https://doi.org/10.1007/978-3-031-20322-0_6
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
85
been used since the 1950s for its positive effects, such as faster development, increased quality and security of code, and knowledge transfer [9, 27]. During the ongoing pandemic, many software engineers writing computer programs in solitude have suffered from the frustration of basic psychological needs: competency, relatedness, and autonomy [11]. A study based on factor analysis of 2225 reports from 53 countries indicated the pandemic had put stress on the psychological needs and, in turn, significantly hurt developers’ wellbeing and productivity [28]. Pair programming could be employed to satisfy the needs and boost intrinsic motivation, but it could also be harmful. Based on previous analyses, it might have positive yet also adverse effects on the motivation and performance of software developers [18]. What effects do its roles have separately, and do they affect programmers of different personalities the same? That was not examined by previous studies, but this study sets to find out. The author proposes a nomological network to represent the constructs of interest in this study, their observable manifestations, and the inter-relations between them. The core constructs are programming role (independent variable), intrinsic motivation (dependent variable), and personality (moderating variable). To perform statistical tests, motivation and personality must be operationalized into observable variables. The former will be modeled within the Self-determination framework [11] and the latter with the Big Five model [14]. Data will be collected in laboratory experiments. Research Problem: How to increase intrinsic motivation in software teams using pair programming. Research Questions:
RQ1: Do distinct pair programming roles affect programmers’ motivation differently? RQ2: Can psychometric tests improve the assignment of pair programming roles?
2 Background The following chapter provides the primary rationale for the research. It delineates the theoretical boundaries of this study, defines key concepts, reviews the pertinent literature, confers our novel approach with previous studies, and postulates operational assumptions and hypotheses. 2.1 Pair Programming Intensive cooperation on a single task brings both the benefits and problems commonly present in small-group collaborations [9, 13]. A seminal meta-analysis paper [18] reported that pair programming is faster than solo when programming task complexity is low and yields code solutions of higher quality when task complexity is high. It also recommended studying moderating factors of the effects on pair programming, which this study will do. Hannay et al. also published research on the effects of personality on pair programming [17]. The resulting relations were rather insignificant, and
86
M. Valový
they recommended studying other moderating factors. Au contraire, they did not study the programming roles of pilot and navigator separately and tested the total performance of the couple, not individual motivation. Our paper will take a novel approach and will clearly differentiate the roles and account for the individual effects on motivation. Remarkably, pair programming was practiced since the 1950s, long before it got its name, and might be one of the remedies to the omnipresent multitasking [9]. 2.2 Personality Operations related to programming are prevalently cognitive. Therefore, they are influenced by behavioral characteristics, including personality, affect, and motivation [11]. Personality, in psychology, is used to describe the array of variables in which individuals differ and refers to an individual’s characteristic patterns of behavior [7]. We incorporate personality variables in our research as moderators. Moderators or boundary variables are able to amplify or attenuate the effect of the independent variable (programming role) on the dependent variable (intrinsic motivation) [21]. Personality traits in the Big Five model [14] can be defined as probabilistic descriptions of relatively stable patterns of emotion, motivation, cognition, and behavior in response to classes of stimuli that have been present in human cultures throughout evolution [10]. Each of the examined Big Five dimensions of human personality can be considered the result of an evolutionary trade-off [24]. Since there is no unconditional optimal value, it is to be expected that genetic diversity will be retained in the population, and clusters will be formed [26] because the traits were found to be 60–80% inherited [30]. That is why we initiate our quantitative analysis by applying hierarchical cluster analysis, as opposed to the previous studies [cf. 17]. 2.3 Intrinsic Motivation The revolutionary concept of intrinsic motivation stems from White’s [34] landmark paper which, contrary to previously prevalent theories, posits that behaviors such as exploration, manipulation, and play could be considered not as drives but as “innate psychological tendencies” endowed in every developing organism. White has labeled these propensities as a motive to produce effects, or “effectance motivation”, the concept that represents a theoretical forerunner of intrinsic motivation. Intrinsic motivation, as defined by Self-determination theory [SDT; 11], exists in the relation between individuals and activities. Each individual is intrinsically motivated for some activities and not others, and only in certain social contexts and not others [ibid.]. In humans, intrinsic motivation is a prototypical example of autonomous behavior, being willingly or volitionally done, as opposed to heteronomous actions, which are subject to external rewards or pressures [ibid.]. That is why no extrinsic rewards nor pressures were allowed in our controlled experiments. 2.4 Hypotheses Five hypotheses were drawn on the synthesis of knowledge in multiple disciplines, including psychology, organizational behavior, and software engineering, as follows:
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
87
H1: Programmers have distinct personalities. H2: Distinct personality types prefer different pair programming roles. H3: Openness positively moderates motivation in the pilot role. H4: Extraversion and agreeableness are essential for a motivated navigator. H5: Neuroticism and introversion are detrimental to both pilot and navigator roles. The hypotheses are supported by pre-existing empirical studies in diverse fields, e.g.: H1 and H2. First, the IT work environments have evolved to become sociable and incorporate agile methods. Secondly, programmers are expected to manifest in characteristic Big Five clusters with distinct preferences [24, 26] (see Sect. 2.2). H3. Since software engineering is a highly creative profession, writing code frequently requires divergent thinking and the ability to imagine. Openness to experience is, in fact, suggested to be strongly associated with divergent thinking and the ability to generate multiple unusual and creative solutions to problems [10]. H4: A higher level of agreeableness is associated with the preference to work in teams and make small contributions throughout a project; extraversion corresponds with the preference to work in teams and also on multiple tasks at once [12]. H5. Neurotic people are more sensitive toward negative emotions. As a result, they prefer jobs that prioritize hygiene factors [21]. Pair cooperation is contrary to that (see Sect. 2.1). Introversion and its effect on pairing are inverse to extraversion.
3 Methods The following chapter presents the inquiry framework of our research and explains the selection of methods, including their location on the epistemological landscape. 3.1 Context The proposed research questions revolve around the research problem of effectively increasing motivation using agile methods, such as pair programming. To answer them, we opted for an experimental, mixed-methods research strategy [23]. In the context of an IT undergraduate software engineering course, three rounds of a controlled experiment were carried out. Afterward, the quantitative data from the experiments were analyzed using contemporary statistical methods to establish empirical links between personality dimensions and software engineers’ attitudes [12, 15]. Additionally, we conducted semi-structured interviews with the experiment participants and evaluated them using qualitative methods such as thematic analysis. 3.2 Experimental Design The subjects were students who signed up for an advanced software engineering course at the undergraduate university level. The author ran three laboratory sessions of 60-min net programming time each, during which the (N = 40) subjects took a break every 10 min to self-report their
88
M. Valový
motivation with a seven-item questionnaire and rotate in pairs and receive a new task (yet being able to continue on the previous one because the tasks were continuous). One group of subjects worked in pairs, and the subjects in the “control group” worked alone. The partners of each pair were either designated “pilot,” who controls the keyboard and codes, or “navigator,” who conceptualizes the solution to the given task and looks for defects, with the subjects told to switch roles every 10 min. The last session was without a control group. This effectively put each individual in three different conditions or “roles” (solo, PP-pilot, PP-navigator) for 6x10 minutes, yielding 6 motivation measures for each individual in each condition. Each individual’s “preferred role” is then related to his or her personality. The preferred role is defined as the condition with the highest average reported motivation level. The subject’s personalities were measured with the Big Five personality test at the beginning of each session. The subjects were instructed on how to pair-program during a 60-min pilot session that preceded the three experimental sessions. The subjects were working on predefined tasks in a static order. During the first session, the purpose of the tasks was to develop a contextual menu for their semestral work, which is an adventure game in Java with a graphical user interface in JavaFX. In the second session, the tasks were about animating elements in the game. No external motivators were used, i.e., no credits were given for correct solutions. The task difficulty was the main concern to the validity of the experimental design and that is why the first two sessions contained a “control group”. In each session, the motivational differences between each 10-min round were compared in both the control (solo programming) and the test group, and statistical tests confirmed that there was, in fact, no relation between the tasks and self-reported intrinsic motivation. This is consistent with the findings of Vanhanen and Lassenius [33]. It is also worth noting that the subjects were of various backgrounds and abilities, and this would have a greater effect on the results if performance were measured [1] as performance = ability x motivation [21]. We diminished those effects by measuring intrinsic motivation, which depends on autonomy, competence, and relatedness. The effect of “pair jelling”, i.e., relative improvement after the first task mentioned by Williams et al. [36], was tested statistically and did not manifest in motivation. Pairs were allocated randomly and irrespective of personality, similar to two experimental studies, one with 564 students and 90% of pairs reporting compatibility [20] and another with 1350 students and 93% reported compatibility [36]. 3.3 Inquiry Framework There have recently been many arguments in favor of using instruments coming from psychology and related fields for systematic studies on human aspects of software engineering [12, 15], labeled as “behavioral software engineering” [22]. The psychometrics of our choice were Big Five Inventory (BFI) and the first subscale of Intrinsic Motivation Inventory (IMI). Several versions of the BFI questionnaire were developed to measure the five-factor OCEAN dimensions: openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism. We chose the BFI-10 version [29], which contains just 10 five-point Likert-scale questions, yet maintains
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
89
similar statistical properties (reliability, consistency, validity, less redundancy) than other questionnaires and was co-authored by John, the author of the original BFI-44 from 1999 [19]. The overall mean correlation of BFI-10 to BFI-44 is .83. IMI is a “multidimensional measurement device intended to assess participants’ subjective experience related to a target activity in laboratory experiments” [31]. It is particularly useful in experiments related to intrinsic motivation and self-regulation. Its first sub-scale, called “interest/enjoyment,” contains seven Likert-scale questions and is considered the self-report measure of intrinsic motivation. The items have been shown to be factor-analytically coherent and stable across a variety of tasks, conditions, and settings [31]. The instruments were implemented in their original English language version. The instruments were not validated by the authors. 3.4 Experimental Questionnaire and Interview Design The final questionnaire used in the experiments consisted of 63 questions and was dispersed using Office Forms prior to the experiment. It had the following structure: – informed consent – only two demographic questions as the psychological literature stresses that demographic variables are not suitable for the study of motivation [21]. – 10 questions on Big Five dimensions (BFI-10) – six times seven questions on intrinsic motivation (IMI) + one question asking what the performed pair programming role was – two administrative questions: participant’s initials and email address for the psychometric results (optional) The first three parts were filled before the experiments’ task rounds had begun. After each round, the participants reported their intrinsic motivation. Finally, they could answer two administrative questions and receive psychometric results. After the third experimental session, the participants were queried by email and asked for optional participation online (MS Teams platform) interview participation in the next week. According to Guest et al. [16], “saturation,” or the point at which no new information or themes are observed in the data, occurs at 6 to 12 interviews. The author conducted semi-structured interviews with experiment participants until saturation was reached at data set of N = 12 items. The semi-structured interviews consisted of 25 questions, divided into seven sections, with the actual length of the interviews ranging from 14 (I#01) to 82 (I#11) minutes, the average being 30,5 min and the median 26,5 min. A considerable amount of data extracts was collected from the items but only a selection of these featured in the final analysis. Each interview commenced with informed consent. The anonymity of the results and confidentiality of audio recordings have been guaranteed. 3.5 Thematic Analysis The inquiry framework presented in this article uses thematic analysis (TA) developed for use within a qualitative paradigm, subjecting data to analysis for commonly recurring
90
M. Valový
themes. The chosen qualitative method, TA, is theoretically flexible but not atheoretical or inherently realist, meaning the author had to choose where it would be based on the epistemological landscape [6]. The author has chosen the inductive (bottom-up) way of identifying patterns in the data instead of the deductive (top-down) approach. Inductive analysis was used to code the data without trying to fit it into a pre-existing coding frame or the researchers’ analytic preconceptions. In this sense, the thematic analysis employed is data-driven (as opposed to being theory-driven). Epistemology: It is essential to note that the researcher has not freed himself of his theoretical and epistemological commitments. The chosen research epistemology guided what the author said about the data and informed how he theorized meaning. The chosen essentialist approach is generally suitable for theorizing motivations, experiences, and meaning straightforwardly because a simple, largely unidirectional relationship is assumed between meaning, experience, and language [25]. Instead of coding for a specific research question, the codes have evolved incrementally, mapping onto the inductive process. An interpretative approach is appropriate when one seeks to understand complex real-world phenomena from a people-centric perspective [25]. The author chose a more experiential version of the interpretative TA to objectively capture how the participants feel about pair programming and grasp their insights into the topic instead of the critical version, which tries to excavate the latent meanings. The author treated pair programming as an orthogonal axis to people’s points of view, experiences, and understandings. Theoretical flexibility means reflexivity is crucial to successfully implementing TA; the researcher makes active choices and reflects on his or her data-reading assumptions [5]. Seven steps by Braun and Clarke [6] were applied flexibly to fit the research questions and data: transcribing, becoming familiar with the data, generating initial codes, discovering themes, reviewing themes, defining themes, and writing up. The first step, transcribing, can be seen as a key phase of interpretative data analysis, as the meanings are created during this thorough act [3]. For initial pre-processing, a professional tool (Descript, v30.1.0) was used to generate basic transcription and remove filler words like “um”. The subsequent manual transcription (step 1) had proven to be an excellent way of familiarizing oneself with the data (step 2). Final transcripts were imported into a computer-aided qualitative data analysis tool (MAXQDA, v22.0.1), coded, and analyzed for themes (steps 3–4). A precise method of theme construction (steps 3–6) proposed by Vaismoradi et al. [32] was utilized. The author worked systematically through the data set, giving each item full and equal attention. Then, codes were aggregated into the high-level, descriptive (non-theoretical) themes. Usually, themes become identified by measuring their “prevalence,” which can be done in several ways [4]. The key indicator was “a number of participants talked about the pattern.“ The lower limit was set to three. In the end, the author has developed the final version of the thematic map and reported all supplied themes, subthemes, and encoded extracts of data.
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
91
4 Results 4.1 Quantitative Results Of 40 students, 2 were females and 38 were males. The students’ software engineering experience ranged from a half to six years (μ = 2.2, σ = 1.5): 19 had up to one year of experience, 13 had more than one and up to three, and 8 had more than three years. 4.1.1 Clustering Personality variables were mapped using clustering methods. Statistical software OriginPro (v2021) and RStudio (v4.0.5) were used. It was crucial to keep the clusters as distinctive as possible. The Dunn index metric [2] indicated the usage of a complete linkage method with the number of clusters set to three. The three centroids created by the cluster analysis are presented in Fig. 1 with their respective means and standard deviations. The first cluster is mainly characterized by the predominant personality dimension “openness to experience” (μ = 8.29, σ = 1.21). The second cluster is characterized by two dominant personality dimensions, “extraversion” (μ = 7.36, σ = 1.36) and “agreeableness” (μ = 7.91, σ = 0.83). The third—last cluster—is characterized by predominant personality dimension “neuroticism” (μ = 7.82, σ = 1.40) and very low “extraversion” (μ = 3.55, σ = 1.04).
Fig. 1. Cluster centroids – characterized by their means and standard deviations
The relations between clusters and preferred roles were computed by the maximum intercept and are displayed in Fig. 2 with their respective intercept counts. Cluster 1 prefers the Pilot (11x), cluster 2 the Navigator (6x), and cluster 3 the Solo (6x) role.
Fig. 2. Contingency table – clusters and preferred roles
92
M. Valový
4.1.2 Hypotheses Testing Three sessions per six rounds and 40 students produced 654 motivation inventories (90%+ participation). The first hypothesis was confirmed by the three generated clusters, which were internally homogenous and externally heterogeneous. The χ2 test confirmed the dependence between personality clusters and role preference (p-value = 0.0126*). Hypotheses H3, H4, and H5 were verified using the one-tailed student’s t-test to compare the means of the given role group and of the studied population. Despite our data set not having the normal distribution of Big Five dimensions, the variables are normally distributed in the general population [26], allowing us to use parametric tests. The p-values were 0.00116** for openness (H3), 0.00207** for extraversion, 0.00016*** for agreeableness (H4), 0.000012**** for neuroticism, 0.00169** for extraversion (H5). Thus, all hypotheses are statistically sound and largely significant (2-4 stars) on significance level α = 0.05. 4.2 Qualitative results From the qualitative data set, 68 codes with 184 occurrences and 4 themes were generated: four positives, two negatives, and two neutrals. Each theme is listed in its own section, some contain subthemes. Each interview transcript was labeled I#01–12. We begin by discussing programmers’ positive attitudes towards pair programming. Afterward, we compare them to the neutral and negative ones. 4.2.1 Positive Attitudes Firstly, the majority of occurring codes (31/68), with a total number of 106 occurrences, were positive. The themes generated from them included psychological, pedagogical, and therapeutic effects, cognitive load, and performance. 4.2.1.1 Activation of the “Hawthorne Effect” A very salient theme is the Hawthorne Effect, which encompasses all positive effects on someone’s work when he or she is “being observed” or is in the presence of someone. The participants argued that they experienced a lot of “positive stress” from various sources. The sources were coded as “being part of something” (5x), “working with someone has a larger meaning” (4x), “out of comfort zone” (4x), and “my partner helped me to not get distracted” (3x). More positive effects of pairing on motivation could be cited, e.g.: “It was motivating that we did it together (I#10),” or “I experienced a good feeling of connection (I#7).” 4.2.1.2 Essential Complexities of Software Engineering The second positive theme generated was about separating the concerns and facing the essential complexities of software engineering. While going through the steps of thematic analysis recursively, the author generated several codes conveying the latent meaning of the responses, such as “safety net (when the partner was navigating me)” and “partner saves time when I am stuck.” Nevertheless, the experiential codes with the most occurrences were: “I get to focus on one thing” (7x), “I just think of the goal, not
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
93
how I write the code/algorithms” (5x), and the complementary, pilot’s, part, “I am just writing code and do not have to think about what I will be writing next” (4x). 4.2.1.3 Improved Performance (and Costs) Cost is a controversial topic: “Does pair programming justify double personal costs?” Although this study does not reveal answers to that question, it generates the “improved performance” theme. The related excerpts from students’ responses were coded as “while solo, I could not finish on time but in pair we always did” (7x), “in pair, we managed to finish all the tasks” (6x), “two heads together know more” (5x), “I was more effective because of pressure to ask quickly and thanks to fast answers from the partner” (3x). 4.2.1.4 Teaching and Knowledge Sharing An adamant theme was the practice of social skills and dealing with problems by speaking about them. The variety of codes under this theme included: “I felt like a teacher” (5x), “speaking about the problem made me understand it” (4x), “perfectly formulated questions were required” (4x), “knowledge transfer” (3x), and “learning with two brains” (3x). In addition, a large number of positive reports was directly related to the role of the navigator and received code labels such as “being the navigator pushes you to think more” (5x) and “you are the boss, it feels nice” (3x). The subtheme generated under teaching and learning was “helping.” The programmers reported pair programming’s positive, even therapeutic effects. The codes included: “pairing helps with mood” (6x), “helping someone makes you feel better” (4x), “pleasure in the work” (3x), “improved cooperation also in the whole team” (3x), and “training for becoming a manager” (3x). 4.2.2 Neutral Attitudes A large proportion of analyzed codes (21 out of 68) came under a neutral tone and captured statements about the relation between the agile development practice and personality factors and stressed the importance of rules. 4.2.2.1 Psychometric Testing and Personality’s Moderating Effect First, the students unanimously agreed that “psychometric results describe me accurately.” This code was applied in 11 interviews. The remaining participant (I#11) responded that his personality “depends on the current mood of the day.” The rest of the feedback on the psychometric questionnaire used in this study was overwhelmingly positive and is captured by codes: “it was all me” (5x), “I really found myself” (4x), and “personality determines everything about the person” (4x). The major sub-theme was personality’s moderating effect on pair programming. The code “personality determines what role I prefer” was used ten times. High extraversion was mentioned as the prerequisite for being good at pair programming (6x). Two participants mentioned that high agreeableness is essential for the role of navigator. They explained that someone with low agreeableness would become stressed if his pilot took on a different coding trajectory than planned. Three participants mentioned they would not recommend pair programming to neurotics.
94
M. Valový
These results offer qualitative support to H4-5 and answer the RQ2 positively. 4.2.2.2 Importance of Rules This theme is related to the frequent switching of roles and picking up on the partner’s tasks. For instance, most participants responded to question 20 “How well was your partner respecting the frequent change of roles?” equivocally and produced the code “these were the rules, it was good” (8x). 4.2.3 Negative Attitudes A minority of codes (16/68) with two themes was negative. 4.2.3.1 Attention and Focus Deficits This first theme was generated upon frequent rhetoric about attention and focus deterioration. Two salient arguments under this theme included (1) losing focus when not using the keyboard and (2) difficulty following everything the partner does. Regarding the former argument, four interviewees explicitly stated they realized it would be easier to work by themselves than with someone, which made them lose attention, e.g., “I had to explain it several times until it got through my partner (I#08).” However, the problem was not limited to frustration in explaining. Another participant stated he had trouble multitasking: “When I was the pilot, I was not (able) listening to my partner (while simultaneously writing the code) (I#10).” Concern about time was raised too, and the code “You spend not only your time but also someone else’s” was applied three times. Another constituent code for this theme that stands out is: “When I am solo, I can enjoy finding information,” which occurs twice in the analysis. 4.2.3.2 Personality and Competency Differences Another prominent theme actively identified in the transcripts was “pair’s personality and (perceived) competency differences.” The standard argument was that it is difficult to work with introverts. The most frequently applied codes were: “I knew I could have an impact, but partner did not wait/listen” (4x), “partner was bad at explaining what he wants from me” (3x), “cooperation was minimal, the partner was not absorbing my advice” (3x), “partner was so experienced he did not need me” (3x). These codes were all related to complaints coming from the navigator’s point of view and directed at their pilot partners. However, some also voiced their frustration related to a perceived sub-par competency when being in the navigator role, e.g., “When your partner is waiting on you, and you do not know the answer, it can be frustrating (I#06).” While no negative attitudes under this theme came from the pilot’s point of view, some role-agnostic difficulties were raised, such as the argument: “Some people want to keep their personal space, and you cannot change this (I#06),” or the excerpts marked by the code “when the person does not talk (about the problem)” (4x), generating the sub-theme “pair programming is less viable for introverts.” A sub-criterial code was “coding style differences” (2x), saying that it is difficult to pick up on work done by someone who uses different naming conventions in the shared code. These results support the hypotheses H4-5 in a qualitative way.
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
95
5 Discussion The primary goal of this study was to shed light on the nomological network linking pair programming roles to the Big Five personality traits and motivation. Our study extends prior research in the field by employing several novel approaches, e.g., measuring individual intrinsic motivation within the Self-determination framework [65], measuring the effects separately for distinct roles, and using cluster analysis. Our study also delivers new results, e.g., that individual differences among software engineering undergraduate students are reflected in distinct pair programming role preferences and that the instruments and analytical and interpretative methods employed in this study can detect such connections. The remainder of the discussion is structured into three parts: (i) debating how our results answer the research questions and conferring them to previous studies, (ii) debating threats to validity, and (iii) discussing the implications and their scope. 5.1 Answering the research questions Many rounds in each experimental session were vital to the rich data corpus of 654 inventories which enabled a proper application of the recommended quantitative methods [14]. A subsequent follow-up with post-experimental interviews was essential to gather the qualitative insight required to answer the research questions: RQ1: Do distinct pair programming roles affect programmers’ motivation differently? We analyzed the first question in a quantitative way. A rich data corpus had to be gathered to allow statistical inquiry about significant relationships between the role and motivation levels moderated by personality variables. Our analysis confirmed the hypothesized relations and measured the strengths of the links. Programmers do possess varied personalities, mapped onto three distinct, yet internally homogeneous clusters. Moreover, each personality type is differently affected by pair programming. Answer: “Pilot, navigator, and solo roles have different effects on intrinsic motivation of software professionals, depending on their personality. Pilots are motivated if their openness is high, navigators require extraversion and agreeableness, and introverted professionals with neuroticism do not thrive in pair programming.” This answer is in slight contrast to the meta-analytical study [18], where the effects of pair programming were found less significant. But we used a different nomological network than any of the mentioned studies, with more specific constructs, and also analyzed a variable – intrinsic motivation – less externally affected than performance. RQ2: Can psychometric tests improve the assignment of pair programming roles? The second question was analyzed qualitatively in an essentialist’s interpretative way. As Sect. 4.2.2.1 has shown, Big Five is accurate and can be regarded as a predictor of intrinsic motivation. Answer: “Psychometrics explain which pair programming will suit a programmer and also explain why. Thus, knowledge of the Big Five dimensions provides meaningful ways of increasing the overall motivation in software teams.”
96
M. Valový
This answer is in heavy contrast to the previous results in the seminal paper [17], which, too, analyzed the moderating effects of the Big Five on pair programming and declared them rather insignificant. The reasoning behind the contrast might be mainly that we examined the roles and individuals separately.
5.2 Internal validity The subjective nature of interpretation imposes threats to the internal validity of our qualitative results. However, the data has been processed systematically and in an epistemological way that introduces as little subjective bias as possible. Internal validity of our quantitative analysis refers mainly to the suitability of our data sets for the application of the selected statistical methods (student t-test, χ2, hierarchical cluster analysis), and choosing the number of clusters. This discussion was covered in Chapter 4.1. It also pertains to conclusions about causation. We analyzed the interrelations between created clusters and the preferred role and were able to find significant relationships between these two variables; therefore, we consider our assumptions, drawn upon psychology’s standard theories, valid. 5.3 Implications and external validity Our study proposes that pair programming can be employed to help satisfy the basic psychological needs (autonomy, relatedness, competency), and, in effect, increase the intrinsic motivation of software personnel. The nomological network proposed in this study recognized which moderators are important for software professionals to perform in the triad of solo, pilot, and navigator roles. Software engineering managers could try to infer from the psychometrics of their team and act based on our findings. Regarding external validity, the results might not be applicable in settings where programmers come from different personality clusters or are far more experienced.
6 Conclusion Personality was confirmed to be a valid predictor of intrinsic motivation also in software teams. Subjects of our treatment belonged to three personality clusters, each having significantly different inclinations toward pair programming. For instance, some find satisfaction in not being alone, yet others must work solo to enjoy it. The third identified cluster (high neuroticism, low extraversion) was congruent with the controversially aged phenotype identified by Cougar and Zawacki [8]. Even though some of the participants’ views are contrasting, they are not contradictory. Quite the contrary, they represent the opposite ends of the spectrum of programmers’ attitudes. Therefore, all opinions are equally valid for implementing pair programming and enhancing the motivation of a software engineering team and must be considered. As a result, pair programming ought to be implemented cautiously with strict rules defined beforehand (as theme 4.2.2.2 suggests) because adverse effects on motivation
Effects of Pilot, Navigator, and Solo Programming Roles on Motivation
97
are a possibility (as themes 4.2.3.1, 2 suggest). Our results may serve as actionable recommendations, with their applicability limited to teams where the members fall within the three clusters identified by our study and are not experts. The author wishes to pursue his endeavors further by employing a similar experimental design with a focus on the effects of specific constellations in pairing. Acknowledgments. This work was supported by an internal grant funding scheme (F4/34/2021) administered by the Prague University of Economics and Business.
References 1. Arisholm, E., Gallis, H., Dyba, T., Sjøberg, D.I.: Evaluating pair programming with respect to system complexity and programmer expertise. IEEE Trans. Software Eng. 33(2), 65–86 (2007) 2. Bezdek, J.C., Pal, N.R.: Cluster validation with generalized Dunn’s indices. In: Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, pp. 190–193. IEEE (1995) 3. Bird, C.M.: How I stopped dreading and learned to love transcription. Qual. Inq. 11(2), 226–248 (2005) 4. Boyatzis, R.E.: Transforming Qualitative Information: Thematic Analysis and Code Development. Sage (1998) 5. Braun, V., Clarke, V.: Successful Qualitative Research: a Practical Guide for Beginners. Sage (2013) 6. Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3(2), 77–101 (2006) 7. Corr, P.J., Matthews, G. (eds.) The Cambridge Handbook of Personality Psychology. Cambridge University Press (2020) 8. Couger, J.D., Zawacki, R.A.: Motivating and Managing Computer Personnel. Wiley (1980) 9. DeMarco, T., Lister, T.: Peopleware: Productive Projects and Teams. Addison-Wesley (2013) 10. DeYoung, C.G.: Cybernetic big five theory. J. Res. Pers. 56, 33–58 (2015) 11. Ryan, R.M., Deci, E.L.: Self-Determination Theory: Basic Psychological Needs in Motivation, Development, and Wellness. Guilford Publications (2017) 12. Feldt, R.L., Angelis, R.T., Samuelsson, M.: Links between the personalities, views and attitudes of software engineers. Inf. Softw. Technol. 52(6), 611–624 (2010) 13. Forsyth, D.R.: Group Dynamics. Cengage Learning (2018) 14. Goldberg, L.R.: The structure of phenotypic personality traits. Am. Psychol. 48(1), 26 (1993) 15. Graziotin, D., Lenberg, P., Feldt, R., Wagner, S.: Psychometrics in behavioral software engineering: a methodological introduction with guidelines. ACM Trans. Softw. Eng. Methodol. (TOSEM) 31(1), 1–36 (2021) 16. Guest, G., Bunce, A., Johnson, L.: How many interviews are enough? an experiment with data saturation and variability. Field Meth. 18(1), 59–82 (2006) 17. Hannay, J.E., Arisholm, E., Engvik, H., Sjøberg, D.I.: Effects of personality on pair programming. IEEE Trans. Software Eng. 36(1), 61–80 (2009) 18. Hannay, J.E., Dybå, T., Arisholm, E., Sjøberg, D.I.: The effectiveness of pair programming: a meta-analysis. Inf. Softw. Technol. 51(7), 1110–1122 (2009) 19. John, O.P., Donahue, E.M., Kentle, R.L.: Big five inventory. J. Personal. Soc. Psychol. (1991) 20. Katira, N., Williams, L., Wiebe, E., Miller, C., Balik, S., Gehringer, E.: On understanding compatibility of student pair programmers. In Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education, pp. 7–11 (2014)
98
M. Valový
21. Latham, G.P.: Work Motivation: History, Theory, Research, and Practice. Sage (2012) 22. Lenberg, P., Feldt, R., Wallgren, L.R.: Behavioral software engineering: a definition and systematic literature review. J. Syst. Softw. 107, 15–37 (2015) 23. Mertens, D.M.: Transformative mixed methods research. Qual. Inq. 16(6), 469–474 (2010) 24. Nettle, D.: The evolution of personality variation in humans and other animals. Am. Psychol. 61(6), 622 (2006) 25. Patton, M.Q.: Qualitative Evaluation and Research Methods. SAGE Publications, inc, (1990) 26. Penke, L., Denissen, J.J.A., Miller, G.F.: The evolutionary genetics of personality. Europ. J. Pers. Publish. Euro. Assoc. Pers. Psychol. 21(5), 549–587 (2007) 27. Plonka, L., Sharp, H., van der Linden, J., Dittrich, Y.: Knowledge transfer in pair programming: an in-depth analysis. Int. J. Hum Comput Stud. 73, 66–78 (2015) 28. Ralph, P., et al.: Pandemic programming. Empir. Softw. Eng. 25(6), 4927–4961 (2020). https:// doi.org/10.1007/s10664-020-09875-y 29. Rammstedt, B., Kemper, C.J., Klein, M.C., Beierlein, C., Kovaleva, A.: A short scale for assessing the big five dimensions of personality: 10 item big five inventory (BFI-10). Meth. Data Anal. 7(2), 17 (2013) 30. Riemann, R., Kandler, C.: Construct validation using multitrait-multimethod-twin data: the case of a general factor of personality. Eur. J. Pers. 24(3), 258–277 (2010) 31. Self-Determination Theory. 2022. Intrinsic Motivation Inventory. http://selfdeterminationt heory.org/intrinsic-motivation-inventory/. Accessed 12 2022 32. Vaismoradi, M., Jones, J., Turunen, H., Snelgrove, S.: Theme development in qualitative content analysis and thematic analysis (2016) 33. Vanhanen, J., Lassenius, C.: Effects of pair programming at the development team level: an experiment. In: 2005 International Symposium on Empirical Software Engineering, 2005, p. 10. IEEE (2005) 34. White, R.W.: Motivation reconsidered: the concept of competence. Psychol. Rev. 66(5), 297 (1959) 35. Williams, L., Kessler, R.R.: Addison-Wesley Professional, R. Kessler. Pair programming illuminated (2003) 36. Williams, L., Layman, L., Osborne, J., Katira, N.: Examining the compatibility of student pair programmers. In: AGILE 2006 (AGILE’06), p. 10. IEEE (2006)
Video Game Development Process for Soft Skills Analysis Adriana Peña Pérez Negrón1 , David Bonilla Carranza1(B) , and Mirna Muñoz2 1 CUCEI, Universidad de Guadalajara, Blvd. Marcelino García Barragán 1421, 44430
Guadalajara, Jal, Mexico [email protected], [email protected] 2 CIMAT Zacatecas, C. Lasec y And. Galileo Galilei, M 3, L 7 Quantum Ciudad del Conocimiento, 98160 Zacatecas, Zac, Mexico [email protected]
Abstract. The study of soft skills within the organization has acquired great relevance in recent years. Because software development is a human endeavor mainly based on teamwork, soft skills represent a key factor in its success. The analysis of soft skills is currently mainly conducted through questionnaires and interviews, both subject to interpretation and normally not in a situational setting. This is probably why video games have been presented as an alternative to contextualize the situation in which soft skills are present. There are different proposals with processes for soft skills analysis by means of video games; however, there are no unified criteria on how to develop video games for soft skills analysis. This work presents a process for the analysis of soft skills from the conception of the development of a video game to the necessary metrics for its comprehension. Keywords: Player-style analysis · Videogames · Development · Soft skills
1 Introduction With the player-style analysis, the game industry put on the table the study of the player’s features or characteristics for the benefit of the gameplay experience. The gameplay experience is understood as the player’s interaction with the virtual space, what the player can do in it, and the processes and outcomes of such interactions. In a video game, all the actions and reactions of the player, known as the player replay, represent valuable information helpful to observe the player’s actual behavior in a virtual scenario. Therefore, replay analysis has been proposed to assess the psychological characteristics of a person who’s playing a game [1, 2]. The traditional approach to identifying play style is by developing the player modeling [3], this includes detection, modeling, prediction, and expression of the player characteristics displayed in the game that generate behavioral patterns [4]. Under the idea of the comprehension of the player’s style lies adaptability, a system that automatically adapts the user experience to the actual user [5]. User modeling adaptability aims © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 99–112, 2023. https://doi.org/10.1007/978-3-031-20322-0_7
100
A. P. P. Negrón et al.
at the users’ needs and preferences, and can it can be based on information from the user profile to more sophisticated systems like applying Artificial Intelligence (AI) [4, 6, 7]. The player modeling typically uses information gathered both during the game session and from the player profile. Also, the player modeling approaches can be modelbased or model-free. Model-based approaches are built on a theoretical framework representing the expected categorization of the player, for example, an emotional model. And a model-free approach generates the player model with no strong initial assumptions; this approach tends to identify behavioral patterns. Also, hybrid approaches can be applied with elements from both approaches [2]. When a game or a video game has other than just its inherent ludic objective, it is considered a serious game (SG) [8]. Serious games’ scope is broad; their objectives include training, learning, encouraging specific behavior, giving insights on particular problems, and psychological profiles determination [9], and they also represent a suitable alternative for the soft skills analysis. Soft skills or non-technical skills are closely related to the way people behave. They comprise a combination of abilities, attitudes, habits, and personality traits that allow people to better perform [10]. Contrary to hard or technical skills, which are naturally measured according to the performance in the field of interest, soft skills represent a challenge in both formalization and evaluation [11]. At an individual level, soft skills covered in the literature include 1) psychological issues like personality traits, characteristics, mental patterns, judgment, bias, or motivations; 2) cognitive issues such as learning skills and types, abstract thinking, knowledge sharing, or training; and, 3) management skills such as planning or making estimates [6]. And at a team level, the team member’s soft skills have been recognized as impacting how they relate to each other, and the team as a whole [12]. Consequently, the players’ behavior is here proposed as a means for the analysis of soft skills. A video game represents situational media where data can be gathered in a discreet, non-invasive, non-disturbing way. They provide virtual environments for the players to be themselves without the feeling of being observed. Video games also allow creating a similar to a real-life situation, in a controlled environment where diverse data can be automatically collected. However, even though video games facilitate a proper setup to understand people’s individual and team behavior characteristics, their use for psychological studies is in the early stage, typically using a psychological test to validate or corroborate the video game data analysis [7, 13, 14]. In this paper is presented a process to design video games to understand soft skills for software development teams. The proposed process is based on a set of different video game designs for soft skills analyses.
2 Player Behavior Linked to Soft Skills Personality and soft skills are closely related. As Garcia [10] pointed out, personality development is the acquisition of life skills, those skills required to live successfully, and life skills are soft skills. Developing such soft skills is a systematic, conscious and continuous process.
Video Game Development Process for Soft Skills Analysis
101
For the study of personal characteristics and behavior, the most common instruments are those based on self-reports, interviews, and questionnaires. Unfortunately, a selfreport is prone to bias because of social desirability, a tendency to present a favorable or expected image instead of the real one, memory, or motivation [3]. As an alternative, a situational evaluation represents a non-invasive approach for behavioral observation and analysis. For that, video games are immersive and engaging computer-generated scenarios where people can interact by making choices without suffering consequences. Features that will very probably encourage the players to stay true to themselves [1]. Also, in a video game, the control of the situations is simultaneous for each participant. In the analysis of soft skills based on video games, the play style identification practice can be applied considering that all the elements related to the player interaction within the game, determine the player style. This analysis includes game metrics and the statistical spatial-temporal features; data usually mapped to cognitive states such as attention, challenge, and engagement [16]; this is the gameplay input. Another way of video game data collection is by using body measure sensors such as body worm, electroencephalogram (EEG), or electrocardiogram (ECG), but users might perceive them as invasive [17]. The model’s output is usually a set of the player states, where such states can help to map the soft skills of interest. The game design should support the players’ immersion and engagement. The game design is composed of the game mechanics and the game dynamics. The actions, behaviors, and control mechanisms, that is, the building blocks to construct the game considered the game mechanics. And those game mechanics generate the player experience based on the game desires and motivation, denominated game dynamics [18]. Extracting data from the player interaction then should provide a great amount of information beyond traditional questionnaires or interviews [4]. According to [19], the observational behavior through the player inputs in video games represents the same concept of the observer reports, and it potentially avoids the bias problems aforementioned.
3 Related Works Guardiola and Stéphane [14] presented a game design methodology to generate a psychological profile of the players. Their process included nine steps 1) identifying the psychological model, 2) considering audience and constraints, 3) working with experts for validation (Experts pipeline), 4) setting the game concept according to the psychological model, 5) defining the psychometric items and include them in the gameplay loop, 6) rationalizing the connection between the game design and the psychological dimensions, 7) building an interface for results, 8) evaluating the items’ quality, and 9) performing a loop until finding a strong connection between the generated profiles and the traditional questionnaire. In this method, the authors highlight the importance of comparing the game results with a reliable data collection such as a traditional questionnaire. The framework then is based on a psychological model scientifically validated, where the data has to be collected during the gameplay session, also the psychological model must match the profiling intention, and the result has to be compared through reliable means. The experts’
102
A. P. P. Negrón et al.
validation or “Experts pipeline” represents the game conception and production phases in the model. In the Conception phase is developed the game design document (GDD) and technical and expert evaluations are conducted. Then, in the production phase the specifications are determined and the production is performed; afterward, the player tests and functional tests are conducted, and finally, an expert review on the psychometric function takes place. Both the conception and the production phase follow an incremental iterative process. This process was applied in the JEU SERAI SG, a vocational guidance for students, not to substitute a career advisor but to offer a self-assessment test for the student. Alloza et al. [20] argued that commercial video games (not SGs) are a helpful tool for soft skills identification and as training scenarios. Then, their soft skills assessment process starts with identifying open-source video games that warranty modifications on the code, in order to be able to include indicators to measure the skills. They designed a telemetry algorithm based on a literature review for supporting the selection of video games. Based on the selected games, they decide on the soft skill to be measured. In their study, a psychometric pre-test was used along with facial recognition with the intention to predict the players’ affective states. At the end of the sessions, the players re-completed the psychometric instrument to compare results to establish whether or not an improvement in the selected soft skills. The next three steps depict the Allonza et al. [20] process: 1. Exploration for the selection of a commercial game. 2. Selection of the soft skills to improve. 3. Validation of the soft skills improvement. In step three, the validation consists of pre and post-tests since the proposal is for soft skills improvement. We consider that the Allonza et al. [20] proposal of using open source video games is a valid alternative to include in the process for the analysis of soft skills through video games. This option can be considered to efficiently the resources in the development process, although limited by the open source game design. Mayer [6] aimed to demonstrate the validity of game-based training and assessment. They use a commercial, multiplayer, three-dimensional game for team training and psychometric tests. They gather data in pre and post-game surveys, and both instruments aim to evaluate team characteristics on an individual basis, through the gameplay logs. This proposal aims at both team and individual evaluations. The game advantage of generating a similar to a real-life situation perfectly applies to a team evaluation. As a result, individual but in a team context and team evaluations are other alternatives to be considered for the analysis of soft skills in the video game context. McCord et al. [5] conducted a study to compare traditional personality instruments to game-like instruments. Their game design is context grounded in a real narrative scenario with options for the course of action. Although the game gives the player the impression that the progress depends on the chosen option, they are only related to a personality characteristic. The participants answered a traditional personality test and then the results are compared. In the McCord et al. [5] proposal can be highlighted the importance of avoiding that the player chooses options, maybe contrary to his/her nature,
Video Game Development Process for Soft Skills Analysis
103
but with the expectation of getting a better score in the game. This is a design point that has to be carefully taken into consideration to validate the analysis. Pouezevara et al. [21] described the process of developing and testing a problemsolving game. They used Evidence-Centered Design (ECD) as a framework for designing tasks related to the data generated by the game. ECD uses domain analysis, domain modeling, and conceptual assessment to break down the skill to be evaluated into measurable parts. For that, they accomplished a literature review and used experts’ consultation to define a set of skills that were prioritized by importance, being teachable, and by its measurability. The game design was a co-development process with a set of focus group sessions. The game development was validated initially in a functionality test, then with a user test, and then with a user interface testing process. In order to validate the game, a triangulation was made through the gameplay, personality test, and self-assessments. A test of abilities required to play the game was previously performed, on it, the participants were asked about their digital game frequency playing habits to establish their abilities in video games. Next, a psychological test for the soft skills of interest was applied. Data was collected during the game sessions. Finally, they compared the psychometric instruments, self-assessment, and game metrics. The Pouezevara et al. [21] proposal presents several important points for the methodology design. First, it is evidenced by the importance of using a design framework for the game tasks or goals; and second, the importance of breaking the tasks into measurable parts. Also, the game requires both functionality and user tests, along with the regular test processes. Another important issue that can be inferred by their proposal is that the player abilities should not interfere in the game outcome; in this case, we suggest using games that do not require physical abilities and in all the cases establish that the required abilities to play the game do not interfere with the soft skills analysis. The last point also important is the consideration of a self-assessment other than the psychometric instrument that can also support the evaluation. Zulkifly [22] described the process of adapting a personality theoretical framework to the design of a SG. The proposed method is first to consult experts on personality and game design with the purpose of identifying questionnaire items associated with the behavior to be measured, and then making a relation with the game metrics that capture the identified behavior. In this second step, the author suggests compiling the best ideas into an exhaustive list of game metrics. Next, establish game mechanics to implement the identified game metrics, debating how game mechanics best fit in a cohesive game and listing their possible combinations. Based on the previous steps can be determined the game design specifications. This proposal is interesting because it delves into the design from the questionnaire items to the game metrics and the game mechanics. Ammannato and Chiesi [13] conducted a study using information about the way the player acts and reacts in a competitive video game to assess personality traits by applying machine learning to extract data from the game logs. The authors selected a personality model, and soft skills (honesty-humility and emotionality traits) related to a multiuser game assuming that a game includes the pleasure of engaging with others and uncertainty in the outcomes. The video game Massive Online Battle Arena (MOBA) with cooperative and competitive dynamics was selected. An electronic form of the psychometric test was applied to the participants. And, the game logs were codified to
104
A. P. P. Negrón et al.
perform the analysis of the data through deep learning models based on the psychometric dimensions of the personality assessment instrument. They correctly predicted personality traits based on the players’ video game actions in their preliminary study. In this proposal are identified two different approaches for multiuser games, the competitive and the cooperative modes. Also, for the analysis, they mined data not just by comparison from a traditional psychometric game, but also to understand other hidden player behaviors that might contribute to the disclosure of the psychological metric. Haizel et al. [23], presented a study to establish if a game can predict personality and they compared the game-based evaluation with traditional methods. They developed a role-playing game (RPG) with freedom of choice. Participants were previously asked about their gaming habits. Then they played the game before they answered a personality test. In this proposal the gaming habits were included in the evaluation, this can be seen as part of the abilities of the player, but also to understand certain player tendencies to consider in the psychometric evaluation. Peña Pérez Negrón et al. [24], presented a study for the evaluation of interactive styles, that is consistent behavior within certain situations, associated with software development teams. Their process is first to select a contingency arrangement according to the interactive style to be analyzed. Then has to be established the expected values proper for the team members. Based on the previous steps, select the game. And finally, defying the data that will define the evaluation of the participants. They used a pre-questionnaire of self-evaluation regarding the abilities to evaluate, and a post-questionnaire to validate the player’s perception of the interactive style in the game. In this proposal expected values for the team performance are included. Based on these proposals and their highlighted characteristics a process for the analysis of soft skills based on video games is next presented.
4 Game Design Process for the Analysis of Soft Skills This process has been designed from the nine studies in the Related Work section. Figure 1 depicts the stages proposed for the video game development considered for a general process for the analysis of soft skills [25]. Roughly, in the Conception phase the key aspects of designing the video game are determined. In the second phase, the Design phase, are determined the methodologies and metrics for the development of the game. In the design phase, the gameplay must be related to the soft skill to be analyzed. In the third phase, the Development phase the game is coded and the first digital prototypes are made. The Implementation phase, the fourth one, is in which a protocol is created for the target audience, and tests are made with a pilot group to obtain a functional version as a result. The last phase, Results Evaluation is when the analysis of the soft skill of the player is done. The Conception phase detailed is shown in Fig. 2. It begins with the identification of the soft skills to be evaluated. Once the soft skills are identified, the situations or contingencies in which they can be present are specified. After linking the skills to scenarios within the video game, the target audience must be identified. Then an individual or group evaluation is defined, or even both, and it is analyzed in detail the aspects and requirements of the video game; also it has to be determined whether it meets the necessary objectives for the analysis of soft skills. In the individual evaluation, the scope of the
Video Game Development Process for Soft Skills Analysis
Results Evaluation Phase
105
Conceptual Phase
Implementatio n Phase
Design Phase
Development Phase
Fig. 1. Phases for the development of video games
evaluation is determined, then the restrictions that the evaluation will have, and, finally the development framework for the video game will be defined. It is worth remembering that in the definition of the group evaluation there are two aspects to take into account, these are the competitive aspect and the comparative aspect. For the competitive situation, the features and restrictions that will be implemented in the development framework of the video game are evaluated; in short, it is the evaluation of how competitive the video game under development will be. While the comparative aspect evaluates the cooperation criteria and its restrictions for the players within the video game, leading us to a general development framework in which the competitiveness and cooperation between players are evaluated.
Fig. 2. Conception phase
106
A. P. P. Negrón et al.
For the design of the video game shown in Fig. 3, the previously developed framework will be used based on the defined evaluations, so the Design phase will be broken down into different tasks, which are: • Definition of design criteria: taking care that the player’s skills are not involved in the overall video game design and eliminating repetitive patterns within the design. • Definition of metrics: relate the tasks and objectives of the metrics established in the video game design. • Definition of the elements of the video game: here, some of the most important aspects of the video game are defined, such as the definition of mechanics, dynamics, and aesthetic aspects of the video game. • Gameplay design: basically this is the creation of the video game design document (GDD).
Fig. 3. Design phase
Acerenza et al. [16] developed what they called the SUM methodology, which uses the Scrum framework for the development of video games. In the SUM two phases to the Scrum framework were included: the Pre-production and the Pre-game phases that correspond to the Planning and Elaboration phases respectively. And a third phase, Post-production and Post-game corresponding to the Beta and Closing phases were also included, see Fig. 4. The SUM phases are next briefly described: Phase 1: Concepts, where it is specified the required basic elements for the design such as target audience, business model, game elements, main features, gameplay, characters, and history, and also determined the technical languages and tools, that is, the development of the game concept. Phase 2: Planning, where the project is scheduled with its main milestones. The video game’s functional and non-functional characteristics are estimated and prioritized, and the teams are defined. In other words, the administrative plans and their specifications are established. Phase 3: Elaboration, where the video game is implemented through an iterative and incremental approach. The phase is broken down into three threads; objectives planning, execution of the tasks, and evaluation.
Video Game Development Process for Soft Skills Analysis
107
Phase 4: Beta, where evaluations and the adjustment of different aspects are made like the gameplay, its fun, or the analysis of the learning curves. Phase 5: Closing to conclude the process. A final version of the video game is developed based on the requirements, and feedback is gotten from the different phases.
Fig. 4. The development phase steps
During the whole project it has to be risk management to minimize the consequences and impact of possible problems. Figure 5 shows the Implementation phase, where a protocol must be drawn up with the following points: The video game must have a suitable environment for the player and the gameplay. It has to be considered whether the video game must use any artifact external to the environment where it can be executed, for example, a wireless controller, or virtual reality device. It has to be taken into account the time in which such a video game is going to be implemented. Also, it has to be considered contingency plans in case something goes wrong. And finally, a functionality test has to be performed to ensure that both the game and its implementation work correctly.
Fig. 5. Implementation phase
In the Results Evaluation phase, shown in Fig. 6, a comparison of expectations versus reality has to be made by comparing the results obtained with the expectations. The results of the evaluation criteria can be selected through interviews, expert judgments, self-evaluation, and pattern analysis based on Artificial Intelligence. The results of the soft skills measurements obtained after the implementation of the video game should be collected. Finally, the overall results obtained should be analyzed and interpreted through graphs or other visual means that allow seeing the results and comparing them with expectations in order to reach a conclusion and determine whether the objectives were met.
5 Framework Validation Steps The validation framework presents the steps to be evaluated in each phase. Figure 7 corresponds to the evaluation of the Conception phase. In this validation it has to be
108
A. P. P. Negrón et al.
Fig. 6. Results Evaluation phase
compared whether the targeted soft skills are similar to those obtained in the evaluation of the video game framework; if so, we move on to the Design phase; if not, the soft skills are reanalyzed with the objectives to be evaluated [14].
Fig. 7. Conception phase validation
Figure 8 shows the validation of the Design phase where a basic prototype of the video game is created with its mechanical characteristics. There has to be tested if the video game meets the design and mechanical objectives before the Development phase, otherwise, a better design has to be created and tested until the objectives of this phase are met. The validation of the Development phase is shown in Fig. 9. A digital prototype of the video game is first tested and compared with the objectives of the Development phase. If it is correct, then it can be continued with the Implementation phase, otherwise, the prototype has to be improved and tested until the objectives are set. Figure 10 shows the validation of the Implementation phase where the software quality is implemented. Here are identified quality tests and applied so that the software is of the best possible quality. If the software meets the objectives, it goes to the Results Evaluation phase, otherwise, quality tests are reapplied to find out what is missing in the video game software to meet the quality criteria.
Video Game Development Process for Soft Skills Analysis
109
Fig. 8. Design phase validation
Fig. 9. Development phase validation
Figure 11 shows the validation with a question: Is there a psychology test that evaluates soft skills? If so, the results are compared with such psychological test to determine if the general objectives of the framework in the video game have been met. In case there is no psychological test to evaluate the soft skills, then if it is available an expert might evaluate the results. If so, the results obtained by the expert are compared in the same way with the assessments raised and the expected objectives for the video game framework. If not, a self-assessment by the players can be performed and the results are compared with the evaluations proposed and the objectives expected for the video game framework.
110
A. P. P. Negrón et al.
Fig. 10. Implementation phase validation
Fig. 11. Results Evaluation phases validation.
6 Conclusions and Future Work Soft skills within video games present a challenge for measurement because they currently represent a key factor for project success. The definition of metrics to evaluate them is a difficult task. Some different efforts and initiatives develop evaluation methodologies to measure soft skills, but it is difficult to find examples of concrete analysis for soft skills and how these metrics are used in practical evaluations. This project presents a process for the development of a video game to assess and analyze soft skills. The next logical step is the implementation of the process in an exploratory study to determine its validity.
References 1. Ventura, M., Shute, V., Kim, Y.J.: Video gameplay, personality and academic performance. Comput. Educ. 58(4), 1260–1266 (2012). https://doi.org/10.1016/j.compedu.2011.11.022
Video Game Development Process for Soft Skills Analysis
111
2. Ontanon, S., Zhu, J.: The personalization paradox: the conflict between accurate user models and personalized adaptive systems. In: 26th International Conference on Intelligent User Interfaces. Presented at the IUI 2021: 26th International Conference on Intelligent User Interfaces, College Station, TX, USA, pp. 64–66. ACM (2021). https://doi.org/10.1145/3397482. 3450734 3. Desurvire, H., El-Nasr, M.S.: Methods for game user research: studying player behavior to enhance game design. IEEE Comput. Graphics Appl. 33(4), 82–87 (2013). https://doi.org/10. 1109/MCG.2013.61 4. Hullett, K., Nagappan, N., Schuh, E., Hopson, J.: Data analytics for game development (NIER track). In: Proceedings of the 33rd International Conference on Software Engineering. Presented at the ICSE 2011: International Conference on Software Engineering, Waikiki, Honolulu, HI, USA, pp. 940–943. ACM (2011). https://doi.org/10.1145/1985793.1985952 5. McCord, J.-L., Harman, J.L., Purl, J.: Game-like personality testing: an emerging mode of personality assessment. Pers. Individ. Differ. 143, 95–102 (2019). https://doi.org/10.1016/j. paid.2019.02.017 6. Mayer, I.: Assessment of teams in a digital game environment. Simul. Gaming 49(6), 602–619 (2018). https://doi.org/10.1177/1046878118770831 7. Aggarwal, S., Saluja, S., Gambhir, V., Gupta, S., Satia, S.P.S.: Predicting likelihood of psychological disorders in PlayerUnknown’s Battlegrounds (PUBG) players from Asian countries using supervised machine learning. Addict. Behav. 101, 106132 (2020). https://doi.org/10. 1016/j.addbeh.2019.106132 8. Mildner, P., ‘Floyd’ Mueller, F.: Design of serious games. In: Dörner, R., Göbel, S., Effelsberg, W., Wiemeyer, J. (eds.) Serious Games, pp. 57–82. Springer, Cham (2016). https://doi.org/ 10.1007/978-3-319-40612-1_3 9. van Lankveld, G., Spronck, P., van den Herik, J., Arntz, A.: Games as personality profiling tools. In: 2011 IEEE Conference on Computational Intelligence and Games (CIG 2011). Presented at the 2011 IEEE Conference on Computational Intelligence and Games (CIG), Seoul, Korea (South), pp. 197–202. IEEE (2011). https://doi.org/10.1109/CIG.2011.6032007 10. Garcia, I., Pacheco, C., Méndez, F., Calvo-Manzano, J.A.: The effects of game-based learning in the acquisition of “soft skills” on undergraduate software engineering courses: a systematic literature review. Comput. Appl. Eng. Educ. 28(5), 1327–1354 (2020). https://doi.org/10. 1002/cae.22304 11. Muzio, E., Fisher, D.J., Thomas, E.R., Peters, V.: Soft Skills Quantification (SSQ) Foi Project manager competencies. Proj. Manag. J. 38(2), 30–38 (2007). https://doi.org/10.1177/875697 280703800204 12. Milczarski, P., Podlaski, K., Hłoba˙z, A., Dowdall, S., Stawska, Z., O’Reilly, D.: Soft skills development in computer science students via multinational and multidisciplinary GameDev project. In: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. Presented at the SIGCSE 2021: The 52nd ACM Technical Symposium on Computer Science Education, Virtual Event, USA, pp. 583–589. ACM (2021). https://doi.org/10.1145/ 3408877.3432522 13. Ammannato, G., Chiesi, F.: Playing with networks: using video games as a psychological assessment tool. Eur. J. Psychol. Assess. 36(6), 973–980 (2020). https://doi.org/10.1027/ 1015-5759/a000608 14. Guardiola, E., Natkin, S.: A game design methodology for generating a psychological profile of players. In: Loh, C.S., Sheng, Y., Ifenthaler, D. (eds.) Serious Games Analytics, pp. 363– 380. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-05834-4_16
112
A. P. P. Negrón et al.
15. Dhaouadi, S., Ben Khelifa, M.M.: A multimodal physiological-based stress recognition: deep learning models’ evaluation in gamers’ monitoring application. In: 2020 5th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP). Presented at the 2020 5th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Sousse, Tunisia, pp. 1–6. IEEE (2020). https://doi.org/10.1109/ATSIP4 9331.2020.9231666 16. Acerenza, N., et al.: Una Metodología para Desarrollo de Videojuegos. Presented at the 38° JAIIO - Simposio Argentino de Ingeniería de Software, pp. 171–176 (2009). https://www. fing.edu.uy/sites/default/files/biblio/22811/asse_2009_16.pdf 17. Saputra, R., Iqbal, B.M., Komarudin: Stress emotion evaluation in Multiplayer Online Battle Arena (MOBA) video game related to gaming rules using electroencephalogram (EEG). In: Proceedings of the 2017 4th International Conference on Biomedical and Bioinformatics Engineering. Presented at the ICBBE 2017: 2017 4th International Conference on Biomedical and Bioinformatics Engineering, Seoul, Republic of Korea, pp. 74–77. ACM (2017). https:// doi.org/10.1145/3168776.3168797 18. Bonilla Carranza, D., Peña Pérez Negrón, A., Contreras, M.: Videogame development training approach: a Virtual Reality and open-source perspective. JUCS J. Univ. Comput. Sci. 27(2), 152–169 (2021). https://doi.org/10.3897/jucs.65164 19. Rahman, E.: Gamers’ experiences in playing video games – a theoretical thematic analysis (2017). https://doi.org/10.13140/RG.2.2.22549.37608 20. Alloza, S., Escribano, F., Delgado, S., Corneanu, C., Escalera, S.: XBadges. Identifying and training soft skills with commercial video games (2017). https://doi.org/10.48550/ARXIV. 1707.00863 21. Pouezevara, S., Powers, S., Moore, G., Strigel, C., McKnight, K.: Assessing soft skills in youth through digital games. Presented at the 12th Annual International Conference of Education, Research and Innovation, Seville, Spain, pp. 3057–3066 (2019). https://doi.org/10.21125/ iceri.2019.0774 22. Zulkifly, A.: Personality assessment through the use of video games, 190 (n.d.) 23. Haizel, P., Vernanda, G., Wawolangi, K.A., Hanafiah, N.: Personality assessment video game based on the five-factor model. Procedia Comput. Sci. 179, 566–573 (2021). https://doi.org/ 10.1016/j.procs.2021.01.041 24. Negrón, A.P.P., Muñoz, M., Carranza, D.B., Rangel, N.: Towards the evaluation of relevant interaction styles for software developers. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 137–149. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-89909-7_11 25. Abdul-Kadir, M.R., Price, A.D.F.: Conceptual phase of construction projects. Int. J. Project Manag. 13(6), 387–393 (1995). https://doi.org/10.1016/0263-7863(96)81776-5
29110+ST: Integrated Security Practices. Case Study Perla Maciel-Gallegos1 , Jezreel Mejía1(B) , and Yadira Quiñonez2 1 Centro de Investigación en Matemáticas, Zacatecas Unit, A.C., Jalisco S/N, Col. Valenciana,
Guanajuato, GJ, Mexico {perla.maciel,jmejia}@cimat.mx 2 Facultad de Informatica Mazatlan, Universidad Autonoma de Sinaloa, 82000 Mazatlan, Mexico [email protected]
Abstract. Data security has become a significant area of interest for everyone involved in developing mobile applications. Therefore, it is crucial to consider that most of the applications security issues are introduced in the development process. According to A. Semeney, Founder of DevTeam.Space, mobile applications are generally developed by small teams or software development VSEs. In this context, this article presents a tool developed with the name of 29110+TS that shows the proposal of security improvements to the ISO/IEC 29110. To validate the proposal and the 29110+TS tool, an expert’s judgment method was implemented with a survey in which the answers gave a positive response to the improvements done to the Base framework and the 29110+TS tool with 90.5%. Keywords: 29110 · Security practices · Expert’s judgment method · 29110+ST
1 Introduction Currently, approximately 4.66 billion users are connected to the internet, potentially falling victim to an attack. Of these connected users, approximately 4.28 billion are internet users via mobile devices [1]. It is essential to identify the vulnerabilities presented on each platform to fix the security issues and use security frameworks to solve them. In this context, mobile applications are generally developed by small teams or software development Very Small Entities, which, according to A. Semeney, Founder of Dev Team. Space [2] have 5–9 developing an application for any mobile OS. Examples of these are Swenson He and Blue Whale Apps [3], among the Top 10 Web application development companies in 2020–2021. Therefore, there is a need for models, standards, or methodologies suitable for Very Small Entities (VSEs) to help develop secure mobile applications. In the software development industry, there are few standards suitable for VSEs. This can be a barrier to implementation in the so-called Very Small Software Development Organizations known as Very Small Entities (VSEs). The ISO/IEC 29110 standard [4] is being used to certify software development; however, it does not have © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 113–125, 2023. https://doi.org/10.1007/978-3-031-20322-0_8
114
P. Maciel-Gallegos et al.
tasks or activities oriented to good practices of secure software development that can be used in developing secure mobile applications. Therefore, this article presents a tool that shows the proposal for security improvements to the ISO/IEC 29110. This proposal allowed to development of a web tool named 29110+TS. In order to validate this web tool, an expert’s judgment method was implemented with a survey, and the results allowed for the feasibility of integrating security practices towards the ISO/IEC 29110 standard to be applied in very small software development entities. The remainder of this paper is organized as follows: Sect. 2, the background is established by referring to previous works that have been published; Sect. 3 describes the 29110+ST tool; Sect. 4 presents the case study and finally, Sect. 5 presents a brief conclusion of the article.
2 Background In order to develop the 29110+ST web tool, two articles were developed and published. The first article identifies the standards and methodologies on which several authors are proposing mobile software development frameworks are built to avoid serious security issues. To obtain these results a Systematic Literature Review (SLR) was performed. SRL is a research protocol proposed by B. Kitchenham [5] that helps identify and explain all the relevant information available for a specific research subject that could be defined by a question, a topic or an area of interest. The results are described in [6]. As a result, the following standards and methodologies were identified and selected. • • • • •
ISO/IEC 27001 [7]. ISO/IEC 27034 [8]. Microsoft SDL [9]. Seven Touchpoints by McGraw [10]. CORAS Methodology [11].
The second article [12] shows the security practices integrated into ISO/IEC 29110 from frameworks identified in the first article. Establishing the ISO/IEC 29110 standard as the base framework to which secure software development practices are integrated, and that can be used in VSEs to develop mobile applications. After the common practices among the security frameworks were determined, they were introduced into the base framework to create the proposal. The security practices to improve the ISO/IEC 29110 were divided into two overall improvements: 1) Added or improve at the task level and the output products level, and 2) the deployment packages to integrate security practices at the task level that can be used in the development of secure mobile applications (see Fig. 1). The results and more details are described in [12]. The changes to the activities’ tasks in the two processes defined in the ISO/IEC 29110 were in the Project Management process and the Software Implementation process.
29110+ST: Integrated Security Practices. Case Study
115
Fig. 1. General view of the improvements to the Base Framework [12].
3 29110+ST After integrating all frameworks identified as a result of the systematic literature review and the developed proposal, a website named 29110+ST was developed. This website is divided into sections for a better understanding of the information. The link to the website is https://i29110-plus-st.herokuapp.com/index.html. The main page is a summary of the sections and has direct access to the improvements to the ISO/IEC 29110 section (Fig. 2). Within the page, there is a menu containing access to 1) general information on ISO/IEC 29110, 2) the Security frameworks, 3) the security improvements introduced to the base Framework (Improvements), which is divided into Security Practices introduced to the base Framework directly (Security Practices) and the Deployment Packages (Deployment Packages) that can be implemented for the development of secure applications, both mobile and general; 4) the Documentation section where the documents generated in this proposal (the modified ISO/IEC 29110 guide and the Deployment Packages generated) are made available to the 78 users; 5) the description of the Team; and finally 6) a contact section at the bottom of the page (Contact Us). The section in Fig. 3 briefly describes the security frameworks used for the improvement of the base framework, which includes the ISO/IEC 27001 and 27034 standards, the Microsoft SDL framework, and McGraw’s Seven Touchpoints, as well as the CORAS risk assessment methodology and OWASP’s Top 10 Mobile Risks.
116
P. Maciel-Gallegos et al.
Fig. 2. The main page is a summary of the sections [].
Fig. 3. The security frameworks section.
The Enhancements subsection, Safety Practices, describes the Enhancements introduced in each task. In addition, it defines the specific tasks and the additions that were made to each, as shown in Fig. 4. On the other hand, the subsection in Fig. 4 describes the developed packages for theRequirements, Design, and construction of the software Implementation process. Moreover, the considerations of a minimum level of security that mobile applications must have to guarantee their safe use from the point of view of the five main aspects previously mentioned.
29110+ST: Integrated Security Practices. Case Study
117
Fig. 4. The security frameworks section.
There is also a section called Documentation, shown in Fig. 5, that includes all the documents generated in the solution, including the deployment packages and the modified ISO/IEC 29110. The Documentation section has a classification of the documentation. It has a thumbnail of each document, a quick view of the deployment package or modified standard, and ways to access the PDFs by selecting the document name or the More details icon (see Fig. 5). There is also a section called Documentation shown in Fig. 6, that includes all the documents generated in the solution, including the deployment packages and the modified ISO/IEC 29110.
4 Case Study To validate the usability of the proposal, the Expert Judgment Method was chosen because it is often considered as a very reliable option available to use to solve problems or make decisions in various domains and fields. The Expert Judgment is the data provided by an expert to answer a problem or question [13]. There is no exact quantity of experts that must be defined, although some authors suggest a group of between 15 to 25 [14]. These are some applications of the Expert Judgment like [13]: 1) Determine the probability of an event and assess the impact of changes; 2) Determine the current state of field knowledge; 3) Predict product or process performance; 4) Determine the validity
118
P. Maciel-Gallegos et al.
Fig. 5. The deployment packages section.
Fig. 6. The documentation section.
of assumptions; 4) Select input and response variables for the selected model; and 5) Provide essential elements for decision making when multiple options are available. The steps defined and followed were taken from the thesis of Miguel González [14] to evaluate the proposal [14].
29110+ST: Integrated Security Practices. Case Study
119
4.1 Presentation of the Proposal and Definition of the Features to Evaluate The first step to evaluate the proposal is to prepare a Presentation of the proposal to explain to the experts the issue context, the solution, and the proposal to gain their opinion. The features required to execute the evaluation are described in Table 1. Table 1. Features required for the evaluation.
4.2 Definition of the Competences of the Experts to Be Selected For the Expert Judgment, the critical step is to define the competencies that an Expert must have and be aligned to the object of evaluation. In this case, two profiles were defined, and the criteria that the subject must have or comply with to be considered an Expert. The criteria are a) professional formation and b) experience. The description of the Profiles can be seen in Table 2. Table 2. Experts’ profiles and criteria.
120
P. Maciel-Gallegos et al. Table 3. 5-point Likert Scale.
4.3 Creation of the Data Collection Tool The creation of the tool for the data collection consists of the development of a survey that evaluates the utility, presentation, and usability of the proposal and the opinion of the experts. The technology used to generate the survey was Google Forms. It facilitates the collection and analysis of the proposal, then generates a link that can be sent to the experts who will answer it via the Internet. The 5-point of Likert scale was chosen with the following parameters to define the evaluation of the questions (Table 3): The questions that are included in the evaluation survey of the proposal are described as follows: Q1 Do you consider that ISO/IEC 29110 indicates tasks or activities for secure software development? Q2 Do you consider ISO/IEC 29110 to be compatible with mobile application development? Q3 Do you consider that including security practices in ISO/IEC 29110 is useful for you? Q4 Do you consider that the security frameworks (ISO/IEC 27001 and 270034, Microsoft SDL, McGraw’s Seven Touchpoints and CORAS) considered in the proposal meet what is needed to integrate security into ISO/IEC 29110? Q5 Do you consider that the security improvements made to ISO/IEC 29110 Part 5 are clear? Q6 Do you consider that the security improvements in tasks, work products and roles made to ISO/IEC 29110 Part 5 are sufficient for the development of secure applications on VSEs? Q7 Do you consider that the deployment packages that correspond to the tasks in the Software Implementation Process are clear? Q8 Do you consider that the deployment packages corresponding to common mobile security concerns are clear? Q9 Do you consider that the way the information is presented on the website is clear? Q10 Do you consider that the information included in the website is sufficient to understand the security improvements in ISO/IEC 29110? Q11 What are your suggestions for improvement?
29110+ST: Integrated Security Practices. Case Study
121
4.4 Collection of the Experts Answers In this step, to collect the experts’ answers, a Presentation was done, in which the Experts that complied with the profiles and criteria were invited to an Online Presentation to show the work done, indicating the issue at hand, the solution, and the proposal, explaining each element generated and the 29110+ST tool. The link to the tool was given to them to navigate so they could get to know the contents and the elements introduced in it. Then a section of questions about the proposal was done at the end of the presentation. After clearing any doubt about the work done, the link to the survey was given to the Experts for them to answer. 4.5 Analysis and Interpretation of the Results This step features the analysis and interpretation of the evaluation results with Expert Judgment, with 19 experts answering the survey. Q1: Do you consider that ISO/IEC 29110 indicates tasks or activities for secure software development? Regarding the experts’ answers to know if the ISO/IEC 29110 indicates tasks or activities for secure software development: 73.68% answered that they “Totally Disagree” and “Disagree”. For the remaining percentage, 26.32% “Neither Disagree nor Agree” was their neutral answer. Most respondents (14 experts) agreed that the ISO/IEC 29110 does not consider security practices in software development. Q2: Do you consider ISO/IEC 29110 to be compatible with mobile application development? The second question asked if the ISO/IEC 29110 is compatible with mobile application development. The experts’ answers are shown in Figure 49.79% “Totally Agree” and “Agree” and that justifies the reasons the ISO/IEC 29110 was chosen as the base framework, which is the kind of organization that the ISO/IEC 29110 help in the software management and development process. Then 21.1% was neutral to the question. Q3: Do you consider that including security practices in ISO/IEC 29110 is useful for you? The answer to the question regarding the consideration that including security practices in ISO/IEC 29110 is useful to the expert. The results show that 100% of the answers were “Totally Agree” and “Agree”, denoting the great need for improvement in security in the ISO/IEC 29110. Q4: Do you consider that the security frameworks (ISO/IEC 27001 and 270034, Microsoft SDL, McGraw’s Seven Touchpoints and CORAS) considered in the proposal meet what is needed to integrate security into ISO/IEC 29110? The results were that 100% of the answers were “Totally Agree” and “Agree” which reinforces the decision of taking those security frameworks as the minimal security practices that must be considered in developing secure software. Q5: Do you consider that the security improvements made to ISO/IEC 29110 Part 5 are clear?
122
P. Maciel-Gallegos et al.
The answers of the expert in the 5th question that ask if the security improvements done in the ISO/IEC part 5 were clear. The 78.95% of the answer’s inclines to “Totally Agree” and “Agree”. It highlights the clarity of the improvements done in the guide of the ISO/IEC 29110 in its 5th part. The other answers, 21.1%, considered the improvements to the guide neutral or not clear enough. Q6: Do you consider that the security improvements in tasks, work products and roles made to ISO/IEC 29110 Part 5 are sufficient for the development of secure applications on VSEs? The 6th question asks if the expert considers that the security improvements in tasks, work products, and roles made to ISO/IEC 29110 Part 5 are sufficient for developing secure applications on VSEs. The results indicate “Totally Agree” and “Agree” with 89.4%, which agrees with the statement that the improvements to the ISO/IEC 29110 part 5 are enough to help develop the secure application. On the other hand, 10.5% think that it is a neutral improvement. Q7: Do you consider that the deployment packages that correspond to the tasks in the Software Implementation Process are clear? The experts’ answers to the question regarding the clarity of the Deployment packages of the Software Implementation Process can be found. 100% of the answers were “Totally Agree” and “Agree”, which can be interpreted as the great usability and clarity the Deployment Packages generated for the Software Implementation Process must be implemented. Q8: Do you consider that the deployment packages corresponding to common mobile security concerns are clear? The experts’ answers to the question regarding the clarity of the Deployment packages corresponding to common mobile security concerns can be found. 100% of the answers were “Totally Agree” and “Agree”, which can be interpreted as the great usability and clarity that the Deployment Packages generated for the typical mobile security concerns have to be implemented. Q9 Do you consider that the way the information is presented on the website is clear? The 9th question is oriented to knowing the structure and classification of the information on the website. The results indicate “Totally Agree” and “Agree” with 100%. It can be interpreted as a great clarity of the structure of the information classified on the website. Q10 Do you consider that the information included in the website is sufficient to understand the security improvements in ISO/IEC 29110? The answers to the questions ask if the information on the website is enough to understand the security improvement done to the ISO/IEC 29110. The 84.2% of the answers “Totally Agree” and “Agree” with the information provided on the website to understand the improvements done to the ISO/IEC 29110. A 10.5% answered that it was not enough
29110+ST: Integrated Security Practices. Case Study
123
information to understand the improvements, and just 5.3% (one person) were neutral to the question. Q11 What are your suggestions for improvement? Regarding to suggestions, the following list describes the most relevant suggestions for improvement: • Show more about this improvement’s impact on ISO/IEC 29110 • Create formats for the development of secure mobile applications. • I recommend using different diagrams to follow the traceability of the different procedures that can be visualized and understood. For example, how can it be activated, what is its task of origin and destination, what roles are involved, as well as the different states that are presented in the formats as the process goes on (new, accepted, and rejected). It is for a better understanding seen from a general overview. 4.6 Report of Results The results and suggestions for improvement obtained with the survey applied to the expert’s group can be reported as follows: a) For the selection of the ISO/IEC 29110 as the base framework, the answers were positive regarding the type of organization or teams that develop mobile software, which can enter the category of VSEs, which are the type of organization that the ISO/IEC works within. Also, the knowledge that the base framework does not consider security practices, so the improvement gives a real and useful contribution, as the expert’s answers let it know. b) The security framework integrated into the base framework to improve it was considered enough because it met the need to introduce security elements in the base framework. The experts stated that the improvement was useful to implement c) The improvements to part 5 of the ISO/IEC 29110 were considered clear enough to understand the security practices introduced. In this part, the experts made the suggestion that could complement the proposal with the addition of more formats to help in the implementation of the improvements d) The Deployment Packages, which defines the process of implementation of security practices in the Software Implementation Process and the Mobile Concerns, were perceived as clear by the experts. e) Regarding the Website, the information structure in which it was presented was clear enough for the expert to understand the proposal but had a deficiency in the amount of information that must explain the improvements. In this part, the expert suggested that the proposal could be complemented with more diagrams to understand the traceability of the new additions.
5 Conclusions In this work, we present a tool that shows the proposal for security improvements to the ISO/IEC 29110 named 29110+TS. To validate the proposal and the 29110+TS tool, an
124
P. Maciel-Gallegos et al.
Experts Judgment Method was implemented with a survey in which the answers gave a positive response to the improvements done to the Base framework and the 29110+TS tool with 90.5% generally. This survey can be divided in 5 points to evaluate: 1) Know if the selection of the ISO/IEC 29110 as the Base framework was justified with the context of the issue, which had a positive response with 76.34%, but consider improvement to make it better; 2) Know if the security additions were helpful in the context defined, as well as the security frameworks considered to help in the improvement to the Base framework, having the 100% of acceptance; 3) Consider if the security improvements made in the part 5 of the ISO/IEC 29110 were clear and enough for the development of secure applications on VSEs, and the 84.18% of the expert gave positive answers having reasonable rate of acceptance; 4) The deployment packages generated for the Software Implementation Process and for secure development of mobile applications, had a 100% of acceptance, giving as a conclusion that the deployment packages had the information necessary to understand the implementation of the artifact; and 5)The Usability of the website, taking into consideration if the information structure was presented clearly and if the information contained in the website was enough to understand the security improvements in the Base framework, this had a 92.1% rate, giving an excellent note to the structure but having to improve the amount of content necessary to understand the proposal.
References 1. DataReportal: Internet users in the world 2020 | Statista. Statista GmbH, October 2020 (2022). https://www.statista.com/statistics/617136/digital-population-worldwide/ 2. Newzoo: Smartphone users 2020 | Statista. Statista GmbH (2020). https://www.statista.com/ statistics/330695/number-of-smartphone-users-worldwide/ 3. App Annie; TechCrunch: Annual number of mobile app downloads worldwide 2020 | Statista. Statista, September 2020 (2020). https://www.statista.com/statistics/271644/worldwide-freeand-paid-mobile-app-store-downloads/ 4. Poniszewska-Maranda, A., Majchrzycka, A.: Access control approach in development of mobile applications. In: Younas, M., Awan, I., Kryvinska, N., Strauss, C., Thanh, D.V. (eds.) MobiWIS 2016. LNCS, vol. 9847, pp. 149–162. Springer, Cham (2016). https://doi.org/10. 1007/978-3-319-44215-0_12 5. Kitchenham, B., Brereton, O.P., Budgen, D., Turner, M., Bailey, J., Linkman, S.: Systematic literature reviews in software engineering - a systematic literature review. Inf. Softw. Technol. 51, 7–15 (2009) 6. Mejía, J., Maciel, P., Muñoz, M., Quiñonez, Y.: Frameworks to develop secure mobile applications: a systematic literature review. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S., Orovic, I., Moreira, F. (eds.) WorldCIST 2020. AISC, vol. 1160, pp. 137–146. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45691-7_13 7. ISO/IEC: ISO/IEC 27001:2013, Information technolog. Security techniques. Information security management systems. Requirements 8. ISO/IEC: ISO/IEC 27034 — Information technology — Security techniques — Application security 9. Microsoft: Security Development Lifecycle | SDL Process Guidance Version 5.2 (2012) 10. McGraw, G.: Software Security: Building Security In. Addison-Wesley Professional, Richmond (2006)
29110+ST: Integrated Security Practices. Case Study
125
11. Lund, M.S., Solhaug, B., Stølen, K.: The CORAS Model-Based Method. SINTEF, Oslo (2006) 12. Mejía, J., Muñoz, M., Maciel-Gallegos, P., Quiñonez, Y.: Proposal to integrate security practices into the ISO/IEC 29110 standard to develop mobile apps. In: Mejia, J., Muñoz, M., Rocha, Á., Avila-George, H., Martínez-Aguilar, G.M. (eds.) CIMPS 2021. AISC, vol. 1416, pp. 29–40. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-89909-7_3 13. Amer, M., Daim, T.: Expert judgment quantification. In: Daim, T., Oliver, T., Kim, J. (eds.) Research and Technology Management in the Electricity Industry. Green Energy and Technology, pp. 31–65. Springer, London (2013). https://doi.org/10.1007/978-1-4471-509 7-8_3 14. García, L., Fernández, S.J.: Procedimiento de aplicación del trabajo creativo en grupo de expertos. Ingeniería Energética XXIX(2), 46–50 (2008) 15. González Pacheco, M.Á., Muñoz Mata, M.A., Hernández Reveles, J.G.: Creación de una biblioteca de juegos serios para hacer más efectiva la enseñanza de Kanban acorde a las necesidades de la Pyme. M.S. thesis, CIMAT, Zacatecas (2021)
Comprehension of Computer Programs Through Reverse Engineering Approaches and Techniques: A Systematic Mapping Study Yazmin Alejandra Luna-Herrera(B) , Juan Carlos Pérez-Arriaga, Jorge Octavio Ocharán-Hernández, and Ángel J. Sanchéz-García Facultad de Estadística e Informática, Universidad Veracruzana, Xalapa, Veracruz, México [email protected], {juaperez,jocharan,angesanchez}@uv.mx
Abstract. The maintenance phase is an activity carried out by software engineers that requires an understanding how computer programs work. However, most legacy systems lack associated documentation and have poorly designed artifacts. As a result, technical debt is generated, which causes a significant increase in maintenance costs. Reverse engineering is applied to help software engineers to understand how the program was designed. Within reverse engineering, different approaches reduce the effort required for comprehension. In addition, static, dynamic, and hybrid analysis techniques are used to generate artifacts where the program’s behavior can be easily visualized. This paper presents the results of a Systematic Mapping Study (SMS) conducted to identify reverse engineering approaches to help software engineers understand computer programs. Forty-eight studies were selected. Ten different approaches were identified in these studies, the main ones being Model-Driven Reverse Engineering (MDRE) and visualization graphics; static, dynamic, and hybrid analysis techniques were found; and fifteen artifacts featuring visualization graphs, class diagrams, and sequence diagrams. Keywords: Computer programs · Reverse engineering · Systematic mapping study · Approach · Software engineering
1 Introduction Understanding computer programs is an analysis phase activity that software engineers perform before performing system maintenance tasks. This activity aims to acquire the necessary knowledge of the dynamic and static aspects of the program for its modification. For software engineers, it is common to work on legacy systems. However, most of these systems lack associated documentation and design artifacts or do not have the required quality. There are several reasons for inadequate and outdated documentation. These reasons are associated with a lack of resources and time during software development activities and developer skills limitations. Some of them incidentally introduce code smells into the source code. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 126–140, 2023. https://doi.org/10.1007/978-3-031-20322-0_9
Comprehension of Computer Programs
127
Code smells are symptoms in the code that indicate that things are not being done correctly, which can cause problems in the future. This leads to technical debt, which impacts design decisions that provide benefits in the short term but in the long-term cause failures in the system’s structure. As a result, there is a significant increase in maintenance costs [1]. Due to the technical debt, the software maintenance activity entails a more significant expenditure of resources. It is estimated that between 50% and 80% of the life cycle cost of a software product is spent on maintenance [2]. Additionally, it is essential to point out that the process of understanding legacy programs alone consumes between 47% and 62% of the resources allocated to maintenance [2]. Reverse engineering is the process of analyzing a system to identify its components and interrelationships. With this information, representations of the system will be created at a higher level of abstraction [3]. This is to help software engineers to improve their comprehension of how the program was designed and works. In reverse engineering, different approaches reduce the effort required to comprehend. An approach is defined as directing interest toward an issue or problem from previous assumptions to solve it correctly [4]. These approaches use static, dynamic, and hybrid analysis techniques to generate artifacts where the program’s behavior can be easily visualized. The need to modify the computer program leads to the obligation to comprehend it. According to the review of the current literature, no Systematic Mapping Study (SMS) was found that serves as a reference source for the problem. Therefore, the objective of this SMS will be to analyze reverse engineering approaches whose purpose is to support software engineers in comprehending computer programs. This paper is organized as follows: Sect. 2 presents the background and related work. In Sect. 3, the followed method is described. Section 4 presents the results. Finally, Sect. 5 concludes and defines future work.
2 Background and Related Work In the manual search of related works, two systematic reviews were identified. In 2018 Ghaleb et al. [5] investigated program comprehension through reverse engineering sequence diagrams. They emphasized the creation of sequence diagrams as a practical method for recovering the behavior of software systems, mainly for those with inadequate documentation. In addition, they cover various approaches focused on visualization and dynamic, static or hybrid techniques mentioned in state-of-the-art. A second systematic review was the one prepared in 2017 by Raibulet et al. [6]. In this they mention the approaches found to support model-based reverse engineering. In addition, they expose the application of MDE (Model-Driven Engineering) to develop functional software understanding tools during reverse engineering activities. Given that the first emphasizes only the generation of artifacts, sequence diagrams and static and dynamic techniques, and the second focuses on model-based reverse engineering, there is a need to update a systematic mapping focusing on the diversity of approaches, the techniques that are applied and the diversity of artifacts generated to aid understanding.
128
Y. A. Luna-Herrera et al.
3 Research Method The Systematic Literature Review (SLR) method proposed by Kitchenham and Charters [7] was followed as a guide to carry out this research work. This consists of three stages, planning, conducting, and reporting the results. Within the planning, the need for the review is identified. In conduction, the primary studies are selected, quality is evaluated, and data extraction, follow-up, and synthesis are carried out. Finally, within the results report, the form of diffusion of the results is established. 3.1 Planning Within the planning phase, the research questions were established. From these, the keywords were defined to build the search string. In addition, the sources of information, the inclusion and exclusion criteria, the selection procedure for primary studies, the quality evaluation criteria, and the data to be extracted are specified. Research Questions. The Systematic Mapping Study (SMS) aims to answer the research questions shown in Table 1. Table 1. Research questions and motivation Questions
Motivation
RQ1: Which are the current reverse engineering Identify approaches used in reverse approaches used to comprehend computer engineering to reduce the effort needed to programs? compress computer programs RQ2: Which are the techniques exposed by reverse engineering for the comprehension of computer programs?
It is important to identify the techniques used by reverse engineering to help understand the benefits of each
RQ3: Which reverse engineering artifacts have been generated to help comprehension of computer programs?
This question helps identify the artifacts generated by the approaches to identify the most used ones
Keywords. Once the research questions were defined, the keywords were defined, as shown in Table 2. Search String. The keywords allowed the creation of the search string used in this paper. (“reverse engineering” AND (“program comprehension” OR “software comprehension”) AND (“approach” OR “tool” OR “technique”)). Information Sources. The sources used were the following virtual libraries: ACM Digital Library, IEEE Xplore, SpringerLink, and ScienceDirect. These were chosen because they are repositories of articles in computing and related disciplines. Primary Study Selection Criteria. Table 3 and Table 4 show the inclusion and exclusion criteria used to select the primary studies, respectively.
Comprehension of Computer Programs
129
Table 2. Keywords to build the search string. Concept
Synonyms
Reverse engineering
Software reverse engineering
Program comprehension
Software comprehension
Technique Approach Tool
Table 3. Inclusion criteria. IC
Description
IC1
The study must be written in English
IC2
The study is a journal article, conference article, or workshop article
IC3
There is evidence in the title or abstract that at least one of the research questions is answered
IC4
The complete work can be accessed
IC5
The full text of the study answers at least one research question
Table 4. Exclusion criteria. EC
Description
EC1
The study is a shorter or repeated version of a study found in a different source
EC2
The study covers the comprehension of computer programs from a different perspective than software engineering
EC3
The study covers reverse engineering approaches from a different approach to the comprehension of computer programs
EC4
The study was published before 2015 or after 2020
EC5
The study addresses software maintenance without emphasizing the comprehension process
Primary Study Selection Procedure. The phases followed for the primary study selection procedure are shown below in Fig. 1. Quality Assessment. After conducting the study selection procedure, the quality of the studies was assessed. Although none of these were ruled out, the quality assessment process provided accuracy to the information collected from each study, where 1 point will be given if it meets the quality criterion, 0.5 if it partially meets, and 0 if not compliant. Table 5 shows the quality criteria followed in this evaluation.
130
Y. A. Luna-Herrera et al.
Phase 1
Phase 3
Phase 2 • EC1 • IC2
•EC4 •IC1
•IC3 •EC2 •EC3 •EC5
Phase • IC4 • IC5
Fig. 1. Primary studies selection process Table 5. Quality evaluation criteria. ID Question 1
Does the study clearly show its objectives?
2
Are the clear links between the objectives, the information displayed, and the conclusions?
3
Is the study consistent?
4
Does the study answer the research questions?
5
Is the technique, approach, artifact, or tool for program comprehension easily identified?
6
Is the technique, approach, artifact, or tool for comprehension programs widely discussed?
Data Extraction. The template shown in Table 6 lists the data fields extracted from each primary study. Table 6. Data extraction template. Kind of data
Data extracted
Publications details
Title Author Year Source Publication type Reference Abstract
Context
Approaches Techniques Artifacts Proposed tools Tools Mentioned
Comprehension of Computer Programs
131
3.2 Execution The conducting process began by entering the search string in the databases, and obtaining the initial results, after which the stages were applied in the four selected databases; the number of articles selected for each phase is shown in Table 7. Table 7. Primary studies selection phases. Database
First results
ACM Digital Library
Phase 1
415
108
Phase 2 91
Phase 3
Phase 4
9
9
IEEE Xplore
620
64
58
22
22
SpringerLink
1,698
512
307
21
13
ScienceDirect
278
82
61
9
4
3,011
766
517
61
48
Total
4 Results
Quantity of studies
The distribution of the selected studies by year of publication was analyzed. 12 10 8 6 4 2 0
8
11
9 5
2015
2016
2017 2018 Year of publication
5
5
5
2019
2020
2021
Fig. 2. Distribution of studies per year.
In Fig. 2 can be observed that the year with the highest number of studies was 2018, with 11 studies. It can be noted that during the first two years, the studies increased, but in 2017 it decreased to almost 50%. Furthermore, during the last three years, studies have decreased compared to 2018, but it keeps a constant at five. RQ1: Which Are the Current Reverse Engineering Approaches Used to Comprehend Computer Programs? In this paper, the term approach considers how a problem is addressed from previous assumptions [4]. The main problem is the comprehension of computer programs. By analyzing the selected studies, it was found that ten approaches apply to understanding computer programs. It is important to note that a study may apply one or more reverse engineering approaches, as these are not mutually exclusive.
132
Y. A. Luna-Herrera et al.
Applied Approaches
The Model-Driven Reverse Engineering approach predominates; this approach was classified by the type of artifact it generates to obtain a more detailed visualization of the information. It has the advantage of generating a comprehensive view based on structural and behavioral models of legacy systems. The second most used approach is visualization graphics, which seeks to represent information visually. It provides a more accessible and easy way to analyze data by using symbolic elements such as lines, circles, and squares. The approaches can be used in general contexts because they all help to understand the program, the most flexible being the visualization graphics. But depending on the approach, a greater analysis will be obtained in a certain area. For example, if an understanding of program behavior is needed, it is recommended to use the use case, activity, or state representation diagram approaches. If in addition to the behavior it is necessary to know the interaction, sequence diagrams are recommended. On the other hand, it is required to know the structure of the system, class and entity relationship diagrams are recommended. The approaches and their frequency can be visualized in Fig. 3. And Table 8.
Generation of sequence diagrams (MDRE) Generation of class diagrams (MDRE) Elaboration of visualization graphics Generation of state representation models (MDRE) Design pattern detection Generation of domain models (MDRE) Documented code generation Presentation Layer Analysis Extracting class dependencies Generation of use case diagrams (MDRE) Model mining Generation of UCM diagrams (Use Case Map) Preparation of an executive summary Extraction of dynamic data structures Flowchart generation (MDRE) Generation of Entity-Relationship models (MDRE)
Model-Driven Reverse Engineering (MDRE)
12 12 11 5 5 3 3 2 2 2 2 1 1 1 1 1 0
2
4 6 8 Quantity of mentions
10
12
Fig. 3. Frequency of reverse engineering approaches to comprehension.
RQ2. Which are the Techniques Exposed by Reverse Engineering for the Comprehension of Computer Programs? A software technique is a procedure for designing, developing, documenting, or maintaining computer programs [8]. The most identified technique is static analysis, which has the advantage that it examines the computer program from the source code, and the program’s execution is unnecessary. The second most identified technique is dynamic analysis; it examines the program when it is running and has the advantage of supporting code instrumentation. When a study uses dynamic and static analysis, it is considered a hybrid analysis. Studies that apply hybrid analysis mention that their artifacts are more robust since they come from analyzing information from the source code and the program’s execution.
Comprehension of Computer Programs
133
Table 8. Approaches applied by primary studies. Approach
Primary study
Generation of sequence diagrams (MDRE)
[11, 13, 15, 17, 20, 21, 26, 33, 38, 40, 46]
Generation of class diagrams (MDRE)
[10, 16, 18, 22, 28, 33, 35, 36, 39, 43, 47, 53]
Elaboration of visualization graphics
[10, 35, 38, 39, 42, 45, 48, 52–55]
Generation of state representation models (MDRE)
[15, 16, 23, 48, 51]
Design pattern detection
[14, 24, 27, 32, 45]
Generation of domain models (MDRE)
[34, 47, 50]
Documented code generation
[29, 41, 56]
Presentation Layer Analysis
[14, 30]
Extracting class dependencies
[12, 49]
Generation of use case diagrams (MDRE)
[34, 39]
Model mining
[25, 31]
Generation of UCM diagrams
[46]
Preparation of an executive summary
[37]
Extraction of dynamic data structures
[19]
Flowchart generation (MDRE)
[36]
Generation of Entity-Relationship models (MDRE)
[34]
Quantity of mentions
It should be noted that both static and hybrid analysis can only be carried out when the program code is available. This is not necessary in dynamic analysis except that code instrumentation is required. Figure 4 and Table 9 show the techniques mentioned, the number of studies, and which studies apply each.
25 20 15 10 5 0
24 17 7
Static Analysis
Dynamic Analysis Applied Technique
Hybrid Analysis
Fig. 4. Applied reverse engineering techniques for comprehension.
RQ3: Which Reverse Engineering Artifacts Have Been Generated to Help Comprehension Computer Programs?
134
Y. A. Luna-Herrera et al. Table 9. Techniques applied by primary studies.
Technique
Primary study
Static analysis
[10, 12, 14, 18, 20, 22, 23, 25, 27, 31, 34–37, 40, 44, 47, 49, 51–54, 56]
Dynamic analysis
[13, 16, 17, 19, 21, 26, 29, 30, 33, 38, 39, 41–43, 45, 50, 55]
Hybrid analysis
[11, 15, 24, 28, 32, 46, 48]
Generated artifact
A software artifact is an element for software development processes. It collects all the information necessary to specify, develop and maintain a software-based system [9]. Visualization graphics, the most generated artifact, have the advantage of being easy to read and understand. In addition, it offers a variety of artifacts such as the following: call graph, atomic section graph, force-directed graphs, state flow graph, dependency graph, abstract Semantic Graph (ASG), thread chart, graph model, inter-component transition graph, graphical event-traces, treemap views, node-link diagram. The second most generated artifacts are the class diagram, which has the advantage of representing the program’s structure showing its classes, interfaces, characteristics, and relationships. Sequence diagrams that describe the flow of a sequence of messages with their specifications are artifacts found in the studies analyzed. Two types of sequence diagrams are generated: the first one is obtained from static analysis, and the second one is from dynamic analysis. Static analysis has broad coverage, but the final diagram increases the horizontal size and complexity of the diagram. Dynamic analysis includes only methods that are executed. However, it can skip important methods, and the diagram will grow vertically if the execution time is too long. The idea of “optimized sequence diagrams” applying hybrid analysis has been proposed to preserve the advantages of both techniques. Figure 5 and Table 10 show the artifacts’ diversity and the number of studies that mention it. It should be noted that a study may generate one or more types of artifacts. Visualization graph Class diagram Diagram of sequence State Representation Domain model XML (Extensible Markup Language) Documented code Pattern Instances Activity diagrams Petri nets Use case diagram UCM Diagram (Use Case Map) JSON (JavaScript Object Notation) Entity relationship model Flowchart
20 14 14 8 4 3 3 2 2 2 2 1 1 1 1 0
2
4
6 8 10 12 Quantity of mentions
14
Fig. 5. Reverse engineer artifacts to aid comprehension.
16
18
20
Comprehension of Computer Programs
135
Table 10. Artifacts generated by primary studies. Artifact
Primary study
Visualization graph
[10, 12, 14, 14, 18, 19, 27, 30, 35, 35, 37–39, 42, 45, 49, 52–55]
Class diagram
[10, 16, 18, 22, 28, 31–33, 35, 36, 39, 43, 47, 53]
Diagram of sequence
[10, 11, 13, 15, 17, 20, 21, 26, 32, 33, 38, 40, 46]
State Representation
[15, 16, 23, 25, 30, 31, 48, 51]
Domain model
[27, 34, 47, 50]
XML
[12, 15, 30]
Documented code
[29, 41, 56]
Pattern Instances
[24, 44]
Activity diagrams
[30, 33]
Petri nets
[26, 30]
Use case diagram
[34, 39]
UCM Diagram
[46]
JSON
[10]
Entity relationship model
[34]
Flowchart
[36]
A wide variety of tools are used by reverse engineering to generate artifacts and facilitate comprehension. Tools for model generation stand out, but tools are also used to generate visualization graphs, pattern instance detection, documentation, determine class dependencies, identify dynamic structures, generate UCM diagrams, and search for expressions. Figure 6 shows the types of tools used or created by the selected studies. 14 13
14
Quantity of tools
12 10 8 5
6
3
4
3 1
2
1
1
1
0 Model generation
Generation of visualization graphs
Generation of sequence diagrams
Detection of Automated design pattern documentation instances generation
Determining Identification of Generation of class dynamic data UCM diagrams dependencies structures (Use Case Map)
Expression search
Tool type
Fig. 6. Proposed reverse engineering tools.
Tools that are only mentioned were found in the primary studies. Tools for generating UML diagrams stand out, but tools for decompiling code, model-based analysis, static
136
Y. A. Luna-Herrera et al.
analysis, identification of classes, dynamic structures, and generation of visualization graphics are also mentioned. These tools are mentioned in Fig. 7.
16 Quantity of mentions
16 14 12 10 8 5
6
5
5
4
4
3
3
2 0 Generation of UML diagrams
Code decompiler
Model-Based Static analysis Class Identification Generation of Analysis identification of dynamic visualization graphs data structures Tool type
Fig. 7. Reverse engineering tools were mentioned.
5 Conclusions and Future Work A Systematic Mapping Study (SMS) was carried out to answer two research questions, applying the method proposed in [7] in four sources of information, where 48 articles selected as primary studies reported reverse engineering approaches for comprehension of computer programs by using techniques and artifacts. After analyzing the results of this research, ten reverse engineering approaches currently used to understand computer programs can be noted. The main ones are Model-Driven Reverse Engineering (MDRE) and visualization graphics. A variety of 15 artifacts were identified, with visualization graphs, class diagrams, and sequence diagrams being the most used. These approaches apply only one of the three techniques (static, dynamic, or hybrid analysis). In most cases, artifacts are obtained using a tool. This work fulfills the function of collecting the diversity of reverse engineering approaches for comprehending computer programs and making known the techniques used and the artifacts generated. In this way, a reference document is generated that software engineers can apply when deciding how to understand the system. In future work, it is proposed to explore further reverse engineering approaches from a Multivocal Literature Review (MLR) to obtain information related to computer program comprehension practices reported in gray sources such as white papers, development blogs, and corporate websites among others.
Comprehension of Computer Programs
137
References 1. Marinescu, R.: Assessing technical debt by identifying design flaws in software systems. IBM J. Res. Dev. 56(5), 9:1-9:13 (2012). https://doi.org/10.1147/JRD.2012.2204512 2. Nelson, M.L.: A survey of reverse engineering and program comprehension. arXiv:cs/050 3068 (2005) 3. Canfora, G., Di Penta, M., Cerulo, L.: Achievements and challenges in software reverse engineering. Commun. ACM 54(4), 142–151 (2011). https://doi.org/10.1145/1924421.192 4451 4. Real Academia Española:. Diccionario de la lengua española [Dictionary of the Spanish Language], Madrid, Spain, 22nd edn. (2001) 5. Ghaleb, T.A., Alturki, M.A., Aljasser, K.: Program comprehension through reverseengineered sequence diagrams: a systematic review. J. Softw. Evol. Process 30(11), e1965 (2018). https://doi.org/10.1002/smr.1965 6. Raibulet, C., Arcelli Fontana, F., Zanoni, M.: Model-driven reverse engineering approaches: a systematic literature review. IEEE Access 5, 14516–14542 (2017). https://doi.org/10.1109/ ACCESS.2017.2733518 7. Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. University of Durham, Durrham, UK (2007) 8. Gallegos, F.: Software Tools and Techniques, 20 (1985). https://www.gao.gov/assets/128750. pdf 9. Silva, M., Oliveira, T.: Towards detailed software artifact specification with SPEMArti. In: Proceedings of the 2011 International Conference on Software and Systems Process, pp. 213– 217 (2011). https://doi.org/10.1145/1987875.1987912 10. Cloutier, J., Kpodjedo, S., El Boussaidi, G.: WAVI: a reverse engineering tool for web applications. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC), pp. 1–3 (2016). https://doi.org/10.1109/ICPC.2016.7503744 11. Srinivasan, M., Yang, J., Lee, Y.: Case studies of optimized sequence diagram for program comprehension. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC), pp. 1–4 (2016). https://doi.org/10.1109/ICPC.2016.7503734 12. Nanthaamornphong, A., Leatongkam, A., Kitpanich, T., Thongnuan, P.: Bytecode-based class dependency extraction tool: Bytecode-CDET. In: 2015 7th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 6–11 (2015). https://doi.org/ 10.1109/ICITEED.2015.7408903 13. Lyu, K., Noda, K., Kobayashi, T.: SDExplorer: a generic toolkit for smoothly exploring massive-scale sequence diagram. In: 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC), pp. 380–384 (2018) 14. Cosma, D.C., Mihancea, P.F.: Understanding web applications using component based visual patterns. In: 2015 IEEE 23rd International Conference on Program Comprehension, pp. 281– 284 (2015). https://doi.org/10.1109/ICPC.2015.39 15. Ghaleb, T.A.: The role of open source software in program analysis for reverse engineering. In: 2016 2nd International Conference on Open Source Software Computing (OSSCOM), pp. 1–6 (2016). https://doi.org/10.1109/OSSCOM.2016.7863684 16. Garzón, M.A., Aljamaan, H., Lethbridge, T.C.: Umple: a framework for model driven development of object-oriented systems. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 494–498 (2015). https://doi.org/10. 1109/SANER.2015.7081863 17. Kaixie, L., Noda, K., Kobayashi, T.: Toward interaction-based evaluation of visualization approaches to comprehending program behavior. In: 2019 IEEE Workshop on Mining and Analyzing Interaction Histories (MAINT), pp. 19–23 (2019).https://doi.org/10.1109/MAINT. 2019.8666933
138
Y. A. Luna-Herrera et al.
18. Varoy, E., Burrows, J., Sun, J., Manoharan, S.: From code to design: a reverse engineering approach. In: 2016 21st International Conference on Engineering of Complex Computer Systems (ICECCS), pp. 181–186 (2016). https://doi.org/10.1109/ICECCS.2016.030 19. Rupprecht, T., Chen, X., White, D.H., Boockmann, J.H., Lüttgen, G., Bos, H.: DSIbin: identifying dynamic data structures in C/C++ binaries. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 331–341 (2017). https://doi.org/ 10.1109/ASE.2017.8115646 20. Ghaleb, T.A., Aljasser, K., Alturki, M.A.: Enhanced visualization of method invocations by extending reverse-engineered sequence diagrams. In: 2020 Working Conference on Software Visualization (VISSOFT), pp. 49–60 (2020) 21. Noda, K., Kobayashi, T., Toda, T., Atsumi, N.: Identifying core objects for trace summarization using reference relations and access analysis. In: 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 13–22 (2017). https://doi.org/10.1109/ COMPSAC.2017.142 22. Jolak, R., Le, K.-D., Sener, K.B., Chaudron, M.R.V.: OctoBubbles: a multi-view interactive environment for concurrent visualization and synchronization of UML models and code. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 482–486 (2018). https://doi.org/10.1109/SANER.2018.8330244 23. Said, W., Quante, J., Koschke, R.: Reflexion models for state machine extraction and verification. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 149–159 (2018). https://doi.org/10.1109/ICSME.2018.00025 24. Yang, S., Manzer, A., Tzerpos, V.: Measuring the quality of design pattern detection results. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 53–62 (2015). https://doi.org/10.1109/SANER.2015.7081815 25. Said, W.: Interactive model mining from embedded legacy software. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 484–487 (2018) 26. Baidada, C., Bouziane, E.M., Jakimi, A.: A new approach for recovering high-level sequence diagrams from object-oriented applications using petri nets. Procedia Comput. Sci. 148, 323– 332 (2019). https://doi.org/10.1016/j.procs.2019.01.040 27. Ujhelyi, Z., et al.: Performance comparison of query-based techniques for anti-pattern detection. Inf. Softw. Technol. 65, 147–165 (2015). https://doi.org/10.1016/j.infsof.2015. 01.003 28. Kakarontzas, G., Pardalidou, C.: Improving component coupling information with dynamic profiling. In: Proceedings of the 22nd Pan-Hellenic Conference on Informatics, pp. 156–161 (2018). https://doi.org/10.1145/3291533.3291576 29. Liu, Z., Wang, S.: How far we have come: testing decompilation correctness of C decompilers. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 475–487 (2020). https://doi.org/10.1145/3395363.3397370 30. Martins, L.C.G., Garcia, R.E., Marçal, I.: Using Information Visualization to comprehend user interface layer: an application to web-based systems. In: Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems, pp. 1–10 (2017). https://doi.org/10. 1145/3160504.3160558 31. Said, W., Quante, J., Koschke, R.: Do extracted state machine models help to understand embedded software? In: Proceedings of the 27th International Conference on Program Comprehension, pp. 191–196 (2019). https://doi.org/10.1109/ICPC.2019.00038 32. Lucia, A.D., Deufemia, V., Gravino, C., Risi, M.: Detecting the behavior of design patterns through model checking and dynamic analysis. Transacciones de ACM sobre ingeniería y metodología de software 26(4), 13:1–13:41 (2018). https://doi.org/10.1145/3176643
Comprehension of Computer Programs
139
33. Haendler, T., Sobernig, S., Strembeck, M.: Deriving tailored UML interaction models from scenario-based runtime tests. In: Lorenz, P., Cardoso, J., Maciaszek, L.A., van Sinderen, M. (eds.) ICSOFT 2015. CCIS, vol. 586, pp. 326–348. Springer, Cham (2016). https://doi.org/ 10.1007/978-3-319-30142-6_18 34. Reis, A., da Silva, A.R.: Evaluation of XIS-reverse, a model-driven reverse engineering approach for legacy information systems. In: Pires, L.F., Hammoudi, S., Selic, B. (eds.) MODELSWARD 2017. CCIS, vol. 880, pp. 23–46. Springer, Cham (2018). https://doi.org/10.1007/ 978-3-319-94764-8_2 35. Anquetil, N., et al.: Modular Moose: a new generation of software reverse engineering platform. In: Ben Sassi, S., Ducasse, S., Mili, H. (eds.) ICSR 2020. LNCS, vol. 12541, pp. 119–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64694-3_8 36. Yadav, R., Patel, R., Kothari, A.: Critical evaluation of reverse engineering tool Imagix 4D! Springerplus 5(1), 1–12 (2016). https://doi.org/10.1186/s40064-016-3732-x 37. Sora, ¸ I.: Helping program comprehension of large software systems by identifying their most important classes. In: Maciaszek, L.A., Filipe, J. (eds.) ENASE 2015. CCIS, vol. 599, pp. 122–140. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30243-0_7 38. Hammad, M., Al-Hawawreh, M.: Generating sequence diagram and call graph using source code instrumentation. In: Latifi, S. (ed.) Information Technology – New Generations. AISC, vol. 558, pp. 641–645. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-549781_81 39. Dugerdil, P., Sako, R.: Dynamic analysis techniques to reverse engineer mobile applications. In: Lorenz, P., Cardoso, J., Maciaszek, L.A., van Sinderen, M. (eds.) ICSOFT 2015. CCIS, vol. 586, pp. 250–268. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30142-6_14 40. Alvin, C., Peterson, B., Mukhopadhyay, S.: Static generation of UML sequence diagrams. Int. J. Softw. Tools Technol. Transf. 23(1), 31–53 (2019). https://doi.org/10.1007/s10009019-00545-z 41. Sulír, M.: Integrating runtime values with source code to facilitate program comprehension. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 743–748 (2018). https://doi.org/10.1109/ICSME.2018.00093 42. Majumdar, S., Chatterjee, N., Sahoo, S.R., Das, P.P.: D-Cube: tool for dynamic design discovery from multi-threaded applications using PIN. In: 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 25–32 (2016). https://doi.org/10. 1109/QRS.2016.13 43. Abualese, H., Sumari, P., Al-Rousan, T., Al-Mousa, M.R.: Utility classes detection metrics for execution trace analysis. In: 2017 8th International Conference on Information Technology (ICIT), pp. 469–474 (2017). https://doi.org/10.1109/ICITECH.2017.8080044 44. Khan, M., Rasool, G.: Recovery of mobile game design patterns. In: 2020 21st International Arab Conference on Information Technology (ACIT), pp. 1–7 (2020). https://doi.org/10.1109/ ACIT50332.2020.9299966 45. Lessa, I.M., de F. Carneiro, G., Monteiro, M.P., Brito e Abreu, F.: On the Use of a multiple view interactive environment for MATLAB and Octave program comprehension. In: Gervasi, O., et al. (eds.) ICCSA 2015. LNCS, vol. 9158, pp. 640–654. Springer, Cham (2015). https:// doi.org/10.1007/978-3-319-21410-8_49 46. Braun, E., Amyot, D., Lethbridge, T.C.: Generating software documentation in use case maps from filtered execution traces. In: Fischer, J., Scheidgen, M., Schieferdecker, I., Reed, R. (eds.) SDL 2015. LNCS, vol. 9369, pp. 177–192. Springer, Cham (2015). https://doi.org/10.1007/ 978-3-319-24912-4_13 47. Mendivelso, L.F., Garcés, K., Casallas, R.: Metric-centered and technology-independent architectural views for software comprehension. J. Softw. Eng. Res. Dev. 6(1), 1–23 (2018). https://doi.org/10.1186/s40411-018-0060-6
140
Y. A. Luna-Herrera et al.
48. Duarte, L.M., Kramer, J., Uchitel, S.: Using contexts to extract models from code. Softw. Syst. Model. 16(2), 523–557 (2015). https://doi.org/10.1007/s10270-015-0466-0 49. Dias, M., Orellana, D., Vidal, S., Merino, L., Bergel, A.: Evaluating a visual approach for understanding JavaScript source code. In: Proceedings of the 28th International Conference on Program Comprehension, pp. 128–138 (2020). https://doi.org/10.1145/3387904.3389275 50. Harth, E., Dugerdil, P.: Document retrieval metrics for program understanding. In: Proceedings of the 7th Forum for Information Retrieval Evaluation, 8–15 (2015). https://doi.org/10. 1145/2838706.2838710 51. Yamamoto, R., Yoshida, N., Takada, H.: Towards static recovery of micro state transitions from legacy embedded code. In: Proceedings of the 1st ACM SIGSOFT International Workshop on Automated (2018) 52. Satish, C.J., Mahendran, A.: The effect of 3D visualization on mainframe application maintenance: a controlled experiment. J. King Saud Univ. Comput. Inf. Sci. 31(3), 403–414 (2019). https://doi.org/10.1016/j.jksuci.2017.03.003 53. Hoff, A., Nieke, M., Seidl, C.: Towards immersive software archaeology: regaining legacy systems’ design knowledge via interactive exploration in virtual reality. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1455–1458 (2021). https://doi.org/10.1145/ 3468264.3473128 54. Alanazi, R., Gharibi, G., Lee, Y.: Facilitating program comprehension with call graph multilevel hierarchical abstractions. J. Syst. Softw. 176, 110945 (2021). https://doi.org/10.1016/j. jss.2021.110945 55. Dashuber, V., Philippsen, M.: Trace visualization within the software city metaphor: a controlled experiment on program comprehension. In: 2021 Working Conference on Software Visualization (VISSOFT), pp. 55–64 (2021). https://doi.org/10.1109/VISSOFT52517.2021. 00015 56. Aghajani, E., Bavota, G., Linares-Vásquez, M., Lanza, M.: Automated documentation of Android apps. IEEE Trans. Softw. Eng. 47(1), 204–220 (2021). https://doi.org/10.1109/TSE. 2018.2890652
Data Science Based Methodology: Design Process of a Correlation Model Between EEG Signals and Brain Regions Mapping in Anxiety Julia Elizabeth Calderón-Reyes1 , Humberto Muñoz-Bautista1 , Francisco Javier Alvarez-Rodriguez1 , María Lorena Barba-Gonzalez2 and Héctor Cardona-Reyes3(B)
,
1 Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico
[email protected], [email protected], [email protected] 2 Universidad de Guadalajara, Jalisco, Mexico [email protected] 3 CONACYT, CIMAT, Zacatecas, Mexico [email protected]
Abstract. This work addresses the difficulty of data obtention regarding the monitoring of brain activity at the time of a panic attack, coupled with the lack of methodologies and models oriented to mental disorders particularly anxiety, which could contribute to the development of solutions based in the level of anxiety of the user and overall condition, therefore optimizing the processes; an approach encompassing data science and software engineering is proposed to contribute with the development of data-driven solutions and partial obtention of results ready to be deployed within the means of the Lean UX Methodology MVP´S to cross validate the data transparency of clinical history of people with anxiety making use of and EEG device as an instrument, thus prioritizing the identification of guidelines that allow the analysis of monitored data and the determination of correlating factors in people with anxiety and history of panic attacks, paving the way for the health specialist to optimize processes by identifying the common trigger. Keywords: Data science · Lean UX · EEG signals · Anxiety
1 Introduction Software engineering has become a core area for the development of frameworks and the integration of services based on the input provided by the user [16], thus stablishing a bridge between volumes of data and services through guidelines present in: methodologies, models and processes; although the blueprints for the data flow and the feedback obtained to improve the quality of the service among other factors are key for the development of products and its deployment, given the central role of the human interaction and its indicators of User Experience (UX) and User Interface (UI) not only the data flow is to be evaluated in regards of accessibility and availability, but to be considered as © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 141–151, 2023. https://doi.org/10.1007/978-3-031-20322-0_10
142
J. E. Calderón-Reyes et al.
a whole starting point to acquire insights in the earlier stages of the data acquisition and treatment, hence applying data science principles for data treatment and its processes. The methodology in this article introduces the design process for the proposed correlation model between EEG signals and brain regions mapping in the study of anxiety and panic attacks, via the integration of core study areas in computer science as are software engineering and data science as well as their branches of human computer interaction, data treatment and data analysis encompassed in the Lean UX Methodology approach [11] to prioritize the basis of Brain Computer Interfaces (BCI) and the obtention of partial results within each Minimum Viable Product (MVP) for the analysis and treatment of data obtained through case studies of anxiety and the presence of panic attacks in given moments of a person’s life [13], therefore providing a guideline for its monitoring in a common basis covering the identification of opportunity areas and the optimization of current solutions based on the use of and EEG Device.
2 Literary Review Guidelines of all kinds are set in the software engineering area to identify the main actors of a given problem allowing to trace back each interaction and determinate the work flow for each one given the requirements to cover while considering the target solution, thus making feasible to adhere to a given methodology that covers the needs of the user or a tailored approach that could create a replicable model of the situation observed so it can be studied further and the products obtained within each iteration can be replicated if necessary; however not all methodologies are an optimal solution for a problem even if they do provide the target results, so the approach of the specialty area or areas involved in the solution of the problem must be evaluated thoughtfully to avoid incurring in an outdated procedure or an unfit philosophy that could compromise the time invested in the project or task to complete, the resources to be used, the instruments to be applied, and the human and financial capital. The proposed approach covered in this article encompasses the Lean UX Methodology and the Data Science Methodology to unify the analytic approach of the problem to be solved with the input provided by the user, the level of its interactions and the obtention of partial results through structuration and segmentation of the raw data culminating in the creation of models and the assignation of processes, hence covering the theoretical basis of a software product based in the user and the practicality of computational methods and techniques. 2.1 Methodologies: Software Engineering and Data Science Lean UX. Methodologies play the leading role for the design, development, and implementation of software along with the degree of acceptation of the users and stakeholders; however, each methodology is structured accordingly to the knowledge area where its going to be implemented and the requirements to cover which has led to advances regarding the implementation of Intelligent Agents and its regulations partaken in the software engineering area and data analysis [2, 10], thus although Lean and agile methodologies are applied in a common basis, they leave a pathway for the combination of traditional
Data Science Based Methodology
143
iterative models and user oriented design as pursued in the Lean UX methodology that provides quality products without compromising cost and time requirements [1] accomplishing the generation of minimum viable products per iteration which can be further improved with the feedback that it recollects. The range of its applications can cover not only the software industry, but areas in which the improvement of a processes relies in the costs and time requirements to be optimized to improve the user experience, considering its behavior [14] and the implications of the scalability of the software product to cover users demands, the role of each team member [8], key tasks to accomplish per stable and the approach to be executed upon experimental design, case study or trial towards the advances on regard of the targeted goal and transitions per cycle or iteration. IBM Data Science Methodology. Data Science is an area that derivates from computer science and has grown remarkably given the constant increase in data volume which reflects in its fields of application, hence the existence of several methodologies with the data science approach and data-driven solutions for target objectives or applications regarding the interconnection of devices and the origin of the data, specifications that can lead to challenges to obtain analytics and apply given techniques [15]; mathematically speaking of the areas with the highest demand for data science methodologies and its particular implementation is in the field of statistics [4] given the application of preexisting knowledge and the obtention of new insights to create indicators in every area while applying from basic to complex techniques. Particularly, this publication follows the lead of the IBM Data Science Methodology [17] due to its analytic approach of the problem, that prioritizes de data flow from its conception in earlier stages to its analysis and culmination in the feedback obtained to optimize the solution achieved after each iteration; given the connection between the user interaction and the data flow for the case study of the proposed methodology, the data science approach makes of the Lean UX Methodology a common ground for the evaluation, implementation and feedback to be obtained in a data-driven cycle for the solution of problems.
3 Related Work Regardless of the methodological approach or proposal of models to convey solutions oriented towards the mental health area, the views of the contributions can vary regarding the depiction of anxiety and the technological basis to cover given the key indicators and the approach to be taken; while anxiety remains the key indicator to be studied its worth noting that the cognitive functions of the human brain and capability to regulate emotions can provide important data for the identification of emotional states and its effect in the behavior, which is why the related works also cover the paradigm of other kinds of anxiety disorders, the techniques deployed that provide guidance regarding the algorithms to consider, and other technologies and related areas that could be considered for future works. To elaborate on the related works Table 1 shows the technologies applied to the constructs of anxiety in conjunction with the cognitive and emotional processes.
144
J. E. Calderón-Reyes et al. Table 1. Panic attacks: related methodologies and applications.
Reference
Technology
Construct
Description
Approach
Khessiba, S et al. [12]
Deep Learning
Inference
Brain activity study via EEG signals and Deep Learning architectures
Neural Computing
Chen, C et al. [7]
EEG Signal
Anxiety
Evaluation of Neuropsychiatry neurofeedback for Applications Anxiety relief
Balan, O et al. [5]
Machine Learning, Deep Learning
Fear Acrophobia
Research of techniques to automatize fear level detection
Artificial Intelligence and Virtual Reality
Suhaimi, N et al. [18]
EEG Signals
Cognition
Identification of human emotional states using EEG signals
Computational Intelligence and Neuroscience
Beaurenaut, M et al. [6]
STAI Questionnaire
Anxiety
Threat of scream paradigm to study physiological and subjective anxiety
Statistic Correlation
Francese, R et al. [9]
Virtual Reality
Emotions
User centered methodology for an emotion detection system
User centered for detection
Although the literature review analysis that is shown above highlights the technological and methodological applications for anxiety in regards of primarily artificial intelligence techniques, neuropsychology and neuroscience applications it is important to consider the degree of impact of the methodological proposal in its initial stages and its capability for exponential growth either on the same areas of application stablished or in related areas; the current methodological proposal and its core model prioritizes the lean approach in conjunction with the data science views on the data flow which altogether provides a guideline for health specialist based on the insights obtained that can lead to the optimization of processes and identification of correlating factors.
4 Methodological Proposal An integral focus is key to achieve target results that can be adapted from the sample population to its extrapolation regarding an specific stage or implementation of a change in a process or task to be iterated which can be achieved by the combination of the structure provided by a Brain Computer Interface (BCI), it’s actors and the devices
Data Science Based Methodology
145
for input and output whilst wrapping the processes of user segmentation of the Lean UX methodology and the data recollection and treatment necessary for a case study; particularly, the Lean UX guidelines were applied in regards of the Minimum Viable Products, covering the formulation of an hypothesis based on an expected outcome, the design of the proposal, the creation of the MVP’s, and further research along with the inherent learning obtained from the iteration [11]. Shown below Fig. 1 outlines the primary stages of the core methodology in a linear sequence: segmentation, design, treatment, launching, and evolution, with the iteration between the segmentation and evaluation stages to reevaluate each of the subsections, where the segmentation covers the hypothesis, then the proposal is formulated in the design, followed by the creation of the MVP´s given its proper treatment, and the research and learning carried out by the launching and evolution stages.
Fig. 1. Methodology design inspired on Lean UX [11].
4.1 Structural Design of a Model To sustain the methodological proposal, it’s development and design, the primary stages of the methodology were divided by its contributions to the subsequent design of a model, thus prioritizing the segmentation stage and the design stage. Segmentation. Partaking in the segmentation process, the opening stage of the methodology allowed to confirm the analysis of requirements of the sample population to identify the target users and the match between the established requirements and the needs to be covered whilst actively solving the problem within each advance on the stages, therefore the initial segmentation contributed to the formulation of objectives and timeframes for its accomplishments and approval of the minimal viable products (MVP´s), thus creating samples and appointing an approach. Design. By taking into consideration the segmentation of the raw data provided by the sampling an approach substages a layout of said data was proposed as the blueprint for
146
J. E. Calderón-Reyes et al.
the MVP´s projections of results to be obtained from the initial stage to the last one, abiding by an initial prototype to define the levels of interaction of the user according to the requirements identified in a preliminary analysis, carrying the representation of its schemes, main tasks and processes as a workflow to guide its assignation.
4.2 Segmentation and Assignation of Processes Provided the segmentation and design as guidelines for the implementation of the findings obtained through the segmentation, the stages of treatment, launching and evaluation were profiled towards the assignation of processes within the substages to link the projections with the findings regarding the MVP´s and its contributing factor for the data science model obtained as main product. Treatment. The specifications provided by the previous stages allowed the design of an exploratory analysis script that was applied to the raw data and generated a structure of the data types and variables to be represented within the sample dataset to carry the test in the upcoming launching stage, therefore creating a data model based on the: segmentation, abstraction, modelling, and profiling of the data, executing not only a cleaning process between the substages and monitoring the margin of error, but creating a MVP to cover the need in means of data, so that the computing techniques can be launched and evaluated without further complications. Launching. For the deployment of the software solution the hypothesis in which the objectives were based on were evaluated by implementing the computing techniques best suited for the sample data, and the cost and time restrictions inherent to the problematic, hence given the categorical quality of the data, an early classification the solution was tested applying the K-Means categorization algorithm and the ID3 and J48 categorization algorithms, recollecting information in two sample clusters. Evaluation Encapsulated within the evaluation substages, the processes for the verification and validation of the model were held respectively in means of the scenery and feedback; a linear data flow of the methodology was obtained based on the scenery run locally on the training data which was then analyzed and verified to be consistent and accu-rate given the precision obtained in the partial results as MVP´s on each of the previ-ous stages, thus concluding the design verification for the initial iteration of the mod-el. The percentage of accuracy achieved through the methods applied given the op-timal solution based on the algorithms was then validated by the feedback obtained from the stakeholders, where a member of the medical team along with a health specialist deemed the experimental design of the model as successful in the explora-tion of the data characteristics and the behavior of the model given the techniques applied for the classification and categorization of the data.
Data Science Based Methodology
147
5 Results The results obtained through the first iteration of the Lean UX methodology design along with the data science approach cover the basics of the verification and validation processes for software development, and the exploratory analysis of the model for further applications regarding the correlation between brain signals and brain regions mapping in anxiety. Figure 2 shows the ramifications of the results obtained, where the Lean UX approach [11] is represented as the core methodology taking into consideration the minimum viable products (MVP’s) that are obtained through each of the methodology stages, and prioritizing the verification and validation processes in its last stage, whereas the data science approach is represented as the design process that encompasses the current advances presented in means of the data science methodology proposed by IBM [17] and its focus in the problem at hand which particularly was explored in the comparison and selection of algorithmic techniques for the categorization and classification of the data. Described in the subsections below are the stages identified as part of the key data science model of the treatment stage that’s encompassed by the methodological proposal, along with their contributions to the design process for the EEG signals correlation whilst highlighting the primary products of the model: the stakeholders review, the constructs proposal, and the instrumental contextualization.
Fig. 2. Exploratory analysis of the model.
5.1 Data Segmentation As stablished in the methodological proposal, the data segmentation was used as a filter for the data to explore the possible variables involved and its effects while assessing the
148
J. E. Calderón-Reyes et al.
structure of the raw data as a set of elements and identify which will be valuable for the obtention of K and its clusters as to have a clear classification and identify the type of data to be downloaded, recollected, and represented; thus resulting in the comparison of data sources according to the type of EEG device and classification of its signals. Taking into consideration the need to explore a dataset without bias for the design of the instrument oriented towards anxiety, a sample dataset of auditory evoked potential [3] was chosen, adding up to its compatibility to the headset to compare further readings and the allocation of the sensors in the brain regions: frontal (F8), parietal (P4), and temporal (Tz). 5.2 Data Abstraction Once the dataset source was established and its parameters were identified in a preliminary analysis, the data abstraction stage was implemented to prepare the dataset prior its evaluation, reshaping the dataset so that the names of the variables could reflect the sensors and its allocation, tailored for the categorization and classification of its variables; therefore having selected the treatment to be applied along with the normalization of variables and parameters, thus validating the projections and calculations for the algorithmic techniques to be implemented for the classification and categorization to identify a control variable for future sampling. 5.3 Data Modelling The final transformation of the dataset was cross validated for each of the algorithmic techniques and control variables to be considered from the changes present between the raw data and the transformed data to create a sample dataset, which was trained according to the specification provided for each algorithm during the experimentation stage and the resulting categories; the implementation highlighted the output of the partial results and the insights provided by the application of each algorithmic technique, thus making feasible to profile the data obtained for future implementations and shape the model accordingly to the K Means, ID3 and J48 12 categories corresponding to the experiment and session record per one of the subjects from the sample population within the dataset. 5.4 Data Profiling The design process for the methodology, and core aspects of the model helped to stablish the instrumental design of the computing techniques and its application for the profiling of the provided data, therefore the contribution to the evaluation and optimization of the processes within the methodology and the model itself made the analysis of the partial results and the MVPs a decision factor to compare the level of progress regarding the implementation of the project and the optimality of the data and solutions provided toward the correlation of anxiety and its indicators. Shown below Fig. 3 introduces the stages of the design process according to its main function, subroutines to develop, and identified constructs for its design given the involvement of the stakeholders in the process, hence complementing the methodology in Fig. 1 and the exploratory analysis in Fig. 2 by providing inflexion points that
Data Science Based Methodology
149
will reinforce the reproducible research and will become the bridge between the software engineering basis, and the design and application of the instrumentation with the identified techniques.
Fig. 3. Design of the instrumental process.
6 Conclusions and Future Work The problematic and case of study must remain a top priority to evaluate the degree of advance made towards the target objective or the projections of the results, not only to contrast against the initial hypothesis and measure the research capabilities for further investigations or the replication of certain studies, but to implement the analytical approach iteratively after each partial advance, launching or evaluation to apply the feedback obtained either from the final iteration at the moment, or from the record kept from the MVP´s and the projections stablished; hence, even though the Lean UX Methodology is the one providing the framework for the data science based model and the data-driven processes, its principles are integrated within the Lean UX philosophy providing data transparency to the results and an increasing capability for improvement, optimization and scalability that build up to the correlation between EEG Signals to
150
J. E. Calderón-Reyes et al.
eventually mapping the Brain Regions and Anxiety presented, with the intervention of health specialists. To conclude, the guidelines obtained from the designed process under the current methodology allowed to allocate the tasks corresponding to the stages within the dataflow, highlight the implications of the algorithmic techniques deployed, and to process the results towards an unifying profile applied to data analysis and processing as to obtain further information regarding the EEG device, the key components from the BCI, the categorical variables as control in the sample population, and the improvements to be acknowledged; therefore for future works the insights obtained through the local experimentation and the reevaluation of the methodological design that culminated in the introduction of the data science model are going to be adjusted to held meetings with health specialist and identify the instrument to be used along with the EEG device for the obtention of data from the users.
References 1. Aarlien, D., Colomo-Palacios, R.: Lean UX: a systematic literature review. In: Gervasi, O., et al. (eds.) ICCSA 2020. LNCS, vol. 12254, pp. 500–510. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-58817-5_37 2. Abdalla, R., Mishra, A.: Agent-oriented software engineering methodologies: analysis and future directions. Complexity 2021, 1–21 (2021). https://doi.org/10.1155/2021/1629419 3. Alzahab, N.A., et al.: Auditory evoked potential EEG-biometric dataset (2021). https://doi. org/10.13026/ps31-fc50. https://physionet.org/content/auditoryeeg/1.0.0 4. Ashofteh, A., Bravo, J.M.: Data science training for official statistics: a new scientific paradigm of information and knowledge development in national statistical systems. Stat. J. IAOS 37, 771–789 (2021). https://doi.org/10.3233/SJI-210841 5. B˘alan, O., Moise, G., Moldoveanu, A., Leordeanu, M., Moldoveanu, F.: An investigation of various machine and deep learning techniques applied in automatic fear level detection and acrophobia virtual therapy. Sensors 20(2), 496 (2020) 6. Beaurenaut, M., Tokarski, E., Dezecache, G., Gr‘ezes, J.: The ‘threat of scream’ paradigm: a tool for studying sustained physiological and subjective anxiety. Sci. Rep. 10(1), 1–11 (2020) 7. Chen, C., et al.: Efficacy evaluation of neurofeedback-based anxiety relief. Front. Neurosci. 15, 758068 (2021) 8. Follett, J.: What is Lean UX? (2017). https://www.oreilly.com/radar/what-is-lean-ux/ 9. Francese, R., Risi, M., Tortora, G.: A user-centered approach for detecting emotions with low-cost sensors. Multimedia Tools Appl. 79(47–48), 35885–35907 (2020). https://doi.org/ 10.1007/s11042-020-09576-0 10. Fujita, H., Guizzi, G.: Proceedings of Intelligent Software Methodologies, Tools and Techniques: 14th International Conference, SoMet 2015, Naples, Italy, 15–17 September 2015 (2015) 11. Gothelf, J., Seiden, J.: Lean UX. O’Reilly Media, Inc., Sebastopol (2021) 12. Khessiba, S., Blaiech, A.G., Khalifa, K.B., Abdallah, A.B., Bedoui, M.H.: Innovative deep learning models for EEG-based vigilance detection. Neural Comput. Appl. 33(12), 6921– 6937 (2020). https://doi.org/10.1007/S00521-020-05467-5. https://link.springer.com/article/ 10.1007/s00521-020-05467-5 13. Kircanski, K., Craske, M.G., Epstein, A.M., Wittchen, H.U.: Subtypes of panic attacks: a critical review of the empirical literature. Depress. Anxiety 26, 878–887 (2009). https://doi. org/10.1002/DA.20603
Data Science Based Methodology
151
14. Kompaniets, V., Lyz, A., Kazanskaya, A.: An empirical study of goal setting in UX/UIdesign. In: 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT), pp. 1–5 (2020). https://doi.org/10.1109/AICT50176. 2020.9368570 15. Martinez, I., Viles, E., Olaizola, I.G.: Data science methodologies: current challenges and future approaches. Big Data Res. 24, 100183 (5 2021). https://doi.org/10.1016/J.BDR.2020. 100183 16. Nadeem, A.: Human-centered approach to static-analysis-driven developer tools. Commun. ACM 65, 38–45 (2022). https://doi.org/10.1145/3486597 17. Rollins, J.B.: Metodolog´ıa fundamental para la ciencia de datos (2015) 18. Suhaimi, N.S., Mountstephens, J., Teo, J.: EEG-based emotion recognition: A state-of-the-art review of current trends and opportunities. Comput. Intell. Neurosci. 2020 (2020). https:// doi.org/10.1155/2020/8875426, https://pubmed.ncbi.nlm.nih.gov/33014031/
Can Undergraduates Get the Experience Required by the Software Industry During Their University? Mirna Muñoz(B) Centro de Investigación en Matemáticas A.C. - Sede Zacatecas, Zacatecas, Mexico [email protected]
Abstract. The use of software standards in the software industry as a requirement to be part of the software development chain has been increasing in recent years. This fact is not an exception for Very Small Entities, as suppliers for medium and large organizations. However, achieving the use of software engineering models and standards is not an easy task for them. One of the principal facts is the human factor. Most engineers in this type of organization are junior engineers with little or no experience working under international software engineering models or standards. This paper provides the results of implementing an international standard in universities to reduce the gap between the requirements of the software engineering industry and the knowledge received at universities. The results show that students can get the required experience by developing real projects following the ISO/IEC 29110 standard. Keywords: Undergraduates · Industry requirements · International standard · ISO/IEC 29110 · Software Development Centers
1 Introduction Software engineering standards, such as 12207, 15505, and 25010, are targeted to contribute to software development organizations to achieve the development of quality products within budget and schedule by optimizing their efforts and resources [1]. This fact becomes especially critical in VSEs, the most significant percentage of software companies worldwide, where developing high-quality products or services is fundamental for their growth and survival [2]. Implementing international standards in VSEs can be a full path of obstacles due to the effort required to achieve a correct implementation. As a solution, the members of the ISO WG 24 developed the ISO 29110 series of standards and guides. This standard provides a set of proven practices that tie with the needs of VSEs and allow them to get benefits such as increasing their product quality, reducing their delivery time, and reducing their development costs [3, 4]. However, some specific characteristics of VSEs such as (a) the lack of previous experience in the use of development processes and the implementation of international © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 152–161, 2023. https://doi.org/10.1007/978-3-031-20322-0_11
Can Undergraduates Get the Experience
153
standards, (b) the pressure to work harder to survive in the software market, (c) few employees with little or no experience in the use of international standards, and (d) the lack of budget to perform activities related to software process improvements; are a current barrier to diminish to help them in the achievement of producing high-quality software products and services [5]. This paper addressed the human factor barrier related to the employees having little or no experience working with international standards. The research question for this analysis is: Can undergraduates get the experience required by the software industry during their university? To answer the research question, this paper analyses universities that have implemented an international standard throughout Software Development Centres (SDC) that allows undergraduates to get experience in working under international standards. After the introduction, the rest of the structure of the paper is: Sect. 2 provides the background of this research; Sect. 3 provides the material and methods to collect data; Sect. 4 presents the results, and Sect. 5 provides the conclusion and future work.
2 Background 2.1 Previous Work on Industry Requirements The author of this paper participated in 2016 in an analysis that aims to collect the requirements of a set of software development enterprises regarding the expected knowledge and practice experience related to hard and soft skills [6]. The instrument was composed of seven questions, and 32 organizations answered it [6]—this paper focuses on those questions addressing hard and soft skills. The list of expected hard and soft skills is briefly described in Table 1. The table includes deficiencies mentioned by at least nine organizations, around 30% of the sample. Table 1. The software industry expected hard and soft skills (adapted from [6]). Question
Results
Which are the main knowledge deficiencies you find when hiring computer personnel?
Deficiencies of knowledge in: • Using of methodologies and best practices • Implementing quality assurance • Using quality models and standards • Collecting software needs • Leading a project
What are your main abilities and deficiencies Deficiencies in: when hiring a computer person? • Decision-making • Risk management abilities • Capacity to solve problems • Teamwork (continued)
154
M. Muñoz Table 1. (continued)
Question
Results
Expected knowledge related to project management
Knowledge in managing best practices for: • Project planning • Risk management • Project monitoring and control
Expected knowledge related to software development
Knowledge in managing best practices for: • Requirement development • Software validation • Requirement management
Expected knowledge related to support in project management
Knowledge in managing best practices for: • Measurement and analysis • Process and product quality assurance • Configuration management
What are you looking for in a graduated student of informatics engineering, software engineering, computer science, or computer engineering
Skills for: • Operative and maintenance-operation levels (40%) • High management and operative levels (21%) • High management and operative management levels (16%)
Even when the results are from 2016, the viability of data was demonstrated by comparing the results of the software industry needs, published in 2020, titled “Closing the gap between software engineering education and industrial needs” by Garousi et al. [7]. This paper highlights a set of soft and hard skills found reviewing 33 studies. This research took the SWEBOK areas as a reference. The paper’s results are listed below: • Skills requested in the industry: Software engineering professional (professionalism, group dynamics, and communication skills), project management, requirements for engineering, design, and testing. • The knowledge gaps identified in this paper are configuration management, software engineering models and methods, and software engineering process. • Soft skills: teamwork and communication, leadership, and critical thinking. 2.2 ISO/IEC 29110 ISO/IEC 29110 is a series of standards and guides that targets to help VSEs having little or no experience or expertise in improving their software development processes to achieve quality software products [8, 9]. This standard has three main characteristics that make it ideal for covering the needs of VSEs: (1) it is composed of four profiles to be selected according to the specific VSE needs (entry, basic, intermediate, and advance); (2) it aims to be implemented with any lifecycle such as waterfall, iterative, incremental, evolutionary or agile and (3) it has as
Can Undergraduates Get the Experience
155
a core of two processes project management and software implementation for all their profiles [8, 9]. Figure 1 presents an overview of ISO 29110.
Fig. 1. Overview of ISO 29110 profiles and processes
It is important to highlight that the Basic profile of ISO 29110 is the only profile in which a VSE can be certified. Therefore, it is the target profile in this research. Figure 2 provides an overview of the two processes of the basic profile.
Fig. 2. Overview of the basic profile processes [8, 10].
As the figure shows, the project management process trigger is a statement of work or a list of needs provided by the customer. Then, if the VSEs accept the project, the planning activity initiates the project. After that, the execution of the project starts launching the implementation process until the customer receives the work products described in the statement of work (e.g., user documentation, installation manual, code). Then, during the project execution, the evaluation and control are performed. Finally, when the project ends, the closure activity should be performed.
156
M. Muñoz
One thing to highlight is that the two processes of the Basic profile cover many knowledge areas of the Software Engineering Body of Knowledge (SWEBOK) [8]. 2.3 Software Development Centers Software Development Centers (SDCs) aim to provide a location where students can implement the knowledge acquired in their subjects for the practical execution of software development projects. The SDCs’ purpose is to enable students to experience working on projects with real internal or external customers [11]. According to [11], the benefits that SDCs certified in ISO/IEC 29110 can promote in universities are: • Customers: (1) an SDC can widen customers’ interest in making requests for new projects; (2) customers increase the level of trust because the ISO/IEC 29110 gave the SDC the credential of working according to an international standard; (3) SDC increase the customer satisfaction because they can implement better management for achieving the commitments with their customers. • Educational: (1) (1) the students have the opportunity to participate in real projects; (2) reducing the learning curve of students allowing them to develop software for real customers using ISO/IEC 29110 so that they implement concepts taught in class.
3 Material and Methods The instrument created for collecting data is a survey. The survey consists of 6 questions as specified below; two of them are to collect general data: 1. In what year did you start using the basic profile of the ISOIEC 29110 standard? 2. What institution do you belong to? 3. How many students have you trained in the standard since you start using ISO/IEC 29110? 4. From the following list of requirements of the software industry, could you mark the hard skills you consider your students to reinforce with the implementation of the basic profile of the ISO/IEC 29110 standard? 5. From the following list of requirements of the software industry, could you mark the soft skills that you consider your students to reinforce with the implementation of the basic profile of the ISO/IEC 29110 standard? 6. Do you consider that your students obtain the experience required to work in companies with a culture of processes? Yes/No, and why? The instrument was shared by google forms (https://forms.gle/w96zR7sKxn9N Hh1j7) to SDCs having certified in the ISO/IEC 29110 during the period from 2017 to 2020, nine SDCs covered this characteristic, and the following section presents the results of the analysis.
Can Undergraduates Get the Experience
157
4 Results This section analyzes the results of information collected with the instrument to answer the research question established in this paper. Seven answers were received, which means 78% of the invited sample. 4.1 General Data This section provides the results of the third first question of the questionnaire related to the institution name, the year in which the institution started using the ISO 29110, and the number of students trained in the standard. Table 2 provides a summary of the obtained results. By confidentiality, the SDCs are named as SDC plus a consecutive number (e.g., SDC 1). Table 2. SDC ID, year of starting using the standard, and the number of trained students. SDC ID
Year
# of trained students
SDC1
2019
57
SDC2
2019
55
SDC3
2019
35
SDC4
2019
50
SDC5
2019
12
SDC6
2017
12
SDC7
2019
40
4.2 Hard Skills Reinforced Using ISO 29110 in SDCs This section provides the results of question four of the instrument related to the hard skills: From the following list of requirements of the software industry, could you mark the hard skills you consider your students to reinforce with the implementation of the basic profile of the ISO/IEC 29110 standard? Moreover, to develop this question, the knowledge requested by the industry mentioned in Sect. 2.1 was taken as the base. The analysis of the answers to this question is presented in Fig. 3.
158
M. Muñoz
Fig. 3. Hard knowledge reinforced by students using ISO29110 in SDCs
As the figure shows, the knowledge (hard skills) reinforced always are project management (mentioned by 6 SDCs), project monitoring and control, requirements management, and project closure (mentioned by 5 SDCs). Besides, the knowledge reinforced often is test and quality assurance (mentioned by 5 SDCs) and risk management and configuration management (mentioned by 3 SDCs). 4.3 Soft Skills Reinforced Using ISO 29110 in SDCs This section provides the results of question five of the instrument related to soft skills: From the following list of requirements of the software industry, could you mark the soft skills that you consider your students to reinforce with the implementation of the basic profile of the ISO/IEC 29110 standard? Moreover, to develop this question, the knowledge requested by the industry mentioned in Sect. 2.1 was taken as the base. The analysis of the answers to this question is presented in Fig. 4.
Fig. 4. Soft knowledge reinforced by students using ISO29110 in SDCs
As the figure shows, the soft skills reinforced always are teamwork and capacity to solve problems (mentioned by 6 SDCs), leading a team and leading a project (mentioned by 5 SDCs), and decision-making and continuous improvement (mentioned by 4 SDCs).
Can Undergraduates Get the Experience
159
4.4 Experience Required by the Software Industry This section provides the results of question six of the instrument related to the experience required by the software industry: Do you consider that your students obtain the required experience to work in companies with a culture of processes? Yes/No, and why? 100% of Institutions answer “Yes.” Related to reason, Table 3 shows the set of reasons considered. Table 3. Reasons mentioned by SDCs to consider their students get the experience required by the software industry using ISO 29110 SDC ID
The reason mentioned by SDCs
SDC1
Students are located faster than ever in the industry. CEOs mentioned that students’ training has improved in the last few years
SDC2
Because the hard and soft skills requested by enterprises are those reinforced by using the ISO 29110
SDC3
Students learn how to implement software development processes based on the project management process and software implementation process of ISO 29110
SDC4
Because ISO 29110 allows reviewing the software products resulting from implementing the project management and software implementation activities, students are aware of the project
SDC5
Students are involved in topics closer to the reality of the software industry
SDC6
The insertion of students in organizations having processes’ culture is more fluid. Therefore, the integration of students in important software development organizations is growing
SDC7
Implementing ISO 29110 in the SDC allows students to work under a culture of documenting and monitoring processes
After analyzing the answers to questions four to six provided by the eight institutions, the research question defined for this research: Can undergraduates get the experience required by the software industry during their university? It can be answered as yes in the context of this research with the sample of eight institutions having a certified Software Development Center. SDCs allow students to practice implementing and using an international standard aimed at Very Small Entities. Unfortunately, the results of this research cannot be generalized because the sample size is not as large as the author would like. However, it becomes relevant since the author did not find a similar analysis as the one presented in this paper.
5 Conclusions and Future Work One highlighted characteristic of VSEs in this paper is that they have few employees with little or no experience using international standards. Besides, most of them hired junior engineers that are recent university graduates.
160
M. Muñoz
In this context, the academic field should be aware that the software industry needs to achieve engineers having both hard and soft skills to have a natural transition from an academic to an industrial environment. The transfer achievement is related to the universities’ facilities that allow students to practice software development under real scenarios. Institutions having SDC certified in ISO29110 standards have confirmed that basic profile processes allow students to cover the knowledge required by the software industry. Then, the standard covers knowledge such as: requirements, design, construction, testing, quality (e.g., reviews, verification, validation), configuration management (e.g., version control, change request, release management), engineering management, engineering models, and methods (e.g., traceability) and engineering process, as well as, the management of the relationships between a customer and the organization that develops software. As future work, the author of this research would like to broaden the research by focusing on the students trained in SDCs to get data related to their skills reinforced by using the standard, providing innovative instruments to collect data such as using gamification or serious game. Another interest of the author is developing resources that could improve the development of hard and soft skills related to software engineering using gamification and serious game. Acknowledgments. The author wants to thank the seven Mexican institutions that accepted to participate in the 6-questions questionnaire: Instituto Tecnológico Superior Zacatecas Occidente, Unidad Profesional Interdisciplinaria de Ingeniería Campus Zacatecas IPN, Universidad Tecnológica del Estado de Zacatecas, Instituto Tecnológico Superior de Loreto, Instituto Tecnológico de Jerez, Instituto Tecnológico Superior Zacatecas Norte, Instituto Tecnológico Superior de Nochistlán.
References 1. Laporte, C.Y., Muñoz, M., Mejia Miranda, J., O’Connor, R.V.: Applying software engineering standards in very small entities-from startups to grownups. IEEE Softw. 35(1), 99–103 (2017) 2. Ibarra, G., Vullinghs S., Burgos F.J.: Panorama Digital de las Micro, Pequeñas y Medianas Empresas (MiPymes) de América Latina 2021, Santiago. GIA Consultores (2021) 3. Nyce: Certificación de sistemas ISO/IEC 29110 (2022). https://www.nyce.org.mx/certifica cion-isoiec-29110/ 4. Muñoz M., Mejía J.: ¿Qué es el Grupo de Trabajo 24?. Software Guru, 56. https://sg.com. mx/revista/56/wg24 5. Muñoz, M., Mejia, J., Laporte, C.Y.: Implementing ISO/IEC 29110 to reinforce four very small entities of Mexico under an agile approach. IET Softw. 14, 75–81 (2020). https://doi. org/10.1049/iet-sen.2019.0040 6. Muñoz, M., Negrón, P.P.A.., Mejia, J., López, G.: Actual state of coverage of Mexican software industry requested knowledge regarding the project management best practices. Comput. Sci. Inf. 13(3), 849–873 (2016). 25p 7. Garousi, V., Giray, G., Tuzun, E., Catal C., Felderer M.: Closing the gap between software engineering education and industrial needs. IEEE Softw. 37(2), 68–77 (2018). https://doi.org/ 10.1109/MS.2018.2880823
Can Undergraduates Get the Experience
161
8. ISO/IEC TR 29110-5-1-2:2011. Software Engineering - Lifecycle Profiles for Very Small Entities (VSEs) - Part 5-1-2: Management and Engineering Guide: Generic Profile Group: Basic Profile. International Organization for Standardization (2011). Freely available from ISO. http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html 9. Muñoz, M., Mejia, J., Laporte, C.Y.: Reinforcing very small entities using agile methodologies with the ISO/IEC 29110. In: Mejia, J., Muñoz, M., Rocha, Á., Peña, A., Pérez-Cisneros, M. (eds.) CIMPS 2018. AISC, vol. 865, pp. 88–98. Springer, Cham (2019). https://doi.org/10. 1007/978-3-030-01171-0_8 10. Laporte, C.Y., O’Connor, R.V.: Software process improvement in industry in a graduate software engineering curriculum. Softw. Qual. Prof. J. 18(3), 4–17 (2016) 11. Mirna, M., Mejia, J., Peña, A., Lara, G., Laporte, C.Y.: transitioning international software engineering standards to academia: analyzing the results of the adoption of ISO/IEC 29110 in four Mexican universities. Comput. Stand. Interfaces 66, 103340 (2019). https://doi.org/ 10.1016/j.csi.2019.03.008. ISSN 0920-5489
Knowledge Management
Data Mining Prospective Associated with the Purchase of Life Insurance Through Predictive Models José Quintana Cruz and Freddy Tapia(B) Department of Computer Science, Universidad de las Fuerzas Armadas ESPE, Sangolquí, Ecuador {jaquintana,fmtapia}@espe.edu.ec
Abstract. This work proposes the creation of an analytical model which allows improving the effectiveness of sales through the use of business intelligence and data mining methodologies which allow analyzing the historical information of corporate clients and determining the probability of buying a product. In this research, methodologies, and tools are used that allow structuring the steps to define the task, collect and analyze data, choose and configure the model, format data, evaluate results and report them to decision-makers. This will allow testing various analytical models that train them and compare them with historical data, provide new data, which eventually will help increase sales effectiveness, highlighting that the data used for this analysis are demographic data, socio-economic aspects, and any information that contributes to having a framework to be reused in future sales campaigns. Keywords: Life insurance · Data mining · Data warehouse · Data mart · CRISP-DM · SEMMA · KDD · Data science · Machine learning platforms · Random forest · Decision tree · Neural networks · Bayesian
1 Introduction All companies currently want to analyze information to take advantage of making more timely business decisions, but not having the right tools at hand means opportunities are lost. In this context what you want is to take advantage of the information and create an analytical model that indicates the probability of purchase. The first step is to define a methodology to structure a reusable model. Then what must be done is to identify the data that the company has to run analytical models and take advantage of that information for efficient decision making. In their article, Thuring F. Nielsen J. P., Guillén M., and Bolancé C. mention that “insurance policies or credit instruments are financial products that involve a long-term relationship between the customer and the company.” For many companies a possible way to expand their business is to sell more products to preferred customers in their portfolio. Data on the customers’ past behavior is stored in the company’s database, and this data can be used to assess whether or not more products should be offered to a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 165–179, 2023. https://doi.org/10.1007/978-3-031-20322-0_12
166
J. Q. Cruz and F. Tapia
specific customer. In particular, data on past claiming history, for insurance products, or past information on defaulting, for banking products, can be useful for determining how the client is expected to behave in other financial products”. This information can be used to select preferred customers and cross-sell them products they do not yet possess [1]. Data mining has been defined as the process of discovering patterns in data. The process must be automatic or semi-automatic [2]. Data mining not only plays a key role in understanding the interactions between historical data and outcomes, but also in characterizing those interactions in a way that can predict future outcomes and feed those results back into further analysis and decision-making [3] (Fig. 1).
Fig. 1. Workflow of what data mining involves [3].
In the 90s, some of the main data mining methodologies emerged that help carry out data mining projects in a more orderly and standardized way. The main methodologies are CRISP-DM, KDD Process and SEMMA. CRISP-DM: It’s a proven method to guide data mining jobs which consists of six phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and deployment. The sequence of the phases is not rigid. Going back and forth between different stages is always necessary [4]. SEMMA: It is an iterative process divided into five stages that are represented by the acronym SEMMA (Sample, Explore, Modify, Model, and Assess) which means: Sampling, Explore, Modify, Model, and Evaluate [5]. KDD-Process: Knowledge Discovery in Databases refers to the broad process of finding knowledge in data and emphasizes the “high-level” application of particular data mining methods. Five stages are considered: Selection, Preprocessing, Transformation, Data mining, and Interpretation or Evaluation. The KDD process is interactive and iterative, involves numerous steps, and many decisions are made by the user, and is preceded by the development of an understanding of the application domain [6]. The objective of this work is to design a framework and find one or several data mining models that allow analyzing current and historical information on sales and their campaigns, through the use of methodologies, techniques, and intelligence tools. Businesses, to suggest and identify the best prospects and promote cross-selling of individual insurance.
Data Mining Prospective Associated with the Purchase
167
2 Related Works In their article, Devale & Kulkarni mention that “data mining can be defined as the process of selecting, exploring, and modeling large amounts of data to uncover previously unknown patterns. In the insurance industry, data mining can help firms gain business advantages. For example, by applying data mining techniques, companies can fully exploit data about customers’ buying patterns and behavior–as well as gain a greater understanding of their business to help reduce fraud, improve underwriting, and enhance risk management. Specifically, data mining can help insurance firms in business practices such as: (1) Acquiring new customers; (2) Retaining existing customers; (3) Performing sophisticated classification; (4) Correlation between policy design and policy selection [7]. In his article, T. Kaewkiriya indicates that the objective of his work is to propose a framework for the prediction of life insurance clients based on multi- algorithms. This framework consists of three modules. The first is the data preparation. The second is the data cleaning. The third is the data extraction. The process of data extraction is divided into 3 steps. (1) Feature selection. This step select the optimal feature for data analysis. (2) Step cluster the data by utilizing K-mean algorithm. (3) Extract data based on a neural network algorithm in order to create a recommendation model. The test results showed that the use of multi-algorithms had the highest predictive accuracy at 92.83% [8]. The recommendation of a product is critical to attract clients, Kumar and Singh mention in their article, particularly in life insurance where the company has multiple options for the customer. Accordingly, improving the quality of a recommendation to fulfill customers’ needs is important in competitive environments. The authors developed a two-stage product recommendation methodology that combines data mining techniques and analytic hierarchy process (AHP1 ) for decision-making. Firstly, clustering technique was applied to group customers according to the age and income because these two are very important in deciding insurance product. Secondly the AHP was then applied to each cluster to determine the relative weights of various variables in evaluating the suitable product for them [9]. In their article Jandaghi et al., indicate that one of the important issues in service organizations is to identify customers, understand their differences, and classify them. Recently, customer value as a quantitative parameter has been used to segment customers. A practical solution for analytics development is to use analytics techniques such as dynamic clustering algorithms and programs to explore the dynamics in consumer preferences. The goal of this research is to understand current customer behavior and suggest the right policy for new customers in order to achieve the highest profits and customer satisfaction. To identify this market in life insurance customers, fuzzy grouping has been used Fuzzy K-Means with Noise Cluster (FKM.pf.niose2 ) technique for classifying customers based on their demographic and behavioral data [10].
1 AHP: It is a method that selects alternatives based on a series of criteria or variables, normally
hierarchical, which usually conflict. 2 FKM.pf.niose is a k-means fuzzy clustering algorithm with noise clustering.
168
J. Q. Cruz and F. Tapia
In their article, Qadadeh & Abdallah’s mention that customer segmentation is important in designing marketing campaigns to improve business and increase revenue. Clustering algorithms can help experts achieve this goal. Exponential growth with high-dimensional databases and data warehouses, such as Customer Relationship Management (CRM3 ). In this article, different data analysis algorithms are investigated, specifically K-Means4 and Self-Organizing Map (SOM)5 [11].
3 Experimental Setup and Methodology 3.1 Business Intelligence The purpose of Business Intelligence is to convert raw data into knowledge, so that business leaders and managers can make decisions based on real data. Business analysts use Business Intelligence tools to create support products for optimal business management [12]. 3.2 Data Warehouse, Data Mart A data warehouse is a data repository that provides a global, common, and comprehensive view of the organization’s data-regardless of how it is going to be used later by consumers or users-with the following properties: stable, consistent, reliable, and with historical information. In short, they are topic-oriented, integrated, time-varying, and non-volatile. According to Ralph Kimball (considered the main proponent of the dimensional approach to data warehouse design), a data warehouse is a copy of transactional data specifically structured for query and analysis. According to W. H. Inmon (considered by many to be the father of the Data Warehouse concept), a Data Ware-house is a set of subject-oriented, integrated, time-varying, and non-volatile data that is intended to support decision-making [13]. Table 1 details certain aspects of the two different approaches: Table 1. Data warehouse versus data mart [14]. Data warehouse
Data mart
Corporate/company-wide
Departmental
Union of all Data Marts
A single business process (continued)
3 CRM business software that serves to have all communications with customers in one place
and accessible throughout the company. 4 K-means clustering method, whose goal is to partition a set of n observations into k groups,
each observation belonging to the group whose mean value is closest. 5 SOM type of artificial neural network, which is trained using unsupervised learning to produce
a representation, called a map.
Data Mining Prospective Associated with the Purchase
169
Table 1. (continued) Data warehouse
Data mart
Data received from the data preparation area (Staging Area)
Star-join (facts & dimensions)
Queries on presentation resource
Technology optimal for data access and analysis
Structure for corporate view of data
Structure to suit the departmental view of data
Organized in Entity Relationship model
Organized in Star or Snowflake Pattern
3.2.1 Framework Selected In their work, Azevedo and Santos concluded that, the Data Analysis Framework for Small and Medium Enterprises (SME6 ) provides the guidelines to be able to carry out data analysis work in a structured and methodological way in six steps that take the best of three of the most used methodologies in the field. World, achieving a simpler process that helps to obtain results more quickly by performing the analysis of the information while the data is being collected to be able to use it to choose the model and evaluate it as a cycle until the best results are achieved [15]. This framework is based on six steps. Here is a brief description of them: (1) Define the task: business ideas or goals for a suitable data mining application should be generated by senior management with the help of internal or external work groups to the company. All ideas collected must be evaluated, considering their cost-benefit. Finally, the company must choose between several standard data mining tasks. These tasks include customer aggregation, customer behavior prediction, sales predictions, or shopping cart analysis; (2) Collect and data analysis: The activity of collecting data should be a priority since unlike large companies SMEs. Your databases, or even your data warehouse are not well organized. Therefore, it is recommended to use external data or data from government entities that have a high degree of quality and are often free. The data collected should be analyzed using scatter diagrams to identify outliers that can negatively influence the results; (3) Choose and configure the model: Depending on the task defined in the first step, a model must be chosen. Each model can be customized by configuring its parameters. It is not possible to choose and configure the correct model in the first attempt. This is achieved with a series of quizzes and errors until the best result is achieved; (4) Format data: The first thing to check is the type of data with which the model can work. For example, while decision tree models can work with almost all data types, neural networks work only with numeric data types. The handling of missing, erroneous, or atypical data is done on a case-by-case basis since they can influence and distort the data, so it may be advisable to exclude these values; (5) Evaluate results: There are many possibilities to evaluate the quality of the model. They depend on the type of task that was performed. A classification model can be evaluated by split validation, which consists of randomly dividing the data into two parts. One part is used to build the 6 SME: (Small and Medium-sized Enterprises).
170
J. Q. Cruz and F. Tapia
model, and the other part is used to evaluate the performance of the model. Numerical models can be evaluated using statistical techniques. These results must be reviewed by the company’s experts and determined if the results are useful. If not, the model must be reconfigured until optimal results are obtained; (6) Report to decision makers: The easiest way to implement this phase is by making the knowledge obtained from the data mining activities available to decision makers through a reliable tool to make an optimal decision. As can be seen in Fig. 2, the SME framework is strongly related to CRISP-DM.
Fig. 2. SME framework process. Source [15]
3.3 Data Mining Tools With the growing need and interest in analyzing massive data sets (Big Data), a new generation of tools called Data Science and Machine Learning Platforms have appeared in organizations. These tools allow data scientists, analysts, or business users to interact with their data. These tools support the complete data mining cycle to create, deploy, and manage advanced analytics models [16]. In the Gartner quadrant as of January 2021, as shown in Fig. 3 are the main data science and machine learning tools and in Fig. 4, a comparison of the tools. Knime. It’s open-source software for creating and producing data science using an easy and intuitive environment, allowing the development of data mining models in a visual environment, with advanced features including collaboration, automation, and deployment. RapidMiner: A cloud-based and on-premises solution, this app comes with a visual drag-and-drop tool and a machine learning library, which enables developers to build and deploy predictive models. It helps users identify data issues like correlations, missing values, and more. Alteryx: Offers end-to-end automation of data science, machine learning, and analytics processes, which in turn facilitates the agility needed to accelerate digital transformation. Anaconda: It was built by data scientists for data
Data Mining Prospective Associated with the Purchase
171
Fig. 3. Gartner’s Quadrant, January 2021 Data Science and Machine Learning [16].
scientists. More than 20 million people use this technology to solve the most difficult problems. It’s a serious technology for real data science and ML applications. Data robot: It’s an enterprise AI platform that accelerates and democratizes data science by automating the journey from end to end. From data to value, enables trusted AI applications to be deployed at scale within your organization. Databricks: It’s designed to work with a large number of use cases (batch processing, real-time data processing, data warehousing, graphs, machine learning, and deep learning). In addition, it allows you to work with the following programming languages (Python, R, Scala, and SQL). Azure Machine Learning Studio: It is a drag-and-drop type tool that allows you to create, test, and deploy machine learning models. Publish models as web services that can be easily consumed in custom applications or business intelligence tools like Excel.
172
J. Q. Cruz and F. Tapia
Fig. 4. Tool analysis [17].
4 Data Analysis 4.1 Task Definition The first step is to understand the business and how this work affects the results of the search for new prospects for the sale of individual life insurance. Analyzing the current situation, we find that the campaigns carried out so far have low effective-ness in closing sales, so the task is to find a model that helps identify the possibility of purchase by customers. 4.2 Data Collection and Analysis Data Understanding. At this point, an analysis of the information available by the company is carried out in order to identify the data sources and variables that are going to be used in the model. Information Sources. The main sources with which to work to create a master data table are those found in the sales management system. The data related to the issuance of policies is the core of the business that is in a SQL Server base. The demographic data of the client that is in a DB2 base, IBM’s Data Waterhouse Netezza is also used,
Data Mining Prospective Associated with the Purchase
173
external data queries to enrich and complement the information that already exists in the sources. All these data allow us to have the necessary information to use it in the data analysis and modeling process. 4.3 Data Format Definition and Exploration of Variables. After identifying the sources available to the company, about 845 thousand records managed since the end of 2019 are taken, yielding an approximate number of 280 columns. Those that are redundant are excluded from this set of columns. Sequential codes, loading dates, and other columns that do not contribute to the analysis of this work are also excluded. As a result, there are 91 variables with which a first exploratory data analysis will be carried out in R. Data Quality. As a result of the descriptive analysis carried out, we can find 35 variables that have between 90 and 100% of non-existent data, but they should not necessarily be excluded since they may be data that must be imputed, so it is important to know the business to identify if a variable must be excluded or not. Data Preparation. Upon analyzing the variables and their correlation, repeated variables and variables that are unique data such as identification, names, and surnames that do not contribute or can distort the result of the prediction are eliminated, leaving 37 variables selected from the initial 280. Two flows are created: the first to extract the data from the different sources and perform a first filtering and general cleaning job, and the second flow performs a data cleaning and imputation job when necessary, leaving the Master Data Table ready with which the different analysis models will be trained. Data Cleaning and Construction. In this step, we proceed with the validation of the information contained in the variables, if the data is complete and consistent, we proceed to impute default values in null data or change them for a value that allows a more accurate analysis, such as the case of age with data greater than 18 and less than 99 years, which does not correspond to reality but to the quality of the data. A correlation analysis and a frequency analysis are also carried out that allow the identification of each variable the data that is most repeated in the data set. For example, the variable number of children, 31.28% corresponds to people who do not have children. This represents 236088 records. On the other hand, 58.42% are married people. This analysis is made for all the variables that are taken into account for the model. This helps to define the de-fault value in cases where it is necessary to impute a missing value. An analysis of cross tables is also carried out between some variables to be able to determine what incidence they have with respect to others and if the selected variables contribute significantly to the application of the predictive models that will be used in this study. The process of separating the data for the training of the models and the data for the new campaign that will be used for the prediction based on these models. The flow starts with the file generated in the previous processes, then the data for training and the data for the prediction are filtered taking as reference the date of the last campaign, as a result of the flow we have the files that will be used in the prediction models.
174
J. Q. Cruz and F. Tapia
4.4 Choose and Configure the Model In this work, some models that are used for prediction are mentioned, so the following models have been taken into consideration: (1) Neural Networks: They represent the first learning algorithm or “machine learning” (as opposed to traditional statistical approaches) for predictive modeling. (2) Bayesian Classifier: This creates a binomial or multinomial probabilistic classification model of the relationship between a set of predictor variables and a categorical target variable. The Simple Bayesian Classifier assumes that all predictor variables are independent of each other and predicts, on the basis of sample input, a probability distribution in a set of classes; so it calculates the probability of belonging to each class that has the target variable. (3) Decision Tree: Used to create a set of if-then split rules to optimize model building criteria based on decision tree learning methods. (4) Random Forest: creates a model that builds an ensemble of decision tree models to predict a target variable based on one or more predictor variables. The different models are built using random samples of the original data, a procedure known as bootstrapping7 [18]. 4.4.1 Model Construction The designed flow consists of four steps: (1) Selection of processed data, where an automatic process is carried out that determines the type of data and field, as well as the selection of columns to be used in the models; In addition, the records are filtered based on whether they were purchased or not; With all this, the sample data will be generated for the test and for the comparison of the results of the models. (2) Sample preparation, separates 70% of the negatives and 100% of the positives, the difference between the negatives together with the 100% of the positives is left to compare the models. (3) Training of the models with the test data, each model processes the data and returns the results of its analysis. (4) Comparison of models, this allows obtaining the results of each of them. 4.5 Evaluate Results To evaluate a model, some of the measures described below must be considered. In Table 2, the Confusion Matrix is shown, and Table 3 shows the calculation formulas for these measures described below: Precision: It is the number of correct predictions across all classes divided by the total number of samples. Precision [Yes/No]: It is defined as the number of cases that are correctly predicted as Class [YES/NO], divided by the total number of cases that actually belong to Class [YES/NO], this measure is also known as recovery. AUC: area under the ROC curve, only available for two-class classification. (The ROC curve is created by plotting the true positive rate (TVP) or sensitivity versus the false positive rate (TFP) or 1-specificity, at various threshold values and is the estimate of the trapezoidal area between each point). F1: The F1 score or measure of accuracy is the percentage of actual 7 It is a resampling technique that is used in statistics more and more frequently thanks to the
power of today’s computers.
Data Mining Prospective Associated with the Purchase
175
members of a class that were predicted to be in that class divided by the total number of cases that were predicted to be in that class. In situations where there are three or more classes, the average accuracy and average recall values across all classes are used to calculate the F1 score [18].
Table 2. Confusion matrix Predicted Condition Actual Condition
Positive (P)
Negative (N)
Positive (P)
True Positive (TP) hit
False Negative (FN) miss, underestimation
Negative (N)
False Positive (FP) false alarm, overestimation
True Negative (TN) correct rejection
Table 3. Model rate calculation formulas Total Population: PT = P + N
(1)
True Positive Rate Recall – sensitivity: TPR = TP P = 1- FNR FN False Negative Rate - miss rate: FNR = P = 1-TPR Accuracy: ACC = TP+TN P+N True Negative Rate specificity, selectivity: TNR = TN N = 1-FPR FP False Positive Rate: FPR = N = 1 – TNR Positive predictive value precision: PPV = PT PP = 1 − FDR FN False omission rate: FOR = PN = 1 − NPV Negative predictive value: NPV = TN PN = 1- FOR FP False discovery rate: FDR = PP = 1 − PPV 2TP F1 Score: F1 = 2PPV∗TPR PPV+TPR = 2TP+FP+FN P Prevalence: PV = P+N
(2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
4.6 Result of the Models From a universe of about 800,000 selected records, 70% of the data is used for the training process, and the remaining 30%, about 240,000 records, are used for valida-tion. It is important to mention that only 27,000 data points are positive purchases, so in this type of sample, in which positive results are a small proportion, it is preferable for training to take all the cases and have a better result in the prediction. In Table 4 are the results of the four selected models resulting from comparing the test sample versus the real data.
176
J. Q. Cruz and F. Tapia Table 4. Model results
Model
Accuracy
Accuracy No
F1
AUC
Accuracy Yes
Bayesian
0.9174
0.9791
0.9550
0.8030
0.3859
Neural Network
0.9177
0.9958
0.9559
0.7685
0.2446
Random Forest
0.9243
0.9989
0.9594
0.8136
0.2825
Decision Tree
0.9448
0.9940
0.9699
0.8399
0.5212
The confusion matrix that can be seen in Table 5 shows the coincidences of the projection of a positive purchase, versus the actual data of a positive purchase and it can be seen that the best prediction is the decision tree model with 52% effectiveness versus the other models that reach 38.25% of the Bayesian model, 28% of the Random Forest model, and 24% of the Neural Networks model. Table 5. Confusion matrix ACTUAL MODEL
PREDICTION
NO
YES
Bayesian Method
NO
228342
16631
4872
10451
232238
20459
976
6623
232948
19431
YES Neural Network
NO YES
Random Forest
NO
Decision Tree
NO
YES YES
266
7651
231816
12966
1398
14116
Finally, 11 thousand records of a new campaign are processed in the 4 selected prediction models. These generate a new column with the output result. By verifying the results and taking those with a purchase percentage greater than 50%, we obtain the results of Table 6. In this result, it can be seen that the data predicted as “YES” in 3 of the 4 models is low, and it is possible to understand the results that have been had in the real campaigns and take the corrective measures to achieve better results on sale.
Data Mining Prospective Associated with the Purchase
177
Table 6. New campaign result Model Neural Networks Random Forest Bayesian Decision Tree
YES Prediction > = 50% 8 6 148 6
5 Discussion This work is a methodological guide for all those who wish to start a data analytics process from the beginning since it highlights the main data mining methodologies, the definition of possible frameworks, the use of the most used predictive models; A review of the best data science and machine learning tools on the market is also included, all with the aim of strengthening the creation of analytical data models. The present work coincides with some reviewed articles, where the use of specific data and adequate modeling between them influence the purchase of a certain product or service. Many of the related works seek the generation of predictive models, but it must be taken into account that each region or country has its own characteristics and aspects that influence these models. Emphasizing that the specific data should be adequately defined according to the aforementioned. One of the weaknesses that we can mention is that in any automation process, trained people are required, who help define and understand the possible results or trends that they could have in a given time, all with the aim of obtaining a better result., and therefore a more effective decision. An important point to consider is the universe used, about 800 thousand records with 95% completeness of information; but on the other hand, the real purchases are 27,000, which represents 3.37% of the entire universe, which can affect the result of the prediction. For these cases, it is advisable to use this entire set both in training and in testing models. As a possible future work, it is expected to use the prediction model in the electronic sales process, integrating it online as a web service that enriches the data and returns the result of the prediction automatically. Likewise, it is also intended to train it with other prediction criteria in such a way that the work to be carried out is more effective.
6 Conclusions and Recommendations • With the use of a traditional prospecting model, it was evidenced that the average sales reaches 26.6%, with the implementation of the proposed model these sales can be raised to 30% and increase the effectiveness by including online enrichment of demographic and socio-economic data from external sources and integrate them into the sales processes to be able to carry out the profiling immediately.
178
J. Q. Cruz and F. Tapia
• With the traditional prospecting model, it is evident that customer profiles are not completely covered, unlike the proposed model it was possible to segment by age groups, sex, among and others, which would allow raising the percentage of access to specific groups when searching directly the prospects that meet these characteristics. • The model that gave the best result in its accuracy was the decision tree with 52%. • Access to data is a critical issue to carry out any analysis of information, so the process of obtaining, cleaning and loading of data is a long process and requires a lot of attention to have data that allow an effective analysis. • It is important to have people who know the business and how the data is structured in the different systems of the company, since it is an important contribution to create the master data table that will be used for the analysis and prediction of the data. • The test must be carried out with several models and their results compared to determine which one provides a better result in predicting the question to be answered. • The use of intelligence and data analysis tools help reduce model creation times and allow more focus on data analysis, preparation and results to answer business questions.
References 1. Thuring, F., Nielsen, J.P., Guillén, M., Bolancé, C.: Selecting prospects for cross-selling financial products using multivariate credibility (2012) 2. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques (2011) 3. Hornick, M.F., Marcadé, E., Venkayala, S.: Chapter 1 - Overview of Data Mining. The Morgan Kaufmann Series in Data Management Systems, pp. 3–24 (2007) 4. Pete, C., et al.: Crisp-Dm 1.0. CRISP-DM Consortium (2000) 5. S.I. Inc.: Introduction to SEMMA (2017). [En línea]. Available: https://documentation.sas. com/doc/en/emref/14.3/n061bzurmej4j3n1jnj8bbjjm1a2.htm 6. Azevedo, A., Santos, M.F.: KDD, SEMMA and CRISP-DM: a parallel overview. In: Proceedings, de IADIS European Conf. Data Mining (2008) 7. Devale, A.B., Kulkarni, R.V.: Applications of data mining techniques in life insurance. Int. J. Data Mini. Knowl. Manage. Process, 31–40 (2012) 8. Kaewkiriya, T.: Framework for prediction of life insurance customers based on multialgorithms (2017). [Online]. Available: https://ieeexplore.ieee.org/document/8102444 9. Kumar, P., Singh, D.: Integrating Data Mining and AHP for Life Insurance Product Recommendation (2011). [En línea] 10. Jandaghi, G., Moazzez, H., Moradpour, Z.: Life insurance customers segmentation using fuzzy (2015). [Online]. Available: http://www.worldscientificnews.com/wp-content/uploads/ 2015/07/WSN-21-2015-38-49.pdf 11. Qadadeh, W., Abdallah, S.: Customers segmentation in the insurance company (TIC) dataset (2018). [En línea] 12. Pierson, L.: Data Science for Dummies (2015) 13. Curto Díaz, J., Coneca Caralt, J.: Introducción al Business Intelligence (2011) 14. Ponniah, P.: Datawarehouse: The Building Blocks (2010) 15. Dittert, M., Härting, R.C., Reichstein, C., Bayer, C.: Data Analytics Framework for Business in Small and Medium-Sized Organizations. Smart Innovation, Systems and Technologies (2018)
Data Mining Prospective Associated with the Purchase
179
16. Lisdatasolutions: https://www.lisdatasolutions.com. [En línea]. Available: https://www.lisdat asolutions.com/blog/herramientas-del-data-mining/ 17. Gartner: https://www.gartner.com/reviews/market/data-science-machine-learning-platforms (2021). [En línea] 18. Alteryx: www.alteryx.com (2021). [En línea]
A Review of Graph Databases Jaime I. Lopez-Veyna1(B) , Ivan Castillo-Zuñiga2 , and Mariana Ortiz-Garcia1 1 División de Estudios de Posgrado e Investigación, Tecnologico Nacional de Mexico, Instituto
Tecnologico de Zacatecas, Carr, Panamericana Km. 7 La Escondida, Zacatecas, Mexico {ivanlopezveyna,mariana.ortiz}@zacatecas.tecnm.mx 2 Tecnologico Nacional de Mexico, Instituto Tecnologico del Llano Aguascalientes, Km. 18 Carretera Ags-SLP. Municipio del Llano., Aguascalientes, México [email protected]
Abstract. Graph databases are becoming a topic of interest in the research community because of the possibilities that offers in the era of big data. Recently, different types of graph databases have been proposed, however, majority of graph databases are under fifteen years of age and have constant improvements. This article presents a review of graph databases and introduces an architecture which represents the basis for most of the graph databases. Besides, according to data storage type and the data model, a graph database taxonomy is proposed. Such taxonomy allows a categorization from graph databases studied in this research. In the last part we present a survey that describe deeply some popular graph databases. This survey has the aim to provide a guideline to the selection of one of these databases. Keywords: Graph databases · Data models · Graph data management
1 Introduction With the large amount of data generated, users must deal with massive, complex, irregular, sparse and interconnected datasets, graph databases have emerged as a new option to represent, model, and query these datasets. The study of graph databases is an interesting field of research because of several reasons. The first is the inherent property of graphs to model the data with their relationships. Second, the application in domains such as Semantic Web, Social or Biological Networks, Modeling DNA, Recommendation Systems, between others. Third, the demand of storing and successfully recovering and manage big volumes of data. And fourth, to find new storage alternatives to traditional relational databases. Graph databases have become one of the best methods to represent and query interconnected data [1]. Interconnected data are data that have a relationship between its items. The connections between data represents the different relationships between things. These connections are crucial for many applications in which the relationships between objects or entities are as important as the objects themselves. It means that the significance of information depends on the relationships almost as the same level than entities. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 180–195, 2023. https://doi.org/10.1007/978-3-031-20322-0_13
A Review of Graph Databases
181
For example, the entities could be a person, product, email, city, etc., and the relations could be specified between two or more persons, or between persons and products, among others. Using a data graph, the connected data represents the interaction between users, indicating that person “A”, is friend of person “B”, or that person “B” buys the product “X”. Using these relationships, we are allowed to suggest new friends to user “A” like social networks do, or to recommend new products to user B. Many current graph databases employ graph data structures and analysis of information techniques to allow users to perform basic operations like insert, update, query, and compute data. Also, graph databases must be able to support data sources with billions of edges and consequently a big quantity of updates per second. The use of graph databases can bring some advantages such as a) A new form of model data in a graph structures by introducing a new level of abstraction, b) querying the graph structure directly using proprietary operators and new query languages, and c) data structures adapted for storing graph data [2]. The purpose of this research is to present a general architecture of a graph database. Also, we propose a graph database classification and a taxonomy based on the type of storage and the graph data model used by these systems. Next, we summarize some graph database systems, describing their main characteristics, data model, query languages, and other features with the purpose to explore the benefits, drawbacks, and to provide the reader an overview of these systems, with the purpose of helping in the selection among different graph databases. The rest of the paper is organized as follows: Sect. 2 is dedicated to review a graph database foundations and presents a generic architecture of a graph database. Also, a classification and a taxonomy are proposed. Section 3 presents the related work. The state-of-the-art is presented in Sect. 4. Finally, Sect. 5 presents conclusions and future work.
2 Graph Database Foundations A graph database is another type of database management system which is able to contain, represent, and query a graph data structure, with capabilities to Create, Read, Update and Delete (CRUD) a data graph. It includes two basic parts. First a set of nodes or vertices and second, a set of edges which connect some pairs of vertices. In the most basic form, graph databases are finite, include a direction, and the edges have a label [3]. The data are characterized in nodes or vertices, edges, and properties. Vertices represent entities and edges manifest the connection between nodes. Properties can be included on nodes and in edges which describe their specific characteristics. In this kind of systems, every node or edge is linked to other nodes directly which results in no join operations. A graph database commonly provides their query methods with the aim to retrieve stored data using the graph structure. These systems must include an option to visit the nodes and edges (the structure of the graph), which is called “traversal”. According to Robinson et al. [1], there exists two useful properties to consider in a research of graph databases technologies. First, “the underlying storage”, which includes the store and management of data. And second, “the process engine”, which includes the management of CRUD operations and the capabilities to include an indexing method.
182
J. I. Lopez-Veyna et al.
In this index, each element includes a pointer to its adjacent elements. This avoids the necessity of search in each data block using their index. This is the reason for the name index-free-adjacency [4]. 2.1 Architecture of Graph Databases According to Angles et al. in [2] it is necessary that a graph database include some main components such as a user interface or API (Application Programming Interface) as an external interface, a mechanism for data definition, management and querying included in a database language, optimization of queries, and an engine with the functions of storage, transactions, and data manipulation. In this research, for the selected graph databases, we identify three common layers in the majority of these databases that could be associated with the ANSI/SPARC architecture [5]: “External” layer, which is closest to the user, the “Conceptual” layer (likewise called logical level), and “Internal” layer (likewise called storage level) which is responsible to manage the way that the data is stored inside the system. Also, inside the layers, we recognize some components or options that are used by different graph databases. Figure 1 shows a general architecture of a graph database that is according to the ANSI/SPARC model for a general database.
Fig. 1. General architecture of graph databases.
2.1.1 External Layer In this layer we can identify three types of interfaces: API, User interface or Application as the first means to connect with the database. In the case of the API (Application Programming Interface), graph databases define a set of routines related with one or more programming languages. Other option used by graph databases is a User interface, which uses some commands or instructions introduced through a command line interface or Web page. There exist graph databases that use an Application to interact with the database. In this case it is necessary to install a software or service in one machine or a cluster and all the graph database operations can be performed through this application. Table 1 shows the type of external layer used by some graph databases.
A Review of Graph Databases
183
Table 1. Type of external layer interface. Type of interface Graph database API
AllegroGraph InfiniteGraphDB, HyperGraph, Girap, Neo4j, Sparksee-Dex, OrientDB
User Interface
InfoGrid, Neo4j, Titan, VertexDB, Virtuoso
Application
AllegroGraph, ArangoDB, Giraph InfiniteGraphDB, OrientDB, Titan, Virtuoso
2.1.2 Conceptual Layer The “Conceptual” layer in graph databases handles tasks like indexing, querying, modeling, traversing, and prepare the data to store. Other approaches complement the conceptual layer using Hadoop Framework [6] or a P2P Distribution Framework. Table 2 presents the type of conceptual layer used by some graph databases. Table 2. Type of conceptual layer. Type of conceptual layer
Graph database
Basic conceptual layer
InfiniteGraphDB, InfoGrid, Sparksee-Dex, Neo4j, HyperGraphDB
Hadoop framework
Giraph, Titan
P2P Distribution Framework
HyperGraphDB
2.1.3 Internal Layer “Internal” layer has the responsibility to save data in different filesystems, databases, or files. In this case, we can find two different approaches: native graph databases, and no native graph databases. A binary format or a file are used by native graph databases to store the data. Indexing algorithms and data graph structures are implemented and specified to query and store graphs. Native graph databases do not depend on an index since graph structure includes by its nature an adjacency index [1]. No native graph databases use other systems to store data graph such as Key Value Store, Hadoop Distributed File System (HDFS), MySQL, PostgreSQL among others. Table 3 shows the type of storage used in internal layer. As can be seen, a native storage is used only by some graph databases, in some cases a relational database or other types of storage with serialized data are supported.
184
J. I. Lopez-Veyna et al. Table 3. Type of storage in internal layer.
Type of internal layer Graph database Native
InfiniteGraphDB, Sparksee-Dex, Neo4J, ArangoDB, HypergraphDB
Non-native
Storage system HyperGraphDB
Key value storage
AllegroGraph, Giraph Hadoop InfoGrid
Files, MySQL, PostreSQL, HDFS
Titan
Cassandra, HBase, BerekleyDB, ElasticSearch, Lucene
2.2 Graph Data Models Graph databases uses a graph model with the purpose of allow a persistent storing of the vertices and the relations between them (edges). The data model implemented by these databases is one of the most important features, because this guarantees the adequate functionality of store and query data for the selected domain application. In contrast with relational databases which use only relational model, different types of data models are employed in graph databases. The design of a graph database is based on a data model which determines the data structures, traversal options, query operations and integrity constraints used by the system. According to Robbinson et al. in [1], Property Graph, RDF (Resource Frame Description) Graph and Hypergraphs are the three dominant data graph models. 2.2.1 Property Graph Property graph is a graph data model in which the main elements of the graph (nodes and edges) can have some attributes defined by the user [7]. The attributes are specified in a key-value structure. The key (normally represented in a string form) is considered the name of the attribute and the value, which can be a string or a user data type. Entities can be represented by nodes and one or mode labels can be included. The connections between nodes are called relationships, which organize nodes in an inter-connected data structure and allow finding related data. These relationships always include two vertex the first is called start node, and the second is the end node, in some approaches, can include a direction. Properties can be included in both nodes and relationships. Adding properties to relationships is one of the most useful ability used by graph algorithms for offering additional metadata which allows to add semantic to relationships [1]. 2.2.2 RDF Model A data model specialized for managing RDF data is RDF model, a description of the graph is through the terms of its edges. An edge is composed of three parts: Subject, Predicate and Object (SPO) known as triple. In this triple, the Subject is known as source
A Review of Graph Databases
185
node, the Object is the destination node or also known as target node and both nodes are linked by the Predicate [8]. A large collection of triples conforms a RDF dataset, in which all triples form an RDF graph. Robinson in [1] refers to the RDF model as the second most used graph data model. 2.2.3 Hypergraph In the Hypergraph model, a graph is formed by a group of sets that belong to a universal set of vertex V, which is alike to an undirected graph, where any number of vertex (>0) can be connected through an edge [9]. In a normal graph database, each edge links one pair of nodes, but in the hypergraph model, each edge can connect an arbitrary number of nodes. In other words, an edge is a subset of vertices. So, an edge is called hyperedge. In domains where we need to represent relationships many to many hypergraph can be convenient [1]. Hypergraph is known as universal data model and is a convenient option in domains such as artificial intelligence, natural language processing, bioinformatic domain, applications with big scale knowledge representation or models with high complex data model. After explaining the three main graph models, it is important to specify that this research includes graph databases of the three graph data models. 2.3 Graph Database Classification Based on data model and the type of storage we propose a classification of graph databases. Figure 2 shows the classification of the graph databases considered in this research, we distinguish between native or non-native and setting them into a data model.
Fig. 2. Classification of graph databases.
From above classification we can propose a taxonomy of graph databases. This taxonomy includes two ways to identify a graph database: by the storage type as native or non-native and by the data model used. Figure 3 summarizes the proposed taxonomy.
186
J. I. Lopez-Veyna et al.
Fig. 3. Taxonomy of graph databases.
3 Related Work Comparing graph databases is not a new subject area. In this research we deeply study and classify the main proposals that have been made until the current date. Angles in [2] present a survey that compare nine graph databases like AllegroGraph [10], Sparksee-DEX [15], Filament [22], G-Store [23], HyperGraphDB [12], InfiniteGraphDB [50], Neo4j [14], Sones [24], and VertexDB [19], concentrating in their data model features, query facilities and integrity constraints. Also, he describes the logical and physical level, and evaluate the support for querying. One of the main conclusions was that there exist some areas of graph databases that justify new developments. Thompson in [8] presents a literature survey of graph databases describing different graph databases such as Clustered, Key-Value, Main memory, and a physical schema and data structures of some graph databases but focused mainly to RDF technologies. In such paper graph databases like RDF3X [25], Diplodocus [26], Graph [13], 4store [27], Virtuoso [20], CumulusRDF [28], and Urika [19] were analysed. Kolomiˇcenko et al. in [30] presents a benchmark called BlueBench that compare six graph databases such as Tinkerpop Pro Stack [31], Sparksee-DEX [15], IniniteGraphDB [18], Neo4j [14], OrientDB [16] and Titan [18]. They establish that Sparksee-DEX and Nejo4j have a better performance than the rest of the systems, this is due to unique characteristics of their backend for some defined queries. Another conclusion was that InfiniteGraph have the lowest degree performance implementation but clarifying that InfiniteGraphDB is focused on distributed solution with horizontal scaling, nor for a single machine. Jouili and Vansteenberghe in [32] perform graph database comparison in a distributed framework. They include four graph databases such as Sparksee-DEX [15], Neo4j [14], Titan [18] and OrientDB [16]. Their results showed that Neo4j [14], Sparksee-DEX [15], Titan [18] and OrientDB [16] achieved similar performance in read-only intensive workload, and for read and write testing Sparksee-DEX work properly, while Titan [18], OrientDB [16], and Neo4j [14], degrade their performance sharply. McColl et al. in [42] performs an evaluation of twelve graph databases Stringer [33], Boost [34], Giraph [13], Mtgl [35], NetworkX [36], Titan [18], Bagel [37], OrientDB [16], Pegasus [38] among others, measuring the performance of the amount of memory consumed while processing a graph by an application, also they measure the efficiency of save each edge of the graph. Tiny, small, medium, and large graphs with different
A Review of Graph Databases
187
number of nodes and edges were used in the tests. They report that all of twelve opensource packages, complete all the tests with tiny and small graphs (32K of nodes and 256K edges). With medium graphs (8M of edges) only nine of the twelve packages successfully completed all benchmark test, and for large graphs (16M of nodes and 128M of edges), only five of the original twelve packages (Boost, Giraph, MTGL, Pegasus and Singer) successfully completed all benchmark tests. Also, they showed a matrix with some technical information such as license, languages features, memory vs disk space consumed, between others. Furthermore, they increment the graph databases including information of other NoSQL databases. Capotâ et al. in [39] presents Graphalytics, a big data benchmark that include three popular platforms: Giraph [13], GraphX [40] and Neo4j [14], including several datasets such as Graph500, Patents and Social Network Benchmark 1000 (SNB) and five graph algorithms such as General Statistics (STATS), Graph Evolution (EVO), Breadth First Search (BFS), Community Detection (CD) and Connected Components (CONN). In their comparison, they reported that Neo4j [14] is not capable of process graphs that not fit in a computer memory, however, because of its non-distributed nature it gets the best performance. Also, they found that for the CONN algorithm, Giraph [13] got a little bit better performance than GraphX [40] and was unable to process some workloads. Kumar in [7] presents a data model description and the internal architecture of seven graph databases such as Neo4j [14], Sparksee-DEX [15], InfiniteGraphDB [11], Infogrid [17], HyperGraphDB [12], Trinity [41] and Titan [18]. Also, he described some of the main features of such graph databases. A survey for huge unstructured data on Graph databases is presented by Patil et al. in [53]. They compare six graph databases that include Neo4j [14], HyperGraphDB [12], Sparksee-Dex [15], Trinity [41], InfiniteGraphDB [11], and Titan [18], also they include other areas of comparison such as the efficiency storing data, subgraph extraction methods, indexing data and graph methods, social networking and semantic queries. Finally, Timon-Reina et al. in [54] presents a survey of the use of graph databases in the biomedical domain and their application. They compare the evolution and performance of several graph databases such as OrientDB [16], ArangoDB [21], Microsoft Azure DB [55], HyperGraphDB [12], Titan [18], InfiniteGraphDB [11], Oracle Spatial and Graph [55], Sparksee-DEX [15], Neo4j [14] between others. We divide these surveys in two categories: descriptive and experimental. In this case descriptive means that the survey only reports a description of some graph databases including the technical and software features. And experimental means that the survey includes a comparison between several graph databases, using characteristics such as: size of memory consumed, the efficiency to store nodes or edges, etc. Table 4 summarizes graph database research comparisons. It shows the year of the research, identifies the type of project, i.e., if it is descriptive or experimental, and, in this case, the name of the proposed benchmark. With the related work described previously and with the information shown in Table 5, we can observe that HyperGraphDB, InfiniteGraphDB, Neo4j, OrientDB, Sparksee-DEX, and Titan, are widely used in several descriptive and experimental research project. While Giraph, InfoGrid, Sones, VertexDB are less used. Systems such
188
J. I. Lopez-Veyna et al. Table 4. Graph database research comparisons.
Research
Year
Descriptive
Experimental
Benchmark
Kolomicênko et al. [30]
2010
•
Bluebench (based on Blueprints)
Angles [2]
2012
•
Thompson [8]
2013
•
Joulili and Vansteenberghe [32]
2013
•
Blueprints
McColl et al. [43]
2014
•
Graphdb-testing
Capotâ et al. [39]
2015
•
Graphalytics
Kumar [7]
2015
•
Patil et al. [53]
2018
•
Timon-Reina et al. [54]
2021
•
•
Own Implementation, Linked Data Benchmark Council, Blueprints
as 4Store, AllegroGraph, Bagel, Boost, Microsoft Azure DB, Oracle Spatial and Graph, between others were used only in one research project.
4 Graph Databases Survey Most of current graph database are free or commercial products developed fifteen years ago or less and are in continuous improvement. This section describes some graph databases that use one of the three most popular graph data model. Next, we describe the multi-model databases, which have recently emerged as another option to manage graph data. Table 5. Use of graph databases in research projects. Graph database/Research
Angles [2]
4Store AllegroGraph
Thompsoin [8]
Kolomic ênko et al. [30]
Joulili and Vansteenberghe [32]
McCollo et al. [43]
Capotâ et al. [39]
Kumbar et al. [7]
Patil et al. [53]
Timon-Reina et al. [54]
• •
ArangoDB
•
Azure DB
•
Bagel
•
Boost
•
(continued)
A Review of Graph Databases
189
Table 5. (continued) Graph database/Research
Angles [2]
CumulusRDF
Kolomic ênko et al. [30]
Joulili and Vansteenberghe [32]
McCollo et al. [43]
Capotâ et al. [39]
•
•
Patil et al. [53]
•
GraphX
•
GraphChi
•
G-Store
•
HyperGraphDB
•
InfiniteGraphDB
•
•
InfoGrid
•
•
•
•
•
•
•
•
•
Mtgl
• •
•
•
NetworkX
•
•
•
Oracle Spatial and Graph
•
OrientDB
•
•
Pegasus
•
•
•
RDF3X
•
Redis
•
Sones
•
Sparksee-DEX
•
•
•
Stringer •
Titan
•
Trinity Urika
Virtuoso
•
•
•
•
•
•
•
•
•
Tinkerpop Pro Stack
VertexDB
Timon-Reina et al. [54]
•
Giraph
Neo4J
Kumbar et al. [7]
•
Diplodocus Filament
Thompsoin [8]
•
•
• • •
4.1 Property Graph Databases This section is devoted to describing several graph databases in which the property graph data model is used. Giraph [13] is a big data processing graph included in an Apache project. Scaling with hundreds or thousands of machines was one of their design purposed. Apache Hadoop’s with Map Reduce implementation was used to process graphs. The initial release was based on Pregel (an architecture for graph processing proposed by Google [48]), but with new functionalities such as master computation, sharded aggregators, edge-oriented input, between others. It has been used for Facebook with some performance improvements to process up to one trillion of edges [45]. According to [46] in June 2022 Giraph has been ranked in the fourteen position.
190
J. I. Lopez-Veyna et al.
InfiniteGraphDB [11] is a distributed-oriented system written in Java, with C++ core. An efficiency graph analysis and a large-scale graphs is supported. It is designed to handle very high throughput on a structure like a graph. Scalability, graph partitioning, paralleling processing, and distributed approach, were some of the priorities of this graph database. To support distributed properties and allowing scaling and replications, this graph databases utilize Objectivity/DB as backend [30]. Also, high performance related to a query an index in multiple key fields is provided, this index is known as graphwise index [7]. A generic Java API for graph databases called Blueprints is supported. Blueprints is fundamentally a directed and distributed multi graph with labels in edges, that specify certain methods that can be used to add, remove, or retrieve nodes [32]. Likewise, an access by Rexter or Gremlin is also supported [30]. According to [46] in June 2022 InfiniteGraph has been ranked in the nineteen position. Neo4j [14] is a highly scalable, native graph database. Is an embedded, persistent graph database engine, that save structured data graph into a disk. A graph data structure is optimized to save data instead of tables. A framework graph for traversals operations is included. It does not support sharding. Multiple languages for graph operations are supported. A Network with billions of relationships and nodes is scaled. According to [46] in June 2022 Neo4j has been ranked in the first position as the best Graph databases. Sparksee-DEX [15] is a scalable and high-performance graph databases written in C++. Query performance for exploration and retrieval in large networks is the main characteristic. Pattern recognition and social network analysis is possible through graph querying. A remote access to database server via REST interface its provided. It is compatible with Blueprints interface. The latest version supports Python, Java,.NET programming, Java, C++, and Objective C. The entire operating systems are supported. A simple Java Jar is required for execution. The first mobile version of graph database for mobile devices is Sparksee-Dex. According to [46] in June 2022 Sparksee-Dex has been ranked in the twenty-nine position. Titan [18] is a graph databases, scalable, distributed over a cluster with several machines, designed for querying, and storing graphs that contain billions of vertices and edges. ACID support, fault tolerance, linear and elastic scalability, high availability, replication, multi data center, hot backup and data distribution are its main characteristics. It is an open-source project written in Java and was initially released in 2021. A support for various storage backends such as Apache HBase, Oracle BerkleyDB or Apache Cassandra is included. It is compatible with Blueprints implementation and include a superset of Blueprints functionality called TitanGraph. It follows a property graph model and supports the Gremilin query language. According to [46] in June 2022 Titan has been ranked in the thirty-three position. Bisty [47] is a in memory graph databases embeddable, that is compatible with the Blueprints implementation. An ACID transactions are guaranteed as a feature, also a serialization using the JSON processor, a support for online backups throughout an interface, and a good performance for a write and read in multiple thread scenarios are included.
A Review of Graph Databases
191
Filament [22] is a framework and toolkits proposed on a traversal query style. A persistent stored for graph objects including their properties and built on top of PostgreSQL is integrated. The support for SQL queries through JDBC is the core of this graph library. FlockDB [11] is a distributed open-source graph databases developed by Twitter. Is fault tolerant, and it is capable to manage network graphs with short level of depth. According to [46] in June 2022 FlockDB has been ranking in the thirty-one position. G-Store [23] is presented as a prototype to save large graphs that include labels on nodes. It uses an optimized disk storage that exploits the internal definition of the graph to provide access to patterns discovered in queries. InfoGrid [17] is a graph database with web interface with some extra software components that make easy the creation of web applications based on REST-full API components. Pregel [44] is a framework for node-based graphs, created over Apache Hadoop. It has good performance, scalable and fault-tolerance for graphs with billions of nodes. Phoebus [49] is a distributed framework for large scale graphs processing written in Erlang language. It is a distributed application for processing enormous graphs. It is a Google Pregel implementation. It is an implementation of Google’s Pregel for distributed processing of very large graphs. Trinity [41] is a graph database for general purpose. It is a memory cloud framework for distributed graph. A cluster offers a global addressing for key values stored in a cloud memory. This memory addressing allows a better performance. A processing for queries with low latency online is offered, together with a high-performance offline analysis of big scale graphs with a billion of nodes. 4.2 RDF Graph Databases This section is devoted to explaining some RDF graph databases. AllegroGraph [10] is an up-to-date RDF graph database that maintain a highperformance through the combination of disk-based persistent storage and an efficient manage of memory. It is capable to scale with a superior performance for large graphs. Its design includes a Linked Data format as a standard format to store RDF triples. It is considered as a precursor of current graph databases. The current version is devoted to accomplishing the Semantic Web Standards, evolving from a basic graph database. This Graph databases supports SPARQL, RDFS++ and OWL. According to [46] in June 2022 AllegroGraph has been ranked in the twenty-three position. CloudGraph [7] is a graph database based on.NET framework that uses key-value pairs of data in memory or disk to store graphs. Their characteristics include a full transaction support, hot backup, resource balancing, and an intuitive Graph Query Language called (GQL). GraphDB [50] is an RDF robust graph database that support SPARQL standard, as well as high efficiency. Its core provides an infrastructure for requirements were data integration, exploration of relationships, agility in modelling, and data consumption and publishing are important. Also, this graph database accomplishes the Semantic Web Standards for RDF.
192
J. I. Lopez-Veyna et al.
uRika [29] is an analytics platform that separates data and its representation, allowing for new data sources and new relationships without complex data model changes. It uses a triple store database architecture and is well suited to discover hidden relationships in such diverse areas as life sciences, financial services, and government operations. VertexDB [19] is a graph databases with elevate performance. An automatic garbage collection is supported. It uses JSON as response data format for HTTP protocol requests. 4Store [27] uses a quartet of data to store objects that include object, subject, predicate and model. The model is like SPARQL graph. This graph database was initially developed to accomplish the data needs of a semantic web based company called Garlik. 4.3 Hypergraph Databases HyperGraph [12] is an open-source graph data for general purpose, expandible, distributed, compatible with different platforms, embeddable. A directed hypergraphs mechanism is used to store the data. Between its main characteristics we can mention a storage management and customizable indexing, the storage of the data graph includes the orientation of edges. It is capable of manage N-ary relationships between nodes and edges. The traversal of the graph is using a relational query style. A dynamic DB schema is included with an entire transactional operations. Is extendible, and a knowledge representation and data modelling are also included. An object-oriented and relational aspects are included in a semi-structured general data model [52]. According to [46] in June 2022 HyperGraphDB has been ranked in the thirty-four position. 4.4 Multi-model Graph Databases Multi-imodel graph databases are another type of graph database that supports data graph together with other data managements such as Document store or Key-value store. In this section we describe some multi-model graph databases. ArangoDB [21] is an open-source multi-model graph database. It is designed since its first version as native multi-model. It is capable to model documents, data graphs, and key-value data. A support for a JavaScript extension using a sql-like format for queries is supported with good performance. This database was originally released under the name AvocadoDB, but it was changed to ArangoDB in 2012. According to [46] in June 2022 ArangoDB has been ranked in the twelve position. OrientDB [16] is graph databases considered in the second generation of distributed graph databases. A flexibility for store graphs and documents in one database is included. It offers multi-master replication, horizontal scaling, sharding (partitioning), fault tolerance. To get a maximal performance this graph database can be dispersed between different servers, and hence get robustness and scalability. A different types of schemas are supported such as schema-full, schema-mixed, and without-schema. It is written in Java and includes SQL, and a custom SQL based language as its query languages. This system is a multi-model graph database; therefore, it allows graph, key-value, object models and documents. It provides a new mode to manage graphs without transaction called non-transactional graphs. According to [46] in June 2022 OrientDB has been ranked in the thirteen position.
A Review of Graph Databases
193
Virtuoso [20] is considered an enterprise graph database because it offers several features for relational data management such as RFD triples based on predicate or properties, or tables in a native SQL. A support for several data management such as column store, relational data graphs, is included. Likewise content management like JSON, XML, HTML, or plain text is also supported. Finally, it is capable to work with web repositories such as Open Data or WebDav and interact with web services that uses technologies such as SOAP or REST. According to [46] in June 2022 Virtuoso, has been ranked in the eleven position. Microsoft Azure Database [54] is a multi-model graph database platform as a service built for the cloud. Is a fully managed SQL automating update, provisioning backup, monitoring, and upgrading in the cloud, is a serverless compute system. It uses the latest SQL server capabilities. It is based on the Microsoft SQL Server database engine. According to [46] in June 2022 Microsoft Azure SQL Database has been ranked in the nine position. Oracle Spatial and Graph Database [55] is a graph databases designed to manage advanced characteristic for spatial data and social applications that use semantic graphs. Data analysis for logical or physical networks is also supported. A facilitation for retrieval, storage, query, and update data collections with spatial features is provided through function and schemas stored in an Oracle database. According to [46] in June 2022 Oracle Spatial and Graph databases has been ranked in the first position (this graph databases is included together with Oracle DBMS).
5 Conclusions and Future Work In this article we present an overall summary of different graph databases. We present a common architecture of graph databases identifying three main layers used in most of the systems, such as: External Layer, Conceptual Layer and Internal Layer and specifies how the selected systems use this layers. Besides, we classify some products as native and non-native graph databases based on the index algorithms and storage type used. Also, we describe the three graph databases data models that are mostly used and put the systems in the most adequate category. With those two classifications, a taxonomy is proposed, for the best of our knowledge this is the first graph database taxonomy. Next, we present a general description of some graph databases that use the three graph data models. Additionally, we include the software description and technical characteristics of some graph databases to explore the possibilities and limitations of these systems with the aim to make easier to the user the selection of one of these approaches. There exists a lot of possibilities of programming languages to develop graph database applications, also, the user has distinct options to store data in different systems such as Hadoop, Cassandra, HBase, Lucence, etc., or even in a distributed form or using an in-memory approach. There are a lot of query languages, however, at this moment there is not an agreement for a definition of a standard for a graph query language for graph databases, nevertheless, we can observe that Gremlin has been taken as a commonly option in some graph engines. This research shows that the users have a lot of good options to manage their data in graph databases. Our future work includes the use of a benchmark (Blueprints, Graphalytics or other) to test the performance with respect to the memory and disk space consumed of various
194
J. I. Lopez-Veyna et al.
graph databases against our own graph database, a new proposal which is currently under development.
References 1. Robinson, I., Webber, J., Eifrem, E.: Graph Databases, new opportunities for connected data. O’Reilly books (2015). ISBN 978-1-491-93200-1 2. Angles, R.: A comparison of current graph database models. In: Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Workshops. IEEE Computer Society, pp. 171–177 (2012). ISBN 978-0-7695-4748-0 3. Barceló Baeza, P.: Querying graph databases. In: Proceedings of the 32Nd ACM SIGMODSIGACT-SIGAI Symposium on Principles of Database Systems. ACM, pp. 175–188 (2013). ISBN 978-1-4503-2066-5 4. Patil, S., Vaswani, G., Bhatia, A.: Graph databases- an overview. In: International Journal of Computer Science & Information Technologies 5(1), 657–660 (2014). ISSN 0975-9646 5. Date, C., Kannan, A., Swamynathan, S.: An Introduction to Database Systems, 15 edition. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (1999). ISBN 0321197844 6. Apache hadoop: http://hadoop.apache.org 7. Kumar, R.K.: Graph databases: a survey. In: Proceedings of International Conference on Computing, Communication and Automation. IEEE, pp. 785–790 (2015). ISBN 978-1-47998890-7 8. Thompson, B.: Literature survey of graph databases. In: Technical Report. SYSTAP (2013) 9. Iordanov, B.: Hypergraphdb: A generalized graph database. In: Proceedings of International Conference on Information Integration and Web-based Applications & Services. ACM, pp. 115–124 (2013). ISBN 978-1-4503-2113-6 10. Allegrograph: Retrieved 28 June 2022. From: http://franz.com/agraph/allegrograph/ 11. Infinitegraph: Retrieved 14 June 2022. From: https://infinitegraph.com/ 12. Hypergraphdb: Retrieved 27 June 2022. From: http://www.hypergraphdb.org/ 13. Giraph: Retrieved 16 July 2022. From: http://giraph.apache.org/ 14. Neo4j: Retrieved 26 June 2022. From: https://neo4j.com/ 15. Sparksee-DEX: Retrieved 22 June 2022. From: http://sparsity-technologies.com/ 16. Orientdb: Retrieved 27 June 2022. From: http://orientdb.com/orientdb/ 17. Infogrid: Retrieved 22 June 2022. from: http://infogrid.org/trac/ 18. Titan: Retrieved 29 June 2022. From: http://titan.thinkaurelius.com/ 19. Vertexdb: Retrieved 16 June 2022. From: http://www.dekorte.com/projects/opensource/ver texdb/ 20. Virtuoso: Retrieved 12 June 2022. From: http://virtuoso.openlinksw.com/ 21. Arangodb: Retrieved 2 June 2022. From: https://www.arangodb.com/ 22. Filament: Retrieved 13 June 2022. From: https://sourceforge.net/projects/filament/ 23. G-store: Retrieved 28 June 2022. From: http://g-store.sourceforge.net/ 24. Sones: Retrieved 29 June 2022. From: https://github.com/sones/sones/ 25. Neumann, T., Weikum, G.: The rdf-3x engine for scalable management of rdf data (2009) 26. Diplodocus: Retrieved 26 June 2022. From: http://diuf.unifr.ch/main/xi/diplodocus/ 27. 4store: Retrieved 15 June 2022. From: http://4store.org/ 28. Cumulusrdf: Retrieved 1 July 2022. From: https://github.com/cumulusrdf/cumulusrdf 29. Urika: Retrieved 21 June 2022. From: http://www.cray.com/sites/default/files/resources/ Urika-GD-WhitePaper.pdf/ 30. Kolomicênko, V., Svoboda, M., Mlynková, I.H.: Experimental comparison of graph databases. In: Web-Age Information Management, pp. 25–36. Springer Berlin Heidelberg (2010). ISBN ISBN 978-3-642-16720-1
A Review of Graph Databases
195
31. Tinkerpop Pro Stack: Retrieved 11 June 2022. From: http://tinkerpop.apache.org/ 32. Jouili, S., Vansteenberghe, V.: An empirical comparison of graph databases. In: Proceedings of the 2013 International Conference on Social Computing, pp. 708–715. IEEE Computer Society (2013). ISBN 978-0-7695-5137-1 33. Stinger: Retrieved 0 June 2022. From: http://www.stingergraph.com/ 34. Boost: Retrieved 2 June 2022. From: http://www.boost.org/ 35. Mtgl: Retrieved 3 June 2022. From: https://software.sandia.gov/trac/mtgl 36. Networkx: Retrieved 7 June 2022. From: https://github.com/frewsxcv/mbz2nx/ 37. Bagel: Retrieved 29 June 2022. From: https://github.com/mesos/spark/wiki/Bagel-Progra mming-Guide/ 38. Pegasus: Retrieved 16 June 2022. From: http://www.cs.cmu.edu/~pegasus/ 39. Capotâ, M., Hegeman, T., Iosup, A., Prat-Perez, A., Erling, O., Boncz, P.: Graphalytics: a big data benchmark for graph-processing platforms. In: Proceedings of the GRADES’15. ACM, 7:1–7:6 (2015). ISBN 978-1-4503-3611-6 40. Graphx: Retrieved 21 June 2022. From: https://www.mapr.com/products/product-overview/ graphx/ 41. Trinity: Retrieved 1 July 2022. From: https://www.microsoft.com/en-us/research/project/tri nity/ 42. McColl, R.C., Ediger, D., Poovey, J., Campbell, D., Bader, D.A.: A performance evaluation of open-source graph databases. In: Proceedings of the First Workshop on Parallel Programming for Analytics Applications, 11–18. ACM (2014). ISBN 978-1-4503-2654-4 43. Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010). ISBN 978-1-4503-0032-2 44. Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: Graph processing at Facebook-scale. volume 8,12. VLDB Endowment, pp. 1804–1815 (2015). ISSN 2150-8097 45. Db-engines ranking of graph dbms: Retrieved 21 June 2022. From: https://db-engines.com/ en/ranking/graph+dbms 46. Bisty: Retrieved 29 June 2022. From: https://bitbucket.org/lambdazen/bitsy/wiki/Home/ 47. Flockdb: Retrieved 27 June 2022. From: https://github.com/twitter/flockdb/ 48. Phoebus: Retrieved 11 June 2022. From: https://github.com/xslogic/phoebus/ 49. Cloudgraph: Retrieved 1 June 2022. From: http://www.cloudgraph.com/ 50. Graphdb: Retrieved 7 June 2022. From: http://graphdb.ontotext.com/ 51. Iordanov, B.: Hypergraphdb: a generalized graph database. In: Proceedings of the 2010 International Conference on Web-age Information Management, pp. 25–36. Springer-Verlag (2010). ISBN 3-642-16719-5, 978-3-642-16719-5 52. Patil, N.S., Kiran, P., Kavya, N.P., Naresh Patel K.M.: A survey on graph databases management techniques for huge unstructured data. International Journal of Electrical and Computer Engineering (IJECE), 1140–1149 (2018). ISSN 2088-8708, https://doi.org/10.11591/ijece. v8i2.pp1140-1149 53. Timon-Reina, S., Rincon, M., Martinez-Tomas, R.: An overview of graph databases and their applications in the biomedical domain. Journal of Biological Databases and Curation (2021). https://doi.org/10.1093/database/baab026 54. Azure SQL Database: Retrieved 1 July 2022. From: https://azure.microsoft.com/en-us/pro ducts/azure-sql/database/ 55. Oracle Spatial and Graph: Retrieved 1 July 2022. From: https://docs.oracle.com/database/ 121/SPATL/what-is-oracle-spatial-and-graph.htm#SPATL440
Implementation of Sentiment Analysis in Chatbots in Spanish to Detect Signs of Mental Health Problems Eduardo Aguilar Yáñez, Sodel Vazquez Reyes(B) , Juan F. Rivera Gómez, Perla Velasco Elizondo, Alejandro Mauricio Gonzalez, and Alejandra García Hernández Autonomous University of Zacatecas, 98000 Zacatecas, Zac, Mexico {37180547,vazquezs,jf_riverag,pvelasco,amgdark, alegarcia}@uaz.edu.mx
Abstract. The detection of mental health problems such as depression has become increasingly important, nowadays its increase due to various factors such as the appearance of social networks, a global health crisis that has forced people to isolate themselves in their homes, among others. Unfortunately, these problems are often not detected in time because many people do not have the confidence to express their problems to doctors, psychologists or relevant authorities, causing them to advance to a more critical stage of their condition. However, many people feel more confident sharing their thoughts through social media platforms or chatbots these days, which are a form of liberation where you can lessen the social pressure to express your thoughts through anonymity. That is why this presented study proposes a system with which possible signs of mental health can be detected. This is in a scenario of high school students (young people from 14 to 18 years old on average) through interaction with the chatbot of the high school of the Autonomous University of Zacatecas, and using conversation flows, the chatbot was deployed during a period where the implementation of sentiment analysis algorithm presented an accuracy of 0.86 showing optimistic data for the detection of signs of problems of mental health. Keywords: Chatbot · Conversational bot · Sentiment analysis · Mental health problems
1 Introduction Nowadays, due to the social isolation that is carried out preventively by the SARS-Cov2 pandemic virus, the number of cases of mental health problems has been increasing. A study of 1,210 inhabitants of 194 cities in China, post COVID, showed that 53.8% of the participants had moderate to strong psychological impact; 16.5% had moderate to strong depressive symptoms; 28.8% had moderate to strong anxiety symptoms; and 8.1% had moderate to strong stress levels [1]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 196–214, 2023. https://doi.org/10.1007/978-3-031-20322-0_14
Implementation of Sentiment Analysis in Chatbots
197
China, being the country in which the SARS-Cov2 virus pandemic originated, had an increase in mental health illnesses, which is also reflected worldwide, since according to the WHO (World Health Organization) it was calculated that the COVID-19 pandemic had caused a 27.6% increase (95% uncertainty interval (IU): 25.1–30.3) in cases of major depressive disorder (MDD) and an increase of 25, 6% (95% II: 23.2–28.0) of anxiety disorders (AD) cases worldwide in 2020 [2]. Taking into account the above statistics, we can deduce that there is clearly an increase in cases of anxiety, stress and depression or symptoms that are linked to these diseases. This is a consequence of various factors, such as social isolation, pressure and danger to medical personnel or civilians, changes in people’s routine, among others. Technology in the field of mental health has advanced significantly in recent years, for example, speech recognition, Natural Language Processing (NLP) and Artificial Intelligence (AI) have been used to support the detection of symptoms of psychiatric disorders [3]. Research has shown that artificial intelligences such as Alexa, Siri or the Google Assistant is often considered by people as a friend or family member [4]. Currently, the use of chatbots is becoming more and more normalized, because they allow them to automate time-consuming tasks such as the attention provided by trained personnel in different areas and companies. In addition, chatbots are used to maintain conversations with other people, this for research purposes or even as an alternative to combat loneliness and depression. The problem with most chatbots, in the case of customer service or for the detection of certain patterns, is that very few of them offer a personalized experience for the user, offer a poor understanding of the context or a solution to problems with little complexity. They are simply programmed to understand certain sentences and respond with limited options, ignoring the context or simplifying the interaction to a limited set of questions. On the other hand, as for conversation with chatbots, they present the problem that most are programmed to respond only with sentences that the algorithm has determined correspond to the message that the user has inserted, making the experience with the chatbot short, besides that many times the coherence of the conversation is lost or simply the response is not very adequate to what the user has expressed. Tuva Lunde Smestad, in his master’s thesis regarding improving the user experience of chatbot interfaces, mentions that most chatbots are not doing their task properly and result in faulty interfaces that fail to predict the simplest of questions. Furthermore, he mentions that chatbot interactions are unintelligent, unhelpful and ineffective [4]. This tells us that chatbots only satisfy certain basic needs of a conversation, but in a limited and not personalized way, creating the user a feeling of lack of understanding or “tact” on the part of the bot. A negative scenario is that 57% of companies have implemented or are planning to implement a chatbot as part of the services they provide soon [5]. And if this problem persists, it could cause a loss of reliability and even revenue for companies that implement chatbots as customer service. Not to mention that it may also cause a loss of interest in this type of application, in addition to the fact that this completely takes it away from the goal of simulating a conversation with a real person. In the area of health and psychology, this problem means that the objective of helping people to detect any disease or disorder is affected due to the simplicity of the chatbot
198
E. A. Yáñez et al.
conversation flow and a lack of processing of user responses. For this reason, this work aims to use sentiment analysis in a conversational chatbot, in the area of psychology, to detect signs of mental health problems. Next in this article you will find a State of the art section (Sect. 2) where different papers are compared about where sentiment analysis is used in chatbots and its use in the area of mental health. Then there will be a Sect. 3 where the entire development process of the prototype and its architecture will be described, a Sect. 4 where the results obtained will be shown and a Sect. 5 where the conclusions and future work of the investigation will be found.
2 State of Art Next, we describe what sentiment analysis is, its use in chatbots, improvements that can be made to chatbots, and how it has been used in the mental health context. 2.1 Chatbots Chatbots or conversational bots are applications that use natural language processing and/or rules defined in a question-and-answer system [6]. These have the objective of simulating a conversation with a person through text. This type of conversation can have the objective of just chatting with someone or it can also be to provide support and help to customers in a company that practices e-commerce, among others. Chatbots use natural language processing (NLP), which is the way a computer can interpret human language based on reasoning, learning, and understanding [7]. On the other hand, NLP is not only implementing an algorithm to respond to a text input or process and classify the received inputs, but it is also combining different algorithms and techniques to achieve a better understanding of the received inputs and the context they refer to by a computer. 2.2 Sentiment Analysis in the Detection of Depression Sentiment analysis or emotion recognition is the study of extracting specific information such as opinions or sentiments from texts by using PLN [8]. But… What are the sentiments? Feelings are a state of mind which is given in relation to external inputs and these are produced when the emotion is processed in the brain and the person is aware of that emotion and the mood it produces [9]. In other words, emotions are reactions of our body based on the context that the person is living in, while feelings are a state of mind that is produced when a person presents an emotion. Although nowadays many uses them as synonyms. Now, feelings have been classified in several different ways over time, but the most common is that they are divided into positive, negative and neutral; in addition, these three classifications are divided into more. For example, in the positive ones we find happiness, euphoria and love, while in the negative ones we find sadness, anger or fear and the neutral ones are the only way in which it refers to expressions without any emotion or feeling involved [9].
Implementation of Sentiment Analysis in Chatbots
199
It is because of the above that, by means of sentiment analysis and the use of PLN, it is possible to extract from texts or audio the sentiments involved in them and also the type of opinion that a certain person has, all this by classifying the entries into positive or negative sentiment, even with a more specific classification. A paper in which a way to analyze sentiments in a specific way to detect depression is proposed is the article "EmoCure" by Basantani et al. [10], which uses formulas available on github.com for the classification of depressive texts. These formulas are the following, in (1) it uses a for loop to go through each word in the text to be analyzed and will validate if a word is within the list of positive words by adding one unit to the variable sum or within the list of negative words by subtracting one unit to the variable sum In (2) it calculates the sentiment score by dividing the variable by the number of words in the analyzed text and finally in (3) based on the score obtained the text will receive a label by validations where 1 is a positive text, 0 is a neutral message and −1 is a negative message: For each word in the text do if the word is in the positive list then sum = sum +1 else if the word is in the negative list then sum = sum -1 End if End if End for =(
)/
If score >= 0.2 then label = 1 else if -0.5 < score < 0.2 then label = 0 else if score 2453 METs = 0.001092(VA) + 1.336129, when VM cpm 2453, Kcals/min = 0.001064 × VM + .087512(BMI) − 0.500229. (16) Kcals / min = CPM × 0.0000191 × BMI.
(17)
Khan et al. [36] propose a framework (Framework) for human activity recognition for body area networks (BANs). A BAN consists of the use of multiple wearable devices
Measurement of Physical Activity Energy Expenditure Using Inertial Sensors
221
placed on the body to collect a larger amount of information. Using the local energybased shape histogram (LESH) technique for activity detection. In this study they make use of classifiers such as Simple Linear Regression (SLR) model, Naïve Baye classifier and Sequential Minimal Optimization (SMO) algorithm. In the study they make a combination of 5 sensors, neither the type nor the brand is specified, but the combination of these, which are distributed in the left hand with label 1, the right hand with label 2, the waist with label 3, the left foot with label 4 and the right foot with label 5. The tests for the placement of the sensors were performed with each of the sensors separately, three sensors and five sensors at the same time, resulting as the best combination sensors 1, 3, 4 at the same time for the 3 classifiers. Attal et al. [37] present a systematic review of different classification techniques for human activity recognition from inertial sensor data and the effect of position. For their development, they used three Xsens Xbus Kit brand inertial sensor units. These sensors were worn by healthy participants at key points on the chest, as well as on the upper and lower extremities of the body (right thigh and left ankle). Both supervised and unsupervised machine-learning techniques were used for the experimentation. Model inputs consisted of raw data from inertial sensors. Data acquisition was collected from six healthy subjects with different profiles, with a mean age of 26 years and a mean weight: 65 kg. The results presented show that the supervised algorithms perform better than the unsupervised ones especially k-nearest neighbors (K-NN) and random forests (RF). Various proposals for the measurement of EE are presented in the literature. Several of these works consider the use of various devices such as ActiGraph GT3X, MTw ™, Xsens or the triaxial sensor Actiwave Motion [26, 33, 35]. In general, for data acquisition it is suggested the use of triaxial inertial type sensors that allow three degrees of freedom (X, Y, Z axes); for the estimation of EE mathematical techniques such as linear regressions or artificial neural network models are used and for the validation of the results obtained a standard technique such as indirect calorimetry using a respiratory gas analyzer or a Series 7450 V2 gold nasal mask is used [25, 29]. The use of multiple devices placed on the body may allow for greater accuracy, but it requires processing a larger amount of data from the sensors Increasing considerably the processing requirements [36, 37]. It can be concluded that each of the studies presented is valuable for the devices they employ and the computational methods they present. We consider that the use of a single sensor would be sufficient to obtain acceptable results if it is placed either on the waist or on one of the lower extremities. However, if greater precision is required, we would recommend, as in the work of Khan et al. [36], the combination of sensors on the left or right hand, on the waist and on the left foot. For the above reasons, we decided to use a single sensor position and to select from the formulas analyzed the one that gives us acceptable measurement data with low computational resources. We decided to do this because we are more interested in developing a solution for detecting the increase or decrease of physical activity, as well as observing compliance with daily physical activity recommendations, rather than obtaining exact data for a given physical activity.
222
J. A. Miguel-Ruiz et al.
4 Proposal for Measuring Energy Expenditure This research work seeks to measure energy expenditure in the elderly. This information will be provided to the older adult or a third person, such as a family member or a physician. The intention is to know to what extent an older adult meets the minimum recommendations of the WHO, to know how much energy he/she uses during the day and how he/she distributes his/her physical activity. We propose the use of non-invasive wearable devices, specifically the use of inertial sensors such as accelerometers and gyroscopes, to monitor energy expenditure. The research will be developed in five phases, which are composed of: data acquisition, selection of models for energy expenditure, determination of energy expenditure, implementation of the system, tests on older adults. For the data acquisition, a study of the wearable devices market will be carried out considering cost, portability, and reliability. Data acquisition of the selected device will be performed through the development of a mobile application. The data will be obtained and classified depending on the type of movement whether it is walking, jogging, or running. The positioning of the sensors will be considered according to the work done by Khan et al. [36]. The result of this phase is the determination of the sensor to be used, the determination of the positioning of the sensors, and the data classified depending on the type of movement. For the selection of models for energy expenditure, a analysis of the different models and methods for the calculation of EE reported in the literature were carried out. The next step consists of selecting the models or methods that fit the needs of this research considering aspects such as computational cost and accuracy. Once the models or methods have been selected, they will be implemented in a prototype that will be used in the next phase. The models considered for experimentation are those of Carneiro et al. [12], Kingsley et al. [13], Caron et al. [32], Howe et al. [39], ActiGraph LLC [40, 41]. For the determination of energy expenditure, the analysis and testing of the prototype implemented in a mobile device were carried out and compared with the results present in the compendium of physical activities and its comparison with a standard technique. This standard technique could be indirect calorimetry with a mask for the measurement of oxygen consumption and carbon dioxide exhalation. The results obtained should be compared statistically and the quadratic error calculated; then corrections should be made on the prototype for the measurement of EE. Implementation of the system. The development of the experimental system for the measurement of physical activity in older adults will be carried out based on the prototype that will be developed at that time. The implemented system will allow the elderly to have a count of calories expended, how active they are during the day, and this information will also be accessible to their caregivers. Tests on older adults. In the last phase, three main steps will be performed. The first one consists of selecting and obtaining a sample of older adults, in which the tests of the developed application will be applied. The second step is the final tests for the validation of the application and the method for EE. In the third step, corrections will be made in case of errors.
Measurement of Physical Activity Energy Expenditure Using Inertial Sensors
223
4.1 Prototype for Energy Expenditure Measurement For the experimentation was developed a prototype which makes use of React Native. This is a Framework for the development of native applications either Android or iOS, using the JavaScript language and rendered with native code for mobile applications [42]. The development of this prototype was focused on the acquisition of data belonging to the Arduino nano 33 BLE. This application makes use of BLE services for data acquisition, the user can see the device with which the information is being obtained. This device has 9 degrees of freedom, since it incorporates a triaxial accelerometer, a gyroscope and a magnetometer, these sensors are integrated in the inertial measurement unit (IMU) LSM9DS1. The data was collected and sent via its Bluetooth low energy (BLE) card. Figure 1 (a) shows the operation of this prototype, the data were obtained from 3 different positions: left hand, waist on its right side and right leg. 4.2 Experimental Design Energy expenditure was collected from twelve participants with an average age of 29.5 years. The participants were made up of young men only. The experimentation was carried out with four devices, three of which belonged to the prototype developed and the fourth the Polar M430, which is a reference device belonging to the Polar company. The Polar M430 device has an algorithm called Smart Calories [43] patented by the company, the validation of the device has been done in previous research as in Henriksen et al. [44]. Figure 1 (b) consists of the placement of the devices being: a) the left wrist, b) the waist on its right side, c) the right ankle and d) the right wrist with the reference device. The experimentation was carried out in a basketball court following a protocol previously established with the participants. It was divided into two sections, the first was the walking activity, where participants performed 10 min of constant activity at a pace in which they did not feel fatigued. The second activity was jogging, where participants performed six minutes at a pace at which they did not feel fatigued. Figure 1 (b) and (c) show the placement of the sensing devices. For the experimentation, the formulas proposed by ActiGraph [40, 41], Sasaki [27], Hildebrand et al. [28], Caron et al. for waist [32], Caron et al. for ankle [32], Howe et al. for male without body mass index [39], and Howe et al. for male were selected with body mass index [39]. For selecting the formulas makes use of variables such as magnitude vector, body mass index and other characteristics of the participants. The first step consisted of the analysis of variance (ANOVA) of the gait activity, seeking to determine if there is a variation in the means of the different treatments. Each formula is considered as a treatment [43], to perform a post hoc test comparing against the reference method, this analysis was performed with the statistical software Minitap [44]. For this post hoc analysis, Dunnett’s test was used to contrast the means against the reference method and to verify graphically the formulas that are close to the reference method. Figure 2 (left) shows the different results of the formulas represented by Fn, compared against R being R the reference method. The line 0 represents how far away they are from the reference method. The closest formulas to the reference method,
224
J. A. Miguel-Ruiz et al.
Fig. 1. (a) Mobile application prototype, (b) Placement of wearable sensors, (c) Arduino nano BLE based sensing on ankle, waist and left wrist and reference device on the right wrist.
represented by R, with a confidence interval of 95% are: i) ActiGraph [40, 41], see Eq. (1), ii) Caron et al. for the ankle [32], see Eq. (5), and iii) Howe et al. without body mass index (BMI) [39], see Eq. (6). Once the general result of the formulas is obtained, it is necessary to perform a second post hoc analysis, for each one individually in their different positions and the average of these, which results in: i) ActiGraph [40, 41], waist with a single device, Hand with a single device, Average being the average of the three positions, see Eq. (1), ii) Caron et al. [32], ankle with a single device, average being the average of the three positions, see Eq. (1), iii) Howe et al. without body mass, index (BMI), ankle with a single device, average being the average of the three positions, see Eq. (6). When obtaining post hoc test results for the best positions with the number of devices, a second analysis using measures such as accuracy with Pearson’s correlation index (r), reliability with the intraclass correlation index (ICC), the level of error with the standard error of measurement (SEM) and finally the p-values of each test. Table 1 shows the result of this analysis. Table 1. Results of the walking activity Formula
ICC
r
SEM
p-values
Equation (1) Waist
0.74
-0.01
10.68
0.081
Equation (1) Hand
0.87
0.07
7.61
0.223
Equation (1) Average
0.71
0.71
15.29
0.067
Equation (5) Ankle
0.94
0.58
3.92
0.429
Equation (5) Average
0.32
0.60
14.17
0.003
Equation (6) Ankle
0.97
0.44
2.60
0.562
Average
0.74
0.32
7.63
0.083
Measurement of Physical Activity Energy Expenditure Using Inertial Sensors
225
For the second activity, the analysis process is repeated using the ANOVA test and Dunnett’s post hoc test to verify the differences. Figure 2 (right) shows the comparison of the seven formulas mentioned for the jogging experiment. The post hoc test determined that the following formulas are the most like the reference method: i) Caron et al. for waist [32], Eq. (4), ii) Caron et al. for ankle [32], Eq. (5), iii) Howe et al. for body mass index (BMI) [39], Eq. 7. Once the formulas have been compared, a second Dunnett’s test is performed for each of them in their different positions and average, obtaining the following results: i) Caron et al., waist with a single device, Eq. (4), ii) Caron et al., ankle with a single device, Eq. (5), iii) - - Howe et al. with body mass index (BMI), hand with a single device, average of the three positions, Eq. 7. As in the previous analysis of walking, reliability and accuracy tests are performed with the intraclass correlation and Pearson’s correlation measures, obtaining the results shown in Table 2.
Fig. 2. ANOVA analysis of (right) walking, and (left) jogging Table 2. Results of the jogging activity Formula
ICC
r
SEM
p-values
Equation (4) Waist
0.999991025
0.446
0.06353698
0.993
Equation (5) Ankle
0.999985099
0.402
0.08186886
0.990
Equation (7) Average
0.998829272
0.921
0.75075719
0.915
Equation (7) Hand
0.99599775
0.898
1.73990595
0.229
At the end of the experimentation, two formulas were selected, one for each type of activity. For the walking activity, the formula belonging to ActiGraph [40, 41] in combination of the three devices and the average of these is the one that presents the best results. For the jogging activity, the seventh formula belonging to Howe et al. that considers gender and body mass index [39], was determined to have the best results. We emphasize that the formulas analyzed were not designed for the prototype developed; other technologies were used for sensing, which presents a new opportunity to develop new formulas based on those analyzed. Another point to be analyzed is the existence of formulas that present better results for activities other than walking or jogging,
226
J. A. Miguel-Ruiz et al.
which presents an opportunity for new research. Even the possibility of identifying the formula or formulas that allow the most appropriate measurement of EE for a set of activities.
5 Conclusions It has been shown that compliance with the physical activity recommendations proposed by the WHO can help to reduce the probability of suffering from cardiovascular diseases and to perceive an improvement in physical condition. The information generated is very valuable both for the elderly themselves and for health specialists or caregivers since it would allow a correct analysis for the assignment of diets and physical conditioning programs [39]. This article reviews the state of the art of the most appropriate techniques and methods for estimating EE for walking and jogging activities. On the one hand, the technologies available for the measurement of the activity and their most appropriate use, and on the other hand, the different models to perform the calculation. To carry out this research we resorted to the design of experiments to provide the ability to be replicated under similar conditions. For the selection of the wearable device, a search was made in the market for low-cost sensing devices, with acceptable accuracy and with the required functionalities and characteristics of data acquisition and transmission. A study was conducted to evaluate the different options, advantages, and disadvantages for the placement of the wearable devices, as well as the accessible alternatives to have a reference device to contrast the results. During the analysis EE, seven formulas were selected that presented encouraging results for walking activity. With the use of precision and reliability metrics, it was determined that the ActiGraph formula with the use of the three devices presented the best results for the walking activity. For the jogging activity, a total of 4 formulas were selected that present good accuracy and reliability, being the seventh Howe formula, which includes the body mass index (BMI), with the configuration of the three devices and the average of these, the best formula for the measurement of jogging activity. Finally, because of this work, a total of two formulas were obtained with sufficient accuracy, one for each activity, walking or jogging. Two options are proposed here, the first is that the user indicates the type of activity to be performed and the second is to use the same sensor to automatically detect the type of activity performed, differentiating between activities, and selecting the most appropriate formula automatically. As future work we propose the detection of the different types of activities automatically with the use of artificial intelligence techniques such as neural networks or random forests, the development of new formulas for the calculation of energy expenditure from those selected in this study. From this work, the development of a reliable, accurate and low-cost industrial prototype for monitoring physical activity in older adults will be sought. Acknowledgments. This work was carried out in the context of the project 14923.22P.
Measurement of Physical Activity Energy Expenditure Using Inertial Sensors
227
References 1. Instituto de Seguridad y Servicios Sociales de los Trabajadores del Estado: ISSSTE, Sedentarismo afecta al 58.3 por ciento de los mexicanos mayores de 18 años (2019). https://www.gob.mx/issste/prensa/sedentarismo-afecta-al-58-3-por-ciento-delos-mexicanos-mayores-de-18-anos?idiom=es. Accessed 15 Juin 2022 2. INEGI: Estadísticas a Propósito del Día Internacional de las Personas de Edad. Instituto Nacional de Estadística y Geografía, pp. 1–9 (2019). https://www.inegi.org.mx/contenidos/ saladeprensa/aproposito/2019/edad2019_Nal.pdf. accessed 15 Juin 2022 3. INEGI: Menos de la mitad de la población realiza en su tiempo libre la práctica de algún deporte o ejercicio físico. Comunicado de Prensa Núm. 25(18), 1–12 (2018). https://www.inegi.org.mx/contenidos/saladeprensa/boletines/2020/EstSociod emo/mopradef2020.pdf. Accessed 15 Juin 2022 4. Ponce, G., Kánter, I. del R.: Día del adulto mayor. Instituto Belisario Domínguez. Núm 70, 16 (2015). http://bibliodigitalibd.senado.gob.mx/handle/123456789/3180. Accessed 15 Juin 2022 5. Organización Mundial de la Salud OMS: Inactividad física: un problema de salud pública mundial. https://www.who.int/dietphysicalactivity/factsheet_inactivity/es/. Accessed 15 Juin 2022 6. Gaetano, A.: Relationship between physical inactivity and effects on individual health status. Journal of Physical Education and Sport 16(2), 1069–1074 (2016). https://doi.org/10.7752/ jpes.2016.s2170 7. Paramio-Pérez, G.: Beneficios psicológicos de la actividad física y el deporte. Revista de Educación, Motricidad e Investigación 7, 1 (2017). https://doi.org/10.33776/remo.v0i7.3133 8. Organización Mundial de la Salud OMS: La actividad física en los adultos mayores.https:// www.who.int/es/news-room/fact-sheets/detail/physical-activity. Accessed 15 Juin 2022 9. Herrera, E., Pablos, A., Chiva-Bartoll, O., Pablos, C.: Effects of Physical Activity on Perceived Health and Physical Condition on Older Adults. Journal of Sport and Health Research 9(1), 27–40 (2017). http://repositori.uji.es/xmlui/handle/10234/166401. Accessed 15 Juin 2022 10. Sabido Rangel, J.L., Apolo-Arenas, M.D., Montanero Fernández, J., Caña-Pino, A.: Measurement of energy expenditure in submaximal functional assessment tests through accelerometry. Rehabilitacion 52(4), 223–229 (2018). https://doi.org/10.1016/j.rh.2018.06.003 11. Vargas, M., Lancheros, L., Barrera, M. del P.: Gasto energético en reposo y composición corporal en adultos. Revista de la Facultad de Medicina 59, S43–S58 (2011) 12. Carneiro, S., et al.: Accelerometer-based methods for energy expenditure using the smartphone. In: 2015 IEEE International Symposium on Medical Measurements and Applications, MeMeA 2015 - Proceedings, pp. 151–156 (2015). https://doi.org/10.1109/MeMeA.2015.714 5190 13. Kingsley, M.I.C., et al.: Wrist-specific accelerometry methods for estimating free-living physical activity. J. Sci. Med. Sport 22(6), 677–683 (2019). https://doi.org/10.1016/j.jsams.2018. 12.003 14. Farinola, M.: Técnicas de valoración de la actividad física. Calidad de Vida y Salud 3(2), 23–34 (2010) 15. Sirard, J.R., Pate, R.R.: Physical activity assessment in children and adolescents. Sports Med. 31(6), 439–454 (2001). https://doi.org/10.2165/00007256-200131060-00004 16. Arias-Vázquez, P.I., Balam-de la Vega, V., Sulub-Herrera, A., Carrillo-Rubio, J.A., RamírezMeléndez, A.: Beneficios clinicos y prescripcion del ejerccicio en la prevencion cardiovascular primaria: Revisión. Revista Mexicana De Medicina Y Rehabilitacion 25(2), 63–72 (2013). https://www.medigraphic.com/pdfs/fisica/mf-2013/mf132e.pdf. Accessed 15 Juin 2022
228
J. A. Miguel-Ruiz et al.
17. Costill, D.L., Wilmore, J.H.: Fisiología del esfuerzo y del deporte, 3rd ed. España Barcelona, Paidotribo (2001). [Online]. Available: https://profesoradoonline.com/wp-content/uploads/ 2020/06/Fisiologia-del-Esfuerzo-y-del-Deporte-Jack-Wilmore-Costill.pdf. Accessed 15 Juin 2022 18. Hernández, A.L., Cortés, M.C.B., Barón, A.Á., Tinjacá, L.A.T., Ávila, H.A.G.: Tecnología vestible una ventaje competitiva en el entrenamiento deportivo 11(1) (2020).https://doi.org/ 10.15332/dt.inv.2020.01161 19. Farinola, M.G., Lobo, P.R.: Técnicas de Medición de la Actividad Física en Investigaciones Argentinas: Necesidad de Incorporar Técnicas Objetivas 18, 9–19 (2017) 20. Cuauhtémoc, M.: Dispositivos vestibles. Gaceta 1(132), 2014 (2015). http://gacetaii.iingen. unam.mx/GacetaII/index.php/gii/article/view/2040. Accessed 15 Juin 2022 21. Pinheiro Volp, A.C., Esteves de Oliveira, F.C., Duarte Moreira Alves, R., Esteves, E.A., Bressan, J.: Gasto energético: Componentes y métodos de evaluación. Nutricion Hospitalaria 26(3), 430–440 (2011). https://doi.org/10.3305/nh.2011.26.3.5181 22. Ainsworth, B.E., et al.: Compendium of physical activities: An update of activity codes and MET intensities. Medicine and Science in Sports and Exercise 32(suppl. 9) (2000). https:// doi.org/10.1097/00005768-200009001-00009 23. Ainsworth, B.E., et al.: Compendium of physical activities: an update of activity codes and MET intensities BARBARA. Historical Semantics and Cognition 12, 61–89 (2013). https:// doi.org/10.1515/9783110804195.61 24. Redondo, R.B.: Gasto energético en reposo; métodos de evaluación y aplicaciones. Nutr. Hosp. 31, 245–254 (2015). https://doi.org/10.3305/nh.2015.31.sup3.8772 25. Hans Rudolph, I.: Hans Rudolph 7450 Series V2 Mask. https://www.amronintl.com/hans-rud olph-7450-series-v2-mask.html. Accessed 15 Juin 2022 26. ActiGraph, L.: GT3X. https://actigraphcorp.com/support/activity-monitors/gt3x/ 27. Sasaki, I.E., John, D., Freedson, P.S.: Validation and comparison of ActiGraph activity monitors. J. Sci. Med. Sport 14(5), 411–416 (2011). https://doi.org/10.1016/j.jsams.2011. 04.003 28. Hildebrand, M., van Hees, V.T., Hansen, B.H., Ekelund, U.: Age group comparability of raw accelerometer output from wrist-and hip-worn monitors. Med. Sci. Sports Exerc. 46(9), 1816–1824 (2014). https://doi.org/10.1249/MSS.0000000000000289 29. van Hees, V.T., et al.: Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity. PLoS ONE 8(4), 1 (2013). https://doi.org/10.1371/journal.pone.0061691 30. Perez Acevedo, I.M., Valencia Varona, J.A.: Propuesta de sistema electrónico para la estimación del gasto energético en actividad física. Universitaria Autónoma del Cauca (2017) 31. Tumnark, P., Cardoso, P., Cabral, J., Conceição, F.: An ontology to integrate multiple knowledge domains of training-dietary-competition in weightlifting: a nutritional approach. ECTI Trans. Comp. Info. Technol. 12(2), 140–152 (2018). https://doi.org/10.37936/ECTI-CIT.201 8122.135896 32. Caron, N., Peyrot, N., Caderby, T., Verkindt, C., Dalleau, G.: Estimating energy expenditure from accelerometer data in healthy adults and patients with type 2 diabetes. Experimental Gerontology 134(December 2019), 110894 (2020). https://doi.org/10.1016/j.exger.2020. 110894 33. Xsens: MTw: Easy Integration Of EMG And Wireless Inertial 3D Kinematics. https://www.xsens.com/press-releases/mtw-easy-integration-emg-wireless-inertial-3dkinematics. Accessed 15 Juin 2022
Measurement of Physical Activity Energy Expenditure Using Inertial Sensors
229
34. Tsukahara, I., Nakanishi, M., Izumi, S., Nakai, Y., Kawaguchi, H., Yoshimoto, M.: Lowpower metabolic equivalents estimation algorithm using adaptive acceleration sampling. In: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, vol. 2016-Octob, pp. 1878–1881 (2016). https://doi.org/10. 1109/EMBC.2016.7591087 35. CamNtech, C.L.: Actiwave Motion. https://www.camntech.com/actiwave-motion/. Accessed 15 Juin 2022 36. Khan, I.U.S., et al.: On the correlation of sensor location and human activity recognition in body area networks (BANs). IEEE Syst. J. 12(1), 82–91 (2018). https://doi.org/10.1109/ JSYST.2016.2610188 37. Attal, F., Mohammed, S., Dedabrishvili, M., Chamroukhi, F., Oukhellou, L., Amirat, Y.: Physical human activity recognition using wearable sensors. Sensors (Switzerland) 15(12), 31314–31338 (2015). https://doi.org/10.3390/s151229858 38. G.R. GmbH and Kasernenstraße: ERGOSTIK. https://www.geratherm-respiratory.com/pro duct-groups/cpet/ergostik/. Accessed 15 Juin 2022 39. Howe, C.C.F., Moir, H.J., Easton, C.: Classification of physical activity cut-points and the estimation of energy expenditure during walking using the GT3X+ accelerometer in overweight and obese adults. Meas. Phys. Educ. Exerc. Sci. 21(3), 127–133 (2017). https://doi. org/10.1080/1091367X.2016.1271801 40. Williams, R.: Kcal estimates from activity counts using the Potential Energy Method. CSA, Inc., ActiGraph, 49 (1998) 41. ActiGraph, L.L.C.: What is the difference among the Energy Expenditure Algorithms? (2018). https://actigraphcorp.my.site.com/support/s/article/What-is-the-differenceamong-the-Energy-Expenditure-Algorithms. Accessed 15 Juin 2022 42. Inc. Facebook: react-native-fetch-blob. https://www.npmjs.com/package/react-native-fetchblob. Accessed 15 Juin 2022
Software Systems, Applications and Tools
A New Proposal for Virtual Academic Advisories Using ChatBots Carmen Lizarraga1 , Raquel Aguayo1(B) , Yadira Quiñonez1(B) , Víctor Reyes1 , and Jezreel Mejia2 1 Universidad Autónoma de Sinaloa, 82000 Mazatlán, Mexico {carmen.lizarraga,raquelaguayog,yadiraqui, victorreyes}@uas.edu.mx 2 Centro de Investigación en Matemáticas, Zacatecas Unit A.C, Jalisco S/N, Col. Valenciana, Guanajuato, GJ, Mexico [email protected]
Abstract. Today, Artificial Intelligence is already part of our environment and is used in various social, economic, and educational fields. Therefore, incorporating Artificial Intelligence technology in any field is an emerging need because it generates new opportunities to offer better products and services. However, to stay ahead, it is necessary to implement the right technology to change the competitive landscape in organizations and educational institutions. This work proposes the incorporation of ChatBots as a new proposal to perform virtual academic advisories within the Institutional Tutoring Program of the Universidad Autonoma de Sinaloa to strengthen the Peer Advisors program. In this sense, an instrument in Google form has been designed and validated to know the student’s and professors’ opinions of the Faculty of Computer Science before its development. Keywords: Artificial intelligence · Natural language · Conversational agents · Academic advisories
1 Introduction In the last decade, many developments have been made in the field of Artificial Intelligence (AI) [1]; various studies presented by some researchers from the scientific community are mainly focused on the creation of intelligent machines and devices capable of imitating the functions and human movements [2, 3]. Currently, AI is a topic that has become increasingly relevant and has been an essential reference for different researchers in different fields of application, such as home applications [4], in education [5], in organizations [6], and in industry manufacturing is no exception [7], due to the development of various tools, devices, products, and services that simplify daily activities. As a result, new opportunities and development challenges have been generated to solve complex industrial and social problems and create innovative products and solutions [8]. In this sense, today, many companies and industries have incorporated Artificial Intelligence in different tasks or activities that directly impact process improvement [9] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 233–242, 2023. https://doi.org/10.1007/978-3-031-20322-0_16
234
C. Lizarraga et al.
and, consequently, have obtained a competitive advantage over other organizations. For this reason, the influence of Artificial Intelligence technologies is increasingly immersed in society. In this context, research on Artificial Intelligence has focused on finding efficient and robust methods to solve complex problems in different application areas. Nowadays, Artificial Intelligence can be considered a fully consolidated scientific discipline because new areas of knowledge are emerging that try to improve the efficiency, performance, and robustness of the algorithms, techniques, and tools used [10, 11]. As a result, a wide variety of AI algorithms have been modeled, and numerous technological advances have been made; that focus mainly on the massive automation of services to increase productivity, flexibility, quality, and, above all, to improve safety and reduce risky tasks for people. Society, in general, has become potential clients of the leading companies investigating AI developments. For this reason, these companies have invested in and adopted digital technologies based on Artificial Intelligence systems, mainly due to economic factors since it is possible to reduce operating costs and increase revenues due to the expansion of the sales market. Furthermore, with the incorporation of digital technology, expectations and capabilities in companies increase; educational institutions are no exception since using artificial intelligence technology in educational environments facilitates students’ teaching and learning [12, 13]. In this sense, Chatbots have become very popular in most companies in recent years. This type of conversational agent is generally used to automate tasks and improve the user experience. Moreover, implementing them reduces operating costs, specifically in the human resources in charge of attending to the different user queries. The most common applications of Chatbots are automated customer services [14, 15], health [16, 17], financial transactions [18, 19], education [20, 21], call centers [22], and electronic commerce [23, 24]. In this work, the incorporation of ChatBots is proposed as a new proposal to perform virtual academic advisories in the Mazatlan Faculty of Informatics of the Universidad Autonoma de Sinaloa (UAS). The remainder of this paper is organized as follows: Sect. 2 addresses the background of the tutoring program and the types of tutoring handled within the Universidad Autonoma de Sinaloa. Section 3 presents the description of the proposal for this work. Section 4 describes the methodological strategy applied in this work. In Sect. 5, the results obtained with the digital questionnaire applied to both students and teachers are presented, in addition to the general diagram of the proposal, the functional architecture of the ChatBot, and technical aspects of software development. Finally, Sect. 5 presents a summary and a critical analysis of the conclusions obtained from this research work.
2 Background In 2006, the UAS launched the Institutional Tutoring Program (ITP), which the Honorable University Council approved. This program was implemented as a necessity and an emergency suggested by the globalized era that demands a new conception, a different way of organizing academic work that focuses on higher quality in education, and the transition from the paradigm of teaching to learning. This implies greater attention, more
A New Proposal for Virtual Academic Advisories Using ChatBots
235
personalized, and constant evaluation of the learning processes that account for teaching effectiveness. Tutoring, according to Lara et al. defined as a guiding activity carried out by the tutor and closely linked to the educational process and teaching practice to promote the comprehensive training of students autonomously [25]. It consists of a set of interventions that are developed with the students, their families, and the teaching team, in which the figure of the tutor becomes, for different reasons, the articulating axis of the teaching function. According to Díaz et al. [26] tutoring refers to the extra-class academic activity carried out in each Academic Unit by a tutor previously trained to support, accompany, and guide a student or a small group of them systematically in achieving their best school performance and comprehensive training. Within the tutorial process established in the ITP of the UAS, a human network is woven that contributes to shaping the tutorial itself to obtain the best results. Mainly the human factor; the tutor and the tutored must establish a psychological and pedagogical contract, who must exercise their rights and assume their responsibilities for the proper functioning. In this sense, the ITP has a program of individual tutoring, group tutoring, mixed tutoring, distance tutoring, disciplinary advisors, and peer advisors. The central objective of these programs is to contribute to the improvement of the educational quality of the UAS through a process of attention, accompaniment, and orientation of the student to promote their best school performance and integral development. The Peer Advisors program is formed by students from the Academic Unit with an excellent academic level. The students must possess specific skills and abilities that allow them to approach others, create empathy, and have a greater understanding and appropriation of the disciplinary contents to provide them with the academic help they need. The objective of the peer advisor is to provide academic assistance to students who require support in a subject or to clarify doubts about a particular subject in which some difficulty or insufficiency is perceived. In this sense, this document describes an automated solution to help students achieve the training established within the study plans and programs through specialized ChatBots for each subject.
3 Proposal Description Currently, the process of carrying out academic advisory services in the faculty depends directly on the tutoring department. Therefore, a call is usually made at the beginning of the semester to select candidates who will be part of the peer advisor program. Any student with a significant track record and willing to share their knowledge with their peers voluntarily can be a peer advisor. Later, in a meeting with the Honorable Technical Council of the Academic Unit, the files are analyzed, and the students who will be peer advisors are approved. In this sense, this document proposes an alternative to perform automated virtual academic advising using specialized ChatBots for different subjects as a support tool for the peer advisor program and, at the same time, to help students achieve the training established within the study plans and programs. A Chatbot is a computer program that uses artificial intelligence techniques to establish conversations with humans using text or voice input. ChatBots can typically help
236
C. Lizarraga et al.
address a variety of needs in a variety of settings. One of the main features is that they can save human time and resources by collecting or distributing information instead of requiring a human to perform these tasks. Chatbots can be beneficial because they provide a more convenient way to interact with a computer or software program. Another advantage of ChatBots is consistency; most ChatBots respond to queries based on algorithms and pre-programmed data. ChatBots typically provide the same answers to the same questions repeatedly because they rely on a core set of features to enhance their conversation skills and automated decision-making. In this sense, to perform this virtual academic advisory proposal, it is essential to know the opinion of the students who benefited from the peer advisor program. Therefore, an instrument in Google form has been designed and validated to know the student’s and professors’ opinions of the Faculty of Computer Science before development.
4 Methodology A quantitative non-experimental cross-sectional study with a descriptive scope was carried out [27]. The population of this research was the students and professors of the Universidad Autonoma de Sinaloa (UAS). The sample was for convenience with professors from the Faculty of Informatics of Mazatlan, and students benefited from the peer advisor program of the faculty. The survey technique was used to collect information, analyze, and know the opinion of the teachers and students benefited; the instrument used for this work was a digital questionnaire in Google form. A population of 50 students from the faculty was considered, and the sample considered was 25 students. Concerning the professors, a population of 30 professors from the faculty was considered, and the resulting sample was 15 teachers. This research was performed in an organized and systematic manner. First, the director was notified, and permission was requested to perform this research with students and professors of the faculty. Second, the instrument was designed and validated for students and teachers to determine their opinion about the peer advisor program. Third, the information was collected by applying the instrument in a Google form. Finally, the results obtained from the research context were analyzed.
5 Analysis of the Results 5.1 Instrument Results for Students The participants in this research were 25 students and 15 professors from the Faculty of Computer Science Mazatlan. According to the instrument applied to the benefited students, Fig. 1a shows that 100% of the students affirmed that the peer advisor program serves as support to pass the failed subjects. Concerning the difficulty of communication with Peer advisors, Fig. 1b shows that 84% of the students never have communication problems, and only 16% mention that they rarely have problems establishing communication. Figure 1c shows that 92% mentioned strongly agree that peer advisors have sufficient skills to support and guide students with failed subjects, while the other 8% agreed somewhat. Regarding the space assigned to the advisors, Fig. 1d shows that 84%
A New Proposal for Virtual Academic Advisories Using ChatBots
237
Fig. 1. Instrument results: (a) Do you consider that the Peer Advisors program serves as a support for students with failed subjects? (b) Do you have difficulty establishing communication with the peer advisor? (c) Do peer advisors have sufficient skills to support and guide students with failed subjects? (d) What is the assigned space, or where are the academic advisories usually carried out?
confirmed that the classroom is where they usually carry out the advisors, and 16% of the students mentioned that they are performed in the computer center. In this sense, to perform this proposal for virtual academic advisories, it is essential to know the opinion of the students who benefited from the peer advisor program. Then, it was asked if they considered it feasible to perform the virtual advisories. Figure 2a shows that 88% of the students strongly agree with this alternative and the other 12% mentioned that they agree with its agreement. Finally, they were asked which platform they considered more viable to carry out virtual advisories; according to the results, Fig. 2b shows that 84% of the students consider it feasible to use ChatBots as part of the advisors, and 12% mentioned that through video conference platforms. Only 4% prefer advice through learning management platforms. 5.2 Instrument Results for Teacher Concerning the professors, a population of 30 professors from the faculty was considered, and the resulting sample was 15 teachers. First, a presentation was made with the teachers to introduce the proposal, because the teachers are a fundamental part of the development and implementation of ChatBots. Then, an instrument was applied in Google forms to determine if teachers agreed with this proposal’s development. It was asked if they considered it feasible to perform virtual advisors using Chatbot; 80% of the teachers said
238
C. Lizarraga et al.
Fig. 2. Instrument results: (a) Do you consider it feasible to perform virtual academic advisories? (b) What platform do you consider most feasible to perform virtual academic advisories?
that was strongly agreed with the proposal, 12.33% indicated that they agreed, while 6.67% were neutral with the proposal (see Fig. 3a). Also, for developing this proposal, the extraction of information is fundamental; in this sense, the teachers were asked which platform is more viable for the extraction (see Fig. 3b). According to the results, 66.66% of the teachers considered a Web Page the most viable option because it is possible to develop each learning unit according to the subject’s study program. Only 13.33% mentioned that better information manipulation could be handled using a database, and 6.67% of teachers indicated the best alternative is to use a mobile application, collaborative mental maps, or Google forms, respectively.
Fig. 3. Instrument results: (a) Do you consider it feasible to perform virtual academic advisories using ChatBots? (b) What platform is more feasible to extract information from the different subjects?
Once the information on the applied instruments was analyzed, Fig. 4 shows the general idea of the proposal. First, it will be necessary to develop a web page to extract the information about each subject; the teacher will be asked to capture the number of learning units of the subject that he teaches, as well as each of the subtopics that make up the program study. In addition, the necessary information for its development and all the resources, tools, and strategies used. Second, once the information is extracted, it is stored in a database; this collected information is key to developing the specialized
A New Proposal for Virtual Academic Advisories Using ChatBots
239
ChatBot. Finally, with this information, ChatBot is integrated into text-based platforms such as Telegram, WhatsApp, or Facebook that students who require it will use.
Fig. 4. General diagram of the proposal
Figure 5 shows the functional architecture of the ChatBot; as can be seen, the interactions of the two main actors of the ChatBot, the teacher and the student, are presented. Within the technical aspects of software, a web page is necessary that will be developed in the Django framework [28]. This framework is one of the most used for web development; it is free and open source. Therefore, Django offers fast, secure, and highly scalable development. With the flexibility of Django, it is possible to scale the web page according to the requirements. In addition, it is proposed to use Firebase as a database because it offers real-time database management and a wide variety of tools that facilitate the development of an application, such as the ease of managing user authentications. It includes analytics, file storage, databases, configurations, and automated messaging tools [29]. Finally, the Dialogflow platform as a natural language processing engine is proposed to design the ChatBot and integrate it with the most used social networks such as WhatsApp, Facebook, and Telegram [30].
Fig. 5. Functional architecture of the ChatBot and technical aspects of software development
240
C. Lizarraga et al.
6 Conclusion and Future Work This article presents a context analysis of the processes of the Peer Advisors program, which the Universidad Autonoma de Sinaloa has been running since 2006, with the Institutional Tutoring Program. In this program, the key actors are the students who are willing to provide support to other students that require it. Therefore, this work focuses on the students who benefited from the program and the faculty teachers because they are the primary key to developing this proposal. In this sense, an instrument was made to know the opinion of the students who benefited from the program. According to the results, the students mentioned that they very much agree with the program because they receive advice on the failed subjects to present the extraordinary exams. They also commented that they feel comfortable with classmates’ advice. However, scheduling the meetings for the advisories is sometimes impossible due to time availability or saturation of tasks or projects. Then, the students were asked if they considered it feasible to perform academic advising virtually; according to the results, 88\% stated that they agreed with this alternative. The other 12\% mentioned that it was a good alternative. However, it is difficult for them because they do not have an internet connection at home or mobile data to perform the advisories. Regarding the possible platforms to carry out virtual consultancies, 84\% of the students indicated that it was feasible to use ChatBots as part of academic advisories, 12\% mentioned that through videoconferencing platforms, and 4\% prefer the use of ChatBot learning management platforms. Therefore, according to the results of the applied instruments, developing this proposal as a support tool for the Peer Advisors program is considered feasible. It has been verified that students are willing to perform virtual academic advisories through ChatBots, and teachers have shown interest and willingness to extract information from the subjects. Extraction is the key to developing and implementing specialized ChatBots for virtual academic advising. Historically, ChatBot development was complex because building natural language processing engines was challenging and time-consuming to implement. However, many natural language processing programming libraries exist today, and cloud-based services have lowered this significant barrier. Therefore, it is only necessary to import the natural language processing engine from the available libraries (Google, Amazon, Microsoft), then incorporate the essential functionality to interpret human language and establish the conversation through queries. In this sense, to continue with the implementation of this proposal, in future work, a web page will be developed to perform the extraction of information and an analysis of the available libraries with the natural language processing engine to create ChatBot.
References 1. Liu, W., Zhuang, G., Liu, X., Hu, S., He, R., Wang, Y.: How do we move towards true artificial intelligence. In: IEEE International Conference on High Performance Computing and Communications; International Conference on Data Science and Systems; International Conference on Smart City; International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, pp. 2156–2158. IEEE Press, Haikou, Hainan, China (2021)
A New Proposal for Virtual Academic Advisories Using ChatBots
241
2. Ahmed, I., Jeon, G., Piccialli, F.: From artificial intelligence to explainable artificial intelligence in industry 4.0: a survey on what, how, and where. IEEE Trans. Industr. Inform. 18(8), 5031–5042 (2022) 3. Radanliev, P., De Roure, D., Maple, C., Santos, O.: Forecasts on future evolution of artificial intelligence and intelligent systems. IEEE Access 10, 45280–45288 (2022) 4. Huang, Z.: Analysis of IoT-based Smart Home Applications. In: IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering, pp. 218–221. IEEE Press, SC, USA (2021) 5. Chen, L., Chen, P., Lin, Z.: Artificial intelligence in education: a review. IEEE Access 8, 75264–75278 (2020) 6. Zel, S., Kongar, E.: Transforming digital employee experience with artificial intelligence. In: IEEE International Conference on Artificial Intelligence for Good, pp. 176–179. IEEE Press, Geneva, Switzerland (2020) 7. Yang, T., Yi, X., Lu, S., Johansson, K.H., Chai, T.: Intelligent manufacturing for the process industry driven by industrial artificial intelligence. Engineering 7(9), 1224–1230 (2021) 8. Kumpulainen, S., Terziyan, V.: Artificial general intelligence vs. industry 4.0: do they need each other? Proc. Com. Sci. 200, 140–150 (2022) 9. Kamran, S.S., Haleem, A., Bahl, S., Javaid, M., Prakash, C., Budhhi, D.: Artificial intelligence and advanced materials in automotive industry: potential applications and perspectives. Mater. Today: Proc. 62(6), 4207–4214 (2022) 10. van der Maas, H.L.J., Snoek, L., Stevenson, C.E.: How much intelligence is there in artificial intelligence? a 2020 update. Intelligence 87, 101548 (2021) 11. Quiñonez, Y.: An overview of applications of artificial intelligence using different techniques, algorithms, and tools. In: Peña, A., Muñoz, M. (eds.) Latin American Women and Research Contributions to the IT Field, pp. 325–347. IGI Global, Hershey, Pennsylvania (2021) 12. Hwang, G.J., Xie, H., Wah, B.W., Gaševi´c, D.: Vision, challenges, roles and research issues of artificial intelligence in education. Comput Educ: Artif Intell 1, 100001 (2020) 13. Chassignol, M., Khoroshavin, A., Klimova, A., Bilyatdinova, A.: Artificial Intelligence trends in education: a narrative overview. Procedia Comput. Sci. 136, 16–24 (2018) 14. Darapaneni, N., et al.: Customer support chatbot for electronic components. In: IEEE Interdisciplinary Research in Technology and Management, pp. 1–7. IEEE Press, Kolkata, India (2022) 15. Dihingia, H., Ahmed, S., Borah, D., Gupta, S., Phukan, K., Muchahari, M.K.: Chatbot implementation in customer service industry through deep neural networks. In: IEEE International Conference on Computational Performance Evalua- tion, pp. 193–198. IEEE Press, Shillong, India (2021) 16. Christopherjames, J.E., et al.: Natural language processing based human assistive health conversational agent for multi-users. In: IEEE Second International Conference on Electronics and Sustainable Communication Systems, pp. 1414–1420. IEEE Press, Coimbatore, India (2021) 17. Softic, A., Husic, J. B., Softic, A., Barakovic, S.: Health chatbot: design, implementation, acceptance and usage motivation. In: IEEE International Symposium Infoteh-Jahorina, pp. 1– 6. IEEE Press, East Sarajevo, Bosnia and Herzegovina (2021) 18. Bhuiyan, M.S.I., Razzak, A., Ferdous, M.S., Chowdhury, M.J.M., Hoque, M.A., Tarkoma, S.: BONIK: a blockchain empowered chatbot for financial transactions. In: IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 1079– 1088. IEEE Press, Guangzhou, China (2020) 19. Nicoletti, B.: Artificial intelligence in support of customer proximity in banking 5.0. In: Banking 5.0. PSFST, pp. 153–172. Springer, Cham (2021). https://doi.org/10.1007/978-3030-75871-4_5
242
C. Lizarraga et al.
20. Kasthuri, E., Balaji, S.: A chatbot for changing lifestyle in education. In: IEEE International Conference on Intelligent Communication Technologies and Virtual Mobile Networks, pp. 1317–1322. IEEE Press, Tirunelveli, India (2021) 21. Sophia, J.J., Jacob, T.P.: EDUBOT-a chatbot for education in Covid-19 pandemic and VQAbot comparison. In: International Conference on Electronics and Sustainable Communication Systems, pp. 1707–1714. IEEE Press, Coimbatore, India (2021) 22. Al-Abbasi, L.M.S., Elmedany, W., Hewahi, N.M.: An intelligent agent for E-government call center. In: IEEE Smart Cities Symposium, pp. 558–565. IEEE Press, Online Conference, Bahrain (2021) 23. Khan, M.M.: Development of an e-commerce sales Chatbot. In: IEEE International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI, pp. 173–176. IEEE Press, Charlotte, NC, USA (2020) 24. Rakhra, M., et al.: E-Commerce assistance with a smart Chatbot using artificial intelligence. In: IEEE International Conference on Intelligent Engineering and Management, pp. 144–148. IEEE Press, London, United Kingdom (2021) 25. Sánchez-Cabezas, P.P., Luna-Álvarez, H.E., López-Rodríguez, M.M.: The tutoring in higher education and its integration in the pedagogical activity of the university teacher. Conrado 15(70), 300–305 (2019) 26. Díaz, C., et al.: Programa Institucional de Tutorías. Editorial UAS, México (2006) 27. Creswell, J.W., Creswell, J.D.: Research Design. Qualitative Quantitative and Mixed Method Approaches, 5th edn. Sage, Thousand Oaks, California (2017) 28. Django: The web framework for perfectionists with deadlines. https://www.djangoproject. com/ 29. Firebase: Goolge. https://firebase.google.com/ 30. Dialogflow Documentation: https://cloud.google.com/dialogflow/docs
A Telemonitoring System for Patients Undergoing Peritoneal Dialysis Treatment: Implementation in the IONIC Multiplatform Framework Juan Manuel Sánchez Juárez1 , Eduardo López Domínguez2(B) , Yesenia Hernández Velázquez3 , Saúl Domínguez Isidro1 , María Auxilio Medina Nieto4 , and Jorge De la Calleja4 1 Laboratorio Nacional de Informática Avanzada, 91100 Xalapa, Veracruz, Mexico
{jsanchez.mca15,saul.dominguez}@lania.edu.mx
2 Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional,
07360 Ciudad de México, CDMX, Mexico [email protected] 3 Universidad Veracruzana, 91020 Xalapa, Veracruz, Mexico [email protected] 4 Universidad Politécnica de Puebla, 72640 Cuenalá, Puebla, Mexico {maria.medina,jorge.delacalleja}@uppuebla.edu.mx
Abstract. Patients with Chronic kidney disease (CKD) undergoing peritoneal dialysis (PD) treatment require continuous monitoring by the medical staff. In this context, the National Laboratory on Advanced Informatics developed a telemonitoring system focused on patients with CKD undergoing PD treatment. However, one of the system’s limitations is that the patient application was only produced for the Android operating system, leaving some potential patients with other operating systems. In this work, we present the implementation of the patient-oriented application in the IONIC multiplatform framework. IONIC allows developing a single application for multiple mobile operating systems and optimizing the development time and access to the device hardware. The performance tests on the patient application’s services developed in IONIC show a comparable consumption of resources in terms of CPU and RAM regarding the application developed in Android. Nevertheless, it gets higher response times than those of the native application. Keywords: Telemonitoring system · Peritoneal dialysis patients · IONIC multiplatform framework · Performance
1 Introduction Chronic kidney disease (CKD) is one of the severe conditions in the population [1]. According to the National Institute of Statistics and Geography (INEGI) [2], CKD is one of Mexico’s leading causes of death. CKD patients have three treatments: kidney © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 243–257, 2023. https://doi.org/10.1007/978-3-031-20322-0_17
244
J. M. S. Juárez et al.
transplant, hemodialysis, and peritoneal dialysis (PD) [1]. PD is a treatment that uses the patient’s abdomen lining (peritoneum) to filter the blood inside the body. Within PD, there are two modalities: Continuous Ambulatory Peritoneal Dialysis (CAPD) and Automated Peritoneal Dialysis (APD). PD is a continuous home treatment carried out throughout the day, and it is only necessary to go to the hospital for tests and check-ups that are usually every three or four months [3]. Among the main disadvantages of PD is patient or a relative in care is in charge of carrying out the treatment, they should record a dialysis biomedical data sheet daily, analyzing to determine dehydration or fluid retention; characteristics of the drained fluid, and assess the presence of pain and discomfort [4]. The records are delivered during the periodic examinations in the hospital. Furthermore, in the period between check-ups, there is no monitoring of biomedical indicators nor immediate control of risk situations, such as imbalances or lag in the limit range provided by the doctor. Finally, the records in the dialysis sheets can lead to errors in data clarity, loss, or forgetfulness by the patient. In this context, a telemonitoring system was developed at LANIA for monitoring and controlling patients with CKD undergoing CAPD and APD treatment [5]. The telemonitoring system is composed of a patient-oriented native Android application and a doctor-oriented mobile web application. According to the study presented by Statcounter GlobalStats [6], Android was the dominant mobile operating system in the market in 2021 with 70.52% of users, and iOS in the second position with 28.7%; the remaining percentage is divided among other mobile operating systems. This fact implies that some patients with iOS mobile devices may not be able to take advantage of the LANIA telemonitoring system. In this work, we present the implementation of the patient-oriented component for telemonitoring patients with CKD undergoing PD treatment in the IONIC multiplatform framework. IONIC allows developing a single application for multiple mobile operating systems and optimizing the development time and access to the device hardware. Furthermore, the applications are compiled natively, creating a specific high-performance version for each target platform [7]. The performance tests on the patient application’s services developed in IONIC showed a comparable consumption of resources in terms of CPU and RAM regarding the application developed in Android. Nevertheless, it gets higher response times than those of the native application.
2 Cross-Platform Analysis Cross-platform refers to the ability of software or hardware to operate identically on different platforms. In this regard, cross-platform frameworks are the tools that allow the development of applications with the behavior described previously [8]. In the context of mobile apps, these frameworks allow developers to create hybrid apps deployable on multiple operating systems (OS). These types of apps are not native applications because the rendering is done through web views and not with graphical interfaces specific to an OS, but they are not web applications either, considering that they are applications packaged to be deployed on the device even working with the native system API [8]. As a result of an analysis carried out in this work, some useful frameworks (PhoneGap [9], Apache Cordova [10], Titanium [11], Kony Visualizer [12], IONIC [13], and Xamarrin [14]) were identified for developing cross-platform software for mobile devices. The
A Telemonitoring System for Patients Undergoing
245
studied frameworks were chosen according to the needs and objectives that this project needs, e.g., it should allow the generation of apps for the predominant operating systems in the market, including Android, IOS, and Windows. Besides, the app must be hybrid, i.e., the framework can assemble the mobile app through web development. Concerning the analysis in terms of technologies used, platforms that support, advantages, disadvantages, and costs of these frameworks, we conclude that IONIC is the framework that fits the project needs. IONIC [13] is an open-source framework for developing hybrid mobile applications by using web technologies such as CSS, HTML5, and SASS (Syntactically Awesome Stylesheets). In addition, it provides access to the main application development interfaces in JavaScript for greater customization, allowing the application to be compiled and distributed to mobile operating systems such as Android, IOS, and Windows, through native application stores [13]. By including Angular, IONIC provides its own components and custom methods to interact with them and work on the Model View Controller (MVC) architecture [13].
3 Implementation of Patient and Medical Staff Apps in IONIC The telemonitoring system developed at LANIA [5] is characterized by offering different services for monitoring, control, and remote treatment of patients undergoing CAPD and APD. This system is assembled by a native Android application, which provides services to the patient, and by a mobile Web application, which provides a set of services to the medical staff (see Fig. 1).
Fig. 1. Telemonitoring system for patients undergoing peritoneal dialysis treatment.
The telemonitoring system implementation carried out through the IONIC mobile application cross-platform framework allows the apps to be developed in its syntax language and structure based on HTML and Angular JS and executed on Android and IOS [13]. The graphical user interface (GUI) component includes style sheets generated from SASS, a scripting language that is translated into CSS [15]. The app is under the MVC architectural pattern and is developed to be fast considering the minimum manipulation of the DOM (Document Object Model), which is essentially a platform interface that provides the standard set of objects for representing HTML XHTML, and
246
J. M. S. Juárez et al.
XML documents. The iterative calls are kept asynchronously, waiting for the callback [13]. The IONIC architecture is composed of three main blocks or modules: • IONIC: Helps structure the application. • CORDOVA: Used to add native device support via plugins. • GLUP: Its compilation file is non-configurable code, the minify file helps concatenate other files and tasks run with the maximum concurrency. The following sections describe the implementation of the apps oriented to the medical staff and patients of the telemonitoring system developed at LANIA [5]. 3.1 Medical Staff Oriented App The app implementation oriented to the medical staff considers its execution in Android and IOS. The developed services are: 1) Patients’ management, 2) Search exchange records, 3) Search lab results, 4) Manage clinical history, 5) Range settings, 6) Generate notifications, and 7) Review alerts. Following, the implementation of the main services of the medical staff app are detailed. Search exchange records: For this section, the list of exchanges recorded by the previously selected patient is shown (see Fig. 2a). The list is specified by date and record number; the doctor user should set the corresponding date ranges and determine the treatment modality (APD/CAPD) to search for records. The detail is displayed when selecting a record (see Fig. 2b). Search lab results: This service displays a list of the laboratory results of a selected patient (see Fig. 3a). The user interface displays two buttons to indicate assignment or lack of lab results. In addition, the interface offers a search for records using date ranges. Finally, the detail of each selected laboratory result is displayed through AlertController (see Fig. 3b).
Fig. 2. Search exchange records GUI, (a) list of exchanges (b) detail of a selected exchange.
A Telemonitoring System for Patients Undergoing
247
Fig. 3. Search lab results GUI, (a) list of lab results and (b) detail of the lab result.
Generate notifications: This service provides doctor-to-patient communication. The doctor visualizes the list of notifications, starting from the current date in which it is accessed to a later month (see Fig. 4a). Selecting a record allows viewing the details of said notification (see Fig. 4b); if the notification has a valid date, it can be edited and display the synchronization status. To generate a new notification, the type, brief description, and expiration date must be indicated (see Fig. 4c). AlertController confirms the saving status (see Fig. 4d).
Fig. 4. Generate notification GUI, (a) fields to set minimum and maximum values, (b) biomedical data, and (c) confirmation message.
248
J. M. S. Juárez et al.
Range settings: This service allows the doctor to set the minimum and maximum values (see Fig. 5a) of biomedical data such as hematocrit, sodium, potassium, albumin, and ultrafiltration for APD patients (see Fig. 5b). AlertController confirms the saving status (see Fig. 5c).
Fig. 5. Range settings GUI, (a) fields to set minimum and maximum values, (b) biomedical data, and (c) confirmation message.
3.2 Patient Oriented App The app implementation oriented to the patient considers its execution in Android and IOS. The developed services are: 1) Register exchanges APD and CAPD, 2) Register lab results, 3) Search clinical history, 4) Alerts generation, 5) Search notifications, and 6) Settings. Following, the implementation of the main services of the patient app are detailed.
Fig. 6. Register exchanges APD GUI, (a) required data for APD treatment and (b) required data for vital signs.
A Telemonitoring System for Patients Undergoing
249
Register exchanges APD: This service allows recording the necessary data for the APD treatment (see Fig. 6a), validating the mandatory fields and the permitted number of characters. In addition, vital signs can be recorded optionally (see Fig. 6b) via AlertController. Register exchanges CAPD: This service allows registering the necessary data for the CAPD treatment (see Fig. 7a and 7b), validating the mandatory fields and the permitted number of characters. In addition, vital signs can be recorded optionally (see Fig. 7c) via AlertController. Finally, the status of the exchange submission is confirmed through AlertController, allowing the patient to know if the exchange register was synchronized with the doctor’s app and can be viewed by the doctor (see Fig. 7d).
Fig. 7. Register exchanges CAPD GUI (a - b) required data for CAPD treatment, (c) required data for vital signs, and (d) submission status.
Search notifications: This service allows visualizing the doctor’s notifications to the patient from the medical staff app. The list shows the date, type, and a brief description of the notification (see Fig. 8a). A given range of dates can filter this list. When the user selects a record, the complete notification’s detail and the validity status are displayed through AlertController (see Fig. 8b).
250
J. M. S. Juárez et al.
Register lab results: The patient captures the lab results through this service; ion-tabs elements were implemented for each of the exams, such as obtaining biochemistry in the blood (Fig. 9a), complete blood count (Fig. 9b), lipid profile (Fig. 9c), determination of albumin and liver enzymes (Fig. 9d).
Fig. 8. Search notification GUI, (a) notification list and (b) notification detail.
Alerts generation: This service sends alerts via SMS to a receiver (a doctor or a patient relative) previously designated by the patient. Notifications include the type of alert and its description and can be generated by three factors described below: • Fluid characteristics: If the fluid feature is different from the “Transparent” option in the register exchanges APD and CAPD service, the app generates an alert. • Biomedical index: The doctor previously establishes ranges for some of the biomedical data that the patient will record. Fields for bounded biomedical data are grayed out in the APD and CAPD exchanges registration GUI. If the patient inserts a parameter outside that established range, the app will generate an alert. • Ultrafiltration: In the case of patients undergoing APD treatment, there is the capture field for ultrafiltration, which is in a range established by the doctor, grayed out in the APD exchanges registration GUI. An alert will be generated if the patient inserts a parameter outside the set range. The patient is warned by a message on the screen for three seconds regarding the alert’s generation. 3.3 Integration Tests This section describes the integration tests carried out independently for each module that composes the application to measure if the operation performs as expected and no errors are found. Tests were based on scenarios where two patients and a doctor were simulated. Table 1 presents the general report of the integration tests performed on the modules of the doctor’s application.
A Telemonitoring System for Patients Undergoing
251
Fig. 9. Register exam lab GUI, (a) biochemical data, (b) blood count data, (c) lipid profile data, and (d) albumin and liver enzyme data. Table 1. Summary of doctor application integration tests (Errors Found (EF), Errors Resolved (ER), Errors Pending (EP) and Iterations (I)) Module
Sub-module
EF
ER
EP
I
2
2
0
3
Patient
3
3
0
4
Nurse
3
3
0
4
Doctor
3
3
0
4
Log in Register person
User manager
4
4
0
6
Patient selection
2
2
0
3
Search exchange record
4
4
0
6
Search lab results
1
1
0
3
(continued)
252
J. M. S. Juárez et al. Table 1. (continued) Module
Sub-module
Manage clinical history
EF
ER
EP
I
3
3
0
5
Range setting
1
1
0
3
Generate notifications
2
2
0
4
Review alerts
3
3
0
5
Sign off
1
1
0
3
Based on the summary in Table 1, we identified that the most common errors found in the tests were the so-called asynchronous ones because the processes must wait for responses, which was not considered. On the other hand, it was found that the iondatetime components return the date in a different format, causing an error when saving data on the servers; the date set must be date-type instead of text-type. Patient app tests were based on scenarios where two patients were simulated. Table 2 presents the general report of the integration tests performed on the patient application modules. Like the medical staff app, the ion-datetime components return the date in a different format, causing an error when saving data on the servers; the date set must be date-type instead of text-type. On the other hand, the asynchronous functions require adjustments in the times assigned to obtain the response due to the number of services that are triggered simultaneously and depend on each other. Table 2. Summary of patient application integration tests (Errors Found (EF), Errors Resolved (ER), Errors Pending (EP), and Iterations (I)). Module
EF
ER
EP
I
Log in (with connection)
0
0
0
3
Login (with no connection)
0
0
CAPD
1
1
0
3
APD
1
1
0
3
Pre-Dated CAPD
1
1
0
3
Pre-Dated APD
1
1
0
0
0
3
Biochemistry
0
0
0
3
Hemogram
0
0
Register exchanges
Sub-module
Notifications Register lab results
3
3
3 (continued)
A Telemonitoring System for Patients Undergoing
253
Table 2. (continued) Module
Sub-module
EF
ER
Lipidic profile
0
0
EP
I 3
Protein
0
0
3
Hepatic enzymes
0
0
3
Clinical history
0
0
0
3
Treatment history
1
1
0
3
Contacts
1
1
0
3
Doctor setting
0
0
3
Alerts
Relative setting
2
2
3
CAPD
1
1
3
APD
1
1
0
3
Lab results
1
1
0
3
Sync now
3
3
0
5
Sign off
0
0
0
3
4 Performance Tests Performance tests helped us obtain and validate response times to action and validate scalability and stability. Performance-related activities have to do with the use of resources in terms of RAM, CPU, and response time, providing data that indicates the probability of user dissatisfaction with the performance characteristics of the system [16]. The elements considered in the performance tests are presented below. 4.1 Factors to Analyze The measurement of three aspects that compare the application’s performance developed in IONIC with the application developed in Android is considered for the performance tests. The elements that were analyzed are described below: • Response Time (Seconds): It is important to measure the time it takes for each of the activated services; We measure the time from the call to the service until we get the response from the server and it is processed. Therefore, it is necessary to measure the times and be able to denote the possible differences. Response time = (RT − SRT + IRT)
(1)
RT - Request Time, SRT - Server Response Time, IRT - Information Response Time. • RAM consumption (Mb): It allowed to identify the range of consumption made by the application and identify the minimum capacities of the devices they support because the excessive consumption of RAM causes slow app operation.
254
J. M. S. Juárez et al.
• CPU consumption (%): Like RAM consumption, excessive CPU consumption would lead to the crash of the application. 4.2 Comparison Against the App on Android The comparison in performance issues, based on factors such as RAM, CPU, and response time, between the application developed in IONIC and the application developed in Android oriented to the patient, allowed us to assess whether the development in multiplatform frameworks is adequate, worse or equal to native app development, using the same services and integration test scenarios. Based on the tests conducted in three iterations to the application directed to the patient, the following graphs were obtained. The list of tests used is: PUI_AP-001: Log in (Authenticate), PUI_AP-003: CAPD exchange, PUI_AP-004: APD exchange, PUI_AP-005: Notifications, PUI_AP-006: Lab results, PUI_AP-007: Clinical history, PUI_AP-008: Treatment history, PUI_AP-010 PUI_AP-011: Settings, PUI_AP-012: Sync. The averaged data of the three iterations contemplated in the graph were obtained through Android Studio. Although the process was the execution of the complete flow of the listed service, the Android Studio tool provided the data on the consumption of RAM, CPU, and response time, see Eq. (1). CPU Consumption Figure 10 shows the tested and evaluated services in terms of CPU consumption. In services such as Authenticate, Exchange APD, Exchange CAPD, and Sign off, an average difference of 5% was found in consumption. For the Lab result test, the application builtin LANIA obtained 20% higher than the application built-in IONIC due to the difference in handling the information sent to the server.
APP IONIC
APP ANDROID
40% 20% 0%
Fig. 10. CPU consumption graph.
RAM Consumption Figure 11 shows the services that were tested and evaluated in RAM consumption. In services such as APD Exchanges, Notifications, Lab Results, Clinical History, Treatment
A Telemonitoring System for Patients Undergoing
255
History, Settings, and Sync, we found an average difference of 7.6Mb in consumption. For the CAPD Log in and Exchange services tests, the application built-in IONIC consumed 13Mb more than the application built-in LANIA since objects had to be created where the app stored the temporary information of the patient, in addition to storing the data locally. Lastly, for the Sign-off service, the application’s built-in LANIA consumption was greater by 24 Mb due to the difference in handling the information stored in objects which must be cleaned. In the application built-in IONIC, the objects are destroyed.
APP IONIC
APP ANDROID
100 80 60 40 20 0
Fig. 11. RAM consumption graph.
Response Time In response times, it is notable that IONIC takes longer to execute services because IONIC works asynchronously on specific tasks such as requests to the server, local storage and queries to local storage, forcing the application to wait a few seconds to use the responses (see Fig. 12).
APP IONIC
APP ANDROID
20 15 10 5 0
Fig. 12. Response time graph
256
J. M. S. Juárez et al.
For the Login, APD exchange, CAPD exchange, Load Notification, Lab Results, Settings, and Sync services, the time is accumulated by the multiple functions performed internally and necessary for the correct operation of the application. A possible solution to this problem is to reimplement the services through cascading programming for asynchronous functions without the need to wait for a response.
5 Conclusions and Future Work This paper presented the implementation of a telemonitoring system for the control and follow-up of patients with CKD (Chronic Kidney Disease) in the IONIC multiplatform framework. This system comprises two applications, one aimed at the patient and the other for the specialist doctor and nurse (medical staff). The selection of IONIC as a development framework was based on its attributes, standing out the support of the three leading dominant platforms in the market, support of AngularJs, acceptable performance, where the application does not need large graphics, and support of a generic language. The results of the performance tests carried out on the services of the patient application show a comparable consumption of resources in terms of CPU and RAM concerning the application developed on Android. However, it shows a longer response time than the native app. As future work, it is considered to identify and re-implement the services that generated longer times than the application built-in LANIA through a cascade programming for asynchronous functions and without the need to wait for the response, which will reduce response times.
References 1. Treviño Becerra, A.: Protección Renal. Revista Oficial del Colegio de Nefrólogos 41, 1 (2020) 2. INEGI: Características de las defunciones registradas en México durante 2020. COMUNICADO DE PRENSA NÚM. 592/21, pp. 1–4 (2021). www.inegi.org.mx%2Fcontenidos% 2Fsaladeprensa%2Fboletines%2F2021%2FEstSociodemo%2FDefuncionesRegistradas202 0preliminar.pdf&clen=3570106&chunk=true. Accessed 21 Apr 2022 3. Méndez-Durán, A., Méndez-Bueno, J., Tapia-Yáñez, T., Muñoz-Montes, A., Aguilar Sánchez, L.: Epidemiología de la insuficiencia renal crónica en México. Elsevier España S.L. 31(1), 8–10 (2010) 4. Contreras, F., Espinosa, J.C., Esguerra, G.A.: Quality of life, self-efficacy, coping styles and adherence to treatment in patients with chronic kidney disease undergoing hemodialysis treatment. Psicología y Salud. 18, 165–179 (2008) 5. Rodiz-Cuevas, J., Lopez-Dominguez, E., Hernandez Velazquez, Y.: Telemonitoring system for patients with chronic kidney disease undergoing peritoneal dialysis. IEEE Lat. Am. Trans. 14(4), 2000–2006 (2016) 6. Statcounter GlobalStats: Mobile Operating System Market Share Worldwide. https://gs.sta tcounter.com/os-market-share/mobile/worldwide. Accessed 21 Mar 2022 7. Delía, L., Galdamez, N., Thomas, P., Pesado, P.: Un Análisis Experimental de Tipo de Aplicaciones para Dispositivos Móviles. XVIII Congreso Argentino de Ciencias de la Computación, pp. 766–776. http://sedici.unlp.edu.ar/bitstream/handle/10915/32397/Documento_completo. pdf?sequence=1 (2013). Accessed 21 Apr 2022 8. Rodríguez, C., Enríquez, H.: Características del desarrollo en frameworks multiplataforma para móviles. INGENIUM 15(30), 101–117 (2014)
A Telemonitoring System for Patients Undergoing
257
9. A. S. Inc.: Adobe PhoneGap. https://phonegap.com (2008). Accessed 14 Sep 2018 10. T. A. S. Foundation: CORDOVA. https://cordova.apache.org (2010). Accessed 14 Sep 2018 11. Axway: Appcelerator. https://www.appcelerator.com/Titanium/ (2008). Accessed 14 Sep 2018 12. I. Kony: KONY. https://www.kony.com/products/visualizer/ (2012). Accessed 14 Sep 2018 13. D. Co.: IONIC. https://ionicframework.com/ (2013). Accessed 27 Sep 2017 14. Xamarin, Xamarin Inc.: https://www.xamarin.com/ (2011). Accessed 27 Sep 2017 15. Sass: https://sass-lang.com (2006). Accessed 21 Sep 2018 16. Molyneaux, I..: Chapter 1 Why Performance Test? de The Art of Application Performance Testing. O’Reilly Media (2009)
System for Monitoring and Control of in Vitro Ruminal Fermentation Kinetics Luis Manuel Villasana-Reyna1 , Juan Carlos Elizondo-Leal1(B) , Daniel Lopez-Aguirre1 , Jose Hugo Barron-Zambrano1 , Alan Diaz-Manriquez1 , Vicente Paul Saldivar-Alonso1 , Yadira Quiñonez2 , and Jose Ramon Martinez-Angulo1 1 Facultad de Ingenieria y Ciencias, Universidad Autonoma de Tamaulipas, 87120 Victoria,
Mexico {jcaelizondo,dlaguirre,hbarron,amanriquez,vpsaldiv, jrangulo}@docentes.uat.edu.mx 2 Facultad de Informatica Mazatlan, Universidad Autonoma de Sinaloa, 82000 Mazatlan, Mexico [email protected]
Abstract. The study ruminant digestive system is characterized by pre-gastric retention and fermentation. The vitro gas production techniques help predict the fermentation kinetics of food for ruminants. In this paper, an integral automatized system for monitoring the generated gas production by ruminal fermentation is presented. The system used a microcontroller equipped with pressure and temperature sensors whose function is data collection from the experimental samples and the liberation of the generated gas. The database is generated in a local server with internet remote access that which in turn is processed using a mobile application for visualized data, experiment control, and monitoring in real-time. With the proposed system, experimental tests are generated in ruminates and from the data obtained, the producer decided to establish diets for weight gain, growth, or animal health. Keywords: Remote monitoring and control · Arduino · Web service · Mobile application · Fermentation kinetics
1 Introduction Domesticated grazing ruminants (i.e., cattle and sheep) are an efficient way to produce food for humans. Ruminants have a highly developed and specialized mode of digestion that allows them better access to energy in the form of fibrous feeds. The study of their digestive system is characterized by pregastric retention and fermentation [1]. These studies are necessary to evaluate diet feedstuffs in ruminants to evaluate feedstuffs for ruminants [2]. However, these methods are becoming increasingly less attractive concerning animal welfare issues, the costs associated with maintaining surgically modified animals, and the limited number of samples that can be examined at one time [3]. For identification of nutritive apported that offer the food in ruminant species, traditionally © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 258–271, 2023. https://doi.org/10.1007/978-3-031-20322-0_18
System for Monitoring and Control of in Vitro Ruminal
259
using the vitro cumulative gas production techniques were developed to predict the fermentation feedstuffs. First, a feedstuff is incubated with buffered rumen fluid, and the gas produced is measured as an indirect indicator of fermentation kinetics [4, 5]. Menke et al. [6] described an in vitro system in which gas produced from the fermentation of a substrate was used to estimate digestibility and metabolizable energy content. The syringe was incubated in a horizontally rotating rack at 39 °C and the cumulative gas production was measured by reading the position of the piston at various time intervals. The most straightforward pressure measurement technique requires manual measurement of headspace pressure where is used a syringe to the pressure equalizer valve and by removing the sample of the headspace gas for measuring gas production [7]. On the other hand, Wilkins [8] describes a different approach to measuring fermentation kinetics in vitro. Here the fermentation takes place in a sealed vessel, and the gas produced was determined using a pressure transducer to measure the pressurized gas accumulation in the vessel headspace. This principle of measuring pressure with a sensor or transducer has been widely adopted as a simple yet sensitive method of determining fermentation kinetics. Is important to mention that in the literature existing works describing the semi- and complete automation of headspace pressure recording have been developed by Pell and Schofield [9], Cone et al. [10], Mauricio et al. [3] and Davies et al. [11]. Recently, a gas production system (ANKOMRF Gas Production Measurement System) has been developed (Ankom Technology, Macedon, NY, USA), consisting of a bottle kit equipped with a pressure detector and wirelessly connected to a personal computer. Pressure values are recorded and transmitted to the computer at a set interval time. When a set threshold pressure has been reached, gas accumulating in the headspace of bottles is automatically released by an open-closed valve. In recent years, the intelligent sensor has achieved significant attention in livestock [12–14] and agriculture [15–17]. Moreover, is important mention that the sensors and information technologies use with applications to the precision nutrition of animals, allows today obtain model-data fusion approach in which data provided by sensor can be used in the more accurate, certain, and timely predictions. In the literature are mentioned works models where the human intervention is minimal in the different process where animals and monitoring technology are involved using sensors or databases, as is through by creating of start and finish times of feed supplementation of grazing animals in a timely fashion. These data bases help to deliver with more precision the amount of feed required for a target production level daily according to observed trends in live weight (LW) and diet quality. This systems and data bases generated is possible that users obtain time optimizations and data monitoring variables in real time into experiment processes for decisions strategies [18, 19]. The use of smart systems for monitoring using sensors represent an area of interest in applications development as obtain environmental information and physiological parameters in animals as obtaining environmental information can also predict health levels, and with this information can take a decision-making base in ruminants animal by health experts and producers [20]. In this work, we present an integral automatized system with the used to measurement in real time variables as pressure and temperature. The data information is processing trough a local server and mobile application for visualized data, experiment control,
260
L. M. Villasana-Reyna et al.
and monitoring with objective of takes decisions in experimental process in ruminal fermentation kinetics.
2 Materials and Methods Starting with design to system is necessary define to meet a set of functional requirements for the development it. The main requirements for the correct functioning of the system are listed below: • Creating experiment. The user must be able to create a new experiment through the mobile application by providing the name of the system and the lecture hours that the system must execute. • Display information about the experiment. Within the mobile application, the information obtained by the compilation system of the experiment must be displayed in real-time, taking into consideration pressure, temperature, and time. The information could be presented in general, just as detailed for each sample. • Adjustments to the experiment. Throughout the experiment, the user should be able to make label changes to some of the monitored elements or change the programmed reading times depending on whether it is a new case, editing, or elimination. \ • Making adjustments in the application. It is intended to establish the configuration elements that directly affect the operation of the application; the aspects that were taken into consideration are manipulating the frequency with which user information is updated and receive alerts about the process. • Unfolding a list of records of the concluded experiments. Once an experiment has finished, the system closes the processes and will keep the records of such experiments, which the user can access. • Eliminating experiment records. The records of users authenticated in the system may be deleted by their author. • Export experiment logs. The records of the experiments completed by the user can be exported from the system to excel sheets. The non-functional requirements of the system are shown next. • Compilation mechanisms. It is necessary to establish a mechanism that allows the information of each experiment sample to be reliably collected with the least need for intervention by the user. • Database Server. Once the information is taken from each sample, it is necessary to storage it to develop a database that allows the manipulation of the information. • Web Server. An architecture is required to make available the information for which it is necessary to implement a web server. • Web service. Establishing an interface between the stored information in the database and the mobile application. • Mobile application. The information must be presented remotely through a mobile application developed by the Android operative system.
System for Monitoring and Control of in Vitro Ruminal
261
Figure 1 shows a general outline of the process and components that integrate the designed system. As a result of the analysis of the suggested problems, a solution that satisfies the before-mentioned needs is presented. The circuit is made up of four phases or an essential stage for its execution which are designing a compilation system, incorporating a local server, implementing a web service, and presenting the data in a mobile application.
Fig. 1. General outline of the components that integrate the system.
2.1 Designing Compilation System Figure 2 shows a designed for the data compilation and operator control stage. Here is shown the involves software and hardware intervention to carry out the mediation task of the values presented in each sample as a compilation system. The circuit is divided into modules with the ability to monitor six samples. Each module is made up of six sensors of gage pressure of the Honeywell brand and model SSCDANN015PGAA5. In addition,
Fig. 2. Schematic diagram of the compilation system.
262
L. M. Villasana-Reyna et al.
six pinch valves 2-Way of the brand NResearch incorporated and model 225P011–21, directly applied to the samples and controlled by an Arduino UNO. Moreover, a primary module incorporates a DS18B20 temperature sensor and an operator that carries out the sample shaking. Finally, the circuits were recoated with a commercial spray isolator since the modules work in a place with high humidity levels. The data compilation module has an execution software whose behavior is expressed in Algorithm 1. The algorithm establishes the functionality of the compilation module (Arduino UNO), which operation is done through the operation requests which are found in a repetitive execution block where serial port readings are done internally to search for a new execution command. When it is found, a comparison is made to look for the case of execution which must return the corresponding value to the serial port. The functionalities that can be carried out within the software are enlisted next: • Identification: The mechanism was established for each module controlled by the local server. A serial port requests the module identifier by the local server, which does the module’s answer. • Opening and closing valves. Each sample in the Arduino UNO platform has a valve of gas piling. The central server controls the device’s behavior, and through the serial port, it requests each order independently. • Shaking: as part of the established requirements, it can find the shaking of all the samples in determined moments; in the primary modules, an operating device can be activated using the serial port by the local server. • Pressure and Temperature: reading to the corresponding sensor and valuation of the obtained data is done. It is put at disposal in the serial port. The algorithm is mentioned and described below by the Data collection protocol for the Module side as follows. Input: Commands from the serial port break; Output: Hardware actions and data writing while(module is on) if(SerialPort.isAvailable) switch(SerialPort.getValue) case 1: OpenValves(); break; case 2: CloseValves(); break; case 3: StartAgitation(); break; case 4: StopAgitation(); break; case 5: SerialPort.write(ID); break; case 6: SerialPort.write(Temperature); break; case 7: SerialPort.write(Pressure);
System for Monitoring and Control of in Vitro Ruminal
263
2.2 Local Server The local server can be considered the central element of the system as it is the one in charge of coordinating the necessary activities for monitoring the experiment and takes part in the second stage of the data flow from obtaining until its complete presentation to the user. The server, using USB ports, establishes a connection with the compilation modules implemented in the experiment. The number of modules that can be implemented will depend on the capacities of the server. Here the functions as a controller of the compilation system, grouping the collected information by the sensors within a SQL database and putting it at its disposal by implementing a local web service. The software and task elements found in the execution inside the local server are described for system ends. The interchange of information between the compilation modules and the local server can be done through the serial ports. For this task was necessary development and implementation using the C language which is a general way is divided into two sections according to its operation functioning, where the first phase consists of the identification of the modules (Arduino) connected to the local server. The process to complete the first phase is as follows: 1. A scan is performed over a given number N of ports. 2. Once communication is established with the module, its identification is requested. 3. A connection to the database is opened, and it is determined if the identifier received by the module exists. If it has not been registered, a new record is created, and then the existence of the records of the module samples is queried. 4. The connection to the database is closed. Once the identification of the modules that are found connected to the local server is made, it continues to the second phase that, in a general way, consists in doing monitoring of the programmed readings by the applier. Then, the software connects to the database, followed by a search of the records related to the times of the programmed readings. When the reading list has been obtained, it is sorted in ascending order of date and time to determine the upcoming execution time. Constantly in T intervals established previously in the software, it is done a request to the server about the current hour is compared with the one obtained in the database. It is done this way until the hour of the local server is the same or higher than the record time of the database. Once this stage is completed, the software connects with the previously identified modules. Then, the next module is identified, and the temperature and pressure of each sample of the consulted module are requested. Subsequently, the information obtained through the serial is inserted in a new record in the database; a relationship is established between the data obtained with the sample, time, and experiment to which it corresponds. Then the process is determined as successful, and the waiting status is changed to “done.” Finally, the reading search process starts again. 2.3 Database Within the server, it can find the execution of a SQL database manager, which is the one in charge of handling the compilation of the gathered information by the electronic system and their operation settings, as following:
264
• • • • • • •
L. M. Villasana-Reyna et al.
A user does an experiment. An experiment has notifications regularly include actions. An experiment has changes. An experiment has samples. A sample belongs to an Arduino. A sample includes readings. The readings have times.
For develop the logical structure of information storage, its necessary to identify the entities involved and the attributes that make it up; nine entities were identified and their functioning in the structure as shown in Fig. 3.
Fig. 3. Diagram entity relationship, attributes, and entities identification.
2.4 Web Service An Apache server is in execution and is the charge of put the web services at disposal to make operations to the SQL database. The web service is developed in PHP, and the services with which it works are table inquiry, value insertion, record updates, and elimination of the latter. To takes the exit of the information to the internet, a Java code algorithm was developed, which has the function of making inquiries to the local database using the previously mentioned web service. Then, the returning information is checked and sent to an external database in a server (the cloud) by using a web service executing in such server. Then, the service is executed by an external server or a cloud domain. In this service’s task is to store and manage a database with the gathered values by the compilation system and stores the configuration values that the user entered from the mobile application. For this, the service carries out the functions of inserting, checking, modifying, eliminating data and records from the database so that the counterpart can access them in a determined moment. A mobile application is used to show the information and parameters desired.
System for Monitoring and Control of in Vitro Ruminal
265
2.5 Mobile Application To present the results of the experiments a native mobile application for the Android operating system was developed by the collection system (pressure and temperature). This information is presented in graphs with the real-time history of the evolution of the observed experiments. Also, the application’s structure is made up of five general modules, which are mentioned and described next. • Access module control: due to the sensitivity and importance of the integrity of the managed information by the system, an access control module is incorporated through which accessibility restrictions are established using of the validation of a user and password previously registered. • New experiment module: within this module, the necessary configuration tools for creating a new execution cycle corresponding to an experiment are presented. • Current module: it is the one in charge of making consults to the external web service and giving a presentation to the obtained information during the execution of an experimentation cycle. The received information is presented in detail in graphics and records, where the temperature is presented in a general way concerning all the samples belonging to the cycle. At the same time, the generated pressure is detailed and presented individually. • Archives module, within the module, it keeps access to previously concluded experimentation cycles for a determined time to recover the information in a determined moment. Such documents cannot be visualized within the application, for which the module has the function of sharing or exporting the information in Excel spreadsheets, which a spreadsheet reader can later work on. In the same way, the included documents can be eliminated; once this action has been done, they cannot be recovered. • Adjustments module: this module contains the controls through which the status of the system’s operation can be adjusted, such as the readjustment of the previously entered readings and the frequency of information updating displayed in the current module, among other parameters by the application.
3 Results This section presents the results of the following project, mainly consisting of the system from the compiling system, the local server, and the mobile application. An adequate mechanism was designed to collect the needs identified in the analysis stage at the beginning of the project. This mechanism can perform pressure and temperature measurements and, at the same time, can release the gas that is contained in the sample vials, which are used in food experimentation by experts in animal nutrition. The proposed and developed prototype is made up of elements that can be easily found in online stores. The Arduino UNO was used in the market as a microcontroller used for education or small-scale projects. Differential pressure sensors were used for pressure measurement of all samples; relays were also used to energize the release valves and the 12-V power supply. Figure 4a shows the internal view of the developed data compilation module, and Fig. 4b an external view of the module.
266
L. M. Villasana-Reyna et al.
Fig. 4. Electronic system module, a) data compilation module, b) data compilation module, external view
Another essential phase for the system’s functioning is the local server; as mentioned above, it is the fundamental part of the compilation mechanism since it coordinates the system’s complete operation. The tasks that are performed on the server can be taken into three divisions: 1. Serial port readings. It has implemented a communication protocol between the local server and the compilation modules for the request of the information and the reception of such. 2. Storage. The necessary information for the system’s operation and the experimentation is stored in a MySQL database, tailor-made basing on the system’s requirements, using the established database development techniques. 3. Local server synchronization with an external server. Due to the system’s structure when using a local database and another external one, the necessity to establish a synchronization procedure is born. For making the reading tasks to the serial port, just as in the storage of data, a C language code was developed, whose tasks are: • • • • • • • •
Authenticating the user. Identifying the stand-by experiments by the user. Identifying the modules connected to the local server. Creating the registers in the database for each identified module. Starting the selected experiment by the user. Monitoring the upcoming reading times. Obtaining the information of each module. Storing the obtained information.
On the other hand, to perform the third task, the server has software developed in Java that constantly consults the database (local, external) in search of changes, followed up by the corresponding update. The last part of the experimentation is displaying of the results in the mobile application to the user. The information can be accessed at any geographic point as long as it has Internet access. To access the information, it is necessary to launch the mobile application; when doing so, the application will immediately ask
System for Monitoring and Control of in Vitro Ruminal
267
for the access credentials for authenticating the user, as it is shown in Fig. 5a. Once successfully validating the user’s credentials, the application will provide access to the options menu, which the user can use; as shown in Fig. 5b, five options can be found.
Fig. 5. Mobile application, a) authentication screen, b) main menu screen
The necessary elements for creating a new experiment can be found in the new module. It is essential to mention that the created experiments are done with a not started status. The necessary experiments to create a new experiment are the name the experiment will take. The programming of the time at which the obtained information by the compilation modules will be saved can be observed in Fig. 6a. Once a new experiment is created, it is left on standby so the user can start working it on the local server. After this is done, the generated information can be visualized in real-time in the processing module; from there, the general temperature recorded in the compilation mechanism can be visualized inside of the incubator. In the same way, the modules that were recognized by the local server can be visualized (see Fig. 6b). Suppose that it would be wanted to know in more detail the measurements that have been made so far on a particular sample. In that case, it is only necessary to touch the sample in question; a new screen is displayed with the history of the measurements, the temperature of that moment, the time when it was made, and the module to which it belongs. An example of this is shown in Fig. 6c. If throughout the execution of the experiment, the user considers that it is necessary to modify the programmed reading time previously in the moment of creation of the experiment, eliminating the time or well creating new ones. This process can be done in the process screen when selecting time configuration; after this, it will display a new screen with the list of the experiment’s pending time, as shown in Fig. 6d. In order to make a change, it is only necessary to select the record at hand, followed by the operation to make in the upper part of the screen.
268
L. M. Villasana-Reyna et al.
Fig. 6. Mobile application, a) new experiment screen, b) experiment in execution screen, c) details of a sample screen, d) reading time screen.
It is assumed that when the system executes the last scheduled reading, the experiment ends. In this way, the registry is closed. To save the information collected, it is necessary to access the registry module, where the list of completed experiments is presented. Therefore, the user can select the records and export them to a file, or, where appropriate, the records can be deleted from the database, as shown in Fig. 7a. In the mobile application configuration module, a settings option was defined where the frequency can be updated, in addition to the configuration of the application alerts. This can be seen in Fig. 7b.
System for Monitoring and Control of in Vitro Ruminal
269
Fig. 7. Mobile application, a) records screen, b) settings screen.
4 Discussion Throughout this project, the design, development, and implementation of an electronic circuit and a mobile application were suggested for monitoring and controlling of the gas generated by ruminal fermentation, whose experiments are designed by experts in animal nutrition [6]. For the development of the prototype, a data compilation mechanism (measurement of temperature and pressure) was implemented for each sample requiring the less human intervention possible [15]. For this, an Arduino UNO board was chosen, which serves as the interface between hardware and software elements. It also uses actuators (release valves), energy sources, and pressure and temperature sensors, considering the product’s precision. In addition, software was developed to manage the data and integrate all the elements in a mobile application to obtain a functional prototype. Once the laboratory tests had been performed, it was observed that the project carried out had excellent acceptance by animal nutrition experts; because it allowed them to observe in detail the evolution of their experiments. Therefore, the experts have considered that the project is an excellent technological contribution to research on the ruminant digestive system. In addition, it was shown that through a mobile application, it is easy to collect the data from the experiments done in animal nutrition and developing versions for multiple platforms is possible given the use of web services in future works.
5 Conclusions In this work, the design, development, and implementation of a mobile application for the monitoring and control of the experiments designed by the animal nutrition experts was proposed. The use of this application will facilitate the collection of data obtained from the experiments. In addition, it is possible to develop versions for multiple platforms in the application, making use of the developed web service. As future work, we must test the functionality on a larger scale. This to demonstrate and test the scalability of the system. As well as applying security measures that comply with established standards in the different stages of information flow, extending the development to multiple
270
L. M. Villasana-Reyna et al.
platforms by consuming the existing web service. Also, as a long-term goal is to extend the capabilities of the application. For example, to make behavioral estimates or perhaps to analyze the data that are generated in real time by applying the analysis metrics of animal nutrition experts. This is ample opportunities for exploitation in the mobile application.
References 1. Van Soest, P.J.: Nutritional Ecology of the Ruminant. Cornell University Press (1994) 2. Orskov, E.R.; Hovell, F, Mould, F.: Use of the nylon bag technique for protein and energy evaluation and for rumen environment studies in ruminants. Livestock Res. Rural Dev. 9, 19–23 (1997) 3. Mauricio, R.M., Mould, F.L., Dhanoa, M.S., Owen, E., Channa, K.S., Theodorou, M.K.: A semi-automated in vitro gas production technique for ruminant feedstuff evaluation. Am. Feed Sci. Technol. 79(4), 321–330 (1999) 4. Elghandour, M., et al.: Effects of exogenous enzymes on in vitro gas production kinetics and ruminal fermentation of four fibrous feeds. Am. Feed Sci. Technol. 179(1–4), 46–53 (2013) 5. Rymer, C., Huntington, J.A., Williams, B.A., Givens, D.J.: In vitro cumulative gas production techniques: History, methodological considerations and challenges. Am. Feed Sci. 123, 9–30 (2005) 6. Menke, K.H., Raab, L., Salewski, A., Steingass, H., Fritz, D., Schneider, W.: The estimation of the digestibility and metabolizable energy content of ruminant feedingstuffs from the gas production when they are incubated with rumen liquor in vitro. J. Agric. Sci. 93(1), 217–222 (1979) 7. Theodorou, M.K., Williams, B.A., Dhanoa, M.S., McAllan, A.B., France, J.: A simple gas production method using a pressure transducer to determine the fermentation kinetics of ruminant feeds. Am. Feed Sci. 48(3–4), 185–197 (1994) 8. Wilkins, J.R.: Pressure transducer method for measuring gas production by microorganisms. Appl. Microbiol. 27(1), 135–140 (1974) 9. Pell, A., Schofield, P.: Computerized monitoring of gas production to measure forage digestion in vitro. J. Diary Sci. 76(4), 1063–1073 (1993) 10. Cone, J.W., van Gelder, A.H., Visscher, G.J.W., Oudshoorn, L.: Influence of rumen fluid and substrate concentration on fermentation kinetics measured with a fully automated time related gas production apparatus. Am. Feed Sci. 61(1–4), 113–128 (1996) 11. Davies, Z.S., Mason, D., Brooks, A.E., Griffith, G.W., Merry, R.J., Theodorou, M.K.: An automated system for measuring gas production from forages inoculated with rumen fluid and its use in determining the effect of enzymes on grass silage. Am. Feed Sci. 83(3–4), 205–221 (2000) 12. Benito-Lopez, F., et al.: Applicability of ammonia sensors for controlling environmental parameters in accommodations for lamb fattening. J. Sensors 2018, 4032043 (2018) 13. Zhang, L., Kim, J., Lee, Y.: The platform development of a real-time momentum data collection system for livestock in wide grazing land. Electronics 7(5), 71 (2018) 14. Germani, L., Mecarelli, V., Baruffa, G., Rugini, L., Frescura, F.: An IoT architecture for continuous livestock monitoring using LoRa LPWAN. Electronics 8(12), 1435 (2019) 15. Lakhiar, I.A., Jianmin, G., Syed, T.N., Chandio, F.A., Buttar, N.A., Qureshi, W.A.: Monitoring and control systems in agriculture using intelligent sensor techniques: a review of the aeroponic system. J. Sensors 2018, 8672769 (2018) 16. Micheletto, M., Zubiaga, L., Santos, R., Galantini, J., Cantamutto, M., Orozco, J.: Development and Validation of a LiDAR Scanner for 3D Evaluation of Soil Vegetal Coverage. Electronics 9(1), 109 (2020)
System for Monitoring and Control of in Vitro Ruminal
271
17. Robles Algarín, C., Callejas Cabarcas, J., Polo Llanos, A.: Low-cost fuzzy logic control for greenhouse environments with web monitoring. Electronics 6(4), 71 (2017) 18. González, L., Kyriazakis, I., Tedeschi, L.: Precision nutrition of ruminants: approaches, challenges and potential gains. Animal 12(s2), s246–s261 (2018) 19. Kim, W.-S., Lee, W.-S., Kim, Y.-J.: A review of the applications of the internet of things (IoT) for agricultural automation. J. Biosyst. Eng. 45(4), 385–400 (2020) 20. Zhang, M., Feng, H., Luo, H., Li, Z., Zhang, X.: Comfort and health evaluation of live mutton sheep during the transportation based on wearable multi-sensor system. Comput. Electron. Agric. 176, 105632 (2020)
Delays by Multiplication for Embedded Systems: Method to Design Delays by Software for Long Times, by Means of Mathematical Models and Methods, to Obtain the Algorithm with Exact Times Miguel Morán(B) , Alicia García, Alfredo Cedano, and Patricia Ventura Centro Universitario de Ciencias Exactas E Ingenierías (CUCEI), Departamento de Electro-Fotónica, Universidad de Guadalajara, Blvd. Marcelino García Barragán #1421, Esq. Calzada Olímpica, Guadalajara, Jalisco, México {miguel.moran,alicia.garrreola,alfredo.cedano, maria.vnunez}@academicos.udg.mx
Abstract. Delays by Multiplication for Embedded Systems presents a design methodology to find the accuracy in the execution of delay algorithms by software; by obtaining a mathematical model, which is translated into a first degree linear algebraic equation with three unknowns, from which and from a given total time, doing possible to establish the value of the variables involved to reach the accuracy of the full time consumed by the algorithm; however, the mathematical model obtained implies by itself, a challenge in its mathematical solution. The procedure that makes the solution possible is described and implemented to find the value of the three unknowns involved in the model. Keywords: Delay by software · Delay by multiplication · Embedded systems delays · Mathematical modeling of software · MCS-51® · PIC16FXXX® · AVR®
1 Introduction The presented paper solves the lack of a mathematical model and method to establish, with certainty, a software delay for long times (uncertainty by multiplication). The presented paper helps to determine the values the variables involved in the algorithm will require to achieve accuracy in time; a compensation model is provided based on the mathematical model and will be corrected with a few instruction cycles. The analysis is developed in assembly language to lay a foundation that can be transferred to other languages.
2 Delay by Multiplication The proposed delay by multiplication algorithm is developed as a function with the intention that it can be easily invoked within any program. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 272–285, 2023. https://doi.org/10.1007/978-3-031-20322-0_19
Delays by Multiplication for Embedded Systems
273
2.1 Description of the Delay by Multiplication Algorithm A short program called “Flashing” is illustrated (see Fig. 1.); this program will be used to measure the delay by multiplication function from the time it is invoked until the end of the delay program. The algorithm invokes the delay by multiplication function at the beginning, continuing with an operation where the state of a bit is logically negated. The call to the delay by multiplication function is repeated, thus generating a closed loop. This action could represent an LED that turns on and off “Flashing” until you choose to turn off the circuit; the delay by multiplication function will be responsible for controlling the on and off times.
Delay by Multiplication
Flashing R0 = X R1 = Y R2 = Z Delay by Sums R0 = R0-1 ~ P1.0
No
R0 = 0 Yes
R1 = R1-1 No
R1 = 0 Yes
R2 = R2-1 No
R2 = 0 Yes
Return
Fig. 1. Delay by multiplication, using three registers R0, R1, and R2 with assigned variables X, Y, and Z.
The delay by multiplication (illustrated in Fig. 1) is composed of three 8-bit variables, named R0, R1, and R2, these elements can be declared as local variables within the function, or it is possible to use registers of the processor itself. The registers R0, R1, and R2 are assigned different constant values with the denominations X, Y, and Z, respectively, which will be calculated later, so each register has the assignment: R0 = X, R1 = Y, R2 = Z. Once the values are assigned, the sequence requires that R0 (R0 = R0–1) is decremented first; the next step is to compare the value in R0 to find out if it has already reached zero (R0 = 0) in case this condition is not met, it will continue decrementing;
274
M. Morán et al.
Once the state is reached R0 = 0, we proceed with the decrement in R1 (R1 = R1–1) and in the same way comparison is made to find out if R1 is equal to zero (R1 = 0) and if this condition is not met, we continue decrementing the value in R0, which is now zero; so again decrements are made until R0 = 0, this will mean that for each decrement in R1, 256 decrements are made in R0. At the moment that the condition is fulfilled where the decrement in R2 is performed, continuing with the decrement in R0 and driving 256 decrements until, resulting in a decrement in R1, which was at zero and now is 255 and as R1 is different from zero, causes a series of execution cycles where again for each decrement in R1 256 decrements are made in R0, reaching the condition where, and consequently R2 is evaluated verifying that it is zero, in which if the evaluation is false it continues decrementing R0, R1 times by R2 times; on the contrary, once this condition is true, the function call is terminated, and a return (exit of the function call) is triggered. The series of decrements of the algorithm consumes instruction cycles that are translated into time so that the routine can consume the necessary time cycles until the target time is reached. 2.2 Assembler Coding In the algorithm’s analysis, the device’s general characteristics are contemplated. Still, the times and speed of execution of the instructions are where the assembler language plays an important role in the selected embedded system. The embedded family originally from Intel, the “MCS-51,” was chosen because it is one of the best known in the field of 8-bit embedded systems. Based on the flowchart (see Fig. 1), the program code is created using the data sheets where the assembler instructions are found [1], the first column refers to the name of the routine and legends used in the program and the second column contains the code or program (see Table 1). The specifications of the MCS-51 embedded system family [1, 2] define that the processor performs one instruction cycle in 12 oscillator periods (IC = 12 cycles) and that the typical frequency is 12 MHz (frecTyp = 12 MHz), which would mean that one instruction cycle is executed in 1µsec according to the formula (1). TIC =
12cycles IC 12cycles = = = 1 × 10−6 s = 1 µs cycles frecTyp 12MHz 12 × 106 seconds
(1)
Table 1. Delay by multiplication Encoding for the MCS-51 embedded system, including the machine cycles of the instructions and notes concerning the values for X, Y, and Z. Name of the function or routine
Code
Instruction cycles (µs) Notes
Flashing:
ACALL 2 Delay_by_multiplication (continued)
Delays by Multiplication for Embedded Systems
275
Table 1. (continued) Name of the function or routine
Code
Instruction cycles (µs) Notes
CPL P1.0
2
SJMP Flashing
2
Delay_by_multiplication: MOV R0, #X
1
0 ≤ X ≤ 255
MOV R1, #Y
1
0 ≤ Y ≤ 255
MOV R2, #Z
1
0 ≤ Z ≤ 255
DJNZ R0, Loop1
2
DJNZ R1, Loop1
2
DJNZ R2, Loop1
2
RET
2
Loop1:
From the above (1), the information is taken to define the duration of the instruction cycles necessary to execute the instructions in the delay by multiplication algorithm, in units of µseconds, as shown in the instruction cycles column (Table 1). Observations are incorporated in the notes column, where the minimum and maximum values of the variables are specified, given the 8-bit conditions previously defined in 2.1 to avoid losing sight of them in the process. 2.3 Obtaining the Mathematical Model of the Code To obtain the total time (TT ) of the delay by multiplication function in units µseconds, we take the first two µseconds (K1 = 2) consumed by the ACALL function, which invokes the delay by multiplication function (see Table 1). Once the processor is positioned in the hold by multiplication line, three instructions continue MOV Rn , #data which are executed in 1 each, which will add three µseconds more (K2 = 3) to the total time; these instructions are used to load the variables X, Y, and Z in the respective registers R0, R1, and R2. After the execution of the three previous instructions, we continue with the implementation of Loop1, which will add its own time to the total, which is described in the following paragraph. For code analysis purposes, values are assigned to the variables where X = 4, Y = 3, Z = 3. The first round of decrements with R0 in Loop1 is shown, which consumes eight µseconds times displayed in the “Accumulated Instruction Cycles” column (see Table 2). The execution of the corresponding instructions in assembly language consumes two instruction cycles; this situation is repeated three more times because R0 = 0, so we infer (2), if X = 4 and that each execution cycle is 2 µs. These conditions apply to all values X within the established range 0 ≤ X ≤ 255, as observed in the notes (see Table 1). TR0 = 2X = 2(4) = 8 µs.
(2)
276
M. Morán et al. Table 2. Delay by multiplication: execution of Loop1 with R0, considering that X = 4
Register
Execution of instruction
Instruction cycles (µs)
Instruction cycles accumulated (µs)
First decrement &
2
2
2
4
2
6
2
8
Loop1: R0 = 4 − 1 = 3
R0 = 0 R0 = 3 − 1 = 2
Second decrement & R0 = 0
R0 = 2 − 1 = 1
Third decrement &
R0 = 1 − 1 = 0
Fourth decrement &
R0 = 0 R0 = 0 End Loop1& decrement R1
Next, it is considered that R1 = 3 and that R0 = 0 (see Table 3), which will imply that R0 will be decremented up to 256 times, in this case from 0 to 0 and this for each value R1–1 of times, in addition to the proper decrements in R1 which is expressed in (3). TR1 = 2(256)(Y − 1) + 2Y = 514Y − 512 = 514(3) − 512 = 1, 030 µs.
(3)
Table 3. Delay by multiplication: Execution of Loop1 with R1 and R0, considering R1 = 3 and R0 = 0 Register
Execution of instruction
Instruction Cycles (µs)
Instruction Cycles accumulated (µs)
R1 = 3 − 1 = 2
First decrement &
2
2
2
(512) + 4
2
2(512) + 6
R1 = 0, go to Loop1 R1 = 2 − 1 = 1
Second decrement & R1 = 0, go to Loop1
R1 = 1 − 1 = 0
Third decrement & R1 = 0, & decrement R2
Loop1: R0 = 0 − 1 = 255
First decrement & (continued)
Delays by Multiplication for Embedded Systems
277
Table 3. (continued) Register
Execution of instruction
Instruction Cycles (µs)
Instruction Cycles accumulated (µs)
2(256)
(512)
R0 = 0 .. . R0 = 1 − 1 = 0
Final decrement & R0 = 0 End Loop1& decrement R1
For the performance analysis of the decrements in R2 (see Table 4), it is considered that Z = 3 so R2 = 3, for this case R1 = 0 and R0 = 0. The first decrement of R2 is reached by decrementing the initial values for R0 and R1 as previously explained (see Table 2 and Table 3) leaving R2= 3–1 = 2. For the next decrement of R2, R1 = 0 and R0 = 0, where R0 is decremented from 0 to 0 and for each sequence of these 256 values in R0, R1 is decremented, whereby R1 = 256–1 = 255 again R0 will be decremented from 0 to 0, with which it is inferred that for each decrement of R1, 256 of R0 are realized; It is until R1 = 0, that R2 is decremented; therefore it is assumed that for each decrease of R2, 256 of R1 multiplied by 256 of R0 will be realized and from which Eq. (4) is derived, with which the final equation will be defined almost in its entirety. Table 4. Delay by multiplication: Execution of Loop1 with R2, R1 and R0, considering that Z = 3, R1 = 0 and R0 = 0. Register
Execution of instruction
Instruction cycles (µs)
Instruction cycles accumulated (µs)
R2 = 3 − 1 = 2
First decrement &
2
2
2
(131,072) + 4
2
2[131,584] + 6 = 263,174
R2 = 0, decrement R0 R2 = 2 − 1 = 1
Second decrement & R2 = 0, decrement R0
R2 = 1 − 1 = 0
Third decrement & R2 = 0 then Exit routine
R1 = 0 − 1 = 255
First decrement & R1 = 0, go to Loop1
(512) +
(continued)
278
M. Morán et al. Table 4. (continued)
Register
Execution of instruction
.. . R1 = 1 − 1 = 0
Final decrement R1 = 0,
Instruction cycles (µs)
Instruction cycles accumulated (µs)
(512) (256–1) +
131,584
2(256)
& Decrement R2 Loop1: R0 = 0 − 1 = 255
First decrement & R0 = 0
.. . R0 = 1 − 1 = 0
Final decrement &
2(256)
512
R0 = 0 End Loop1& decrement R1
TR2 = [2(256) + 2(256)(256 − 1) + 2(256)](Z − 1) + 2Z = 131, 584(Z − 1) + 2Z = 131, 586Z − 131, 584.
(4)
Substituting Z = 3 in (1) we have the: TR2 = 131, 586Z − 131, 584 = 131, 586(3) − 131, 584 = 263, 174 µs.
(5)
Finally, the return time (RET at the end of the code in Table 1) which consumes 2 µs of processor time (K3 = 2), will be added to the Total Time, so that by adding K 1 , K 2 , K 3 , (2), (3) and (4) the general equation can be integrated to obtain the total time delay (T T ) of the algorithm by multiplication expressed as: TT = K1 + K2 + TR0 + TR1 + TR2 + K3 .
(6)
The constants are substituted and added, as shown in (7) K1 + K2 + K3 = 2 + 3 + 2 = 7.
(7)
Substituting in (6) Eqs. (2), (3), (4) and (7) we have (8) TT = 2X + (514Y − 512) + (131, 586Z − 131, 584) + 7 = 2X + 514Y + 131, 586Z − 132, 089.
(8)
The accumulated times considering the variables where: X = 4, Y = 3, Z = 3, based on the values obtained in (2), (3), (5), (7) are: TT = (8 + 1, 030 + 263, 174 + 7) µs = 264, 219 µs.
(9)
Delays by Multiplication for Embedded Systems
279
The final equation obtained (10) is characterized by having a high coefficient in Z with a negative constant quite like the coefficient of Z because of the multiplications caused by the variables X, Y, and Z, in the registers R0, R1, R2, which suggests or originates the name of “delay by multiplication.” TT = 2X + 514Y + 131, 586Z − 132, 089.
(10)
In (10) the values of the variables are substituted to check the times obtained in (9) leaving: TT = 2(4) + 514(3) + 131, 586(3) − 132, 089 = 264, 219 µs.
(11)
2.4 Method for Solving the Mathematical Model In (12) a first order equation with three unknowns can be seen; therefore, finding each of the variables by some known method becomes practically impossible, some of the known methods require a second equation or a reference point. A possible approach to the solution is through the “Lagrange indeterminate coefficients” or “Lagrange multipliers,” which is also impractical because there is no second function that can help to find the answer or a reference coordinate for this equation. TT = 2X + 514Y + 131, 586Z − 132, 089.
(12)
An option that can help is to find the minimum total time (TTMin ) that the “delay by multiplication” can consume and the maximum full time (TTMax ) that the function will finish and, in this way, find the domain interval of the process to help in the solution of the equation. It is achieved by considering X = 1, Y = 1, Z = 1 and substituting in (12) and determining (13) TTMin = 2(1) + 514(1) + 131, 586(1) − 132, 089 = 13 µs.
(13)
To find T TMax , we place the variables X = 256, Y = 256, Z = 256, remembering that in the code it is equivalent to X = 0, Y = 0, Z = 0 and substituting in (12), we have that: TTMax = 2(256) + 514(256) + 131, 586(256) − 132, 089 = 33 686, 023 µs.
(14)
In summary, the Total Time interval of the delay by multiplication function is expressed in (15), and how it is possible to see a quite wide range from 13 µs just a little more 33 µs. 13 µs ≤ TT ≤ 33 686, 023 µs.
(15)
The above clearly marks the range of domain of the function or the range of application, so it is not possible to carry out any operation below or above these values and always consider that the values for the Eq. (12) of the variables will be 1 ≤ X ≤ 256,
280
M. Morán et al.
1 ≤ Y ≤ 256, 1 ≤ Z ≤ 256 and remember that “256” in the equation is equivalent to “0” in the registers R0, R1, R2 at the code level. Once the mathematical model of the delay by multiplication algorithm and the domino range of the function have been identified, the results obtained may be satisfactory; however, the issue is not completely solved because now it must be assumed that based on a given total time (TT ), it is possible to find the values of X, Y, Z, in such a way that the work done to find a mathematical model becomes important. 2.5 Obtaining Equations for Variables X, Y, Z In the steps described below, it is assumed that the value of (TT ) is known since it is a value that will be provided within the range known in (15). 1st Step: When observing the equation in (12), Z is the most significant variable or, in other words, the one that contributes the most to the final value of the equation; therefore, as a first intention, it is considered that X = 0, Y = 0, to observe the magnitude of Z, with which from (12) it follows that TT = 2(0) + 514(0) + 131, 586Z − 132, 089 = 131, 586Z − 132, 089.
(16)
By subtracting Z from (16) the following equation is obtained: Z=
TT + 132, 089 . 131, 586
(17)
Assuming the known notation for real numbers R [4, 5] in (18): W = [W ] + {W }.
(18)
where is a real number that can be integer or fractional, [W ] is the largest integer function and {W } is the fractional part of W. It is then defined from (17) that only the integer value of Z will be obtained, that is [Z], as illustrated below: [Z] =
TT + 132, 089 . 131, 586
(19)
2nd step: Once the value of the most significant variable is obtained, Eq. (12) is taken again and X = 0 is considered, which turns out to be the least significant variable: TT = 2(0) + 514Y + 131, 586[Z] − 132, 089 = 514Y + 131, 586[Z] − 132, 089.
(20)
From (20) it is cleared to find the integer value of Y; that is [Y ]. [Y ] =
TT + 132, 089 − 131, 586[Z] . 514
(21)
Delays by Multiplication for Embedded Systems
281
3rd step: Now that we have the values of [Y ] and [Z] we can find the integer value of X, that is [X ], from (12), as follows: [X ] =
TT + 132, 089 − 131, 586[Z] − 514[Y ] . 2
(22)
4th step: Once the values of [X], [Y ] and [Z] are obtained, an analysis of the fractional part of X must be performed; that is {X}, according to (18), since in (12), it is observed that the coefficient of X is two and that it is reflected in (22) as the divisor of the whole equation; therefore, in such a situation, two cases will be obtained in the fractional part {X}, as described in summary in the following points: If the fractional part of X is {X} = 0, the routine will be exact. If the fractional part of X is {X} = 0.5, the routine will not be exact, and an instruction cycle will be needed. This means {X} = 0 the code will not require any programming adjustment to meet the accuracy perspective (see Table 1). On the other hand, when {X} = 0.5 it will be necessary to adjust the original code (see Table 1), this adjustment corresponds to the use of the “NOP” instruction of the assembler language [1] of the processor. In the great majority of processors, the “NOP” instruction fulfills the characteristics of completing an instruction cycle without affecting the state of the processor or the behavior of the algorithm; this instruction must be added before the “RET” in the algorithm or program (see Table 1). It has been identified in the mathematical model. With other processors that the multiplication of the coefficient of X in the main formula, “2,” as it is in this case of (12), multiplied by the fractional part {X } produces a result, how many instruction cycles will be necessary to compensate. So, to formalize using an equation, the coefficient of X, it will be called K X , therefore K X = 2 and in this case, it follows that the compensation of the algorithm in instruction cycles (C IC ) is: (CIC ) = KX · {X }.
(23)
2.6 Demonstration To demonstrate all the above, we propose a TT = 10, 000 µs, which is within the range expressed in (15), and then apply Eqs. (19), (21), (22), and (23) to find [X], [Y ], [Z] and if applicable {X} and C IC . 10, 000 + 132, 089 = 1. 131, 586
(24)
10, 000 + 132, 089 − 131, 586[1] = 20. 514
(25)
10, 000 + 132, 089 − 131, 586[1] − 514[20] = 111. 2
(26)
[Z] = [Y ] = [X ] =
282
M. Morán et al.
In this case KX = 2 and X = 111.5 so substituting in (23): (CIC ) = 2 · 0.5 = 1.
(27)
This means that the compensation of the algorithm shown in (27) (see Table 5) is included in the original code (see Table 1). The algorithm is verified in the Edsim51 [6] and Proteus [7] simulators, taking the reading of the time taken by the algorithm from the Flashing legend to the CPL P1.0 instruction. In both cases, it is one T T =10,000 µs (see Fig. 2). Table 5. Final coding of the Delay by multiplication for the MCS-51 embedded system, with the values obtained for [X], [Y], and [Z] and including the compensation of the necessary instruction cycle Name of the function or routine
Code
Instruction cycles (µs)
Flashing:
ACALL Delay_by_multiplication
2
Delay_by_multiplication:
Loop1:
Notes
CPL P1.0
2
SJMP Flashing
2
MOV R0, #111
1
[X]
MOV R1, #20
1
[Y ]
MOV R2, #1
1
[Z]
DJNZ R0, Loop1
2
DJNZ R1, Loop1
2
DJNZ R2, Loop1
2
NOP
1
RET
2
C IC
Based on the mathematical model for the Intel MCS-51 (28), a series of simulation runs have been carried out to verify the robustness and effectiveness of the previously obtained model. TT = 2X + 514Y + 131, 586Z − 132, 089.
(28)
In the runs we start from a total time (TT ) determined within the range established in (17), from which we obtain [X], [Y ], [Z] where the valid ranges obtained by (28), must be 1 ≤ X ≤ 256, 1 ≤ Y ≤ 256, 1 ≤ Z ≤ 256 and where the value of 256 becomes 0 when transferred to the processor registers (R0, R1, R2), any different value, could be observed as an inconsistency or an error in the model, which certainly is not. However, three special cases have been detected, which are mentioned below and the procedure for making the relevant adjustments for subsequent implementation is indicated therein:
Delays by Multiplication for Embedded Systems
a)
283
b)
Fig. 2. Simulation of the delay by multiplication algorithm: a) Result of the implementation in EdSim51, b) Result of the implementation in Proteus.
1st case: [Y ] = 0, which implies that [Z] = [Z] − 1, [Y ] = 256 and [X ] = [X ] + 1, as seen in lines 2 to 5 and 10 in Table 6. 2nd case: [Y ] = 1&[X ] = 0, which implies that [Z] = [Z] − 1, [Y ] = 256 and [X ] = 256, plus four “NOP” before exiting the function. as can be seen in lines 6 and 7 in Table 6. 3rd case: [Y ] > 1&[X ] = 0, which implies that [Y ] = [Y ] − 1 and [X ] = 256, plus, four “NOP” before exiting the function. How can be seen in lines 8 and 9 in Table 6.
Table 6. Special cases obtained by applying the mathematical model of the “delay by multiplication” for MCS-51, lines 1 and 10 indicate the model’s limits, the column T TADJ illustrates the value obtained after adjustment T T − T ADJ and the equivalent of the “NOP” instructions to be used before exiting the delay function Values obtained from (30)
Adjusted values
#
TT
[X]
[Y ]
[Z]
[X] C IC
[X]
[Y ]
[Z]
T TADJ
T T − T ADJ
1
13
1
1
1
0
0
1
1
1
13
0
2
13,026,511 0
0
100 0
0
1
256 99
3
26,185,111 0
0
200 0
0
1
256 199 26,185,111 0
13,026,511 0 (continued)
284
M. Morán et al. Table 6. (continued) Values obtained from (30)
Adjusted values
[X]
[X]
[Y ]
TT
4
13,026,711 100 0
100 0
0
101 256 99
5
26,185,511 200 0
200 0
0
201 256 199 26,185,511 0
6
9,079,445
0
1
70
0
256 256 69
7
19,606,325 0
1
150 0
0
256 256 149 19,606,321 4
8
15,719,911 0
120 120 0
0
256 119 120 15,719,909 2
9
32,496,611 0
247 247 0
0
256 246 247 32,496,609 2
0
256 256 256 33,686,023 0
10 33,686,023 255 0
[Z]
[X] C IC
0
257 0
[Y ]
[Z]
T TADJ
T T − T ADJ
#
13,026,711 0 9,079,441
4
The same procedure of obtaining the mathematical model and its solution has been carried out with two other devices such as the original Microchip PIC16F8XX [8] and with the original AVR architecture of Atmel [9], considering and respecting their individual specifications, finding a similar program and model in both cases, with which it was possible to establish multiple target times in each of them, managing to evaluate the different data both in simulation and in real execution, obtaining similar results satisfactorily.
3 Conclusions The results obtained have demonstrated the contribution of the “methodology to find the mathematical model of the delay by multiplication algorithm”, as a first step to abstract in a mathematical way the behavior of the delay by multiplication routines, the model obtained, at first sight, produces a bittersweet sensation since it stumbles upon a linear algebraic equation model of the first order and with three unknowns (X, Y, Z), which by its nature, becomes a dead end. The “method to solve the equation of the mathematical model” provides the solution to the model through a methodology consistent with the logical operation of the system and within the domain interval of the function. This method allows the future to increase or reduce the number of variables according to the needs required in each situation to increase or reduce the time. With the certainty, reliability, and accuracy that the method provides, determine the necessary parameters, and build this algorithm for different models or brands of embedded systems and/or processors of 8 bits, 16 bits, 32 bits, or higher. It is possible to build with the algorithm functions where the time parameter becomes a variable in applications where it is required. Finally, the methodology presented here reduces simulation times and inaccuracies that may hinder the development of certain applications where software delays are implemented. Future work that may be offered in larger architectures will present other challenges due to the number of bits that are handled and in which different models and methods, although very similar, may be determined.
Delays by Multiplication for Embedded Systems
285
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
Atmel: http://ww1.microchip.com/downloads/en/DeviceDoc/doc0509.pdf Atmel: https://ww1.microchip.com/downloads/en/DeviceDoc/doc4316.pdf Atmel: https://ww1.microchip.com/downloads/en/DeviceDoc/doc1919.pdf Graham, R. L., Knut D. E., Patashnik, O.: Concrete Mathematics, pp. 67–70. Addison Wesley (1994) Kumar, V.: Functions and Graphs for IIT JEE. Tata McGraw-Hill Education 1.125 (2013) Rogers, J.: https://www.edsim51.com Labcenter: https://www.labcenter.com Microchip: https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/Produc tDocuments/DataSheets/40001291H.pdf Microchip: https://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-2486-8-bit-AVR microcontroller-ATmega8_L_datasheet.pdf
Monitoring System for Dry Matter Intake in Ruminants Jesus Sierra Martinez1 , Juan Carlos Elizondo Leal1(B) , Daniel Lopez Aguirre1 , Yadira Quiñonez2 , Jose Hugo Barron Zambrano1 , Alan Diaz Manriquez1 , Vicente Paul Saldivar Alonso1 , and Jose Ramon Martinez Angulo1 1 Facultad de Ingeniería y Ciencias, Universidad Autónoma de Tamaulipas, 87120 Victoria,
Mexico {jcaelizondo,dlaguirre,hbarron,amanriquez,vpsaldiv, jrangulo}@docentes.uat.edu.mx 2 Facultad de Informática Mazatlán, Universidad Autónoma de Sinaloa, 82000 Mazatlán, Mexico [email protected]
Abstract. Among the Sustainable Development Goals approved by the UN is “2 zero hunger,” One of its targets is to increase investments in agricultural research and extension services and technological development to improve production capacity. Dry Matter Intake (DMI) is a fundamental parameter in animal nutrition research because it estimates the overall supply of nutrients, especially those evaluated during research experiments. This project describes developing a system embedded in sheep cages that experts use in animal production to estimate feed consumption and rejection. This system allows, in an automated way, to monitor and store information about the weight of feed consumed by production animals and the temperature and humidity of the animal’s environment. Furthermore, the system extracts this information through software that allows data collection while minimizing human intervention. Keywords: Automation · Microcontrollers · Animal production · Dry matter intake · 2 zero hunger
1 Introduction Animal products have been constituents of human food. The demand for these products (e.g., milk and beef) in tropical regions is expected to increase due population growth and climate change. The need to increase animal production yields must be achieved through improved management and production technologies [1]. Dry Matter Intake (DMI) is a fundamental parameter in animal nutrition research because it estimates the overall supply of nutrients, especially those evaluated during research experiments. DMI in dairy cattle is essential for correlating health status during periods of stress, such as the transition period in dairy cows, and is critical for calculating efficiency in dairy cows (i.e., kg milk/kg DM consumed) [2]. The DMI of animals can be affected by the physical and chemical characteristics (e.g., dry matter and fiber content) © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 286–298, 2023. https://doi.org/10.1007/978-3-031-20322-0_20
Monitoring System for Dry Matter Intake in Ruminants
287
of dietary ingredients and their interactions [3]. These factors and their interactions make it challenging for any sensor system to estimate the actual DMI in animals in a thoroughly mixed diet format. In contrast, the above factors and their interactions are less problematic when DMI is measured by the disappearance of the thoroughly mixed diet (i.e., DM offered – DM rejected) in research experiments through a gating system or an automatic feeding system that can continuously record intake [2, 4]. Automated systems are innovative solutions that typically use software and hardware to provide high levels of reliability of information minimizing the human intervention. In this sense, using an automated system to track the weight of small ruminant food and the temperature and humidity of the animal’s environment, gives the applicator freedom to perform other activities. Sensor system technology has been used to estimate behaviors such as resting behavior [5, 6], rumination [7–9], and feeding behavior [7, 10–12]. In [7], Zehner et al. presented the RumiWatch nose sensor, which allows monitoring of rumination and feeding behavior for stable-fed cows. RumiWatch classifies chewing as rumination or feeding and stores the information in a microSD memory. Compared to direct observation, the system yields data with high reliability. However, they only obtain rumination time and feeding time without providing data on the amount of feed the animal has consumed. Bikker et al. [10] present an evaluation Cow-Manager SensOor system which is attached to the ear of a cow to classify 1 of 4 behavioral categories, namely “ruminating,” “eating,” “resting,” and “active,”. Data are sent through a wireless connection, via routers and coordinators, to a computer. Büchel, and Sundrum in [11] present a system DairyCheck for measuring feeding time and rumination time in dairy cows, this system implements accelerometers and electromyography, and the collected data is sent wirelessly to a computer. The system presented demonstrates a good degree of reliability, however, it only measures feeding and rumination time without measuring the amount of feed consumed. Mattachini [12] uses HOBO Pendant G logger positioned in the neck of caws, the captured information is downloaded to a computer via USB interface to transfer data to a computer and this information is processed to a data sheet to determine the feeding time. By using the HOBO Pendant G logger the authors demonstrate the existence of a relationship between the information obtained by the sensor and the feeding time, however, the data collection process requires a wired connection between the device and the computer, in addition to the fact that the information captured is regarding the feeding time and not the amount of feed consumed by the animal. The proposals mentioned above [7, 10–12] demonstrate that the use of sensors can provide reliable information on feeding behavior, nevertheless do not directly obtain the amount of feed consumed by the animals, which is fundamental data for animal nutrition experts for the elaboration of diets. Recording individual DMI dynamics collected in real-time through sensor systems is a valuable tool for animal production research trials. This article describes the steps for developing a system embedded in sheep cages for data collection, such as temperature and humidity, as well as food weight throughout the nutritional experiment. In addition,
288
J. Sierra Martinez et al.
the development of a script that wireless allows obtaining the collected data and saving it in a CSV document to obtain graphs and information for future analyzes is presented.
2 Problem Description and Proposed Approach 2.1 Analysis of the Problem In livestock experimentation laboratories cages are used in which experiments are conducted for the analysis of the animal’s diet. Experiments are currently carried out is by placing the animal in a cage for some time to collect data on its consumption; each cage has a feeder and a drinker. Currently, data collection consists of weighing the feed before dispensing it to the animal and, after some time, weighing the rejected material, with consumption being the difference between these two measurements. However, given the great demand for human capital for data collection, an automated system was developed capable of capturing data more frequently and thus determining the dynamics of consumption and considering new variables, such as temperature and humidity. This way, collecting a large amount of data for subsequent analysis is possible. 2.2 Proposed Solution As a result of the problem analysis, we present a solution that meets the needs described above. For the development of our system, we decided to use the model proposed by Liggesmeyer and Trapp [13]. This process allows us to develop partial improvements after each modification, being implemented the most important aspects of the development. The advantage of using this model is that it does not modify the flow of the life cycle and increases the probability of success because customers usually do not know what they want, but by offering them a prototype where they can test their system they will realize what they really do not want in it, thus giving us more specific and clearer requirements for its development. Figure 1 explains the stages of the model used in this development. Each step of the system development process will be described below. Requirements: For the taking of requirements, we conducted an interview with experts in animal nutrition, where we obtained the main ideas of the operation of the system, then we developed an initial prototype which served to perform performance tests and from this to work on improvements in the data collection system and the software responsible for working the captured data. Functional Design: Once the system requirements were defined, UML diagrams were created for subsequent documentation. System architecture: we developed an architecture for the operation of the data capture system, an architecture diagram was used to define it, showing the components used for its development. Software architecture: We use flow charts to define the algorithm to be performed for the software, allowing us to document it and help us with the logic part of the development.
Monitoring System for Dry Matter Intake in Ruminants
289
Fig. 1. Software development model for embedded systems proposed by Liggesmeyer and Trapp (retrieved from [13]).
Software design: We designed a communication protocol capable of communicating the system with the software, which allows to attend the requests made by the users. Implementation: We developed the system using the Arduino Nano microcontroller as a base, the implementation of the components was done using the development environment provided by Arduino. Unit tests: When coding each requirement, we performed separate tests of each component in order to check the proper functioning of the system, the testing technique used was the component test. Software integration and testing: We integrated the components and performed the integration tests in order to find problems with the operation and communication between the software components, the technique used was end-to-end testing. System integration and testing: We integrated the hardware components and performed integration tests in order to find faults in it, we also carried out tests in the communication between software and hardware, treating it as a single set, the technique used was the test example. Functional tests: We perform the tests taking into account the requirements requested by the animal nutrition experts, in order to verify that the system complies with what was requested. Validation tests: Validation tests were conducted by animal nutrition experts in order to validate the product. In summary, the system consists of an information collection system and a desktop application for information management. The collection system is a system that involves the performance of software and hardware to carry out the task of measuring the values present in each of the samples at times set by the applicator of the experiment; the system has a load cell of 10 kg along with a transmitter module HX711. The load cell is responsible for capture the weight of the animal’s feed. HX711 module is calibrated when it is first installed basically consists of finding the scale value to be used, i.e. finding the conversion factor to convert the reading value into a value with weight units. The
290
J. Sierra Martinez et al.
scale is different for each load cell and changes depending on the installation method, maximum weight or load cell model. Moreover, at the beginning of each experiment, the tare is performed in situ with the container empty, which gives us a reliable capture of the weight of the food. Besides, it includes a DHT22 sensor temperature and relative humidity, a Bluetooth module, and an LCD screen to show information on-site, an Arduino nano card controls these devices. On the other hand, the desktop application is in charge of downloading and sharing the information collected by the system and indicating the experiment’s start and end. The proposed system can obtain a large amount of data over time, saving this information in the external EEPROM memory, which will allow us to save with a sampling interval of 15 min for up to 33 consecutive days. The information captured by our system facilitates the expert in animal nutrition to obtain consumption patterns and their possible correlation with physical variables such as humidity and environmental temperature. It also provides an effective tool for digitizing data with little human intervention. Figure 2 shows the architecture of the monitoring system, the software in charge of data management, and the wireless communication type through Bluetooth technology.
Fig. 2. Monitoring system architecture and data management software.
As previously mentioned, the system has an LCD screen that shows the information on temperature and humidity, the weight of the food, and the date (day, month, year, hour, minute, and second). It also shows if the experiment has started and if the experiment has finished, it can also be seen if the tare is being performed. It shows confirmation messages (in the Button Manipulation).
Monitoring System for Dry Matter Intake in Ruminants
291
The system contains three buttons for basic and quick configuration, avoiding the use of Python code (if the applicator wants to perform another function, it necessary to connect through the computer), in addition, a key switch was added to disable the buttons and prevent outsiders from manipulate them. Figure 3 shows the schematic diagram of the developed system, button 1 (green) indicates the start of the experiment at that moment, button two (red) the end of the experiment at that moment, and button three (yellow) the tare of the scale.
Fig. 3. System schematic diagram.
Figure 4 refers to the PCB design generated, minimizing the spaces as much as possible to obtain a board with reduced size. This system was embedded in a prototyping box measuring 15 × 6 × 9.9 cm. Due to the intermittent power supply that may occur, a backup battery is included, which will only come into operation when the power supply is intermittent. The system’s autonomy with this backup battery guarantees 18 h of continuous operation, enough time for the power of the site to be restored. For the development of the software, a communication protocol was defined for data management to establish communication between the software and the system, and in this way, requests can be answered in both directions.
292
J. Sierra Martinez et al.
Communication is done through Bluetooth; before starting the experiment, it is necessary to define the following functionalities: first, configure the start of the experiment (Day, Month, Year, Hour, and Minute); second, set the sampling interval (Minutes); and third, configure the end of the experiment (Day, Month, Year, Hour and Minute). Once the experiment’s configuration has been performed, the experiment is initialized, and when it is finished, the data saved in the system can be downloaded. In addition, it can query start run settings, view end-run settings, precise system data, tare system scale, and modify system clock and graph.
Fig. 4. PCB design
3 Experimental Results This section presents the results obtained with the development of the embedded system, its operation, and the tests performed in an elevated metabolic cage in the Metabolic Unit of the Zootechnical Posta “Ingeniero Herminio García González” of the Faculty of Engineering and Science of the Autonomous University of Tamaulipas. Figure 5 shows the PCB board of the design. In contrast, Fig. 6 shows the inside of the monitoring system, where it can see the different components that make it up, such as the Arduino nano, Bluetooth module, EEPROM memory, battery, the Real Time Clock (RTC), among others. Figure 7a shows the confirmation message for the start of the experiment, which is obtained by pressing the green button and cancels the operation by pressing the red button. Figure 7b shows the screen after accepting the start of the experiment. The system displays a message indicating the start of the experiment and shows the different parameters that will be captured later through the defined time interval. It is worth mentioning that each time a new experiment is started, the previously saved data are erased from the EEPROM memory.
Monitoring System for Dry Matter Intake in Ruminants
a)
Front side of PCB board
b)
293
Back side of PCB board
Fig. 5. PCB board.
Fig. 6. Internal view of the monitoring system.
Figure 8a shows the confirmation message for the end of the experiment, which is confirmed by pressing the red button and canceled by pressing the green button. Figure 8b shows the screen after confirming the operation where the message is displayed, indicating the end and the different environmental and time parameters. Figure 9a shows the system main screen, which displays the environmental parameters, the weight, and the time at that moment. By pressing the yellow button, Fig. 9b shows the message indicating that it is in the process; therefore, the user should not add weight to the scale at that moment. Figure 10a and b shows the system implemented in an elevated metabolic cage located in the Metabolic Unit of the Zootechnical Posta “Ingeniero Herminio García González” of the Faculty of Engineering and Science of the Autonomous University of Tamaulipas, located in the municipality of Güémez, Tamaulipas (23° 56 N, 99° 06 W). A test was conducted for a week to check the system operation, both hardware, and software, through an experiment conducted by experts in animal nutrition, with
294
J. Sierra Martinez et al.
a)
Confirmation message when pressing the green button (Start)
b)
Experiment start screen
Fig. 7. System screens when starting an experiment
a)
Confirmation message when pressing the red button (Finish)
b)
Experiment finish screen
Fig. 8. System screens when you want to end an experiment with the red button
a)
Home screen
b)
Display when the Tare button (yellow) is pressed
Fig. 9. Extra system screens
Monitoring System for Dry Matter Intake in Ruminants
295
the support of two sheep with an average live weight of 24.5 kg, which were fed a commercial feed (Uni-Ovino Engorda, Alimentos Union Tepexpan®; 15% protein) for growing sheep. DM consumption data were collected with a sampling interval of 15 min during three days for animal no. 1 and four days for animal no. 2.
a)
Prototype installed in a metabolic cage
b) load cell installed in the feed container of the metabolic cage
Fig. 10. Final prototype of the system installed in the metabolism cage designed for small ruminants.
The results obtained from the DM consumption dynamics of animal no. 1 and the recording of environmental factors such as humidity and temperature are shown in Fig. 11. The green line indicates the recording of the weight (g) of the feed, and the red line indicates the recording of the temperature during the period evaluated. The graphical results show a peak when offering a daily feed (approximately 1200 g/day). Every day of the experiment at 7:00 a.m. the feed rejected is retired and the new diet food is manually added by an operator. A higher DM consumption was recorded during the first hours of the day, and dry matter consumption decreased with temperatures close to 34 °C. In addition, the graph also shows a lower amount of feed at the end of the day, which
Fig. 11. Graph of consumption dynamics with respect to temperature.
296
J. Sierra Martinez et al.
can be interpreted as the animal did not comply with its voluntary DM consumption. Therefore, more feed should be placed in the coming days. The moisture percentages recorded were in the range of 40–85% during the period evaluated. Figure 12 show the results obtained from the DM consumption dynamics of animal no. 2 and the recording of environmental factors such as humidity and temperature. The graphical results show a peak when offering a daily feed (approximately 1400 g/day). According to the results, higher consumption of DM was recorded during the first hours of the day; and the consumption of DM decreased at the end of the day. In addition, the registered humidity percentages ranged between 38% and 85% during the evaluated period, presenting a temperature peak above 36 °C.
Fig. 12. Graph of consumption dynamics versus temperature.
3.1 Short Discussion of Both Graphs The literature has reported that the feed intake of ruminants depends on their body weight. In all ruminant species, protein content positively influences DMI, whereas fiber fractions negatively influence DMI [14]. Another essential factor to take into account is the environmental temperature. Silanikove [15–17] has published detailed reviews on the effects of temperature on DMI and feed digestion in ruminants. Figures 11 and 12 show graphically the information collected by our system. These data demonstrate that the developed system is efficient in obtaining the dynamics of DMI, which is of significant help when evaluating the diets generated by animal nutrition experts.
4 Conclusions and Future Work The use of sensors and the development of systems that allow the automation of a process greatly help the research area; by reducing human intervention, the error factor is also reduced, thus having more accurate data without the need for human capital.
Monitoring System for Dry Matter Intake in Ruminants
297
A dry matter intake monitoring system was developed, consisting of an electronic system on one side and software that allows the collection of data captured by the system on the other. This system allows the capture of data such as the weight of feed poured to the animal, temperature and relative humidity, and data capture date and time. The results show that the system developed is of great help to animal nutrition experts, collecting, over time, essential information, such as the weight of the feed and the humidity and temperature of the environment where the animal is being fed. In general, it allows obtaining the dynamics of dry matter intake, which is fundamental data when analyzing the diets provided by animal nutrition experts. In future work, it would be interesting to generate a filter to avoid abrupt variations generated, either by the weather or by the animal’s movement. Also, develop a mobile application that allows it to perform all the functions so that it is not necessary to have a laptop. In addition, more rigorous tests are needed where, for example, experiments are carried out in controlled environments to evaluate the effectiveness of the data obtained.
References 1. Ribeiro, D.M., et al.: The application of omics in ruminant production: a review in the tropical and sub-tropical animal production context. J. Proteomics 227, 103905 (2020). https://doi. org/10.1016/j.jprot.2020.103905 2. Carpinelli, N.A., Rosa, F., Grazziotin, R.C.B., Osorio, J.S.: Technical note: a novel approach to estimate dry matter intake of lactating dairy cows through multiple on-cow accelerometers. J. Dairy Sci. 102, 11483–11490 (2019). https://doi.org/10.3168/jds.2019-16537 3. Allen, M.S.: Effects of diet on short-term regulation of feed intake by lactating dairy cattle. J. Dairy Sci. 83, 1598–1624 (2000). https://doi.org/10.3168/jds.S0022-0302(00)75030-2 4. Schirmann, K., Chapinal, N., Weary, D.M., Heuwieser, W., von Keyserlingk, M.A.G.: Shortterm effects of regrouping on behavior of prepartum dairy cows. J. Dairy Sci. 94, 2312–2319 (2011). https://doi.org/10.3168/jds.2010-3639 5. Ledgerwood, D.N., Winckler, C., Tucker, C.B.: Evaluation of data loggers, sampling intervals, and editing techniques for measuring the lying behavior of dairy cattle. J. Dairy Sci. 93, 5129–5139 (2010). https://doi.org/10.3168/jds.2009-2945 6. Rodriguez-Jimenez, S., Haerr, K.J., Trevisi, E., Loor, J.J., Cardoso, F.C., Osorio, J.S.: Prepartal standing behavior as a parameter for early detection of postpartal subclinical ketosis associated with inflammation and liver function biomarkers in peripartal dairy cows. J. Dairy Sci. 101, 8224–8235 (2018). https://doi.org/10.3168/jds.2017-14254 7. Zehner, N., Umstätter, C., Niederhauser, J.J., Schick, M.: System specification and validation of a noseband pressure sensor for measurement of ruminating and eating behavior in stable-fed cows. Comput. Electron. Agric. 136, 31–41 (2017). https://doi.org/10.1016/j.compag.2017. 02.021 8. Grinter, L.N., Campler, M.R., Costa, J.H.C.: Technical note: Validation of a behaviormonitoring collar’s precision and accuracy to measure rumination, feeding, and resting time of lactating dairy cows. J. Dairy Sci. 102, 3487–3494 (2019). https://doi.org/10.3168/jds. 2018-15563 9. Zambelis, A., Wolfe, T., Vasseur, E.: Technical note: Validation of an ear-tag accelerometer to identify feeding and activity behaviors of tiestall-housed dairy cattle. J. Dairy Sci. 102, 4536–4540 (2019). https://doi.org/10.3168/jds.2018-15766 10. Bikker, J.P., et al.: Technical note: evaluation of an ear-attached movement sensor to record cow feeding behavior and activity. J. Dairy Sci. 97, 2974–2979 (2014). https://doi.org/10. 3168/jds.2013-7560
298
J. Sierra Martinez et al.
11. Büchel, S., Sundrum, A.: Technical note: evaluation of a new system for measuring feeding behavior of dairy cows. Comput. Electron. Agric. 108, 12–16 (2014). https://doi.org/10.1016/ j.compag.2014.06.010 12. Mattachini, G., Riva, E., Perazzolo, F., Naldi, E., Provolo, G.: Monitoring feeding behaviour of dairy cows using accelerometers. J. Agric. Eng. 47, 54 (2016). https://doi.org/10.4081/jae. 2016.498 13. Liggesmeyer, P., Trapp, M.: Trends in embedded software engineering. IEEE Softw. 26, 19–25 (2009). https://doi.org/10.1109/MS.2009.80 14. Riaz, M.Q., Südekum, K.-H., Clauss, M., Jayanegara, A.: Voluntary feed intake and digestibility of four domestic ruminant species as influenced by dietary constituents: a meta-analysis. Livest. Sci. 162, 76–85 (2014). https://doi.org/10.1016/j.livsci.2014.01.009 15. Silanikove, N.: Effects of water scarcity and hot environment on appetite and digestion in ruminants: a review. Livest. Prod. Sci. 30, 175–194 (1992). https://doi.org/10.1016/S03016226(06)80009-6 16. Silanikove, N.: The struggle to maintain hydration and osmoregulation in animals experiencing severe dehydration and rapid rehydration: the story of ruminants. Exp Physiol. 79, 281–300 (1994). https://doi.org/10.1113/expphysiol.1994.sp003764 17. Silanikove, N.: Effects of heat stress on the welfare of extensively managed domestic ruminants. Livest. Prod. Sci. 67, 1–18 (2000). https://doi.org/10.1016/S0301-6226(00)001 62-7
Speaker Identification in Noisy Environments for Forensic Purposes Armando Rodarte-Rodríguez1 , Aldonso Becerra-Sánchez1(B) , José I. De La Rosa-Vargas1 , Nivia I. Escalante-García2 , José E. Olvera-González2 , Emmanuel de J. Velásquez-Martínez1 , and Gustavo Zepeda-Valles1 1 Universidad Autónoma de Zacatecas, Campus Siglo XXI, Carr. Zacatecas-Guadalajara Km. 6,
Ejido “La Escondida”, 98160 Zacatecas, Mexico [email protected], {a7donso,gzepeda}@uaz.edu.mx, [email protected], [email protected] 2 Tecnológico Nacional de México Campus Pabellón de Arteaga, Carretera a la Estación de Rincón Km. 1, Pabellón de Arteaga, 20670 Aguascalientes, Mexico [email protected], [email protected]
Abstract. The speech is a biological or physical feature unique to each person, and this is widely used in speaker identification tasks like access control, transaction authentication, home automation applications, among others. The aim of this research is to propose a connected-words speaker recognition scheme based on a closed-set speaker-independent voice corpus in noisy environments that can be applied in contexts such as forensic purposes. Using a KDD analysis, MFCCs were used as filtering technique to extract speech features from 158 speakers, to later carry out the speaker identification process. Paper presents a performance comparison of ANN, KNN and logistic regression models, which obtained a F1 score of 98%, 98.32% and 97.75%, respectively. The results show that schemes such as KNN and ANN can achieve a similar performance in full voice files when applying the proposed KDD framework, generating robust models applied in forensic environments. Keywords: Artificial intelligence · KDD · Prototyping · Speaker identification · Speech processing
1 Introduction The speech is a unique biological feature in each person, caused by the differences in the organs of phonation, articulation and breathing. Thus, the particular characteristics of the speech and the way of speaking in each person are their biometric signatures [1, 2]. In this context, speaker recognition is the process of extracting information that describes the identity of people from their voice features [3, 4]. The speaker identification process is normally made up of 3 stages: pre-processing, speaker or speech feature extraction and classification [3, 5, 6]. The pre-processing stage consists in modifying the speech signal to make it suitable for feature extraction analysis [6–8]. Meantime, feature extraction is a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 299–312, 2023. https://doi.org/10.1007/978-3-031-20322-0_21
300
A. Rodarte-Rodríguez et al.
procedure in charge of obtaining the salient characteristics from each frame (a short piece of voice) of the quasi-stationary signal, this to obtain useful and relevant information, as well as delete redundant or irrelevant data. The final step is to learn how to distinguish the features of the different speakers by applying different classification techniques [6, 8, 9]. The aim of this paper is to present and provide a model capable of identifying speakers on noisy environments for gender-independent and text-independent tasks on a closed-set and connected-word corpus. This approach can be taken to a higher level of abstraction, i.e., it can be applied in the creation of different software tools or systems to support criminal and forensic investigations, surveillance and voice-controlled security access, particularly in noisy scenarios. The motivation of this work is based on the idea of creating an auxiliar model to contribute to impartial judgments or decisions in forensic speaker identification tasks or in the voice distinction of a suspicious person in surveillance tasks. Thus, this work provides a performance comparison of KNN (k-nearest neighbors), logistic regression (LR) and ANN (artificial neural network). Following the guidelines of the Prototyping and KDD (knowledge discovery in databases) methodologies, crossvalidation of k iterations was used in a nested manner with the grid search technique; meanwhile the data were normalized as an additional technique in noise reduction, concluding with the analysis and patterns classification. For the development of the suggested and employed models, a data set with 158 different people (classes) was used, of which 40 are men and 10 are women. Models results showed a suitable ability to classify speakers voice samples and audio files, obtaining an F1 score of 85.71% for ANN model, while an F1 of 98.32% for KNN. For the ANN, KNN and logistic regression models, the accuracy obtained was 87.49%, 87.17% and 84.39% for voice samples, while 98.15%, 98.34%, 97.88% for full audio files. This shows that pre-processing and extracting features are crucial stages to improve performance, coupled with the classification power of the approaches. This paper is organized as follows, in Sect. 2 some related works are presented, Sect. 3 depicts the proposal of this work, as well as the basic idea of its formulation in a comparison context. The models results and their comparisons are shown in Sect. 4. Finally, Sect. 5 concludes the paper with a discussion and future work.
2 Related Works Speaker recognition has various applications such as personal assistants, banking transactions and forensic analysis [10]. In a general sense, this type of task can also be applied in other scopes, as in speaker emotion recognition, for instance, Miao et al. [11] performed speaker depression identification through speech using the DAIC-WOZ dataset, where the models used were SVM (support vector machines), KNN and CNN (convolutional neural networks); being CNN an efficient method for the validation of the proposed characteristics with an accuracy of 85%. In the same line, Simic et al. [12] used constrained CNN for speaker recognition in different emotional states. A closed set of speakers (SEAC dataset) was used, which contains five different emotions recorded in the Serbian language, obtaining an average accuracy of 99.248% for neutral speech, 79.84% for sadness and about 85% for the remaining emotions. On the other hand, Shahin et al. [13] proposed a model to identify speakers with an Emirati accent using MFCC (Mel frequency cepstral coefficients) and hidden Markov
Speaker Identification in Noisy Environments for Forensic
301
models, obtaining precisions between 58.6% and 65%. In the same sense, Al Hindawi al. [14] proposed a SVM with an accuracy of 95% for disguised voices under an extremely high pitch condition in a neutral conversation environment. While Ge et al. [15] designed a neural network model for text-independent speaker classification and verification, obtaining less than 6% error rate in testing. Besides, Ozcan et al. [16] examined the precision and speed of classifiers that can be used in speaker identification using MFCC variations; in these tests, they used SVM, linear discriminant analysis, KNN and naive bayes, obtaining 98.3% as their best accuracy. Moreover, Aboelenein et al. [17] implemented a text-independent speaker identification model using MFCC with Vector Quantization (VQ) employing as a classifier Gaussian Mixture Models (GMM); obtaining a recognition rate of 91%. Other researchers [18] proved deep models for speaker identification taking as input speech spectrograms, obtaining accuracies of 98.96% combining a 2-D CNN algorithm with a multi-layered Gated recurrent unit (GRU) architecture. These experiments were carried out using the Aishell-1 dataset, containing speech files in Mandarin. Additional works [19] developed a hybrid method combining CNN and GMM for speaker recognition in Chinese using short utterances based on spectrograms. This model makes it possible to describe speakers better and distinguish them from each other more effectively, reaching to reduce the recognition error from 4.9% to 2.5%. While Alsulaiman et al. [20] performed speaker recognition tasks by segmenting Arabic phonemes using the LDC KSU voice database employing features such as MFCC, MDLF, and MDLF-MA; while as a classifier the Gaussian Mixture model was used. The recognition accuracy rates for Arabic vowels were greater than 80%, while the recognition rates for consonants were between 14% and 94%. However, nowadays systems still have multiple complications [17, 21, 22], such as dealing with problems in the variations due to gender, speech speed, pronunciation and accents by region, noise, distortion and the mood of the speaker. These problems cause low accuracy and little generalization, high time and computational cost.
3 Speaker Identification by Means of Speech Processing 3.1 Speech Processing The characteristics of the voice depend on its rhythm or speed, pitch level, tone and accent. So, speech processing is the analysis, processing and study of voice signals for their understanding and recognition. In this sense, speaker identification (SI) is the process of identifying who is speaking, based on analysis, processing and feature extraction of the voice signal. The extraction of these features can be done by obtaining the MFCCs, Fig. 1 explains the procedure for this process. The unknown speaker voice samples are compared with a trained model or pattern (classification), in such a way that if a threshold is exceeded, the identity verification is proclaimed. In this process, feature extraction is a crucial part for the model performance, where MFCCs are commonly used as cepstral coefficients for the representation and description of speech based on biology. Thus, the goal is to extract features from a speech signal to describe the differences in speech with relevant information and delete information that is not useful such as background noise, emotions, volume, tone, among others [23].
302
A. Rodarte-Rodríguez et al.
Fig. 1. Basic speaker recognition model.
ANN, logistic regression and KNN algorithms were employed to perform a performance comparison and choose the most optimal model in speaker identification tasks. These algorithms are firstly used to classify the audio samples, and then the majority vote technique is applied on these samples to classify by speaker audio files. 3.2 Development Methodology The agile software development methodology called Prototyping was used in the presented task, where iteratively the system or tool is designed, modified and created [24]. Where a prototype is conceived as a preliminary version, intentionally incomplete of a reduced system [25]. This type of model allows us for feedback from some of the interested parties in early stages, thus system features are discarded or suppressed while new functionalities and necessities are added as they were needed. Under this premises, the use of prototypes is a useful tool to apply to almost all activities of software design and creation. Requirement gathering is generated based on information provided by the user, later its development and validation is required to show and prove missing functionality defined in stage 1; thus, this process is repeated (see Fig. 2) [26].
Fig. 2. Prototyping methodology for software development.
During the whole process in Prototyping, KDD (knowledge discovery in databases) was used as a basis for the complete information analysis process, which is also responsible for preparing the data and interpreting the results obtained. One of the major premises of KDD is that knowledge is discovered using intelligent learning techniques that examine data through automated processes. KDD is an interactive process that involves numerous steps and includes many decisions that must be made by the user, and it is structured in stages such as i) data collection, ii) feature extraction, iii) use of classification techniques, iv) interpretation and knowledge acquisition (see Fig. 3) [27].
Speaker Identification in Noisy Environments for Forensic
303
Fig. 3. Knowledge discovery databases data flow.
3.3 Proposed Architecture of the Speaker Identification Task Figure 4 describes in a sequential manner the architecture of the proposed model, which includes the extraction procedure for its modeling, assessment, analysis and knowledge consolidation. The first step consists in the audio front-end procedure, which involves extracting the features, obtaining the MFCCs for each frame in each audio file, subsequently, the resulting data are normalized. In step 2, machine learning algorithms are applied to this data for its modeling; the hyperparameters are optimized and the model is evaluated, this from applying cross validation and grid search. Once all the models have been obtained, we proceed to step 3, where the performance of the models is compared and analyzed. Finally, based on the analysis of the results, the most optimal model is selected and proposed as a solution (step 4). In this case two types of classifications are performed: i) isolated samples and ii) samples belonging to the same speaker audio file; both cases to identify the speaker who issued it.
Fig. 4. Architecture of the proposed SI task model based on KDD.
Data Overview. The dataset used consists of 158 classes (people), of which 122 male and 36 female. There are 4911 audio files, where 3395 are used for training (85%, of which 758 are used for validation) and 758 for testing (15%). All files are saved with PCM encoding, 16 bits per sample and a 16 kHz sample rate on a single channel. The SI task has been developed using personalized mid-vocabulary speaker-independent voice corpus of connected-words in Spanish from the northern central part of Mexico in a closed-set environment. With the purpose of strengthening the scope of the voice corpus, it was complemented with utterances from audio files generated through online text-to-speech applications with similar natural Mexican sounding voices. The online applications used were ispeech, oddcast (SitePal) and vocalware. Most audios were recorded with several types of noise: microphone distortion, rain, cars, animals, high and low volume audios, and environmental noise in general. In addition, the age of the human participants ranges from 18 and 26.
304
A. Rodarte-Rodríguez et al.
Feature Extraction. Librosa package is implemented for feature extraction, which is a Python package for audio analysis, the librosa.feature.mfcc() function is used. For the extraction of these MFCCs, windows (frames) of 88 ms overlapped by 35 ms are used. We justify its use since it allows to eliminate saline, background noise, emotions, volume, tone, among others. In addition, this technique is based on biology, thus improving its performance. When extracting the features, 3 subsets of samples are obtained: training, validation and tests, each with 39 features. The subsets have a different number of samples (frames), the training subset has 429309 samples, the validation subset contains 96138 samples, while the test subset has 97286 samples. The parameters defined in the extraction of the MFCCs were the following: nmfcc = 39, htk = True, nfft = 0.088 * sampling rate, hoplength = 0.035 * sampling rate. Modeling. ANN, logistic regression and KNN algorithms were suggested, employed and compared with the aim of finding the most optimal scheme (with the best balance of bias and variance) in terms of performance and runtime. Project dependencies include the following packages and libraries: Numpy, Pandas, SciPy, Scikit-Learn, librosa, Tensorflow and Matplotlib. Data Normalization. Converting the different variables to the same scale is very important, which allows to converge to models more quickly and keeps useful information from outsiders, and it makes them less sensitive to this [28]; this can be performed using Eq. (1): xnormalized =
xn − μ , σx
(1)
where xn is a data of the vector, μ is the average and σx is the standard deviation of x. Modeling with Artificial Neural Network. A neural network is a series of interconnected layers and neurons, where each layer can be composed of p neurons or a set of n inputs. Neural networks commonly allow to classify, predict and recognize patterns using the backpropagation algorithm. The aggregation function simulates the synaptic process (soma), which allows to obtain the postsynaptic potential value of the neuron; while the activation function emulates the information output process (axon of the neuron) [29–32]. In Fig. 5, xn vector simulates incoming electrical pulses or neurotransmitters to the neuron, and it represents an input to the neuron; whilst Wn represents a weight of the neuron. This vector is modified during learning and simulates the dendrites of the biological neuron and allows to give the importance of one input with respect to the others. The weights of the neuron and the inputs to it simulate the synapse process between two neurons [29–31]. The proposed fully connected ANN is made up of an 39dimension input layer (MFCC size), 5 hidden layers and a softmax output layer; where the corresponding number of neurons in each layer is 1024, 513, 257, 129, 65 and 50 (see Fig. 5). The ANN also implements 3 different types of activation functions, the first 4 layers use the sigmoid function, while fifth layer uses the ReLu activation function. This network has as output a softmax layer to predict the probabilities of each class and
Speaker Identification in Noisy Environments for Forensic
305
the class with the highest probability is chosen. The Adam algorithm was used as an optimizer for network training with a learning rate of 0.00095 during 20 epochs with a binary cross entropy loss function; besides, the type of model chosen was Sequential() with a Dense() type layer. Accuracy was used as a performance metric. Moreover, the weights were initialized using kernel = “uniform”, which obtains network weights very close to 0 with a normal distribution.
Fig. 5. Proposed ANN model architecture.
According to various authors, the sigmoid activation function more accurately emulates the concept of a neuron in a brain. However, these functions can be problematic, causing that the neural network to learn very slowly and fall into a local minimum during training. In this case its use is justified since the sigmoid activation function combined with the ReLU function allow the algorithm to converge faster, and with less bias than when implementing the hyperbolic tangent activation function. Also, the ReLU activation function is used due to it solves the problem of the disappearance of the gradient in logistic functions and hyperbolic tangent functions. For this case, Adam is used because it is computationally efficient, has low memory requirements and is very suited to problems that are large in terms of data/parameters. Modeling with Logistic Regression. This scheme models the probability of a discrete outcome (dependent variable) given input variables (independent variables), which can be employed for the classification of categorical variables, and it can be extended to multiclass purposes. Since the output yielded by the model is a probability prediction (input sample belongs to a certain class), the dependent variable is bounded between 0 and 1 according to Eq. (2) [33, 34]: ϕ(x) =
1 1 + e−z
(2)
The trade-off parameter (hyperparameter) of logistic regression that determines the strength of the regularization is called C [35]. In this sense, regularization will penalize the extreme parameters values, where these lead to overfitting in the training data. Accordingly, the range of values used in grid search to find the most optimal value of C was: 0.001, 0.003, 0.007, 0.01, 0.05, 0.1, 0.5, 1.0, 3.0, 6.0 and 10.0. The optimal hyperparameters returned by grid search for LR were c = 6.0 and random state = 42.
306
A. Rodarte-Rodríguez et al.
Figure 6a presents the validation curve for the defined range of values of C. Increasing the regularization strength too much, the model tends to generalize better and increases the variance. That is, it generates good abstraction of the data reducing the bias committed by the model, otherwise model tends to memorize the data thus causing a severe problem of overfitting. Decreasing the regularization strength generates an increase in bias and a decrease in variance. The most optimal value calculated for the regularization parameter C is 6.0; where this value maintains the perfect balance between bias and variance by increasing the regulation strength. For this type of models, a common way to approach this problem is to increase the number of data, building additional features or reducing the degree of regularization; regarding the latter, this is not the case since it has been carried out without a positive impact. To improve the performance, we choose the variables with descriptive information of the phenomenon, selecting a not so large number of functions that allow us to obtain meaningful information about the audio files.
Fig. 6. Validation curves. (a) With Logistic regression. (b) With KNN.
Modeling with K-Nearest-Neighbors. The idea of KNN is to find the closest match of the test data in the feature space. The algorithm classifies each new data in the corresponding group, depending on whether it has k neighbors closer to one group or another. In other words, it calculates the new sample distance with respect to each of the existing ones, so these distances are accommodated from smallest to largest to select the group to which they belong. This group will therefore be the one with the highest frequency with the smallest distances [36]. This algorithm is based on the classical distance between two points, although different functions can be employed. The parameters to form KNN model are obtained through the grid search and the cross-validation technique in a nested way, using k = 10. The range of values to find the most optimal value of KNN was between 1 and 25, only considering the odd values. In this case, the Euclidean distance was implemented to build the KNN algorithm (p = 2); besides, the best parameter returned by GridSearchCV() for n neighbors was 3. Figure 6b presents the validation curve for the range of values defined for the proposed model. As a result, and after various tests of k ranging from 1 to 25, we choose k = 3, balancing k to be enough adequate to mitigate smoothing and to reduce noise. To solve this problem of high variance or overfitting, more samples could be collected, or the number of features extracted could be increased.
Speaker Identification in Noisy Environments for Forensic
307
Modeling with Majority Vote. This machine learning model works by performing some actions to generate a majority yield based on a previous output of another process [37]. In this case, samples of each audio file are classified as independent, and then a majority vote is applied to classify the audio file depending on the class that is most present in those samples. Hyperparameter Optimization. For optimal performance of the models, a grid search was performed on a grid of chosen parameters to obtain a best-performing set of parameters. Grid search was implemented using the GridSearchCV() function from the scikitlearn library. Subsequently, validation and learning curves (see Fig. 6) were used to diagnose and to solve underfitting or overfitting problems [38, 39]. Cross Validation and Evaluation. The training subset is randomly divided into k = 10 iterations without replacement, where k–1 iterations were used for model training and one iteration for performance evaluation [40, 41]. Results Analysis. The results indicate that audio front-end procedures are the key phases for a good performance in speaker identification tasks, independently of the model that is used to classify. Due to incorrect pre-processing or incorrect feature extraction will decrease classification performance. A proper pre-processing allows simple algorithms such as KNN and logistic regression to achieve performance similar to that of more robust models. Model Selection. The ANN and KNN models are proposed as the most optimal estimators, given that they present a better bias-variance balance. Finally, they require tolerable time and computational cost. For the selection of the best model, cross-validation, learning curves and different metrics were used to select the model with the best performance and the best bias-variance balance.
3.4 Evaluation To evaluate the models, accuracy, precision, specificity and recall performance metrics are considered [42]; however, since the dataset used may be moderately unbalanced, thus F1 score was included as additional metric for comparison.
4 Experiments and Discussion 4.1 Implementation and Runtime Details The different phases of feature extraction, training and testing for the different models were executed on a Dell XPS 9550 series (Intel(R) Core(TM) i7-6700HQ CPU @2.60 GHz with 16 GB 2666 MHz DDR4 RAM). While the training of the ANN model was accelerated using an NVIDIA GeForce GTX 960 M card, which contains 2 GB GDDR5 RAM, 2500 MHz memory clock and 640 processing cores. Besides, CUDA 11.0 library has been used in Windows 10 to speed up the matrix operations.
308
A. Rodarte-Rodríguez et al. Table 1. Runtime in minutes. Phase
ANN
Logistic regression
KNN
Training
32.89
1095.56
752.22
0.21
52.85
Testing
0.0005
Table 1 shows the corresponding data processing time for the training and testing phase of each implemented model. The runtime for the feature extraction phase was 17.57 min. The time to train and test 120 different KNN models and 110 different logistic regression models is also shown, this is why the computing time is high. 4.2 Metric Analysis It is important to mention that the macro-average of each metric is used from the respective score of each class with the aim of summarizing the performance of the model. Table 2 illustrates the training average performance (macro-average) of speech samples identification per speaker of the 3 models used. It can be seen that all the models achieve a high level of abstraction, since they are able to correctly model all the speakers. The most optimal model during training was the KNN model, which obtained an F1 score of 0.9989, i.e., the algorithm manages to differentiate and learn to detect the classes with a high confidence of 99.89%. While the precision obtained was 0.9990, that is, when it predicts a speaker, the algorithm is correct with 99.90% confidence. The algorithm had a recall of 0.9989, in other words, it correctly identifies 99.89% of the speakers. Finally, the model was correct in 99.87% of all samples of each class. On the other hand, when comparing testing results with the training values, all the estimators present a moderate problem of overfitting. Thus, the model that presents better generalization and less memorization of the data is the ANN, with an F1 score of 85.71%, an accuracy of 87.49% and a precision of 85.91% in testing. Table 2. Speech samples classification results. Metrics
ANN
KNN
Logistic regression
Training
Testing
Training
Testing
Training
Testing
Accuracy
0.9715
0.8749
0.9987
0.8717
0.9160
0.8439
F1 score
0.9710
0.8571
0.9989
0.8560
0.9187
0.8315
Recall
0.9705
0.8681
0.9989
0.8615
0.9155
0.8363
Precision
0.9728
0.8591
0.9990
0.8586
0.9225
0.8356
Specificity
0.9998
0.9992
0.9999
0.9991
0.9994
0.9990
The results obtained for the different estimators in the classification of audio files in the testing phase are presented in Table 3. When applying the majority vote technique
Speaker Identification in Noisy Environments for Forensic
309
on the classified samples, an almost null bias and variance are obtained. The model with the highest performance is KNN, with an F1 score of 0.9832, an accuracy of 98.34% and a precision of 98.55% in the testing phase. To reduce overfitting, it is necessary to consider more voice samples per class, improve filtering techniques and have a more robust dataset. What affects the performance of these models is the variability of the speech and/or noise. Variability can be caused by factors such as noise introduced by the environment or microphones of different qualities, other voices or secondary sounds, the mood of the speaker and the tone of speech; for example, given the random partitioning of our dataset (testing and training), most noisy speech signals can be in the testing set. Thus, this can cause the model to have an underfitting condition since it was not trained with sufficient noisy data on it. Table 3. Audio files classification results during testing phase. Metrics
ANN
KNN
Logistic regression
Accuracy
0.9815
0.9834
0.9788
F1 score
0.9800
0.9832
0.9775
Recall
0.9823
0.9831
0.9780
Precision
0.9807
0.9855
0.9878
Specificity
0.9833
0.9999
0.9998
4.3 Models Comparison Comparing the models and their implications, they present several advantages, similarities and drawbacks, which are shown in Table 4. The schemes present greater precision, identifying the speakers in possible noisy scenarios and independent of gender (they generalize better). Otherwise, these models require almost the same computational cost, and are also very similar in terms of the time required to predict and train. One of the disadvantages is that these models present greater overfitting. Our proposed predictive models differ in several ways from other models. First, a set of models taken to a higher level of abstraction with a performance slightly superior or similar to other existing models are proposed. Second, this analysis uses many samples in Spanish with speakers of different genres. Another advantage is the use and validation of widely used prediction models. On the other hand, this study has several limitations in addition to those above mentioned. This is a study conducted with people from a single region. This studio needs to be replicated in large, multi-region settings of other countries or states for generalization to an international scale and thus improve the identification of accents by region. Our estimator needs to be extremely validated in prospective studies using data from the aforementioned situations before application in real scenarios. Nevertheless, the scope achieved so far is a solid starting point.
310
A. Rodarte-Rodríguez et al. Table 4. Advantages and disadvantages comparison of the proposed models.
Model
Advantages
Disadvantages
ANN
1. Fast and very accurate model 2. Insensitive to noise 3. Excellent data generalization 4. Good rate of over-learning 5. Good bias index 6. Reduced time in training and testing 7. Portable model
1. High computational cost 2. High variance 3. Complex model 4. Using GPU to speed up the training and prediction time 5. Sensitive to overfitting
KNN
1. Simple and low complexity model 2. Lower variance and bias 3. Portable model 4. Performance near the best
1. High computational cost 2. Significantly slower as the number of samples and features increases 3. Sensitive to noise
LR
1. Simple and low complexity model 2. Fast with large amounts of data 3. Low bias 4. Portable model 5. Low computational cost 6. Performance near the best
1. Sensitive to data normalization 2. High variance 3. Low robustness 4. Adequate pre-processing required to significantly improve performance
5 Conclusion and Future Work In this work, logistic regression, KNN and ANN were implemented and compared in speaker identification tasks. Nested cross-validation was implemented with the grid search technique to optimize the models, besides validation and learning curves were used to diagnose and solve underfitting or overfitting problems. Also, to develop this project, the agile Prototyping development methodology was used in conjunction with KDD, which are useful tools that can be applied in several stages of design, modification and creation of multipurpose software. The proposed models for speech sample classification (MFCC vectors) and speaker identification (full speech audio) were taken to a higher level of abstraction, and they can be applied in multiple contexts such as criminal and forensic investigations. These models achieved an F1 score of 85.71% (ANN) in the voice sample classification, whilst an F1 score of 98.32% (KNN) in audio file classification. This study showed that robust algorithms such as KNN and ANN can achieve performance similar to the more robust models, as long as an efficient filtering process (denoising) and a suitable phonetic content extraction process are applied. In addition, this research will allow the creation of different software tools to support criminal and forensic investigations involving recorded voice samples, this in surveillance and voice-controlled security access, particularly in noisy scenarios. For future work, it would be interesting to build a new and more robust dataset for training these models, measuring the algorithms performance implemented with a greater number of speakers from different regions (accents). In addition, the improvement and implementation of new methods to reduce overfitting is suggested, for instance, applying CNN as a filtering and classification technique. Furthermore, conversational audio files
Speaker Identification in Noisy Environments for Forensic
311
can be used to do more extensive testing in real situations. In this sense, it would be pertinent to analyze scenarios with extreme noise since it is part of the use in forensic environments. The results obtained generates guidelines to apply this experimentation in other areas, e.g., classifying the people age, mood classification and applications in health sector, like identifying any speech-related pathology.
References 1. Mohd Hanifa, R., Isa, K., Mohamad, S.: A review on speaker recognition: technology and challenges. Comput. Electr. Eng. 90 (2021) 2. Becerra, A., de la Rosa, J.I., González, E., Pedroza, A.D., Escalante, N.I.: Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition. Multi. Tools Appl. 77(20), 27231–27267 (2018). https://doi.org/10.1007/s11042-018-5917-5 3. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997) 4. Basharirad, B., Moradhaseli, M.: Speech emotion recognition methods: a literature review. In: The 2nd International Conference on Applied Science and Technology (ICAST’17), p. 020105. AIP Publishing (2017) 5. Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio, Speech Lang. Process. 21, 1060–1089 (2013) 6. Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11, 23–61 (2011) 7. Pawar, R.V., Jalnekar, R.M., Chitode, J.S.: Review of various stages in speaker recognition system, performance measures and recognition toolkits. Analog Integr. Circ. Sig. Process 94(2), 247–257 (2017). https://doi.org/10.1007/s10470-017-1069-1 8. Chaudhary, G., Srivastava, S., Bhardwaj, S.: Feature extraction methods for speaker recognition: a review. Int. J. Pattern Recognit. Artif. Intell. 31, 1750041 (2017) 9. Lotia, P., Khan, M.R.: A review of various score normalization techniques for speaker identification system. Int. J. Adv. Eng. Technol. 3, 650–667 (2012) 10. Khalid, L.F., Abdulazeez, A.M.: Identifying speakers using deep learning: a review. Int. J. Sci. Bus. 5, 15–26 (2021) 11. Miao, X., Li, Y., Wen, M., Liu, Y., Julian, I.N., Guo, H.: Fusing features of speech for depression classification based on higher-order spectral analysis. Speech Commun. 143, 46– 56 (2022) 12. Simi´c, N., Suzi´c, S., Nosek, T., et al.: Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy 24, 1–17 (2022) 13. Shahin, I., Nassif, A.B.: Emirati-accented speaker identification in stressful talking conditions. In: Int. Conf. Electr. Comput. Technol. Appl. ICECTA, pp. 1–6. IEEE Press, Ras Al Khaimah (2019) 14. Al Hindawi, N.A., Shahin, I., Nassif, A.B.: Speaker identification for disguised voices based on modified SVM classifier. In: Int. Multi-Conference Syst. Signals Devices, SSD, pp. 687– 691. IEEE Press, Monastir (2021) 15. Ge, Z., Iyer, A.N., Cheluvaraja, S., Sundaram, R., Ganapathiraju, A.: Neural network based speaker classification and verification systems with enhanced features. In: Intelligent Systems Conference (Intel-liSys), pp. 1089–1094. IEEE Press, London, UK (2017) 16. Ozcan, Z., Kayikcioglu, T.: A speaker identification performance comparison based on the classifier, the computation time and the number of MFCC. In: 25th Signal Processing and Communications Applications Conference (SIU), pp. 1–4. IEEE Press, Antalya (2017) 17. AboElenein, N.M., Amin, K.M., Ibrahim, M., Hadhoud, M.M.: Improved text-independent speaker identification system for real time applications. In: Proc. 4th Int. Japan-Egypt Conf. Electron. Commun. Comput. JEC-ECC, pp. 58–62. IEEE Press, Cairo (2016)
312
A. Rodarte-Rodríguez et al.
18. Ye, F., Yang, J.: A deep neural network model for speaker identification. Appl. Sci. 11, 1–18 (2021) 19. Liu, Z., Wu, Z., Li, T., Li, J., Shen, C.: GMM and CNN hybrid method for short utterance speaker recognition. IEEE Trans. Industr. Inf. 14, 3244–3252 (2018) 20. Alsulaiman, M., Mahmood, A., Muhammad, G.: Speaker recognition based on Arabic phonemes. Speech Commun. 86, 42–51 (2017) 21. Chakroun, R., Zouari, L.B., Frikha, M., Hamida, A.B.: A novel approach based on Support Vector Machines for automatic speaker identification. In: 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–5. IEEE Press, Marrakech (2015) 22. AbuAladas, F.E., Zeki, A.M., Al-Ani, M.S., Messikh, A.E.: Speaker identification based on curvlet transform technique. In: International Conference on Computing, Engineering, and Design (ICCED), pp. 1–4. IEEE Press, Kuala Lumpur (2017) 23. Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 1, 19–22 (2010) 24. Prototyping Model: https://searchcio.techtarget.com/definition/Prototyping-Model 25. Weitzenfeld, A., Guardati, S.: Ingeniería de software: el proceso para el desarrollo de software. In: Introducción a la Computación, pp. 355–396. Cengage Learning (2007) 26. Sommerville, I.: Software engineering. Pearson, México (2011) 27. Comendador, B., Rabago, L., Tanguilig, B.: An educational model based on knowledge discovery in databases (KDD) to predict learner’s behavior using classification techniques. In: ICSPCC2016. IEEE Press (2016) 28. Singh, D., Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020) 29. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015) 30. Han, S.H., Kim, K.W., Kim, S., Youn, Y.C.: Artificial neural network: understanding the basic concepts without mathematics. Dement. Neurocognitive Disord. 17, 83–89 (2018) 31. Zhou, I., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020) 32. Kingma, D.P., Ba, L.J.: Adam: A method for stochastic optimization. In: 3rd Int. Conf. Learn. Represent. ICLR, pp. 1–15. arXiv.org, Ithaca (2015) 33. LaValley, M.P.: Logistic regression. Circulation 117, 2395–2399 (2008) 34. Ranganathan, P., Pramesh, C.S., Aggarwal, R.: Common pitfalls in statistical analysis: logistic regression. Persp. Clin. Res. 8, 148–151 (2017) 35. Guptaa, P., Garg, S.: Breast cancer prediction using varying parameters of machine learning models. Proc. Comput. Sci. 172, 593–601 (2020) 36. Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Adv in Neural Information Processing Systems. MIT Press, Vancouver (2004) 37. Zhang, D.: Methods and rules of voting and decision: a literature review. Open J. Soc. Sci. 8, 60–72 (2020) 38. Pillai, S.K., Raghuwanshi, M.M., Gaikwad, M.: Hyperparameter tuning and optimization in machine learning for species identification system. In: Dutta, M., Rama Krishna, C., Kumar, R., Kalra, M. (eds.) Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India. LNNS, vol. 116, pp. 235–241. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3020-3_22 39. Wu, J., Chen, X.Y., Zhang, H., et al.: Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 17, 26–40 (2019) 40. Wong, P.Y.T.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32, 1586–1594 (2020) 41. Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. Arxiv, pp. 1–49 (2020) 42. Grandini, M., Bagli, E., Visani, G.: Metrics for multi-class classification: an overview. Arxiv, pp. 1–17 (2020)
Author Index
A Abud-Figueroa, Mara Antonieta, 50 Aguayo, Raquel, 233 Alvarez-Rodriguez, Francisco Javier, 141 B Barba-Gonzalez, María Lorena, 141 Barron-Zambrano, Jose Hugo, 258, 286 Becerra-Sánchez, Aldonso, 299 C Cabral, Silvia Ramos, 63 Calderón-Reyes, Julia Elizabeth, 141 Cardona-Reyes, Héctor, 141 Carranza, David Bonilla, 99 Castillo-Zuñiga, Ivan, 180 Castolo, Juan Carlos Gonzalez, 63 Cedano, Alfredo, 272 Cruz, José Quintana, 165 D de J. Velásquez-Martínez, Emmanuel, 299 de Jesús González-Palafox, Pedro, 50 De la Calleja, Jorge, 243 De La Rosa-Vargas, José I., 299 Diaz Manriquez, Alan, 286 Diaz, Walter Abraham Bernal, 63 Diaz-Manriquez, Alan, 258 Domínguez, Eduardo López, 243 E Elizondo Leal, Juan Carlos, 286 Elizondo, Perla Velasco, 196
Elizondo-Leal, Juan Carlos, 258 Escalante-García, Nivia I., 299 Estrada-Esquivel, Hugo, 215 G García, Alicia, 272 Garcia, Rodolfo Omar Dominguez, 63 García-Mireles, Gabriel Alberto, 34 Gómez, Juan F. Rivera, 196 Gonzalez, Alejandro Mauricio, 196 H Hernández, Alejandra García, 196 Hernández, Yasmín, 215 Hossain Faruk, Md Jobair, 3 I Ionica, Andreea, 20 Isidro, Saúl Domínguez, 243 J JR, Efren Plascencia, 63 Juárez, Juan Manuel Sánchez, 243 Juárez-Martinéz, Ulises, 50 L Leba, Monica, 20 Lizarraga, Carmen, 233 Lopez-Aguirre, Daniel, 258, 286 Lopez-Veyna, Jaime I., 180 Luna-Herrera, Yazmin Alejandra, 126
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 J. Mejia et al. (Eds.): CIMPS 2022, LNNS 576, pp. 313–314, 2023. https://doi.org/10.1007/978-3-031-20322-0
314 M Maciel-Gallegos, Perla, 113 Mancilla, Miriam A. Carlos, 63 Martinez-Angulo, Jose Ramon, 258, 286 Martinez-Rebollar, Alicia, 215 Mejía, Jezreel, 113, 233 Miguel-Ruiz, Juan Antonio, 215 Morán, Miguel, 272 Muñoz, Mirna, 99, 152 Muñoz-Bautista, Humberto, 141 N Negrón, Adriana Peña Pérez, 99 Nieto, María Auxilio Medina, 243 O Ocharán-Hernández, Jorge Octavio, 126 Olvera-González, José E., 299 Ortiz-Garcia, Mariana, 180 Ortiz-Hernandez, Javier, 215 P Pérez-Arriaga, Juan Carlos, 126 Pournaghshband, Hasan, 3 Pulido-Prieto, Oscar, 50 Q Quiñonez, Yadira, 113, 233, 258, 286
Author Index R Reyes, Sodel Vazquez, 196 Reyes, Víctor, 233 Rodarte-Rodríguez, Armando, 299 Rodríguez-Mazahua, Lisbeth, 50 S Saldivar-Alonso, Vicente Paul, 258, 286 Sanchéz-García, Ángel J., 126 Shahriar, Hossain, 3 Sierra Martinez, Jesus, 286 T Tapia, Freddy, 165 Torre, Miguel De la, 63 V Valový, Marcel, 84 Velázquez, Yesenia Hernández, 243 Ventura, Patricia, 272 Villasana-Reyna, Luis Manuel, 258 Y Yáñez, Eduardo Aguilar, 196 Z Zatarain, Omar, 63 Zepeda-Valles, Gustavo, 299