Sustainable Smart Cities. Theoretical Foundations and Practical Considerations 9783031080203, 9783031088148, 9783031088155


361 55 11MB

English Pages [342] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
Social Aspects of Making City Smart
A Smart City Analytical Framework in Economics
1 Introduction
2 Literature Review
3 Smart City Analytical Framework
4 Data in Vietnam
5 Conclusion
6 Limitations of the Research and the Next Researches
References
Smart City—Development Trend in the World and Vietnam
1 Introduction
2 Concepts About the Smart City
3 Smart City Model in the World
4 Smart City Development in Vietnam
5 Conclusion and Implications
References
Smart City from a Standards Perspective
References
Point-of-Interests Recommendation Service in Location-Based Social Networks: A Survey, Research Challenges, and Future Perspectives
1 Introduction
2 Literature Review
2.1 Content-Based Recommendation Systems
2.2 Collaborative Recommendation Systems
2.3 Hybrid Recommendation Systems
3 Pros and Cons of Recommendation Algorithms
4 Research Challeges
5 Conclusion
References
San Marcos Smart City: A Proposal of Framework for Developing ISO 37120:2018-Based Smart City’s Services for Lima
1 Introduction
1.1 Why Smart City for Lima
1.2 Chapter Organization
2 Architecture Proposal
2.1 San Marcos Smart City Architecture
2.2 Physical Layer
2.3 Communication Layer
3 Proposal Implementation
3.1 WIFI Component
3.2 LoRaWAN Component
3.3 Facial Recognition Component
4 Results
4.1 WiFi Component
4.2 LoRAWAN Component
4.3 Facial Recognition Component
5 Summary and Future Directions
6 Conclusions
References
Social and Technical Challenges in Eco-Sustainable Smart City in India—An Analysis
1 Introduction
2 A Smart City Model and Its Components
3 Role of ICT in the Development of Smart City
3.1 Smart Neighbourhood
3.2 Smartivists
3.3 Smart Community
4 Infrastructure and Transport
5 Smart Agriculture
6 Smart Healthcare
7 Smart Energy Management
8 Smart Resource Management, Governance, Waste Management
9 Education, Training and Security
10 Smart City Challenges in India
10.1 Mission Smart Cities
10.2 Implementation
10.3 Challenges and Issues
11 Conclusion
References
5G and Other Networking Technologies for Smart Cities
A Framework for Designing Long Term Digital Preservation System
1 Introduction
2 General Assumption About Document Retention and Preservation by Organization
3 Literature Survey
4 Methodology for Digital Data Preservation System (DPS)
4.1 Existing System Study Analysis and Design
4.2 Examples of Various Categories of Operating Organizations in India
4.3 Questionnaire Prepared for Survey in Assistance of Carrying out Face to Face Interview Sessions with Various Organizational Stakeholders
4.4 Select the Organization Needs
5 The Open Archival Information System Reference Model (OAIS)
6 Digital Repository Development
7 Conclusion and Future Scope
References
Towards Sustainable Smart Cities: The Use of the ViaPPS as Road Monitoring System
1 Introduction
2 Sensing Technologies in On-Road Object Inventory
3 Case Study: The ViaPPS
3.1 System Design
3.2 Road Inspection—Data Acquisition
3.3 Data Handling
3.4 Features Extraction and Image Anonymization
3.5 Reports and Data Management Systems
4 Discussion
5 Conclusion
References
Optimal Resource Allocation for Public Safety Device to Device Communication Using PSO
1 Introduction
2 Literature Survey
3 Methodology
3.1 Particle Swarm Optimization (PSO)
3.2 Resource Allocation Using PSO
3.3 Weighted Average Throughput and Penalty for Resource Constrained Settings
3.4 Implementation Details
4 Results and Analysis
5 Conclusion and Future Work
References
Research Progress in Internet of Things (IoT) Application in Smart Cities Development: A Bibliometric Analysis
1 Introduction
2 Methodology
3 Results and Discussion
3.1 Document Types and Language of Publication
3.2 Annual Growth of Publications on IoT Application in Smart Cities
3.3 Most Productive SCOPUS Subject Categories and Journals
3.4 Most Productive Countries
3.5 Highly Productive Institutions in IoT Application in Smart Cities
3.6 Highly Cited and Impactful Articles Related to the IoT Application in Smart Cities Development
3.7 Author Keyword Analysis and Research Hotspot
3.8 Application of Bibliometric Studies in Assessing the Future Research in IoT
4 Conclusion
References
Neural Network Based Task Scheduling in Cloud Using Harmony Search Algorithm
1 Introduction
2 Related Work
2.1 A. Framework that is Used to Optimize Task Scheduling in the Cloud Environment [1]
2.2 B. Harmony Search Algorithm: Strengths and Weaknesses [2]
2.3 C. Particle Swarm Optimization: Development, Applications, and Resources [3]
2.4 D. NN-Based Secure Task Scheduling in Computational Clouds [4]
2.5 E. Ant Colony Optimization [5]
2.6 An Idea Based on Honey Bee Swarm for Numerical Optimization [6]
2.7 Job Scheduling for Cloud Computing Using Neural Networks [7]
3 Proposed Model
3.1 Phases of Harmony Search Algorithm
3.2 Flow Diagram of the Harmony Search Algorithm
3.3 Input Definitions
4 Simulation Result
5 Conclusion
References
Neural Inspired Ant Lion Algorithm for Resource Optimization in Cloud
1 Introduction
2 Related Work
3 Proposed Model
4 Results
5 Conclusion
References
Data Science and Business Analytics, IoT, AI and ML for Smart Cities
Smart School Selection with Supervised Machine Learning
1 Introduction and Related Work
1.1 Dataset Description
2 Preprocessing
3 Experiments and Results
3.1 Experiment-I
3.2 Experiment-II
4 Model Evaluation
5 Conclusion
References
Artificially Intelligent and Sustainable Smart Cities
1 Introduction
2 Literature Review
3 Background Study
3.1 Internet of Things (IoT)
3.2 Artificial Intelligence (AI)
3.3 Augmented Reality (AR)
3.4 Drones
3.5 Cloud Computing
3.6 Big Data
4 Different Portfolios of Smart Cities
4.1 Smart Traffic Solution
4.2 Smart Fire Brigades
4.3 Smart Policy Making and Planning
4.4 Smart Farming
4.5 Smart Electricity Grids
4.6 Smart Parking Solutions
4.7 Smart Security Management (Law Enforcement)
4.8 Smart Waste Management
4.9 Smart Pollution Control
4.10 Smart Self-Sustainable Public Toilets
4.11 Smart Healthcare Facilities
5 Self-Building AI Model
6 An Ideal Smart City
6.1 α-Command
6.2 β-Command
6.3 γ-Command
7 Major Drawbacks
8 Conclusion and Future Scope
References
Machine Learning Self-Tuning Motivation Engine for Telemarketers
1 Introduction
2 State of the Art
3 Motivation and Serious Games
4 A New Perspective: MOTIVARNOS
5 Architecture and Basic Concepts
5.1 Loading of the Raw Data
5.2 Structure of the Data
5.3 Data Pre-processing
5.4 Exploratory Data Analysis
5.5 Telemarketer Profile
6 Conclusions and Future Work
References
QROWD—A Platform for Integrating Citizens in Smart City Data Analytics
1 Introduction
2 Related Work
2.1 IOT for Smart Cities
2.2 Mobile Crowdsensing and Crowdsourcing
3 The QROWD Platform and Architecture
4 Crowdsourcing Services
4.1 Design Guidelines for Human Tasks
4.2 Crowdsourcing Service Implementation Framework
5 Data Acquisition and Generation
5.1 Pre-existing Data Sources
5.2 Data from Citizens Devices
5.3 Citizen Challenges
5.4 Annotations from Street-Level Imagery
6 Data Models and Storage
6.1 Data Models
6.2 Big Data Storage
7 Data Integration
8 Use Cases
8.1 Generating and Managing Mobility Infrastructure Data
8.2 Modal Split Surveys
9 Summary and Conclusion
References
Estimation of Short-Time Forecast for Covid-19 Outbreak in India: State-Wise Prediction and Analysis
1 Introduction
2 Related Work
3 Theoretical Background
4 Data Pre-processing
4.1 Population
4.2 Weather
5 Result Analysis
5.1 Time-Series Plots
6 Results
6.1 Short Term Predictions and Their Analysis
6.2 Kalman X-days Prediction Discussion for Indian Subcontinents
6.3 Comparative Analysis of Different Nations
7 Conclusion
References
Recommend Papers

Sustainable Smart Cities. Theoretical Foundations and Practical Considerations
 9783031080203, 9783031088148, 9783031088155

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Studies in Computational Intelligence 942

Pradeep Kumar Singh Marcin Paprzycki Mohamad Essaaidi Shahram Rahimi   Editors

Sustainable Smart Cities Theoretical Foundations and Practical Considerations

Studies in Computational Intelligence Volume 942

Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. This series also publishes Open Access books. A recent example is the bookSwan, Nivel, Kant, Hedges, Atkinson, Steunebrink: The Road to General Intelligence https://link.springer.com/book/10.1007/978-3-031-08020-3 Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.

Pradeep Kumar Singh · Marcin Paprzycki · Mohamad Essaaidi · Shahram Rahimi Editors

Sustainable Smart Cities Theoretical Foundations and Practical Considerations

Editors Pradeep Kumar Singh Narsee Monjee Institute of Management Studies (NMIMS) School of Technology Management and Engineering Chandigarh Campus, India Mohamad Essaaidi ENSIAS College of Engineering Mohammed V University Rabat, Morocco

Marcin Paprzycki Polish Academy of Sciences Warsaw, Poland Shahram Rahimi Department of Computer Science and Engineering Mississippi State University Starkville, MS, USA

ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-3-031-08814-8 ISBN 978-3-031-08815-5 (eBook) https://doi.org/10.1007/978-3-031-08815-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

This book is classified into three major sections: (i) Social Aspects of Making City Smart, (ii) 5G and Other Networking Technologies for Smart Cities and (iii) Data Science and Business Analytics, IoT, AI and ML for Smart Cities. The first section covers the social aspects for the smart cities, the key challenges, opinion of experts on smart cities and standard perspectives. The second parts explore the role of latest networking technologies, especially 5G technologies, in terms of better communication and connectivity among smart devices. The third section explores the role of data science, Internet of Things (IoT) and artificial intelligence (AI) and machine learning (ML) solutions for smart cities applications. The brief of each of chapter included in the book is stated below. In the first and second sections, there are six chapters in each, respectively, as mentioned below. Khoi et al. discussed their findings on title “A Smart City Analytical Framework in Economics”. This chapter covers the data analytics for the Vietnam. The authors discuss challenges in terms of six factors: smart economy, smart governance, smart environment, smart citizen, smart traffic and smart living in context to Vietnam. These six factors are primarily analysed in the chapter. There is another chapter written by Khoi et al. to address the “Smart City— Development Trend in the World and Vietnam”. In this chapter, the most recent trends in Vietnam’s to become a smart cities are included, although stated that there is no smart cities as such, so the findings could be the future guidelines for the smart city planners in Vietnam. Karocki Piotr contributed on title “Smart City from a Standards Perspective”. This chapter covers many standard definitions of the smart cities and what experts suggest a smart city should have. There are few opinions about smart cities; “it is not only about technology advancements”; it much beyond technological advancements. Sustainable development goals (SDGs) are also integrated part of the smart cities. Finally, author concluded with the remark “smart city is a new way of thinking in city planning”. Asaad et al. addressed their work on “Point-of-Interests Recommendation Service in Location-Based Social Networks: A Survey, Research Challenges, and Future Perspectives”. This work aims to address most of the currently proposed ideas and v

vi

Preface

proposed techniques for offering accurate Point-of-Interest (POI) recommendation systems. This work could be considered as a start of the art and future possibilities of research in terms of taxonomies of POI-based recommendation solutions. Guerra et al. contributed their findings on “San Marcos Smart City: A Proposal of Framework for Developing ISO 37120:2018-Based Smart City’s Services for Lima”. The main objectives of their work is to build and test the main technological components for a smart city including different layered architectures that contain the physical to application layer analysis in terms of implementation and other challenges. This framework might be useful for the similar areas of smart cities with respect to the following: education, environmental, law, government, private sector. Saxena et al. carried out an in-depth analysis on “Social and Technical Challenges in Eco-Sustainable Smart City in India—An Analysis”. Their work emphasizes on the needs of cooperation among government, industry, citizen, etc., to develop an ideal environment for the smart communities and cities. Various success factors for smart cities are discussed during their work in context to India. Sinha et al. reported a framework on title “A Framework for Designing Long Term Digital Preservation System”. In their contribution, data preservation practices for digital contents of smart cities are discussed. Long-term and short-terms goals for digital data preservation and challenges are the key findings. Giudici et al. discussed their findings on “Towards Sustainable Smart Cities: The Use of the ViaPPS as Road Monitoring System”. Their main contribution emphasizes on the role of ICT in road monitoring and showcases the ViaPPS: a mobile Pavement Profiling System and its usefulness in assisting the smart cities authorities in tracking the road networks and conditions of roads at various locations within smart city. Dhruvik et al. wrote about their findings on “Optimal Resource Allocation for Public Safety Device to Device Communication Using PSO”. Their work proposed an improved optimized resource allocation algorithm for D2D applications; it further prioritizes PSC over commercial applications and that makes it useful for smart cities applications in regard to device to device communication improvements. In the similar direction, a bibliometric-based analysis is conducted by Shri Ram, and he explored the research progress in Internet of Things (IoT) application in smart cities and ongoing challenges. The findings from this work with the support of published papers concluded IoT as one of the most demanding areas of technologies used in smart cities applications. Anand et al. focused their findings on “Neural Network (NN) Based Task Scheduling in Cloud Using Harmony Search Algorithm”. The findings are aligned to the cloud computing task scheduling challenges, and authors have analysed the comparison between the efficiency of Moth, GA and harmonic search (HS) techniques. Gulati et al. investigated their findings on “Neural Inspired Ant Lion Algorithm for Resource Optimization in Cloud”. The findings of the papers concluded that the best option to be used in resource allocation is ant lion algorithm in cloud computing. Kumar et al. detailed their findings on “Smart School Selection with Supervised Machine Learning”. In their work, a recommender system is proposed. This smart school selection application illustrated by the authors found beneficial to students

Preface

vii

and may be helpful to the parents while choosing an academic school to their wards, which might further help in enhance the grades based on right selection of school. Gourisaria et al. stated their findings on “Artificially Intelligent and Sustainable Smart Cities”. This paper covers the key aspects of smart cities; i.e. technologies involved, the implementation and drawbacks associated with the proposed techniques have been discussed with in-depth analysis. Daniela López De Luise et al. submitted their results on “Machine Learning SelfTuning Motivation Engine for Telemarketers”. In their work, they minutely analysed the challenges associated with the telemarketing job and how the introduction of serious games may help telemarketers in improving their overall performance. Ibáñez et al. discussed one framework on “QROWD—A Platform for Integrating Citizens in Smart City Data Analytics”. This paper covers a platform which provides a framework for helping smart city planners and their team’s members in terms of the design and implementation of human computation tasks. The work is helpful in citizen sensing as crowdsourcing services because according to the work it can be integrated with machine processes and we can have city data analytics for various applications and may be helpful in city planning as well. Bawa et al. reported their findings on “Estimation of Short-Time Forecast for COVID-19 Outbreak in India: State-Wise Prediction and Analysis”. The current work focuses on the state-wise impact of COVID-19 pandemic cases in India and how it will helpful for the future crises and epidemic situations. Ghaziabad, India Warsaw, Poland Rabat, Morocco Starkville, USA

Pradeep Kumar Singh Marcin Paprzycki Mohamad Essaaidi Shahram Rahimi

Contents

Social Aspects of Making City Smart A Smart City Analytical Framework in Economics . . . . . . . . . . . . . . . . . . . Nguyen Thi Ngan and Bui Huy Khoi

3

Smart City—Development Trend in the World and Vietnam . . . . . . . . . . . Nguyen Thi Ngan and Bui Huy Khoi

13

Smart City from a Standards Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . Piotr Karocki

23

Point-of-Interests Recommendation Service in Location-Based Social Networks: A Survey, Research Challenges, and Future Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Safar Maghdid Asaad, Kayhan Zrar Ghafoor, Halgurd Sarhang, and Aos Mulahuwaish San Marcos Smart City: A Proposal of Framework for Developing ISO 37120:2018-Based Smart City’s Services for Lima . . . . . . . . . . . . . . . . Jorge Guerra Guerra, Marco Rios, Alvaro Aspilcueta, Juan Gamarra, Jorge Zavaleta, and Felix Fermin Social and Technical Challenges in Eco-Sustainable Smart City in India—An Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Devanshi Saxena, Shaweta Khanna, Sangeeta Mangesh, Manisha Chaudhry, and Kayhan Zrar Ghafoor

43

65

87

5G and Other Networking Technologies for Smart Cities A Framework for Designing Long Term Digital Preservation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Anand Kumar Sinha, Santosh Kumar, and H. M. Singh

ix

x

Contents

Towards Sustainable Smart Cities: The Use of the ViaPPS as Road Monitoring System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Henri Giudici, Boris Mocialov, and Aslak Myklatun Optimal Resource Allocation for Public Safety Device to Device Communication Using PSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Navadiya Dhruvik, Rakesh Pavan, Neeraj, and M. Kiran Research Progress in Internet of Things (IoT) Application in Smart Cities Development: A Bibliometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 173 Shri Ram Neural Network Based Task Scheduling in Cloud Using Harmony Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Arnaav Anand, Pratyush Agarwal, Dinesh Kumar Saini, and Punit Gupta Neural Inspired Ant Lion Algorithm for Resource Optimization in Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Devansh Gulati, Mehul Gupta, Dinesh Kumar Saini, and Punit Gupta Data Science and Business Analytics, IoT, AI and ML for Smart Cities Smart School Selection with Supervised Machine Learning . . . . . . . . . . . . 221 Deepak Kumar, Chaman Verma, Veronika Stoffová, Zoltán Illes, Anish Gupta, Brijesh Bakariya, and Pradeep Kumar Singh Artificially Intelligent and Sustainable Smart Cities . . . . . . . . . . . . . . . . . . . 237 Mahendra Kumar Gourisaria, Gaurav Jee, G. M. Harshvardhan, Debanjan Konar, and Pradeep Kumar Singh Machine Learning Self-Tuning Motivation Engine for Telemarketers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Daniela López De Luise and Rodrigo Borgia QROWD—A Platform for Integrating Citizens in Smart City Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Luis-Daniel Ibáñez, Eddy Maddalena, Richard Gomer, Elena Simperl, Mattia Zeni, Enrico Bignotti, Ronald Chenu-Abente, Fausto Giunchiglia, Patrick Westphal, Claus Stadler, Gordian Dziwis, Jens Lehmann, Semih Yumusak, Martin Voigt, Maria-Angeles Sanguino, Javier Villazán, Ricardo Ruiz, and Tomas Pariente-Lobo Estimation of Short-Time Forecast for Covid-19 Outbreak in India: State-Wise Prediction and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 323 Puneet Bawa, Virender Kadyan, Anupam Singh, Kayhan Zrar Ghafoor, and Pradeep Kumar Singh

Social Aspects of Making City Smart

A Smart City Analytical Framework in Economics Nguyen Thi Ngan and Bui Huy Khoi

Abstract The idea of a smart city has grown in popularity, and communities have expressed a strong desire to turn into smart cities. Vietnamese cities are facing many challenges of urbanization pressure; population increment increase; Traffic congestion; polluted environment; electricity, water, and traffic infrastructure are overloaded. To get through these challenges, urban paradigms with socio-economic, environmental, and transport infrastructure Smart management is inevitable. However, assess the current level of development towards Smart city access is a relatively recent issue in Vietnam. The development of smart cities is expected to be one of the most significant achievements of societies around the world in the twenty-first century. The paper proposes a smart city analytical framework based on applying statistical algorithms of theoretical and empirical studies on smart cities globally. The paper’s content focuses on a building smart city model’s theoretical structure and implementation outcomes. The results of the paper revealed that the Smart city is influenced by six factors. Research results will be a meaningful reference for researchers and policymakers when the smart urban model in the Vietnamese context. The paper also implicates recommendations for some factors that influence the smart city model in Vietnam. The literature reviews on smart cities and our proposed research model are summarized in Sect. 2, while the Smart City Analytical Framework used in this study is presented in Sect. 3. The data in Vietnam, which are important for our discussion, are showed in Sect. 4, and the conclusion is established in the final section. The paper implicates approvals of some factors that impact the smart city model in Vietnam. Keywords Vietnam · Smart city · Analytical framework · Factors

1 Introduction The development of smart cities is expected to be one of the most significant achievements of societies around the world in the twenty-first century. One of the most N. T. Ngan · B. H. Khoi (B) Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_1

3

4

N. T. Ngan and B. H. Khoi

critical and complex aspects of smart cities is the development of roles and relationships among the key actors involved in envisioning and building them as this growth continues [19]. In human history, we are living on a convergence of two big phenomena: the emergence of global urbanization and the digital revolution. Currently, according to the United Nations, 55% of the world’s population lives in cities, with that percentage expected to increase to 68% by 2050. In 30 years, over 2.5 billion people will live in urban areas, mostly in Asia and Africa, based on current population growth rates [4]. Currently, Vietnamese cities are facing many challenges of urbanization pressure; population increment increase; Traffic congestion; polluted environment; electricity, water, and traffic infrastructure are overloaded. To get through these challenges, urban paradigms with socio-economic, environmental, and transport infrastructure Smart management is inevitable. However, assess the current level of development towards Smart city access is a relatively recent issue in Vietnam. Therefore, the research will be mainly about the topic produce an analytical framework for the pillars of smart city asymptotic according to international and local practice measures to evaluate these pillars through selected contextual scales; research, the proposed analytical framework, and selected criteria will assess the asymptotic level of the model smart city in Vietnam. And from there, we initially gave some key suggestions policies to promote steps towards smart cities for the. Research results will be a meaningful reference for researchers and policymakers when the smart urban model in the Vietnamese context is still quite new, but it is interested strongly [9]. The following is a breakdown of the paper’s structure. The literature reviews on smart cities and our proposed research model are summarized in Sect. 2, while the Smart City Analytical Framework used in this study is presented in Sect. 3. The data in Vietnam, which are important for our discussion, are demonstrated in Sect. 4, and the conclusion is established in the final section. The paper implicates approvals to some factors that impact the smart city model in Vietnam.

2 Literature Review The idea of a smart city has grown in popularity, and communities have expressed a strong desire to turn into smart cities. As a result, such transformation necessitates a strategic approach [19]. A smart city is one that not only uses ICT in its infrastructure but also can incorporate people, information, and technology to create an effective, sustainable, and resilient infrastructure that provides high-quality services and improves residents’ quality of life [21]. The idea of a “smart city” was first implemented in 1994. The smart city concept is based on environmental sustainability, with the primary goal of reducing greenhouse gas emissions in urban areas through the use of cutting-edge technology. The

A Smart City Analytical Framework in Economics

5

growing interest in the smart city concept, as well as the need to address urbanization’s challenges, has resulted in many private and public investments in technology creation and implementation [1]. There are many concepts of “Smart City,” and this concept is used inconsistently with each other. Below, present the idea of a “smart city” of previous researchers: Mitchell [15] argues that the origin of this concept lies in the “Connection City,” promising the use of new telecommunications technologies that will provide an unprecedented amount of documents and data to families. And businesses by “information highways,” resulting in a world dominated by the media. Digital City, a city characterized by technology that uses broadband infrastructure to enable electronic governance and is a “global world” for public transactions, is another forerunner of this Smart City. Giffinger et al. [8] argue that A smart city is a well-functioning city that is founded on a clever combination of resources and the activities of self-determined, autonomous, and informed people. According to Hollands [10], to ensure sustainable growth, smart cities depend on the use of infrastructure networks to increase economic and political performance. Caragliu et al. [5] believe that a City will be smart when it receives investments in people and social capital to create a sustainable economy, improve the quality of life and resources, and effectively managed nature. Alam et al. [2] present Smart City using ICT infrastructure, economic growth, social sustainability, and quality of life are all dependent on human resources, social capital, and environmental resources. Aside from this definition, the authors want to show that Smart Cities are critical for long-term urban growth. It can alleviate many essential issues associated with the current urbanization process, such as traffic congestion, environmental pollution, and limited natural resources. Petrolo et al. [17] suggested six factors that were affecting smart cities: smart governance, smart people, smart economy, smart living, and smart environment. The smart city model usually combines the city’s economic, social, and environmental components in a way that maximizes the productivity of the city’s primary structures sustainably. Elhoseny et al. [7] offered smart government, smart living, smart economy, smart education, smart mobility, smart service, and smart community are nine factors relevant to smart cities. Talari et al. [20] Smart cities, smart houses, smart buildings, attentive consumers, smart energy, and smart grids are among the five smart cities that are impacted. Kumar and Dahiya [13] highlight Six main building blocks that make up a smart city system: (i) smart people, (ii) smart city economy, (iii) smart mobility, (iv) smart climate, (v) smart living, and (vi) smart governance. Ngan and Khoi [16] identified policy analysts face a challenge in developing an effective quantitative model based on economic theory and empirical evidence. To find realistic evidence for the model, their research team gathered data on 362 people in Ho Chi Minh City, Vietnam. Their analysis results showed that the Smart city is affected by Smart Economy, Smart Governance, Smart Environment, Smart Citizens, Smart traffic, and Smart living. Ji et al. [11] show that to achieve people-centric smart city growth, city officials must have a clear understanding of citizens’ needs and expectations of smart city

6

N. T. Ngan and B. H. Khoi

services. The findings show that the majority of our respondents viewed SC services as both necessary and useful to their continued life, relatedness, and development and that they preferred SC services to function in the “hard” domain—such as smart energy, smart transportation, or smart safety—rather than the “soft” domain, such as smart living. Finally, six factors are affecting Smart Cities include Smart Economy, Smart Governance, Smart Environment, Smart Citizens, Smart traffic, and Smart living.

3 Smart City Analytical Framework Smart Economy Smart economy characteristics consider elements of competitiveness, such as innovation, entrepreneurship, branding, productivity, and flexibility of the labor market and integration into the domestic and international market. A typical smart economy will always have e-commerce transactions (E-Business, E-Commerce) based on information and communication technology systems, to maximize efficiency in transactions, saving costs, encouraging innovation in products, services, and business models. Information and communication technology systems help to create smart Eco-Systems through the forms of business and digital entrepreneurship. Besides, the application of high technology in the economy also creates the flow of products, services, and knowledge both tangible and intangible over the network, promoting the connectivity of local and global economic transactions [9]. Smart Governance Smart governance characteristics are evaluated based on the criteria of people’s participation in the process of urban management, utility services, and administrative activities [8]. A municipality with smart governance is a city where residents may contribute ideas to management activities to help the city operate more efficiently. To achieve these goals, information and communication technologies play a key role in smart city development [14]. Information and communication technology systems include hard and soft infrastructures that help provide an open database through which people can connect or with the community, such as monitoring urban activities. These electronic services and E-Government can be provided on mobile applications. Smart governance is the first important pillar that connects the remaining pillars of Smart Cities. Smart Environment The smart environment is assessed based on the criteria of habitat conditions, green area, pollution problem, and effectiveness of pollution treatment and remediation measures. Similar to the features of a smart city mentioned above, a smart city applies high technology in environmental management, especially in the energy sector. According to Manville et al. [14], smart cities must use “smart energy” including

A Smart City Analytical Framework in Economics

7

reusable energy sources, must apply information and communication technology systems in the Energy Grids when measuring, monitoring, and controlling pollution. Information and communication technology also needs to be applied to create Green Buildings, Green Urban Planning. Also, urban services such as street lights, waste discharge systems, water supply, and drainage need to be monitored through the information and communication technology system to evaluate the effectiveness in environmental management. Smart Citizens The Smart Citizen is not only based on urban residents’ education but also the level of individual interaction with the community and the connection between people and people in society [8]. On the other hand, The Cities [6] assessed that cities with the characteristics of smart people are cities in which individuals must have the opportunity to learn for a lifetime, where they must find ways to increase social integration, improving the quality of people’s lives, creating opportunities and motivations to enhance the creative spirit of the people, at the same time, people are guaranteed to have access to open data systems anytime, anywhere. Besides, Manville et al. [14] argued that smart people are individuals with E-Skills skills and can work in the application of information technology and Information communication (ICT-Enabled Working). Smart Traffic The smart traffic system in the city not only meets the normal traveling needs but also provides traffic information for people through applications, upgrading the existing transport system to be modern and more sustainable. Specifically, the traffic network must be safe, clean, and especially flexible and effective, including trams, buses, trains, cars, bicycles, and pedestrians. People can easily change between transport modes to save the most time and money. In order to do it, the smart transport system must provide a source of information data based on the actual time of the vehicle so that the people can access the system and choose effective means of transportation. As a result, this system brings many benefits for both the people and the government, such as cost savings, reduced CO2 emissions, and an improved feedback system from the connection of electronic citizens on the smart traffic system. Smart Living This feature includes all aspects to assess people’s quality of life such as health, safety, culture, housing, and tourism. Smart life is a life that applies ICT—Enabled Lifestyles and Behavior. Smart living must be healthy and safe with high-quality housing, social capital, and high levels of social cohesion [14]. In cities with a smart life, natural resources must be managed in a “smart” way, a sustainable urban environment should be created with “smart” plans for roads and spaces, public and facilities. Most importantly, urban development and management plans should work toward a common goal of improving and assessing the quality of life of urban residents. The six pillars above are often applied in the assessment of a smart city according to international practices; on the other hand, they are also the development goals of urban

8

N. T. Ngan and B. H. Khoi

managers to be achieved. These six pillars are often built on three important foundations: (1) Technology, (2) institutions, and (3) people [14]. In which, technology includes: physical infrastructure, smart technology, mobile technology, virtualization technology, digital network; Human resources include: human resources and social capital; Institutions include: Governance, policy, and regulations. These three platforms act as vehicles and tools to help achieve the six pillars of smart cities. Thus, the analytical framework of smart cities includes 6 pillars: (1) Smart governance; (2) Smart economy; (3) Smart traffic; (4) Smart environment; (5) Smart citizens, and (6) Smart living. These pillars are built on three important foundations: (1) Technology, (2) institutions, and (3) people [6, 8, 14].

4 Data in Vietnam First, the research of Khoi and Ngan [12] showed that six factors influence the smart city. Their quantitative research was carried out with a sample of 314 people in Vietnam. Challenges, according to the writers, could persuade people that Smart City is fair, in line with current trends, and that everyone should cooperate. Six factors influenced the Smart city, accounting for 73.3 percent of the total in Fig. 1. Table 1 shows that the six hypotheses are accepted because their p-values are less than 0.05. The second, based on the results of calculating the indexes of the 6 pillars of smart cities, each indicator is composed of 1–6 components, smart city index and 6 pillar indicators in Southeast Vietnam are shown in Table 2. The principal component analysis is the machine learning algorithm [18]. Overall, Ho Chi Minh City leads the way in the smart city approach, with the highest-ranked smart city index of 0.86, followed by Binh Duong, Dong Nai, Tay Ninh, Ba Ria–Vung Tau and Binh Phuoc with smart city indicators are 0.36, 0.07, –0.30, –0.41 and –0.90 (Table 2). Although Ho Chi Minh City leads in most of the indexes but ranks behind Binh Duong in the smart management index and only 4th in the smart life index (after Binh Duong, Tay Ninh, and Dong Nai. Another noteworthy point, out of six provinces/cities in the Southeast, Binh Phuoc is currently the province with the least potential for a smart urban approach. The above figures show the superiority of Ho Chi Minh City in terms of pillars towards a smart urban model compared to others in Southeast Vietnam in Table 2.

5 Conclusion In short, realizing the goal of building a smart city to improve the quality of life of the people and towards sustainable economic development, Vietnam first needs to focus on three main tasks, namely, quality of infrastructure, improving human

A Smart City Analytical Framework in Economics

9

Fig. 1 Smart city model. Source Khoi and Ngan [12]

Table 1 Factors impacting the smart city Relationship

Coefficients

SD

t

p-value

Decision

Smart citizen → Smart city

0.19

0.042

4.73

0.00

Supported

Smart economy → Smart city

0.29

0.05

5.53

0.00

Supported

Smart environment → Smart city

0.27

0.04

7.19

0.00

Supported

Smart governance → Smart city

0.21

0.04

4.87

0.00

Supported

Smart living → Smart city

0.30

0.04

6.79

0.00

Supported

Smart traffic → Smart city

0.29

0.04

6.76

0.00

Supported

Source Khoi and Ngan [12]

capital, and developing information and communication technologies, and in particular implementing communications technology achievements to the areas of governance, economics, the environment, and move. In particular, Vietnam must take advantage of favorable human, institutional and technological conditions to play the role of pioneering in the development of smart urban models. From there, the development of Vietnam will become a good practice to create pervasiveness in Southeast

10

N. T. Ngan and B. H. Khoi

Table 2 Smart city assessment index of some areas in Vietnam Some areas in Vietnam

Smart economy index

Smart citizen index

Smart governance index

Smart traffic index

Smart environment index

Smart living index

Smart city index

HCM City

0.67

1.65

0.55

1.05

1.38

– 0.16

0.86

Dong Nai

0.02

– 0.70

0.30

0.06

0.41

0.31

0.07

Binh Duong

0.19

0.39

0.58

0.25

– 0.31

1.07

0.36

Ba Ria—Vung Tau

– 0.34

– 0.78

– 0.14

– 0.95

0.68

– 0.93

– 0.41

Binh Phuoc

– 0.10

– 0.48

– 0.90

– 1.42

– 1.16

– 1.34

– 0.90

Tay Ninh

– 0.44

– 0.09

– 0.39

– 0.93

– 1.00

1.04

– 0.30

Source Hoai et al. [9]

Asia. The market is filled with these smart devices, which are simple to use and integrate into an IoT system [3] to access the smart city model in the world. Implications First, recommendations on life in Vietnam: One is a diverse and rich cultural foundation in Vietnam, the second is health care services in Vietnam that satisfy the needs of the people, and the third is the safety of the people. In Vietnam is always guaranteed, the fourth is the quality of housing in Vietnam is very good, five are educational institutions in Vietnam are of good quality, sixth is Vietnam has great potential for tourism development and ultimately happiness. Social benefits for people in Vietnam are very good. Second, recommendations on mobility in Vietnam: People in Vietnam easily access information from state agencies, People in Vietnam easily access information around the world, Vietnam has a technological infrastructure. Good information, Vietnam can develop a sustainable transport system. Third, recommendations on economic development in Vietnam: First, Vietnam has a creative economy, second is the owner of enterprises in Vietnam showing high competitiveness, the third is the brand image of enterprises. Vietnam is highly appreciated by consumers. Fourthly, Vietnam has highly efficient labor productivity, the fifth is the labor market in Vietnam is very flexible, and finally, the international linkage in Vietnam is very high. Fourth, recommendations on the living environment in Vietnam: Vietnam must have a good natural environment, good air conditions, awareness of the people in Vietnam about a good ecosystem, and Vietnam to manage its resources sustainably. Fifth, recommendations for the development of the people in Vietnam: Education levels and qualifications of the people in Vietnam must be good, Vietnamese residents show the spirit of continuing education and in the end, the people of Vietnam have a decline to open thinking (open, capable of accepting the new).

A Smart City Analytical Framework in Economics

11

Sixth, recommendations on urban management in Vietnam: including Policies in Vietnam to facilitate people to integrate into community life easily, community services and society in Vietnam are Good and Urban Management Policy in Vietnam towards transparency.

6 Limitations of the Research and the Next Researches This study is still limited in terms of data and selection criteria for the proposed smart city framework, which will be improved in further studies when resources are available. Currently, Vietnam still only has the idea of a smart city, but there is no smart city that meets the standards or is recognized in the country or the region. Therefore, there is still a lot of debate about the criteria for the evaluation of local smart city proxy’s efforts. Although the study tried to select the criteria that represent the pillars of smart cities from the previous researchers, these criteria are limited by existing data in Vietnam. Especially the observational data for the smart environment and smart life pillar. Acknowledgements Acknowledgements This work was supported in part by Industrial University of Ho Chi Minh City, Vietnam. Authors Contribution The authors contribute to the paper. Bui Huy Khoi contributed to the study of data and the gathering of research-related references. Nguyen Thi Ngan contributed to the compilation of data and the manuscript was revised. Conflicts of Interest The authors declare no conflict of interest.

References 1. Ahvenniemi, H., Huovila, A., Pinto-Seppä, I., Airaksinen, M.: What are the differences between sustainable and smart cities? Cities 60, 234–245 (2017) 2. Alam, F., Mehmood, R., Katib, I., Albeshri, A.: Analysis of eight data mining algorithms for smarter Internet of Things (IoT). Procedia Comp. Sci. 98, 437–442 (2016) 3. Anand, P., Singh, Y., Selwal, A.: Internet of Things (IoT): vulnerabilities and remediation strategies. Paper presented at The International Conference on Recent Innovations in Computing (2020) 4. Bach, K.H.V., Kim, S.-K.: Developing smart city: based on the assessment of smart projects in medium-size cities, Vietnam. Am. Scien. Res. J. Eng. Technol. Sci. (ASRJETS) 56(1), 38–49 (2019) 5. Caragliu, A., Del Bo, C., Nijkamp, P.: Smart cities in Europe. J. Urban Technol. 18(2), 65–82 (2011) 6. Cities, S.: Regional Perspectives. The Government Summit Thought Leadership Series (2015) 7. Elhoseny, H., Elhoseny, M., Riad, A., Hassanien, A.E.: A framework for big data analysis in smart cities. Paper presented at the International Conference on Advanced Machine Learning Technologies and Applications (2018)

12

N. T. Ngan and B. H. Khoi

8. Giffinger, R., Fertner, C., Kramar, H., Meijers, E.: City-ranking of European medium-sized cities. Cent. Reg. Sci. Vienna UT, 1–12 (2007) 9. Hoai, N.T., Dung, N.V., Duyen, T.T.P., Vien, N.V.: Smart urban analysis framework: case study of Southeast Vietnam (Vietnamese). J. Asian Bus. Econ. Stud. 29(6), 05–26 (2018) 10. Hollands, R.G.: Will the real smart city please stand up? intelligent, progressive or entrepreneurial? City 12(3), 303–320 (2008) 11. Ji, T., Chen, J.-H., Wei, H.-H., Su, Y.-C.: Towards people-centric smart city development: investigating the citizens’ preferences and perceptions about smart-city services in Taiwan. Sustain. Cities Soc. 67, 102691 (2021) 12. Khoi, B.H., Ngan, N.T.: Factors impacting to smart city in Vietnam with smartpls 3.0 software application, vol. 10, pp. 1–8. IIOAB (2019) 13. Kumar, T.V., Dahiya, B.: Smart economy in smart cities. In: Smart Economy in Smart Cities, pp. 3–76. Springer (2017) 14. Manville, C., Cochrane, G., Cave, J., Millard, J., Pederson, J.K., Thaarup, R.K., … Kotterink, B.: Mapping smart cities in the EU (2014) 15. Mitchell, W.J.: Designing the digital city. Paper presented at the Kyoto Workshop on Digital Cities (1999) 16. Ngan, N., Khoi, B.: Determinants influencing to smart city. J. Adv. Res. Dyn. Control Syst. 12, 676–681 (2020) 17. Petrolo, R., Loscri, V., Mitton, N.: Towards a smart city based on cloud of things, a survey on the smart city vision and paradigms. Trans. Emerg. Telecommun. Technol. 28(1) (2017) 18. Singh, P.K., Kar, A.K., Singh, Y., Kolekar, M.H., Tanwar, S.: Proceedings of ICRIC 2019: Recent Innovations in Computing, vol. 597. Springer Nature (2019) 19. Stojanovi´c, P.D.B., Kosti´c, P.D.Z., Vuˇci´c, P.D.V.: Sustainable business models in the light of the digital transformation: smart city perspective. Contemp. Econ. Bus. Issues 105 (2021) 20. Talari, S., Shafie-khah, M., Siano, P., Loia, V., Tommasetti, A., Catalão, J.P.: A review of smart cities based on the internet of things concept. Energies 10(4), 421 (2017) 21. Wu, Y.J., Chen, J.-C.: A structured method for smart city project selection. Int. J. Inf. Manage. 56, 101981 (2021)

Smart City—Development Trend in the World and Vietnam Nguyen Thi Ngan and Bui Huy Khoi

Abstract Technology and data are used by smart cities to improve the productivity, economic growth, sustainability, and quality of life of urban people. In order to encourage social development, smart cities must be formally recognized by national and international authorities and organizations. First, this paper aims to research smart city concepts and domains. To better understand the smart city concept, the well-known smart city norms will be presented. Well-defined principles make concrete comparisons between implementing smart cities possible. How smart city projects make a city smarter and enhance the quality of life will be addressed. This review highlights that realistic requirements for implementing smart cities are significant. This paper serves as a guide to the most recent trends in Vietnam’s criteria for smart cities. Keywords Smart city · Trend · Development · Vietnam

1 Introduction As the world’s population grows at an exponential rate, global energy demand rises geometrically, and technology breakthroughs cause more resilient infrastructure than ever before, there is a pressing need for smart cities to build innovative energy systems [1]. Rapid urbanization and advancements in information and communication technology [19] are two of the most significant factors influencing urban security planning and governance. The latter, in particular, has influenced the concept of smart cities, which has been increasingly popular in recent years [10]. By 2050, 66% of the global population is projected to live in urban areas. The task will be to provide vital services for these communities, including adequate electricity, clean water, and nutritious food while guaranteeing maximum economic, social, and environmental sustainability at the same time [9].

N. T. Ngan · B. H. Khoi (B) Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_2

13

14

N. T. Ngan and B. H. Khoi

The process of urbanization has created enormous pressure on cities for infrastructure, housing, services, and the environment to meet the living needs of present and future generations. Smart city construction is a strategic solution to solve problems that arise from the rapid population growth and rapid urbanization in today’s major cities. According to the latest report at the Second Senior Officials Meeting (SOM2), the Asia–Pacific Economic Cooperation (APEC) Forum, and related meetings, the APEC Dialogue on Urban was held. Sustainability is organized by the Group of Friends of the President (FotC) of Urbanization. Currently, there are about 1.8 billion people (about 60% of the population) living in urban areas. It is expected that the population living in the urban area in 2050 will be 2.4 billion people, an increase of 33%; some economies will urbanize over 80%, while others will continue to urbanize rapidly and there will be 14/37 of the world’s megacities in the Asia–Pacific region [5]. These numbers are posing a common difficult problem for many countries is how to both develop a sustainable economy and ensure a healthy living environment for people. The exploding population in urban centers is in a state of overcrowding, the slums have increased and many problems in infrastructure, trade, and sanitation. With rapid urbanization, cities are grappling with major challenges in health, utility, livelihoods and environmental sustainability, community infrastructure and infrastructure, and problems, sanitation, education, housing, and medical facilities in Cities become unsafe and weak. In that context, the concept of “Smart City” is receiving more attention than ever when leading economists all see that building a modern infrastructure will help reduce the burden and impact from population growth, problems from urbanization are becoming increasingly common. Vietnam is also not out of that general trend, for many years, many provinces and cities in the country are gradually building and applying the Smart City model but it is still not appropriate. The rapid growth of population leads to many problems that need to be solved such as housing for people, solving traffic jams, encroaching on pedestrian sidewalks, urban pollution … which is becoming extremely difficult. It is these issues that are causing headaches for managers, planners, residents of the City, and the need to raise here is the need to build a Smart City model to meet the demand urgent current.

2 Concepts About the Smart City The concept of a smart city has been around since the late 1990s and has changed with the development. Until now, many shrewd city ideas have been dispatched. The expression “smart city” alludes to the reception and execution of portable figuring frameworks through pragmatic information the executive’s networks among all city segments and layers [25]. Cities are putting more emphasis on using data management networks, such as the Internet of Things [18], enormous information, and distributed

Smart City—Development Trend in the World and Vietnam

15

computing innovations, to get more brilliant. This information the executive’s frameworks improve different components of activities and associations in the smart city, including traffic lights, feasible asset the board, personal satisfaction, and foundation [8]. First, Smart Cities are innovative cities that use information and communication technology and other means to improve quality of life, the efficiency of urban operations and services, and competitiveness, meeting the needs of present and future generations related to economic, social, environmental as well as cultural aspects [23]. The second, a smart city or smart community is a community in which innovative, innovative, advanced, and reliable information and communication technologies, energy technologies, and other related mechanisms are used. to: Improve people’s health and quality of life; increase effectiveness and efficiency in operating costs and providing civil services; promote economic development; creating a community where people feel better in terms of safe, secure, sustainable, resilient, worth living, and worth working [24]. Third, smart cities are places where traditional networks and services are made more efficient with the use of digital and telecom technologies for the benefit of residents and businesses [4].

3 Smart City Model in the World Cities around the world have determined that to operate smart, they need to become smart cities. To see how more cities around the world have changed thanks to smart city strategy, let’s look at the report which is specifically in the cities [20]. To grow smarter, cities are focusing more on information the board organizations, like the Internet of Things, enormous information, and distributed computing advances [8]. A developing system of digital instruments, smart expedients, and smart household applications is referred to as the Internet of Things (IoT). The Internet of Things is attracting residents’ attention, and this increased awareness of these technologies will improve people’s quality of life [16]. This is a perfect example of the applications that are now being developed for upcoming smart cities. Nevertheless, every city’s priorities and key implementations for smart cities vary, and these kinds of smart gadgets and systems will be used in the forthcoming [3]. Moreover, given the growing number of inhabitants in urban zones, adequate facilities and environmental necessities are difficult to meet; hence, IoT technologies have emerged as a viable answer for establishing a functioning smart city [6]. Although establishing IoT construction is a difficult endeavor, the data administration method has been cited frequently in recent literature for the development of smart cities. As noted in the article, smart cities should include a wide range of devices, link-layer technologies, and services [13]. Furthermore, these systems may be easily converted

16

N. T. Ngan and B. H. Khoi

to a variety of habitats, including floating cities, which are one of them, and smart cities as another smart environment on the planet should grow [26]. Data production is thought to be feasible in practically every aspect of human activity, promising new perspectives on our environment. The data availability demonstrates how Big Data may aid in the most efficient use of resources and the making of well-informed judgments. The Internet of Things (IoT) and Artificial Intelligence (AI) can contribute greatly to this process [2]. Researchers and scientists are motivated by the rapid development of IoT technology concerning developing novel application zones and the Internet of Thing services [11], and these new smart services should be able to suit the needs of citizens all across the world. Human necessities will also be considered by replacing and gathering data inside IoT services in order to enhance awareness of smart city concepts around the world. As a result, actuating, networking, processing, and sensing should all be integrated into the network [28]. Monitoring, collecting, archiving, and sharing open device data from IoT campaigns are also significant aims for smart city development and research [14]. There are many studies in the current literature on many smart city essential subjects, such as environmental monitoring of smart cities [12]. People’s quality of life in a smart city, which primarily focuses on four city-scale phenomena: weather, environment, public transportation, and people flow. In smart cities, data gathering and quality analysis in a semantic web context [7, 8]. The study of Allam and Dhunny [2] examines AI’s urban potential and presents a new paradigm for connecting AI and cities while assuring the integration of crucial cultural, metabolic, and governance components. Seoul City The city of Seoul is one of the leading cities for e-management. The Smart Seoul Plan (2015) was launched to affirm Seoul’s position as a high-tech leader in the city and maintain the city’s competitiveness. The city government also found that citizenship expectations were increasing with many issues including digital distance, aging, transportation, sustainability in Fig. 1. Toronto City Toronto City is the highest-rated Smart City in North America. IBM (International Business Machines) has opened a Business Solutions Analysis Center in Toronto, Toronto which is also an active member of the C40 (a network of major cities around the world committed to tackling gas change) is looking to move to a low-carbon economy. The private sector in Toronto is creating a new smart transportation initiative with the hope of improving transportation efficiency in urban areas. Toronto also recently started using natural gas from landfills to fuel its garbage collection trucks. As Toronto is recognized as the Smartest City in the world, it will take time to deal with rapidly growing problems like affordable housing, low birth rates, and rising unemployment.

Smart City—Development Trend in the World and Vietnam

17

Fig. 1 Seoul Smart city based on data [21]

Singapore City In the article “Smart cities: the Singapore case” (1999), two Asian researchers Mahizhnan and Arun pointed out the factors contributing to making Singapore a Smart City: smart technology, people clever. Particular highlights the importance of IT and IT users for Smart City development. But the article also only explores how to apply IT in building Smart City, strongly the importance of technology, ignoring other factors. At the same time, a quantitative model of Smart City has not been produced. Barcelona City The four main factors in Barcelona’s Smart City model are smart governance, Smart economics, smart life, and smart people. Smart City initiatives are launched and thoroughly addressed, specific reforms in each area based on the main factor that is ICT to develop and improve each field. The weaknesses drawn in this model is that it has not passed quantitative analysis to survey the creative opinion of people in Barcelona, but only stands on the opinion of politicians. The four main factors are highly generalized but still lack some factors for the comprehensive development of Smart City. From the analysis of these Smart City models, experience in building Smart City in the world can be drawn as follows: Each city, depending on the specific conditions and needs, will choose the appropriate way to build a smart city model for your locality, an effective measure is to improve the environment and living conditions for the people of the city when the life and living conditions of the people are better. An uplift, make them cherish the surrounding environment, the government

18

N. T. Ngan and B. H. Khoi

is closer to the people, the people believe in the management and administration of the government, which is the most practical smart city model.

4 Smart City Development in Vietnam Since the launch of innovation, a series of economic reform programs in 1986, the Vietnamese government has been eager to keep pace with global economic development. Urbanization has been accelerated by this transfer in institutional and political authority. Between 1990 and 2017, the urban population of Vietnam increased from 13.8 million (or 20.2% of the total population) to 33.6 million (or 35.2%) [22]. Over the past years, 10 provinces and cities in Vietnam have developed and approved the development of smart urban projects based on the model of corporations and information technology enterprises, which only focus on promoting the development of information technology infrastructure, not focusing on investment in technology infrastructure but there are five smart city projects in Vietnam in Table 1. In 2012, Da Nang was the first city chosen by IBM Technology Corporation as one of 33 cities in the world. At that time, Da Nang received funding from a smarter city program with a total funding value of over 50 million USD, using a smart center operating solution to ensure the quality of water sources serving the people, providing the best public transport, and reduce traffic congestion. Ho Chi Minh City and Hanoi have also taken concrete steps towards different aspects of smart cities, such as experimenting with using cards instead of selling traditional bus tickets. Implementing Wi-Fi city in some places, proposals on using mobile phones to transmit traffic information or ideas to digitize the daily life activities of some businesses. Table 1 Smart city projects in Vietnam Unit City

Time

1

Ha Noi

2020–2030 The smart transport system, smart public administrative reform procedures

2

Ho Chi Minh City 2017–2025 Model solutions for financial attraction, media, and construction build a startup ecosystem; the framework of information technology architecture—open-oriented communication

3

Da Nang

2030–2045 Smart governance; smart economy, smart traffic; smart environment; smart life, and smart citizens

4

Ha Long

2017–2020 Smart mobility, smart governance, smart citizens, smart economy

5

Can Tho

2017–2025 Digital government model, maximizing accessibility, saving energy, and business-friendly

Field

Smart City—Development Trend in the World and Vietnam

19

Vietnam’s three major cities, Hanoi, Ho Chi Minh City, and Da Nang will be part of the ASEAN smart cities network proposed by Singapore before the 32nd ASEAN Summit data of Viettel Group, a favorable factor for the development of smart cities in Vietnam is the proportion of Internet users/total population in Vietnam is quite large, reaching the top 10 in Asia. Vietnam currently has about 49 million Internet users, reaching a penetration rate of 51.5%. Some cities have suitable conditions to apply smart urban models such as Hai Phong, Da Nang, Thanh Hoa, Thai Nguyen, Ha Long, Hue, Can Tho, Rach Gia, Phu Quoc, Nha Trang, and Quy Nhon. Besides the development of smart cities in Vietnam, developers of real estate corporations are also transforming urban areas into a new trend of smart urban areas. Urban area “Smart city” in Hai Boi commune, Vinh Ngoc, Dong Anh district, Hanoi by investor Joint venture Sumimoto (Japan) and BRG Group, launched in 2018 with an area of 272 hectares, is one of the first smart city urban projects in Vietnam. The “Smart City” in the North of Hanoi is planned towards being friendly with nature, developing management systems, and applying high technology in the energy fields, quality, health, education. The investor who also quickly updated the world trend is Vingroup Group, formerly known as the Vinhome Sportiva urban project on the axis of Thang Long Avenue, in 2019 Vingroup Group has changed to Urban Area. Vinhome Smart City—Dynamic Smart City. In an area of 280 hectares, Vinhomes Smart City has learned and operated the application of smart urban models in the world, such as Singapore, Songdo of Korea, and Fujisawa of Japan. As seen, the investor has grasped quickly and immediately changed direction to introduce a new urban area model with a smart ecosystem based on 4 core axes, including Smart Security (intelligent security)., Smart Management (smart operation), Smart Community (smart community), Smart Home. Vinhome Smart City promises smart security—safety with a multi-layer camera system with integrated artificial intelligence that automatically recognizes faces, license plates, automatically alerts strange objects in urban areas, ladder system. Smart machine, intelligent fire protection, air quality monitoring, environmental pollution warning, and updated application traffic situation in cities and urban areas via mobile phone software [17] (Fig. 2).

Fig. 2 Vinhome smart city in Vietnam [15]

20

N. T. Ngan and B. H. Khoi

For example, Vinhomes introduced smart operation with centralized operation center 24/7 applying artificial intelligence, Internet of Things (IoT) in work supervise and operate the entire megacities. Some businesses developed smart cities in Vietnam as Ecopark Group also quickly caught up with the smart city trend when in May 2019 signed a strategic cooperation agreement to implement a smart city project with Ecotek Technology Services Joint Stock Company and Fundacion Metropoli (under Metropoli Ecosystems, Spain) in the new Ecopark subdivision is a shopping mall with an area of 70 hectares. Ecopark aims to develop and strive to become a pilot smart city under the Prime Minister’s decision on the approval of the “Vietnam Smart City Sustainable Development Scheme for the period 2018–2025 and orientation 2030”.

5 Conclusion and Implications Although most city-specific smart city action plans target human capital, the question of how to nurture and integrate urban people has remained at a strategic level due to a lack of efficient implementation methods. Similarly, despite its significance, the role of urban form in determining the future of smart cities has gotten little attention. When the interplay between ICT infrastructure, human capital, and a city’s urban form was not addressed, it was solely considered as a context [22]. Vietnam joined the global internet network in 1997, and the smart city concept was introduced and implemented in other countries in the 1990s. The government and academics believe that the smart city development strategy can help the country catch up and even compete with the rest of the world as a tool for managing rapid urbanization, boosting local and national economies, improving socioeconomic equality, and achieving long-term development goals [27]. Smart city-urban development brings many benefits to the community, environment, and society. The world and Vietnam have been accelerating development towards smart cities. The quality of life of people in smart cities will be enhanced thanks to the benefits of becoming a Smart street will eliminate greenhouse gas emissions; crime prevention solutions, emergency support services; have smart transport and transport solutions to save time; implementing smart medical solutions to help increase longevity; Create many jobs; housing and energy-saving solutions to help save costs and facilitate the implementation of the UN sustainable development targets. Acknowledgements Acknowledgements This paper was funded by IUH, Vietnam. Authors Contribution The authors contribute to the paper. Bui Huy Khoi contributed to the study of data and the gathering of research-related references. Nguyen Thi Ngan contributed to the compilation of data and the manuscript was revised. Bui Huy Khoi conducted a data survey. Conflicts of Interest The authors declare no conflict of interest.

Smart City—Development Trend in the World and Vietnam

21

References 1. Abu-Rayash, A., Dincer, I.: Development of integrated sustainability performance indicators for better management of smart cities. Sustain. Cities Soc. 67, 102704 (2021) 2. Allam, Z., Dhunny, Z.A.: On big data, artificial intelligence and smart cities. Cities 89, 80–91 (2019) 3. Atitallah, S.B., Driss, M., Boulila, W., Ghézala, H.B.: Leveraging deep learning and IoT big data analytics to support the smart cities development: review and future directions. Comp. Sci. Rev. 38, 100303 (2020) 4. EC.: What are smart cities? (2020). https://ec.europa.eu/info/eu-regional-and-urban-develo pment/topics/cities-and-urban-development/city-initiatives/smart-cities_en#:~:text=What% 20are%20smart%20cities%3F,-A%20smart%20city&text=A%20smart%20city%20goes% 20beyond,to%20light%20and%20heat%20buildings 5. Hien, H.: APEC 2017: Creating a new driving force for sustainable urbanization development (Vietnamese) (2017). Retrieved 16 June 2017 from http://dangcongsan.vn/kinh-te-va-hoi-nhap/ apec-2017-tao-dong-luc-moi-cho-phat-trien-do-thi-hoa-ben-vung-437793.html 6. Jin, J., Gubbi, J., Marusic, S., Palaniswami, M.: An information framework for creating a smart city through internet of things. IEEE Internet Things J. 1(2), 112–121 (2014) 7. Kassa, W.-E., Billabert, A.-L., Faci, S., Algani, C.: Electrical modeling of semiconductor laser diode for heterodyne RoF system simulation. IEEE J. Quantum Electron. 49(10), 894–900 (2013) 8. Kirimtat, A., Krejcar, O., Kertesz, A., Tasgetiren, M.F.: Future trends and current state of smart city concepts: a survey. IEEE Access 8, 86448–86467 (2020) 9. Lai, C.S., Jia, Y., Dong, Z., Wang, D., Tao, Y., Lai, Q.H., … Lai, L.L.: A review of technical standards for smart cities. Clean Technol. 2(3), 290–310 (2020) 10. Laufs, J., Borrion, H., Bradford, B.: Security and the smart city: a systematic review. Sustain. Cities Soc. 55, 102023 (2020) 11. Mohammadi, M., Al-Fuqaha, A., Guizani, M., Oh, J.-S.: Semisupervised deep reinforcement learning in support of IoT and smart city services. IEEE Internet Things J. 5(2), 624–635 (2017) 12. Montori, F., Bedogni, L., Bononi, L.: A collaborative internet of things architecture for smart cities and environmental monitoring. IEEE Internet Things J. 5(2), 592–605 (2017) 13. Orlowski, C.: Management of IOT open data projects in smart cities. Academic Press (2020) 14. Pflanzner, T., Leszko, K.Z., Kertész, A.: SUMMON: Gathering smart city data to support IoTFog-Cloud simulations. Paper presented at the 2018 Third International Conference on Fog and Mobile Edge Computing (FMEC) (2018) 15. Phuong, L.: Vingroup officially launches smart city Vinhomes (Vietnamese) (2019). Retrieved 13 Dec 2020 from https://vneconomy.vn/vingroup-chinh-thuc-ra-mat-dai-do-thi-thong-minhvinhomes-smart-city-2019042314214912.htm 16. Rudra, B.: Impact of Internet of Things in smart cities. In: IOT Technologies in Smart-Cities: From Sensors to Big Data, Security and Trust, 41 (2020) 17. Siemens.: With Help from Siemens, Copenhagen will meet 2025 carbon neutrality target (2015). Retrieved 13 Dec 2020 from https://www.3blmedia.com/News/Help-Siemens-CopenhagenWill-Meet-2025-Carbon-Neutrality-Target 18. Singh, P.K., Kumar, N., Gupta, B.K.: Wireless sensing with radio frequency identification (RFID): instrumental in intelligent tracking. Paper presented at The International Conference on Recent Innovations in Computing (2020) 19. Singh, P.K., Veselov, G., Pljonkin, A., Kumar, Y., Paprzycki, M., Zachinyaev, Y.: Futuristic trends in network and communication technologies. In: Third International Conference, FTNCT 2020, Taganrog, Russia, October 14–16, 2020, Revised Selected Papers, Part II (vol. 1396). Springer Nature (2021) 20. Slater, R., Khandelwal, R.: Report on Case Studies of Smart Cities. International Benchmark: MPUIIP. ICF International (2016) 21. SmartCitiesWorld.: Seoul: A city based on data (2020). Retrieved 13 Dec 2020 from https:// www.smartcitiesworld.net/special-reports/special-reports/seoul-a-city-based-on-data

22

N. T. Ngan and B. H. Khoi

22. Thai, H.M.H., Khuat, H.T., Kim, H.M.: Urban form, the use of ICT and smart cities in Vietnam. In: Smart Cities for Technological and Social Innovation, pp. 137–156. Elsevier (2021) 23. U4SSC.: United 4 smart sustainable cities (2020). Retrieved 10 Dec from https://www.itu.int/ en/ITU-T/ssc/united/Pages/U4SSC-info.aspx 24. US.: Cities of the 21st century (2017). https://www.usmayors.org/wp-content/uploads/2017/ 02/2016SmartCitiesSurvey.pdf 25. Varshney, T., Singh, R., Rai, A.K., Sharma, N., Bhushan, B.: Prevailing privacy and security technologies in smart cities. Available at SSRN 3747928 (2020) 26. Venkatesh, J., Aksanli, B., Chan, C.S., Akyurek, A.S., Rosing, T.S.: Modular and personalized smart health application design in a smart city environment. IEEE Internet Things J. 5(2), 614–623 (2017) 27. Vu, K., Hartley, K.: Promoting smart cities in developing countries: policy insights from Vietnam. Telecommunications Policy 42(10), 845–859 (2018) 28. Wazid, M., Das, A.K., Khan, M.K., Al-Ghaiheb, A.A.-D., Kumar, N., Vasilakos, A.V.: Secure authentication scheme for medicine anti-counterfeiting system in IoT environment. IEEE Internet Things J. 4(5), 1634–1646 (2017)

Smart City from a Standards Perspective Piotr Karocki

Abstract All of us have heard the term ‘smart city’. Most of us are convinced that we know what this term means, and we are sure that everyone else understands it the same way as we do. This chapter shows what the ‘smart city’ term means for Standard Bodies (like ISO) and what are the formal boundaries of activities labeled as “smart city” (e.g. written by UN agencies). It also recalls some of the more farreaching visions and threats to humanity posed by the technology implemented too far. Keywords Smart city · Standards · ISO · Well-being · Urbanization · World trends · Resilient city · Human factors Smart city… everyone, or at least everyone in the developed part of our Pale Blue Dot has heard this buzzword. We also know that smart can be e.g. phone, meaning it is something completely different from a plain old phone, it serves a completely different purpose, etc.… Ok, that with purpose was a joke (or maybe not…). We associate ‘smart’ with technology, especially IT. Let us assume, at least for now, that this is the correct association (but we should remember that we have two cultures using the same words but with different meanings, see [1] and that our language not only influences our thinking but even forms it [2, thesis 5.6]). Why do we want to make cities smart? Is this some real progress, something to meet our real needs, or just a marketing term designed to help sell us new hardware or software? Why do we want to build cities at all? Do we know what we really want, what is our goal? To answer these questions, let’s look at international efforts, mainly (but not exclusively) standardization efforts. First, why cities? Short answer: because we need new cities for around 2.5 billion people by 2050. Long answer: the primary document presenting global trends is World Urbanization Prospects, a publication by the United Nations Department of Economic and Social Affairs, the latest edition from 2018. It includes data such as: “Globally, more P. Karocki (B) Independent Consultant, Kraków, Poland e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_3

23

24

P. Karocki

people live in urban areas than in rural areas, with 55 percent of the world’s population residing in urban areas in 2018. In 1950, 30 percent of the world’s population was urban, and by 2050, 68 percent of the world’s population is projected to be urban” [3, p. 1]. Urbanization is not uniform, with the share of the urban population being lowest in Africa (42.5 %) and highest in North America (82.2 %); in Europe, it is 74.5 %. Four years ago, these indicators were lower, respectively: 40 %, 81.5 %, and 73.4 % (data from a similar document for 2014 [4, p. 8]). As we see in this document, further urbanization will be prominent in Africa and Asia, but much less so in North America and Oceania.

Smart City from a Standards Perspective

25

Our current cities cannot absorb these 2.5 billion people, even when turned into ‘megacities’, which are cities with more than 10 million inhabitants. And we have more and more such cities; in 1990, there were only 10 such cities, in 2014 already 28, and in 2018 there were 33. In 1990, 153 million people lived in such cities, which was less than 7 % of the urban population. Today it is 529 million, or 13 %. There are 37 million people in Tokyo (including the agglomeration), more than in Poland … Another 10 cities are expected to become megacities by 2030 (7 in Asia, 2 in Africa, 1 in Europe) [3, pp. 17–18]. Not so long ago, we thought that as population density increased, so did the level of stress and aggression in individuals. Today we know that it is not that simple [5]: although the said observation is true for rats, it is not true for monkeys; humans are more like monkeys than rats. Population density seems to make us more selfish, but no more aggressive [6]; we are also less interested in the community [7]. Moreover, if any relationship existed at all, it would not be linear [8]. By the way, ancient Rome at the dawn of Christianity had over a million inhabitants [9] (residents, not citizens: e.g. St. Paul had Roman citizenship, but e.g. slaves living in Rome did not). Today, two thousand years later, Rome has only four times as many inhabitants. Second question—why smart? Couldn’t they be ‘cities as usual’, just bigger and numerous? We can get part of the answer by adding the following perspective to the statement “new cities for about 2.5 billion people”: now cities are home to about 3.8 billion people. Thus, the population of the cities would increase by 66 % in a few decades. And yet some cities were built (in stages) for thousands of years; for example, Damascus is about 5000 years old. Another part of the answer is that one percent of the world’s population consumes 0.54 % of the world’s energy, but only if that percent lives outside the cities. The same number of people living in cities consume 1.4 % of the world’s energy. Three times more! Now we have 55 % of the population living in urban areas, 45 % in the countryside, and 100 % energy. If all people moved to the cities, we would need 40 % more energy (and that’s just for the current population, not counting its growth). A slightly longer second part of the answer to the second question: the mentioned difference between the energy consumption of urban and rural residents is not only a matter of “more wasteful” energy consumption by residents—it should be remembered that there are many facilities in cities that simply do not exist in the countryside: offices, universities, hospitals, cinemas, theaters, philharmonics, museums, galleries, offices, illuminated streets1 … The threat of a threefold increase in energy consumption per person applies not only to the pace of urbanization; it is necessary to take into account the projected population growth in general. Thus, when estimating energy demand for 2050, we must take into account not only the change in the urbanization rate from 54 to 66 % (although this in itself means that we must produce 9 % more energy than we produce 1

Street lighting is also known as “light pollution”; it disturbs optical astronomy, but also harms humans and animals, disturbing their daily cycle. See below.

26

P. Karocki

today) but also the increase in the Earth population from 7 billion to 9.5 billion. Both of these factors together demonstrate the need to increase energy production by 47 % within 30 years! Increasing energy production means faster environmental degradation, especially when energy is produced from coal. Let’s see what it looks like according to the forecasts for the year 2100. The population is projected to reach 11 billion, of which 84 % are to live in cities. It means that we should produce 98 % more energy than we do today. Note that these estimates are based on a forecast of a significant slowdown in population growth (over the next 30 years, the population will increase by 2.5 billion, or 35 %, and in the next 50 years—only by 1.5 billion, which is 16 %). These estimates are very simplified and do not even take into account any changes in energy consumption caused by new inventions; for example, it is estimated that the energy consumption of the IT sector will reach up to 8 % of global energy production in 2030 [10]. These numbers “speak for themselves” and that is why humanity began to take seriously the problem of urbanization, including its energy aspect. E.g., in the European Union, the Strategic Implementation Plan [11] has been created in 2013 (as part of the European Innovation Partnership on Smart Cities and Communities). It takes into account various scenarios, the most dangerous “business as usual”, i.e. nothing will change, and energy consumption will reach the level of 1842 Mtoe in 2020, and even the most radical project of reducing energy consumption reduces it to only 1474 Mtoe.2 This document uses the off-system unit (not SI unit) ‘toe’, “tonne of oil equivalent”, trying to somehow summarize the total energy consumption—not only electricity consumption but also oil for transport, coal for heating, etc. By using such a unit the planned process of re-electrification of transport is irrelevant to our considerations. Why ‘re-electrification’ and not ‘electrification’? Because in the first decade of the twentieth century, 38 % of cars in the US were electric [12]. In some countries, including Poland, this re-electrification means a significant increase in air pollution, but we do not analyze the environmental effects here.3 The International Energy Agency provides a number of conversion factors, including 1 toe = 11.63 MWh = 41.87 GJ = 1.43 tons of coal [13]. For conversion from diesel to petrol etc. see [14]. How can you stop polluting the environment despite generating more and more energy? One direction is to force the abandonment of the most poisonous methods of energy production, and the other, perhaps more important, is simply not to produce more energy but to save it. Let us refer to another EU document, Control and Optimization for Energy Positive Neighborhoods, issued under CORDIS (Community Research and Development Information Service) [15]. We can find here information that buildings “consume” 40 % of energy (and are responsible for 36 % of CO2 emissions, while transport is only responsible for 30 %). When we compare 2

The data is for Europe, not for the whole world. If you want, try to find fuel consumption data for your country, and assuming the average energy value of the fuel 45 MJ/kg (average for diesel and gasoline) and the conversion factor 1 kWh = 3.6 MJ, you can get approximately how many TWh is consumed per year in fuels. Then compare this with the country electricity production. For Poland, it is 18 million tons, 225 TWh, and the current production is 160 TWh.

3

Smart City from a Standards Perspective

27

this with the figures [16] that a standard building needs 200 kWh/m2 per year and an energy-efficient (or passive, intelligent) building which only needs 85 kWh/m2 , there is a potential for significant savings: if 1479 Mtoe was used in the EU in 2018 [17] (down from 1606 Mtoe four years ago [18]), of which 40 % (=592 Mtoe) is consumed by buildings, the savings may amount to 340 Mtoe, which is 3954 TWh when converted into terms of electricity. Almost eight times more than the annual electricity production in Germany4 … Let’s use this as the upper limit of savings. The lower limit we can calculate using other data. When we use the average area per person (inhabitant) in Europe (data for 2011 [19]), i.e. 42.56 m2 , we multiply it by the EU population in the same year i.e. 439.94 million [20], and by savings 115 kWh/m2 , we get 2153 TWh (on heating households alone, without e.g. hospitals, libraries, offices, etc.) Let’s try to convert it into money. In Europe, the average gross price of 1 kWh is EUR 0.2126 (data for 2020, H1 [21]), which gives us savings of between EUR 457 billion and EUR 840 billion. Taking into account the fact that energy for the industry is almost twice cheaper than for individuals (see the second and third ‘tweets’ in [21]), and that most types of energy are much cheaper than electricity, and some buildings are already converted to energy-saving, we can assume that our savings could amount to approximately EUR 40 billion a year—which is still a considerable amount. Another often considered area of savings is the replacement of street lighting with the LED version. We invest in LED light sources hoping that it will reduce maintenance costs (although the sodium lamp has a lifetime of over 20 000 h and 180 lm/W, and the lifetime of the LED lamp is about 30 000 h, but only 120 lm/W and we get not so monochromatic light, so we perceive it as brighter [22, p. 47]). However, we often forget about the effect of such polychromatic light, as a disturbance of circadian rhythms (melatonin cycle affected by the blue component of the spectrum) not only in humans but in all animals, maybe also in plants. It is estimated that the influence of LED lamps on the circadian rhythm associated with sleep is five times greater than in the case of conventional street lamps [22, p. 48]. This is why Microsoft introduced ‘night light’ in the Windows 10 Creators Update in 2017; this function reduces the blue component of the light. It is a ‘developed’ night mode, available at least from Windows 8.1 (2013), and now also available on Android (from Android 10 introduced in 2019 [23]). Another side effect comes from the fact that the air molecules scatter blue light more strongly than yellow and orange light, which results in a stronger glow of the night sky than LEDs [22, p. 48]. In some cities, the newly installed ‘bluish’ LED street lamps have then been replaced by ‘reddish’, which means spending more money (around $500 per lamp) [22, p. 57]. LoNNe (Loss of the Night Network) was established within the EU, which deals with problems of this kind [24]. So we want more energy-efficient cities (preferably even self-sustained), but without jeopardizing our health. 4

This is possible because most of the heating of buildings is done not with electricity, but with coal, gas, etc.

28

P. Karocki

We should also deal with the climate catastrophe; more and more countries announce a climate emergency. The first was Scotland (April 28, 2019), from other continents: Canada (June 27, 2019), Japan (November 20, 2020). The EU Parliament declared this state (“climate and environmental emergency” [25]) on November 28, 2019. In June 2019, Pope Francis declared a climate emergency (“ecological crisis, especially climate change” [26]); we can take it as a declaration of such a state by the Vatican or by the Catholic Church. A list of such declarations can also be found in Wikipedia [27]. But what does the climate catastrophe mean? First, it is acknowledging more and more noticeable events such as hurricanes, earthquakes, floods (some events are just more frequent, but we also have a denser population), etc. Second, recognizing this condition changes our practice from ‘business as usual’ to more nature friendly. Thus, new cities should be (1) resilient and (2) environmentally friendly. A resilient city is also a good thing in the event of a terrorist attack, for example. Two towers were destroyed, but there was infrastructure beneath them—transportation, pipelines, energy, etc. Re-wiring power and telecommunications after 9/11 in Manhattan was not easy and not cheap: in 2002 it was estimated to cost $2.3 billion, twice as much as the reconstruction of metro lines and one station beneath them [28]. So we want to rebuild cities, but the main parts of cities are not computers, not buildings, but people. Citizens. The city is financed by them, and the city is to be their place and for them. I mention this as the last factor, but all the factors aforementioned are less important; it’s all about the city for citizens, not technology. By the way, what is a city? Do we know what we are talking about? What other terms are we just using? Obliged by the title of this chapter, we should take the definition of a city from some standard document, for example from IEC 60050831 International Electrotechnical Vocabulary (IEV)—Part 831 Smart city systems. Some definitions from this standard5 : city: built-up area or place with a name and defined geographical boundaries considered together with its inhabitants (definition 831-01-03). smart city: city where improvements in quality of life, services, sustainability and resilience are accelerated by the effective integration of many and various types of physical, digital and social systems and the transformative use of data and technology (definition 831-01-26). city sustainability: ability of a system to meet its present needs without compromising its ability to meet the needs of future generations (definition 831-01-06). resilience: ability of a system or a system element to resist being affected by disruptions (definition 831-01-21, taken from ISO/IEC 27031). system: combination of interacting elements organized to achieve one or more stated purposes (definition 831-01-30, taken from ISO/IEC/IEEE 21840).

5

The quotes are from the draft version. It should be published in 2021.

Smart City from a Standards Perspective

29

data: representation of facts of objective reality in a formalized manner (definition 831-02-02). physical system: set of physical objects and processes that work together to fulfill one or more specific functions. EXAMPLES: electrical power distribution systems, logistics systems, metro systems (definition 831-03-03). social system: patterned series of interrelationships existing between individuals, groups, and institutions and forming a coherent whole. EXAMPLES: social systems include nuclear family units, communities, cities, nations, college campuses, corporations, and industries (definition 831-04-03). digital system: system consisting of hardware, software, and possibly network components, used to generate and/or use data to fulfill one or more specific functions (definition 831-02-03). As you can see, the primary goal of a smart city is the quality of life. And IT (or, actually, any technology) is just a tool, a medium, to achieve the goals of a smart city. But what exactly is a smart city? Can we measure the ‘smartness’ of a city, especially the “social system” part of it? The answer is, as you might guess, “yes, we can”—if the answer was different, I would not be asking this question. But how? We have three standards with “indicators” or “metrics” that measure different aspects of the city: . ISO 37120, Indicators for city services and quality of life [29] . ISO 37122, Indicators for Smart Cities [30] . ISO 37123, Indicators for Resilient Cities [31] All three standards have a generally available table of contents,6 therefore I will present only some selected indicators. But let’s check the definition first. What does a smart city mean in terms of these ISO standards? A smart city is a “city that increases the pace at which it provides social, economic and environmental sustainability outcomes and responds to challenges such as climate change, rapid population growth, and political and economic instability by fundamentally improving how it engages society, applies collaborative leadership methods, works across disciplines and city systems, and uses data information and modern technologies to deliver better services and quality of life to those in the city (residents, businesses, visitors), now and for the foreseeable future, without unfair disadvantage of others or degradation of the natural environment”—technically the same as in IEC 60050-831, with one important new element: engaging society. A few examples of indicators that clearly show that a smart city is not about technology: . Percentage of public buildings that are accessible by persons with special needs (ISO 37122, 13.1) . Percentage of the city’s cultural records that have been digitised (ISO 37122, 17.2) 6

Click on the “Preview” button on the pages referenced above.

30

P. Karocki

. Percentage of city population that are active public library users (ISO 37122, 17.4) . Annual number of citizens engaged in the urban planning process per 100 000 population (ISO 37122, 21.1) . Number of new patents per 100 000 population per year (ISO 37120, 5.6) . Number of higher education degrees per 100 000 population (ISO 37120, 6.6) . Noise pollution (ISO 37120, 8.8) . Percentage change in number of native species (ISO 37120, 8.9) . Women as a percentage of total elected to city-level office (ISO 37120, 10.1) . Number of convictions for corruption and/or bribery by city officials per 100 000 population (ISO 37120, 10.2) . Voter participation in last municipal election (as a percentage of registered voters) (ISO 37120, 10.4) . Average life expectancy (ISO 37120, 11.1) . Number of physicians per 100 000 population (ISO 37120, 11.3) . Percentage of city population living below the national poverty line (ISO 37120, 13.2) . Square metres of public outdoor recreation space per capita (ISO 37120, 14.2) . Percentage of municipal budget allocated to cultural and sporting facilities (ISO 37120, 17.2) . Percentage of population living within 0.5 km of public transit running at least every 20 min during peak periods (ISO 37120, 19.6) . Percentage of city population that is overweight or obese—Body Mass Index (BMI) (ISO 37120, 20.4) . Green area (hectares) per 100 000 population (ISO 37120, 21.1) . Number of trees per 100 000 population (ISO 37120, 21.5) Even a ‘resilient city’ is not just about technology, because we have indicators like: . Percentage of properties with insurance coverage for high risk hazards (ISO 37123, 5.4) . Percentage of schools that teach emergency preparedness and disaster risk reduction (ISO 37123, 6.1) . Percentage of emergency preparedness publications provided in alternative languages (ISO 37123, 6.4) . Magnitude of urban heat island effects (atmospheric) (ISO 37123, 8.1) . Percentage of natural areas within the city that have undergone ecological evaluation for their protective services (ISO 37123, 8.2) . Annual frequency of extreme rainfall events (ISO 37123, 8.4) . Percentage of city land area covered by tree canopy (ISO 37123, 8.8) . Annual expenditure on green and blue infrastructure as a percentage of total city budget (ISO 37123, 9.4) . Percentage of essential city services covered by a documented continuity plan (ISO 37120, 13.4) . Percentage of population with basic health insurance (ISO 37123, 11.4) . Percentage of children that are fully immunized (ISO 37123, 11.5)

Smart City from a Standards Perspective

31

. Percentage of population with access to social assistance programs (ISO 37123, 13.2) . Percentage of city population that can be served by city food reserves for 72 h in an emergency (ISO 37123, 19.1) . Percentage of the city’s population living more than one kilometre from a grocery store (ISO 37123, 19.2) . Percentage of city population that can be supplied potable water by alternative methods for 72 h (ISO 37123, 22.2) Someone may ask, “Where’s any ‘computational intelligence’ in this?” Does this volume really talk about the same smart cities featured in this chapter? In short, “yes”. Because technology can be a valuable tool to reach the smart city described by the above indicators. Also, some indicators directly mention the technology, such as: . Total end-use energy consumption per capita (GJ/year) (ISO 37120, 7.1) . Number of internet connections per 100 000 population (ISO 37120, 18.1) . Number of computers, laptops, tablets or other digital learning devices available per 1000 students (ISO 37122, 6.2) . Storage capacity of the city’s energy grid per total city energy consumption (ISO 37122, 7.5) . Number of real-time remote air quality monitoring stations per square kilometre (ISO 37122, 8.2) . Percentage of payments to the city that are paid electronically based on electronic invoices (ISO 37122, 9.2) . Percentage of city services accessible and that can be requested online (ISO 37122, 10.2) . Average downtime of the city’s IT infrastructure (ISO 37122, 10.4) . Percentage of the city’s population with an online unified health file accessible to health care providers (ISO 37122, 11.1) . Percentage of households with smart energy meters (ISO 37122, 12.1) . Percentage of public garbage bins that are sensor-enabled public garbage bins (ISO 37122, 16.5) . Percentage of city streets and thoroughfares covered by real-time online traffic alerts and information (ISO 37122, 19.1) . Percentage of vehicles registered in the city that are autonomous vehicles (ISO 37122, 19.11) . Percentage of city electronic data with secure and remote back-up storage (ISO37123, 10.5) What is the threat arising from the assumptions that “smart = technology” and “new = good, desired”? Many ‘smart’ items are smart in getting our money. Dishonest sellers often use the adjective ‘smart’ to sell products that are not as smart. A smartwatch measuring blood oxygen saturation? You can have it. Although it gives a random number from 96 to 99 % (e.g. Makibes Smart Bracelet N108, and actually

32

P. Karocki

all devices that use the “Hey Band” app; it also does not measure your blood pressure, it tries to calculate it based on your age, gender, and heart rate). Smart scale with information about body fat etc.? You can have it. Although the scale always reports bioelectric impedance of 500, it does not measure anything (because it gives a constant value, this scam is easily detected when you listen to Bluetooth communication with the scale; this is how many scales using the OKOK protocol work, but probably other scales as well). Ok, we know we want many cities, not ordinary but smart ones, we even know the indicators to measure a city’s ‘smartness’. Is that all? Or maybe we have some additional guidance available? If I ask that question, we probably have something to help us create such cities. We have a plethora of standards with many levels of abstraction (or many levels of detail). These standards were (and still are) written in a top-down manner. As in the formal scientific method, the analysis stage: breaking down problems into smaller problems, solving them recursively. ISO created the first level of very abstract standards, then the next levels of more and more detail, until almost everything is described. For example, how close (or how far) tram/bus stops should be. Or how to pay the transport fee,7 how the parking allocation system should work.8 This standardization effort is still ongoing. A system built with this “global image” in mind, using ideas (or suggestions) from standards, would be a “future-proof” system. And this means much cheaper for cities, because without falling into dead ends. ISO uses a cascade approach rather than trial and error; after all, we all know that “creativity does not replace knowledge.” Correcting the solution at the model stage, correcting ideas is easier, faster, and cheaper than correcting already implemented ideas (e.g. a change in new requirements for a processor is only one line, correcting the same after “resolving” the processor to the transistor-level is difficult, prone to errors, expensive and timeconsuming; and it may not be possible to make the same change when we have chips already produced). And each city is one big system with many subsystems that should connect with each other to obtain the network effect (when the value of the sum of elements is greater than the sum of the values of each element). After all, our goal is to use the system, not to constantly implement it. Each system implemented in a city should be treated as a subsystem of one large system—and that large system is the city itself. This is clearly stated in another standard, ISO 37106 Sustainable development and communities—Guide to establishing strategies for smart cities and communities: “A smart city should be described as one that ‘dramatically increases the pace at which it improves its sustainability and resilience… by fundamentally improving how it engages society, how it applies collaborative leadership methods, how it works across disciplines and city systems, and how it uses data and integrated technologies… in order to transform services 7

ISO 37165 Smart community infrastructures—Guidance on smart transportation by non-cash payment for fare/fees in transportation and its related or additional services. 8 ISO 37163 Smart community infrastructures—Guidance on smart transportation for parking lot allocation in cities.

Smart City from a Standards Perspective

33

and quality of life to those in and involved with the city (residents, businesses, visitors).”9 We can invest in solar PV production (prosumers), but this exacerbates the ‘urban heat island’ problem,10 so more energy is needed for air conditioning. The fact that cities are warmer than villages was written centuries ago [32], and now we systematically counteract it; see, among others The European Union UHI Project, “it is implemented through the CENTRAL EUROPE Programme co-financed by the European Regional Development Fund (ERDF). The main target of this transnational project is the development of mitigation, risk prevention and management strategies to counteract the urban heat island (UHI) phenomenon” [33]. But why ISO? Because it is the only body that can create rules that formally bind the whole world. ISO is a global network of national standardization bodies. “Its members are the foremost standards organizations in their countries and there is only one member per country. Each member represents ISO in its country” [34]. Currently, there are 165 national standardization bodies [35], so almost every country has an ISO representative. Such members are, inter alia, PKN from Poland, DIN from Germany, BSI from Great Britain, ANSI from the USA, GOST from Russia, SAC from China, BIS from India. Will smart cities be similar to our current cities? The answer depends on who is asking. Why? Because although ISO members are national standardization organizations, not all regions of the world are equally active in smart city committees. You may remember from the beginning of this chapter that most of the urbanization will happen in Africa and Asia. These two continents will face the biggest changes. But Africa is too poor to take an active part in such works, and Europe is not as interested as, for example, China. Standardization work related to Smart Cities is carried out in the ISO Technical Committee 268 Sustainable cities and communities [36]. It has developed 26 published standards so far, and another 18 are under development. Its members are 46 actively participating countries (marked in blue below) and 26 observing countries (marked in orange).

9

From “Introduction” to standard, page vii. Photovoltaic panels convert only part of the absorbed solar energy to electricity, and significantly reduces the albedo. Much of the solar energy that would normally be reflected ends up as city heating.

10

34

P. Karocki

Let’s go one level down, to the Smart community infrastructures group (ISO/TC 268/SC 1). We currently only have 27 active members and 17 observing members. There are more white spots on our map, and almost all of Africa is white:

And a quick look at who is the convener of the five ISO/TC 268/SC 1 working groups: four are from Japan, one from China. The technical details are developed in cooperation with ISO and IEC (International Electrotechnical Commission), in this case: ISO TC 268 and IEC SyC Smart Cities. The IEC Committee has 19 participating countries and 14 observing countries [37]. Let’s take a look at the WG2 members of this IEC committee. We have 94 members, but 17 are from China, 16 from India, 11 from South Korea, and 15 from XP. All other countries have fewer than 10 members, e.g. Australia 7, USA 6, UK 5, and Russia 4. Asia has a total of 63 members (59 mentioned above, plus 3 from Japan and 1 from Singapore), maybe even 67 if we count Russia as Asia. Either way, 63 of the 94 members are 67 %. In many ways, this means that the foundations of smart cities will be more Asian, that such cities will be built on different principles than Western cities.

Smart City from a Standards Perspective

35

Western civilization is based on a ‘person’; it is a term developed in Trinitarian and Christological discussions in the fourth century (mainly the Cappadocian Fathers: Basil the Great (330–379), Bishop of Caesarea; his brother Gregory of Nyssa (c. 335–c. 395), Bishop of Nyssa and Gregory of Nazianzus (329–389), Patriarch of Constantinople). Asian civilization is based on a different principle, i.e. that the basis of civilization is society, not a person. The West assumes an individual existence after death, and the East assumes vanishing in Nirvana.11 Returning to the question “will smart cities be similar to our present-day cities?”, when asked by someone from the East our answer is “very”, but when asked by someone from the West, the answer would be “slightly”. So, we have ISO standards (371xx family—current list of ISO TC 268 standards, see [38]) and IEC (current list of IEC SyC Smart Cities standards, see [39])—but that is not all, at least in Europe. The European Commission has created “The Rolling Plan for ICT Standardisation”. It provides “a unique bridge between EU policies and standardisation activities in the field of information and communication technologies (ICT)” [40]. It covers 33 topics in four areas (Key enablers and security, Societal challenges, Innovation for Digital Single Market, Sustainable growth). This document can be considered as a great schedule of when and what will be done over the course of many years (even decades), taking into account all interdependencies. E.g. to implement the “Internet of Things” we should have “5G”, but the same “5G” is also a requirement for “Intelligent Transport Systems—Cooperative, Connected and Automated Mobility (ITS-CCAM) and Electromobility”. For each topic, we can find out what policies will be binding, what are the long-term goals, what are the short-term goals (for the next year), what will be (and what is) European law on this topic, what will be (and what is) done by CEN, ETSI, ISO, IEC, IEEE, ITU-T, IETF, etc. In the “Sustainable Development” area, we have the “Smart cities and communities/technologies and services for smart and efficient energy use” topic. This is another “must-know” document for anyone who tries to build any system for any smart city (I think not only for European systems) and for the major of any city that wants to procure such a system. A slightly broader perspective of the “smart city movement” can be obtained from the United Nations Sustainable Development Goals. These goals are [41]: 1. 2. 3. 4. 5. 6. 7. 8. 9. 11

No poverty Zero hunger Good health and well-being Quality education Gender equality Clean water and sanitation Affordable and clean energy Decent work and economic growth Industry, innovation and infrastructure

It is a simplification, but this book is not about philosophy nor theology.

36

10. 11. 12. 13. 14. 15. 16. 17.

P. Karocki

Reduced inequalities Sustainable cities and communities Responsible consumption and production. Climate action Life below water Life on land Peace, justice and strong institutions Partnerships for the goals.

All ISO standards carry labels to which they contribute; so we can see that the three standards listed above (ISO 37120, ISO 37122, ISO 37123) contribute to the achievement of objectives 3, 4, 5, 6, 9, 10, 11, 13 and 16. The last topic related to the smart city concept: a bit of a futuristic look. Automatic city … It seems that nothing better can be invented. Remember, however, that “Any sufficiently advanced technology is indistinguishable from magic.”12 Today, hardly anyone understands the processes taking place in its environment, such as the passage of electrons through the wall.13 Probably only a few percent of the population would be able to explain how a LED light source works. Will humanity not return to mythologizing its surroundings, will there not be deities of light summoned by a spell from a long-forgotten language from thousands of years ago, “turn on the light”? It may seem absurd, but we can encounter such reports many times in the literature. In one of the Planet of the Apes movies, astronauts land on Earth after a long journey to the stars. They find the monkey civilization, for which humans are farm animals, and are treated more or less like chimpanzees and gorillas are treated today. One of the monkeys is friendly towards people; the astronauts take her to the subway station, still illuminated.14 The monkey treats lighting almost like a deity, despite the fact that she uses technology (although limited to mechanics) on a daily basis. Another work, dating from 1888, is the Herbert George Wells’ Time Machine. The hero travels tens of thousands of years into the future, and on the surface of the Earth, he finds only people living in a primitive state, Eloy. As it turns out, there is also a second ‘sub-race’, the Morlocks who live underground, who know the basics of technology and (as it turns out) breed Eloi for meat. In the world of A Canticle for Leibowitz by Walter M. Miller Jr., we have the Albertian Order of St. Leibowitz, it’s monks copy the so-called memorabilia—texts from the First Civilization which annihilated itself. By redrawing drawings (which are electrical diagrams—but the scribes do not know it, their civilization is similar to the medieval one), they embellish them with lines and flourishes, because it seems that straight lines and right angles should be embellished. The same goes for the Thorgal series and many other works. A good summary of these examples is the following quote: “For a while, the more powerful electrobrains will still perform actions that people will be able to understand, grasp, at least 12

One of three “laws” formulated by Arthur C. Clarke. Such tunneling takes place in every transistor. 14 Let’s ignore the fact that this is not possible: no light source will work for thousands of years. 13

Smart City from a Standards Perspective

37

approximately. Later, however, the gap that once started will widen. The thinking machines will present us with the results of their theoretical considerations that we perhaps would be able to apply, but which we will no longer be able to understand. The field of phenomena supervised by automatons will become more and more extensive. In the end, people will dwindle to the dimension of brainless servants of iron geniuses and perhaps begin to worship them” [42, p. 166].15 How is human civilization developing? This development can be described very briefly: the civilization of lazy people, according to the saying “laziness is the mother of invention”. First, we are dealing with the development of civilization consisting in removing the necessity to use human power from life. Instead of legs, you could travel on a horse, you did not have to pull the plow yourself, because an ox or a horse can pull it … Simple machines were used (e.g. a lever, pulley, bow). In recent times, excavators, cars, and elevators have appeared. The human was reduced to intelligence, according to the definition introduced by Aristotle that man is a rational animal. Now, force is needed primarily to press buttons, and buttons are sometimes drawn on the screen. After eliminating the need to use physical force, humanity is also trying to eliminate the need to use intelligence. We have computers, we try to build artificial intelligence, and even instead of hypothesizing and planning an experiment to verify it, computers find trends (similarities) in data (the so-called big data). What is left for the people? It would seem that only emotions will remain, but in this way, we become only an animal (after all, every animal has emotions). No longer homo habilis, a skillful man,16 no longer homo sapiens, a thinking man,17 but homo ludens, a man of fun.18 It would seem perfect, but we know it’s suicide; it was proved in 1968 in John Calhoun’s experiment [44], repeated many times since then by various teams. Briefly, the course of the long-term experiment is as follows: four pairs of mice were placed in a large, square room with a side of 2.7 m. Food and toys were freely available. Initially, the population grew rapidly, reaching 620 individuals by day 315. However, a decline in the birth rate, a breakdown in social relationships, and behavioral changes began to be visible. The last mouse was born on day 600; it was already a period when the males withdrew completely (no fights, no mating ritual). The mice only ate, slept, and cleaned their fur—individually. On day 1588, the population was only 27 individuals (23 females and 4 males). 15

The quote is from dialogue VI between Filonus and Hylas, and this dialogue was already in the original 1957 edition (Dialogues were expanded in 1972). 16 Proper taxonomic name: †Homo habilis Leakey, Tobias & Napier, 1964; specie of Homo, living 2.3–1.65 million years ago. Latin name reflects that this homo was “able, handy, mentally skillful, vigorous” [43]. The term is used here as a metaphor. 17 Proper taxonomic name: Homo sapiens Linnaeus, 1758; specie of Homo, living since 300 000 years ago. The Latin name means “wise, knowing, sapient”. Biologically we are all Homo sapiens sapiens, a subspecies of Homo sapiens; other subspecies are extinct (†Homo sapiens idaltu White et al. 2003). Used here as a metaphor. 18 It is a philosophical name, not a biological one. Ludens is a Latin participle from “to play, play at a game”. This term was coined by John Huizinga in his 1938 book Homo ludens. Also used metaphorically here.

38

P. Karocki

If mankind is to survive, people need some real things to do, some challenges. Our cities cannot create for us the comfort that Calhoun has created for his mice. When planning systems for a smart city, we should consider both issues: understanding the environment and the challenges. First, we have a solution: education being organized more in Chinese (a side effect of participation level in smart city working groups). Western education fails, and we have known it since 1959: “So the great edifice of modern physics goes up, and the majority of the cleverest people in the western world have about as much insight into it as their neolithic ancestors would have had.” [1]. 90 % of this “majority” are university graduates! Perhaps that is why one of the indicators is “Number of science, technology, engineering and mathematics (STEM) higher education degrees per 100 000 population” (ISO 37122, 6.3). In order to know about the dangers and to see a broad perspective, one must not forget about humanists, they can warn against the consequences (although technologists were aware of the fatal effects of ‘information cocoons’, the effect of personalization of search engines, years earlier than sociologists [45]). Only humanists can create measures of ‘happiness’; we know adding technology19 to cities should make citizens happy, but how can we measure happiness (or well-being)? Do we have to implement every possible solution and, for example, after a decade, make a survey? No, we can use existing standards, as IEEE 7010 IEEE Recommended Practice for Assessing the Impact of Autonomous and Intelligent Systems on Human Well-Being. The purpose of this standard is defined as: “provides A/IS20 creators (designers, developers, engineers, programmers, and others) with impact-related insights that should be taken into account throughout the lifecycle of any A/IS to increase and help safeguard human well-being at the individual, population, and societal levels” [46]. This standard covers the following domains (clause 6 of the standard): . . . . . . . . . . . .

Satisfaction with Life Affect Psychological Well-being Community Culture Education Economy Environment Government Health Human Settlements Work

In it, for example, “culture domain” is defined as “that complex whole which includes knowledge, beliefs, arts, morals, laws, customs, and any other capabilities and habits acquired by [a human] as a member of society”; and Information 19

More specifically: “properly adding good technology”. A/IS is the shortcut for “Autonomous and Intelligent Systems” from the title of the standard. It applies also to technology used in smart cities.

20

Smart City from a Standards Perspective

39

and Communication Technologies (ICT) is just one sub-item in Human Settlements domain. Each system in (or “for”) a smart city should be checked for melioration or degradation, among others the psychological well-being of the city’s inhabitants. Would citizens’ satisfaction of life be better or worse with the use of a potential technology/system? And we really know how to check it. I think this is the right time to apply what Kohelet said: “of making many books there is no end; and much study is a weariness of the flesh. Let us hear the conclusion of the whole matter” (Eccl 12: 12b-13a) and write one last sentence. Let this one sentence serve as a summary of the entire chapter: smart city is a new way of thinking in city planning (a new trend in urban planning), and not new tools to achieve the same goals that today’s cities have.

References 1. Snow, C.P.: The Two Cultures. Rede Lecture at the University of Cambridge (1959). http://sf-walker.org.uk/pubsebooks/2cultures/Rede-lecture-2-cultures.pdf, accessed on 2021-01-17 2. Wittgenstein, L.: Tractatus Logico-Philosophicus (1921) 3. UN Department of Economic and Social Affairs: Word Urbanization Prospects 2018. United Nations, New York (2018). Online: https://population.un.org/wup/Publications/Files/WUP 2018-Highlights.pdf, accessed on 2021-01-17 4. UN Department of Economic and Social Affairs: World Urbanization Prospects 2014. United Nations, New York (2014). Online: https://population.un.org/wup/publications/files/wup2014highlights.Pdf, accessed on 2021-01-17 5. Weiss, R.: Does crowding cause aggressive behavior? "Washington Post" July 14, 1994. Online: https://www.washingtonpost.com/archive/lifestyle/wellness/1994/07/12/doescrowding-cause-aggressive-behavior/2fec87ed-e5df-4a4d-a640-544266fe72f8/, accessed on 2021-01-17 6. Levine, R.V., Martinez, T.S., Brase, G., Sorenson, K.: Helping in 36 U.S. cities. J. Pers. Soc. Psychol. 67(1), 69–82 (1994) 7. Milgram, S.: The experience of living in cities. Science 167, 1461–1468 (1970) 8. Regoeczi, W.C.: The impact of density: the importance of nonlinearity and selection on flight and fight responses. Soc. Forces 81(2), 505–530 (2002) 9. Twine, K.: The city in decline: Rome in late antiquity. Middle States Geogr. 25, 134–138 (1992) 10. Jones, N.: How to stop data centres from gobbling up the world’s electricity. Nature 561, 163–166 (2018). Online: https://www.nature.com/articles/d41586-018-06610-y, accessed on 2021-01-17 11. European Commission, Smart cities. http://ec.europa.eu/eip/smartcities/files/sip_final_en.pdf, accessed on 2021-01-17 12. Doppelbauer, M., Winzer, P.: A lighter motor for tomorrow’s electric car. IEEE Spectrum 28 (2017) 13. International Energy Agency, Unit converter and glossary. https://www.iea.org/reports/unitconverter-and-glossary#energy-units, accessed on 2021-01-17 14. Eurostat, Glossary: Tonnes of oil equivalent (toe). https://ec.europa.eu/eurostat/statistics-exp lained/index.php/Glossary:Tonnes_of_oil_equivalent_(toe), accessed on 2021-01-17 15. European Commission CORDIS: Control and Optimisation for Energy Positive Neighbourhoods. http://cordis.europa.eu/project/rcn/105699_en.html, accessed on 2021-01-17 16. Sastry, G.: The role of smart cities in meeting future energy demand. http://tech.firstpost.com/ news-analysis/cities-378194.html, accessed on 2021-01-17

40

P. Karocki

17. Eurostat: Energy statistics—an overview. https://ec.europa.eu/eurostat/statistics-explained/ index.php?title=Energy_statistics_-_an_overview, accessed on 2021-01-17 18. Eurostat, Consumption of energy. http://ec.europa.eu/eurostat/statistics-explained/index.php/ Consumption_of_energy, accessed on 2021-01-17 19. European Commission: Housing space per person https://ec.europa.eu/energy/content/hou sing-space-person, accessed on 2021-01-17 20. Statista: European Union: total population from 2010 to 2020. https://www.statista.com/statis tics/253372/total-population-of-the-european-union-eu/, accessed on 2021-01-17 21. Eurostat: Electricity price statistics. https://ec.europa.eu/eurostat/statistics-explained/index. php/Electricity_price_statistics, accessed on 2021-01-17 22. Hecht, J.: The early-adopter blues. IEEE Spectrum 44–48, 57–58 (2016). Online: https://spe ctrum.ieee.org/green-tech/conservation/led-streetlights-are-giving-neighborhoods-the-blues, accessed on 2021-01-17 23. Dark theme. https://developer.android.com/guide/topics/ui/look-and-feel/darktheme, accessed on 2021-01-17 24. EU Loss of the Night Network. http://www.cost-lonne.eu/, accessed on 2021-01-17 25. European Parliament resolution of 28 November 2019 on the climate and environment emergency. https://www.europarl.europa.eu/doceo/document/TA-9-2019-0078_EN.html, accessed on 2021-01-17 26. Francis: Address of his holiness Pope Francis to participants at the meeting promoted by the dicastery for promoting integral human development on the theme: the energy transition & care of our common home (2019). http://www.vatican.va/content/francesco/en/speeches/2019/june/ documents/papa-francesco_20190614_compagnie-petrolifere.html, accessed on 2021-01-17 27. Wikipedia: Climate emergency declaration. https://en.wikipedia.org/wiki/Climate_emer gency_declaration5tyhjk, accessed on 2021-01-17 28. Bram, J., Orr, J., Rapaport, C.: Measuring the Effects of the September 11 Attack on New York City, Federal Reserve Bank of New York Economic Policy Review (2002). Online https://www. newyorkfed.org/research/epr/02v08n2/0211rapa.html, accessed on 2021-01-17 29. ISO 37120:2018. https://www.iso.org/standard/68498.html, accessed on 2021-01-17 30. ISO 37122:2019. https://www.iso.org/standard/69050.html, accessed on 2021-01-17 31. ISO 37123:2019. https://www.iso.org/standard/70428.html, accessed on 2021-01-17 32. Howard, L.: The climate of London, deduced from Meteorological observations, made at different places in the neighbourhood of the metropolis. 2 volume, London (1820) 33. The European Portal For Energy Efficiency in Buildings, Counteracting Urban Heat Island Effects in a Global Climate Change Scenario. https://www.buildup.eu/en/node/55456, accessed on 2021-01-17 34. ISO. https://www.iso.org/members.html, accessed on 2021-01-17 35. ISO. https://www.iso.org/about-us.html, accessed on 2021-01-17 36. ISO. https://www.iso.org/committee/656906.html, accessed on 2021-01-17 37. IEC. https://www.iec.ch/dyn/www/f?p=103:187:8086545846385::::FSP_ORG_ID,FSP_ LANG_ID:13073,25#3, accessed on 2021-01-17 38. ISO. https://www.iso.org/committee/656906/x/catalogue/p/1/u/0/w/0/d/0, accessed on 202101-17 39. IEC. https://www.iec.ch/dyn/www/f?p=103:23:1460069074161::::FSP_ORG_ID,FSP_ LANG_ID:13073,25, accessed on 2021-01-17 40. European Commission, Rolling Plan 2020 https://joinup.ec.europa.eu/collection/rolling-planict-standardisation/rolling-plan-2020, accessed on 2021-01-17 41. UN Department of Economic and Social Affairs: The 17 Goals. https://sdgs.un.org/goals, accessed on 2021-01-17 42. LEM, S.: Dialogi. Wydawnictwo Literackie, Kraków (1984) 43. Leakey, L.S.B., Tobias, P.V., Napier, J.R.: A new species of the genus homo from Olduvai Gorge. Nature 202(4927), 7–9 (1964) 44. Calhoun, J.B.: Death squared: the explosive growth and demise of a mouse population. Proc. R. Soc. Med. 66, 80–88 (1973)

Smart City from a Standards Perspective

41

45. Pariser, E.: The Filter Bubble: What The Internet Is Hiding From You, City of Westminster (2011) 46. IEEE 7010. https://ieeexplore.ieee.org/document/9084219, accessed on 2021-01-17

Point-of-Interests Recommendation Service in Location-Based Social Networks: A Survey, Research Challenges, and Future Perspectives Safar Maghdid Asaad, Kayhan Zrar Ghafoor, Halgurd Sarhang, and Aos Mulahuwaish Abstract The focus on accurate Point-Of-Interest (POI) recommendation, specifically in location-based services (LBS), has gained all social-network developers’ attention. This is because the POI service has a significant role in helping users to locate targeted areas, including hospitals, airports, stations, billing addresses, postoffice, shopping-mall, and other POIs. Equally, many attempts have been realized to provide accurate POI recommendation solutions via commercial and academic sectors. However, the recommendation solutions have their weaknesses and abilities in terms of initial check-ins, accuracy, behavior of the users’ activities, and historical passed locations. According to the state-of-the-art, a survey of such solutions and utilized techniques is needed. Therefore, this paper aims to address most of the currently proposed solutions and implemented techniques for offering accurate POI recommendation systems. Further, this paper also presents a taxonomy of POI recommendation solutions in which the solutions are classified into content-based filtering, collaborative-based filtering, and hybrid-based filtering solutions. This is with a particular focus on the details of the implemented techniques/algorithms and S. M. Asaad Department of Information System Engineering Techniques, Erbil Technical, Engineering College, Erbil Polytechnic University, Erbil, Kurdistan Region-F.R., Iraq e-mail: [email protected]; [email protected] Department of Software Engineering, Faculty of Engineering, Koya University, University Park, Koya, Erbil, Kurdistan Region-F.R., Iraq K. Z. Ghafoor Department of Computer Science, Knowledge University, University Park, Kirkuk Road, 44001 Erbil, Iraq e-mail: [email protected] H. Sarhang Department of Software Engineering, Faculty of Engineering, Koya University, Koysinjaq, Kurdistan Region-F.R., Iraq e-mail: [email protected] A. Mulahuwaish (B) Department of Computer Science and Information Systems Saginaw, Valley State University, Allendale, MI, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_4

43

44

S. M. Asaad et al.

utilized features. Providing an accurate POIs recommendation solution and other related issues are listed as future research attempts. Keywords Recommendations systems · Point of interest (POI) · POI recommendation · Smart city · Location-based social networks (LBSNs

1 Introduction With the rapid technological advancement of spreading Social Networks around the world, the demands of using Location-based Services (LBSs) are increasing dramatically. The Location-Based Social Networks (LBSNs) is the most famous service among all LBSs. In LBSNs, consumers can check-in and share their location during their daily activities with their friends. For example, while a user is in the coffee shop to drink a coffee cup, the user may take photos with the coffee cup on the table. After that, the user shares the photos with his/her friends through LBSNs. This kind of individuals’ check-in activity will quickly disperse real-life everyday activities and their associated knowledge across the internet. This quick spreading of users’ check-in data of LBSNs has been encouraging both commercial companies and researchers to propose and implement numerous location-based models in different domains, including recommendation systems [1–4] and user profiling [5–10]. Instagram, Facebook, and TripAdvisor are examples of current social networks, and they allow users to locate their places via Global Navigation Satellite Systems (GNSS). Note, such services could be implemented via specific smartphone or Tablet applications/technologies [11, 12]. Additionally, on the side of researching, many real datasets of LBSNs are available for researchers. The Most popular datasets are Gowalla, Foursquare, Brightkite, and Yelp. Researchers construct their models based on some of the chosen samples of these datasets. Gowalla is a dataset that collected 6,442,890 check-ins of the users from Feb. 2009 to Oct. 2010. [13], Foursquare contains 801,131 check-ins collected from New York and Tokyo cities in ten Months (from 12 April 2012 to 16 February 2013) [14]; Brightkite consists of 4,491,143 check-ins were gathered from Apr. 2008–Oct. 2010 [13], Yelp contains 1,320,761 POIs by 1,968,703 users [15, 16]. Numerous researches have been conducted to use these datasets to improve recommendation systems’ accuracy and user profiling services. However, such research and designed models/techniques need a survey to indicate the issues and show the model improvement in a standardized manner [17]. Therefore, the main contribution of this chapter is to present current techniques and solutions in the field of using related location information for the recommendations systems. As well, the chapter investigates the issues of recommendation solutions in terms of content-based, collaborative-based, and hybrid recommendation systems. Finally, the main purpose of this chapter is to make a guideline for the reader to know the recent progress of the applications that utilized the recommendation system.

Point-of-Interests Recommendation Service in Location-Based …

45

The remainder of the paper is organized as: Sect. 2 discusses the current locationbased recommendation solutions. This is followed by illustrating a comparison among all the solutions, which are classified into three groups, including contentbased, collaborative-based, and hybrid-based algorithms. Section 3 summarizes the pros and cons of the recommendation Algorithms. Section 4 discusses the open research challenges. Section 5 concludes a set of investigations over the current POIs solutions.

2 Literature Review In several LBSNs, accurate point-of-interest (POI) recommendation via the obtained location information remains a big issue. In this section, currents studies and attempts to improve the next POI or recommended POI are presented, and the studies’ comparison is demonstrated. Further, the comparison is based on criteria, including datasets, various features, and implemented techniques/algorithms of the recommendation system. Figure 1 depicts the taxonomy of the filtering algorithms that have been

Fig. 1 Taxonomy of PIO recommendation solutions

46

S. M. Asaad et al.

using in the current studies. According to a vast number of studies, most of the nobilities are classified according to the following. ● Content-Based Recommendation Systems ● Collaborative Recommendation Systems ● Hybrid Recommendation Systems.

2.1 Content-Based Recommendation Systems It deals with extracting descriptions from the items determined by user preference. This approach matches extracted features from locations, including tags, price ranges, and categories, with user preferences to make predictions. Additionally, it uses the matching contents’ spatial distances and similarity measures to calculate the prediction score. The approach is robust against the users’ and locations’ cold start issue. However, building new users’ profiles is still a big challenge [18, 19]. Note, the details of the proposed solution are illustrated in Table 1. Table 1 Proposed content-based recommendation solutions Year

Solution name Considered features

Datasets

Techniques/algorithms

2021

Ancher-LDA [2]

● Initial check-in data ● Gowalla

● LDA ● Tenfold cross validation

2020

CBRS [20]

● Contextual Information

● Movies in LDOS-CoMoDa [21] ● Songs in InCarMusic [21] ● Apps in Frappe´ [22] ● Points of interest in STS [23] ● Hotels in the TripAdvisor datasets [24] ● Apple store ratings [25] ● Drug review [26]

● k-means (KM) ● Single-linkage (SLINK) ● Fuzzy c-mean (FCM) ● Self-organizing maps (SOM)

2020

CPAM [27]

● Contextual influence ● User preferences

● Gowalla ● Foursquare

● Similarity metric ● Logistic matrix factorization (LMF)

2019

STPR [4]

● Spatio-temporal informaion of clients’s check-in mobility

● Gowalla ● Foursquare

● Kernel density estimation ● Time interval and the POI investigation mechanism

Point-of-Interests Recommendation Service in Location-Based …

47

Due to lack of considering the significance of the users’ initial check-in information in the majority of the current studies, in their cutting-edge paper of 2021 [2], authors proposed Ancher-LDA (Latent Dirichlet Allocation) Model to find Point of interest (POI) of users in their primary activity area, as presented in Fig. 2. The model is based on initial check-in data and its metadata (anchor) in Location-Based Social Networks (LBSNs). The users are enabled to check in to the places they visit through Gowalla as one of the ancient LBSNs platforms. To evaluate the model, three data sets (of three cities)

Fig. 2 Architecture framework of anchor-LDA [2]

48

S. M. Asaad et al.

of Gowalla were used only to consider check-in and geo-coordinate information in LBSN. In the evolution process, the predictive performance evaluation was conducted on 10% of each city’s datasets by performing tenfold cross-validation. Based on that, they compared the study’s performance with three other baselines, namely pure-LDA, weighted-LDA, and LDA mixture with mean coordinates. According to their comparisons, the Ancher-LDA’s performance was better than the other LDAbased recommender algorithms. However, the proposed solution’s accuracy might be increased by considering more data sources besides the initial check. Due to the scalability and sparsity issues in existing state-of-the-art recommender systems, including collaborative filtering approaches, Authors in [20] suggested a clustering-based recommendation framework in the Internet of Things (IoT) context and environment. The proposed framework uses the vector space model from the information extractor to generate highly accurate recommendations. The proposed algorithm uses four excellently clustering techniques: KM, SLINK, FCM, and SOM. To evaluate the performance of the proposed recommendation scheme, various experiments on seven IoT ranking datasets from various fields, including POIs, were performed. The authors argued that the suggested algorithm achieved a major increase in recommendation precision, as shown by experimental findings. The robust associations for linking the POIs visited by the consumer and the POIs to be visited next are still a big challenge to be modeled accurately, which results in low POI recommendation accuracy. In the article [27], the authors proposed a context and preference aware model (CPAM) for POIs estimation that incorporates both contextual influence and consumer preferences. Initially, the authors created a Skip-Gram-based POI Embedding Model, which is known as SG-PEM. The SGPEM extracts the contextual influence of POIs and learns the vector representation (embedding) of POIs from consumers’ check-in sequences. The studied embeddings are utilized for extracting the consumers’ preferences for the targeted POIs using a similarity measure. After that, the Authors used the Logistic Matrix Factorization (LMF) technique to model the consumers’ customized preferences for POI based on the implicit input information found in the check-in results. Finally, they fused SG-PEM and LMF to form the CPAM model that utilizes contextual influence and consumer expectations to provide customized recommendations. The proposed framework significantly outperformed the state-of-the-art baselines in experiments on two real-world datasets, including the Foursquare and Gowalla. The journey-purpose on client’s mobility and the memory impact of historical check-in behavior (at the same position) patterns significantly affect POI recommendation. They are failed to notice in most of the previous studies. In an attempt, the authors in [4] proposed a framework with Spatio-Temporal influences based on Purpose Ranking (STPR) to estimate top-k POIs to clients based on these patterns shown in Fig. 3. The STPR contains two critical steps. In the first step, POIs in the Spatio-Temporal database are classified into four classes: entertainment, tasting-delicacy, exercise, and daily-travel. The classification is based on the client’s behavior. Each category is considered as a trip purpose. The user’s next trip purpose is predicted via a purpose ranking model based on the user’s historical check-in data.

Point-of-Interests Recommendation Service in Location-Based …

49

Fig. 3 Framework of the Spatio-Temporal influences based on purpose ranking (STPR) model [4]

In the second step, each candidate POI score was calculated by taking into account the properties of spatial and temporal information to predict top-k POIs to target users. The visiting likelihood of a particular POI is predicted for the spatial property by using the kernel density estimation. Whereas, for the temporal property, the time interval and the POI investigation mechanism are utilized. A set of experiments have been conducted on the Foursquare and Gowalla datasets. Further, the model’s accuracy and runtime performance are highly accepted compared to the most modern recommendation applications.

50

S. M. Asaad et al.

2.2 Collaborative Recommendation Systems These kinds of systems apply Collaborative filtering algorithms in the recommendation process. Collaborative filtering recommendation is one of the most widely used approaches rather than traditional recommendation solutions. With this approach, users with similar interests are matched as a basis of recommendation. User behavior has a vital role in the collaborative recommendation. Based on the recommendation techniques, there are two categories of collaborative filtering algorithms, which are memory-based and model-based. Memory-based algorithms process the entire set of data to produce recommendation systems. Memory-based memory-based algorithms approximate users or items via statistical techniques, including K-Nearest Neighbor (KNN), Pearson Correlation Coefficient, Cosine Similarity, etc. Simultaneously, model-based algorithms utilize only a subset of the overall dataset to create a model that learns from users’ preferences. Model-based algorithms utilize rule-based techniques to develop models, including the Bayesian-Network model, Cluster methods, etc. One of the advantages of collaborative filtering approaches is providing high-quality recommendations via utilizing the opinions from similar users or locations. However, collaborative filtering approaches are suffering from data sparsity when the client ratings are low, and the user-item (location, etc.) rating matrix is very sparse [19, 28, 29]. In another vein, most LBSNs application suffers from some deficiencies, including not employing the users’ shared behaviors, cold-start users, and data sparsity. To tackle these issues, authors in [3] investigate a novel method, called Behavior-based Location Recommendation (BLR). Figure 4 illustrates the overall framework structure of the method. The BLR method is depending on the target user’s repetitive behaviors and similar users’ behaviors. The behaviors include both location category and time interval which are combined to define the user’s behavior. Further, the

Fig. 4 Behaviour-based location recommendation (BLR) framework [3]

Point-of-Interests Recommendation Service in Location-Based …

51

BLR Framework comprises of two different components including (1) a behavior prediction component and (2) a spatial model component to precess the check-in data. To predict the users’ behavior component, Random Walk with Restart (RWR) method is used and extended to find users who have the same behaviors based on the users’ Behaviour Graph. Note, the Behavior Graph build from the check-in history of the pre-stored users’ behaviors. This is followed by using Behavior Transition Model (BTM) to estimate clients’ behaviors at any provided times. While the second component, the spatial model, finds Hot Check-in Areas (HCA) based on the check-in history. However, the spatial model’s check-ins information is utilized by two more models (behavior-based spatial model and user-specific spatial model) to learn behavior-based and user-specific spatial likelihoods depending on the HCA. Therefore, the BLR essentially utilizes the behavior model of the user-specific and the behavior-specific spatial models to evaluate the probability of locations and recommend those with a higher likelihood. The study’s conducted experiments presented that BLR’s precision is up to 30% higher than current existing approaches for the recommendation. Also, the other achievement is that the precision of BLR is up to 50% more excessive than the baseline models for cold-start users. Social Spatio-Temporal probabilistic matrix factorization (SSTPMF) framework is also propounded by Davtalab and Alesheikh[30] to resolve the issues of (1) checkin cold-start issue and (2) implicit user and POI correlations. By combining the user similarity, POI similarity, and Spatio-Temporal neighborhood, the SSTPMF expanded probabilistic matrix factorization (PMF). Also, extending the PMF is to enhance the consistency of the POI recommendation and decrease the problem of sparse check-in matrix data. The study performed several experiments on two standard datasets, namely Foursquare and Gowalla. The outcome of the experiments in the study demonstrates that encompassing the use of POI heuristic information and user input increases the efficiency of the recommendation systems for POI. This is due to the cold-start problem has been solved. Also, due to rare attention to the Spatiotemporal intervals among neighbor check-ins in the state-of-the-art models, authors in [31] proposed a new mechanism for POI recommendation is Spatio-Temporal Gated Network (STGN). The mechanism is based on the enhancement of the long short term memory (LSTM) network. The LSTM is used to model the client’s visiting behavior for the next POI recommendation. Where the Spatio-temporal correlations between successive check-ins are obtained via introducing Spatio-temporal gates. Further, the long-term interest updates and the short-term interest updates are controlled by designing a pair of distance and time gates, respectively. The time and distance gates are coupled together to reduce the number of parameters and improve the recommender system’s performance. The study exploits three real datasets (Gowalla, Foursquare, and Brightkite) to make the evaluation process. The experimental findings indicate that the performance of the sample is significantly higher than the approaches to baseline. The change of frequently users’ situation is not considered in most existing studies. However, an initial study in [32] addresses a flexible POI suggestion depends on the user’s situation. The initial study constructs a graph model for each factor of users’

52

S. M. Asaad et al.

Table 2 Proposed collaborative-based recommendation solutions Year Solution name Considered features 2020 BLR [3]

Datasets

Techniques/algorithms

● Spatial and temporal ● Gowalla ● Density-based spatial patterns of user behaviors ● clustering algorithm Foursquare ● Two-dimensional kernel probability distribution ● Brightkite

2020 SSTPMF [30] ● Users similarity ● POIs similarity and ● Spatio-Temporal

● Gowalla, ● Extended probabilistic ● matrix factorization Foursquare

2019 STGN [31]

● Time and distance intervals between neighbor check-ins (user behaviors)

● Gowalla ● LSTM ● Foursquare ● Brightkite

2019 [32]

● Trajectory ● Distance, and ● Preference factors



● Graph modeling Foursquare ● Weight- threshold algorithm (w-TA)

situation, including trajectory, distance, and preference factors. After that, the weight of each factor is adjusted by utilizing a weight-threshold algorithm (w-TA). This denotes that the weight adjustment is based on the user’s current situation and affords flexibility in the ranking of POIs. On the Foursquare dataset, some experiments are performed, and the effects of the proposed model include a flexible ranking of POIs when the weights of the variables change. Note, the details of the proposed solutions are presented in Table 2.

2.3 Hybrid Recommendation Systems The hybrid recommendation approach combines two or more collaborative filtering techniques or combines with Content-based techniques in different ways to improve recommendation accuracy. However, it suffers from time complexity [18, 29]. There are many combining techniques, including (1) Weighted method to produce a single recommendation via numerical combining scores of several recommendation techniques together. (2) Switching method to take among recommendation techniques depending on the current situation and utilizing the selected one. (3) Mixed method to preset recommendations from separate recommenders concurrently to provide the recommendation. (4) Feature Combination method to fuse extracted features from distinct recommendation data sources and then used into a single recommendation approach. (5) Feature Augmentation method to compute an output feature or a set of output features from one technique. Then, the output is used as an input feature to another technique. (6) Cascade method to recommend on the top of the generated a strict result by a recommender technique. (7) The meta-level method utilizes the

Point-of-Interests Recommendation Service in Location-Based …

53

generated model by one recommender as the input of another recommender technique [33, 34]. One of the critical points of POIs recommendation systems is extracting efficient features from the client’s check-in data. To show this claim, a model based on a deep neural network is propounded in [35], which is called DLM. Figure 5 depicts the overall structure of the DLM framework. The DLM framework combines the deep network, Latent Dirichlet Allocation model, and Matrix Factorization algorithm. The DLM Model incorporates some features into the POIs Recommendation model including, (1) user preference features, (2) topic features, (3) and geographical factor features. This is to enhance the performance of a personalized POIs Recommender in most LBSN applications. The conducted experiments on the Foursquare dataset illustrate that using a deep neural network improves the POI recommendation’s efficiency. Also, this will lighten the problem of data sparsity in contrast to the Baseline methods. The study of [36] proposes Neural Collaborative Filtering Tree (NCFT) framework, as demonstrated in Fig. 6, to overcome the existing problems in the traditional POIs recommendation algorithms. The traditional algorithms only utilize a simple simplified linear function to prototype social relationships, client check-in geographic information, and distribution features. In the NCFT, client long-term check-in distribution features are also extracted by proposing an unsupervised client check-in distribution feature extractor (CD-Ex). Simultaneously, the client’s short-term check-in distribution feature is learned through gated recurrent units (GRU). However, two modules have been proposed: the user-based collaborative filtering tree and the itembased collaborative filtering tree propounded. As well, the NCFT utilizes the notion

Fig. 5 Overall framework of the DLM model [35]

54

S. M. Asaad et al.

Fig. 6 Neural collaborative filtering tree (NCFT) framework [36]

to learn deep representations of clients and objects from messaging. In the modules mentioned above, multi-head attention and vanilla attention mechanisms are utilized to understand the clients’ representations and POIs. Comparative experiments are conducted on the Gowalla and the Yelp real datasets. The results display that the proposed framework has notable improvement in AUC and F1 compared to baseline models. The network of visiting users for POIs and existing various Location-based data are other factors that have affected the POIs selection, specifically in Smart City. Therefore, the study in [37] addresses this issue by proposing a new approach, which is Network Representation Learning-Enhanced Multi-Source Information Fusion Model (MSI) for POIs prediction. The proposed approach fuses three joined factors: (1) user preference, (2) geographical influence, and (3) social influence for POI prediction. Precisely, the network representation learning methods performed on the constructed co-visiting user networks of social influence. This is in order to measure the hidden complex social relationships between consumers automatically. Additionally, after prototyping the user preference and geographical influence, the fusion prototype considered all factors to provide accurate POI Recommendations. A large number of experiments on two real datasets (Foursquare and Gowalla) are presented. The results show that the fusion model exceeds state-of-the-art methods for POIs recommendation. However, the temporal influence for POIs recommendation has not been considered as well as the relationships between users and POIs change from time to time would affect the POI recommendation accuracy. Social trust among individuals is another issue since it has not been considered in most researches. Besides, the studies are only focused on similarity, popularity, or geographical influence. However, authors in [1] satisfied trust relationships have an essential role in the collaborative recommendation. The authors also proposed a Friend Recommendation system based on Trust Cluster (FRTC) and Hybrid POI Recommendation (TSG). The study initially generated trust clusters (Co-clusters; Groups of both trusters and trustees with similar patterns) among users using the fuzzy c-means (FCM) algorithm. Consequently, the trust values are predicted through using an iterative manner combined with similarity among clients to recommend friends to a target user. This

Point-of-Interests Recommendation Service in Location-Based …

55

is followed by devising a hybrid framework for the POI recommendation. Additionally, the hybrid framework improves the recommendation system’s quality and time efficiency by combining trust relationships, geographical influence, and user preference. The proposed methods’ efficiency is implied on two real LBSNs datasets, including Foursquare and Gowalla Datasets. This study’s obtained results indicate that proposed algorithms are more accurate in comparison with predefined algorithms of friend recommendation and POIs Recommendation. The accuracy rate of POIs recommendation is also improved using the tensor factorization (TF) approach in [38] as presented in Fig. 7. With the TF approach, topic information of each consumer and POI-topic distribution are generated from the consumers’ comments information by the Latent Dirichlet Allocation (LDA) method. In the next step, each consumer’s check-in data into multiple categories matching to each hour in a day are separated. This is followed by combining the check-in data with the POI-topic distributions to generate user-topic distributions. In the final step, POIs recommendation, namely User-Topic-Time tensor (UZT), is improved by employing a Higher-Order Singular Value Decomposition (HOSVD) method to decompose the third-order tensor (dense User-Topic-Time tensor). The obtained results on a real dataset from WW (World Wide) depict that the proposed approach’s accuracy is higher than the baseline methods. During modeling, the users’ social influence is ignored. Therefore, for this reason, in [39], the authors proposed a novel GeoEISo approach for POI recommendation. The study aimed to gain three key goals. The first goal is to model the geographical influence between POIs by evolving a Kernel Density prediction technique (AKDE), including a self-adaptive kernel bandwidth. In the second goal, two prediction models are achieved. The models are (1) explicit trust values among users’ model and (2) novel trust-based recommendation model. Where explicit trust values between users’ model are predicted by utilizing the Gaussian radial base kernel function in vector regression support (SVR), the trust-based recommendation model uses two types of social trust in the POI recommendation method information: explicit and implicit. In the third goal, users’ preference on a POI, the geographical influence, and social relationships are hybrid to expand a unified geo-social framework. Experimental results on Foursquare (as it is one of the popular datasets in LBSNs) present that GeoEISo outperformed the ultramodern POIs solutions. However, the proposed approach requires a large amount of storage due to complex computations and colossal memory. In the studies mentioned above, personalized behavior differences during modeling geographical influence and exploiting the implicit social influence last but not least, another study [40] focused on using diversity features in check-in data into the existing POIs recommendation algorithms. In the study, the authors proposed Check-in and Temporal Features based Adaptive Recommendation Algorithm (CTFARA). The study exploits probability statistics and collaborative filtering methods to filter and obtain check-in and temporal features. Where the study hybrid four features by using the probability mathematical analysis method. The features include user activity, similarity features from check-in behavior, variability, and consecutiveness features from temporal.

56

S. M. Asaad et al.

Fig. 7 Overall framework of UTZ approach [38]

Additionally, the users clustered into active users and inactive users by utilizing the K-Means algorithm. Based on the proposed features, the Filtering Similar Users Algorithm (FSUA) is invented. Accordingly, the cosine similarity of different time slots smoothing method is used to improve the POIs recommender. The improvement is achieved by integrating user activity, similar user filtering, and smooth similarity calculation. A set of hypotheses and experiments have been set, and the CTF-ARA compared with Baseline POI recommendation methods on Foursquare and Gowalla

Point-of-Interests Recommendation Service in Location-Based …

57

databases. The results show that higher precision and recall improvement can be achieved. However, the study is not considered the spatial features, making the POI recommendation more accurate. Note, the details of the proposed solutions are demonstrated in Table 3.

3 Pros and Cons of Recommendation Algorithms Each Algorithm for recommending POIs has advantages and disadvantages. Table 4 summarizes the most common benefits and drawbacks of the most widely used algorithms of POI recommendation systems. Current and future recommendation systems based on the location information including, shop item recommendation, Control/Sevillian system, e-card applications, and large number applications within Smart cites. The vision of this chapter for the recommendation system based on geolocation information is depicted in Fig. 8, as it can be noticed that most smart city applications around us are using geolocation applications. The model on the left side of the figure is based on content-based algorithms for making POI recommendations in a smart city environment. User 4 is involved in three points, in that model, including the stadium, restaurant A, and coffee shop, and the model recommends Restaurant B for User 4 based on the user’s historical details. At the same time, the right-hand side of the diagram depicts a model based on collaborative algorithms. With this model, User 1 and User 2 are also social partners involved in the library and the stadium. When User 2 checks in at the cinema, the model recommends that cinema to User 1. When User 1 checks into a new location, such as a coffee shop, the coffee shop is recommended to User 2.

4 Research Challeges The most issues of POI recommendation systems which are remained as challenges are centered at the following points: 1.

Many of the current solutions based on Location-Based Services (LBS) utilize only a single form of the recommendation data source. Using multiple data sources would make it easier for the Location-Based Recommendation Systems (LBRSs), including POIs, to offer more accurate POIs recommendations. Different data sources might involve the distance from the location, road patterns, weather situations, varied routes to the location, and time of day, including morning, evening, or night. However, involving different data sources might make the process more complex by increasing the time and storage complexity. This is because there will be big data that have to be processed to make an accurate recommendation.

58

S. M. Asaad et al.

Table 3 Proposed hybrid-based recommendation solutions Year Solution name

Considered features

Datasets

Techniques/algorithms

2020 MSI [37]

● User preference ● Geographical influence ● Social influence

● Gowalla ● Foursquare

● Distance Functions: – Linear regression – Power-law function – Exponential function ● Network representation learning methods: – Deep walk

2019 DLM [35]

● User preference ● Topic features ● Geographical factor

● Foursquare

● Deep neural network ● Dirichlet allocation (LDA) ● Matrix factorization (MF)

2019 NCFT [36]

● Social relationships ● User check-in distribution features ● Geographic information

● Gowalla ● Yelp

● Collaborative Filtering ● Gated recurrent units (GRU) ● Multi-head attention and ● Vanilla attention mechanisms

2019 – FRTC – TSG [1]

● Check-in similarity ● Gowalla and trust relationship ● Foursquare ● User similarity, geographic influence, and trust influence

2018 UZT [38]

● Comment Information

● WW(World-Wide) ● Tensor factorization (TF) ● Dirichlet alocation (LDA) ● Higher order singular value decomposition (HOSVD)

2018 GeoEISo [39]

● Consumerss’ preference on a POI ● Geographical influence between POIs ● Social relationship

● Foursquare

● Kernel function density estimation method based on adaptive kernel bandwidth ● RBF kernel function-based SVR model

● Gowalla ● Foursquare

● Probability statistics ● Collaborative filtering ● K-Means ● Cosine similarity smoothing

2017 CTF-ARA [40] ● Activity and similarity features ● Temporal features

● Co-clustering technique ● Fuzzy c-means (FCM)

Point-of-Interests Recommendation Service in Location-Based …

59

Table 4 Summarization of pros and cons of recommendation algorithms POIs recommendation algorithms

Pros

Cons

Content-based recommendation systems

● Do not need other users’ data The recommendations are unique to a single person; the model does not include any details for other users. This makes scaling to a vast number of users much more straightforward ● Capable to recommend niche POIs The model can recognize a user’s personal preferences and make recommendations for niche products that only tiny other customers are interested in ● Adaptive: Over time, the consistency improves

● Prior knowledge of the domain is needed This approach necessitates a significant amount of domain knowledge since the feature implementation of the objects is hand-engineered to some degree. As a result, the model will only be as successful as the hand-engineered features ● Quality dependent on large historical data set The model will only provide suggestions depending on the client’s existing desires

Collaborative recommendation systems

● No prior knowledge of the domain is needed Since the embeddings are automatically learned, domain information is not needed ● Coincidence The model has the potential to assist users in discovering new POI. Even if the model does not realize the client is interested in a particular POI, it will also suggest it because other users are interested ● A fantastic starting point The feedback matrix is all that is needed to learn a matrix factorization model. The model, in particular, does not include contextual features. This can be utilized as one of the candidate generators in practice ● Adaptive: Over time, the consistency improves

● Cold-start problem The model’s prediction for a given (user, item) pair is the dot product of the corresponding embeddings. So, if an item is not seen during training, the system cannot create an embedding and cannot query the model with this item. This issue is often called the cold-start problem ● Difficult to include side features for query/item Some features that are not part of the POI or item ID are referred to as side features. For example, Region, gender, or age can be used in the side features for the movie predictor’s model. The model’s consistency is improved by using available site features ● “Gray sheep” problem Users whose preferences do not often match with any one of the people in the model are referred to as “gray sheep,” and they do not benefit from collective-based recommendation systems (continued)

60

S. M. Asaad et al.

Table 4 (continued) POIs recommendation algorithms

Pros

Cons

Hybrid recommendation systems

● Combining two or more strategies to improve efficiency Their key goal is to remove each pre-discussed technique’s disadvantages, including content-based algorithms and collaborative recommendations

● Increasing the time and storage complexity This is because big data will have to be processed to make an accurate POIs recommendation

Fig. 8 The vision of the chapter for the recommendation systems based on geolocation information

Point-of-Interests Recommendation Service in Location-Based …

2.

61

Another limitation with LBSNs is extracting the most useful features from that collected (sometimes noise) data from the user, which contains many features. A vast number of statistical and non-statistical techniques have been applied to extract the useful features. However, to make the system more accurate, those machine learning techniques have to be used that can extract increasingly accurate features from the users’ historical information.

In POIs recommendation systems, scalability still appears to be a big issue. The computational complexity of user similarity is O(n2 … k) in user-based collaborative filtering algorithms, where n is the number of clients and k is the average number of user-rated objects. The statistical complexity of similarities would rise sharply with the rises in the numbers of users and objects. The user-based collaborative filtering issue solutions include (1) reduction of dimensionality by factorization paradigm and (2) narrowing the search space by adding contextual information from users. New users’ profile establishment is still a big challenge in content-based recommendation solutions. Since content-based techniques are based on the users’ historical data to create their profile and their profile is used as an initial point to make recommendations. On another side, hybrid filtering algorithms can be considered to address the weaknesses of current solutions to achieve the best efficacy of the POIs recommendation systems. Additionally, hybrid filtering algorithms are powerful in handling cold start issues fabulously. They are also scalable regarding user preferences and location data. However, they exhibit more time complexity as compared to Content-Based Filtering.4. When users check-in in LBSNs, they frequently ponder the privacy dilemma, such as address disclosure and confidential relationship disclosure. Specifically, when people use location-sharing applications, the privacy issue is the most significant consideration for users. As a practical matter, by posting users’ current positions in LBSNs, consumers receive tons of location-based services. All at once, improper disclosure of users’ location information postures threats to their confidentiality. Today, the protection of privacy in LBSNs has drawn interest from scholarly research to industrial utilization [41]. Thus, this paper investigated that all the LBSNs services, including POIs recommenders systems, consider that the users’ private information has to be encrypted via implementing powerful cryptographic techniques. The encryption process makes the privacy of users will be protected from malicious users.

5 Conclusion In conclusion, the rapid improvement of mobile devices technologies and the incredible growth of social networks are directed to generate billions of user-data. Thus, the researchers have been motivated to analyze that data to develop new models and provide new services to users. One of the most attractive services is LocationBased Social Networks (LBSNs). Specifically, the POIs recommendation. In this

62

S. M. Asaad et al.

study, most of the currently proposed solutions and implemented techniques for offering accurate POI recommendation systems have been addressed. Initially, the taxonomy of POI recommendation solutions is illustrated. The solutions are classified into three categories, including (1) content-based filtering, (2) collaborative-based filtering, and (3) hybrid-based filtering solutions. Besides that, the implemented techniques/methods and used features in each of the solutions are explained. This study also presents the strength of the ultra-modern solutions, which provide accurate POIs recommendations. However, some recommendation solutions suffer from data sparsity, cold-start users, cold-start locations issues. Some solutions are also overlooked, some essential features that might improve the accuracy of the POIs recommendation solutions. This includes initial check-in information, users’ social influence, temporal features, spatial features, journey-purpose on consumers’ movement, and the memory influence of historical check-in behavior.

References 1. Zhu, J., Wang, C., Guo, X., Ming, Q., Li, J., Liu, Y.: Friend and POI recommendation based on social trust cluster in location-based social networks. Eurasip J. Wirel. Commun. Netw. 2019(1) (2019). https://doi.org/10.1186/s13638-019-1388-2 2. Seo, Y. D., Cho, Y.S.: Point of interest recommendations based on the anchoring effect in location-based social network services. Expert Syst. Appl. 164(September 2020), 114018 (2021). https://doi.org/10.1016/j.eswa.2020.114018 3. Rahimi, S.M., Far, B., Wang, X.: Behavior-based location recommendation on location-based social networks. GeoInformatica 24(3), 477–504 (2020). https://doi.org/10.1007/s10707-01900360-3 4. Huang, F., Qiao, S., Peng, J., Guo, B., Han, N.: STPR: a personalized next point-of-interest recommendation model with spatio-temporal effects based on purpose ranking. IEEE Trans. Emerg. Topics Comput. PP(c), 1–1 (2019). https://doi.org/10.1109/tetc.2019.2912839 5. Kanoje, S., Mukhopadhyay, D., Girase, S.: User profiling for university recommender system using automatic information retrieval. Procedia Comput. Sci. 78, 5–12 (2016). https://doi.org/ 10.1016/j.procs.2016.02.002 6. Zhao, S., Li, S., Ramos, J., Luo, Z., Jiang, Z., Dey, A. K., Pan, G.: User profiling from their use of smartphone applications: a survey. Pervasive Mobile Comput. 59, 101052 (2019). https:// doi.org/10.1016/j.pmcj.2019.101052 7. Gu, Y., Ding, Z., Wang, S., Yin, D.: Hierarchical user profiling for e-commerce recommender systems. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 223–231. Association for Computing Machinery, New York, NY, USA (2020). https://doi. org/10.1145/3336191.3371827 8. Eke, C.I., Norman, A.A., Shuib, L., Nweke, H.F.: A survey of user profiling: state-of-the-art, challenges, and solutions. IEEE Access 7, 144907–144924 (2019). https://doi.org/10.1109/ ACCESS.2019.2944243 9. Kanoje, S., Girase, S., Mukhopadhyay, D.: User Profiling Trends, Techniques and Applications 1(1) (2015). Retrieved from http://arxiv.org/abs/1503.07474 10. Chen, W., Gu, Y., Ren, Z., He, X., Xie, H., Guo, T., … Zhang, Y.: Semi-supervised user profiling with heterogeneous graph attention networks. In: IJCAI International Joint Conference on Artificial Intelligence, 2019-Augus, pp. 2116–2122 (2019). https://doi.org/10.24963/ijcai.201 9/293

Point-of-Interests Recommendation Service in Location-Based …

63

11. RoyAnimesh, C., Shamsul Arefin, M.: An intelligent recommendation system based on collaborative filtering and grid structure. In: Internet of Things and Connected Technologies, p. 12. SPRINGER-AISC Series, Patna (2020) 12. Singh, P.K., Veselov, G., Vyatkin, V., Pljonkin, A., Dodero, J. M., Kumar, Y.: Futuristic trends in network and communication technologies : Third International Conference, FTNCT 2020, Taganrog, Russia, October 14–16, 2020, Revised Selected Papers. Part I (n.d.) 13. Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082–1090 (2011). https://doi.org/10.1145/2020408.2020579 14. Yang, D., Zhang, D., Yu, Z., Yu, Z.: Fine-Grained preference-aware location search leveraging crowd sourced digital footprints from LBSNs. In: UbiComp 2013 - Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 479–488 (2013). https://doi.org/10.1145/2493432.2493464 15. Liu, Y., Pham, T. A. N., Cong, G., Yuan, Q.: An experimental evaluation of point of interest recommendation in location based social networks. Proc. VLDB Endowment 10(10), 1010– 1021 (2017). https://doi.org/10.14778/3115404.3115407 16. Yelp.: Yelp Dataset Challange (2015). https://www.kaggle.com. Retrieved December 10, 2020, from https://www.yelp.com/dataset/ 17. Singh, P.K., Kar, A.K., Singh, Y., Kolekar, M.H., Tanwar, S.: In: Proceedings of ICRIC 2019, Recent Innovations in Computing, 2020, Lecture Notes in Electrical Engineering book series (LNEE). Lecture Notes in Electrical Engineering 597, vol. 597 (2020). Retrieved from http:// www.springer.com/series/7818 18. Narvekar, M., Nayak, S., Bakal, J.: A survey on location recommendation systems. Commun. Comput. Inf. Sci. 721, 3–12 (2017). https://doi.org/10.1007/978-981-10-5427-3_1 19. Sielis, G.A., Tzanavari, A., Papadopoulos, G.A.: Recommender systems review of types, techniques, and applications. Encyclopedia Inf. Sci. Technol. Third Edition, 7260–7270 (2014). https://doi.org/10.4018/978-1-4666-5888-2.ch714 20. Kashef, R.: Enhancing the role of large-scale recommendation systems in the IoT context. IEEE Access 8, 178248–178257 (2020). https://doi.org/10.1109/ACCESS.2020.3026310 21. Kosir, A.: LDOS-CoMoDa dataset (2012). Retrieved April 20, 2021, from https://www.lucami. org/en/research/ldos-comoda-dataset/ 22. Baltrunas, L., Church, K., Karatzoglou, A., Oliver, N.: Frappe: Understanding the Usage and Perception of Mobile App Recommendations In-The-Wild (2015). Retrieved from http://arxiv. org/abs/1505.03014 23. Braunhofer, M., Elahi, M., Ricci, F.: STS: A context-aware mobile recommender system for places of interest. CEUR Workshop Proc. 1181, 75–80 (2014) 24. Zheng, Y., Mobasher, B., Burke, R.: Context recommendation using multi-label classification. In: Proceedings—2014 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2014, 2(May), pp. 288–295 (2014). https://doi.org/10.1109/WI-IAT.2014.110 25. Ramanathan.: Mobile App Statistics (Apple iOS app store) (2018). Retrieved January 20, 2021, from https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps 26. Gräßer, F., Kallumadi, S., Malberg, H., Zaunseder, S.: Aspect-based sentiment analysis of drug reviews applying cross-domain and cross-data learning. In: Proceedings of the 2018 International Conference on Digital Health, pp 121–125. ACM, New York, NY, USA (2018). https://doi.org/10.1145/3194658.3194677 27. Yu, D., Wanyan, W., Wang, D.: Leveraging contextual influence and user preferences for pointof-interest recommendation. Multimedia Tools Appl 80(1), 1487–1501 (2021). https://doi.org/ 10.1007/s11042-020-09746-0 28. Bao, J., Zheng, Y.: Location-based recommendation systems. In: Encyclopedia of GIS, pp. 1–9. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-23519-6_1580-1 29. Yin, C., Ding, S., Wang, J.: Mobile marketing recommendation method based on user location feedback. Human-centric Comput. Inf. Sci. 9(1) (2019). https://doi.org/10.1186/s13673-0190177-6

64

S. M. Asaad et al.

30. Davtalab, M., Alesheikh, A.A.: A POI recommendation approach integrating social spatiotemporal information into probabilistic matrix factorization. Knowl. Inf. Syst. (2020). https:// doi.org/10.1007/s10115-020-01509-5 31. Zhao, P., Zhu, H., Liu, Y., Xu, J., Li, Z., Zhuang, F., … Zhou, X.: Where to go next: a spatiotemporal gated network for next POI recommendation. In: 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, pp. 5877–5884 (2019). https://doi.org/10.1609/aaai.v33i01.33015877 32. Jang, S., Kim, J. H., Nasridinov, A.: Flexible pOI recommendation based on user situation. In: Proceedings—2019 IEEE International Congress on Cybermatics: 12th IEEE International Conference on Internet of Things, 15th IEEE International Conference on Green Computing and Communications, 12th IEEE International Conference on Cyber, Physical and So, pp. 1257– 1260 (2019). https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00211 33. Li, X., Xing, J., Wang, H., Zheng, L., Jia, S., Wang, Q.: A hybrid recommendation method based on feature for offline book personalization. arXiv (Xx), 1–15 (2018) 34. Ma, K.: Content-based Recommender System for Movie Website, 22 (2016). Retrieved from: http://kth.diva-portal.org/smash/get/diva2:935353/FULLTEXT02.pdf 35. Gao, Y., Duan, Z., Shi, W., Feng, J., Chiang, Y.-Y.: Personalized Recommendation method of POI based on deep neural network. In: 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), pp. 1–6 (2019). https://doi.org/10.1109/ BESC48373.2019.8963449 36. Zhu, J., Ma, S., Li, J.: POI recommendation based on first-order collaborative filtering tree. In: Proceedings—2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks, MSN 2019, pp. 265–270 (2019). https://doi.org/10.1109/MSN48538.2019.00058 37. Hu, H., Jiang, Z., Zhao, Y., Zhang, Y., Wang, H., Wang, W.: Network representation learningenhanced multi-source information fusion model for POI recommendation in smart city. IEEE Internet of Things J. 1 (2020). https://doi.org/10.1109/JIOT.2020.3006989 38. Liao, G., Jiang, S., Zhou, Z., Wan, C., Liu, X.: POI recommendation of location-based social networks using tensor factorization. In: 2018 19th IEEE International Conference on Mobile Data Management (MDM), pp. 116–124 (2018). https://doi.org/10.1109/MDM.2018.00028 39. Gao, R., Li, J., Li, X., Song, C., Zhou, Y.: A personalized point-of-interest recommendation model via fusion of geo-social information. Neurocomputing 273, 159–170 (2018). https://doi. org/10.1016/j.neucom.2017.08.020 40. Si, Y., Zhang, F., Liu, W.: CTF-ARA: an adaptive method for POI recommendation based on check-in and temporal features. Knowl.-Based Syst. 128, 59–70 (2017). https://doi.org/10. 1016/j.knosys.2017.04.013 41. Bilogrevic, I., Huguenin, K., Agir, B., Jadliwala, M., Gazaki, M., Hubaux, J.P.: A machinelearning based approach to privacy-aware information-sharing in mobile social networks. Pervasive Mob. Comput. 25, 125–142 (2016). https://doi.org/10.1016/j.pmcj.2015.01.006

San Marcos Smart City: A Proposal of Framework for Developing ISO 37120:2018-Based Smart City’s Services for Lima Jorge Guerra Guerra, Marco Rios, Alvaro Aspilcueta, Juan Gamarra, Jorge Zavaleta, and Felix Fermin Abstract This chapter shows a framework proposal for developing smart city supported services based on the ISO 37120 standard. To this effect, a project called “San Marcos Smart City”, located in the campus of the National University of San Marcos, was engineered. The aim was to build and test the basic technological components for a Smart City, from the physical layer to the applications layer. An architecture and layers of the proposal are shared below, as well as an example with some components developed on campus. The initial results show that this framework could be implemented at several other Peruvian locations, with the support of multidisciplinary groups and institutions in various areas of expertise: education, environmental, law, government, private sector, among others. Keywords Smart City · ISO 37120:2018 · Internet of Things · LPWAN

1 Introduction Smart cities are envisioned as the future of tech-enabled, resilient, sustainable, creative, and livable urban human settlements in the world. It is increasingly J. G. Guerra (B) · M. Rios · A. Aspilcueta · J. Gamarra · J. Zavaleta · F. Fermin Internet of Things Research Group, National University of San Marcos, Jr. Amezaga 375, Lima, Peru e-mail: [email protected] M. Rios e-mail: [email protected] A. Aspilcueta e-mail: [email protected] J. Gamarra e-mail: [email protected] J. Zavaleta e-mail: [email protected] F. Fermin e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_5

65

66

J. G. Guerra et al.

becoming part of the vision of national governments where technologies are used massively to improve the processes of all urban development dimensions, planning and management [1]. Allam and Newman [2] reviews the characteristics used by cities that define themselves as “smart” to develop a framework that includes them for cities of the future, while in [3] an account of topics relevant to intelligent governance and application cases in Indonesian cities are made. Generally, in a smart city what stands out the most is its technological aspect. For example, in [4] the cities of Busan (Korea) and Santander (Spain), their management and monitoring frameworks stand out. Another important criterion is the one defined in [5] where data is reviewed based on factors related to the Sustainable Development Goal (SDG) corresponding to sustainable and resilient cities. These indicators show that the quality of life, smart planning and sustainability are fundamental themes for a “Smart City” [6–13]. The need to establish a Smart City standard implementation model is the main reason for regulation. The most important is ISO 37120:2018, a set of indicators for steering and measuring 19 subject areas [14]. Arman et al. [15] describes a proposal for a step-by-step process to identify the initiatives for each of the ISO 37120 indicators. Also, [16] describes a study of the ISO37120 indicators and their alignment with the United Nations objectives called “Sustainable Development Goals” for the 2030 agenda [16, 17]. The results of these initiatives lead to believe that they can be applied at Peruvian local governments, like municipalities, to improve them. One of the initial problems in adopting the Smart City concept comes with the challenge of defining what it entails the definition of its own terminology. Consequently, the problem focuses on defining how a city can be classified as a Smart City, thus in [18] a literature review is carried out on the characteristics that would define a smart city and the standards established for its validation. In the literature, topics such as urban planning, economic sustainability, and environmental management are evaluated. In [19, 20] the researchers propose a Smart City implementation model taking elements from different existing norms, as well as a framework adapted to this proposal. Within the standards defined for a Smart City, the best known is ISO37120 [21] defined by the International Standard Organization (ISO) [22], which covers the main indicators of sustainable cities. The current edition is 2018. In [23] a list of ISO indicators covers 19 subject areas. The use of the ISO standard is a way to validate the development of our Smart City proposal, which in the case of Lima must cover, in addition to the usual indicators, other indicators such as energy, economy, and telecommunications, to address sensitive issues such as citizenry security, equal opportunity and discrimination, identified as current problems in today’s societies. In [15] it is clearly shown that the use of these indicators improves the quality of life of citizens.

San Marcos Smart City: A Proposal of Framework for Developing …

67

1.1 Why Smart City for Lima The City of Lima, the capital of Peru, has a population of 9,485,405 inhabitants, according to the last National Institute of Statistics and Informatics census, published in 2017 [2]. Lima’s Metropolitan 43 districts must coordinate with the provincial major, the main political administration authority in the city, all comprehensive plans to solve the city’s main problems. In [24] a detailed study of the socioeconomic conditions experienced by the inhabitants of the city is shown, placing emphasis on an unequal distribution of water, infrastructure conditions and economic growth between the different districts, constituting the main impediment to an effective integration of all its citizens.

1.2 Chapter Organization This chapter consists of six sections. Section 1 introduces the characteristics of a Smart City, as well as the explanation of the need for the city of Lima to become a smart city. Section 2 describes the implementation of the proposal called UNMSM Smart City, considering the architecture of the solution, and understanding the physical model, communications, cloud model and applications. Section 3 presents the implementation of the proposal, considering the physical and communications layers, as well as the organization of the accumulation layer. Section 4 shows the results obtained with the use of available tools. In Sect. 5, future works to be carried out based on the progress made in the project are proposed. Section 6 presents the conclusions, and to finalize, acknowledgments and the bibliography are used.

2 Architecture Proposal There is no standard model for the implementation of a smart city, this happens due to a diversity of definitions. In the literature, various implementation approaches have been found, for example in [25] a model involves the development and management of a multicultural smart city, placing emphasis on the diversity of its citizens, as well as on issues regarding inclusion and equal opportunity. “San Marcos Smart City” is implemented in the campus of the National University of San Marcos, at 375 Amezaga Street, Lima (−12.0578404, −77.086382) as shown in Fig. 1. The UNMSM’s buildings and facilities are contained within a 679,037.04 m2 [3] campus, simulating like conditions as for the City of Lima and its districts, and presenting similar challenges involving transport, education, waste management, security, among others, with the advantage of being a controlled and contained environment. To this effect, data can be collected and analyzed for precise results. The

68

J. G. Guerra et al.

Fig. 1 Campus of National University of San Marcos, Google Maps

project will determine the architectural model of the proposed solution as well as the methodological model. At this time, the physical and communications layers of the proposed architecture, as well as the implemented storage layer model, has been completed.

2.1 San Marcos Smart City Architecture The need to build a smart city for Lima has motivated the authors of this project to determine an architectural model and a future methodological model based on these considerations: (a) (b) (c) (d) (e) (f) (g) (h) (i)

ISO 37120:2018 compliance. Open Hardware and Constraint Devices based Physical layer. No public Wi-Fi, use of LPWAN and local Wi-Fi for implementation. Using different cloud providers for a heterogeneous model. Use of MQTT protocol for IoT communication. Use of Big Data and Machine Learning components in the application layer. Use of Open Data to access free information. Use of Edge Computing for data entry authentication/authorization processes through Constraint Devices. Implementation of an application model based on multiculturalism, diversity, and access to all citizens (Fig. 2).

San Marcos Smart City: A Proposal of Framework for Developing …

69

Fig. 2 Proposal smart city architecture

2.2 Physical Layer Relevant to sensors, actuators and controllers, the characteristics of these components must be defined for proper operation; in [26] there is a detailed study of the elements of the physical layer, which are considered for any IoT based application. The wireless components to implement are based on following devices: (a) (b) (c)

LoRa intelligent sensors. WizNote 811, ESP32 and Raspberry Pi 4 controllers. GPS tracker, environmental sensors, locked-door security sensor and face recognition camera.

2.3 Communication Layer The communication component of the project is shown in Fig. 3:

3 Proposal Implementation For the implementation of the San Marcos Smart City model, the development of 3 initial modules has been considered:

70

J. G. Guerra et al.

Fig. 3 Components of communication layer and cloud layer of proposal

1. 2. 3.

WIFI components: Tracking to vehicles within the physical space of the project. LoRAWAN component: Intelligent distance, temperature, vehicle tracking, point to point communication sensors. Facial Recognition Component: Raspberry PI 4 with high resolution camera and web application with image capture via web.

This implementation covers the physical layer, communications, and the initial modules of the accumulation layer.

3.1 WIFI Component As part of the Mobility dimension of a Smart City, an IoT tracking system was developed for the bus that transports students, professors, and other members of the UNMSM from the university city to different locations in Lima districts and vice versa, so that their location can be visualized in real time. The architecture of the solution (see Fig. 4) has been divided into 5 layers:

San Marcos Smart City: A Proposal of Framework for Developing …

71

Fig. 4 Architecture of the WiFi component

● Physical Layer: For the hardware stage of the geolocation project, the GSM/GPRS + GPS Sim808 module connected via serial port to the ESP32 module has been used. By means of AT commands, instructions are sent from the ESP32 to read the positioning data. For the power supply stage, a battery of minimum 2A current and two LM2596 3A step-down DC-DC converters are required. ● Communication layer: the MQTT light messaging protocol has been used to send the GPS data collected by the ESP32 MCU via WiFi. In the case of the UNMSM’s university campus, there are portable internet modules with WiFi signal output, so it is convenient that the communication is carried out by this means. Other options contemplated, such as communication via cellular network through the use of SIM card, were not convenient because it would have required the contracting of a data plan. Existing WiFi communication modules already account for the associated data plan cost. The corresponding options of communication via low power long range networks (LPWAN) such as LoRaWAN or Sigfox, are not suitable because there is no public LoRaWAN network, while for Sigfox, the limit of 140 messages per day maximum is not sufficient to track the location of the bus. ● Smart City Accumulation Layer: Google’s cloud computing platform was used to collect data from the IoT device and store it in a database. Basically, four services were used: Cloud IoT Core (for device management), Cloud Pub/Sub (as a messaging broker). Firebase is used, which is a cloud platform that facilitates the development of web and mobile applications as it is responsible for managing server-side services (authentication, databases, etc.) and is integrated with the Google Cloud Platform from which the Cloud Functions services are used (allows

72

J. G. Guerra et al.

code to be executed without the need to manage servers, in this case it is used to store what arrives from ESP32 in the database) and Cloud Firestore (flexible and scalable NoSQL-type database). ● Application layer: a native mobile application was developed for iOS and Android using React Native. It is a hybrid mobile development tool based on React JS which is a Javascript library for web UI development developed and maintained by Facebook. This will allow the application to be cross-platform and have better performance for bus location display. The sending of the data collected by the GPS module will be done through the mobile network, therefore, AT commands have been used, which program the A7 chip to transmit the data to the Google cloud for logging. The proposed module will allow to obtain the following benefits: ● The use of ICT tools within its implementation: Geolocation devices, data transmission using WiFi, Cloud software using GCP, applications using React Native and Cybersecurity tools. ● It refers to a delimited urban area: According to the predetermined bus route, these being: North, South, West and East. ● It seeks the best quality of life for the inhabitants: By providing the bus positioning data and making it public, we will be able to keep the citizen aware of where the nearest bus is and can manage their time. ● Decision-making based on city information: It will be possible to detect the city’s busiest points and monitor driver behavior on the streets. ● Eco-sustainability and environmental protection and improvement: by using a battery, the system does not need to be connected to the traditional energy supply.

3.2 LoRaWAN Component This project used a LPWAN module for data collection because the City of Lima does not have a public WiFi. As a result, tracking tasks, traffic control tasks, and recognition of patterns used technologies available in the local market. In the development of the LoRaWAN component, point-to-point network and also multipoint mode were considered (Fig. 5). The multipoint component is organized as follows: ● Physical layer: LoRa RAK sensors that control temperature, distance, GPS Tracker and open-door detection. ● Communication layer: Based on the LoRAWAN RAK 2245 PI HAT Gateway, a shield type card for use with Raspberry PI 3 or 4. This card has antennas for LoRA communication, and GPS. The communication via WIFI will be done through Raspberry PI. ● LoRAWAN cloud: The Thing Network (TTN) open technology service is being used. This cloud is ideal for collecting data sent by LoRA gateways and offers transfer applications to other cloud services such as Cayenne and others.

San Marcos Smart City: A Proposal of Framework for Developing …

73

Fig. 5 LoRAWAN component for Smart City model

● Cloud Server: The Cloud service used is Amazon Web Services (AWS), which receives the data from TTN and stores it in its S3 service. The Big Data and Machine Learning modules of AWS have been activated to process the data obtained (Fig. 6).

Fig. 6 Amazon web services (AWS) architecture

74

J. G. Guerra et al.

3.3 Facial Recognition Component The Facial Recognition module has been created, the billing and the API for the account is activated. A service account must be created in order to communicate with the Raspberry Pi (to run the client library) and also a virtual machine to run our application. To capture the photo that will be used in the facial recognition system the following elements have been used: a Raspberry Pi computer, a Raspberry Pi camera (or alternatively a USB webcam), a standard keyboard and a 7” HDMI monitor. The following Python 3 libraries were installed on the Raspberry Pi for a suitable working environment: google-cloud-storage, Pillow, picamera. The photo is taken with the camera and sent to a Google Cloud Platform for image processing, the virtual machine invokes the DL algorithm and processes the image returning the data of the detected person in JSON format. To implement the image recognition system in the virtual machine, OpenCV, Python and Deep Learning programs are used. Subsequently, the image recognition system uses the deep metric learning technique, which consists of returning a vector of features with floating point values instead of generating a single label. The network architecture used for facial recognition is based on ResNet-34 [27], but it has fewer layers and half of the filters. The creator of dlib (set of Machine Learning algorithms and tools for creating complex C++ programs), Davis King trained the network on a data set of 3 million images, approximately. David tested the “Labeled Faces in the Wild (LFW)” dataset on the network, reaching an accuracy of 99.38% using deep metric learning [28–30]. The dlib library allows the implementation of “deep metric learning” that is used to build our facial features used for the current recognition process. The face_recognition library, includes dlib’s facial recognition functionality, making it easier to use (Fig. 7).

4 Results The following results were obtained.

4.1 WiFi Component See Fig. 8.

San Marcos Smart City: A Proposal of Framework for Developing …

75

Fig. 7 The facial recognition module

Fig. 8 WiFi component for GPS tracking

4.2 LoRAWAN Component 4.2.1

Point-To-Point Communication

To collect data from LoRa nodes, RAK gateways were configured and operate as shown in Figs. 9 and 10. Code and console output of data get from DHT11 sensor is shown in Fig. 11:

76

J. G. Guerra et al.

Fig. 9 RAK card sending data

Fig. 10 RAK card received data

4.2.2

Multipoint Communication

Gateway 2245 PI HAT The selected gateway was enclosed in a resistant plastic housing and linked with LoRa sensors for distance, GPS, opened door, humidity (Fig. 12).

San Marcos Smart City: A Proposal of Framework for Developing …

77

Fig. 11 Code and data between LoRa nodes

Fig. 12 Gateway 2245 with case for outdoor operation

The Thing Network (TTN) Registering a RAK gateway in The Thing Network in order to collect data from the LoRa sensors is made in the TTN console (Figs. 13 and 14).

AWS Services TTN deliver data to the AWS cloud and the Simple Storage Service (S3) so that it can be processed with tools such as Big Data and Machine Learning. The AWS sign-up process is straightforward, and its management console has the options to deploy the

78

J. G. Guerra et al.

Fig. 13 Temperature and humidity sensor registration in TTN

Fig. 14 Distance sensor registration in TTN

cloud as required. Figure 15 shows the AWS console and the data stored in S3 every three minutes.

San Marcos Smart City: A Proposal of Framework for Developing …

79

Fig. 15 Data sent by LoRa humidity sensor through a RAK gateway

4.3 Facial Recognition Component The algorithm was implemented in Python based on Adrian Rosebrock’ proposal named pyimagesearch.com,1 in order to put the facials encodings to a pickle file.

# import the necessary packages # construct the argument parser and parse the arguments # grab the paths to the input images in our dataset # initialize the list of known encodings and known names # loop over the image paths # extract the person name from the image path # load the input image and convert it from RGB (OpenCV ordering) # to dlib ordering (RGB) # detect the (x, y)-coordinates of the bounding boxes # corresponding to each face in the input image # compute the facial embedding for the face

1

pyimagesearch; https://www.pyimagesearch.com/2018/06/18/face-recognition-with-opencv-pyt hon-and-deep-learning/.

80

J. G. Guerra et al.

# loop over the encodings # add each encoding + name to our set of known names and # encodings # dump the facial encodings + names to disk (pickle file) Facial recognition algorithm to process each image, follows:

# import the necessary packages # construct the argument parser and parse the arguments # load the known faces and embeddings # load the input image and convert it from BGR to RGB # detect the (x, y)-coordinates of the bounding boxes corresponding # to each face in the input image, then compute the facial embeddings # for each face # initialize the list of names for each face detected # loop over the facial embeddings # attempt to match each face in the input image to our known # encodings # check to see if we have found a match # find the indexes of all matched faces then initialize a # dictionary to count the total number of times each face # was matched # loop over the matched indexes and maintain a count for # each recognized face # determine the recognized face with the largest number of # votes (note: in the event of an unlikely tie Python will # select first entry in the dictionary)

San Marcos Smart City: A Proposal of Framework for Developing …

81

Fig. 16 Imagenes almacenadas en Google Cloud

# update the list of names # loop over the recognized faces # show the output image Google Cloud stores all images and hosts the web application for image capture and ML processing for Facial Recognition. In the process defined by the module, the photos obtained were stored in the Google Cloud as shown in the image (Figs. 16 and 17): Another way of capturing the images for face recognition is a web-based image capture module (https://storage.googleapis.com/docentes-js/captura) developed to use the PC or laptop’s camera. This application captures up to three images of the person who is registering, and after entering the DNI identification number and press the Send button, photos are registered and sent to the cloud (Fig. 18).

5 Summary and Future Directions Preliminary phase of this project was completed covering the stages of physical layer and communications layer as well as the creation of first services at cloud level. Next, a security module machine learning based for the edge computing layer will be in charge of the authentication, authorization, and non-repudiation topics. The application layer will have transport, public safety, education, sewage management and public health support. All of these will be developed considering the criteria set out in the project, i.e. multiculturalism and access to services for all citizens. This project is planned for two years, where the main guidelines for the construction of

82

J. G. Guerra et al.

Fig. 17 Images captured by the web system or by camera on Raspberry PI

Fig. 18 Image capture, web application for security

the smart city implementation methodology for Lima city will be established through the results obtained.

San Marcos Smart City: A Proposal of Framework for Developing …

83

6 Conclusions The implementation of a smart city is a hard work that takes many years of development and planning and involves different dimensions, where the use of ICT tools is only one of them. Implementing and testing this proposal in a district of Lima city was difficult due to COVID-19 pandemic, that is why this proposal was implemented in the facilities of the National University of San Marcos’s campus whose extension area and characteristics are very similar to any district at small scale, and let test it and collect data, easier. An architecture proposal for a smart city as Lima is possible taking into account linking some LoRaWAN gateways through The Thing Network (TTN) open technology cloud service for data transfer applications to other cloud services. Proof of concept was successful sending and processing light sensors data through TTN and Amazon AWS services, so create applications at accumulation level is possible to implement in short time. Inclusion of all citizens to a smart city model is very important for a real achievement of Smart City implementation, but this requires of the incorporation of specialists in different fields of knowledge such as civil engineers, environmental engineers, education specialists and other professionals who will help the achievement of this multidimensional project. Acknowledgements The authors are grateful to the following members of UNMSM’s Internet of Things Research Group: Clara Justino, Kevin Sanchez, Milton Rios, Max Cossio, Jurgen Guerra, Luis Mallqui, Eduardo Peña, Fernando Gutierrez, Alexander Soto and Brixany Ponce for helping to test this proposal. A special thanks to Marco Barrueto for helping in the translation and technical support.

References 1. Singh, A.: Advances in smart cities: smarter people, governance, and solutions. J. Urban Technol. 1–4 (2019).https://doi.org/10.1080/10630732.2019.1637606 2. Allam, Z., Newman, P.: Redefining the smart city: culture, metabolism and governance. Smart Cities 1(1), 4–25 (2018). https://doi.org/10.3390/smartcities1010002 3. Anindra, F., Supangkat, S.H., Kosala, R.R.: Smart governance as smart city critical success factor (Case in 15 Cities in Indonesia). In: 2018 International Conference on ICT for Smart Society (ICISS) (2018). https://doi.org/10.1109/ictss.2018.8549923 4. Hwang, J., An, J., Aziz, A., Kim, J., Jeong, S., Song, J.: Interworking models of smart city with heterogeneous internet of things standards. IEEE Commun. Mag. 57(6), 74–79 (2019). https://doi.org/10.1109/mcom.2019.1800677 5. Treude, M.: Sustainable smart city—opening a black box. Sustainability 13(2), 769 (2021). https://sci-hub.se/10.3390/su13020769 6. Costa, A., Teixeira, L.: Testing strategies for smart cities applications: a systematic mapping study. In: Proceedings of the III Brazilian Symposium on Systematic and Automated Software Testing (SAST’18). Association for Computing Machinery, New York, NY, USA, pp. 20–28 (2018). https://doi.org/10.1145/3266003.3266005

84

J. G. Guerra et al.

7. Rayes, A., Samer, S.: Internet of things from hype to reality. Internet of Things Security and Privacy 195–223 (2017). https://doi.org/10.1007/978-3-319-44860-2 (Chapter 8). https://doi. org/10.1007/978-3-319-44860-2_8 8. Borsekova, K., Koróny, S., Vaˇnová, A., Vitálišová, K.: Functionality between the size and indicators of smart cities: a research challenge with policy implications. Cities 78, 17–26. ISSN: 0264-2751. https://doi.org/10.1016/j.cities.2018.03.010 9. Alfaro-Navarro, J., López-Ruiz, V.R., Nevado-Peña, D.: The effect of ICT use and capability on knowledge-based cities. Cities 60, 272–280. ISSN: 0264-2751.https://doi.org/10.1016/j.cit ies.2016.09.010 10. Albino, V., Berardi, U., Dangelico, R.M.: Smart cities: definitions, dimensions, performance, and initiatives. J. Urban Technol. 22(1), 3–21 (2015). https://doi.org/10.1080/10630732.2014. 942092 11. OECD: Smart Cities and Inclusive Growth. OECD Publishing, Korea (2020). Available in https://www.oecd.org/cfe/cities/OECD_Policy_Paper_Smart_Cities_and_Inclusive_ Growth.pdf 12. Jeong, Y., Jong Hyuk Park, J.: IoT and Smart City Technology: Challenges, Opportunities, and Solutions. J. Inf. Process Syst. 15(2), 233–238 (2019). https://doi.org/10.3745/JIPS.04.0113 13. Liu, X., Heller, A., Nielsen, P.S.: CITIES Data: a smart city data management framework. Knowledge and Information Systems (2017). https://doi.org/10.1007/s10115-017-1051-3 14. ISO, ISO/TC 268: Sustainable cities and communities. Technical report (2018) 15. Arman, A.A., Abbas, A.E., Hurriyati, R.: Analysis of smart city technology initiatives for city manager to improve city services and quality of life based on ISO 37120. In: Proceedings of the 2015 2nd International Conference on Electronic Governance and Open Society: Challenges in Eurasia—EGOSE’15 (2015). https://doi.org/10.1145/2846012.2846025 16. Moschen, S.A., Macke, J., Bebber, S., Correa, B., da Silva, M.: Sustainable development of communities: ISO 37120 and UN goals. Int. J. Sustain. High. Educ. (2019). https://doi.org/10. 1108/ijshe-01-2019-0020 17. ODS Peru, Objetivos de Desarrollo Sostenible. Disponible en https://www.pe.undp.org/con tent/peru/es/home/sustainable-development-goals.html 18. Bibri, S.E., Krogstie, J.: Smart sustainable cities of the future: an extensive interdisciplinary literature review. Sustain. Cities Soc. 31, 183–212 (2017). https://doi.org/10.1016/j.scs.2017. 02.016 19. Elfiky, I.A., Abouzeid, M.N., Plattus, A.J.: A proposed assessment scheme for smart sustainable urban development. In: Proceedings, Annual Conference—Canadian Society for Civil Engineering (2019) 20. Gordon, F., Shane, M.: Smart City Framework. A Systematic Process for Enabling Smart+Connected Communities. Point of View—CISCO ISBG (2012) 21. Ya Arman, A., Abbas, A., Hurriyati, R.: Analysis of smart city technology initiatives for city manager to improve city services and quality of life based on ISO 37120. In: Proceedings of the 2015 2nd International Conference on Electronic Governance and Open Society: Challenges in Eurasia (EGOSE’15). ACM, New York, NY, USA, pp. 193–198 (2015). https://doi.org/10. 1145/2846012.2846025 22. McCarney, P.: The evolution of global city indicators and ISO37120: the first international standard on city indicators. Stat. J. IAOS 31, 103–110 (2015). https://doi.org/10.3233/SJI150874 23. Berman, M., Orttung, R.W.: Measuring progress toward urban sustainability: do global measures work for arctic cities? Sustainability 12(9), 3708 (2020). https://doi.org/10.3390/ su12093708 24. Ezquiaga, A.S.y.T.: Propuestas de ordenamiento urbano territorial en Lima, Perú. Banco Interamericano de Desarrollo (BID), documento de discusion IDB-DP-00648 (2018) 25. Šulyová, D., Vodák, J.: The impact of cultural aspects on building the smart city approach: managing diversity in Europe (London), North America (New York) and Asia (Singapore). Sustainability 12, 9463 (2020). https://doi.org/10.3390/su12229463

San Marcos Smart City: A Proposal of Framework for Developing …

85

26. Ray, P.:. A survey on Internet of Things architectures. J. King Saud Univ. Comput. Inf. Sci. 30(3), 291–319 (2018). ISSN: 1319-1578. https://doi.org/10.1016/j.jksuci.2016.10.003 27. He, K., Zhang, X., Ren, S.J., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90. 28. Dubois, A.: Facial recognition using deep neural networks. Master en ingénieur civil en informatique, Universite of Liege, Belgique (2018). Avalaible in http://hdl.handle.net/2268. 2/4650 29. Elsaeidy, A., Munasinghe, K., Sharma, D., Jamalipour, A.: Intrusion detection in smart cities using Restricted Boltzmann Machines. J. Netw. Comput. Appl. 135, 76–83 (2019). https://doi. org/10.1016/j.jnca.2019.02.026.Fec.pub 30. Ciaburro, G., Ayyadevara, V.K., Perrier, A.: Hands-on machine learning on Google Cloud Platform, 1st edn. Packt Press, 500 p (2018). ISBN: 9781788393485

Social and Technical Challenges in Eco-Sustainable Smart City in India—An Analysis Devanshi Saxena, Shaweta Khanna, Sangeeta Mangesh, Manisha Chaudhry, and Kayhan Zrar Ghafoor

Abstract Smart City is defined as the use of technology and ICT based innovation to provide efficient services within a system. Smart Cities concept increases the efficiency of urban services, enhancing quality of life and generates new economic opportunities. But to wonder, the debate on the term “SMART CITIES” still continues. It is a dynamic term, having variation in its definition across places to places and from people to people. Basically, smart cities refer to the use of digital innovation to make urban service delivery more efficient. Our honorable Prime Minister Shri Narendra Modi has always focused on the digitalization of our country. His dream of creating 100 smart cities and subsequent emphasis on creating selfsufficient smart villages can be only be achieved by integrating information and communication technologies (ICT) to provide interface between denizens and governance with improved infrastructure. This chapter is basically based on creating better neighbourhoods by combining technology interface with community participation for inclusive development. Smart Communities also play a major role in achieving the target of smart cities. Smart communities can be global as they exist all over the world as well as local since they are based on efforts on local level. Smart Community is described as a geographical area ranging in size from neighbourhood to a multi-country region whose citizens, governments, organizations use information technology to convert their area in efficient and beneficial ways. This needs cooperation among government, industry, citizen etc. This chapter also compares D. Saxena · S. Mangesh · M. Chaudhry JSS Academy of Technical Education, Noida, UP, India e-mail: [email protected] M. Chaudhry e-mail: [email protected] S. Khanna (B) ITS Engineering College, Greater Noida, UP, India e-mail: [email protected] K. Z. Ghafoor Department of Computer Science, Knowledge University, University Park, Kirkuk Road, 44001 Erbil, Iraq e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_6

87

88

D. Saxena et al.

different approaches towards the smart communities. Also, this chapter presents various challenges that hinder the process of smart cities development.

1 Introduction As per the United Nations document, cities are places where large number of people live and work; they consists of government and private offices, trade and transport [1]. The inception of “smart city” is firstly initiated with the idea to use smart, ICT-based digital innovations to invent and develop the new economical opportunities in the cities. With the rise of smart city projects around the globe, more attention requires to be paid to whether the benefits and expenses of smart cities are spread across all segment of society [2]. The concept of smart city projected through multiple literatures available till date has been quite relative, especially pertaining to the area being focussed. Smart city is an successful combination of physical, digital and human systems in the built environment to deliver a sustainable, wealthy and comprehensive future for its citizens [3]. A city become smart when investments in societal capital and conventional and modern communication infrastructure stimulate sustainable economical growth and quality of life with proper utilisation of available natural resources [4]. Smart cities have a strong impact on the quality of life of general public and it aims to create more learned, knowledgeable, and participatory citizens. Smart cities projects permit members of the city to involve in the governance and management of the city and promote them to become active users. An individual must connect in order to attain development of social and cultural capital as well as accomplish mass economic gains in production [5] ICT and social sites play a vital role for the creation of smart neighbourhood. Information and communication technology (ICT) provides interface between the citizens and governance for the improved working of various schemes of government meant for people like Smart city concept. Smart living, Smart Mobility, Smart energy, Smart neighbourhoods, Smart workspaces, community participation intertwined with technology interface act as powerful driving forces for creating smart cities [6–9]. Critical success factors for making a city smart include [10](a) (b) (c)

(d)

Communication infrastructure that ensures speedy, economical and reliable communication networking among different systems. Education and training to the workforce that can be employed locally to operate, maintain and ensure sustenance of the systems independently and efficiently. Policies and programs that ensure citizen participation bridging the social, cultural divide among different group of people. The project that ensures development of creativity that enhances quality of life and harmony in the society Creative and innovative and progressive mindset that accepts new challenges leading to industrial as well as economic growth.

Social and Technical Challenges in Eco-Sustainable …

89

Communicationan d Networking

Broadening scope through marketing

Education & training

Critical Success Factors

Industrial & Economic growth

Policies and Programs

Fig. 1 Critical success factors in making of a Smart city [10]

(e)

Marketing and promoting developments to attract new minds. Marketing also ensures deeper reach out, widening scope of the program (Fig. 1).

Smart Cities can be differentiated from the traditional cities in four major aspects [11]: (a)

(b)

(c)

(d)

Firstly, the Smart Cities will be equipped with global-intelligent platforms having smart modules with ability to interact with the environment intelligently, store, retrieve, send and collect huge quantities of data which can interact with any sub-system or systems that are integral part of the Smart City. The connectivity can be real time to ensure quicker reaction time [11]. Secondly the smart cities will have strategies that ensure active participation and support from its educated and well-informed citizen. Promoting and motivating strategies lead to significant behavioural changes in the citizen and intern within the society. Third difference will be in the continuous adaptation of newer technologies that will ensure progressive but diverse improvement of the smart cities according to the needs of its citizens. Fourth and final difference will be in the approach that means these smart cities will be human centric. All the constituent sub-systems that are part of smart city will be developed to address the needs of its citizen. All the concerns related to societal health, safety will be addressed through its strategies [11].

This work presents a survey that essentially elaborates the essential Smart City constituents and all possible barriers with respect to India.

90

D. Saxena et al.

2 A Smart City Model and Its Components In recent years with rapid growth of information and communication technology, IT enabled e-governance, management, education and networking has brought the community and the system together. The challenges encountered with the migrating population due to urbanization burdening the existing systems/ infrastructure has led to the development of a smart city. A smart city hypothesis has been analysed vigorously and multiple literatures are available that propose a conceptual smart city model. One such model based on smart city projects is presented by ASCIMER project team [21] in 2007.An abstract models have also been presented that mainly focus on the use of ICT, Environmental aspects, livelihood, Economy and governance [22]. The triple helix model focuses on the integrating relationship between university, industry and government [23]. Some models are combination of institutional, technological and human factors. One of the models has recognized a set of internal parameters like Technology, Organisations and Policy which affected the Smart City more directly. The external factors like People, Governance, Natural Environment and Infrastructures were on a second level of impact. Recent conceptual visions of Smart Cities have major focus on governance as the key subject for the successful initiative of the smart cities. The evolution of the models reflects the increasing importance of governance and the shift toward citizen-centric approaches[5, 10, 21, 22]. A multidimensional system that addresses relationship among different stakeholders in urban development has also been proposed through one of the models in the available literature. This model considers the stakeholders coming from four domains namely social, economic, knowledge and political [22]. A summary of smart city models along with their components is tabulated in Table 1. Integrating the concepts available in the literature, the essential components of aeco-sustainable smart city are indicated in Fig. 2.

3 Role of ICT in the Development of Smart City As per literature [24] the smart city is defined as “a well-defined geographical area, in which high technologies, such as Information Communication Technologies (ICT), logistic, energy production, and so on, cooperate to create benefits for citizens in terms of well-being, inclusion and participation, environmental quality, intelligent development; it is governed by a well-defined pool of subjects, able to state the rules and policy for the city government and development.” As per another definition “A smart sustainable city is an innovative city that uses information and communication technologies (ICTs) and other means to improve quality of life, efficiency of urban operation and services, and competitiveness, while ensuring that it meets the needs of present and future generations with respect to economic, social and environmental aspects” [25]. The ICT based infrastructure alongside the traditional utilities and services infrastructures plays a vital role in the growth of smart cities. Information

Social and Technical Challenges in Eco-Sustainable …

91

Table 1 Different smart city models and components as described in the available literature Smart city model

Components of the smart city

Smart city

Management and Organization-, [12] Technology, Governance, policies, people and community, economy, infrastructure and natural environment

Literature

Smart city

Land, Technology, Citizen, Government

Smart city

Technology. Community, people

[13]

Smart city

contextual conditions, governance models, and the assessment of public value

[14] [15]

Smart city

Technology, Community and people

Smart city

Government, mobility, services, community, [16] economy, natural environment and built environment

Smart city

Smart Environment, smart Economy, smart people, smart governance, smart living and smart mobility

[17, 18]

Smart sustainable city (eco-city)

Government, authorities, institutions, agencies, companies, communities, and citizens

[19]

Smart city

Smart agriculture, smart homes, smart city service, smart industry, smart energy, smart infrastructure, smart health, smart transport

[20]

Smart Infrastructure that will lead to industrial and Economical Growth Smart Agricultre for enhancing quality and yield

Smart City Components

Fig. 2 Components of the eco-sustainable smart city

Smart Commuication and networking that fascilitates E-governance, Resource Management, sanitization -waste management Smart Transportation Smart health care system Smart Green Energy Systems for sustainable growth Smart and affordable gadgets that enhanve quality of life , Smart education that can bring harmony in the society.

92

D. Saxena et al.

systems help in optimising the infrastructure inform-interconnect people, and build inter/intra city communication network, enables customization of the applications and service-delivery to the meet the real-world demands. The convergence of ICT and city services will serve as a catalyst in forming the smart city ecosystem with enhanced economic growth [26]. Rather IoT forms a technology backbone for a smart city [27]. Even though the employment of ICT tools vary with the services and applications, it can be considered to be an integration of software, hardware, data, data-storage, data security, cloud computing, data communication, portability, and networking infrastructure [28]. IoT orIET (internet of Everything) has aided in developing systems and applications that can play critical role in designing health monitoring systems, health care equipment. Accessibility to scientific knowledge-base and instrumentation has aided in soil analysis, optimizing use of fertilizers, water management and resource management to enhances crop yield and building a system called-smart agriculture. The use of sensing devices and their massive connectivity has enabled information exchange to develop applications and systems to make community life cost effective and time worthy. The major thrust areas that get benefitted from ICT are [29] (a)

(b)

(c)

Information and knowledge sharing—ICT enabled social networking, community education and valuable—precise information sharing can help to gain an insight to a problem at an early stage. ICT enabled forecast—ICT enabled sensors, information collectionprocessing, predicting patterns through data analysis, massive networking can help in preparing for natural calamities such as cyclones, rainfall, storm etc. Efficient forecasting and management aids in being proactive. ICT information Integration—The information integration gives knowledge on strengths and weakness for efficient customization and optimization of the IT enabled services and applications to cater for the needs.

With proliferation of internet rather than ICT the term IoT or IOE (internet of Things or Internet of Everything) has been used, IoT is an integration of ICT enabled services, gadgets or device that ensure speedy connectivity and reliable communication through use of internet. Use of IoT/IoE in developing smart cities facilitates creating smart communities and smart neighbourhood.

3.1 Smart Neighbourhood ICT/IoT enabled services can be employed effectively for building communities and world a better place to live in Smart neighbourhood concept caters to effectively and efficiently organizing services and making them accessible to each and every member within the society. Creating smart neighbourhood involves participation of the people to create and maintain the infrastructure accessible to them. The level of participation can vary from information access to collaborate and

Social and Technical Challenges in Eco-Sustainable …

93

empower citizens to execute certain project or service. With smart neighbourhood and responsible community participation there can be positive impact on environment management, management of community led programme such as fair, festivals, Inclusive planning for specially abled or chronic ill people, Finance Mobilisation or Micro financing, Employment opportunities, promoting product development, design-innovation, advertising, facility management such as parks, roads, Community sensitization and awareness programs, Urban conservation and cultural identity, public safety and security, waste management [29]. The ultimate objective of all smart city efforts is to enhance quality of life and to assure (economic) prosperity for all. The citizens of a city or society play a vital role in creating and implementing the correct processes and solutions. Technology and data are key enablers, but without the smartness of citizens, right and accurate engagement will not be achieved and smart city projects fail. Many cities have learned this lesson and turned their strategies 180° [19, 30].

3.2 Smartivists As described in a podcast my smart community-SCP E72: The Empowerment of Smartivists, with Tom Mueller dated 11 April 2021, “Smart City concept describes the ability for utilizing the capacity of a city or community to create and adopt solutions for overcoming challenges and seizing opportunities that help transform the places to a more prosperous and more livable place for all stakeholders.”1 Recently, a new term “smartivist” (smart activist) has been coined for smart citizen—“an individual who steps forward to actively support the process of creating a better place on a voluntary basis. The smartivists’ can promote the smart city development through collective use their intelligence, expertise and commitment to create a better and smarter place within shorter timespans [5, 15, 30–33] (Fig. 3).

3.3 Smart Community A keyobjective of the ‘smart’ cities’ approachrelates to community development, empowering local individuals and groups by providing them with the necessary assistance and infrastructure to affect changes in their own communities [10]. Smart community makes use of IoT enabled gadgets and services to boost community functions and performance. Every individual within the community is well informed, is aware of roles and responsibilities towards self and society. The “smart community” concept relies on the beliefthat local leaders know far better than State or national officials how next-generation technologies can best 1

The podcasts from My smart community are available on the link —https://www.youtube.com/ playlist?list=PLhFh0RHqNVF1DFrkCh6v7J0RxiFrPMVC4).

94

D. Saxena et al.

Fig. 3 Smart community [Image Source https://www.nsf.gov]

be marshalled to a community’s benefit. Each community has unique challenges which make a one size fits all approach unviable on the ground. Therefore, only local political, civic, business, and education leaders, working in cooperation, can bring people and technology together in time to capture the competitive and civic advantages that the telecommunications revolution makes possible [34]. As all smart city models focus on the people participation and enhancing quality of life, a smart community plays a central role in a smart city [12, 20, 21, 24, 26, 35].

4 Infrastructure and Transport Infrastructure plays a fundamental role in improving the quality of life of any citizen. Infrastructure project development, as well as its maintenance and usage create numerous employment opportunities for the local population. Projects such as bridges, roads, buildings serve to accelerate the growth of the economy. Improved access for both the consumers as well as businesses act as force multipliers for value generation. Therefore infrastructure development leads to massive savings in cost as well as time. With cars having a GPS device and the commonality of mobile phones with everydriver, many approaches use GPS data to track driver behaviour and traffic patterns [36]. This real-time data is already used for route mapping in applications such as Google Maps, as well as for trip scheduling in public transport. An IoT enabled road traffic control systems, parking systems can assist vehicular movement, navigation and facilitate hassle free parking for urban population saving both cost and time [20].

Social and Technical Challenges in Eco-Sustainable …

95

5 Smart Agriculture Food security is one of the most important parts of the United Nations Sustainable Development goals for 2030. With an increasing world population, worsening climate change causing erratic weather in food centres of the world, the race to ensure that food production is made sustainable and that dwindling resources such as water are utilized efficiently has been a high priority for countries around the world. Smart agriculture is the use of sensors embedded into plants and fields to measure various parameters to help in decision making and prevent/diseases, pests etc. [37]. A part of the smart agriculture paradigm is precision agriculture, which involves sensors being placed in plants to provide targeted measurements and therefore allow for targeted care mechanisms to be deployed. Precision agriculture will be necessary for food security in the future [38] and therefore is an essential part of the fight for sustainable food production. The major applications of AI in IoT for agriculture are crop monitoring/disease detection and data driven crop care and decision making [20].

6 Smart Healthcare The use of mobile and communication technology as well as IoT enabled services is rapidly transforming the way health services interact with the average consumer. With increased approachability comes the possibility of greater personalization and individual-focused public health and medical care. In one particular survey report on m-health monitoring services and their relevant benefits presented by the World Health Organization, mobile enabled health services if implemented strategically and systematically, were shown to revolutionize health outcomes, providing virtually anyone with a mobile phone with medical expertise and knowledge in real-time. This is a game changer particularly to those marginalized or living in remote areas, who would otherwise not have access to this information or healthcare [39]. Data analytics as well as data mining tools can aid in enabling personalized health care and monitoring services. Upon incorporation, such systems will have a wide deployment of various types of diagnostic sensors in different parts of the populace, to collect data about the overall health and medical zeitgeist of the smart city grid. The overall physical and medical patient examination process can be broken down into four parts namely (1) Inspection, (2) Palpation, (3) Percussion, and (4) Auscultation. Each of these four parts can be represented by a unique type of sensor that can collect data remotely for doctors to accurately diagnose and treat patients in real time. The data can be further analysed using data mining tools using big data analytics to design smart systems enabled with artificial intelligence to help the medical professionals [40, 41].

96

D. Saxena et al. Smart Energy System

Renewable -low carbon energy Smart Energy Distribution system Resource Smart Infrastructure Solar Energy Smart grids Hydel power Cross boarder Grids Wind Energy Use of artificial intelligence OTEC/Tidal /Wave Energy and information technology Bio-mass Geothermal

Smart Power Management Low power devices and systems Efficient energy storage systems Smart Metering No emission, No radiation appliances

Fig. 4 Smart energy system for a smart city

7 Smart Energy Management With industrial revolution and urbanization, the total energy consumption has increased impacting the environment due to greenhouse gas emissions, gaseous and particulate pollutants released as a by-product. Moreover, the demand for uninterrupted energy is also increased with migrant population moving towards cities, massive usage of gadgets, services and applications that are now part of everyday life. Also, the manner in which the energy resources are getting depleted, the need for energy management is critical in developing a smart city (Fig. 4). Smart energy as a concept includes smart power generation, smart power grids, smart storage, and smart consumption. The application of information and communication technology (ICT) to traditional energy production, transmission, and consumption trends helps make the paradigm of smart energy [27]. Such systems consist of the intelligent integration of decentralized sustainable energy sources, efficient distribution, and optimized power consumption. The usage of modern technology and connectivity allows for more accurate base load and peak load calculations in electricity production, major reduction in transmission losses, as well as reduction of energy usage at the consumer end through efficient appliances and power systems that tailor energy consumption to actual user needs as opposed to traditional inefficient mechanisms [26, 27, 42, 43].

8 Smart Resource Management, Governance, Waste Management There is need to regulate rampant usage of internet, IOT enabled gadgets and services with respect to their safe and secure usage. Consequences of inappropriate use of technology may create impact on physical/mental health, disruption in societal harmony, financial losses, loss of property etc. In order to ensure right and ethical use of IoT/ICT services good efficient administration plays a dominant role. An administration which is familiar with the trends in evolution of technology is essential for

Social and Technical Challenges in Eco-Sustainable …

97

effective regulation which is necessary to provide efficient implementation of IoT at a wider scale [7, 12, 14, 22, 23, 44–46]. E-governance processes that expand the reach and efficiency of social services serve as a prerequisite for enabling smart cities. Further, such processes keep citizens involved in the wheels of governance and help keep the decision and implementation process transparent [32]. Smart water management remains especially important in the Indian context, the smart cities with high population densities face demand and supply shortages which specifically needs to be address in Indian Context [15, 47–49]. Another important aspect of smart city design is smart waste management that essentially focusses towards recycling and by-product usage. The issue of industrial waste, garbage dumps, silicon waste generated due to rampant use of affordable gadgets requires immediate attention. The systems need to be put in place for smart waste management [4, 25, 50, 51].

9 Education, Training and Security Reach of ICT, IoT enabled gadgets leads to huge amount of data getting created in audio, video or text format. Making of smart city essentially requires smart citizen having adequate awareness about rules, regulations for using these gadgets and systems. The systems and applications that are constituents of the smart city need to ensure data security to its users. Empowering citizen with knowledge and ethics will ensure societal harmony [52–55].

10 Smart City Challenges in India 10.1 Mission Smart Cities Under the leadership of Honourable Prime Minister Narendra Modi, a smart city mission was launched in India in June 2015 after a year of preparations. As per the Indian Govt website, (https://smartcities.gov.in), the purpose of the smart cities mission is to promote cities that provide core infrastructure and a high QoL to its denizens, a clean and sustainable environment as well as the application of ‘Smart’ Solutions, which make use of available technology to bring increased efficiency. There is a clear focus on sustainability as well as inclusive development. The vision aims for compact areas, and to create a replicable model which can then be recreated across the country keeping in mind the unique local challenges present on the ground. These examples can thus act as a catalysing influence toward the creation of a broader interconnected network of smart cities across the country [56]. Among the core infrastructure elements in a Smart City are adequate water supply, guaranteed and uninterrupted supply of electricity, sanitation—including solid waste

98

D. Saxena et al.

management, efficient urban mobility with a key focus on public transport, affordable housing, robust IT connectivity and digitalization particularly for traditionally underserved communities, good governance with a focus one specially e-Governance and citizen participation, law and order, as well as access to affordable healthcare and education.

10.2 Implementation As per the Indian government webpage (https://smartcities.gov.in) the implementation of the smart city projects will be done through the incorporation of Special Purpose Vehicles (SPVs). The SPV will be responsible for the planning, appraisal, approval, funding, implementation, management, operation, monitoring, and evaluation of the development projects. Each Smart City will have an SPV which will be headed by a full time CEO. The board of the SPV shall include nominees of the central and state governments as well as the ULB. The funding for the SPVs will be in the form of tied grants to come out of a separate grants fund created expressly for this purpose. All expenditure will be monitored at the local as well as the state and national levels.. The program implementation has been evaluated for Delhi as a case study [57] which finds that for all the five implemented projects, there are some deficiencies in execution and maintenance. There has been significant positive change in terms of the information portal—of the Government, digitization in fund transfers, information sharing through digital boards, smart parking systems and infrastructure development. But still a lot more needs to be done to address the issue of rehabilitation of poor migrants, enhancing quality of life, water, sanitation and health care [57].

10.3 Challenges and Issues Post Independence the democratic Indian government started its functioning in 1951 with a spatially & structurally imbalanced urbanization having 17% of its population located in urban area. The four metro cities namely Mumbai, Calcutta, Delhi and Chennai were predominant, and were supported by around 3000 small and medium sized towns. More than half of the urban population was coming from these small and medium towns. Over a period of seventy years, there were several government initiatives and plans implemented that have steered the technology revolution in India. Internationally, smart cities have acted as a driver for economic growth and boosted employment generation. Same may be expected in Indian context as the mission smart cities will span over a period of five years for first 100 cities identified.

Social and Technical Challenges in Eco-Sustainable …

99

The major challenge faced by the India is its large population. As per the report published [47] to accomplish the smart cities mission, following issues need to be addressed. ● The government, citizen and responsible bodies all have work in coordination, respond in a responsible manner. this is applicable for centre and state bodies which need to function without any political influence. ● Ample opportunities are to be created for sharing of ideas, experiences which subsequently to be used for revamping the project for betterment. ● Proper representations is required from all the religions, all the income groups, industry personnel, working professionals for inclusive growth. The recommendations may be sought on water management, sanitation, transport, migrant rehabilitation, public safety, health care facilities etc. ● Adequate training and knowledge sharing has to be carried out for efficient implementation of the mission. Government can take help for educational institutions for training and education of the workforce having expertise in technical and financial domain. This also includes sensitization of the citizens for their active participation. ● Civic and administrative bodies need to be empowered for law-and-order enforcement ● Citizen to behave responsibly, pay taxes and help in generating revenue. It is also expected that the local population should make use of the employment opportunities created through infrastructure rather than expecting migrant population to participate and generate rehabilitation concerns. In the same report the concerns raised on the implementation of mission smart cities by the Indian government under the present leadership are as follows [47] ● As per the mission, the infrastructure development will be happening on the open land nearby the urban land to cater for the smart city services. This may destroy the small and medium adjacent city area with existing marginal infrastructure. ● The development of infrastructure and its component will ensure availability of services within a restricted are of the city. The adjacent areas may develop with usual rate. This rapid transformation in the services in turn may create a divide among its population facing inequality in the services provided and progressive development happening within these areas. ● The special purpose vehicle (SPV) is given responsibility of implementing the mission bypassing existing local administrative workforce. This raises a serious question regarding delays in operations, decision making leading to shortcomings in implementation. ● Lack of adequate funding for projects, as well as the grandly slow approval process act as speed bumps toward the creation of the physical infrastructure. The other critical challenges include ● Data Security and Hackers—Internet connectivity also brings with it new security concerns. Both citizen privacy as well as critical infrastructure functioning can

100

D. Saxena et al.

be targets of malicious actors, and therefore strong security practices are essential for the overall success of such projects. The introduction of blockchain technology, being the topic du jour in the tech industry, could possibly be incorporated into such security systems to add an extra and impenetrable layer of encryption [58]. ● Privacy Concerns—Collection of data for streamlining and improving services also brings with it privacy concerns. Not only can such data be prone to leaks and hacks, but the idea of the government having this degree of information on the private lives of its citizens is a source of concern for many among us. It is very important, that proper regulations are placed not only on the collection, but also the use of such data, and that individual information be anonymized wherever possible. Further, it is important to have enhanced security especially for personal information especially where loss or release of such information may have financial implications [6, 58–60]. ● Educating & Engaging the Community—For a Smart City to truly exist and thrive, it needs “smart” citizens who are engaged and actively taking advantage of new technologies. Adoption of new technology can often be low in undereducated as well as remote communities and it must be of special concern to promote utilization of new technology through education as well as increased access [5, 30, 44, 45, 61]. ● Being Socially Inclusive—It is also important that the benefits of technology percolate down to the neediest among us and are not just limited to the top. A focus on niche improvements as opposed to mass benefiting systems such as private cars over public transit can often lead to lopsided development. To prevent the same, it is important that a special focus be given to the most underserved, marginalized as well as remote communities, and that the special needs of these communities are taken into account when overall planning is being undertaken [5, 10, 12, 27, 47].

11 Conclusion In the Smart City conceptual model sustainability and citizen participation are the key aspects that make the city and community “smart”. The ICT and IoT enabled devices, application and services for the backbone in the making of a smart city. When technology, connectivity, networking, governance, resource management, energy management, waste management, well informed citizen, and IoT enabled services are integrated to develop a smart city, there is major impact on the overall standard of living. Only smart community can forma” smart city”. Under the present leadership India has initiated this mission smart cities since 2015 for hundred cities, Though India as country faces a major challenge due its large population, cultural divide, unequal population distribution, urbanization, issues with infrastructure development, sanitation, political interventions, waste management, delays in decision making, inadequate knowledge base or skills, this mission is

Social and Technical Challenges in Eco-Sustainable …

101

expected to at least drive economic and industrial growth, build infrastructure, create job opportunities and bring technology to the doorstep. The major thrust is being given by the government through its renewable energy management projects, motivating—promoting research in alternative energy resources, water harvesting, promoting solar energy system manufacturing and usage, e-governance, digitization, encouraging small entrepreneurs and start-ups through Make-in-India initiative, Swachh Bharat Abhiyan to improve sanitization. Under the prevailing pandemic situation Indian health infrastructure and health care facilities have managed to cope up well. If the issue of migrating population and un-equal population is to be resolved, India has no choice other than successfully implementing this mission. There are some key concerns as observed by the researchers in making of a Smart City, but knowing Indians, it is not a farfetched dream with the right policy decisions, availability of technology, seeking funds from international bodies like World Bank and the finally the most important, the participation of local citizen. One hopes that the Smart Cities Mission in India is able to overcome its numerous obstacles and steer the community towards improving the quality of life toward all in the society.

References 1. P. D. United Nations: Department of Economic and Social Affairs, “The World’s Cities in 2018,” World’s Cities 2018—Data Bookl. (ST/ESA/ SER.A/417), p. 34 (2018) 2. OECD: The OECD Programme on Smart Cities and Inclusive Growth, no. July, pp. 1–59 (2019) [Online]. https://www.oecd.org/cfe/cities/smart-cities.htm 3. Thompson, E.M.: Smart city : adding to the complexity of cities : a critical reflection. In: Complex. Simplicity—Proceedings of 34th eCAADe Conference, vol. 1, no. Graham 2004, pp. 651–660 (2014) [Online]. http://papers.cumincad.org/data/works/att/ecaade2016_225.pdf 4. Caragliu, A., Del Bo, C., Nijkamp, P.: Paper Nijkamp et al Smart cities in Europe 2009. In: 3rd Central European Conference on Regional Science—CERS, pp. 45–59 (2009) 5. Hernàndez, C.A.: The role of citizens in smart cities and urban infrastructures. In: Solving Urban Infrastructure Problem Using Smart City Technology, no. October, pp. 213–234 (2021). https://doi.org/10.1016/b978-0-12-816816-5.00010-3 6. Kumar, R., Banga, H.K., Kaur, H.: Internet of things-supported smart city platform. IOP Conf. Ser. Mater. Sci. Eng. 955(1) (2020). https://doi.org/10.1088/1757-899X/955/1/012003 7. Kogan, N., Lee, K.J.: Exploratory research on the success factors and challenges of smart city projects. Asia Pacific J. Inf. Syst. 24(2), 141–189 (2014). https://doi.org/10.14329/apjis.2014. 24.2.141 8. Hernández-Muñoz, J.M. et al.: Smart cities at the forefront of the future internet. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) 6656:447–462 (2011). https://doi.org/10.1007/978-3-642-20898-0_32 9. Pourzolfaghar, Z., Helfert, M.: Taxonomy of smart elements for designing effective services. AMCIS 2017 Am. Conf. Inf. Syst. A Tradit. Innov. 2017-Augus, 1–10 (2017) 10. Stratigea, A.: The concept of ‘smart cities’. Towards community development? Netcom 26– 3(4), 375–388 (2012). https://doi.org/10.4000/netcom.1105 11. Iqbal, A., Olariu, S.: A survey of enabling technologies for smart communities. Smart Cities 4(1), 54–77 (2020). https://doi.org/10.3390/smartcities4010004

102

D. Saxena et al.

12. Chourabi, H., et al.: Understanding smart cities: An integrative framework. In: Proceedings of Annual Hawaii International Conference on System Science, pp. 2289–2297 (2012). https:// doi.org/10.1109/HICSS.2012.615 13. Dameri, R.P., Cocchia, A.: Smart city and digital city: twenty years of terminology evolution. In: X Conference of the Italian Chapter of AIS, ITAIS 2013, pp. 1–8 (2013) 14. Meijer, A.J., Gil-Garcia, J.R., Bolívar, M.P.R.: Smart city research: contextual conditions, governance models, and public value assessment. Soc. Sci. Comput. Rev. 34(6), 647–656 (2016). https://doi.org/10.1177/0894439315618890 15. Arroub, A., Zahi, B., Sabir, E., Sadik, M.: A literature review on smart cities: paradigms, opportunities and open problems. In: Proceedings of 2016 Internatinal Conference of Wireless Networks Mobile Communication WINCOM 2016 Green Communications Network, pp. 180– 186 (2016). https://doi.org/10.1109/WINCOM.2016.7777211 16. Mosannenzadeh, F., Vettorato, D.: Defining smart city. a conceptual framework based on keyword analysis. TeMA J. L. Use, Mobil. Environ., no. Special, p. 998 (2014) [Online]. Available: file:///C:/Users/ABisello/Documents/EURAC/Formazione/Dottorato UniPD/Materiali/Defining Smart City. A Conceptual Framework Based on Keyword Analysis.pdf 17. Giffinger, R., Haindl, G.: Smart Cities Ranking: an Effective Instrument for the Positioning of Cities?, pp. 703–714 (2007) [Online]. https://upcommons.upc.edu/bitstream/handle/2099/ 11933/05_PROCEEDINGS_M5_01_0014.pdf 18. Balakrishna, C.: Enabling technologies for smart city services and applications. In: Proceedings of 6th International Conference Next Generation Mobile Applications Services Technology NGMAST 2012, pp. 223–227 (2012). https://doi.org/10.1109/NGMAST.2012.51 19. Bibri, S.E., Krogstie, J.: Generating a vision for smart sustainable cities of the future: a scholarly backcasting approach. Eur. J. Futur. Res. 7(1), 1–20 (2019). https://doi.org/10.1186/s40309019-0157-0 20. Syed, A.S., Sierra-Sosa, D., Kumar, A., Elmaghraby, A.: IoT in Smart Cities: A Survey of Technologies, Practices and Challenges (2021). https://doi.org/10.3390/smartcities4020024 21. Monzon, A. Smart cities concept and challenges: Bases for the assessment of smart city projects. In: SMARTGREENS 2015—4th International Conference Smart Cities Green ICT System Proceedings, p. IS-11-IS-21 (2015) 22. Fernandez-Anez, V., Fernández-Güell, J.M., Giffinger, R.: Smart city implementation and discourses: An integrated conceptual model. The case of Vienna. Cities 78(November), 4–16 (2018). https://doi.org/10.1016/j.cities.2017.12.004 23. Gebhardt, C.: The spatial dimension of the triple helix: the city revisited—towards a mode 3 model of innovation systems. Triple Helix 2(1), 0–4 (2015). https://doi.org/10.1186/s40604015-0024-3 24. Dameri, R.P.: Council for innovative research. J. Adv. Chem. 10(1), 2146–2161 (2014) 25. ITU academy: ITU ACADEMY ICT role for smart sustainable cities (2018) 26. Belli, L., et al.: IoT-enabled smart sustainable cities: challenges and approaches. Smart Cities 3(3), 1039–1071 (2020). https://doi.org/10.3390/smartcities3030052 27. Mohanty, S.P., Choppali, U., Kougianos, E.: Everything you wanted to know about smart cities. IEEE Consum. Electron. Mag. 5(3), 60–70 (2016). https://doi.org/10.1109/MCE.2016. 2556879 28. Park, E., del Pobil, A.P., Kwon, S.J.: The role of Internet of Things (IoT) in smart cities: Technology roadmap-oriented approaches (2018) 29. T. U. of Sihou Zhang, T.: The role of information and communication technology for smart city development in China. Tallinn university of technology 30. Gurstein, M.: Smart cities vs. smart communities: empowering citizens not market economics. J. Commun. Inform. 10(3), 1 (2014). https://doi.org/10.15353/joci.v10i3.3438 31. Colin Harrison and Ian Abbott Donnelly: A Theory of Smart Cities, pp. 68–70, 1377. 32. Kapur, D., Sequeira, R.C.: Smart Cities in India—the role of m2m + iot. 33. Li, F., et al.: Smart transmission grid: vision and framework. IEEE Trans. Smart Grid 1(2), 168–177 (2010). https://doi.org/10.1109/TSG.2010.2053726

Social and Technical Challenges in Eco-Sustainable …

103

34. Lindskog, H.: Smart communities initiatives. In: Proceedings 3rd ISOneWorld Conference, no. April, p. 16 (2004) 35. Guedes, A.L.A., Alvarenga, J.C., Goulart, M.D.S.S, y Rodriguez, M.V.R., Soares, C.A.P.: Smart cities: The main drivers for increasing the intelligence of cities. Sustain. 10(9), 1–19 (2018). https://doi.org/10.3390/su10093121 36. Wang, Y., Ram, S., Currim, F., Dantas, E., Sabóia, L.A.: A big data approach for smart transportation management on bus network. In: IEEE 2nd International Smart Cities Conference Improving Citizens Quality Life, ISC2 2016—Proceedings, pp. 0–5 (2016). https://doi.org/10. 1109/ISC2.2016.7580839 37. Koubaa, A., et al.: Smart palm: An IoT framework for red palm weevil early detection. Agronomy 10(7), 1–21 (2020). https://doi.org/10.3390/agronomy10070987 38. O’Grady, M.J., Langton, D., O’Hare, G.M.P.: Edge computing: A tractable model for smart agriculture? Artif. Intell. Agric. 3, 42–51 (2019). https://doi.org/10.1016/j.aiia.2019.12.001 39. WHO (World Health Organization): mHealth New horizons for health through mobile technologies (2018) 40. Clim, A., Zota, R.D., Tinica, G.: Big data in home healthcare: a new frontier in personalized medicine. Medical emergency services and prediction of hypertension risks. Int. J. Healthc. Manag. 12(3), 241–249 (2019). https://doi.org/10.1080/20479700.2018.1548158 41. SMARTCITY, Smart Healthcare Solutions for Smart Cities, no. December (2017) 42. Jiang, A., Yuan, H., Li, D., Tian, J.: Key technologies of ubiquitous power Internet of Thingsaided smart grid. J. Renew. Sustain. Energy 11(6) (2019). https://doi.org/10.1063/1.5121856 43. Lai, C.S., et al.: A review of technical standards for smart cities. Clean Technol. 2(3), 290–310 (2020). https://doi.org/10.3390/cleantechnol2030019 44. Dudzeviˇci¯ut˙e, G., Šimelyt˙e, A., Liuˇcvaitien˙e, A.: The application of smart cities concept for citizens of Lithuania and Sweden: comperative analysis. Indep. J. Manag. Prod. 8(4), 1433 (2017). https://doi.org/10.14807/ijmp.v8i4.659 45. Simonofski, A., Asensio, E.S., De Smedt, J., Snoeck, M.: Citizen participation in smart cities: Evaluation framework proposal. In: Proceedings of 2017 IEEE 19th Conference on Business Informatics, CBI 2017, vol. 1, no. July, pp. 227–236 (2017). https://doi.org/10.1109/CBI.201 7.21 46. Aghimien, D.O., et al.: A fuzzy synthetic evaluation of the challenges of smart city development in developing countries. Smart Sustain. Built Environ. 1–25 (2020). https://doi.org/10.1108/ SASBE-06-2020-0092 47. Aijaz, R.: Challenge of Making Smart Cities in India, vol. 87, no. October (2016) 48. ONU: World Urbanization Prospects, vol. 12 (2018) 49. Gates, A.Q. (2015) Research Challenges Toward the Implementation of Smart Cities in the United States,” no. December (2015) [Online]. https://cait.rutgers.edu/cait/research/researchchallenges-toward-implementation-smart-cities-united-states%5Cnhttp://trid.trb.org/view/ 1360872 50. Gil-Garcia, J.R., Pardo, T.A., Nam, T.: What makes a city smart? Identifying core components and proposing an integrative and comprehensive conceptualization. Inf. Polity 20(1), 61–87 (2015). https://doi.org/10.3233/IP-150354 51. Joshi, S., Saxena, S., Godbole, T., Shreya: Developing smart cities: an integrated framework. Procedia Comput. Sci. 93(September), 902–909 (2016). https://doi.org/10.1016/j.procs.2016. 07.258 52. Kumar: Smart neighborhood to enhance social sustainability and inclusive planning in smart cities. In: Theme 1 Urban Regeneration Sustainability or Theme 4 Smart City, no. April 2015, pp. 1–11 (2016) 53. Sangeeta Mangesh, A.S., Chopra, P., Saini, K.: Advances in Intelligent Systems and Computing II, pp. 513–524 (2019) 54. Aljunid, M.F., Manjaiah, D.H.: Data Management, Analytics and Innovation, vol. 808 (2019) 55. Alam, M., Porras, J.: Architecting and designing sustainable smart city services in a living lab environment. Technologies 6(4), 99 (2018). https://doi.org/10.3390/technologies6040099 56. https://smartcities.gov.in/

104

D. Saxena et al.

57. Aijaz, R.: The Smart Cities Mission in Delhi, 2015–2019: An Evaluation (2020) [Online]. https://www.orfonline.org/research/the-smart-cities-mission-in-delhi-2015-2019-an-evalua tion-60071/ 58. Skwarek, V.: Blockchains as security-enabler for industrial IoT-applications. Asia Pacific J. Innov. Entrep. 11(3), 301–311 (2017). https://doi.org/10.1108/apjie-12-2017-035 59. Chhaya, L., Sharma, P., Bhagwatikar, G., Kumar, A.: Wireless sensor network based smart grid communications: Cyber attacks, intrusion detection system and topology control. Electron. 6(1) (2017). https://doi.org/10.3390/electronics6010005 60. Ismagilova, E., Hughes, L., Rana, N.P., Dwivedi, Y.K.: Security, privacy and risks within smart cities: literature review and development of a smart city interaction framework. Inf. Syst. Front. (2020). https://doi.org/10.1007/s10796-020-10044-1 61. Degbelo, A., Granell, C., Trilles, S., Bhattacharya, D., Casteleyn, S., Kray, C.: Opening up smart cities: citizen-centric challenges and opportunities from GIScience. ISPRS Int. J. Geo-Inform. 5(2) (2016). https://doi.org/10.3390/ijgi5020016

5G and Other Networking Technologies for Smart Cities

A Framework for Designing Long Term Digital Preservation System Anand Kumar Sinha, Santosh Kumar, and H. M. Singh

Abstract Now a days, we are moving towards smart urbanized world wherein rich and easily assessable information system can help to manage the dynamic changing urban world. Digital data preservation plays an important role to make strategies for smart cities. Digital preservation system is composed of policies, approaches and actions with the aim of keeping the digital objects authentic and accessible to users for a long period of time irrespective of challenges like failures, disasters or attacks. This has become a key issue in recent years. due to continuous increase in the amount of digital assets and advancement in technological structure, there is obsolescence of hardware and software required to store and access of the digitized information. So, there is a need to preserve the data for long term which could be easly accessed from anywhere and anytime to support the strategic decisions in making urban cities smart. This paper first, presents a method as Research Strategy to carryout system study analysis of its current operational status with respect to existing preservation practices for its digital data towards long term. Further, digital repository development and digital preservation system model have been proposed which acts as reference model to develop the dital preservation system. Keywords Digital preservation · Digital repository · Digital archives · Preservation action · Preservation policies · Archival information system reference model

A. K. Sinha · H. M. Singh Department of Computer Science and Information Technology, Sam Higginbottom University of Agriculture, Technology, and Sciences (SHUATS), Allahabad, India e-mail: [email protected] S. Kumar (B) Department of Computer Science and Engineering, Galgotias University, Greater Noida, Uttar Pradesh 201310, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_7

107

108

A. K. Sinha et al.

1 Introduction It is understood that during last two hundred years, Acidic Pulp Paper was/is the standard universal mean across the globe for recording any kind of information (hand written, typed or printed) for individual or institutional or organizational purposes. Accordingly based on the average estimation of longevity of the physical medium i.e. paper, for statutory rules/regulations by national Govt. as well as the priorities of business needs the paper documents thus created were being retained/preserved by various organizations. For example operation of National Archives, New Delhi is responsible for archiving of file records of national government up to 25 years old or beyond in paper or microfilm. The established office of record keeping in each Govt. office is supposed to have retention schedule for various nature of files. However due to IT revolution over last two decades, the record keeping scenario in every large or small organization across the world is going through vast changes depending upon the digital landscape across the organization/the country specific. It also includes all preservation management related activities with future perspectives in term of monitoring, planning, action & control function in continuously changing operating environment. Thus there is need to take on the tasks and responsibility of maintenance & retention of its records in digital document form by each small/big organization under its own arrangement either having its own dedicated technical infrastructure or getting it done through the services of specialized third party. Further the evolution of technical solution for long term maintenance & retention of records in digital form, popularly known as “Digital preservation” is the need of the hour for all organization in each country across the globe including Indian organizations [1]. AS world is moving towards smart cities wherein rich and easily assessable information system can help to manage the dynamic changing urban world. Digital data preservation plays an important role to make strategies for smart cities. In this paper, a method has been given as research strategy to carryout system study analysis of its current operational status with respect to existing preservation practices for its digital data towards long term. Further, digital repository development and digital preservation system model have been proposed which acts as reference model to develop the dital preservation system.

2 General Assumption About Document Retention and Preservation by Organization It is appreciated that in each organizations day to day operational life, the created data content is recorded in paper or digital document Form. However depending upon the nature/quality of information generated, the organization specific decides and make arrangement to retain them in shortly/medium/long term period. This is an universal process of maintenance & retention of records in most of the organizations across

A Framework for Designing Long Term Digital Preservation System

109

the globe of course, the percentage of documents destroyed or retained for specified short, medium or perceived long term [2].

3 Literature Survey Digital preservation system is composed of policies, approaches and actions with the aim of keeping the digital objects authentic and accessible to users for a long period of time irrespective of challenges like failures, disasters or attacks [3, 4]. For long term preservation of digital assets, the data is to be made redundant. The Residency can be done with the help of replication or parity techniques also used in RAID [5]. Eraser quotes can also be used to achieve redundancy without overhead of large storage [6]. Researchers in [7] proposed redundancy manager with the integration of the iRODS which is data grid system for digital preservation [8]. In Ref. [9], author has given digital preservation framework for a project. And this framework gives comprehensive approach for assessing and validating digital libraries, digital repositories and data centers preservation. Moore in [2] has given assessment workflow for digital preservation system. In Ref. [10], author proposed an approach for storing and accessing digital assets with the help of programs written in the language of a Universal Virtual Computer (UVC). In Ref. [11], PLANETS Preservation Planning approach to optimally maintain digital assets has been given. Simulation engine is used to design the digital preservation system and policies to maintain the digital data. The simulation engine has been used to estimate the redundancy techniques in case of data grids [7]. Based on the survey, it has been found that preservation system needs more maturity specifically in India. Here a methodology for preserving data in digital form forlong duration has been given. Moreover different organizations producing records have been categorized for esay reference. Finally preservation system has been proposed for long term preservation.

4 Methodology for Digital Data Preservation System (DPS) In our endeavour to ‘Design an integrated system for long term preservation of digital data’, method of System Analysis and Design is envisaged to be followed as Research Strategy/methodology to achieve the defined objectives of the projects. Accordingly, a path for designing DPS following existing preservation practices is decided using the proposed digital archival system as given in Fig. 1. The scope and type of task, to be performed during the research work in aid of development of an organizational digital data preservation system, is identified as below: Step-1:

Search and enumerate various kinds/types producing/holding operating organizations.

of

digital

data

110

A. K. Sinha et al.

Fig. 1 Long term digital preservation process

Step-2:

Categorize and club them into various set of sample organizations. And carry out Macro level study of such group of sample organizations to have an overview about their policies & practices like: i. ii.

iii.

iv. v. vi. vii. viii.

Process of digital data creation out of business transactions by individual or a set of individuals within the organization Saturatory or non-saturatory requirement of record keeping by the organization in short (3 years), medium (10 years) or long term (beyond 25 years more) as per the necessities of govt. regulation or the business policy practices of the organizations. Level of general awareness & concern about preservation of created digital data within the organization as a whole or its stakeholders (individual or set of individuals). Level of technical knowledge & expertise Availability and adequacy of technical infrastructure. Current preservation practices and the foreseeable gap with the prevailing standards if any Problem challenges face due to change and obsolescence in digital technology General technological Indian scenario in digital preservation context.

A Framework for Designing Long Term Digital Preservation System

Step-3:

Accordingly in order to gain an macro level insight into current state of affairs with respect to long term preservation of digital document produced/held by various types of organizations over their normal course of operations, the “Organizational Survey” route is to be taken using two methods as mentioned below: i.

ii.

iii.

Step-4:

Step-5:

111

Literature Survey: Self study/Desk Research by the researcher by going through respective websites over internet as well as available printed literatures, pamphlets, brochures etc. of various Govt./Non Govt. organizations of interest, indicating functional overviews, policy document, process & procedures followed regarding document management including basic concern about its long retentions and preservation. Surveys by following face to face interviews: After formulation of a set of core questionnaire of common nature as well as stakeholder specific questions regarding the task of digital preservation, the method of survey through personal face to face interviews with key persons of the organizations (with an inquisitive approach) is to be resorted, to carry out the required study over a set of selected different categories of organizations. Generally people to be approached need directly or indirectly involved with the responsibilities of organizational data processing and storage/retention having technical or administrative background (some of them may be of higher echelon of the organization, forming part of decision making team). Case Study: within a set of different organization, a couple of case studies may be performed in order to gain some in-depth knowledge on the different news on digital preservation.

After gaining broad level of insight into current state of affairs among a set of various categories of organizations, a specific organization needs to be selected to do an in depth study and analysis of the existing system. To propose logical design of a digital data preservation system.

To propose the required: Processes & practices for sustainable operational functioning of a digital data preservation system which are to be essentially compatible with the organizations vision with respect to its data preservation. These steps are discussed in detail in the following sections.

4.1 Existing System Study Analysis and Design 4.1.1

Study and Analysis of Various Existing Organizations

While looking at the wider canvas of organizations from operational angle, various organizations can be broadly grouped as below:

112

A. K. Sinha et al.

(a)

National & State Level Governance: Central & state government ministries/offices and associated departmental setup up to tehsil/village panchayat level organization. Product Specific Manufacturing Organizations in both Government/private sector industrial organizations engaged in value addition through production/manufacturing. Specialist Service Sector Organization in both government & private sector e.g. Health care Transportation, Aviation, Banking, Finance, Insurance, IT Services, Film & Entertainment, Marketing, Trading, other service organization. Domain specific Educational & Academic institutions of higher learning in government/private sector. Domain specific Research & Development organization in both Govt./private sector. Knowledge, collector & retainer institutions in the field of Culture, Science & Technology, History, Social Sciences in both Govt. & private sector e.g.

(b)

(c)

(d) (e) (f)

● Public & private libraries, Museums, Archives. While viewing above organizations operational nature and their user groups at Macro level, they may be categorized as: (i) The data/documents creators, encompassing all the top four class of organization i.e. National state level governing bodies, manufacturing and service sector organization as well as domain specific educational/research a development institution. (ii) The data/document custodians, comprising of knowledge collector & retainer institutions like Archives, Museum, Libraries & other cultural institution. (iii) The data/document users comprising of Auditors, regulatory Bodies Investigating Agencies, General Mass, students and scholars. Further, at Micro level, the key contributors stakeholders for organizational past and present documents (in paper or digital form) can be visualized as below: – Data creators: In any Govt. or business organizations, floating populations at various levels of functionaries/decision makers/business owners and employees from Top/middle/lower management, creates Documents out of day to day business transactions. In Academic/R&D institution, floating population of students, scholars, teachers, researcher, scientists, Publishers, research funding organization (indirectly), create new knowledge in the form Books, Research Papers, journals, study reports etc. (i)

Data custodians: Various levels of functionaries and professional known as librarians, Archeologist at various archives, museums & libraries, are responsible for collections and retentions of knowledge documents.

A Framework for Designing Long Term Digital Preservation System

(ii)

113

Data users: Various levels of functionaries like auditors, investigators, domain specific professionals/scholars, students, researchers and general public, forms the part of keen users of produced & stored documents.

Further, for better appreciation examples of different class/categories of existing organizations operating across Indian space, is summarized as below.

4.2 Examples of Various Categories of Operating Organizations in India Table 1 Shows the various organizations generating a voluminous data that is needed to be preserved in digital form for long duration. Broadly these have neen categorized into five parts for easy reference.

4.2.1

Sample Set of Organization for Macro Level Study

To carry out a Macro level study of a sample set of organizations belonging to five categories of document producing/holding organizations across Indian space. Following organizations from each group were earmarked as below: (a) (b) (c)

National/State level governance: Ministry of Defenses, GOI, Central Statistical Organizational, (CSO), GOI, National Informatics Centre (NIC) Specialist Service sector organization: LIC of India, Air HQ (VB), New Delhi, HCL, Noida Domain specific education, academic, R&D organization: Indian Institute of Technology, New Delhi, Delhi University, JNU, DRDO ● Memory collectors & Retainer organization: National Archives of India, Janpath, New Dehli (NAI), Indira Gandhi National Centre for Arts, Janpath, New Delhi (IGNCA), Central library, IIT, Delhi, ● Central library, Delhi University, Delhi

This organizational survey was carried out through literature survey/desk research as well as face to face interview of stakeholder with the researcher. Further, to facilitate the conduct of highly focused organizational survey, a set of prepared Questionire (placed at App ‘A’) was used as assisting tool. It was a long drawn process but highly useful and interesting, rather experiences motivating highly for the research under progress. The broad based macro level findings in summarized form is enumerated in subsequent paragraphs.

Types of organization

Govt. machinery

Specialist service sector organization

S. No

1

2 Govt. private sector based organization

Governance

Nature of operation

Organizational employee/functionary

Organizational employee/functionary

Data creators/producer

Table 1 Categories of organizations generation voluminous data for preservation

In both Govt. & Private sector, internal organization owned Record keeping arrangement with archival responsibility

Specialized record keeper like National/state level Archives

Arrangement of record keeping

All India Institute of medical Sciences (AIMS)

Healthcare

Indian Railways, Air India, Shipping corp. of India

Transportation

Indian Army, Indian Air Force, Indian Navy

Defense

Supreme Court of India, High Court in all state Capitals

Judiciary

Ministries/Govt. offices, for Central Govt.—National Archives of India, Janpath, New Delhi

Govt. sector

(continued)

Max Super Specialty Hospitals Fortis Hospital

Jet Airways, Indigo, spice Jet, Go air, state roadways

Pvt. sector

Examples of data/record producing organization

114 A. K. Sinha et al.

Types of organization

Product specific manufacturing organization

S. No

3

Table 1 (continued)

Govt., Private sector based organization

Nature of operation

Organizational employee/Functionary

Data creators/producer

In both Govt. & Private sector, internal organizational based Record keeping arrangement with archival responsibility

Arrangement of record keeping Pvt. sector

Kotak Mahindra Bank, ICICI Bank, Yes Bank, HDFC Bank

ONGC, Coal India, NTPC, Indian Oil, Bharat Petroleum, SAIL, BHEL, BELL, Cement Corp. of India

PSU’s

(continued)

Tata Steel, Tata Motors, Maruti Suzuki, Hyundai, ACC, Jaypee cements, Appolo Tyres, Ranbaxy TCS, Infosys, Wipro, HCL, Airtel, Vodafone, Idea, Reliance

Life Insurance Corp. of AXA Life, Max India (LIC), GIC, Bopa, Bajaj National Insurance, Alliance United India Insurance

Insurance

Reserve Bank of India, State Bank of India, PNB, UCO, Canara bank, Corporation bank, Syndicate bank, IDBI bank

Banking

Maulana Azad Medical Vedanta College & Hospital

Govt. sector

Examples of data/record producing organization

A Framework for Designing Long Term Digital Preservation System 115

Types of organization

Educational/Academic/Research & Development Organization

Library, Archives, Museum

Internal library attached with Academic/R&D Organization

S. No

4

5

5(a)

Table 1 (continued) Data creators/producer



Document selection, collection Storage & preservation



No role in Document production

Acquisition of Students, Teachers, existing Researchers, Publishers, knowledge Research funding agency · Imparting of existing knowledge · Creation of new knowledge

Nature of operation



Document collection, storage, retention & preservation through specialized professionals

Internal arrangement of knowledge record keeping & storage through Institutional library · Publishers of research paper & journals

Arrangement of record keeping

Central library, IIT Delhi Central library, Delhi University Central library, JNU Central library, AIIMS Central library, DRDO HQ, New Delhi, etc

· National Archives Of India (NAI), New Delhi · Nehru Memorial Museum & Library, New Delhi · National Library Of India, Kolkata, West Bengal, etc

Indian Institute of Sciences, Bangalore, · JNU, Delhi, · Delhi University · IIM, Ahmadabad · Other Central and state Universities & Institutions

Govt. sector



Personal collection renowned individuals · Privately funded public libraries

ISB, Hyderabad, · MDI, Gurgaon · XLRI, Jamshedpur · BMS college of engg., Bng · VIT Vallore · BITS Pilani · BITS Mesra

Pvt. sector

Examples of data/record producing organization

116 A. K. Sinha et al.

A Framework for Designing Long Term Digital Preservation System

4.2.2

117

Broad Outline of Organizations

On the day to day operations an outline about the process of document production and retention in different categories of organization (data creators & custodians) is as below: (a)

(b)

(c)

National/State level Governance, Manufacturing & service sector organization: In case of organizations pertaining to Governance, Commercial/Business organizations including service sector, the information is created by various level of employees/functionaries of the organization hierarchy through business transactions and information records produced in the form of documents are stored safely as per the needs of statutory regulation as well as business needs of the organization concerned. Academic Institutional/Research & Development Organization/Publishers: In institutions, case of Academic/Research & Development organizations, the information/new knowledge is created by various categories of knowledge worker like students, Teacher, Researchers, scientists as a part of their day to day course of duties. Archives, Libraries and Museum: The case of typical information/knowledge collector institutions like libraries, Archives, Museums in public and private domain including individual libraries maintained by respective exclusive institutional libraries of Academic, R & D or Business organizations are a little different from normal data producing units. As mentioned above, libraries, Archives & museums are primarily knowledge collector and data warehouses, typically operated & maintained by specialized professionals called librarian/library science professionals and may be identified as Data Manager/Data store keepers.

Additionally, in most of the libraries, efforts are on digitization of old analog documents as scanned document in PDF format apart from micro filming. Thus also, use of computerized library information system is in place in good number of libraries, for ease of search/identification/all accessibility of the desired document from the stored inventory. Further, after gives a brief outline of four major organizational stakeholders (i.e. individuals/collective team of functionaries, student/researchers/teacher/student/scholars, publications and librarian/Archeologists); the survey observations on current state of affairs on the issue of digital preservation, is addressed in four focus areas as below: (a) (b) (c) (d)

General awareness about need for digital preservation View on technological reason and threats to digital documents vis-à-vis digital preservation Current state of affairs in terms of in-house process/practices of digital preservation Future outlook and need for technical infrastructure for digital preservation.

118

4.2.3

A. K. Sinha et al.

Data/Document Creators in Govt./Non-Govt. Organization Including Private Sector Commercial Enterprises

The individual white collar worker i.e. a Manager or collectively the departmental team of executives of a Govt. or Non Govt. (small & large commercial or Noncommercial organization) create documents on day to day basis through business transactions. Generally they write/type/print or electronic ally save the content produced as Text doc/Data base. During the decade of 1995–2005 from mechanical type writers, most of them have switched over to desktop computer (standalone)/a network of interconnected working terminals/PC’s with central services using the platform of intranet for data transfer within the organization or communication with outside world through internet. The work done on the individual terminal is saved in standalone PC’s hard disk or in the central server. In case of past data (past work outputs) required for performing present work, It is fetched from stored data of the PC/central servers. The Data processing technical infrastructure (Hardware & Software) setup is looked after by a set of specialist IT professional of the CIO (Chief Information Offer)/data processing department. A few line managers from Finance/Marketing/Production/HR/Purchase/Stores department along with their counter parts from IT department were quizzed few questions on the issue of digital preservation. The observation are: (a)

(b)

(c)

(d)

Awareness: Majority of them including senior level manager are not aware about the need of digital preservation. After explaining to them in layman’s language, it was conveyed that we are Non-IT professionals and are interested in ‘saving’ the work done by us over central server and retrieval & use next working day only. Data management over its life cycle is the responsibility of other set of specialist from IT department. Later on, even IT professionals were also found quite unconcerned and indifferent on the pertinent cause. Reason: When asked about reasons for preservation of organizational transaction data. The general remark was, yes organization should preserve past data if it has economic value, or it is unique data pertaining to company and its analysis leads to help in decision making & growth in future. Yes, they were quite conversant with current practice of Big data analysis & study of commune behavior. Threats: Yes, they are appreciative about the fast pace of technological change in IT field (both hardware & software) and aware about the importance of sustainable continuity in working IT platform for past data retrieval/reuse and access otherwise lose of old data is eminent. However their perception of relevance useful life of data is of above 3–5 years. Long term of 10–20 years is passes. Current state of affairs: Mostly MS-Office based office documents, networked based data (websites, email, chat, etc.) and images like JPEG, PDF, GIF, TIF are used apart from customized domain-specific ERP application like SAP or in-house developed applications are operated upon for day to day work. For recording business transaction.

A Framework for Designing Long Term Digital Preservation System

(e)

4.2.4

119

Future outlook and need for an Infrastructure: About the need for establishment of any specialized in-house technical infrastructure or arrangement through third party, vendor, majority of them left it to the wisdom of company’s top management or higher echelon of Govt. organization. Additionally it was astonishing to know that while working in private organization or in an office of a state/central Govt. department, they are on the move @ 3–5 years, either due to job switch or departmental or interdepartmental, transfer, secondly the feelings that we are not professionals from IT field. So it is worthless to worry about retention/preservation of my company or department work outputs beyond 3– 5 years or so. Otherwise even it is needed in future, required arrangement would be taken care by the organization with the help of technologist. Such indifferent outlook was found quite prevalent with bank managers at retail branches, now quite conversant with core banking automated application. Knowledge Creations by Scholars Researchers (Student/Teacher/Scientist) at Academic/R&D Organizations

Research is a vocation of creating new knowledge i.e. something out of blue, something not known so far in the chosen field of work. Research involves many actors, primarily the individual or group of individuals perusing particular area of interest, the Guru (the guide/mentor) and the technical facilities made available by the institution. The Research output is the final outcome, obtained through the process of creation of new findings through use/reuse of existing knowledge data. While acting as knowledge creator, the research scholar is supposed to manage his/her project data, make it available to others in a common understandable language, compiling it in standard data reporting format and sharing the output through the means of publication with the help of publishers. Further as user of existing Research data, he is supposed to honor the referred authors/creators through citation/reference, adherence to copyright and other license/ownership related restriction & rules of use while managing any deduction derivation & use of previous data. Following facts are revealed after interaction with few of them in person. (PG students/Ph.D Scholar/faculty members of eminent Academic institution for technology/Non-Technology educational institution, Scientist from Scientific Research & Development laboratories of GOI). (a)

Awareness: Regular reference of existing knowledge in their particular area of interest through available paper documents in the form of papers/Articles/journal as well as online e-journals & books, facilitated by institutional libraries & Archives, the community of Researches are quite aware and concern about the importance and practice of long term retention of knowledge resources however, the need for preservation of digital documents being produced since 90’s of twentieth century was a little disruptive and mind scratching. Majority of them astonished as if this particular aspect of survival of digital documents didn’t strike to them so far. Generally the perception about digital preservation is being mixed up with the recent trend of Digitization of

120

(b)

(c)

(d)

A. K. Sinha et al.

analogue outputs, Digital library, open access, e-learning/online courses etc. and understanding in general is a little vague or not understood at all. Reason: Everybody agreed that created knowledge in the form of digital document like its counter-part is paper document form, need to be preserved for obvious usefulness & its utility like—facilitation of advancement of further research, analysis/reanalysis, modification/up-gradation and validation of old work in future, may be a particular piece of knowledge is unique and has high economic value and also has potential to act as stimulus for any intra/interdisciplinary future research work etc. Threat: Based on individual experiences while working with hardware and software tools and platform like PC’s/laptops/high-tech scientific devices over past 10–15 years (ie since the era of windows 95/office/98 and high level programming languages), by and large everybody agreed that the biggest threat to long term survival of currently produced digital data for their access, retrieval use and reuse would be the fast pace of changes in digital technology. Further, there is a realization that both hardware & software technologies are playing the game of one-up membership and chasing each other in their forward motion. Hence the lack of sustainable hardware/software & support of computing environment will make thus created knowledge inaccessible while adversely affecting the very basic eco system of knowledge creation and dissemination. Current State of affairs: While keeping an eye on a researchers day to day work practices, following facts are brought out, may be directly or indirectly relevant to their research work. They are: (i)

(ii)

(iii)

(iv)

In pursation of their research work, generally Researchers engage themselves with office document. (Microsoft office/open source application) network based data (Websites, e-mail, exchange of data, chat, database tools) and images (like JPEG, PDF, GIF), specialized software programming language compilers, scientific tools & application. Researchers involved in scientific, engineering and technology domain create huge amount of source code, raw data, databases and software applications. Generally majority of them were unable to quantify the amount of data being generated out of their regular endeavor. A wild guess was 1 GB to 1 TB for a project of short to long duration. Across the institutions, the researcher community store their working project/research related data in their personal computers allotted to them by the department/departmental intranet or personal laptops or PC at home or external hard disk (portable). The system of dumping data in institutional archive is non-existent. Only the final research output like Ph.D thesis, published papers etc. are submitted at designated coordinator in hard & softcopy in CDs/on line. The practice of sharing own data is limited to their personal guide or among project group members, closely working with themselves. Otherwise mostly a sense of apprehension/reservation prevails for openly sharing their data while being quite secretive to others. May be there is

A Framework for Designing Long Term Digital Preservation System

(v)

(e)

4.2.5

121

a sense of in security, competitive feeling, fear of this misuse or unethically snatching the thunder by follow colleagues from within outside the institution. By and large, there is distrust in sharing the research in progress data with any institutional coordinator, central organizational repository if any. In other way, it can be safely said that while being the authors of research outcome, they want to have some sort of control over their document/research data with the concern for safety, security, misuse of data. That’s why, only the formally frozen write up with final result of deduction is submitted to publisher of a journal for publication without sharing the source code, and enormous supporting data/raw details. By all, most of existing standard International journal of repute is accessed on-line through institutional subscription for reference of existing knowledge and so far they have not faced any problem accessing even the older ones.

Future outlook and need for infrastructure: With growing economy of the country like India, there is already big push for better quality of higher education and bigger thrust for research and innovation, it is obvious that the quantum of research data will also grow substantially in years to come. In this background, majority of researcher were in agreement for establishment of necessary technical infrastructure to safeguard and extend the longevity existence of Indian research outcomes in digital form. Yes, there is a need for specialized technical setup for preservation of research data for long term storage & accessibility, may be in the shape of standalone exclusive domain specific dedicated repository at small/medium/large organization level or in distributed format for a pool of institutions/organizations, to be established, managed & maintained by a set of trained specialist professionals. The UGCs ‘Shodh Ganga’ project is being appreciated as first step toward centralized collection and retention of academic research outcomes. Second interesting input came out from researcher community is that all of them are in favour of governmental (national/state) effort in establishment of digital repositories/archives for collection retention and achieving of country’s research outcomes rather than by initiative of private enterprise or even through Public Private Partnership mode. The general feeling is, in India, private enterprises do not born to live decades, centuries as their longevity is market driven with short term vision with an eye on profit generation and fast accumulation of wealth. So, it would be better, if private sector is kept aside for such task of strategic nature. Publishers

Publishers of Academic Books & journals play an important role in the process of knowledge dissemination to the user community. Actually in process of knowledge creation through research, the researcher community while being funded by funding agency or while being engaged with Academic/R&D organization, create

122

A. K. Sinha et al.

new knowledge or analyze/update existing knowledge, and the said research output is provided to publisher in the form of manuscript of academic book or Research Paper which is published by the publisher and finally such piece of new information finds place in libraries & Archives after due commercial transaction and finally it is readily available for the consumption of interested user community. This is the dynamics of ecosystem of knowledge creations to knowledge dissemination & consumption, Publishers play a key role in the supply chain and thus publishers are one of major stakeholder in dissemination of research output in qualitative and quantitative terms. Additionally, without going into much of details, publishers are responsible for protection of authors intellectual property rights, follow up of all norms/restriction under copy right rules etc. while managing viable revenue model in the business of knowledge dissemination. As regards roles and responsibilities of publishers with respect of preservation of digital documents produced by them, the experiences of interaction with Indian publishers did not turn out to be encouraging. Accordingly the study of STM Report [12], International Association of Scientific Technical & Medical publishes, UK, was referred to appreciate the status on digital preservation at even International Publishers. Following facts are being quoted as except from the said report [13]. They are: (a)

(b)

(c)

(d)

According to digital publishing on journals, the STM book market (worth about $3.3 billion annually) is growing very fast for digital publishing. Ebooks covered around third of the market in 2016. Around 10,000 journal publishers globally, out of these 5000 are included Scopus database. About 650 publishers are publishing around 11,550 journals which is about 50% of the total journals. Of these, some 480 publishers (73%) and about 2300 journals (20%) are not-for-profit (2018 Report) Around 33,100 English language journals in mid-2018 (plus a further 9400 non-English-language journals), published around 3 million articles a year. It is increasing yearly by about 3% and 3.5%. This growth has increased to 4% per year for articles and over 5% for journals in recent years. Publishing is being done by both formal elements (e.g. journal articles, books) and informal (conference presentation, pre-prints). Besides this there are two main agencies in the scholarly communication supply-chain: ● Publishers who are responsible for quality control, production and distribution. ● Librarian who are responsible for access and navigation to the content, and for its long term preservation.

(e)

Books & e-Books: Electronic books are also offered by STM publishes on the same electronic platform as in their journals. Academic libraries according to teaching interests are also developing e-books. According to one survey (sharp & Thompson 2009), there are following main reasons for this: ● Reader’s convenience (off-campus access) ● It is a strategic move to electronic access

A Framework for Designing Long Term Digital Preservation System

123

● Avoid multiple copies ● Easy accessibility ● Reduced pressure physical space (f)

4.2.6

Current Status with respect to digital preservation at publishers end: Earlier, Long term preservation was the main responsibility of the librarians rather than publishers. The fundamental issue is that the problem of long term digital preservation are not yet resolved. Another important practical issue is the fact that most electronic journal is accessed from the publishers server, the subscribing library itself does not possess a copy to preserve and it cannot rely on publisher necessarily to be in existence at an arbitrary date in the distant future. This in the absence of a proven solution for long terms preservation and lack of ownership over a piece of article of a journal or e-book, most of libraries are subscribing to print version as well as electronic versions of the same, instead of opting for electronic only subscription. Meanwhile, the interim makeshift arrangement of regular transfer of data on new media as old one become obsolete and making available relevant hardware/and operating system, is being resorted to by most of the publishers. Also, publishers are resorting to take digital archiving services of third party like institutions in EU/USA/Australia where research programme to address technical issues with respect to digital preservation is going on as a concentrated effort by the national government, for instance at the national library of Netherland, Digital curation centre & British library in UK, National library of Australia. Many major publishers including Elsevier, Springer, Black well, UUP and Sage have volunteered to utilize the services of national libraries of Netherland for preservation of their publication. Librarians/Archivists

The knowledge data warehouses/stores covers government/non-government, profit/non-profit traditional of memory institutions like Archives, Libraries and Museums including research development organizations engaged in the field of document preservation technology. Data Archives are centre of excellence for data acquisition, dissemination, promotion, and preservation at national or international level including discipline specific institutions. They collect data, make them accessible to user community and preserve them for use in distant future, Institutions like ‘National Archives of India’ (NIA), New Delhi and Indira Gandhi National centre for the Arts (IGNCA), New Delhi, justify their existence as Archives of national importance, operated by Govt. of India. Libraries are also memory institutions like Archives and have similar kind of operational functions but have broader focus of collections. Their collection encompasses all facets of human society and its culture. Never the less, many of exclusively run institutional libraries are domain specific like Science & Technology, Medical & Health care, Pharmacy etc. Libraries also collect all kind of information, make them accessible to users and preserve them for long term. Now a days, in modern libraries, such collections are also born digital

124

A. K. Sinha et al.

or digitized document apart from information on paper document & other physical mediums. Archives and libraries are operationally run and managed by specialist trained professionals called Archivists and librarians respectively, broadly may be designated as Data Store Managers. Further, they are responsible for the following during their day to day course of functions: (a) (b) (c) (d) (e)

Search, identification, verification, and selection of published documents Procurement of identified suitable materials based on budgetary provisions and general demands of their user base Providing facilities for easy access including support for re-use Protecting the rights of authors by adherence to copy rights & IPR liabilities. Performing preservation related activities and training of personnel.

The salient, observation on the issue of digital preservation while interacting with such specialist professionals from some of prominent institutions, are appended below: (a)

(b)

Awareness: For this community, preservation of stored knowledge materials, is considered to be one of the major responsibility of the professions, hence the level of awareness in terms of reasons and importance of preservation activity for paper based documents was found generally quite high and were quite conversant and qualified. However awareness regarding need for preservation of digital based document, was limited to few of the senior and experienced ones, otherwise there is general happiness prevails in the minds of lower and middle level staff that with the advent of procurement and stocking of digital documents like e-books, e-journals, ongoing digitization of existing holding of analogue documents, they would be finally get rid of the responsibility of prevailing traditional preservation activities which is considered quite cumbersome and pain staking. Per se, the general awareness regarding essential need for digital document is directly proportional to the percentage and age of born digital & digitalized material holdings with the institution. Presently most of digital materials held are vendor/publisher supplied e-books/journals and Book-CD’s. Most of the libraries have of late started the process of digitization of selected precious analogue documents and some archives have started producing. Multimedia (Audio, video, still images, graphs) based audiovisual DVD/CDs of cultural heritage of national importance. During these ongoing process, some of the institutions are finding difficulty in accessing old CD/magnetic spools of 10–15 year old technology based in-house born digital products. So, it may be safely concluded that general awareness on the issue of digital preservation is quite low among majority of archivists, librarians & staff in most of the prominent institutions in the country. Reason: Majority of them were in agreement that created knowledge document in digital form of permanent intellectual value are useful for betterment of humanities and needs to be preserved for distant future like their counter parts in paper document form. It’s obvious usefulness and utility in future will act as

A Framework for Designing Long Term Digital Preservation System

(c)

(d)

a stimulus advancement of research, analysis and validation of old work, and further upgradation/modification, uniqueness of the content etc. Threat: Once the limitations and basic nature of fast changing digital technology were explained in layman’s language with live examples within the frame work of ongoing digital activities in their premises, certainly there was realization of impending threat on the existing digital inventory in the future tenure of 10–15 years. Lack of continuity in sustainable hardware and software based computing environment is considered as the vital threat for the provision of information accessibility and dissemination which is primary functions of the profession of archiving and library science. However, after realization of the inevitable threat on digital technology dependent document, there is a feeling of helpless in the absence of requisite knowledge in field of tools, techniques & concept of information technology and an obvious sense of dependence on IT specialist to take on this difficult task. Even, there was murmur on the need for training/re-training on this vital technological aspect, as such in-competency is for seen as threat on the survival of their profession. Current State of Affairs: The current state of affairs in regard to initiatives on digital preservation in major Archives and library at present, is not in near sight of vision. The interaction during the course of survey of various libraries (mostly Govt. funded public and institutional one reveals the following: (i)

(ii)

(iii)

(e)

125

So far they have been traditional paper or physical medium based document holding institutions and generally face lack of necessary/adequate budgetary support for creation of required technical infrastructure and temperature controlled dust free environment even to run reasonable size digital library. Within the limited resources available, the administrative focus is to divide the fund allotted on capital expenditure of procurement of digital equipments, and regular revenue expenditure on subscription of digital documents like e-journals/books as well as regular procurement of consumables for digitization work on selected analogue holding, so that as a part of modernization & digitalization effort, it can safely claimed that majority of the institutions are making all out effort for running a digital library as well as they have a project in hand for digitalization of old records. And nobody at the moment is thinking of gigantic project of digital preservation. Which I suppose, need not to be taken as sweeping statement. Never the less, with the support of Govt. of India, a couple of highly responsible & prominent institutions have initiated the preservation project and its status is said to be confidential.

Future Outlook and need for infrastructure: Majority of Archives/libraries being managed by highly reputed professionals do think that their institutions are at present not prepared for the future task of digital preservation. However do feel, it may be a call of day for scalable future. They find even their present technical infrastructure not adequate enough for operation of a good quality

126

A. K. Sinha et al.

digital library as well as support for ongoing digitization project. Scaling up the technical infrastructure along with necessary technical staff support may be considered a dream to be fulfilled in distant future. However, they all agree that a central/national level digital repositories or a pool of repositions, can be established with strong support of government. Also because of description in preservation process due to advance digital technology, many of them foresee a shift in conventional roles & responsibilities for the profession of custodians of Archives and libraries.

4.3 Questionnaire Prepared for Survey in Assistance of Carrying out Face to Face Interview Sessions with Various Organizational Stakeholders (Organizational Functionaries/faculty/PG Students/Research Scholars/Scientific officers/Librarian/Archeologists/Publishers). A.

General Perceptions and Existing practices of data creators (individual/team) in respect of preservation of data they produce out of daily vocation– 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

B.

What do you mean by ‘Preservation’? What do you understand by ‘Digital Preservation’? Where do you store your research data/document? Are you concerned about preservation of data you are producing out of daily work practices in your research effort? If yes, what kind of data you want to be preserved? Where do you preserve your data and what volume of data you normally preserve? Dhahuena yHow much data you expect to generate in short/medium term, say 5 years. Do you share your data having intellectual or commercial bearing in present & future? If yes, to whom you prefer to share your data? Do you have any organizational support for preservation of data you create out of day to day work practice or your project at hand? Is there any policy, rules regulations existing in respect of preservation of data? What is safety measures applied for saved data against natural/environmental calamites or technological obsolesce? Are there been availability of specialized team of personnel dedicated for preservation of individual or organizational data?

Reasons for Preserving digital data (born & digitized): 1.

What kind of digital data you are creating/producing or storing, or publishing?

A Framework for Designing Long Term Digital Preservation System

127

(i) (ii)

2.

Academic Research output Day to day raw data produced by daily operational transaction/statistics (iii) Well analyzed useful information from transactional data having future value to act as decision support system (iv) Statutory and mandatory reports for regulatory authorities. (v) Digital data of entertainment but commercial value in the form of audio, video, picture, text or multimedia files. (vi) Textual/numerical/graphs based documents inferred from statistical survey/Research of national/social importance. Kindly give your opinion on following issues: (i) Is it necessary to properly preserve the results output of publicly funded research as they are public property? (ii) Is it necessary to properly preserve the results out of privately funded research as they are private property of commercial value e.g. drug development research outcome of a pharmacy company. (iii) Since new research are built on existing knowledge will preservation of ongoing research outputs will have future value to act as stimulus in advancement of knowledge in particular field of science/technology/Humanities/social sciences etc. ● Availability and access to preserved existing data will have (i) Verification and validation activities (ii) Future enquiry and analysis (iii) Will help in future research of inter disciplinary nature (iv) Creation of potential economical value due to its uniqueness.

C.

Threat perception to existing digital data in the absence of long term preservation Kindly respond to probable potential threats identified as below: (i)

(ii) (iii) (iv) (v) (vi) D.

Due to fast changing computing Hardware & Software technologies the existing available information may not be accessible to user. Community in the absence of sustainable supporting computing environment is obsolescence of technology. The semantics format or algorithms of existing data files may be not understandable or useable to user community. Ability to locate the data will be lost Origin and authenticity of original data will not be confirmed in the absence of contextual evidence The continued existence of the present custodian institution of existing data my lease to exist in distant future for unknown reason. The changing hands of custodians/digital holdings may lose the trust & faith of user community.

Requirement of Specialized Technical Infrastructure (a)

Do you have any idea/knowledge of existence of any digital archive? If yes, are there enough in number?

128

A. K. Sinha et al.

(b)

(c)

(d)

(e)

E.

Do you feel or believe the need for some kind of specialized/domain specific organizational/national/state level technical infrastructure to cater the preservation requirement of digital data for long term. If so, would you like such specialized infrastructure/services is organized by National Govt. or by private sector/third party vendors running on commercial terms. Do you feel building a sort of Central Digital Archives/Repository or you would prefer a domain specific network of such institution at suitable places in the country/state interconnected to each other storing data of user organization redundantly. Running any of such infrastructure or service providers would be an expensive preposition, costing large amount of initial capital as well as working capital over a period of time on continuous basis. In the context of funding, what kind of initiative would you like to have for seamless long-term service?

Librarians/Archeologists (a) (b)

(c)

(d) (e)

(f)

(g)

(h)

(i)

Do the temple of memory institutions have policies and procedures to determine’ what kind of data to be accepted for storage/preservation. Secondly, when and how data need to be submitted? Precisely, is there any kind of Data selection activity is taking place at the time of submission by the data creator community, or it is like internet. Where it is free for all to dump anything & everything for posterity. If any submission policy does exist, what is the liability arrangement in the event loss of data/modification/deletion/damage of original stored data? Do data securities related issues are taken care by technical infrastructure that protects stored data from unauthorized & misuse. Are Archivist/library staff is prepared and technically qualified for the future requirements of digital preservation. Are they conversant with the IT based tools & techniques of digital preservation. As publisher, what is number of Academic Books have been published and sold nationally and at international level. Also if you also publish journals, they how many print and e-journals do you represent. Do they consider the long term preservation of published books/journals/articles is an important responsibility of their and do they follow some kind of preservation strategies and policies a part of business ethics. If not, why not and what should be arrangement for creating and maintaining a centralized technical infrastructure at national/state level or a network of domain specific Repositories for Academic/Research outputs. What kind of financing/funding arrangements should be there and what level of role & responsibilities they would like to take against preservation of their publication?

A Framework for Designing Long Term Digital Preservation System

(j)

(k) (l)

(m) (n) (o)

(p)

129

Current practices of institutional subscription based for recently published articles and open access of old articles/old volumes of journals published, will be standard model of user interaction in future also or it is going to take new line. Can you give an idea of how many peer renewed journals are published worldwide and how many publishers are involved? Do you have any system of categorization of publisher in term of large, medium and small publisher based on their annual turnover of books & journals? What kind of digital data/range of formats that authors submit to a publisher? Being a parent Govt. agency for various kind of Academic/Scientific Research/other core individual sectoral field organizational. Since publishing books and journals are being operated on for profit commercial/business model by private sectors around the world. What is the future revenuer model as far as user institutions/organizations/and user communities are concerned. Do you have any apprehension about the people whom you trust to look after your data, may not fulfill the natural confidence criteria?

4.4 Select the Organization Needs Organizational needs of the target organization are understood for detailed design of the the digital preservation system. These organizational needs may include storage methods, accessibility options security plans, policies of the organization, Data Encryption methods, backup criteria, category of important and less important data assets etc.

5 The Open Archival Information System Reference Model (OAIS) OAIS is the basic reference model for developing digital preservation system. Submission Information Package (SIP) maintains the technical information and it has the procedures for maintaining confidential and sensitive data. Dissemination Information Package (DIP) inherits one or more Archival Information Packages (AIPs). AIPs contain content information and associated Preservation Description Information (PDI). PDI is maintained inside the digital preservation system. DIP disseminates the digital information to the consumers in response to user’s queries with the help of access control system. OAIS has preservation planning and administration module. All the OAIS system is controlled by the management module [14, 15]. OAIS model is shown in Fig. 2.

130

A. K. Sinha et al.

Fig. 2 OAIS reference model for DPS

6 Digital Repository Development Digital repository development system is shown in Fig. 3. initially we have Archives and digital repositories. Based on this policies are created and basic architecture of the reservation system is developed. Using this architecture a prototype model is prepared after modifying this prototype model, detailed design of the system is developed and implemented. The model is tested for transfer, backup and recovery of digital objects. Enhancements if any, are done later on. The proposed digital preservation system model is shown in Fig. 4. In this model, the producer of digital objects are digital archives and the digital data generators. These digital data generators could be the Writers, Publishers or any other entities generating digital data. Archival storage system consist of document storage system and access control system. The document storage system contains digital data in the form of databases or flat files. With the help of access control system the digital objects are imported from Archives and catalogues or from independent computers to the archival storage system. Record custodians use the archival storage system with the help of available Administrative Services. Consumers can access the archival storage system with the help of available preservation services in Fig. 5.

Fig. 3 PDI

A Framework for Designing Long Term Digital Preservation System Fig. 4 Digital repository development

Fig. 5 Proposed DPS (Digital Preservation System) model

131

132

A. K. Sinha et al.

7 Conclusion and Future Scope This paper presents framework for designing a system for long term digital data preservation. An exhaustive approach to design an integrated system for long term preservation follows a Research Strategy to achieve the defined objectives of the projects. Macro level study and analysis of set of sample organizations is required to understand policies and practices used by them. At micro level, the key contributors, stakeholders for organizational past and present documents in paper or digital form are visualized for designing the target system. A methodology for preserving data in digital form for long duration has been given. Moreover different organizations producing records have been categorized for esay reference. Finally, digital repository development and digital preservation system model have been proposed. The proposed framework and model gives the guidelines which are helpful in designing the digital preservation system. Future scope of the proposed model is to test it on the real data and large number of organizations.

References 1. Sinha, A.K., Kumar, S., Singh, H.M.: Risk management based approach for long-term digital preservation. Int. J. Sci. Technol. Res. 9(1) (2020) 2. Moore, R.: Towards a theory of digital preservation. Int. J. Digit. Curation. 1(3), 63–75 (2008) 3. Barateiro, J., Antunes, G., Cabral, M., Borbinha, J., Rodrigues, R.: Using a grid for digital preservation. In: Buchanan, G., Masoodian, M., Cunningham, S.J. (eds) Digital Libraries: Universal and Ubiquitous Access to Information. ICADL 2008. Lecture Notes in Computer Science, vol 5362. Springer, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-54089533-6_23 4. Baker, M.G., Shah, M., Rosenthal, D., Roussopoulos, M., Maniatis, P., Giuli, T., Bungale, P. A fresh look at the reliability of long-term digital storage. In :Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006 (2006) 5. Chen, P., Lee, E., Gibson, G.A., Katz, R., Patterson, D.: RAID: high-performance, reliable secondary storage. ACM Comput. Surv. 26, 145–185 (1994) 6. Weatherspoon, H., Kubiatowicz, J.: Erasure Coding Vs. Replication: A Quantitative Comparison. IPTPS (2002) 7. Antunes, G., Barateiro, J., Cabral, M., Borbinha, J., Rodrigues, R.: Preserving digital data in heterogeneous environments. JCDL ‘09 (2009) 8. Hedges, M., Hasan, A., Blanke, T.: Management and preservation of research data with iRODS. CIMS ‘07 (2007) 9. Innocenti, P., Ross, S., Maceviciute, E., Wilson, T., Ludwig, J., Pempe, W.: Assessing digital preservation frameworks: the approach of the SHAMAN project. MEDES (2009) 10. Lorie, R.: A methodology and system for preserving digital data. JCDL’02 (2002) 11. Strodl, S., Becker, C., Neumayer, R., Rauber, A.: How to choose a digital preservation strategy: evaluating a preservation planning procedure. JCDL’07 (2007) 12. Johnson, R., Watkinson, A, Wabe, M.: The STM report. An Overview of Scientific and Scholarly Publishing, 5th edn (2018). https://www.stm-assoc.org/2018_10_04_STM_Report_2018.pdf 13. https://www.stm-assoc.org/ 14. Hockx-Yu, H.: Digital preservation in the context of institutional repositories. Program 40, 232–243 (2006)

A Framework for Designing Long Term Digital Preservation System

133

15. Gladney, H.: Long-term preservation of digital records: trustworthy digital objects. Am. Archivist 72, 401–435 (2009) 16. Mukul, K.S.: Digital repository on cloud infrastructure: issues & challenges. In: APA/CDAC International Conference on Digital Preservation and Development of Trusted Digital Repositories (2014)

Towards Sustainable Smart Cities: The Use of the ViaPPS as Road Monitoring System Henri Giudici, Boris Mocialov, and Aslak Myklatun

Abstract Smart cities are an opportunity to overcome the concerns regarding the rapid increase in highly dense populated urban areas. Using ICT, smart cities make urban areas greener, sustainable and at the same time increase their competitiveness and their economic growth. Road networks play a significant role to improve the sustainability of smart cities. Indeed, deteriorated roads cause tied mobility, traffic congestion, CO2 emissions and economic damages to cities and their citizens. To prevent these negative aspects, the road network have to be continuously maintained. A satisfactory road maintenance relies on a continuous monitoring of road network which can be facilitated deploying ICT. This article presents the role of ICT in road monitoring and showcases the ViaPPS: a mobile Pavement Profiling System. The ViaPPS offers accurate and detailed geo-referenced information of the state of the road and the corresponding furniture with the deployment of LiDAR and computer vision techniques. After presenting the capabilities of the ViaPPS, this article discusses the strategy of the smart city of Oslo towards its sustainable development goals and the role of ICT in road maintenance. Keywords Mobile pavement profiling technologies · LiDAR

H. Giudici (B) · B. Mocialov · A. Myklatun ViaTech AS, Dyrmyrgata 35, 3611 Kongsberg, Norway e-mail: [email protected] B. Mocialov e-mail: [email protected] A. Myklatun e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_8

135

136

H. Giudici et al.

1 Introduction According to the [83], today approximately half of the world’s population lives in cities and in 2030 is expected to increase up to 60%. The growth of the population is reflected in a higher number of densely populated urban areas whose services and liveability are substantially challenged [61]. Such growth can lead to an increase in transport demand and, as a consequence, a considerable high amount of public and private transport uses the road network which quickly deteriorates and creates negative effects for the city itself and for the city inhabitants [17, 61]. Cities with deteriorated road networks are prone to traffic congestions [16]. While the transportation delays of goods negatively affect the economic growth of the city, higher traffic congestion causes an increase in CO2 emissions and lower accessibility of city services for its citizens [16, 17]. This puts a strain on the decision makers, who strive for the sustainable development goals [46] and often find those opportunities in the smart cities. “Smart city” is an emerging concept and lacks a clear definition. The sentiment in the literature varies from improving the livability on one side [19] to optimising the efficiency of the services on the other side [10, 34]. Data and the use of sensors as a source of knowledge for smart cities has been recognised in previous studies [10, 36, 92]. A satisfactory road maintenance relies, among other tasks, on an effective road monitoring. The fundamental task of the road monitoring is to evaluate and assess the presence of road defects prior to a major distress. However, inspection of the road network is an expensive and time-consuming activity, which has to be continuously conducted by the practitioners in an accurate and detailed manner. ICT solutions can facilitate the road monitoring activity and mitigate the relative cost associated. Due to the importance of the monitoring task and the related budget concerns, different ICT solutions are available nowadays. Among the existing solutions, the Light Range and Detection (LiDAR) and Computer Vision (CV) are recognised to be the most prominent technologies for future innovations in many sectors, also in the road transport. The information from LiDAR and CV-camera based sensors, if properly handled and managed, can improve the performance of the monitoring task and, at the time, areeducing the associated costs. However, although the high potentialities, these monitoring solutions are based either on LiDAR either on CV solutions, while very few adopt both at the same time. The need of understanding the already existing solutions and how multiple sensor’s information can be used at the same time to perform road inventories is a priority which need to be addressed. This chapter presents the state of the art on the use of mobile sensing technologies for pavement inspections and shows as a case study the ViaPPS (Pavement Profiling System) [31], a mobile multi-sensor system based on LiDAR and CV technologies. After the description of the ViaPPS, and its relative data processing analysis, it is suggested a definition regarding the deployment of ICT in road maintenance. Finally, this chapter discusses the smart city strategy of Oslo and its ICT deployment towards its sustainable development goals.

Towards Sustainable Smart Cities …

137

2 Sensing Technologies in On-Road Object Inventory Nowadays, ICT is at the heart of smart cities with Intelligent Transport Systems (ITS) ensuring a more effective and efficient connectivity between road assets and asset stakeholders in the form of increased safe mobility. Losurdo et al. [54] list requirements for an ITS to be (a) traffic monitoring, (b) Closed-Circuit television (CCTV) system, (c) monitoring of weather conditions, (d) and monitoring road pavement conditions. This chapter focuses on monitoring of the on-road object inventory, which include categories as: pavement defects (e.g. cracks, potholes, etc.), pavement markings, edges, manholes and road furniture (e.g. light poles, etc.). Road surface condition is directly linked to the traffic safety and efficiency [66, 82]. Kaare [40] lists categories of road monitoring, identifying intrusive, nonintrusive, off-road, and on-demand techniques to monitor road deterioration, each of which requires a different set of equipment. On one hand, traffic volume influences the road condition and can be measured with force sensors embedded into the road [51] as well as passive [64, 89] or active [29, 91] imaging sensors. Besides the low cost of the passive imaging sensors, many cities have installed cameras, which makes data acquisition easier [2]. On the other hand, active imaging can be used to analyse pavement condition under the surface directly, while profilometers as well as active imaging can give insights into the road surface [81]. In response to various road maintenance standards, multiple systems for pavement evaluation have been developed over the time [7, 38, 45, 80]. These mobile systems are usually equipped with a range of state-of-the-art sensors, such as Global Position System (GPS), multiple high-resolution cameras, profilometers, one or more LiDARs, custom designed friction measuring equipment, odometers (or Distance Measurement Instrument, DMI), Inertial Measurement Unit (IMU). LiDARs are based on the retrointensity and have been extensively used in the last decades for capturing 3D point cloud of the surroundings. Mounted on vehicle platforms (e.g. cars, drones and Unmanned Aerial Vehicles (UAV)), these sensors can be adopted for mapping and surveying urban infrastructure, road condition, and related furniture (e.g. manholes, traffic signs, etc.) [25, 35]. Once the point cloud of the inspection area is collected, post-processing algorithms have to be deployed to detect objects of interest. In this regard there are multiple examples of authors presenting their automatic algorithmic approaches to detection and classification of ground objects, pole-like objects, mobile, and immobile objects. In the case of the ground objects, earlier works, like in [93] describe a combined approach of supervised deep learning model and random forest model for detecting manhole covers obtaining completeness and correctness respectively, on average, of 95%. Wei et al. [90] report approximate 95% completeness and correctness by applying Histogram of Oriented Gradients (HoG) [22] and Support Vector Machine (SVM) [84] on to the intensity-based images. A more recent approach utilises one of the latest You Only Look Once version three (YOLOv3) model [69] and achieve 97% F1-measure on intensity images [67].

138

H. Giudici et al.

Li et al. [50] used information about elevation, intensity, roughness, and curvature to classify pavement deformation with 92.3% accuracy, noting that the higher resolution of the sensor could positively affect the accuracy of the method as the experiments revealed that the smaller artefacts in the data were not salient. Similarly, Su et al. [78] build precise road surface Curved Regular Grid (CRG) models [11] quickly using elevation data. El Issaoui et al. [24] evaluate mobile LiDAR systems on their ability to accurately measure the rut depth and crossfall slopes. Zai et al. [95] test road boundaries extraction by performing voxelisation of the point cloud data to avoid objects present in the data and then applying α-shape algorithm to obtain the boundaries of segments and the segments are then adjusted by applying iterative graph cuts. The algorithm scores approximate 95% on completeness, correctness, and quality measures. Li et al. [49] use the road geometry features and performs filtering on the prior to achieve more accurate road boundary extraction with 97% accuracy on the straight patch with slight degradation on curves. Nagai et al. [59] use intensity values to distinguish between grass and asphalt by employing an adaptive thresholding and achieve more than 90% accuracy. Guerrero et al. [33] propose road segmentation approach that achieves 86% F1-measure when incorporating the reflectivity in addition to the geometric features, such as height, smoothness, and tangent vector. Rastiveis et al. [68] use Hough Transform method [39] to detect road markings and achieve 88% F1-score, suggesting that the lower accuracy score is due to the minimum line length of the algorithm, which missed smaller lines on the road. As for the pole-like object detection and classification. Cheng et al. [20] utilise intensitydriven imaging, obtained from the point cloud for lane marking detection using U-net model, trained on synthesised data that was generated by using the intensity thresholding. Results show that the model is capable of very high segmentation precision with deviation of about 1–2 cm from the ground truth. Yu et al. [94] propose a methodology, based on marked point process and Bayesian inference, for automatic detection of trees and show that the algorithm struggles with irregular shapes of the tree crowns. This is a common problem when trying to fit one model to objects that have high variations in shapes and sizes, which has been observed when detecting sculptures [87] and even buildings [52]. In the case of other pole-like object, Cabo et al. [18] present an algorithm for identification and extraction of poles road furniture obtaining an average completeness of 92.3% and correctness of 83.8%. Gargoum et al. [28] filter the LiDAR data based on the intensity thresholds and apply clustering in order to segregate the high-intensity and high-density points with high success rate. Karsten et al. [41] take this one step further and classifies the signs detected in 3D data using the corresponding RGB image with approximately 50% accuracy using a simple sliding window classification. Kargah-Ostadi et al. [60] show how to achieve a higher 86.8% mAP score on traffic sign recognition from RGB images on 43 signs using the SSDLite model with the pre-trained MobileNetV2 backbone [72]. With the explosion of popularity of machine learning and its influence on computer vision, recent approaches utilise data-driven pattern recognition techniques, which require data from Red, Green and Blue (RGB) cameras. Koch and Brilakis [43]

Towards Sustainable Smart Cities …

139

report greater than 80% precision and recall of pavement defect detection by utilising the histogram of the greyscale images, edge thinning, and regression. Tedeschi and Benedetto [79] report 0.7 F-score of cascade classifiers trained on Local Binary Patterns for mobile devices that can be used by the road inspectors. These were common methods in earlier image processing. Later, Pan et al. [63] compare SVM [84], ANN [53], and RF [15] on a number of features, such as pixel means and standard deviations as well as the detected object geometries of images taken by a UAV, reporting greater than 90% accuracy. Using more contemporary methods, Silva et al. [76] report more than 90% AP using the YOLO v4 model [14]. Lee et al. [47] use a model similar to the U-Net [71] to detect multiple anomalies present on the pavement without reporting the results of the model. Unfortunately, the pavement condition is also dependent on the allocation of the road funds, which could be below expectations according to some reports [40, 55]. Such limitations could, however, serve as an innovation catalyst for projects that utilise low-cost sensors for road condition monitoring [3] in order to support preemptive maintenance. For example, mobile phones can be used for detection of road defects [44, 79]. The literature tells us that there are multiple applications for mobile phones on the road. For example mobile phones are used to send pictures to the maintenance personnel via mobile app showing road defects with their location using the embedded GPS. This information is promptly reported to the local road authorities who, then, are responsible for taking action [37, 62]. Crowdsourcing such activities could mitigate extensive costs associated with road inspection. GPS together with the IMU present on most of the phones can also be used for detection of road anomalies [9, 65, 75]. In addition, information fusion from other phone sensors (e.g. accelerometers and gyroscope), combined with machine learning techniques, can be adopted to predict the state of the road condition [5, 12]. Sattar et al. [73] present an extensive review of the current use of mobile phones as road anomalies detectors by looking at the mobile phone as data collectors and what is needed to process the data to extract useful information. Finally the authors stress the challenges of the reviewed methods and assess how an ideal hybrid approach could overcome these challenges. All in all, adoption of 3D scanning sensors have opened up new opportunities for the ITS in the form of faster, more accurate, and cheaper data that is being used for managing road assets and informing the stakeholders. Unsurprisingly, such data is redundant, does not have semantic knowledge, and contains noise, especially in the case of the mobile systems due to the dynamic nature of such systems. Processing power becomes increasingly a bottleneck in overcoming these drawbacks. The expectations become even more stretched when such systems are processing the data in real time. Voxelisation [88] and other data compression [77] methods can reduce the data redundancy, however, additional processing is currently necessary to infer semantics [26] or adjust for the noise [30]. It is worth mentioning that vision sensors are not the only ones that are being used for road inspection. The reader is invited to read the Kim and Ryu review [42] for a more diverse equipment selection for road surface inspection going beyond the computer vision.

140

H. Giudici et al.

3 Case Study: The ViaPPS ViaPPS is a commercial mobile pavement profiling system that is designed to monitor pavement surfaces. The ViaPPS is composed of multiple sensors mounted on a vehicle as platform. The functionality of these sensors can be divided into two categories: (a) perception and (b) position/navigation. During the inspection of a pavement surface, the ViaPPS collects pavement information which are analysed in post-processing stages and can be implemented in appropriate data management systems. The conceptual model and the work flow behind the ViaPPS system are shown in Figs. 1 and 2. The complexity of the collected data during the pavement inspection require a specific handling scheme. According to [25] the stages of the data handling can be reduced to a core procedure which include stages as data collection, synchronization, calibration, georeferencing, data fusion. Once the data are correctly fused, the ViaPPS operator can perform the feature extraction of the interested objects and, if needed, the collected images can be anonymized. Automatic reports are generated for the interested features which can be stored in appropriate data management systems. The following section provide a description of the sequences of the ViaPPS workflow as shown in Fig. 2.

Fig. 1 Conceptual model of ViaPPS

Fig. 2 ViaPPS work flow

Towards Sustainable Smart Cities …

141

Fig. 3 ViaPPS system design with various components for perception and position/navigation. Sources [31, 86]

3.1 System Design The operator controls the system using a PC-controller. Frontally to the vehicle are located two RGB cameras, a point-laser and a DMI (attached to a wheel rim). Two GPS antennas/receivers are located respectively in front and on the back side of the vehicle platform. On the back side of the vehicle three LiDAR sensors are mounted. In proximity to the LiDAR sensors the IMU is located. Two cameras are located on the top and below the LiDAR sensors, respectively a 360 Field Of View (FOV) camera (top) and an RGB cameras (below). Figure 3 shows the ViaPPS system design. Here we extend the technical description provided by Giudici et al. [31]. Perception sensors comprehend four laser scanners, three LiDARs (nr. 1 in Fig. 3) and a point (or texture) laser (nr. 5 in Fig. 3). Perception sensors include also four cameras, three RGB camera (nr. 3, 9 in Fig. 3) and a 360◦ FOV camera. The characteristics of each perception sensors are:

142

H. Giudici et al.

(a) LiDAR: Two Velodyne laser scanners are mounted on the left and right side of the main scanner Z+F Profiler as shown in nr.1 of Fig. 3. The Velodyne sensors have a 360◦ FOV in which over 300,000 point per second (pps) within 100 m in range are collected for each scan. This sensor adopts a laser Class 1 eye safe 905 nm technology [85]. The Z+F Profiler is based on phase technology with a 360◦ FOV. In each scan the Z+F Profiler captures over a million of pps with an accuracy lower than 1 mm ranged within 119 m. The high rotation speed is 200 rotation per second (rps). The adopted laser is classified as Eye safe Class 1 being resistant to dust and water (Protection class IP54) [96]. (b) Point-laser: A Riftek laser scanner (nr. 5 in Fig. 3), based on laser triangulation techniques, recreates in form of point-by-point the profile texture of the scanned surface. This scanner has a shock absorbent technology and high resistance to solar radiations [70]. (c) Cameras: Three RGB Basler cameras (nr. 3, 9 in Fig. 3) collect images at regular intervals, typically 10 m, within 5 MP within a resolution of at least 2448 × 2048 pixels [13]. Ladybug 5+ camera (nr. 2 in Fig. 3) with a field of view of 90% of full sphere. The camera collects images at 30 MP with a frame rate 14.5 with a resolution of 2048 × 2464 [27]. The navigation/position system adopted in the ViaPPS is the Applanix POS LV. The POS LV adopts inertial technology integrated with the Global Navigation Satellite System (GNSS) system to provide high rate (200 Hz), reliable and accurate position data. The generated data provide a continuous position and orientation of the vehicle represented by its Coordinate Reference System (CRS) axes with relative orientation angles. The Applanix POS LV includes several components as the IMU (nr. 4 in Fig. 3), the DMI (nr. 6 in Fig. 3), GPS antennas/receivers (nr. 7 in Fig. 3) and a POS. The IMU contains accelerometers and gyroscopes for the measurement of the angular speed and acceleration while driving. The DMI mounted on the vehicle wheel measures the travelled distance. Combined with the IMU, the GPS antennas, adjusted with GNSS system, provide continuous and highly accurate position and orientation of the vehicle in motion [8].

3.2 Road Inspection—Data Acquisition The road inspection begins by activating the ViaPPS system. After being activated, the perception and navigation/position sensors are ready to operate and the driver starts driving over the road stretch aimed to inspect. While driving the sensors of the ViaPPS collect data from the road and the related surrounding environment. Once

Towards Sustainable Smart Cities …

143

the road has been inspected, the data acquisition phase is ended and the operator terminates the system.

3.3 Data Handling The collected data during the road inspection have to be analyzed in post-processing stages. The stages include the synchronization, calibration, georeferencing and fusion of the data between the perception sensors and the navigation/position sensors [25]. The first step is the synchronization process. Mobile sensors collects data at different time frames creating difficulties while merging their relative data. To avoid differences in timestamps, each single sensor’s dataset have to be accurately synchronized. Hence, the data synchronization refers to the procedure in which common timestamps are assigned to the collected datasets enhancing the comparison of the data from the different data-sources at high accuracy. After the data synchronization the calibration stage takes place. The calibration phase has to be dealt with extreme caution and accuracy. In fact, in this phase the data from the perception and navigation/position sensors are compared to each other. A successful calibration between the perception and navigation/position sensors lead to a robust and reliable georeferenced 3D point-cloud. Here are described the calibration processes between the data from the LiDAR and GNSS/IMU sensors and the data from the LiDAR and RGB camera perception sensors. The LiDARs and IMU sensors are mounted on the vehicle platform with different orientation and position. The difference in orientation between the (absolute) IMU’s CRS and the (relative) LiDAR’s CRS leads to boresight misalignment angles which, if not accurately compensated, cause significant distortion on the generated point-cloud while merging it with the position (GNSS/IMU) data. To avoid orientation distortions, appropriate three dimensional transformations have to be applied to the 3D surface point-cloud, as described by Magnusson [57], where to transfer the LiDAR’s point cloud from its relative CRS to the absolute CRS it requires a procedure which needs as minimum parameters: the measurement of the lever-arm (l), the calculation of the rotational matrix (R), and the compensation of the misalignment angles with relative correction matrix. The lever-arm is the distance between the relative CRS and the absolute CRS. The rotational matrix is the orientation of the relative CRS from the absolute CRS. Different techniques can be adopted to compensate the misalignment angles as the least square method. Once these angles have been correctly compensated, a correction matrix as function of the boresight angles (C) is then created. Once these parameters are correctly performed, the LiDAR pointcloud can be transferred from its relative CRS to the absolute CRS being successfully georeferenced. Equation 1 shows the LiDAR’s point-cloud transfer from its relative CRS to the absolute CRS. pointiabs = p G N SS + R ∗ (l + C ∗ R r el ∗ pointir el )

144

H. Giudici et al.

where pointiabs : point-cloud element i in the absolute CRS; p G N SS : IMU position; R: rotational matrix; l: lever-arm; C: orientation angle corrections matrix; R r el : relative CRS orientation matrix; pointir el : point-cloud element i in the relative CRS. Assuming the pinhole camera model, the calibration between the perception sensors, LiDAR sensors and RGB cameras, demands the estimation of intrinsic ( f — focal length, c—principal point, kn —lens distortion coefficients) and extrinsic camera’s parameters (R—rotation and t—translation) which can be done combining optimisation algorithms such as RANSAC, Levenberg-Marquardt and machine learning techniques. Direct Linear Transform (DLT) [1] and Levenberg-Marquardt algorithms [48, 58] can be used to correct the lens distortions. Based on the already captured point-cloud data, the DLT is used to find a projection matrix (homography). To improve the performance of the algorithm firstly data are normalised by translating (using the mean) and scaling (using the standard deviation). Secondly, the homography equation are transformed into an homogeneous linear system equations. ⎛ ⎞ ⎛ ⎞ x x ⎝y⎠ = α H ⎝y⎠ 1 z The approach to resolve the homogeneous system is based on the use of: the SingularValue Decomposition (SVD) method [32], an eigenvector (corresponding to the smallest eigenvalue) and a de-normalisation to calculate H (the projection matrix). The algorithms from Levenberg-Marquardt are used to minimize the projection errors. The details of this calibration procedure are described in [31]. Figure 4 shows the calibrated 3D point cloud (red dot for each laser point) on RGB images collected during the inspection. In order to increase the processing speed, the 3D points cloud is reduced in volume and/or made more sparse. Figure 4: Calibrated 3D point cloud (red dot for each laser point) laid over an RGB image. After the data from the perception and navigation/position sensors have been synchronized, calibrated and georeferenced, finally they can be fused together. The data fusion, therefore, is the final step of the post-processing procedure and plays a vital role for the overall accuracy and robustness of the monitoring activity. Indeed, satisfactory results of the data fusion gives to each of the perception sensors the same timestamp and georeferenced information which can be adopted to monitor specific road sections and/or time intervals of particular interests using different perception sensors’s data. For example, images can be adopted to verify the correctness of the feature extracted from the point-cloud.

Towards Sustainable Smart Cities …

145

Fig. 4 Calibrated 3D point cloud (red dot for each laser point) laid over an RGB image. The volume of the point cloud is decreased to 203 m and the resolution is not modified in software. Source [31]

3.4 Features Extraction and Image Anonymization The 3D generated point-cloud from LiDAR’s scanner enhances the re-construction of the pavement conditions with the relative furniture. With the respect of pavement monitoring, a dedicated model based on dynamic adaptive algorithms for the pavement inventories is constructed from the 3D point cloud data. The adaptive algorithms are based on the detection of normalized road parameters and provide a mechanism for the detection of road features based on pattern recognition. This dynamic adaptive model detects parameters as road defects (cracks, potholes, ravelling, and joints), edge, and markings. Figure 5 shows detected interesting features performed as a part of the road surface inspection routine from the 3D point cloud. Object detection using region-based convolutional models is employed with the aim of detection and recognition of road features from the high-resolution RGB camera images. The data-driven models are fine-tuned partly on the real annotated data and additional synthesised road inventories data.

Fig. 5 Calibrated 3D point cloud (red dot for each laser point) laid over an RGB image. The volume of the point cloud is decreased to 203 m and the resolution is not modified in software. Source [31]

146

H. Giudici et al.

Fig. 6 Detection of road features using computer vision techniques. The figure shows a bounding box over an object of interest (e.g. manhole). Source [86]

Figure 6 shows results from the region-based convolutional model after it has been trained to recognise manholes. The model outputs four values for each corner of a bounding box and a confidence value for the class recognised inside the bounding box. Cameras are both a blessing and a curse since they can capture huge amounts of potentially useful information in a very short time, they also capture sensitive information, such as personal information of individuals in the field of view. From images and videos, it is possible to detect and recognise road users (e.g. vehicles, cyclists, scooters, pedestrians) and track their relative movements posing a threat to their privacy. Since there is a growing pressure regarding privacy concerns, industries that are using cameras have to provide solutions which take into account privacy concerns. In accordance with the General Data Protection Rules (GDPR), the ViaPPS adopts advanced privacy algorithms to protect the privacy of the road users from the collected images as can be seen in Fig. 7.

3.5 Reports and Data Management Systems Data is collated in form of reports after the inspection and any personal information are strip out of them. Filtered road inspection data is then pre-processed using the steps described above and can be imported into data management systems, each of which is responsible for the collection of individual data with relative format (e.g. RGB images, 360 FOV images, LiDAR 3D point-cloud, etc.). As can be seen in Fig. 8, the data is then referenced using the GPS information, detected feature information

Towards Sustainable Smart Cities …

147

Fig. 7 Privacy protection by blurring sensitive information, such as vehicles of different kind and people. Source [86]

Fig. 8 a High-resolution RGB camera management system with automated traffic sign detection and public database synchronisation. b ViaPhoto 360◦ image management system. Source [86]

(e.g. cracks, traffic signs, etc.) so that it can be found easily. Moreover, the data is also tracked in the temporal domain. This means that evolution of a specific road furniture could be tracked over the time. This helps when predicting the future road surface deterioration. Moreover, various services, like Google Maps or road furniture databases, created by the road authorities are integrated into the data management system for reference and verification purposes.

4 Discussion To date, there has been an enormous amount of efforts to enhance a sustainable development of the smart cities around the world. The work described in this chapter has been performed in relation to the smart city of Oslo case. The strategy of the

148

H. Giudici et al.

city officials is to deploy ICT extensively to make greener and sustainable city [21]. In line with this strategy and with a given focus on regional and local level, the recent Norwegian guidelines [23] suggest a roadmap to ensure the integration of the smart cities framework. These guidelines prioritise deployment and use of technologies aimed to reduce the traffic and air pollution as well as adopt greener transport solutions. In order to deploy such technologies, a joint effort from local authorities as well as organisations, researchers and industries is required. In the guidelines is suggested that the local authorities need to develop expertise and to collaborate with the industry, academia, and organisations to exploit new ideas all the way from the inception to deployment. This chapter discussed how ICT can improve the efficiency of the road monitoring and presented a specific case of a mobile pavement monitoring system, which is used to foster sustainable smart cities. Transportation solutions rely on a well maintained road network with acceptable levels of service where individual, public, and freight transportation can freely use roads without experiencing traffic congestion and playing a role in CO2 reduction [16, 17]. However, the presence of road defects, if not promptly repaired, can lead to high repair costs affecting the economies [6]. Therefore, maintaining the road network as soon as a defect is detected is an important piece of sustainability for the urban, and local, mobility. The monitoring of the road condition deploying data-driven ICT is one of the activities of the road maintenance, which can be therefore described as smart road maintenance. The smart road maintenance compared to the smart city concept is less apparent in the academic literature, whose focus is on how ICT have a potential to be deployed for road maintenance [37, 56, 74] and on the obstacles encountered from innovations in the maintenance field [4]. As the focus is more in the practical adoption of such technologies, to the knowledge of the authors a definition on what a smart road maintenance is still missing. To fill this gap, we would like to propose a possible definition of smart road maintenance which we hope can be useful for further upcoming discussions in this field. Smart road maintenance could be defined as a collection of ICT-supported efforts that keep the road network safe and at an acceptable level of service for any road users in any weather by monitoring and improving pavement surface conditions.

5 Conclusion This chapter presents the importance of road monitoring for the enhancement of sustainable smart cities. Safe and reliable road network in smart city is like a nervous system in a human body in that the smart cities would not be ‘smart’ without the safe and reliable roads in the same way as the body would not be able to function without the nervous system. In short, sustainable smart cities are built on safe and reliable road networks. A deteriorated road network can lead to reduced mobility, traffic congestion, and CO2 emissions which are unacceptable in the sustainable smart cities. The use of data-driven ICT can facilitate the road monitoring by processing and

Towards Sustainable Smart Cities …

149

analysing accurate and reliable data that describes the conditions of the road surface. This chapter is a case for the ViaPPS, a robust solution for road maintenance that provides reliable and accurate geo-referenced LiDAR and computer vision-based integrated data. ICT systems, such as ViaPPS, are required to assess road surface conditions and prompt road maintenance actions.

References 1. Abdel-Aziz, Y., Karara, H.: Direct linear transformation into object space coordinates in closerange photogrammetry. In: Proceedings of the Symposium on Close-Range Photogrammetry, Urbana, IL, pp. 1–18 (1971) 2. Akbar, M.A., Azhar, T.N.: Concept of cost efficient smart CCTV network for cities in developing country. In: International Conference on ICT for Smart Society (ICISS), pp. 1–4. IEEE (2018) 3. Akinmade, O.D., Cinfwat, K.Z., Ibrahim, A.I., Omange, G.N.: The use of Roadroid application and smart phones for road condition monitoring in developing countries. In: 8th Africa Transportation Technology Transfer Conference (2017) 4. Akkermans, H., Besselink, L., Van Dongen, L., Schouten, R.: Smart moves for smart maintenance. Findings from a Delphi study on ‘Maintenance Innovation Priorities’ for the Netherlands, Dutch Institute of World Class Maintenance (DIWCM) (2016) 5. Allouch, A., Koubâa, A., Abbes, T., Ammar, A.: RoadSense: smartphone application to estimate road conditions using accelerometer and gyroscope. IEEE Sens. J. 17(13), 4231–4238 (2017) 6. American Automobile Association: Pothole damage: fact sheet (2016). http:// publicaffairsresources.aaa.biz/wp-content/uploads/2016/02/Pothole-Fact-Sheet.pdf. Accessed 11 March 2021 7. Amoureus, L., Bomers, M.P.H., Fuser, R., Tosatto, M.: Integration of LiDAR and terrestrial mobile mapping technology for the creation of a comprehensive road cadastre. In: 5th International Symposium on Mobile Mapping Technology, pp. 29–31 (2007) 8. Applanix. https://www.applanix.com(2021) 9. Astarita, V., Vaiana, R., Iuele, T., Caruso, M.V., Giofrè Vincenzo, P., De Masi, F.: Automated sensing system for monitoring of road surface quality by mobile devices. Procedia—Soc. Behav. Sci. 111, 242–251 (2014) 10. Bakıcı, T., Almirall, E., Wareham, J.: A smart city initiative: the case of Barcelona. J. Knowl. Econ. 4(2), 135–148 (2013) 11. Barsi, A., Poto, V., Tihanyi, V.: Creating OpenCRG road surface model from terrestrial laser scanning data for autonomous vehicles. In: Vehicle and Automotive Engineering, pp. 361–369. Springer (2018) 12. Basavaraju, A., Du, J., Zhou, F., Ji, J.: A machine learning approach to road surface anomaly assessment using smartphone sensors. IEEE Sens. J. 20(5), 2635–2647 (2019) 13. Basler. https://www.baslerweb.com(2021) 14. Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020) 15. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 16. Bull, A., NU. CEPAL, German Agency for Technical Cooperation: Traffic Congestion: The Problem and how to Deal with it. Economic Commission for Latin America and the Caribbean (2003). ISBN: 92-1-121432-7 17. Burningham, S., Stankevich, N.: Why road maintenance is important and how to get it done. Transport Notes Series; No. TRN 4, World Bank, Washington, DC (2005) 18. Cabo, C., Ordoñez, C., García-Cortés, S., Martínez, J.: An algorithm for automatic detection of pole-like street furniture objects from mobile laser scanner point clouds. ISPRS J. Photogramm. Remote Sens. 87, 47–56 (2014)

150

H. Giudici et al.

19. Caragliu, A., Del Bo, C., Nijkamp, P.: Smart cities in Europe. J. Urban Technol. 18(2), 65–82 (2011) 20. Cheng, Y.-T., Patel, A., Wen, C., Bullock, D., Habib, A.: Intensity thresholding and deep learning based lane marking extraction and lane width estimation from mobile light detection and ranging (LiDAR) point clouds. Remote Sens. 12(9), 1379 (2020) 21. City of Oslo: Oslo smart city strategy (2018). https://www.oslo.kommune.no/politics-andadministration/smart-oslo/smart-oslo-strategy. Accessed 11 March 2021 22. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886– 893 (2005) 23. Design and Architecture Norway (DOGA), the Norwegian SMart City Network, Nordic Edge: Roadmap for smart and sustainable cities and communities in Norway. A guide for local and regional authorities (2019). https://doga.no/globalassets/pdf/smartby-veikart-19x23cm-engv1_delt.pdf. Accessed 11 March 2021 24. El Issaoui, A., Feng, Z., Lehtomäki, M., Hyyppä, E., Hyyppä, H., Kaartinen, H., Kukko, A., Hyyppä, J.: Feasibility of mobile laser scanning towards operational accurate road rut depth measurements. Sensors 21(4) (2021) 25. El-Sheimy, N.: An overview of mobile mapping systems. In: Proceedings of the FIG Working Week, pp. 16–21 (2005) 26. Feng, D., Haase-Schuetz, C., Rosenbaum, L., Hertlein, H., Glaeser, C., Timm, F., Wiesbeck, W., Dietmayer, K.: Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans. Intell. Transp. Syst. 22(3), 1341–1360 (2020) 27. Flir. https://www.flir.com(2021) 28. Gargoum, S., El-Basyouny, K., Sabbagh, J., Froese, K.: Automated highway sign extraction using LiDAR data. Transp. Res. Rec. 2643(1), 1–8 (2017) 29. Gézero, L., Antunes, C.: Road rutting measurement using mobile LiDAR systems point cloud. ISPRS Int. J. Geo-Inf. 8(9) (2019) 30. Gilroy, S., Jones, E., Glavin, M.: Overcoming occlusion in the automotive environment—a review. IEEE Trans. Intell. Transp. Syst. 22(1), 23–35 (2019) 31. Giudici, H., Mocialov, B., Myklatun, A.: ViaPPS: A Mobile Pavement Profiling System. arXiv pre-print: arXiv:2101.11267 (2021) 32. Golub, G., Kahan, W.: Calculating the singular values and pseudo-inverse of a matrix. J. Soc. Ind. Appl. Math. Ser. B: Numer. Anal. 2(2), 205–224 (1965) 33. Guerrero, J., Chapuis, R., Aufrère, R., Malaterre, L., Marmoiton, F.: Road curb detection using traversable ground segmentation: Application to autonomous shuttle vehicle navigation. In: 16th International Conference on Control, pp. 266–272. Automation, Robotics and Vision (ICARCV) (2020) 34. Hall, R.E., Bowerman, B., Braverman, J., Taylor, J., Todosow, H., Von Wimmersperg, U.: The vision of a smart city. Technical report, Brookhaven National Lab., Upton, NY (US). No. BNL-67902; 04042 (2000) 35. Harrap, R., Lato, M.: An overview of LIDAR: collection to application. NGI Publ. 2, 1–9 (2010) 36. Harrison, C., Eckman, B., Hamilton, R., Hartswick, P., Kalagnanam, J., Paraszczak, J., Williams, P.: Foundations for smarter cities. IBM J. Res. Dev. 54(4), 1–16 (2010) 37. Hashem, S., Cardiño, C.: Innovative pavement materials and design: smart roadways and smart road maintenance for the future. In: The International Conference on Civil Infrastructure and Construction (CIC 2020) (2020) 38. Hernández-García, D.-E., Gonzalez-Barbosa, J.-J., Hurtado-Ramos, J.-B., Ornelas-Rodríguez, F.-J., Castaneda, E.C., Ramírez, A., Garcia, A.I., Gonzalez-Barbosa, R., Aviña-Cervantez, J.G.: 3D city models: mapping approach using LiDAR technology. In: 21st International Conference on Electrical Communications and Computers, pp. 206–211 (2011) 39. Hough, P.V.: Method and means for recognizing complex patterns. US Patent 3,069,654 (1962)

Towards Sustainable Smart Cities …

151

40. Kaare, K.K.: Performance Measurement of a Road Network: A Conceptual and Approach for Estonia. PhD thesis, Tallinn University of Technology (2013) 41. Karsten, L., Gargoum, S., Saleh, M., El-Basyouny, K.: Automated framework to audit traffic signs using remote sensing data. J. Infrastruct. Syst. 27(3), 04021014 (2021) 42. Kim, T., Ryu, S.-K.: Review and analysis of pothole detection methods. J. Emerg. Trends Comput. Inf. Sci. 5(8), 603–608 (2014) 43. Koch, C., Brilakis, I.: Pothole detection in asphalt pavement images. Adv. Eng. Inform. 25(3), 507–515 (2011) 44. Kong, Y., Yu, Z., Chen, H., Wang, Z., Chen, C., Guo, B.: Detecting type and size of road crack with the smartphone. In: 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), vol. 1, pp. 572–579. IEEE (2017) 45. Kukko, A., Andrei, C.-O., Salminen, V.M., Kaartinen, H., Chen, Y., Rönnholm, P., Hyyppä, H., Hyyppä, J., Chen, R., Haggrén, H., et al.: Road environment mapping system of the Finnish Geodetic Institute-FGI roamer. In: ISPRS Workshop on Laser Scanning, vol. 36, pp. 241–247 (2007) 46. Lee, M.: Sustainable development in the EU: the renewed sustainable development strategy. Environ. Law Rev. 9(1), 41–45 (2007) 47. Lee, T., Chun, C., Ryu, S.-K.: Detection of road-surface anomalies using a smartphone camera and accelerometer. Sensors 21(2), 561 (2021) 48. Levenberg, K.: A method for the solution of certain problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944) 49. Li, K., Shao, J., Guo, D.: A multi-feature search window method for road boundary detection based on LIDAR data. Sensors 19(7), 1551 (2019) 50. Li, Z., Cheng, C., Kwan, M.-P., Tong, X., Tian, S.: Identifying asphalt pavement distress using UAV LiDAR point cloud data and random forest classification. ISPRS Int. J. Geo-Inf. 8(1), 39 (2019) 51. Li, Z.-X., Yang, X.-M., Li, Z.: Application of cement-based piezoelectric sensors for monitoring traffic flows. J. Transp. Eng. 132(7), 565–573 (2006) 52. Lin, Y.-J.: Point Cloud-Based Analysis and Modelling of Urban Environments and Transportation Corridors. PhD thesis, Purdue University Graduate School (2019) 53. Livingstone, D.J.: Artificial Neural Networks: Methods and Applications. Springer (2008) 54. Losurdo, F., Dileo, I., Siergiejczyk, M., Krzykowska, K., Krzykowski, M.: Innovation in the ICT infrastructure as a key factor in enhancing road safety. A multi-sectoral approach. In: 25th International Conference on Systems Engineering (ICSEng), pp. 157–162 (2017) 55. Lyimo, B.J.: Development of information and technology framework for monitoring road maintenance projects in Tanzania, a case of Tanzania national roads agency. Olva Acad.-Sch. Res. 2(3), 2 (2019) 56. Madduri, H.: A smart road maintenance system for cities—an evolutionary approach. In: Innovative Technologies in Management and Science, pp. 43–56. Springer (2015) 57. Magnusson, M.: The three-dimensional normal-distributions transform: an efficient representation for registration, surface analysis, and loop detection. PhD thesis, Örebro University, School of Science and Technology (2009) 58. Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963) 59. Nagai, Y., Kusakari, R., Kuroda, Y.: Classification of point cloud using received light intensity according to the degree of separation. In: IEEE/SICE International Symposium on System Integration (SII), pp. 323–328 (2020) 60. Kargah-Ostadi, N., Waqar, A., Hanif, A.: Automated real-time roadway asset inventory using artificial intelligence. Transp. Res. Rec. 2674(11), 220–234 (2020) 61. OECD and European Commission: Cities in the World (2020). https://doi.org/10.1787/ d0efcbda-en 62. Pak, B., Chua, A., Vande Moere, A.: FixMyStreet Brussels: socio-demographic inequality in crowdsourced civic participation. J. Urban Technol. 24(2), 65–87 (2017)

152

H. Giudici et al.

63. Pan, Y., Zhang, X., Cervone, G., Yang, L.: Detection of asphalt pavement potholes and cracks based on the unmanned aerial vehicle multispectral imagery. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 11(10), 3701–3712 (2018) 64. Pena-Gonzalez, R.H., Nuno-Maganda, M.A.: Computer vision based real-time vehicle tracking and classification system. In: 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 679–682. IEEE (2014) 65. Perttunen, M., Mazhelis, O., Cong, F., Kauppila, M., Leppänen, T., Kantola, J., Collin, J., Pirttikangas, S., Haverinen, J., Ristaniemi, T., et al.: Distributed road surface condition monitoring using mobile phones. In: International Conference on Ubiquitous Intelligence and Computing, pp. 64–78. Springer (2011) 66. Pilli-Sihvola, E., Aapaoja, A., Leviäkangas, P., Kinnunen, T., Hautala, R., Takahashi, N.: Evolving winter road maintenance ecosystems in Finland and Hokkaido. Japan. IET Intell. Transp. Syst. 9(6), 633–638 (2015) 67. Qing, L., Yang, K., Tan, W., Li, J.: Automated detection of manhole covers in mls point clouds using a deep learning approach. In: IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 1580–1583. IEEE (2020) 68. Rastiveis, H., Shams, A., Sarasua, W.A., Li, J.: Automated extraction of lane markings from mobile LiDAR point clouds based on fuzzy inference. ISPRS J. Photogramm. Remote Sens. 160, 149–166 (2020) 69. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018) 70. Riftek. https://riftek.com(2021) 71. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer (2015) 72. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) 73. Sattar, S., Li, S., Chapman, M.: Road surface monitoring using smartphone sensors: a review. Sensors 18(11), 3845 (2018) 74. Seneviratne, D., Ciani, L., Catelani, M., Galar, D., et al.: Smart maintenance and inspection of linear assets: an Industry 4.0 approach. Acta Imeko 7(1), 50–56 (2018) 75. Seraj, F., van der Zwaag, B.J., Dilo, A., Luarasi, T., Havinga, P.: RoADS: a road pavement monitoring system for anomaly detection using smart phones. In: Big Data Analytics in the Social and Ubiquitous Context, pp. 128–146. Springer (2015) 76. Silva, L.A., Blas, H.S.S., García, D.P., Mendes, A.S., González, G.V.: An architectural multiagent system for a pavement monitoring system with pothole recognition in UAV images. Sensors 20(21), 6205 (2020) 77. Simonovsky, M., Komodakis, N.: Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3693–3702 (2017) 78. Su, J., Miyazaki, R., Tamaki, T., Kaneda, K.: High-resolution representation for mobile mapping data in curved regular grid model. Sensors 19(24), 5373 (2019) 79. Tedeschi, A., Benedetto, F.: A real-time automatic pavement crack and pothole recognition system for mobile android-based devices. Adv. Eng. Inform. 32, 11–25 (2017) 80. Thodesen, C.C., Lerfald, B.O., Hoff, I.: Review of asphalt pavement evaluation methods and current applications in Norway. Baltic J. Road Bridge Eng. 7(4), 246–252 (2012) 81. Tomiyama, K., Kawamura, A., Nakajima, S., Ishida, T., Jomoto, M., et al.: A mobile profilometer for road surface monitoring by use of accelerometers. In: 7th Symposium on Pavement Surface Characteristics: SURF (2012) 82. Trubia, S., Severino, A., Curto, S., Arena, F., Pau, G.: Smart roads: an overview of what future mobility will look like. Infrastructures 5(12), 107 (2020) 83. United Nations: The World’s Cities in 2018. Department of Economic and Social Affairs, Population Division, World Urbanization Prospects (2018)

Towards Sustainable Smart Cities … 84. 85. 86. 87. 88.

89.

90. 91.

92. 93. 94.

95.

96.

153

Vapnik, V.: The Nature of Statistical Learning Theory. Springer (2013) Velodyne. https://velodynelidar.com(2021) ViaTechAS. https://www.viatech.no/(2021) Wang, R., Peethambaran, J., Dong, C.: LiDAR point clouds to 3D urban models: a review. IEEE J. Se. Topics Appl. Earth Observ. Remote Sens. 11(2), 606–627 (2018) Wang, Y., Cheng, L., Chen, Y., Wu, Y., Li, M.: Building point detection from vehicle-borne LiDAR data based on voxel group and horizontal hollow analysis. Remote Sens. 8(5), 419 (2016) Wei, P., Shi, H., Yang, J., Qian, J., Ji, Y., Jiang, X.: City-scale vehicle tracking and traffic flow estimation using low frame-rate traffic cameras. In: Adjunct Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the ACM International Symposium on Wearable Computers, pp. 602–610. Association for Computing Machinery, New York, NY, USA (2019) Wei, Z., Yang, M., Wang, L., Ma, H., Chen, X., Zhong, R.: Customized mobile LiDAR system for manhole cover detection and identification. Sensors 19(10), 2422 (2019) Yao, W., Hinz, S., Stilla, U.: Traffic monitoring from airborne LIDAR—feasibility, simulation and analysis. In: XXI Congress, Proceedings. International Archives of Photogrammetry, Remote Sensing and Spatial Geoinformation Sciences, Beijing, China, vol. 37, p. B3B (2008) Yin, C., Xiong, Z., Chen, H., Wang, J., Cooper, D., David, B.: A literature survey on smart cities. Sci. China Inf. Sci. 58(10), 1–18 (2015) Yu, Y., Guan, H., Ji, Z.: Automated detection of urban road manhole covers using mobile laser scanning data. IEEE Trans. Intell. Transp. Syst. 16(6), 3258–3269 (2015) Yu, Y., Li, J., Guan, H., Wang, C., Cheng, M.: A marked point process for automated tree detection from mobile laser scanning point cloud data. In: International Conference on Computer Vision in Remote Sensing, pp. 140–145. IEEE (2012) Zai, D., Li, J., Guo, Y., Cheng, M., Lin, Y., Luo, H., Wang, C.: 3-D road boundary extraction from mobile laser scanning data via supervoxels and graph cuts. IEEE Trans. Intelll Transp. Syst. 19(3), 802–813 (2018) ZF (2021). https://www.zf-laser.com

Optimal Resource Allocation for Public Safety Device to Device Communication Using PSO Navadiya Dhruvik, Rakesh Pavan, Neeraj, and M. Kiran

Abstract The Device to Device (D2D) communication allows two different devices in close proximity to communicate directly among themselves without relaying through the base stations (eNodeB or eNB). The D2D communication offloads the traffic from eNB and thus, has many advantages, including higher throughput and less end-to-end delay. Though the PSC was basically invented for Public Safety Communication (PSC) and to help the first responders, its distinct advantages have attracted other commercial applications as well. The eNB treats all the D2D applications equally during resource allocation and does a uniform resource allocation where one application is engaged in commercial activities. At the same time, the other saves one’s life. Thus, in this work authors proposed a novel optimized resource allocation algorithm for D2D applications which prioritizes PSC over commercial applications. In order to achieve the objective, Particle Swarm Optimization (PSO) technique was employed in the proposed work. Furthermore, a new weighted average fitness function was designed for PSO to suit the requirements. The proposed algorithm was simulated in NS-3, and the results were taken for different iterations. It was observed that the PSO algorithm for the designed fitness function achieved the local and global optimum values in a considerable amount of time. It was apparent from the results that PSC D2D pairs produced convincing results when compared to D2D pairs with commercial applications. Keywords Device to Device Communication (D2D) · Public Safety Communication (PSC) · Particle Swarm Optimization (PSO) · Network · Resource allocation · NS-3 Network Simulator · Optimization · Network throughput · Network latency · Fitness function

N. Dhruvik · R. Pavan · Neeraj (B) · M. Kiran Department of Information Technology, National Institute of Technology Karnataka (NITK), Surathkal, India e-mail: [email protected] M. Kiran e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_9

155

156

N. Dhruvik et al.

Fig. 1 Different types of D2D communication

1 Introduction The rapid increase in the number of cellular users and bandwidth-hungry devices has resulted in a scarce spectrum and thus, has increased the demand for network resources. This rapid growth has made it difficult for the infrastructure-centric traditional architecture to manage all the users who are communicating parallelly within a cell. This resulted in the invention of 5G, the fifth generation computer networks, to improve the spectrum’s efficiency and thereby meet the increasing demand for network resources. One of the promising verticals of 5G technology is Device to Device (D2D) communication which was first introduced in 3rd Generation Partnership Project Release 12 [1] and further optimized in functionalities in subsequent Releases. In LTE standards, this type of communication is also called sidelink communication. In D2D communication, a transmitter D2DT and a receiver D2D R forms a D2D pair which can communicate directly when they are in close proximity meaning that the communication does not relay through the eNB (eNodeB). Thus, the architecture comes down from infrastructure-centric to user-centric. D2D communication technology offloads some of the responsibilities of eNB, and it increases the network reliability and network capacity. Since the communication does not relay through the eNB, D2D communication has considerably less delay and is energy efficient. Further, it has advantages of spectrum efficiency and good transmission data rate as well [2, 3]. D2D communication can be of three types, in-coverage, partial coverage, and out-of-coverage communication. In in-coverage communication, the D2D pair will be within the vicinity of the eNB, as shown in Fig. 1a. In such type of D2D communication, both D2DT and D2D R are controlled by eNB, and the resource allocation is done explicitly by the eNB (but eNB will not interfere during the communication). Whereas, in partial coverage D2D communication, one of the D2D users in a D2D

Optimal Resource Allocation for Public Safety Device …

157

pair will be in the vicinity of eNB while the other user will be outside the vicinity of eNB, as shown in Fig. 1b. In such a situation also eNB will control the D2D pairs in terms of resource allocation. While Fig. 1c explains an infrastructure-less scenario where D2D pair has to synchronize among themselves for the communication [3]. In D2D communication, eNB is involved only in control information communication to the D2D pairs while the data transfer happens directly between the devices. Thus, D2D users can effectively reuse the resources within a cell and increase the throughput. There are two types of Spectrum Allocation techniques in D2D Communication. First is Inband Communication, which belongs to the Licensed Spectrum. This is further divided into Overlay, which is a non-overlapping spectrum using a separate spectrum for D2D and Cellular users. Moreover, Underlay where the same spectrum is used for both D2D and Cellular users. The second type of Spectrum Allocation technique is Outband communication, which belongs to the Unlicensed Spectrum. This is also divided into Controlled and Uncontrolled categories [4, 5]. Because of its distinguishing characteristics, D2D technology has attracted many researchers, and in the current research trends, resource allocation, mode selection, communication mechanism, and relay technology are widely studied. Although D2D communication was first introduced mainly for the Public Safety Communications (PSC) and first responders, it was soon utilized for commercial applications like gaming, social media applications, advertisements, etc. The eNB does not differentiate between PSC and Commercial applications during resource allocation, where one saves life and others are engaged in sales/advertisement/gaming. This leads to a question, whether eNB should give equal preference for both commercial and PSC applications or give absolute importance to the PSC applications where someone’s life is at stake. This motivated authors to design a novel algorithm that gives the PSC utmost preference over commercial applications during emergency times. This can be achieved by reusing Resource Blocks (RB) and ensuring that more RBs are available for PSC events over commercial applications. In this work, the authors have proposed a dynamic resource allocation scheme to the D2D pairs in an in-coverage scenario based on Particle Swarm Optimisation (PSO) algorithm. A weighted average fitness function based on the throughput is designed for the PSO for prioritizing the PSC over commercial application. The main contributions in this work are: 1. Novel dynamic resource allocation scheme based on the PSO algorithm for D2D communication. 2. A Weighted average fitness function based on the throughput is designed for the PSO for prioritizing the PSC application over commercial application during the resource allocation 3. Upholding the first responders and ensuring the Quality of Service (QoS) during emergencies. The rest of the article is organized as follows. In the second section, the authors discuss the work that has been carried out in the literature and conclude the current state of the problem. The authors have discussed the methodology detailing how PSO

158

N. Dhruvik et al.

is incorporated in D2D resource allocation in the third section. In the third section, implementation, results, and analysis are presented. Finally, the conclusion is drawn, and the future work for improvement is discussed in the fourth section.

2 Literature Survey There have been a handful of works in the literature concerning resource allocation in D2D communication. This section highlights only those articles which are in line with our defined objectives. In literature [6], authors have ensured the Quality of Experience (QoE) to the D2D users using a flexible resource allocation scheme based on a dynamic Stackelberg game theory technique. The proposed algorithm uses a single leader (eNB), and multiple followers (D2D users) architecture, which reduces the interference based on the pricing scheme for optimal RBs and transmits power. Further to enhance QoE, three category classes of D2D applications have been created with the corresponding utility function. A distributed algorithm is proposed to achieve Stackelberg equilibrium across all the users. Authors have simulated the proposed algorithm and have proved that it achieves optimal transmit power and increases the throughput. In literature [7], authors have proposed a bargaining game approach for efficient resource allocation and relay selection. Authors have designed multi-relay model where a D2D user can not only act as a source but also act as a relay node. Then based on the cooperation behaviour among the selfish UE while relaying, a fair relay selection scheme is drawn. Using the bargaining game approach, fair and efficient resource sharing is also proposed by the authors. Using simulation, the authors have proved the efficacy of the proposed algorithm. In literature [8], authors have proposed a probabilistic integrated resource allocation strategy and a quasi-convex optimization algorithm for mode selection and optimized resource allocation for D2D communication. The authors have considered the difference between the channel quality of each user for dynamically selecting the communication mode of the user in each scheduling period. Also, the proposed algorithm achieves fairness by appropriately allocating the RBs, thereby maximizing the throughput. Through theoretical results, authors have proved that the proposed algorithm can form the basis for other heuristic algorithms. In literature [9], authors have proposed a game-theoretic approach that opportunistically exploits the unallocated channels to the conventional users in a dedicated mode of D2D communication. After ensuring the minimum bandwidth for each D2D pair, a channel reuse is facilitated among the D2D pairs by grouping the D2D pairs in to coalitions. Within a coalition, D2D pairs can reuse the channel of each other. Two approaches were used for creating a coalition, dynamic programming, and sequential bargaining. Though it provided upper bound capacity, it suffered from high complexity, while the latter reached close to optimal capacity with less complexity. Through simulations the authors have claimed that the proposed algorithm triples the sum capacity.

Optimal Resource Allocation for Public Safety Device …

159

In literature [10], authors have proposed a two-stage resource allocation algorithm centered on weighted utilitarian and meta bargaining solutions. The classical bargaining solutions, namely Nash and Kalai-Smorodinsky, were used to implement the meta bargaining game. By employing the step-by-step iterative process, the authors have significantly reduced the complexity of the proposed algorithm. The simulation results show that the proposed algorithm is fair and efficient. In literature [11], the authors have tried to minimize the transmission power by using a distributed strategy based on softmax decision-making and Q-learning. Then, the Hungarian algorithm was used to search for the optimal spectral matching scheme to maximize the sum rate of the D2D users. In literature [12], Gale Shapley algorithm, i.e., the stable matching algorithm, was used to address the issue of power allocation to D2D users. First, game theory was used to analyze UEs’ interactions, and a power allocation algorithm was applied to establish preferences. Then the Gale-Shapley algorithm was used to match D2D users with Cellular Users. In literature [13], the authors have tackled the problem of resource allocation in D2D communication among the D2D pairs and the cellular user equipment using genetic algorithms. Further, the authors have improved the same using the harmony search algorithm. Meanwhile, some papers have adopted intelligent optimization algorithms to solve the resource allocation issue. In literature [14], a resource allocation scheme that is a joint and user matching scheme that is based on the genetic algorithm was used to reduce the intra-cell interference. This was used to search globally for optimal user matching solutions to increase system throughput substantially. In literature [15], authors have proposed a scheme (user machine) based on the genetic algorithm for optimal power allocation to the D2D users. The proposed algorithm achieves multidimensional optimization, and the genetic algorithm helps to attain the near-optimal user matching in the whole network. In addition, the resource allocation was also dealt with in the literature [16, 17]. The literature study reveals that considerable work has been done in resource allocation between D2D users and Cellular Users. Some of the work has used intelligent optimization algorithms like PSO and genetic algorithms, while others have used the game theory approach. Most of the articles have considered interference as the main parameter. Some of the work has also addressed the issue of power distribution among the UEs. While none of the work till now specifically tackles the resource allocation between the D2D user equipment, i.e., between PSC applications and commercial applications like what is done in this work.

3 Methodology In this work, the authors tackle the problem of optimal resource allocation, specifically (RBs), to the PSC D2D devices over the commercial D2D devices. In the default case, equal resources are allocated to both PSC D2D devices and the commercial application D2D devices as all communication is equal at eNB. However, one traffic serves to save lives (PSC) while other traffic shares an advertisement. Hence, an

160

N. Dhruvik et al.

effort is made in this work to prioritize PSC over commercial applications during the resource allocation. Overall, the work concentrates on producing a higher throughput and reducing the end-to-end delay for PSC D2D pairs. For the aforementioned objectives, the PSO optimization technique is used for optimizing resource allocation. Moreover, the methodology is explained in detail in the following subsections.

3.1 Particle Swarm Optimization (PSO) PSO is a metaheuristic optimization algorithm inspired by the characteristic behaviour exhibited by bird flocking and fish schooling. It is based on the idea that swarm intelligence can be used to solve complex mathematical optimization problems in engineering. There are two main components in the PSO, velocity and coordinate. Each particle will have an initial velocity and coordinate in a solution space. Further, each particle will also have a fitness value which will be calculated using a fitness function. In the proposed algorithm, the weighted average throughput function is used as a fitness function. All particles will know their own best performance as well as the group’s best performance. This influences the velocity of each particle in the next iteration. Each particle adjusts its velocity based on its previous best performance and based on the group’s best performance. As the algorithm iterates, the particles will converge to the best solution in the solution space [18]. There are many advantages of PSO such as: 1. It has a relatively low computational complexity 2. Fewer hyper-parameters to select when compared to other optimization algorithms. 3. Versatile and can be applied to many different settings and engineering problems. 4. Appropriate to optimize nonlinear continuous functions. The main objective of an optimization problem is to discover a variable depicted by a vector X = [x1 , x2 , x3 , . . . , xn ] which minimizes or maximizes based on the proposed optimization function f (X ). The position vector X , a n dimensions vector, depicts a variable model, where n represents the total number of variables that may be determined in a problem. Diversely, the function f (X ) is referred to as a fitness function or objective function, which will estimate how good or bad a position X is[18].

3.2 Resource Allocation Using PSO In the proposed work, the PSO is used to find an optimal resource allocation scheme that allocates the appropriate number of RBs to each PSC and commercial D2D pairs. Considering a swarm of P particles, where each particle G p encodes a unique resource allocation scheme, a particle tuple is defined as a

Optimal Resource Allocation for Public Safety Device …

161

G p = {X p , V p }

(1)

Here X p is the position of the particle, and V p is the velocity. With respect to the proposed objectives, each element X p,i in the position vector represents the number of resource blocks to be allocated to the ith D2D pair. If X tp , V pt represent the position and velocity vectors of the pth particle at the iteration number t, then the update equations for calculating the new position X t+1 p , t+1 i.e. the new resource allocation scheme, and the new velocity vector V p are shown in Eqs. 2 and 3 respectively: ( ) ( ) V pt+1 = α1 V pt + α2 r1t X ∗p − X tp + α3r2t g ∗ − X tp

(2)

X t+1 = X tp + V pt+1 p

(3)

and On the other hand, Eq. 3 updates the position of the particle. The parameter α1 is the inertia constant, and for the classical PSO version, it is a positive constant value. This parameter is essential for balancing the global search, also known as exploration, and local search, known as exploitation. Equation 2 tells us that there are three different parts to a velocity of the particle in each iteration, and are detailed below. The first term in Eq. 2 is a product between parameter α1 and the previous velocity of the particle, which denotes considering a previous motion of the particle into the current iteration. Hence, for example, if α = 1, the motion of the particle is fully influenced by its previous motion, then the particle may keep going in the same direction. On the other hand, if 0 ≤ α1 < 1, then such influence is reduced, meaning that a particle rather moves to other regions in the search domain. Therefore, if the inertia weight parameter is downsized, the swarm might explore more areas in the searching domain. Thus, there is a very high chance that the swarm might reach the global optimum. The second term of Eq. 2, the individual cognition term, is found based on the difference between the current position of the particle and its own best position X ∗p . The reason for including this term in the equation is that as the particle moves far distant from its personal best position, there should be an increase in the difference, i.e., X ∗p − X tp ; hence, the increase in this term attracts the particle to its best own position. The parameter α2 is an individual particle’s cognition parameter, and it scales the importance of particle’s own previous experiences. The other parameter r1 is a random value parameter whose value ranges between [0, 1], and this is the parameter that avoids premature convergence; thus plays a vital role. At last, there is a social learning element as the third term in the equation. The PSO not only should converge towards local optimal but also converge towards global optimal as well. Thus, this term will make sure that all the particles will be skewed towards the global best. In a swarm, all particles will share the information about the global optimal

162

N. Dhruvik et al.

found, thereby motivating all other particles in the swarm to move towards the global best.

3.3 Weighted Average Throughput and Penalty for Resource Constrained Settings As the total resources available are scarce and limited, the proposed algorithm will prioritize these resources to the PSC D2D pairs over commercial D2D Pairs. Accordingly, a fitness function has been designed, which is a weighted combination of the throughput of the PSC D2D and commercial D2D pairs, wherein the latter is assigned a lower weight. This weighted average throughput fitness function guides the PSO algorithm to prioritize the total throughput of the PSC D2D pairs. The fitness of the pth particle is formulated as: F(X p ) =

w pub T hr oughput psc + wco T hr oughputco w pub + wco

(4)

Here, the term w psc > wcom is to assign more priority to the PSC D2D pairs. There is also a penalty scheme in the proposed algorithm wherein the algorithm penalizes the commercial D2D pairs by reducing the number of RBs allocated to them in cases when the total number of allocated RBs exceeds the maximum capacity i.e., MaxResBlockAvailble. For computing, the T hr oughput psc and T hr oughputco values, the simulation is run for a fixed duration of time, and packets received are totaled in terms of bytes for the PSC and commercial D2D pairs separately. The penalty scheme algorithm is as follows: ’ ’ ’ penalize the commercial pair allocated resource i f the total number of resources is more than MaxResBlockAvailable ’ ’ ’ def penalty ( res , num ) : while num>0: for i in range(nD2DPairs//2 ,nD2DPairs) : i f num>0 and res [ i ]>1: res [ i]−=1 num−=1 else : return res return res Here the variable num is the difference between the total number of allocated RBs and MaxResBlockAvailble, and the 2nd half of the array res store the number of RBs allocated to commercial pairs. This penalty scheme iteratively reduces the number of RBs allocated to commercial D2D pairs if the total allocated RBs exceeds the maximum capacity.

Optimal Resource Allocation for Public Safety Device …

163

Fig. 2 The PSO algorithm overall

The overall PSO algorithm proposed in this work is shown in Fig. 2. While Fig. 3 shows the pictorial representation of the proposed algorithm motivating each particle in the swarm to update its position wherein also improving the resource allocation scheme to promote the public D2D pairs and the weighted average throughput of the network.

164

N. Dhruvik et al.

Fig. 3 Working principle of PSO with fitness function for optimized resource allocation

3.4 Implementation Details The proposed algorithm is simulated in NS-3 [19] psc-3.0.1 patch [20]. A network of a total of 10 D2D pairs and one eNB is considered for the simulation. Out of 10, 5 D2D pairs are considered for PSC, and the remaining are considered as commercial D2D pairs. All the D2D pairs are considered to be within the vicinity of eNB, i.e., we have considered an in-coverage scenario for the simulation. The default settings for transmission power and path loss model were considered for the simulation. The physical layer variables were taken as it is, and RrSlFfMacScheduler was considered a scheduler for the simulation based on the round-robin method. The sidelink communication channels are established between the D2D pairs, through which they communicate with each other directly without the eNB’s assistance. For communication, applications were installed in each of the D2D pairs. Moreover, it is assumed that the public safety applications are pandemic, earthquake, etc., and the commercial applications are social media, gaming, etc. During the resource allocation for D2D pairs in the classical approach, the round-robin scheduler allocates a static number of resources to all the D2D pairs. It does not provide any preference even if the D2D pair belongs to PSC D2D pairs or dynamically changes the RBs to D2D pairs even if there is an emergency. This has been replaced with the proposed PSO-based dynamic resource allocation algorithm, and the results are analyzed. During the resource allocation, the scheduler refers to the resource allocation table before allocating resources to the D2D pairs. The resource allocation table consists of the RNTI (Radio Network Temporary Identifier) number of each D2D pairs and the corresponding RBs to be allocated to them. This resource allocation table will be updated in the proposed algorithm, specifically the total number of RBs columns, based on the PSO algorithm. Accordingly, the scheduler will allocate the resources

Optimal Resource Allocation for Public Safety Device …

165

to the D2D pairs. If the entry for any D2D pairs is not present, the scheduler will allocate the default number of resources to the D2D pair. Initially, a random number of resource blocks between 1 and 3 were allocated to all the D2D pairs. In the PSO algorithm, each particle encodes a different resource allocation scheme as described in Sect. 3.2., i.e., the position vector of each particle stores a resource allocation schema. The velocity for this particle is the tendency to change the current resource allocation scheme to maximize the fitness value of the particle (the fitness function discussed in Sect. 3.3). Initial velocity vectors are assigned randomly for all the particles. The variables personal best and global best are hyper-parameters of the PSO algorithm and are initially equal to the particle’s position for all the particles in the algorithm. Initially, a random resource allocation scheme to the particles was considered, and the velocity was kept random. The particle will run the simulation with the assigned parameters and generate the output. The simulation generates a throughput value of the network, which will be calculated separately as throughput of PSC D2D pairs and throughput of commercial D2D pairs. These throughput values will be further used for calculating the fitness function of the current iteration. If the fitness value comes better than the current personal best value, then the personal best value will be replaced by the current fitness value and check for the global best parameter. The proposed algorithm also calculates the throughput and average end-to-end delay of PSC D2D pairs and commercial D2D pairs. We will further discuss the results in the coming section.

4 Results and Analysis While running the proposed algorithm, the authors observed that as the iterations of the PSO increased, so is the performance of PSC applications over commercial applications. For the proposed algorithm, 15 iterations were considered for the PSO. From the simulation results, the network throughput and throughput of total PSC and Commercial D2D pairs were calculated. As a second QoS parameter, the average end-to-end delay of the PSC and Commercial D2D pairs was also calculated and tabulated. Figures 4 and 5 show the screenshot of the D2D pairs for PSC and Commercial applications, respectively. The figures also show the RNTI numbers of each node. While Fig. 6 shows the overall output snapshot. The results analyzed are detailed below. Table 1 shows the throughput values extracted for each iteration. Network throughput is measured as the amount of data transmitted and received per unit time in Kbps. In the Table, Total column refers to the total throughput of the network, while the other two columns refer to PSC and commercial communication throughput, respectively. One can observe that as the PSO iterations increase, the PSC D2D pairs completely dominate the commercial D2D pairs. It should be noted here that the PSO dynamically allocates an appropriate number of RBs to the PSC and Commer-

166

Fig. 4 Public safety communication D2D pairs

Fig. 5 Commercial communication D2D pairs

N. Dhruvik et al.

Optimal Resource Allocation for Public Safety Device …

167

Fig. 6 Resource allocation using PSO and evaluation parameters

Fig. 7 Throughput versus iterations

cial pairs based on the preference and learning; hence, as the iteration increases, the algorithm achieves local and global optimal effectively by allotting more RBs to PSC over commercial application. More RBs result in more data transfer; thus, the domination of PSC over Commercial communication in terms of throughput. Figure 7 shows the graphical representation of the results extracted, giving more clarity of the PSC domination over commercial applications. It is observed from the graph that the throughput for commercial applications is relatively low and is less than the PSC application throughput till 10 iterations. After 10 iterations, there is a saturation in throughput for both PSC and commercial applications, which means that the PSO algorithm has reached the local and global optima. Further, no changes will be made to the resource allocation table of the scheduler. We can observe from the Table that there is a 168.73% increase in the total Network Throughput. Also, there is a 245.21% increase in the PSC application throughput, while there is only an 84.20% increase in the throughput for commercial applications. Next, the proposed algorithm was analyzed for an average End to End delay. The End to End delay is also known as One-Way Delay (OWD), which refers to the total

168

N. Dhruvik et al.

Table 1 Throughput values for D2D device pairs in Kbps Itr Total (Kbps) PSC (Kbps) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1603.2 1723.3 1843.69 2284.12 2602.15 2736.96 3058.65 3428.69 3847.69 4308.41 4308.41 4308.41 4308.41 4308.41 4308.41

841.68 922.68 1082.16 1246.25 1523.56 1689.45 1905.68 2047.54 2344.48 2905.61 2905.61 2905.61 2905.61 2905.61 2905.61

Com (Kbps) 761.523 800.62 761.523 1037.87 1078.59 1074.51 1152.97 1381.15 1503.2 1402.8 1402.8 1402.8 1402.8 1402.8 1402.8

time taken by a packet to reach the destination from the source across the network. With respect to D2D communication, it measures the time taken for the packet to travel from D2DT to D2D R and is measured in nanoseconds (ns). One can observe that the end-to-end delay in iteration 1 for both PSC and commercial applications is almost the same; the reason for this can be explained as in the first iteration, almost the same amount of RBs are allocated to both the PSC and commercial application devices. This same amount of RBs contribute the same amount of queuing delay for both applications. As the number of iterations increases, we observe a decrease in the end-to-end delay for both PSC and commercial applications; this is because the allocated RBs increases for both the parties; more the RBs less the queuing delay. However, we can observe that comparatively, there is more decrease in end-to-end delay in PSC applications than commercial applications after some iterations. This can be observed in the graph shown in Fig. 8 too. The reason being that the proposed algorithm gives more preference to the PSC over commercial application resulting in more RBs for PSC over commercial application. These more RBs reduce the queuing delay in PSC compared to commercial applications with relatively fewer RBs. From Table 2, observation reveals that there is an 89.45% decrease in end-to-end delay for PSC application D2D devices. Next, we can see that there is only a 74.24% decrease in end-to-end delay for commercial application D2D devices. Overall, the simulation results reveal that the PSC applications have better metrics than commercial applications as the number of iterations increase and the RBs allocation gets better as the PSO algorithm achieves local and global optima over a period of iterations.

Optimal Resource Allocation for Public Safety Device … Table 2 Average end to end delay in nanoseconds (NS) Itr PSC avg delay (ns) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

8.03e+8 6.52e+8 452082000 252082000 224869000 198956000 158963000 115469000 1.05e+8 8.46e+7 84638200 84638200 84638200 84638200 84638200

169

Com avg delay (ns) 8.33e+8 7.52e+8 740072000 520556000 440896000 398956000 356989000 296815000 256365000 214565000 214565000 214565000 214565000 214565000 214565000

Fig. 8 Average end-to-end delay versus iterations

5 Conclusion and Future Work Though D2D communication was primarily invented for PSC/first responders, it was eventually used for other commercial applications. The eNB allocates an equal amount of resources to both PSC and commercial applications during resource allocation, though PSC has a crucial job to do. Hence, in this work, an effort is made to prioritize the PSC D2D pairs over commercial D2D pairs using the PSO algorithm. Accordingly, a weighted average fitness function based on throughput was designed

170

N. Dhruvik et al.

for the PSO. The proposed algorithm was simulated in NS-3, and the results show that resource allocation optimization is achieved successfully. A significant improvement in the throughput of PSC applications over commercial D2D pair devices was noticed in the results. A 245.21% increase in throughput for PSC applications instead of only 84.20% increase for commercial D2D pairs were observed. Even when the end-to-end delay was used as a metric, an 89.45% decrease for PSC was observed instead of only a 74.24% decrease for commercial D2D pair devices. The success of PSO depends on the fitness function. In the proposed algorithm, a weighted average throughput fitness function was designed. In the near future, the authors would like to explore the fitness function with additional parameters.

References 1. 3GPP Official Website. https://www.3gpp.org/specifications/releases/68-release-12. Accessed 16 May 2021 2. Osseiran, A., Monserrat, J.F., Marsch, P. (eds.): 5G Mobile and Wireless Communication Technology, 1st edn. Cambridge University Press (2016). ISBN 978-1-107-13009-8 3. ns-3 Public safety Communication Documentations Release psc-4.0. National Institute of Standards and Technology, 15 April 2021. https://github.com/usnistgov/psc-ns3/blob/psc-4.0/ns3psc-documentation.pdf. Accessed 15 May 2021 4. Lin, X., Andrews, J.G., Ghosh, A.: Spectrum sharing for device-to-device communication in cellular networks. IEEE Trans. Wirel. Commun. 13(12), 6727–6740 (2014). https://doi.org/10. 1109/TWC.2014.2360202 5. Song, L., Han, Z., Xu, C.: Resource Management for Device-to-Device Underlay Communication. Springer (2014). ISBN: 978-1-4614-8192-8 6. Sawyer, N., Smith, D.B.: Flexible resource allocation in device-to-device communications using Stackelberg game theory. IEEE Trans. Commun. 67(1), 653–667 (2019). https://doi.org/ 10.1109/TCOMM.2018.2873344 7. Zhang, G., Hu, J., Heng, W., Li, X., Wang, G.: Distributed power control for D2D communications underlaying cellular network using Stackelberg game. In: 2017 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6 (2017). https://doi.org/10. 1109/WCNC.2017.7925878 8. Li, J., Lei, G., Manogaran, G., Mastorakis, G., Mavromoustakis, C.X.: D2D communication mode selection and resource optimization algorithm with optimal throughput in 5G network. IEEE Access 7, 25263–25273 (2019). https://doi.org/10.1109/ACCESS.2019.2900422 9. Najla, M., Becvar, Z., Mach, P.: Reuse of multiple channels by multiple D2D pairs in dedicated mode: game theoretic approach. IEEE Trans. Wirel. Commun. (2021). https://doi.org/10.1109/ TWC.2021.3057825 10. Kim, S.: D2D enabled cellular network spectrum allocation scheme based on the cooperative bargaining solution. IEEE Access 8, 53710–53719 (2020). https://doi.org/10.1109/ACCESS. 2020.2981290 11. Pérez-Romero, C.J., Sánchez-González, J., Agustí, R., Lorenzo, B., Glisic, S.: Power-efficient resource allocation in a heterogeneous network with cellular and D2D capabilities. IEEE Trans. Veh. Technol. 65(11), 9272–9286 (2016) 12. Zhou, C.Z., Ota, K., Dong, M., Xu, C.: Energy-efficient matching for resource allocation in D2D enabled cellular networks. IEEE Trans. Veh. Technol. 66(6), 5256–5268 (2017) 13. CVassilakopoulos, M., et al.: Resource allocation schemes based on intelligent optimization algorithms for D2D communications underlaying cellular networks. Mob. Inf. Syst. PB— Hindawi (2018)

Optimal Resource Allocation for Public Safety Device …

171

14. Yang, C.C., Xu, X., Han, J., Rehman, W.U., Tao, X.: GA based optimal resource allocation and user matching in device to device underlying network. In: Proceedings of 2014 IEEE Wireless Communications and Networking Conference Workshops, Istanbul, Turkey, April 2014, pp. 242–247 15. Yang, C.C., Xu, X. Han, J. Tao, X.: GA based user matching with optimal power allocation in D2D underlaying network. In: Proceedings of IEEE Vehicular Technology Conference, Vancouver, Canada, Sept 2014, pp. 1–5 16. Wang, C.F., Li, Y., Wang, Z., Yang, Z.: Social-community-aware resource allocation for D2D communications underlaying cellular networks. IEEE Trans. Veh. Technol. 65(5), 3628–3640 (2016) 17. Tang, C.H., Ding, Z.: Mixed mode transmission and resource allocation for D2D communication. IEEE Trans. Wirel. Commun. 15(1), 162–175 (2016) 18. Erdogmus, P. (ed.): Particle Swarm Optimization with Applications. IntechOpen (2018). ISBN: 9781789231489, 1789231485 19. NS-3 Official Website. https://www.nsnam.org/. Accessed 16 May 2021 20. NS-3 Appstore. https://apps.nsnam.org/app/tag/publicsafety/. Accessed 16 May 2021

Research Progress in Internet of Things (IoT) Application in Smart Cities Development: A Bibliometric Analysis Shri Ram

Abstract Internet of Things (IoT) is a paradigm shift in web technology which have capability to connect household electronic devices. There is a large opportunity coming up where IoT is being applied in monitoring different devices through Internet. The country in the world harnessing the web technologies in improving the facilities in its cities to make it smart cities to improve the life styles of its residents. The IoT is expanding its horizon to improve the life style of public and applied to develop smart cities. The purpose of this paper is analyzing the bibliographic data available from SCOPUS database to understand the scope of the IoT application in Smart cities development. The keywords associated with IoT and Smart Cities resulted in 18,241 articles published from 136 countries and the subject is proliferating into different disciplines and new research areas are being explored where IoT, as an emerging technology, can be applied to develop smart cities. Keywords Internet of Things · Smart cities · Bibliometric analysis · Research trends

1 Introduction The Internet of Things (IoT) is a paradigm shift in Internet technology and wireless communication. For the first time, the term ‘Internet of Things’ was used by Kevin Ashton in 1999 for optimizing supply chain system. As the concept grown, it promises to have impact on the ‘things’ home appliances, medical equipment and devices, and consumer electronic goods. IoT is becoming an emerging technology with the progress of the time. There is an enormous improvement in IoT is happening in its integration, data transfer and analytics for better application. Initially, the IoT finds its application in major industrial and high-tech companies [48]. However, as the time and technological innovation progressed with application of Radio Frequency Identification (RFID) and Wireless Sensor Networks, the IoT improved towards the S. Ram (B) Central Library, Sikkim University, Gangtok 737102, Sikkim, India e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_10

173

174

S. Ram

connecting world devices to make comfortable and quality city life [8, 21, 32, 40]. Furthermore, with the progress of the time, IoT finds its application in medical health [14, 26], machine learning application in health care [19], energy conservation [29, 46], and transport centric applications [1, 28] but not limit itself. As the Internet of things is progressing, its application extending the services to make city life more comfortable with the use of microcontrollers and transceivers [51]. There are enormous research happening specially focussed on market application, political dimension, technical side and financial dimension. On research dimension, this chapter is focussed on analyses the published literature in the field of IoT application in the development of smart cities, to assess the current status of research, the key author, institution, counties and journals where most of the IoT publications appearing. The directive research progression based on the keywords was analysed to understand the proliferation of new research areas. Bibliometric studies provide such avenue to describe the research trends in scientific domain [20]. There are many bibliometric studies associated with analysis of research publication related to internet of things. Some of the study were generic in nature [5, 36], few were related to the IoT application in agriculture [42] blockchain [27] business, supply chain and marketing [12, 33, 37]. However, no instances of IoT application in smart cities have been traced. The bibliometric studies as a tool often used to sharpen the research question in a given field and helps in understanding the direction of research [34]. The bibliometric analysis helps in assessing the research growth, publication types, languages in which articles are published, productivity of author, institution, country and journals, impact in terms of citations and further assessing the future growth and networking in a given subject. So, the rationale of the study is to assess research productivity of IoT application in development of the smart cities through bibliometric methods. The study is taken up with specific objective to understand the: • Extent of research progress of IoT application in the development of smart cities using bibliometric methods • Map the research activities in the future application and development of new areas where IoT can be applied for development of smart cities

2 Methodology The research is based on the bibliographic literature on IoT and Smart cities retrieved from SCOPUS database. The SCOPUS is a multidisciplinary indexing abstracting and citation database covering over seventy-five million bibliographic records from over 24, 600 publications and over sixteen million author profiles. The database is extensively being used for identification and analysis of journals for publication; monitor research impact and differentiation of research topic etc. The database is extensively being used for bibliometric analysis. The keywords “Internet of Things”, “Internet of Thing”, “Web of things”, “IoT ”, “Smart Cities”, “Smart City” available in Title, Abstract and Keyword fields of the database were used to build search

Research Progress in Internet of Things (IoT) Application …

175

query. The time period was kept from 2011 up to 2020. The citation count was taken as number of the citation received by the articles up to 2020 since its publication. Journal Impact Factor (JIF), the average of total citations over total publication of a journal in last two years was taken from the Journal Citation Report published by Clarivate Analytics for the year 2019. VoS viewer was used to create network diagram. The results were analysed for productivity in the form of literature for document types and language; annual productivity; productive authors, institutions, country and journals; citation impact; keyword proliferation and subject development and most cited articles.

3 Results and Discussion The search strategy involving keywords combination “Internet of Things”, “Internet of Thing”, “Web of things”, “IoT ”, “Smart Cities”, “Smart City” resulted 18,241 documents from SCOPUS database. The following sections discusses the general characteristics of the research publication trends.

3.1 Document Types and Language of Publication The 18,241 documents about IoT application in smart cities distributed in into 15 document types. It is found that significant number of the literature were indexed as conference papers (9270; 50.82%) and articles (7172; 39.32%). These two document types were crucial component of research on IoT application in smart cities. High number of conference paper indicates that research in this field is new and emerging stage and deliberations at the conferences are media of fast communication [38, 41]. Other significant document types included Book Chapters (757; 4.15%), Review (673; 3.69%), Conference Review (231; 1.27%). The document with less than one percent contribution was Editorial (63; 0.35%), Book (52; 0.29%), Note (6; 0.03%), and Short Survey (5; 0.02%). Data paper, letter, erratum and some papers with an unindexed category were also found (Table 1). In general, citations are taken as a reflection of research impact or its quality [3, 49]. There are many research available describing a relationship between document types and citation impact [24, 44]. The review as a document type had the highest Average Citations Per Paper (ACPP). It has ACPP2020 of 37.75 citations per paper, which can be ascribed to the three classic review articles with 500 or more citations in 2021 (TC2020 ≥ 500) of 1263 by Agiwal et al., [2], 824 citations by Lu [30] and Borgia, [11]. Short survey with five documents averaging 20.20 citations with one article [31] having 126 citations. Books were also included amongst the highest ACPP2020 of 18.0 citations per book with Holler et al., [23] and Vermesan & Friess [47] with more than 150 citations. The article type which reports of research on original works averaged 16.12 citations per paper.

176

S. Ram

Table 1 Document types of IoT application in smart cities research Document type

TP

%

TC2020

ACPP2020

Conference paper

9270

50.82

47,874

5.16

Article

7172

39.32

115,636

16.12

Book chapter

757

4.15

2295

3.03

Review

673

3.69

25,403

37.75

Conference review

231

1.27

0

0

Editorial

63

0.35

723

11.48

Book

52

0.29

940

18.08

Note

6

0.03

19

3.17

Short survey

5

0.03

131

26.2

Retracted

3

0.02

47

15.67

Data paper

2

0.01

8

4

Letter

2

0.01

0

0

Erratum

1

0.01

0

0

Undefined

4

0.02

0

0

TP—Total Publications; TC2020 = Total Citations up to 2020; ACPP2020 = Average Citation Per Paper up to 2020

There were eighteen classic papers with TC2020 ≥ 500 or more citations and Gubbi et al. [21] was one of the highest cited article. Though conference papers were most productive types, the average citation per paper was only 5.16 citations. As English was a global language; 99.04% of the research papers were published in English. However, the 18,241 published documents appeared into thirteen different languages other than English.

3.2 Annual Growth of Publications on IoT Application in Smart Cities As shown in Fig. 1, only 10 IoT application in smart cities documents were published in 2011 in contrast to 5420 documents in 2020. So, over a period of ten years, the literature was grown to 541 times than that published in 2011. Till 2019, the publication growth has been found exponentially with squared value of r = 9609, which is an indication that 96% positive growth in IoT application in smart cities development. Considering this growth trends, it is expected that the publication in 2020 must be much higher, the number of publications were slightly higher than 2019. The citation count of an articles is one indicator and it denotes the impact of a publication in given field or related research areas [3, 49, 54]. Though the number of research publication on IoT application in Smart cities increased over ten years,the average citations per publication (ACPP) declined. The documents with longer life

Research Progress in Internet of Things (IoT) Application …

177

Fig. 1 The publication growth of IoT application in smart cities development

span have higher ACPP than younger documents of recent year. The documents published in year 2011 have ACCP of 107,790 citations in comparison to documents published in 2020 with an ACPP of 0.20 citations per paper. The citation impact is time sensitive and these publications still need to be tested over time to assess the real impact over time.

3.3 Most Productive SCOPUS Subject Categories and Journals The analysis of categories and journals distribution is a basic part of a bibliometric study [13, 15, 35]. The SCOPUS distributed whole database into 334 subject categories [39]. The research publications in this field were distributed in twentysix different subject categories. IoT being a domain of engineering & technology (Computer science) and smart cities is a domain of civil engineering & architecture,the most of the publication falls under computer science, engineering, mathematics, decision science, physics, energy and materials science. However, some interdisciplinary subject areas have also included environmental science and business & economics. In total, 7927 documents published in different journals (leaving aside conference publications, book & book chapter document types) in this theme were published in 1203 journals. The top fifteen journals which have published most number of documents on this theme with their total publication, percentage share, rank, total citations, Average Citation Per Paper and Impact Factor 2019 (IF2019 ) are listed in Table 2. Of

178

S. Ram

Table 2 Most productive journals in IoT application in smart cities development Source title

TP

IEEE Access

718 1(9.06)

R(% Share) TC

14,983 20.87

ACPP IF (IF2019 ) Subject Category 3.745

Computer Science, Information Systems

IEEE Internet of Things Journal

624 2(7.87)

19,670 31.52

9.936

Computer Science, Information Systems

Sensors (Switzerland)

607 3(7.66)

9158

15.09

3.275

Instruments & Instrumentation

Future Generation Computer Systems

236 4(2.98)

14,029 59.44

6.125

Computer Science, Theory & Methods

Computer Communications

118 5(1.49)

2624

22.24

2.816

Computer Science, Information Systems

Electronics (Switzerland)

116 6(1.46)

551

4.75

2.412

Engineering, Electrical & Electronic

Sustainability (Switzerland)

104 7(1.31)

772

7.42

2.576

Environmental Sciences

0.63



Computer Science, Theory & Methods

International Journal 93 of Innovative Technology and Exploring Engineering

8(1.17)

59

Journal of Network and Computer Applications

88

9(1.11)

3591

40.81

5.57

Computer Science, Hardware & Architecture

Applied Sciences (Switzerland)

87

10(1.10)

498

5.72

2.474

Chemistry, Multidisciplinary

International Journal of Recent Technology and Engineering

81

11(0.02)

47

0.58



Computer Science, Theory & Methods

Wireless Communications and Mobile Computing

76

12(0.96)

630

8.29



Telecommunications

Journal of Advanced 74 Research in Dynamical and Control Systems

13(0.93)

75

1.01



Automation & Control Systems

IEEE Transactions on Industrial Informatics

71

14(0.90)

2750

38.73

9.112

Automation & Control Systems

Wireless Personal Communications

71

15(0.90)

409.00

5.76

1.061

Telecommunications

TP = Total Publications; R = Rank; TC = Total Citations; ACPP = Average Citation Per Paper; IF2019 = Impact Factor as per Journal Citation Report 2019

Research Progress in Internet of Things (IoT) Application …

179

the top fifteen productive journals, three each belongs to Computer Science, Information System and Computer Science, Theory & Methods; two each belongs to Telecommunications and Automation & Control System; One each belongs to Instruments & Instrumentation; Environmental Sciences; Chemistry, Multidisciplinary; Computer Science, Hardware & Architecture; Engineering, Electrical & Electronics. Such a distribution of subject category where IoT finds a truly multidisciplinary approaches where the subjects like engineering, architecture and environmental science finds its application for the development of smart cities [7]. IEEE Access (IF2019 = 3.745) ranked first with most number of articles (718 articles,9.06% share) on the theme. IEEE Internet of Things Journal (IF2019 = 9.936) ranked second with 624 artic and Sensors (Switzerland) (IF2019 = 3.275) ranked third with 607 articles. Future Generation Computer Systems (IF2019 = 6.125), Computer Communications (IF2019 = 2.816), Electronics (Switzerland) (IF2019 = 2.412) and Sustainability (Switzerland) (IF2019 = 2.576) were the other journals which had published more than hundred articles in the field. Comparing the top fifteen most productive journals on IoT applications in Smart cities development, Table 2 revealed that articles published in Future Generation Computer Systems (IF2019 = 6.125) had the highest ACPP of 59.44 citations per paper followed by Journal of Network and Computer Applications (IF2019 = 5.57) with ACPP of 40.81 citations per paper and IEEE Transactions on Industrial Informatics (IF2019 = 9.112) with ACPP of 38.73 citations per paper. The least cited journal was International Journal of Recent Technology and Engineering (ACPP = 0.58 citations per paper). Four journals do not have impact factor recorded as per the Journal Citation Report of 2019.

3.4 Most Productive Countries The country productivity is based on the author’s affiliation in the article. As least one author in the article has who have contributed in the article, was considered as one contribution by the country. These 18,241 articles where author’s affiliations in SCOPUS databases was present, it was found that these were published from 136 countries. The evaluation parameters include total publication (TP), percentage share and rank; total citation (TC) and rank; average citation per paper (ACPP) and rank [52]. The top 15 productive countries include six Asian countries, four European countries, three American countries and Australia is given in Table 3. Six out of the seven major industrialized countries of the world (G7) including USA, Italy, United Kingdom, France, and Germany were ranked in the top 15 productive countries; whereas Japan (346 articles; ranked 18th) was not part of these productive countries. The Internet of Things application in Smart Cities development is the most common research area in Asian region, which is reflected through the result. India is the most productive country and published most number of papers (TP = 3120 articles; 17.10%) followed by China (TP = 2851 articles; 15.63% share) and USA (TP = 2323 articles; 12.74% share). On the parameter of total citations (TC) the articles published from USA has accumulated highest number of citations (TC = 48,442

180

S. Ram

Table 3 Most productive counties in IoT application in smart cities development Country

TP

R(% TP)

R(TC)

R(ACPP)

India

3120

1(17.10)

5(21,311)

15(6.83)

China

2851

2(15.63)

2(34,954)

13(12.26)

United States

2323

3(12.74)

1(48,442)

2(20.85)

Italy

1258

4(6.90)

4(23,503)

5(18.68)

United Kingdom

1250

5(6.85)

3(24,980)

4(19.98)

South Korea

872

6(4.78)

8(15,869)

6(18.20)

Spain

812

7(4.45)

7(16,247)

3(20.01)

Australia

672

8(3.68)

6(20,115)

1(29.93)

Canada

669

9(3.67)

9(11,647)

7(17.41)

France

663

10(3.63)

10(97,420

9(14.69)

Pakistan

602

11(3.30)

11(8747)

10(14.53)

Germany

598

12(3.28)

12(7635)

12(12.77)

Saudi Arabia

584

13(3.20)

13(7501)

11(12.84)

Brazil

529

14(2.90)

15(4632)

14(8.76)

Malaysia

437

15(2.40)

14(6762)

8(15.47)

TP = Total Publication; R = Rank; TC = Total Citations; ACPP = Average Citation Per Paper

citations; ACPP = 20.85 citations) followed by China (TC = 34,954 citations; ACPP = 12.26 citations) and United Kingdom (TC = 24,980 citations; ACPP = 19.98 citations). The impact according to most frequently cited paper comes from Australian articles. The articles published from Austria are most frequently cited where each article on an average citation of 29.93 citations per paper followed by USA (ACPP = 20.9 = 85 citations) and Spain (ACPP = 20.01 citations). Indian research impact in terms of citation is having lowest average citation of 6.83 citations though the most number of articles.

3.5 Highly Productive Institutions in IoT Application in Smart Cities Institution’s publication performance were compared on the basis of eight parameters proposed in a similar studies [25, 44]. The top eleven highly productive institutions with more than hundred publication is given in Table 4. Among the highly productive institutions, four are from China, two are located in Pakistan, and one each from Saudi Arabia, India, Portugal, France and Italy. King Saud University in Saudi Arabia ranked top with 181 articles (0.99% share) including top ranked in terms of Total Citations (4491 citations, ACPP = 24.81 citations) and h-Index = 34. High productivity of the IoT research publication can be visualized due to a dedicated course

Research Progress in Internet of Things (IoT) Application …

181

Table 4 Productive institutions in IoT applications in smart cities development Institutions

TP

R(%Share)

R(TC)

R(ACPP)

R(h-Index)

King Saud University, Saudi Arabia

181

1(0.99)

1(4491

1(24.81)

1(34)

COMSATS University Islamabad, Pakistan

158

2(0.87)

3(2802

3(17.73)

2(29)

Beijing University of Posts and Telecommunications, China

156

3(0.86)

2(2902

2(18.60)

4(24)

Vellore Institute of Technology, India 154

4(0.85)

7(1810)

9(11.75)

3(25)

Instituto de Telecomunicacoes, Portugal

153

5(0.84)

6(1873)

8(12.24)

5(24)

Chinese Academy of Sciences, China 148

6(0.81)

4(2306)

5(15.58)

6(22)

University of Electronic Science and Technology of China, China

121

7(0.66)

5(1962)

4(16.21)

7(21)

CNRS Centre National de la Recherche Scientifique, France

119

8(0.65)

11(1178)

11(9.90)

11(15)

Alma Mater Studiorum Università di Bologna, Italy

116

9(0.64)

8(1540)

7(13.28)

10(19)

National University of Sciences and Technology, Pakistan

115

10(0.63)

10(1290)

10(11.22)

9(20)

Ministry of Education China, China

106

11(0.58)

9(1482)

6(13.98)

8(20)

TP = Total publication; R = Rank; TC = Total Citations; ACPP = Average Citation Per Paper

on Internet of Things Application. The King Saud University follows, COMSATS University Islamabad located in Pakistan with 158 articles (0.87% share; 2nd rank) including 3rd rank in total citations (2802 citations) and h-Index of 29. COMSATS university has Ubiquitous Connectivity and Internet of Things (UCIoT) research group, which might be one of the reasons for high productivity. The third ranked highly productive institution was Beijing University of Posts and Telecommunications, China with 156 articles (0.86% share) including second rank in total citations (TC = 2902; ACPP = 18.60 citation) and h-Index of 24. Apart from this list, Vellore Institute of Technology (VIT), India was ranked 4th in terms of number of publications. Further, it is ranked 4th in terms of h-Index (h-Index = 25). VIT is ranked 7th in terms of total citations (TC = 1810).

3.6 Highly Cited and Impactful Articles Related to the IoT Application in Smart Cities Development The citation indexes [16] offers a new approach to control subject literature. The citation indexes provide ready to use data on how frequently an article is being cited to measure the research performances [17]. The citations count is perhaps the most important quality indicator in bibliometric analysis to assess the importance

182

S. Ram

of the research in academic world. There are many opinions about pro and cons of the citation as a measure of the research impact and leading to various alternative methods being adopted to adjudge the research impact in various field, but it is still one of the popular methods of assessing research impact. The subject like Internet of Things is quite emerging research area and judging the research impact of articles of recent publication could be quite useful as recommended in some bibliometric studies [22]. The citation trends of eight most cited article in IoT application in smart cities development is analysed and given in Table 4. The article “Internet of Things (IoT): A vision, architectural elements, and future directions” published in 2013 was most cited article with TC2020 of 5819 [21]. This article is still impactful in recent citation C2020 of 1091 citations. The second most cited article was “Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications” published in 2015 with TC2020 of 3250 citations [4]. “Fog computing and its role in the internet of things” published in 2012 was third most citated article with TC2020 of 3189 citations [10]. Almost all these highly cited articles have witnessed decrease in its most recent cited year C2020 showing decrease in citation impact. However, the article “Next generation 5G wireless networks: A comprehensive survey” with TC of 2020 of 1256 citations and its most recent citation C2020 of 434 citations showing increase citation impact. The progression of citation impact of these most cited article is given in Fig. 2. Almost all these most cited articles shown slow rise and slow decline in its citation life cycle [9]. On the parameter of average citation per year, Gubbi et al. [21] was most impactful with annual average citation of 727.38 citations per year, followed by Al-Fuqaha [4] (ACPY = 541.67 citations) and Zanella et al. [51] (ACPY = 401.71 citations). Stankovic [43] had least average annual citation of 153.43 citations per year.

Fig. 2 Citation life cycle of the most cited article

Research Progress in Internet of Things (IoT) Application …

183

3.7 Author Keyword Analysis and Research Hotspot The keywords associated with the articles provides the research focus centred in the writing. Such important keywords reflected in title, abstract and author keywords. Keywords are used to identify the research trends of a given fields [18, 53]. However, author keywords are one of the key parameters which reflects the central theme of the research embodied in the paper. Author keywords is the best representation of the theme of the research addressed in the article. Therefore, an analysis was carried out for IoT and Smart cities through the keywords assigned by the authors in their article. Table 5 gives top fifteen keywords ranked according percentage of articles appeared in a time period of two years. Internet of Things, Internet of Things (IOT), Smart City, IoT, Internet of Thing (IoT) are the common keywords which has been refereed most frequently in the Table 5 Most prominent author keywords and research hotspot Keywords

R(TP)

2011–12 R(%)

2013–14 R(%)

2015–16 R(%)

2017–18 R(%)

2019–20 R(%)

Internet of Things

1(14,034)

3(0.13)

1(1.27)

1(5.91)

1(23.10)

1(44.93)

Internet of Things (IOT)

2(5033)

1(0.15)

2(0.72)

3(1.81)

3(7.96)

2(16.38)

Smart City

3(4099)

5(0.05)

5(0.33)

5(1.35)

2(8.49)

4(11.77)

IoT

4(3522)

7(0.03)

9(0.23)

7(1.27)

4(5.33)

3(12.06)

Internet of Thing (IOT)

5(2692)

10(0.01)

6(0.28)

6(1.34)

5(4.06)

5(8.76)

Cloud Computing

6(1489)

6(0.05)

7(0.24)

9(0.93)

7(2.60)

7(4.18)

Big Data

7(1483)

14(0.00)

11(0.12)

8(1.02)

6(3.07)

10(3.76)

Smart Cities

8(1340)

2(0.14)

3(0.70)

2(2.93)

15(1.35)

15(2.10)

Automation

9(1333)

4(0.06)

4(0.55)

4(1.62)

10(1.96)

12(3.01)

Network Security

10(1289)

11(0.01)

13(0.10)

11(0.70)

8(2.27)

8(4.08)

Energy Efficiency

11(1119)

15(0.00)

15(0.03)

13(0.41)

11(1.93)

6(4.55)

Wireless Sensor Networks

12(1090)

9(0.02)

12(0.10)

14(0.40)

13(1.72)

9(3.77)

Energy Utilization

13(1050)

8(0.02)

8(0.24)

10(0.71)

9(2.01)

13(2.87)

Intelligent Buildings

14(923)

13(0.01)

14(0.09)

15(0.37)

14(1.52)

11(3.66)

Internet

15(911)

12(0.01)

10(0.13)

12(0.63)

12(1.79)

14(2.39)

TP = Total Publication; R = Rank

184

S. Ram

articles. These keywords are mostly ranked at top across the study period, with minor variation in the ranking. Based on the keyword analysis, a network diagram has been created using VoS viewer software (Fig. 3). VoS viewer is a tool used for bibliometric mapping using different bibliometric parameters [45]. Different network cluster can easily identify based on the author keywords. At the centre of each clusters includes these fifteen most frequently used keywords. At the periphery, the new keywords or subjects can be identified which are most recent in use by the researchers associated working in the area of Internet of Things and Smart Cities. The most recent application of the various technology can be identified as its its association with Smart Cities research. The prominent cluster and research hotspot emerging which are being applied to smart cities includes Big Data, Fog Computing, Automation, Network Security, Sensors, Energy Utilization, and Intelligent Building are prominent keywords (Fig. 3). With the advancement of the technology Blockchain, Smart Parking, Edge Computing, Waste Management are the new field of research emerging in most recent time (Fig. 4).

Fig. 3 Author keyword network in IoT application in smart cities development

Research Progress in Internet of Things (IoT) Application …

185

Fig. 4 Author keywords and network linkage with smart cities

3.8 Application of Bibliometric Studies in Assessing the Future Research in IoT Bibliometric analysis provides the understanding about the new direction of research through mapping of the field [19]. Through this study it can be found that, the Internet of Things application in smart cities development has given new direction of research which can be visible through this research based on the analysis of keywords trends used by the authors in their publication (Fig. 4). In most recent publications it is found that the authors have very commonly used blockchain for security [19], managing data through deep learning, artificial intelligence & machine learning [6], edge computing [50] are coming in a strong ways which certainly deemed to be beneficial in developing smart cities.

186

S. Ram

4 Conclusion The Internet of things research is increasing with the time and it is finding its application in various subject areas. Smart cities are one of the areas which is becoming an agenda for various countries to make public life comfortable. The literature-based studies shows that there is a tremendous amount of research publication already available within the ten years of time period and the literature has grown exponentially. The research impact has shown decline in its impact in terms of citation when average citation per paper declined over the time, whereas the number of research publication increased. The documents with longer life span have higher ACPP than younger documents of recent year. Though IoT is a domain of the engineering & technology (Computer science) and Smart cities a domain of civil and architecture engineering, the research involved many other interdisciplinary subject areas such as decision science, energy, material science, business, management and economics. Most of the research publication has appeared from Asian region and India is the top most countries with highest number of publications. However, in terms of impact, the Indian publications have least average citation per paper. Similarly, most of the productive institutes are from Asian region and COMSAT institute from Pakistan was most productive institute. The proliferation of new subject and research is one of the characteristics of the IoT application in smart cities development. The topic such as big data, blockchain, energy conservation, sensors, automated parking are the new areas being evolved and becoming research hotspot in the field. The IoT application is increasing in its importance and new areas finding its usability. Smart city is one of the areas where the IoT has been applied prominently which can be observed through the large number of research publication. As the research progresses, new avenues could be possible through this research.

References 1. Abdel-Basset, M., Imran, M.: Special issue on industrial internet of things for automotive industry-new directions, challenges and applications. Mech. Syst. Signal Process. 142, 106751 (2020). https://doi.org/10.1016/j.ymssp.2020.106751 2. Agiwal, M., Roy, A., Saxena, N.: Next generation 5G wireless networks: a comprehensive survey. IEEE Commun. Surv. Tutorials (2016). https://doi.org/10.1109/COMST.2016.2532458 3. Aksnes, D.W., Langfeldt, L., Wouters, P.: Citations, citation indicators, and research quality: an overview of basic concepts and theories. SAGE Open (2019). https://doi.org/10.1177/215 8244019829575 4. Al-Fuqaha, A., et al.: Internet of things: a survey on enabling technologies, protocols, and applications. IEEE Commun. Surv. Tutorials 17(4), 2347–2376 (2015). https://doi.org/10.1109/ COMST.2015.2444095 5. Ali, R.R.M., Ahmi, A., Sudin, S.: Examining the trend of the research on the internet of things (IoT): a bibliometric analysis of the journal articles as indexed in the Scopus database. J. Phys.: Conf. Ser. 1529(2), 022075 (2020). https://doi.org/10.1088/1742-6596/1529/2/022075 6. Allam, Z., Dhunny, Z.A.: On big data, artificial intelligence and smart cities. Cities 89, 80–91 (2019). https://doi.org/10.1016/j.cities.2019.01.032

Research Progress in Internet of Things (IoT) Application …

187

7. Andrisano, O., Bartolini, I., Bellavista, P., Boeri, A., Bononi, L., Borghetti, A., Brath, A., Corazza, G. E., Corradi, A., de Miranda, S., Fava, F., Foschini, L., Leoni, G., Longo, D., Milano, M., Napolitano, F., Nucci, C.A., Pasolini, G., Patella, M., et al.: The need of multidisciplinary approaches and engineering tools for the development and implementation of the smart city paradigm. Proc. IEEE 106(4) (2018). https://doi.org/10.1109/JPROC.2018.2812836 8. Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comp. Netw. 54(15), 2787– 2805 (2010). https://doi.org/10.1016/j.comnet.2010.05.010 9. Aversa, E.: Citation patterns of highly cited papers and their relationship to literature aging: a study of the working literature. Scientometrics 7(3–6), 383–389 (1985). https://doi.org/10. 1007/BF02017156 10. Bonomi, F., et al.: Fog computing and its role in the internet of things. In: Proceedings of the first edition of the MCC workshop on Mobile cloud computing, pp. 13–16 (2012) 11. Borgia, E.: The internet of things vision: key features, applications and open issues. Comput. Commun. (2014). https://doi.org/10.1016/j.comcom.2014.09.008 12. Cavalieri, A., Reis, J., Amorim, M.: Circular economy and internet of things: mapping science of case studies in manufacturing industry. Sustain. 13(6), 3299 (2021) 13. Chiu, W.T., Ho, Y.S.: Bibliometric analysis of homeopathy research during the period of 1991 to 2003. Scientometrics (2005). https://doi.org/10.1007/s11192-005-0201-7 14. Chiuchisan, I., Chiuchisan, I., Dimian, M.: Internet of Things for e-Health: an approach to medical applications. In: International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–5 (2015). https://doi.org/10.1109/IWCIM.2015.7347091 15. Ellegaard, O., Wallin, J.A.: The bibliometric analysis of scholarly production: how great is the impact? Scientometrics (2015). https://doi.org/10.1007/s11192-015-1645-z 16. Garfield, E.: Citation indexes for science a new dimension in documentation through association of ideas. Science 1(3159), 10–111 (1953). https://doi.org/10.1093/ije/dyl189 17. Garfield, E.: Citation frequency as a measure of research activity and performance. Current Contents 5, 406–408 (1973) 18. Garfield, E.: KeyWords Plus-ISI’s breakthrough retrieval method. 1. Expanding your searching power on current contents on diskette. Curr. Contents 32, 5–9 (1990) 19. Ghazal, T.M., et al.: IoT for smart cities: machine learning approaches in smart healthcare—a review. Future Internet 13(8), 218 (2021). https://doi.org/10.3390/fi13080218 20. Glanzel, W.: Bibliometrics as a research field a course on theory and application of bibliometric indicators. Course Handout (2003). Available at http://nsdl.niscair.res.in/bitstream/123456789/ 968/1/Bib_Module_KUL.pdf 21. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of Things (IoT): a vision, architectural elements, and future directions. Future Gener. Comput. Syst. (2013). https://doi.org/ 10.1016/j.future.2013.01.010 22. Ho, Y.S., Hartley, J.: Classic articles published by American scientists (1900–2014): a bibliometric analysis. Current Sci. 111(7) (2016). https://doi.org/10.18520/cs/v111/i7/11561165 23. Holler, J., Tsiatsis, V., Mulligan, C., Avesand, S., Karnouskos, S., Boyle, D.: From machineto-machine to the Internet of Things. Mach. Mach. Internet Things (2014). https://doi.org/10. 1016/C2012-0-03263-2 24. Hsieh, W.H., Chiu, W.T., Lee, Y.S., Ho, Y.S.: Bibliometric analysis of Patent Ductus Arteriosus treatments. Scientometrics (2004). https://doi.org/10.1023/b:scie.0000027793.12866.58 25. Hsu, Y.H.E., Ho, Y.S.: Highly cited articles in health care sciences and services field in science citation index expanded: a bibliometric analysis for 1958–2012. Methods Inf. Med. 53(6) (2014). https://doi.org/10.3414/ME14-01-0022 26. Javdani, H., Kashanian, H.: Internet of things in medical applications with a service-oriented and security approach: a survey. Health Technol. 8(1), 39–50 (2018) 27. Kamran, et al.: Blockchain and Internet of Things: a bibliometric study. Comput. Electr. Eng. 81(2020), 106525 (2020). https://doi.org/10.1016/j.compeleceng.2019.106525 28. Kirk, R.: Cars of the future: the Internet of Things in the automotive industry. Netw. Secur. 2015(9), 16–18 (2015). https://doi.org/10.1016/S1353-4858(15)30081-7

188

S. Ram

29. Liu, X., et al.: Energy-efficient resource allocation for cognitive industrial Internet of Things with wireless energy harvesting. IEEE Trans. Indus. Inf. 17(8), 5668–5677 (2020). https://doi. org/10.1109/TII.2020.2997768 30. Lu, Y.: Industry 4.0: a survey on technologies, applications and open research issues. J. Ind. Inf. Integr. (2017). https://doi.org/10.1016/j.jii.2017.04.005 31. Mashal, I., Alsaryrah, O., Chung, T.Y., Yang, C.Z., Kuo, W.H., Agrawal, D.P.: Choices for interaction with things on Internet and underlying issues. Ad Hoc Netw. (2015). https://doi. org/10.1016/j.adhoc.2014.12.006 32. Miorandi, D., et al.: Internet of things: vision, applications and research challenges. Ad Hoc Netw 10(7), 1497–1516 (2012). https://doi.org/10.1016/j.adhoc.2012.02.016 33. Miskiewicz, R.: Internet of things in marketing: bibliometric analysis. Mark. Manage. Innov. 3, 371–381 (2020). http://doi.org/10.21272/mmi.2020.3-27 34. Moed, H.F.: How evaluative informetrics relates to scientific, socio-historical, political, ethical and personal values. Sch. Assess. Rep. 2(1) (2020). http://doi.org/10.29024/sar.18 35. Moschini, U., Fenialdi, E., Daraio, C., Ruocco, G., Molinari, E.: A comparison of three multidisciplinarity indices based on the diversity of Scopus subject areas of authors’ documents, their bibliography and their citing papers. Scientometrics (2020). https://doi.org/10.1007/s11 192-020-03481-x 36. do Nascimento, D.A., et al.: A bibliometric study about Internet of Things. Int. J. Adv. Eng. Res. Sci. 6(4), 213–20 (2019). https://dx.doi.org/10.22161/ijaers.6.4.24 37. Rejeb, A., et al.: Internet of Things research in supply chain management and logistics: a bibliometric analysis. Internet Things 12(2020), 100318 (2020). https://doi.org/10.1016/j.iot. 2020.100318 38. Rowley-Jolivet, E.: The pivotal role of conference papers in the network of scientific communication. ASp (1999). https://doi.org/10.4000/asp.2394 39. SCOPUS.: What Is the Complete List of Scopus Subject Areas and All Science Journal Classification Codes (ASJC)? Elsevier (2020). https://service.elsevier.com/app/answers/detail/a_id/ 15181/c/10547/supporthub/scopus/ 40. Shahid, N., Aneja, S.: Internet of Things: vision, application areas and research challenges. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), IEEE, pp. 583–587 (2017). https://doi.org/10.1109/ISMAC.2017.8058246 41. Shamir, L.: The effect of conference proceedings on the scholarly communication in Computer Science and Engineering. Sch. Res. Commun. (2010). https://doi.org/10.22230/src.2010v1 n2a25 42. Singh, et al.: Internet of things and agriculture relationship: a bibliometric analysis. J. Glob. Bus. Adv. 13(5), 643–664 (2020). https://dx.doi.org/10.1504/JGBA.2020.112821 43. Stankovic, J.A.: Research directions for the internet of things. IEEE Internet Things J. 1(1), 3–9 (2014). https://doi.org/10.1109/JIOT.2014.2312291 44. Usman, M., Ho, Y.S.: A bibliometric study of the Fenton oxidation for soil and water remediation. J. Environ. Manage. (2020). https://doi.org/10.1016/j.jenvman.2020.110886 45. Van Eck, N., Waltman, L.: Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84(2), 523–538 (2010). https://doi.org/10.1007/s11192-009-0146-3 46. Varjovi, A.E., Babaie, S.: Green Internet of Things (GIoT): vision, applications and research challenges. Sustain. Comput.: Inf. Syst. 28, 100448 (2020). https://doi.org/10.1016/j.suscom. 2020.100448 47. Vermesan, O., Friess, P.: Internet of things applications: from research and innovation to market deployment. In: Internet of Things Applications: From Research and Innovation to Market Deployment (2014) 48. Vilajosana, I., et al.: Bootstrapping smart cities through a self-sustainable model based on big data flows. IEEE Commun. Mag. 51(6), 128–134 (2013). https://doi.org/10.1109/MCOM. 2013.6525605 49. Waltman, L.: A review of the literature on citation impact indicators. J. Informetrics (2016). https://doi.org/10.1016/j.joi.2016.02.007

Research Progress in Internet of Things (IoT) Application …

189

50. Yu, W., et al.: A survey on the edge computing for the Internet of Things. IEEE Access 6, 6900–6919 (2017). https://doi.org/10.1109/ACCESS.2017.2778504 51. Zanella, et al.: Internet of things for smart cities. IEEE Internet Things J. 1(1), 22–32 (2014). https://doi.org/10.1109/JIOT.2014.2306328 52. Zhai, C., Ho, Y.S.: A bibliometric analysis of distributed control publications. Measur. Control 51(3–4), 113–121 (2018) 53. Zhang, J., et al.: Comparing keywords plus of WOS and author keywords: a case study of patient adherence research. J.Assoc. Inf. Sci. Technol. 67(4), 967–972 (2016). https://doi.org/ 10.1002/asi.23437 54. Zhao, Y., Jiang, Y., Zhou, Z., Yang, Z.: Global trends in karst-related studies from 1990 to 2016: a bibliometric analysis. Alex. Eng. J. (2021). https://doi.org/10.1016/j.aej.2020.12.052

Neural Network Based Task Scheduling in Cloud Using Harmony Search Algorithm Arnaav Anand , Pratyush Agarwal , Dinesh Kumar Saini , and Punit Gupta

Abstract Cloud computing is a field that is growing in popularity every passing day for computation and storage purposes. Having already been adopted by companies like Google, Microsoft, IBM, etc., it is slowly but steadily making its way into the mainstream market, which is evident from the exponential increase in the size of the cloud servers over the past few years. Being a computational process where extensible resources are conveyed as facilities to customers using online methods, Cloud computing needs to have a method for choosing the correct resources for executing processes within a given framework. This is called task-scheduling. Various taskscheduling models are in use currently, but the one that we will be focusing on is the NN (Neural Network)-based model. This model was set up to estimate the task execution status for resource allotment among the candidates. An NN-based model makes use of various scheduling algorithms to formulate optimum results in terms of quality of service (QoS), total cost, service satisfaction, etc. Through our work, we are simulating various task-scheduling algorithms in a virtual environment and comparing their efficiency based on the results we obtain from these simulations. While our focus will be on an emerging metaheuristic optimization algorithm called the Harmony Search Algorithm, we are also running simulations for the Moth algorithm, and the Genetic Algorithm. Keywords Task scheduling · Cloud computing · Harmony search · Quality of service · Moth · Genetic

1 Introduction Cloud computing, a promising field that is growing in popularity, not only amongst the big league companies like Apple, Amazon, Google, etc. but it is also being used by smaller businesses, companies, and even individuals for computational and storage purposes, both as consumers as well as providers. Cloud computing works with the A. Anand · P. Agarwal · D. K. Saini · P. Gupta (B) Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_11

191

192

A. Anand et al.

help of data centres that help in the functioning of the cloud environment. The power consumption at these data centres depends on the number of requests received and the region from where they are served. The increase in requests leads to an increase in the cloud size, which further leads to an increase in the power consumption over the data centre. To keep this power consumption at a feasible value, these requests need to be maintained in such a way that there is an efficient algorithm to handle resource utilization, improve the consumption and tackle request failures when they occur. Various features of cloud computing have, which makes it preferable for everyone, which ensures that even in the coming future, the demand for cloud computing services will keep on growing. These include: • Pooling of resources: Many consumers in cloud computing leads to the resources being pooled together. This leads to the use of multi-tenancy and the dynamic allocation and de-allocation of these resources. This is an elastic process where the allocation is done according to the demand. • On-demand service: Being based on self-reliant service models, cloud computing permits the consumer to complete activities like management, scheduling, deployment, and management, while also allowing him the computation allowances without interacting with the provider. • Cheap pricing: Having no upfront cost, cloud computing is based completely on the usage of the customer who can monitor the usage and manage costs, based on the resource they want to consume. • Quality of Service (QoS): Quality of service for users is another thing that holds utmost importance in Cloud computing. The services mentioned in the agreement must include various factors like an adequate number of resources, good performance, no bandwidth issues and 24/7 access to the cloud server. Various advantages cloud computing has over the other computational methods that make it beneficial for every person that uses the service. These advantages include: • Cloud computing helps in reducing the cost of management and maintenance of our IT systems. Instead of buying expensive systems and materials for our businesses, we can reduce our expenses by making use of the resources of our cloud service provider. • We can scale our business up or down per the working and storage needs quickly to suit the condition we are in, allowing flexibility per our needs. Instead of buying and setting up expensive upgrades personally, the cloud computing service provider can take care of the situation for us. • Cloud computing helps the work practices of employees be more flexible. We can access data from anywhere in the world, all we need is a good internet connection. If we need access to our data while you are not at our offices, we can connect to your virtual office, quite fast and without much difficulty.

Neural Network Based Task Scheduling …

193

• Depending on the service provider, we can ensure that our systems are updated regularly with the latest software, up to date technology, as well as upgrades to servers and computer processing power. • With the help of collaboration, our business can be communicated and shared more easily than the methods used previously. While doing projects across various locations, we can use cloud computing to give access to all the members of the project with the help of cloud computing. As Cloud Computing is still in its infancy, it does have some issues that can be a roadblock for some consumers to overcome before they put their trust in this technology. The issues include: • As the industry starts to adopt Cloud Computing, the upturn in daily user figures along with data centres brings about an increase in the consumption of total power. • As the load distributed between data centres does not have information about their power consumption, the comparison to the usual power consumption is under capacity load. • The scheduling algorithms currently in use aim to balance the load when request load increases and not when the power consumed is increased. • Data centres with high loads may slow down due to the higher number of computing requests, which also leads to higher power consumption. • Data centres that have less load may have a higher power consumption than required to do the computations as the information is not completely available. • Requests that have time limits may miss their deadlines if there is a high load on a data centre, which can be a huge problem for the consumer. This paper mainly focuses on task simulations using Harmony Search Algorithm and comparing the results we obtain to the results obtained by running the simulations for other similar algorithms and finding the most efficient algorithmic model for usage. The Harmony Search algorithm was first proposed in the year 2001 and was used in water optimization networks to optimize the distribution. Being an emerging metaheuristic optimization algorithm, Harmony Search is roughly based on a musician’s improvisation of harmony. When musicians try to compose music, they try multiple combinations of pitches that they remember until they find the perfect tune. As it is visible, this process of finding the perfect tune is quite like the process of formulating the optimal resolutions to various mechanical problems. Over the years, this algorithm has become helpful in solving optimization problems in fields like behaviour medical science, architecture, IT industry, image processing, power systems, etc.

2 Related Work In this section some of the related work from the field of task scheduling in cloud are discussed.

194

A. Anand et al.

2.1 A. Framework that is Used to Optimize Task Scheduling in the Cloud Environment [1] In the above research, the authors R. Jemina Priyadarsini and L. Arockiam have very elegantly explained the essence of Cloud Computing, Load Balancing, Task Scheduling, and Task Allocation and how they are inter-connected. They explain the growth of Cloud Computing and the steps that the Optimization Algorithms have gone through to get to the stage where it is at currently. They talk about how Task Scheduling is a famous Non-Deterministic Polynomial Time Problem, which enables the researchers to find better ways to optimize the Scheduling Algorithms even more. The cloud is generally used to support its users with Quality of Service (QoS) and thus when Task Scheduling Management becomes a part of Task Allocation, the Cloud Service Providers (CSPs) must uphold their QoS due to which the Task Scheduling Algorithms play an important role. In recent years, progress has been made in the field of optimizing task scheduling algorithms, with the migration from using hybrid metaheuristic methods based on Genetic Algorithms, to implementing Task Scheduling by using Ant Colony Optimization (ACO) and even Particle Swarm Optimization (PSO) Algorithms. The authors concluded the paper by elaborating the Framework to enhance Cloud Scheduling, which generally goes in three steps: • Resource Identification • Resource Allocation • Job Execution Since the usage of Cloud Computing technologies has massively increased in the current day and age, it becomes extremely important to have efficient resource allocation strategies with Service Level Agreements (SLA) to achieve satisfaction, maintain the QoS and maximize the profits for the CSP. With this paper, not only do we know about the main variations of task scheduling, but also about optimization strategy along with its impact based on demeanor the cloud system.

2.2 B. Harmony Search Algorithm: Strengths and Weaknesses [2] With this research paper, the authors Milad Ahangaran and Pezhman Ramezani have talked about the Harmony Search Algorithm, its 3 rules namely the Harmony Memory Considering Rule (HMCR), the Pitch Adjustment Rule (PAR), and the Random Selection Rule (RS). The objective of the Authors was simple, that is, to be better able to understand metaheuristic algorithms, a thorough analysis of their search mechanisms are extremely important, and that is precisely what the authors have tried to provide us with in the form of the Strengths and Weaknesses of the Harmony Search Algorithm by describing and analyzing the above-mentioned rules to get a better knowledge of their consequences on the algorithm’s performance.

Neural Network Based Task Scheduling …

195

The authors describe the evolutionary process of solutions in the Harmony Search Algorithm which includes the steps of Initialization in which the algorithm initializes several solutions in a randomized way and Improvisation in which the HS generates a new solution using RS, HMC, and PA rules, the values of whose decision variables are selected by either using the HMC rule with the probability of HMCR or by using the RS rule using the 1-HMCR probability. We then make use of the PA rule to give different values to the decision variables using the HMC rule. They also bring to our notice serious drawbacks of the Algorithm that it cannot maintain an effective balance between global and local search. Excluding the PA rule, all other concepts in the HS algorithm can be applied comprehensively, thus turning the attention of scientists and researchers to this shortcoming, to improve the overall performance of the algorithm. After further research and testing, they came up with the Improved Harmony Search Algorithm where they tried to improve the performance of the HS algorithm by dynamically increasing the adjusting rate of pitch and decreasing the overall bandwidth. The only drawback of this new IHS rule was the low value of the PA rule, which forced further study to eliminate this shortcoming and hence, the Highly Reliable Harmony Search (HRHS) rule was proposed. The major distinguishing factor between the HRHS and the IHS is in the way in which they adjust PAR. To get rid of the previously mentioned drawback of IHS, HRHS in its initial iterations keeps the value of PAR high, which is gradually brought down. While the HRHS and its further variants might have their minor shortcomings, it is a noteworthy thing that the researchers are working tirelessly to continuously develop the algorithm to get rid of the smallest of the deficiencies to make this algorithm as precise as possible. The findings of the authors after doing an extensive empirical investigation on the effects of the Harmony Search Rules on its performance concluded that the HMC rule ensures the supremacy of this algorithm over the others, it aids the algorithm to focus on solutions that are better than normal and, in this way, contributed towards increasing the rate of convergence of the algorithm. The RS rule works as a comprehensive search tool through all iterations and by working as a global search in early iterations, the RS rule can affect the algorithm’s performance greatly and also help prevent clinging onto the local optima. However, in the final iterations, the algorithm’s performance can be unsettled by the RS rule. In contrast, the PA runs a localised search through all iterations of the algorithm and by working as a local search in final iterations, helps the algorithm in obtaining better results, which shows that it is a great idea to apply the PA rule to the HMC rule as it aids the algorithm in balancing between exploration and exploitation.

2.3 C. Particle Swarm Optimization: Development, Applications, and Resources [3] In the given paper, the authors Russell C. Eberhart and Yuhui Shi have tried to assess the growth and developments made by the various researchers in the Particle Swarm

196

A. Anand et al.

Optimization (PSO) Algorithm, the major developments that the algorithm went through since its inception in 1995, and lastly what according to them the future had in store for the Algorithm. The authors first described the original version of the PSO Algorithm, which was followed by discussions on the Constriction Factors, the Inertia Weight, and the Tracking and Optimizing of Dynamic Systems. The PS Optimization concept first came into being as a simple simulation for a simplified social system, whose initial goal was to show a graphical simulation of the unpredictable movement of a flock of birds. These simulations, in some time, were developed to incorporate the nearest-neighbor velocity matching, eliminate the ancillary variables, and include searches based on multiple dimensions and the acceleration by the path travelled. From the initial presence of random solutions, we can conclude that the algorithm was quite similar to the Genetic Algorithm. However, the one major difference between the two algorithms is the fact that, in Particle Swarm, we assign a random velocity to every potential solution. These potential solutions are called particles, which are then flown over the problem space. We use the Particle Swarm Optimization for various purposes that include things like a method to help evolve the Neural Networks (NN), a common metal removal operation in manufacturing, called end milling, controlling voltage and reactive power at an electric utility, training a neural network as the charge-value estimator for electric vehicles, and the ingredient mix optimization, where PSO is used along with other algorithms for an ingredient mixture for growing strains of different microorganisms. They concluded the paper by stating that the PSO Algorithm like other progressive computational algorithms could be used to figure out most optimization problems and those problems that could be converted into optimization problems. PSO had numerous application areas and one with the most potential being multiobjective optimization, classification, biological system modelling, pattern recognition, system design, scheduling (planning), signal processing, robotic applications, decision making, simulation, and identification.

2.4 D. NN-Based Secure Task Scheduling in Computational Clouds [4] Through this paper based on NN-based task scheduling, the authors Tchórzewski, J., Respício, A., & Kolodziej, J. have attempted to propose a smart system to facilitate decision making related to security and tasks scheduling in cloud services, which aims to automate these processes. The system consisted of two types of Neural Networks (NN) and an evolutionary algorithm, which had the prime target of sorting the tasks coming into the Computational Cloud (CC) according to their security demands. Their presented system consisted of two Neural Networks and an Evolutionary Scheduler. One of the NNs was a classier/sorter NN, which classified and sorted batches incoming into the cloud; and the other, was an Expert NN that predicted

Neural Network Based Task Scheduling …

197

the VMs configurations. The Evolutionary Scheduler optimized the scheduling of tasks relying on an Evolutionary Algorithm.

2.5 E. Ant Colony Optimization [5] Through this paper based on Ant Colony Optimization, the authors talk about Ant colony optimization (ACO) that takes inspiration from the ravaging demeanor of some ant species. Ant colony optimization uses a system for solving optimization problems that are similar to the way the ants drop their essence on the soil to imprint some supportive paths that the other members of the colony should follow. The article aims to familiarize the readers with ant colony optimization and to examine its most prominent functions. Starting from looking at the background information on the ravaging demeanor of ants using organic information, they move on describing ant colony optimization and its main variants like ACO for the Travelling Salesman Problem, The Ant Colony Optimization Metaheuristic, and the Ant System Algorithm. The authors further describe the prominent theoretical results concerning ACO and their successful applications. A significant segment of research based on ACO in the past few years is concerned with its usage and applications. The topics that still have active research going on include. • • • • •

Dynamic Optimization Problems Stochastic Optimization Problems Multi-Objective Optimization Parallel Implementations Continuous Optimization

The authors also talk about various ant-inspired algorithms and provide an overview of them. This includes topics like: • • • •

Algorithms Inspired by Rummaging and Path Recognition Algorithms Inspired by Brood Sorting Algorithms Inspired by Division of Labor Algorithms Inspired by Cooperative Transport.

Thousands of researchers worldwide are applying ACO to various optimization problems nowadays. The optimization problems that have been applying ACO include problems like the Travelling Salesman Problem, the Ant Colony Optimization Metaheuristic and some other main ACO algorithms like Ant System, Max–Min Ant System, and Ant Colony System.

198

A. Anand et al.

2.6 An Idea Based on Honey Bee Swarm for Numerical Optimization [6] In this paper, the author Dervis Karaboga talks about a specific foraging behavior of honeybee swarms and a new Artificial Bee Colony (ABC) algorithm. The ABC algorithm was designed by mimicking the nature of honeybees in real life and describing them by solving multimodal and complex optimization problems. In the model, three groups of bees comprise the colony, which are known as scouts, onlookers and employed bees respectively. Employed artificial bees constitute the first half of the colony, followed by the onlookers constituting the second half. There is only one employed bee for every food source available and on the exhaustion of the food source, these employed bees become scouts and help in finding new food sources.

2.7 Job Scheduling for Cloud Computing Using Neural Networks [7] In this paper based on Job Scheduling, the authors try to make maximum use of allocated resources and combine them to get the best results while solving large scale arithmetic problems. They try to highlight the emergence of cloud computing as a model that can access networks and allocate resources with suitable and nominal regulating efforts. Due to features such as complete customization, portability, 24 * 7 on-demand availability and isolation, cloud computing has wider acceptance and a brighter future than most other technologies out there. The authors also explain the meaning of Genetic Algorithm and Neural Networks. A Genetic Algorithm (GA) is a search-based optimization technique based on the principles of Genetics and Natural Selection. It is often useful in finding ideal or near-ideal solutions to problems that under other conditions can take a long time to solve. In Genetic Algorithms, the population is always in competition amongst itself to evolve and subsequently beat the others to be the best solution possible. The best solution is chosen based on the fitness function. The major steps of GA include gathering the initial population, finding the fitness function, selection of candidates, crossover among them and then finally the mutation to find the best possible solution (Fig. 1). With the help of Cloud Computing, various tasks can be performed in a much more efficient and maximized manner. Every user has to pay for the service that they are provided in this environment and the ultimate concern for cloud computing is job scheduling. Job scheduling of users’ requests refers to the allocation of resources to the requests such that the required tasks can be finished in minimal time based on the time defined in the request made by the user, considering the dynamic factors and the statistics of users’ jobs. The authors also point out the two sides of cloud computing. While one is the cloud-computing user, the other is from the provider. From the user’s point of view, the scheduling algorithm should try to achieve the best results for both

Neural Network Based Task Scheduling …

199

Fig. 1 Basic Neural network architecture

the execution time and users’ budget. In contrast, the scheduling algorithm should improve resource application, lessen the maintenance cost, and power usage from the provider’s view. Being a combinational problem, job scheduling can never have an optimal solution by any algorithm or rule and varies in optimality in accordance to the needs and the level of satisfaction of the user and ability of the provider in achieving so with maximum benefit. There exist many other algorithm for optimization like optimization using neural networks, whale optimization, swarm intelligence, cost based dynamic scheduling, Ant lion based optimization and many more [8–11].

3 Proposed Model The Harmony Search (HS) algorithm method is an emerging metaheuristic optimization algorithm that has been inspired by the underlying principles that are used by musicians to improvise the harmonies that they produce. The HS Algorithm has the distinguishing features of being a simplistic algorithm with incredible searching efficiency. In the recent past, it has also been successfully employed to cope with numerous challenging tasks limited but not restricted to areas such as function optimization, pipe network optimization, mechanical structure design, and the optimization of data classification systems. The Harmony Search (HS) method that is used by the musicians to improvise the harmonies produced by them consist of three possible steps (processes): • Playing the famous note exactly from their memory. • Playing a particular note in the vicinity of the note previously selected by them. • Selecting any random note. Moreover, these three processes have been moulded to create the optimization process of the Harmony Search Algorithm. The three processes used by the musicians have effectively become the three rules that govern the generation of solutions in the Harmony Search Algorithm, and those rules are the Harmony Memory Considering

200

A. Anand et al.

Rule (HMC), the Pitch Adjustment Rule (PA), and the Random Selecting Rule (RS). It is these three rules that are modified to improve the efficiency of the HS Algorithm.

3.1 Phases of Harmony Search Algorithm The execution of the HS Algorithm has been divided into 4 steps, which are elaborated below, • • • •

Initializing the HS Memory (HM) Improvising a new solution from the given HM Updating the HM Repeating Steps 2 + 3 until a predetermined termination criterion is met, for example the number of iterations.

3.2 Flow Diagram of the Harmony Search Algorithm See Fig. 2. Fig. 2 Proposed model flow diagram

Neural Network Based Task Scheduling …

201

3.3 Input Definitions The Harmony Search Algorithm has three input parameters, which are the Harmony Memory Considering Rate (HMCR), the Pitch Adjustment Rate (PAR) and the Random Selecting (RS) value. It is these three values that determine the solutions that are produced by the Algorithm and how efficient the optimization will be, and by tweaking these values, we can get an even better performance in terms of scheduling tasks in the Cloud. • Harmony Memory Considering Rate (HMCR)—it is defined as an exploitation component in the HS Algorithm and greatly increases the elitism of the Algorithm. If we remove the HMC Rule and only apply the PA rule to the RS rule the intensity of exploration of the algorithm is increased but the convergence and the quality of the final solutions are decreased which implies that the algorithm begins functioning as a completely random search. • Pitch Adjustment Rate (PAR)—The existence of the PA rule does improve the performance of the algorithm; however, it is not as tangible as we expect it to be. Based on the research done by numerous individuals it is quite evident that the PA rule is not as effective in the early iterations when the algorithm is expected to effectively search all of the solution space. The main use of the PA rule is to prevent the algorithm from being stuck on local optima and direct it towards the global optima. However, to increase the efficiency of the HS Algorithm an efficient concept needs to be applied to the PA rule. • Random Selection (RS)—The value that we consider for the RS rule is typically inverse of the probability of applying the HMC rule. In the absence of the RS rule, the PA rule works as a local search, which indicates that the algorithm is tracked in the local optima. RS rule also has a higher influence in the initial iterations or the Algorithm than in the final iterations.

4 Simulation Result As per the simulations that we conducted for the Harmony Search, Genetic, and Moth Algorithms, we can use the results of those simulations and do a comparative analysis to assess the performance of each algorithm and narrow down on how efficient the Harmony Search algorithm truly is. For our Simulations the values that we chose to use for each of the three algorithms were standard i.e. we allocated four datacentres and five virtual machines to each of the algorithms, and ran five simulations each for 500, 1000, 2000, 3000, 4000, and 5000 tasks for each algorithm. Using those simulations we created charts for Total Time, Execution Time, Average Utilization (of the VMs), and the Simulation Time to do a comparison between the 3 algorithms. Given in the table above are the results we obtained from simulating the Harmony Search Algorithm. With the help of the test results obtained from the table above, we do a thorough and conclusive comparison of HS, GA & Moth Algorithm, based on

202

A. Anand et al.

Fig. 3 Total time comparison between Moth, GA, and HS Algorithms

factors like Total Time, Execution Time, Simulation Time and Average Utilization of the respective algorithms. In Fig. 3, we see that the total time taken by all the three algorithms, i.e. Moth, Genetic & Harmony Search remains almost similar for all the tasks assigned to them at different ranges. While the Genetic and Harmony Search Algorithms perform almost identically for all ranges, it can be seen that the total time taken for the 4000– 5000 task range is slightly higher for the Moth Algorithm. While the graph values in the first figure are too like find any notable difference, it is in Fig. 4 where we can see a contrasting difference between the three algorithms from an earlier stage. From the 2000 task mark, it is visible that the Execution Time for the Moth Algorithm exceeds that of the Harmony Search and the Genetic Algorithm, with the latter two algorithms varying around similar values for all the predefined tasks range. Through Fig. 5, we have tried to demonstrate the difference between the three algorithms when it comes to Simulation Time. From the above graph, we can infer that in a Cloud Environment, the time consumption while utilising the Moth Algorithm to simulate the Environment will be extremely high, as the time taken to bigger tasks

Fig. 4 Execution time comparison between the Moth, GA, and HS Algorithms

Neural Network Based Task Scheduling …

203

Fig. 5 Simulation time comparison between the Moth, GA, and HS Algorithms

Fig. 6 Average utilization comparison between the Moth, GA, and HS Algorithms

will consume a higher bandwidth, thus making it non-economical for both the Cloud Service Provider and the Client who uses their service. In Fig. 6, we understand the difference between the algorithms in terms of the Average Utilisation of the VMs that are allocated for the given tasks at a specific data centre. From the given graph, it is evident that for the Moth Algorithm, the Average Utilization is significantly lesser than that of the Harmony Search & the Genetic Algorithm. It is also evident that both the HS and GA algorithms peak numerous times during the process, with the GA algorithm reaching peaks more frequently and at higher levels when compared to the HS Algorithm.

5 Conclusion From the above simulations and the results that were produced it is clear that to utilize the Harmony Search Algorithm in the cloud environment it will have to be done in a

204

A. Anand et al.

mode where we take into consideration the Execution Time of the tasks as well as the Average Utilization of the Virtual Machines that are assigned to a particular entity, and since the Harmony Search Algorithm had a very uniform performance in both those spaces we have concluded that the best place to utilize the Harmony Search algorithm in a Cloud Environment would be in a Cost Aware structure since the way a Cloud Service Provider assesses the amount that must be charged from a client for enlisting their service is by analyzing how much the Virtual Machines were utilized and the time for which they were running.

References 1. Priyadarsini, R.J., Arockiam, L.: A framework to optimize task scheduling in cloud environment. Int. J. Comput. Sci. Inf. Technol. 5(6), 7060–7062 (2014) 2. Ahangaran, M., Ramezani, P.: Harmony search algorithm: strengths and weaknesses. J. Comput. Eng. Inf. Technol. 2(1) (2013) 3. Shi, Y.: Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No. 01TH8546), vol. 1, pp. 81– 86. IEEE (2001) 4. Tchórzewski, J., Respício, A., Kolodziej, J.: ANN-based secure task scheduling in computational clouds. In: ECMS, pp. 468–474 (2018) 5. Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006) 6. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. Vol. 200. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department, pp. 1–10 (2005) 7. Maqableh, M., Karajeh, H.: Job scheduling for cloud computing using neural networks. Commun. Netw. 6(03), 191 (2014) 8. Abualigah, L., Diabat, A.: A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Cluster Comput. 1–19 (2020) 9. Selvi, S.T., Valliyammai, C., Dhatchayani, V.N.: Resource allocation issues and challenges in cloud computing. In: 2014 International Conference on Recent Trends in Information Technology, Chennai, India, pp. 1–6 (2014) 10. Kilic, H., Yuzgec, U.: Improved antlion optimization algorithm via tournament selection. In: 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN), Girne, Northern Cyprus, pp. 200–205 (2017) 11. Petrovi´c, M., Petronijevi´c, J., Miti´c, M., Vukovi´c, N., Miljkovi´c, Z., Babi´c, B.: The Ant Lion optimization algorithm for integrated process planning and scheduling. Appl. Mech. Mater. 834, 187–192 (2016)

Neural Inspired Ant Lion Algorithm for Resource Optimization in Cloud Devansh Gulati , Mehul Gupta , Dinesh Kumar Saini , and Punit Gupta

Abstract There are various task scheduling models that are in use currently, but the one which we will be focusing on is the ANN (Artificial Neural Network) based model. This model was set up to estimate the task execution status for resource allotment among the candidates. An ANN-based model makes use of various scheduling algorithms to find the best results possible in terms of quality of service (QoS), total cost, service satisfaction, etc. Through our paper, we are simulating various task scheduling algorithms in a virtual environment and comparing their efficiency based on the results we obtain from these simulations. While our focus will be on an emerging meta-heuristic optimization algorithm called the Ant lion Algorithm, we are also running simulations for the Whale Optimization algorithm, and the Genetic Algorithm. For the prediction and allocation of cloud resources we use the Ant Lion Optimization Algorithm. Artificial Neural Network (ANN) is used for resource allocation. We discuss the results that depicts we get better results compared to the existing methods with proper allocation of resources and minimal cost. Keywords Cloud computing · QoS · Resource allocation · Meta-heuristic · Load balancing · Artificial neural network

1 Introduction Cloud computing has grown as a topic for emerging research due to its features such as wide accessibility, flexibility, and cost [1]. Computer skills of the grid and the capabilities of the material are integrated the computer-based communication model is now integrated with the Internet among data center developers with multiple computers. Cloud resources are provided remotely where data can be shared by users and tasks can be integrated into the cloud [2]. Compact computer, sharing resources, raw computer, equitable cargo management, and resource utilization at a costly production and management cost. D. Gulati · M. Gupta · D. K. Saini · P. Gupta (B) Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_12

205

206

D. Gulati et al.

There are various features that cloud computing has, which makes it preferable for everyone, which ensure that even in the coming future, the demand for cloud computing services keep on growing. These include: a.

b.

c.

d.

Pooling of resources: The large number of consumers in cloud computing leads to the resources being pooled together. This leads to the use of multi-tenancy and the dynamic allocation and de-allocation of these resources. This is an elastic process where the allocation is done according to the demand. On-demand service: Being based on self-reliant service models, cloud computing permits the consumer to complete activities like management, scheduling, deployment and management, while also allowing him the computation allowances without interacting with the provider. Cheap pricing: Having no upfront cost, cloud computing is based completely on the usage of the customer who can monitor the usage and manage costs, based on the resource they want to consume. Quality of Service (QoS): Quality of service for users is another thing that holds utmost importance in Cloud computing. The services mentioned in the agreement must include various factors like adequate number of resources, good performance, no bandwidth issues and 24/7 access to the cloud server.

As Cloud Computing is still in its infancy, it does have some issues that can be a roadblock for some consumers to overcome before they put their trust in this technology. The issues include: 1.

2.

As the industry starts to adopt Cloud Computing, the rapid increase in the number of users along with the data centers leads to an increase in the consumption of total power. As the load distributed between data centers does not have information about their power consumption, the comparison to the usual power consumption is under capacity load. • The scheduling algorithms currently in use aim to balance the load when request load increases and not when the power consumed is increased. • Data centers with high loads may slow down due to the higher number of compute requests, which also leads to higher power consumption. • Data centers that have less load may have a higher power consumption than required to do • The computations as the information are not completely available. • Requests that have time limits may miss their deadlines if there is a high load on a data center, which can be a huge problem for the consumer.

General issues of cloud are the inadequacy of resources and expertise Cloud Cost Management, Resource Optimization Problem, Dealing with Multi-Cloud Environments, Migration on hybrid cloud, Vendor Lock-In and Cloud Integration.

Neural Inspired Ant Lion Algorithm for Resource …

207

2 Related Work In this section some of the related work from the field of task scheduling in cloud are discussed. A.

• • •

B.

A Framework that is used to Optimize Task Scheduling in the Cloud Environment [3] In the above research, the authors R. Jemina Priyadarsini and L. Arockiam have very elegantly explained the essence of Cloud Computing, Load Balancing, Task Scheduling, and Task Allocation and how they are interconnected. They explain the growth of Cloud Computing and the steps that the Optimization Algorithms have gone through to get to the stage where it is at currently. They talk about how Task Scheduling is a famous Non-Deterministic Polynomial Time Problem, which enables the researchers to find better ways to optimize the Scheduling Algorithms even more. The cloud is generally used to support its users with Quality of Service (QoS) and thus when Task Scheduling Management becomes a part of Task Allocation, the Cloud Service Providers (CSPs) must uphold their QoS due to which the Task Scheduling Algorithms play an important role. In recent years, progress has been made in the field of optimizing task scheduling algorithms, with the migration from using hybrid meta-heuristic methods based on Genetic Algorithms, to implementing Task Scheduling by using Ant Colony Optimization (ACO) and even Particle Swarm Optimization (PSO) Algorithms. The authors concluded the paper by elaborating the Framework to enhance Cloud Scheduling, which generally goes in three steps: Resource Identification Resource Allocation Job Execution Since the usage of Cloud Computing technologies has massively increased in the current day and age, it becomes extremely important to have efficient resource allocation strategies with Service Level Agreements (SLA) to achieve satisfaction, maintain the QoS and maximize the profits for the CSP. With this paper, not only do we know about the main variations of task scheduling, but also about optimization strategy along with its impact based on demeanor the cloud system. Harmony Search Algorithm: Strengths and Weaknesses [4] With this research paper, the authors Milad Ahangaran and Pezhman Ramezani have talked about the Harmony Search Algorithm, its 3 rules namely the Harmony Memory Considering Rule (HMCR), the Pitch Adjustment Rule (PAR), and the Random Selection Rule (RS). The objective of the Authors was simple, that is, to be better able to understand metaheuristic algorithms, a thorough analysis of their search mechanisms is extremely important, and that is precisely what the authors have tried to provide us with in the form of the Strengths and Weaknesses of the Harmony Search Algorithm by describing

208

D. Gulati et al.

and analyzing the above-mentioned rules to get a better knowledge of their consequences on the algorithm’s performance. The authors describe the evolutionary process of solutions in the Harmony Search Algorithm which includes the steps of Initialization in which the algorithm initializes several solutions in a randomized way and Improvisation in which the HS generates a new solution using RS, HMC, and PA rules, the values of whose decision variables are selected by either using the HMC rule with the probability of HMCR or by using the RS rule using the 1-HMCR probability. We then make use of the PA rule to give different values to the decision variables using the HMC rule. They also bring to our notice serious drawbacks of the Algorithm that it cannot maintain an effective balance between global and local search. Excluding the PA rule, all other concepts in the HS algorithm can be applied comprehensively, thus turning the attention of scientists and researchers to this shortcoming, to improve the overall performance of the algorithm. After further research and testing, they came up with the Improved Harmony Search Algorithm where they tried to improve the performance of the HS algorithm by dynamically increasing the adjusting rate of pitch and decreasing the overall bandwidth. The only drawback of this new IHS rule was the low value of the PA rule, which forced further study to eliminate this shortcoming and hence, the Highly Reliable Harmony Search (HRHS) rule was proposed. The major distinguishing factor between the HRHS and the IHS is in the way in which they adjust PAR. To get rid of the previously mentioned drawback of IHS, HRHS in its initial iterations keeps the value of PAR high, which is gradually brought down. While the HRHS and its further variants might have their minor shortcomings, it is a noteworthy thing that the researchers are working tirelessly to continuously develop the algorithm to get rid of the smallest of the deficiencies to make this algorithm as precise as possible. The findings of the authors after doing an extensive empirical investigation on the effects of the Harmony Search Rules on its performance concluded that the HMC rule ensures the supremacy of this algorithm over the others, it aids the algorithm to focus on solutions that are better than normal and, in this way, contributed towards increasing the rate of convergence of the algorithm. The RS rule works as a comprehensive search tool through all iterations and by working as a global search in early iterations, the RS rule can affect the algorithm’s performance greatly and help prevent clinging onto the local optima. However, in the final iterations, the algorithm’s performance can be unsettled by the RS rule. In contrast, the PA runs a localised search through all iterations of the algorithm and by working as a local search in final iterations, helps the algorithm in obtaining better results, which shows that it is a great idea to apply the PA rule to the HMC rule as it aids the algorithm in balancing between exploration and exploitation.

Neural Inspired Ant Lion Algorithm for Resource …

C.

D.

209

Particle Swarm Optimization: Development, Applications, and Resources [5] In the given paper, the authors Russell C. Eberhart and Yuhui Shi have tried to assess the growth and developments made by the various researchers in the Particle Swarm Optimization (PSO) Algorithm, the major developments that the algorithm went through since its inception in 1995, and lastly what according to them the future had in store for the Algorithm. The authors first described the original version of the PSO Algorithm, which was followed by discussions on the Constriction Factors, the Inertia Weight, and the Tracking and Optimizing of Dynamic Systems. The PS Optimization concept first came into being as a simple simulation for a simplified social system, whose initial goal was to show a graphical simulation of the unpredictable movement of a flock of birds. These simulations, in some time, were developed to incorporate the nearest-neighbor velocity matching, eliminate the ancillary variables, and include searches based on multiple dimensions and the acceleration by the path travelled. From the initial presence of random solutions, we can conclude that the algorithm was quite like the Genetic Algorithm. However, the one major difference between the two algorithms is the fact that, in Particle Swarm, we assign a random velocity to every potential solution. These potential solutions are called particles, which are then flown over the problem space. We use the Particle Swarm Optimization for various purposes that include things like a method to help evolve the Neural Networks (NN), a common metal removal operation in manufacturing, called end milling, controlling voltage and reactive power at an electric utility, training a neural network as the charge-value estimator for electric vehicles, and the ingredient mix optimization, where PSO is used along with other algorithms for an ingredient mixture for growing strains of different microorganisms. They concluded the paper by stating that the PSO Algorithm like other progressive computational algorithms could be used to figure out most optimization problems and those problems that could be converted into optimization problems. PSO had numerous application areas and one with the most potential being multi-objective optimization, classification, biological system modelling, pattern recognition, system design, scheduling (planning), signal processing, robotic applications, decision making, simulation, and identification. NN-based secure task scheduling in computational clouds [6] Through this paper based on NN-based task scheduling, the authors Tchórzewski, J., Respício, A., & Kolodziej, J. have attempted to propose a smart system to facilitate decision making related to security and tasks scheduling in cloud services, which aims to automate these processes. The system consisted of two types of Neural Networks (NN) and an evolutionary algorithm, which had the prime target of sorting the tasks coming into the Computational Cloud (CC) according to their security demands. Their presented system consisted of two Neural Networks and an Evolutionary Scheduler. One of the NNs was a classier/sorter NN, which classified and sorted batches incoming into the cloud; and the other, was an Expert NN that predicted the VMs configurations. The

210

E.

• • • • •

• • • •

D. Gulati et al.

Evolutionary Scheduler optimized the scheduling of tasks relying on an Evolutionary Algorithm. The conclusion from the experiments in their paper demonstrated and confirmed the efficiency of their system. They aimed to design the system for Cloud service providers and consumers with the help of Infrastructure as a Service (IaaS) or Platform as a Service (PaaS) Cloud Computing models. In the future, they wished to introduce genetic algorithms for providing support to projects based on automatic detection of security threats. Ant Colony Optimization [7] Through this paper based on Ant Colony Optimization, the authors talk about Ant colony optimization (ACO) that takes inspiration from the ravaging demeanor of some ant species. Ant colony optimization uses a system for solving optimization problems that are like the way the ants drop their essence on the soil to imprint some supportive paths that the other members of the colony should follow. The article aims to familiarize the readers with ant colony optimization and to examine its most prominent functions. Starting from looking at the background information on the ravaging demeanor of ants using organic information, they move on describing ant colony optimization and its main variants like ACO for the Travelling Salesman Problem, The Ant Colony Optimization Metaheuristic, and the Ant System Algorithm. The authors further describe the prominent theoretical results concerning ACO and their successful applications. A significant segment of research based on ACO in the past few years is concerned with its usage and applications. The topics that still have active research going on include. Dynamic Optimization Problems Stochastic Optimization Problems Multi-Objective Optimization Parallel Implementations Continuous Optimization The authors also talk about various ant-inspired algorithms and provide an overview of them. This includes topics like: Algorithms Inspired by Rummaging and Path Recognition Algorithms Inspired by Brood Sorting Algorithms Inspired by Division of Labor Algorithms Inspired by Cooperative Transport. Thousands of researchers worldwide are applying ACO to various optimization problems nowadays. The optimization problems that have been applying ACO include problems like the Travelling Salesman Problem, the Ant Colony Optimization Metaheuristic and some other main ACO algorithms like Ant System, Max–Min Ant System, and Ant Colony System. Talking about the Travelling Salesman Problem, we know that it has a given number of cities and all we know is the distance between them. The main goal of this problem is to find the shortest distance one must travel to tour all the cities. Using ACO, we make use of some artificial ants who must move on a graph that is used to encode the problem in a simulated environment. We use vertices and edges to denote

Neural Inspired Ant Lion Algorithm for Resource …

F.

211

cities and the links, respectively. We use a variable value called pheromone for each edge, which is understood and changed by the ants when they arrive at the location. With ACO being an algorithm that depends on multiple iterations, we thus consider multiple artificial ants before formulating the results. At each step of finding the correct path, the ants use a stochastic mechanism, based on the pheromone, of choosing the vertices and hence, avoiding the vertices they have previously visited. An Idea Based on Honey Bee Swarm for Numerical Optimization [8]

In this paper, the author Dervis Karaboga talks about a specific foraging behavior of honeybee swarms and a new Artificial Bee Colony (ABC) algorithm. The ABC algorithm was designed by mimicking the nature of honeybees in real life and describing them by solving multimodal and complex optimization problems. In the model, three groups of bees comprise the colony, which are known as scouts, onlookers and employed bees respectively. Employed artificial bees constitute the first half of the colony, followed by the onlookers constituting the second half. There is only one employed bee for every food source available and on the exhaustion of the food source, these employed bees become scouts and help in finding new food sources. There exist many other algorithm for optimization like optimization using neural networks, whale optimization, swarm intelligence, cost based dynamic scheduling, Ant lion based optimization and many more [9–18].

3 Proposed Model Ant lion is an emerging meta-heuristic optimization algorithm. We encounter two types of population in the given algorithm that are the ants and the ant lions. The ants play an important role to explore space by the process of random walking. The ant lions on the other hand holds the responsibility of holding the best position the ants discover and update their position as a better new place discovered. A unique type of ant lion called the Elite Ant Lion in this search space also affects the random walking of the ants irrespective of its distance to other ants. The position of the elite ant lion is replaced by any of the other ant lions once they find a better position in the search space. Once the process of random walking and finding of the most optimized position is completed, we stop, and the position of the elite ant lion is selected as the final optimal response. To sum up we can state that at first the ants are free to move in the search space and then the ant lions are allowed to hunt. Ant lion optimization algorithm is a type of swarm intelligent algorithm that aims to solve problems in the process of optimization. The ALO follows the method of random walking in a defined search landscape as discussed. Every ant follows a different path of random walk, which increases the exploration of the algorithm. The upper bound and lower bound of each variable in ant normalize concerning the random walks. These walks should gravitate towards

212

D. Gulati et al.

the ant lion, which describes the actual behavior of how the ants get trapped in the sand pits made by the ant lions in nature. The fittest ant lion has the most probability of impacting the movement of the ants. These are simulated using a roulette wheel. Only two ants can affect the other ant’s movement i.e., the ant selected using a probability and the elite ant lion. As the number of iterations increases, the range of the random walks decreases proportionately. Ant lions are updated every iteration if we find any ant with better fitness. I.

Phases of Ant Lion Algorithm Proposed Ant Lion Algorithm consists of following steps:

• Initialization: In this phase the random location of the ants is defined. This phase is responsible for initialization of basic cloud infrastructure:   RWi = RWi1 , . . . .RWik . . . RWin

(1)

The walk of the ant at kth iteration can be defined by Eq. 2.   NRWik = RWik − a /(b − a)

(2)

where RW is the position of individual ant. • Grab the prey This phase defines the boundary limit of each ant LB is the lower bound and UB be the upper bound limit of iteration, LBk = LB/D; UBk = UB/D

(3)

LBj = ALj + LBjk

(4)

• Create an ant ambush Anti =

    NRWi, jk + NRWi, ek UBk − LBk + LBjk + LBek /2

(5)

• Update fitness function This step is responsible to evaluate the fitness value of the new position of the ant after each iteration where α + β = 1. Fitness_valuei = α ∗ U tili zation + β ∗ T otal_E xecution_timei T otal_E xecution_timei =

n  i=1

T ask_Length i /M I P S j

(6)

(7)

Neural Inspired Ant Lion Algorithm for Resource …

213

Fig. 1 Flow diagram

I.

Flowchart of Ant Lion Algorithm

The flow diagram of proposed algorithm is shown in Fig. 1. Figure shows the working of proposed scheduling algorithms. Figure 2 shows the layered architecture and the working of proposed model for task schuling.

4 Results Simulations were run on various algorithms i.e., Whale Optimization Algorithm, Genetic Algorithm and Ant Lion Algorithm to obtain various results such as execution time, total time, simulation time and average utilization. These results enable us to analyze these algorithms comparatively. Using java cloudsim we allocated 4 data centers and 5 virtual machines to run the simulations on each of the above-mentioned

214

D. Gulati et al.

Fig. 2 Proposed model algorithm

algorithms, varying the number of tasks to be allocated. We started with 1000 tasks and with increasing 1000 tasks in every simulation and the range varied from 1000 to 5000 tasks. With the help of R studio, we then run these algorithms and used a common cost function FUN to obtain the minimal cost for every simulation. By the following simulations we find out the following results which tell us how ALO is a better approach for cloud computing and the factors which makes it better than the others. On checking the data for total time Fig. 3 taken by the algorithms, we observe that GA takes the most amount of time. While the time taken by the WOA and ALO is almost the same at the end, but we see a dip in the time taken around 4000 tasks by the ALO after the simulations while there is a continuous increase in the time taken. Fig. 3 Total time taken by each algorithm

Neural Inspired Ant Lion Algorithm for Resource …

215

Fig. 4 Simulation time of each algorithm

So, with the following trend we can conclude that ALO takes the minimum time out of all the algorithms considered. In the above Fig. 4 we see that the simulation time taken by out ALO is the maximum and the other algorithms relatively take less time for simulation with GA taking the minimum amount for simulation even with many scheduled tasks. Observing the Fig. 5 above plotted for average time taken by each algorithm to run a varied number of tasks, we can easily conclude that the Antlion algorithm performed best while the WOA was the slowest. Genetic algorithm performed better than the Antlion algorithm while allotting 4000 tasks but when we take the overall performance into consideration, we observe the trend and thus conclude that Antlion takes them minimum amount of time. While observing the Fig. 6, we notice various spikes in the behavior of all the algorithms. We see that GA stays constantly on the top thus utilizing the virtual machines more. While we see around 4000 tasks the WOA utilizes less VM but if we average out and check we notice that the ALO is the algorithm which utilizes the minimum virtual machine. Fig. 5 Execution time of each algorithm

216

D. Gulati et al.

Fig. 6 Average utilization of VM for each algorithm

5 Conclusion Our suggested solution is utilized for dynamically forecasting and deploying cloud resources by overcoming numerous current systems limitations. With the help of the graphs above and the various results. we can conclude that Antlion Optimization Algorithm takes the minimum amount of time for execution as well as the least total time. Although taking maximum time for simulation but performing well in other factors we see that it gives us the most favorable results. On the other hand, we see that Whale Optimization Algorithm takes almost the same total time but the execution time and the virtual machines utilization greater than ALO results it to be more costly and thus we get better results from our selected algorithm. Similarly, when we compare to Genetic Algorithm, we see it utilizes the maximum amount of virtual machines making it the most costly one and with various other factors like it takes more time for execution and total time being the maximum we conclude that it is not the most efficient one and ALO being the minimum time taker and the least costly one is the most efficient among the algorithms we chose and compared. Thus, it can be concluded that the Ant Lion algorithm should be the best option to be used in resource allocation in Cloud Computing.

References 1. Abualigah, L., Diabat, A.: A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Cluster Comput. 1–19 (2020) 2. Selvi, S.T., Valliyammai, C., Dhatchayani, V.N.: Resource allocation issues and challenges in cloud computing. In: 2014 International Conference on Recent Trends in Information Technology, Chennai, India, pp. 1–6 (2014) 3. Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data mining with an ant colony optimization algorithm. IEEE Trans. Evol. Comput. 6(4), 321–332 (2002) 4. Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006) 5. Abdi, S., Motamedi, S., Sharifian, S.: Task scheduling using modified PSO algorithm in cloud computing environment. Int. Conf. Mach. Learn. Electr. Mech. Eng. 37–41 (2014)

Neural Inspired Ant Lion Algorithm for Resource …

217

6. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. Vol. 200. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department, pp.1–10 (2005) 7. Mirjalili, S.: The ant lion optimizer. Adv. Eng. Softw. 83, 80–98 (2015) 8. Maqableh, M., Huda K.: Job scheduling for cloud computing using neural networks. Commun. Network 6(03) (2014) 9. Kilic, H., Yuzgec, U.: Improved antlion optimization algorithm via tournament selection. In: 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN), Girne, Northern Cyprus, pp. 200–205 (2017) 10. Petrovi´c, M., Petronijevi´c, J., Miti´c, M., Vukovi´c, N., Miljkovi´c, Z., Babi´c, B.: The Ant Lion optimization algorithm for integrated process planning and scheduling. Appl. Mech. Mater. 834, 187–192 (2016) 11. Kiliç, H., Yüzgeç, U.: Parallel Machine Scheduling using Improved Antlion Optimization Algorithm (2015) 12. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, vol. 4, pp. 1942–1948 (1995) 13. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 14. L.D., D.B., Krishna, V.P.: Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl. Soft Comput. 13(5), 2292–2303 (2013) 15. Ramezani, M., Bahmanyar, D., Razmjooy, N.: A new optimal energy management strategy based on improved multi-objective antlion optimization algorithm: applications in smart home. SN Appl. Sci. 2(12), 1–17 (2020) 16. Wen, X., Huang, M., Shi, J.: Study on resources scheduling based on ACO algorithm and PSO algorithm in cloud computing. In: 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (2012) 17. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J. Global Optim. 39(3), 459–471 (2007) 18. Ali, E.S., Abd Elazim, S.M., Abdelaziz, A.Y.: Ant lion optimization algorithm for optimal location and sizing of renewable distributed generations. Renew. Energy 101, 1311–1324 (2017)

Data Science and Business Analytics, IoT, AI and ML for Smart Cities

Smart School Selection with Supervised Machine Learning Deepak Kumar, Chaman Verma, Veronika Stoffová, Zoltán Illes, Anish Gupta, Brijesh Bakariya, and Pradeep Kumar Singh

Abstract In today’s competitive academic environment, parents and students usually face the school selection problem for a decade. Keeping the question in mind, we proposed to seek the select significant features (academic, social, demographic, etc.) with the help of machine learning algorithms (Support Vector Machine (SVM), Extreme Gradient Boosting (XGB), and Random Forest (RF)). These features will be helpful for guardians/parents, schools, and teachers in deciding the students the best school for their education. We used a statistical approach (one-way ANOVA) to investigate the impact of school selection reasons towards student’s grades. The standard open data set of Portuguese secondary school student was used here for analysis. A Synthetic Minority Over-sampling Technique-Nominal Continuous (SMOTE-NC) technique was used for resampling the imbalanced Reason target class. The proposed automatic school selection recommender might be helpful in every academic community and intelligent education. We found school selection reasons have a statistically significant impact on the final grade. The RF comes out as a strong predictor among all proposed models with an accuracy of 71%. The final grade, going out with friends, parents’ job, and activities are the essential features for Smart School Selection. D. Kumar · A. Gupta Apex Institute of Technology, Chandigarh University, Mohali, India C. Verma (B) · Z. Illes Eötvös Loránd University, Budapest, Hungary e-mail: [email protected] Z. Illes e-mail: [email protected] V. Stoffová Trnava University, Trnava, Slovakia e-mail: [email protected] B. Bakariya I.K.Gujral Punjab Technical University, Jalandhar, India e-mail: [email protected] P. K. Singh Narsee Monjee Institute of Management Studies (NMIMS), School of Technology Management and Engineering, Chandigarh Campus, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_13

221

222

D. Kumar et al.

Keywords One-way-ANOVA · Supervised learning · SVM · XGB · RF · SMOTE-NC

1 Introduction and Related Work Tremendous growth can be witnessed due to artificial intelligence-associated tools and techniques in the last five years. Today’s applied problems that might be resolved by big IT companies in few couple of hours or days exhibit the capabilities of modern machine learning methods. Moreover, academicians and researchers’ growing interest can be noted through their recent contributions in machine learning research articles in which they try to make education-related intelligent decisions. Often in the education sector, there is a significant requirement to predict smart school decisions for future students to improve school’s curriculum-related decisions and guide parents to make wise decisions for their wards. This is where intelligent statistical inference comes into the picture. Its associated technique helps us analyze the observations to extract meaningful information to assess final student performance. Supervised learning (SL) is an essential type of machine learning that builds a model on mapping input to output [1]. Various machine learning techniques exist such as SVM, clustering, and deep learning to automate the prediction tasks easily [2]. Students’ performance prediction is made by students’ daily interaction on Moodle module events using RF and SVM algorithms. The RF method made higher accuracy in prediction as compare to another predictive model [3]. SL algorithms persist the capabilities to Predict a wide variety of applied applications. One of them predicted students could obtain a certificate by using SVM and other prevalent algorithms like logistic regression, KNN [4] where SVM achieved the most remarkable accuracy (97.78%). Marks records of the previous semester with current semester used as input for SVM, RF, Gradient boosting (GB) algorithms for judging and predicting grade for student performance and SVM made as similar as above said performance in prediction [5–7]. Using the XGB model to predict dropout prediction helps the institution in managing the retention of students. The XGB key predictors for dropout are transfer status, institutions’ selectivity, and control of institutions [8]. Supervised cluster-based classifier model uses clustering technique with classification algorithm [9] using the RF to achieving higher accuracy (96.25%). The SVM and RF algorithms predicted student’s guardians in the school of Portuguese [10]. A study was conducted to know the effect of COVID-19 on the student using the one-way technique on the available student dataset. It was found that negative affect at a low, moderate level compared to positive level [11]. To study and examine the safety and security index and human development relationship using one-way-ANOVA on 53 African countries samples. It was found that statistically significant difference [12]. Statistically, a significant result was discovered when a study was conducted on the correlation of higher engagement with social media tools and final grade. Twitter

Smart School Selection with Supervised Machine Learning

223

using student and corresponding learning outcomes exhibits a significant increase in both engagement and grade for the experimental group [13]. Further, we have seen that machine learning and statistical analysis have been performed to get meaningful insights from the education datasets. As preliminary research, the main objective of this study is to know the impact of school selection reasons (close to home, school reputation, course preference, other) on the final grade of students. Additionally, the authors presented efficient machine learning models that identified the reason to select a school to study. The results of the paper might be useful to school administration, parents and students themselves to choose the school to get admission for quality education.

1.1 Dataset Description Student performance dataset consist 650 observations with 33 features with all description described in Table 1. It is collected from the UCI Machine Learning repository [14]. Figure 2 shows that the Gabriel Pereira school has more students than Mousinho da Silveira school, with more female students. Moreover, students are addicted to the internet, and most students have final grades 10–15. The majority of students are willing to take higher education. Most students are from united families (greater than 3), as evident from their family size and students’ families are very literate as mother and father are very educated. Most of the students are given extra support by their schools. The author conducted two experiments using the dataset. The first experiment was performed with statistical analysis to explore the impact of school selection reasons (explanatory variables) on the student’s final grades (target variable). The second experiment identified the variable reason based on 30 explanatory variables excluding (G1, G2) to remove the multicollinearity in the dataset. This experiment solved the multiclassification problem with machine learning algorithms (Fig. 1). Figure 2 shows that the course preference-based school selection is 43.9%, and the second choice of school selection is 23% due to the location of the school that is nearest to home. The school reputation contributed 22% and other reasons played 11.1%. Figure 3 shows the boxplot of grade versus reason of school selection. It can be seen that the mean score of reputation is 12.5 shows that it has impacted grades as compared to home, course, and others. Hence, the reputed school students and course admitted students have high grades.

2 Preprocessing This section deals with proposed supervised machine learning algorithms, which are SVM, RF, and XGB. Under consideration, the dataset is partitioned into a train

224

D. Kumar et al.

Table 1 Dataset description Feature

Data type with domain values

Description

Feature

Data type with domain values

Description

School

Nominal GP—Gabriel Pereira MS—Mousinho da Silveira

Students’ school name

Guardian

Nominal Mother Father Other

Student’s guardian

Sex

Nominal F—female M—male

Student’s gender

Treveltime

Numeric 1 to 1 hr

Home to school travel time

Age

Numeric: 15–22

Age of student

Studytime

Numeric 1 to 10 hr

Student’s weekly study time

Address

Nominal U—Urban R—Rural

Residence community

Failure

Numeric n if 1 ≤ n < 3, else 4)

No. of past class failures

Famsize

Nominal LE—Less or equal to 3 GT3—Greater than 3

Family size of student

Schoolsup

Nominal Yes No

Is school supporting in their study

Pstatus

Nominal T—Living together A—Apart

Student’s parent living together or apart

Famsup

Nominal Yes No

Is student have family support or not

Medu

Nominal 0. None 1. Primary education (4th grade) 2. 5th to 9th grade, 3. Secondary education 4. Higher education

Student’s mother education background

Paid

Nominal Yes No

Whether student attended extra paid classes

Fedu

Nominal 0. None 1. Primary education (4th grade) 2. 5th to 9th grade, 3. Secondary education 4. Higher education

Student’s father qualification

Activities (extra-curricular activities)

Nominal Yes No

Whether student attended extra-curricular activities or not

(continued)

Smart School Selection with Supervised Machine Learning

225

Table 1 (continued) Feature

Data type with domain values

Description

Feature

Data type with domain values

Description

Mjob

Nominal Teacher Health care related Civil services At home Other

Student’s mother job type

Nursery

Nominal Yes No

Whether student attended nursery school or not

Fjob

Nominal Teacher Health care related Civil services At home Other

Student’s father job type

Higher

Nominal Yes No

Whether Student wants to go for higher study

Reason

Nominal 0. Close to home 1. School reputation 2. Course preference 3. Other

Student reason to choose this school

Dalc

Nominal 1—Very low 5—Very high

Student’s workday alcohol consumption

Internet

Nominal Yes No

Whether student have internet at home or not

Health

Nominal 1—Very bad 5—Very good

Student’s current health status

Romantic

Nominal Yes No

Whether student is in relationship

Walc

Nominal: 1—Very low 5—Very high

Student’s weekend alcohol consumption

Famrel

Nominal 1—Very bad 5—Excellent

Student’s quality family relationship

Absences

Numeric 0–93

School absenteeism

Freetime

Nominal 1—Very low 5—Very high

How much student have free time school

G1

Numeric 0–20

Student’s first-sem grade

Goout

Nominal 1—Very low 5—Very high

Going out with friend’s intensity

G2

Numeric 0–20

Student’s sec-sem grade

G3

Numeric: from 0 to 20

Student’ final grade

test split with an 80:20 ratio. Classification models are judged by their performance metrics (precision, recall, f1-score), and graphical methods (confusion matrix, receiver operating characteristics) curves are used. The experiments are conducted with three machine learning algorithms on sample data after resampling with the SMOTE-NC technique to predict the smart school selection reasons. SMOTE-NC is a famous variant of SMOTE for addressing an issue when most features are nominal compared to continuous features [15]. In our dataset, out of 31 features, 8 are numerical and 25 are nominal, as described in Table 1. Figures 4 and 5 are displaying the reason class count before resampling and after

226

D. Kumar et al.

Fig. 1 Description of dataset

resampling. There was an imbalance exist in the school selection reason with respective instances (Course Pref.-205, Close to Home-149, Other-72, and Reputation-143) after balancing all reasons with 205 instances with SMOTE-NC algorithm displayed.

Smart School Selection with Supervised Machine Learning

Fig. 2 School selection reason distribution

Fig. 3 Final grades based on school selection reason Fig. 4 Imbalanced class

227

228

D. Kumar et al.

Fig. 5 SMOTE-NC balanced class

3 Experiments and Results This section focused on the experimental framework of the presented study. It is used for data pre-processing and to generate the confusion matrix and performance metrics. The experimenter application is also used to compare the accuracy and CPU training time of classifiers using a One-Way-ANOVA at 0.05 significant level. For experiment 2, the present paper has used prediction algorithms to carry out the study, it used the following standard metrics as a performance metric to compare the proposed algorithms. Recall : =

TP T P + FN

(1)

A recall is, by definition, a total of True positive (TP) elements separated by sums of TP and false negative (FN) totals [16] as stated in Eq. 1. Precision can be defined as TP divided by the sum of TP and FP denoted in Eq. 2 Pr ecision :

TP T P + FP

(2)

The Matthews correlation coefficient (MCC) [17] is a more accurate statistical measure that only yields a high score if the prediction performed well in all four confusion matrix groups (TP, FN, TN, and FP) proportional to the size of positive and negative elements in the dataset as stated in Eq. 3. T P.T N − F P.F N . MCC : √ (T P + F P)(T P + F N ).(T N + F N )

(3)

Smart School Selection with Supervised Machine Learning

229

The ratio between the number of correctly classified samples and the total number of samples is the most rational performance metric as the accuracy metric stated in Eq. 4. Accuracy :

TP +TN T P + T N + FP + FN

(4)

The harmonic mean of the Recall and Precision values equals the F1 Score. The F1 Score achieves the ideal combination of Precision and Recall, allowing for a proper assessment of the model’s success in classification [18] as stated in Eq. 5. F1 − Scor e :

(2.T P) (2.T P + F P + F N )

(5)

There is a non-linear analytic relationship exist between Cohen-kappa score and widely used classification metrics (e.g., sensitivity and specificity). Kscore = 1 indicated perfect agreement between actual and predicted [19]. Mathematically cohen-kappa can be defined as: K Scor e : pA pE

( p A − pE) (1 − pE)

(6)

observed relative agreement between two annotator. hypothetical probability of agreement by chance.

3.1 Experiment-I As evident from the mean differences in descriptive Table 2, there seems a significant difference but can be cleared with the One-Way ANOVA statistical method. Mean differences can be observed with close to home (0) have less value (11.55) as compared to others and the maximum value among them is for course reputation (3) is 13.04 with a standard deviation of 2.98.

Table 2 Descriptive statistics 0

N

Mean

Std. deviation

285

11.55

31.09

1

285

12.00

2.917

2

285

10.47

3.771

3

285

13.04

2.983

Total

1140

11.76

3.339

230

D. Kumar et al.

Table 3 One way ANOVA computation Sum of square

df

Mean square

F

Sig

969.385

3

323.128

31.301

0.000

Between groups Within groups

11,727.193

1136

Total

12,696.578

1139

10.323

From Table 3 it is apparent that there is a significant effect of school chosen reasons on level of final grade G3, F (31,136) =31.30, p < 0.05 at 5% significant level. The Bonferroni approach here used to compare various groups (school selection reasons) at the baseline, it is helping in examine the relationship between variables using t-distribution-based as thresholds. The Bonferroni test (posthoc test) from Table 4 and a quick inspection reveals that the Close to home and school reputation have no significant difference (p > 0.0.5). Course preference is still significantly different from other reasons like reputation, close to home, and others (p < 0.05). The mean plot diagram in Fig. 6 also displaying the same results when school chosen reasons are close to home and school reputation. No differences were found close to home and course preferences with p > 0.05, but course preferences and other reasons displayed significant differences with p < 0.05 (Table 4). Figure 6 reflects the final grade mean value of course preference (0) at the y-axis value (11.55) going high with value (12) as close to home reason. It can be also Table 4 Bonferroni w.r.t G3 G3 Bonferroni (I) reason

(J) reason

Mean difference (I-J)

Std. error

Sig.

95% confidence interval

0

1

−0.453

0.269

0.557

−1.16

1

2

3

* The

Lower bound

Upper bound 0.26

2

1.081*

0.269

0.000

0.37

1.79

3

−1.488*

0.269

0.000

−2.20

−0.78

0

0.453

0.269

0.557

−0.26

1.16

2

1.533*

0.269

0.000

0.82

2.24

3

−1.035*

0.269

0.001

−1.75

−0.32

0

−1.081*

0.269

0.000

−1.79

−0.37

1

−1.533*

0.269

0.000

−2.24

−0.82

3

−2.568*

0.269

0.000

−3.28

−1.86

0

1.488*

0.269

0.000

0.78

2.20

1

1.035*

0.269

0.001

0.32

1.75

2

2.568*

0.269

0.000

1.86

3.28

mean difference is significant at the 0.05 level

Smart School Selection with Supervised Machine Learning

231

Fig. 6 Mean plot of school selection reason

cleared from Bonferroni Table 4 with p < 0.05. There is a significant difference between both reasons. From the mean plot, it can also observe that other reasons have a more significant impact on low final grades with a mean value (10.47), and the significant difference can be observed from other (course preferences, close to home, and reputation) with p < 0.05 (Table 4). School reputation also impacting major on final grade with mean value (13.04).

3.2 Experiment-II This experiment predicted the reasons for selecting a school. It checks whether it is due to Course Choice, Near Home, Reputation, or Other reasons. We trained and tested three algorithms named RF, XGB, and SVM. Model parameters are set for achieving the best results in Table 5. Gini criterion is used to measure the quality of split in the random forest with a minimum number of samples at leaf set to 2. The number of trees in the RF is set to 500 by setting n_estimator. When we want to look for the best split, we set the max feature as auto. Gamma is set to zero as the more significant the gamma, the more conservative algorithm in XGB. It is set in accordance to shorten the loss in further portioning. The learning rate is also called eta. It is used as a step size in shrinkage in making updation to stop overfitting. XGB tree maximum depth set to 7. The number of boosting stages is set to 300 by setting parameter n_estimator. In SVM, regularization parameter strength is inversely proportional to C set as positive value by 1000, and the kernel is set as RBF, usually set in non-linearity.

232

D. Kumar et al.

Table 5 Model hyperparameters configurations Model name

Hyper-parameters

Values

RF

Criterion

Gini

XGB

SVM

Min_sample_leaf

2

N_estimator

500

Max_feature

Auto

Gamma

0

Learning_rate

0.2

Max_Depth

7

Min_child_weight

4

N_estimator

300

C

1000

Decision_function_shape

OVO

Degree

3

Kernel

RBF

Gamma

Auto

4 Model Evaluation Table 6 displays three model performance measures (RF, XGB, and SVM) results for the school selection reason. F1-score is the best way to measure the model strength as it is considered the weighted average of precision and recall. Table 6 illustrates the RF model’s output is very high w.r.t F1-Score and accuracy levels in various performance metrics compared to the other models. The RF model’s accuracy (71%) and F1-Score (73%) reflecting the strength of prediction. Cohen’s Kappa statistics are best known for measuring the predicted classes’ proximity to the actual. Its output is between 0 and 1. It is generally considered for its value as the closer the score is one, the better the classifier. As kappa value of RF is high, indicating the best model predictor. The MCC is used for calculating the correlation coefficient between observed and predicted classification, and value Table 6 Performance metrics Performance

RF

XGB

SVM

F1-score

0.73

0.69

0.66

Precision

0.82

0.89

0.74

Recall

0.83

0.84

0.79

Accuracy

0.71

0.70

0.66

Kappa score

0.61

0.60

0.54

Mathews correlation coefficient (MCC)

0.62

0.60

0.55

Smart School Selection with Supervised Machine Learning

233

1 represents a perfect prediction, random prediction represented by zero. The RF value is also high in all cases. Figure 7 shows that final grade (G3) is an essential feature in selecting a school for admission and school, going out with friends, and mother job with the top predictor RF model. These features clearly show that the mother and father’s jobs playing a significant role in selecting the particular school. School selection also depends on the activities of their students. To evaluate classification accuracy in a much better way, confusion matrices are depicted in Fig. 8, proposed multiclass classification models (SVM, XGB, and RF), XGB true predicted (72%) from diagonal values is higher among other models. Still, the misclassification ratio of RF is the lowest (41%) among other models. Important School Selection Reason Predicting Features 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

Fig. 7 Key predictors to school selection

Fig. 8 Models confusion matrix

234

D. Kumar et al.

5 Conclusion We explored the impact of school selection on the final grade of students and predicted the future student’ school reason. One-way ANOVA proved a statistically significant difference between school selection reason groups F(3, 1136) = 31.30, p < 0.05. The Bonferroni posthoc test found a statistically significant difference between course preference and others (school reputation and close to home). We also implemented three supervised machine learning models (XGB, RF, and SVM) to recognize the intelligent school selection reasons. We found that the RF model outperformed others inaccuracy (71%) and F1-score (73%). Final grade (G3), parents’ job (mjob, fjob) playing a significant role in selecting the school. Students also consider free time and going out with friends in school selection. School preference also depends on the activities they have done in the schools. Supporting the significant impact of school selection on students’ grades and automatic school recommender could keep making an innovative school choicebased system. For this, the authors proposed machine learning models with significant features to be implemented in the future with web-based technology such as cloud and IoT. This smart school selection application is beneficial to students and may help parents choose an academic school to enhance the grades of children. Future work may include the PCA [20], Chi2 , Gain ratio, Info-Gain, K-feature, and relief approach as a dimensionality reduction with the same machine learning. Acknowledgements The research has been supported by Cultural and educational grant agency of the Slovak Ministry of Education KEGA under grant No. 013TTU-4/2021: Interactive animation and simulation models for deep learning. The work of Chaman Verma and Zoltán Illes was supported with “UNKP-21-4-I-ELTE-634” New National Excellence’ Program of the Ministry of Human Capacity of the Hungarian Government, and Co-financed by the European Social Fund under the project “Talent Management in Autonomous Vehicle Control Technologies (EFOP-3.6.3-VEKOP16-2017-00001)”.

References 1. Kotsiantis, S., Zaharakis, I., Pintelas, P.: Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26, 159–190 (2006). https://doi.org/10.1007/s10462007-9052-3 2. Tanwar, S., Bhatia, Q., Patel, P., Kumari, A., Singh, P.K., Hong, W.: Machine learning adoption in blockchain-based smart applications: the challenges, and a way forward. IEEE Access 8, 474–488 (2020). https://doi.org/10.1109/ACCESS.2019.2961372 3. Nespereira, C., Elhariri, E., El-Bendary, N., Vilas, A., Redondo, R.: Machine learning based classification approach for predicting students’ performance in blended learning. Adv. Intell. Syst. Comput. 407, 47–56 (2016). https://doi.org/10.1007/978-3-319-26690-9_5 4. Ma, C., Yao, B., Ge, F., Pan, Y., Guo, Y.: Improving prediction of student performance based on multiple feature selection approaches. In: Proceedings of the ICEBT 2017, Toronto, ON, Canada, 2017, pp. 36–41. https://doi.org/10.1145/3141151.3141160 5. Pushpa, S., Manjunath, T., Mrunal, T., Singh, A., Suhas, C.: Class result prediction using machine learning. In: Proceedings of the 2017 International Conference on Smart Technologies

Smart School Selection with Supervised Machine Learning

6.

7.

8.

9.

10.

11.

12. 13. 14. 15.

16.

17.

18.

19.

20.

235

for Smart Nation (SmartTechCon), Bengaluru, India, 19 Aug 2018, pp. 1208–1212. https://doi. org/10.1109/SmartTechCon.2017.8358559 Rastrollo-Guerrero, J., Gomez-Pulido, J.A., Domínguez, A.: Analyzing and predicting students’ performance by means of machine learning: a review. Appl Sci 10, 1042 (2020). https://doi.org/10.3390/app10031042 Mantoo, B.A., Khurana, S.S.: Static, dynamic and intrinsic features based android malware detection using machine learning. In: Lecture Notes in Electrical Engineering, vol. 597, pp. 31– 45. Springer (2020). https://doi.org/10.1007/978-3-030-29407-6_4 Huo, H., Cui, J., Hein, S., et al.: Predicting dropout for nontraditional undergraduate students: a machine learning approach. J. College Student Reten.: Res. Theory Pract. (2020). https://doi. org/10.1177/1521025120963821 Almasri, A., Alkhawaldeh, R.S., Çelebi, E.: Clustering-based EMT model for predicting student performance. Arab. J. Sci. Eng. 45, 10067–10078 (2020). https://doi.org/10.1007/s13369-02004578-4 Verma, C., Stoffova, S., Zoltan, I., Kumar, D.: Towards prediction of student’s guardian in the secondary schools for the real-time. In: Proceeding of ICRIC 2019, Lecture Notes in Electrical Engineering (LNEE), pp. 159–175. Springer (2019) Wang, Y., Jing, X., Han, W., et al.: Positive and negative affect of university and college students during COVID-19 outbreak: a network-based survey. Int. J. Public Health 65, 1437–1443 (2020). https://doi.org/10.1007/s00038-020-01483-3 Sow, M.: Using ANOVA to examine the relationship between safety & security and human development. J. Int. Bus. Econ. 2 (2014). https://doi.org/10.15640/jibe.v2n4a6 Junco, R., Heiberger, G., Loken, E.: The effect of Twitter on college student engagement and grades. J. Comput. Assist. Learn. 27(2), 119–132 (2011) https://archive.ics.uci.edu/ml/datasets/student+performance Koivu, A., Sairanen, M., Airola, A., Pahikkala, T.: Synthetic minority oversampling of vital statistics data with generative adversarial networks. J. Am. Med. Inform. Assoc. 27(11), 1667– 1674 (2020). https://doi.org/10.1093/jamia/ocaa127 Kumar, D., Verma, C.: Towards recognition of normal versus pneumonia infected patients using deep neural network technique. In: Lecture Notes in Electrical Engineering, vol. 701, pp. 307–17. Springer Science and Business Media Deutschland GmbH (2021). https://doi.org/ 10.1007/978-981-15-8297-4_25 Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. (n.d.) Accessed 25 Apr 2021. https:// doi.org/10.1186/s12864-019-6413-7 Hakkoum, H., Idri, A., Abnane, I.: Assessing and comparing interpretability techniques for artificial neural networks breast cancer classification. Comput. Methods Biomech. Biomed. Eng.: Imag. Vis. (2021) https://doi.org/10.1080/21681163.2021.1901784 Wang, J., Yang, Y., Xia, B.: A simplified Cohen’s Kappa for use in binary classification data annotation tasks. IEEE Access 7, 164386–164397 (2019). https://doi.org/10.1109/ACCESS. 2019.2953104 Sachdev, K., Gupta, M.K.: Predicting drug target interactions using dimensionality reduction with ensemble learning. In: Singh, P., Kar, A., Singh, Y., Kolekar, M., Tanwar, S. (eds.) Proceedings of ICRIC 2019. Lecture Notes in Electrical Engineering, vol. 597. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29407-6_7

Artificially Intelligent and Sustainable Smart Cities Mahendra Kumar Gourisaria , Gaurav Jee , G. M. Harshvardhan , Debanjan Konar , and Pradeep Kumar Singh

Abstract The ever-increasing population leads to congested cities making them difficult to manage, facilitating the need for modern approaches. The emerging concept of smart cities exceedingly encourages the use of sensors and automated systems to tackle these issues. Through sending real-time data and feedback, the extracted data can then be analyzed and processed for the sustainable socio-economic development of the cities. All this can be possible due to the introduction of Internetof-Things (IoT) devices and complex sensors that are capable of communicating with each other creating a form of a digital environment, helping in making the cities autonomous. The emergence of big data and advanced machine learning techniques can detect anomalies and alterations that are otherwise not possible with the human eye. Today the machine learning models are becoming non-deterministic and can grow with time. A smart city is a collection of basic amenities provided in a smart, efficient, and eco-friendly manner that is interconnected and provides real-time feedback such as traffic management, pollution control, waste management, health care, etc. In this chapter, we discuss the underlying technology, their implementation, and the working of a model for the creation of a self-sustained smart city. The purpose of a smart city could be achieved if it is sustainable, self-reliant, and autonomous. For that, a novel smart city central management system (SCAS) is introduced in the chapter to administer and cater to the city with minimal intervention of a human. The model is conceptual and focuses more on the implementation and the planning phases of the smart city design. M. K. Gourisaria (B) · G. Jee · G. M. Harshvardhan School of Computer Engineering, KIIT Deemed to Be University, Bhubanewar, Odisha 751024, India e-mail: [email protected] D. Konar Computer Science and Engineering, SRM University AP, Amaravati, Mangalgiri, Andhra Pradesh, India e-mail: [email protected] P. K. Singh Narsee Monjee Institute of Management Studies (NMIMS), School of Technology Management and Engineering, Chandigarh Campus, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_14

237

238

M. K. Gourisaria et al.

Keywords Artificial intelligence · IoT · Smart cities · Object detection · Action recognition · GSOM

1 Introduction According to the United Nations, the world population is projected to reach 8.5 billion in 2030, and by 2050 to 9.7 billion [1]. Moreover, 68% of the world population is expected to be living in an urban environment. Directly increasing the stress on the existent infrastructure such as traffic, congestion, resources management, and even law enforcement. Making a city smart simply means giving it the ability to make smart decisions based upon data collected through a multitude of sensors and IoT devices. For example, smart traffic management is based on previously collected data of the day of the year and congested periods. To sum up, smart IoT devices are set up and used to gain information about any specific problem. The collected data is processed and studied upon by different AI or Machine Learning models which give a predicted analysis or study about that problem. The result so gained is used to take affirmative actions [2, 3]. Figure 1 shows the information collection and implementation workflow for smart traffic, as an example. While different people may perceive the idea of smart cities differently, some researchers have tried to standardize the term [4]. In general, a smart city may be able to optimize the running or operating costs of the city, help in smart governance of a city helping in running the city effectively giving real-time insights into what decisions to make. It may help increase the energy-saving thru smart energy grids which estimate and optimize the “how much” and “when” to start and stop the supply of electricity. The interconnected road traffic management system collects the data and estimates the busiest of routes during a certain time of the day, a month, or a year, leading to making smart decisions of how the traffic flows so that the frequency of road jams declines. While the general idea about smart cities is clear, there is no single definition present to define a smart city. For our discussion, let us define what elements are necessary for a city to be smart [5]. The basic requirements for this are: sanitation, water supply, assured electricity, urban mobility and transport, affordable housing, robust IT connectivity and digitalization, health and education, safety and security, good governance, e-governance, and citizen participation [6, 7].

Fig. 1 Flow diagram of information collection and implementation

Artificially Intelligent and Sustainable Smart Cities

239

Fig. 2 Components of a smart city [5]

The components of a smart city are shown in Fig. 2. But all this is not alone sufficient for a city to be called smart if they are bifurcated. The very purpose of a smart city is to be fully connected, able to make decisions and run autonomously. All this will be possible if it has a centralized system that can interpret, process, and take affirmative actions. We have introduced a three-layered administration system, at a conceptual level. Which counters data collection, data interpretation, and decision-making all by itself. This chapter is further organized into 5 sections. 2. Literature review discusses the existing research done in the field of smart cities. Then in 3. Background study, the prerequisite technologies required for the functioning of the smart city are described. It is further divided into subsections for IoT, AI, Augmented Reality (AR), Drones, Cloud Computing, and Big Data. In 4. Different portfolios of smart cities, we discuss, in brief, the important portfolios of a smart city. In 5. Self-building AI model, we describe a very important aspect of building a sustainable smart city. In 6. A novel smart central administration structure (SCAS) is proposed to run the smart city. Finally, in the penultimate section, the drawbacks of developing such a city and the fear of data privacy are discussed, followed by conclusions and references.

2 Literature Review There has been copious research in the field of smart cities. From applications to the specific requirement to tailor to the demands of a smart city. A few of the requirements of building a smart city that could be essential to our research can be such as, the pursuit of a non-deterministic cum generalized AI model that can self-grow, integration of Cloud computing with smart city management, Effective and efficient use of

240

M. K. Gourisaria et al.

IoT devices, blockchain, security concerns of a smart city. Some more requirements such as image processing, video processing, and language processing in machine learning are also required for the effective use of surveillance cameras and drones. By doing an analysis of the related work, we come across various latest technologies used in the field of smart cities, and also we can compare each work with another based on some parameters. The technological growth and the amount of data increasing day after another leading to an increase in the parameter requirements to train the AI model. While the models we regularly use are more deterministic and perform a definite task that is preassigned, the demand for an adaptive model that can self-build grows (i.e. selfstructure, self-configure, and self-learn). Alahakoon et al. (2020) did a very thorough study where they explained why the AI models must be unsupervised empowering big data analytics and they explored the benefits of having an adaptive AI model. They explained the use of cloud computing platforms to run these dynamic models [8]. Navarathna and Malagi have further studied the planning part of smart cities. They discussed the initial phases of developmental work in a smart city [5]. Lee and Lee (2015) discussed the aspects of applications, investments, and the challenges that may be faced by enterprises when implementing IoT devices in their enterprise. They discussed a total of five types of IoT technologies based on IoT-based products and services [9]. Khan and Salah (2018) discussed the security concerns of IoT technologies. They clarified the fact that IoT devices are easily breachable and are a potential danger if not checked. They did an investigative study of the security concerns and categorized the issues. At last, they discussed the state-of-the-art solutions and the power of blockchain to solve such security problems [10]. Albawi et al. (2017) in their paper thoroughly discussed the working of convolutional neural networks. Convolutional neural networks are primarily used for image classification tasks but can also be used in other domains such as Natural Language Processing (NLP). They studied convolutional neural networks while discussing the most efficient parameters for better performance and efficiency [11]. Lopes et al. (2015) worked on facial expression recognition using convolutional neural networks. The task of facial expression is not as simple as image classification such that, it requires the detection of even the slight changes in the features of facial expression. They attained an accuracy of 97.81% in their implementation [12]. Jaouedi et al. (2020) proposed a hybrid learning model for human action recognition. Human action recognition plays a significant role in facilitating behavioral study. The applications can be used in surveillance activities enabling automation. They used Gated Recurrent Neural Network for sequential data and video classification. The study was conducted on UCF Sports, UCF101, and KTH datasets, achieving an accuracy of 96.3% on the KTH dataset [13]. Mathe et al. (2020) in their study used the skeletal information of a human body to recognize the action whilst tracking it. They trained their model on DFT (Discrete Fourier Transform) images resultant from raw sensor readings. Their model could detect ADLs (Activities of Daily Living). Another important aspect of detecting a person’s intention comes through his/her speech [14]. In this regard, Noda et al. (2014) worked on building an audio-visual

Artificially Intelligent and Sustainable Smart Cities

241

speech recognition (AVSR) system. They introduced a model called connectionisthidden Markov (HMM). They firstly applied denoising autoencoders to remove noise from the audio. Then a CNN model is used to extract the visual information about the audio from mouth images. Altogether multi-stream HMM (MSHMM) is applied for integrating the acquired audio and visual HMMs independently [15]. Another task at hand is image segmentation in object detection. Felzenszwalb and Huttenlocher et al. (2004) treated the image as a graph and computed the dissimilarity index from the weights associated with the edges. Using this they found the segmentation and at last used a hierarchical grouping algorithm to differentiate the object from the background [16]. While image segmentation is best suited to still images and can’t be used in video detection techniques, for real-time object detection in video feeds Redmon et al. (2016) proposed a very popular method known as YOLO short for “you only look once”. Instead of first finding out the bounding boxes and then using a classifier to detect the object, they treat object detection itself as a regression problem wherein finding the boundaries in a single evaluation is done. The models process the images at 45 frames per second [17].

3 Background Study 3.1 Internet of Things (IoT) While there is no absolute definition of IoT many researchers and academicians have defined it in their way. Madakam et al. (2015) defined IoT as an open and comprehensive network of intelligent objects that can organize, share information and resources, reacting, and acting accordingly. The industries that IoT is deployed in are shown in Fig. 3. IoT is growing by the day, already being the topmost technology in the IT industry. In the last decade, IoT has attracted a lot of attention to creating a global infrastructure of interconnected physical objects. It can be said to be a global network allowing communication within things-to-things, things-to-humans, and humans-to-humans, whilst at the same time marking each object to be unique by tagging them uniquely. It is a world where one is connected to everything—be it wireless or wired. All these interconnected networks create a whole lot of data demanding analytical study. The radical thing about all this is it makes possible the sense of communication between environmental complexities and can act upon such data. All this is mostly automated and does not require human intervention to be deployed or to even act. The fields in which IoT can be or is being used today varies immeasurably as a few examples of the field in which they are deployed are agriculture, automotive industries, instruments and sensors, infrastructure work, health care, entertainment, automation techniques. We shall look at a few IoT technologies in brief.

242

M. K. Gourisaria et al.

Fig. 3 Types of industries where IoT is deployed [9]

A.

Wireless Fidelity (Wi-Fi)

B.

A family of IEEE 802.11 wireless network protocols is known as Wi-Fi allowing Wi-Fi-enabled electronic devices and other connected devices to communicate locally over a wireless network in a given range. Wi-Fi is a trademark of an international body known as the Wi-Fi alliance consisting of over 800 countries and is responsible for certifying products that comply with its standards. Today Wi-Fi is found in a lot of devices from smartphones, notebooks, and consumer electronics. As of 2019, there were about 14.96 billion WLANconnected devices in the entire world [18]. Technology contains any type of WLAN product that supports any of the IEEE 802.11 together with dual-band, 802.11a, 802.11b, 802.11g, and 802.11n. Bluetooth

C.

In 1994 originally, Ericson Mobile started a project named “Bluetooth”. It is a small range of wireless radio technology for the sharing of information in a Personal Area Network (PAN). Bluetooth uses Ultra-High Frequency (UHF) radio waves in the medical, industrial, and scientific bands from 2.402 to 2.480 GHz. They use the specification of the IEEE 802.15.1 standard and mostly communicate at less than one Mbps [19]. Near Field Connection (NFC) As the name advocates, it is a near-field technology with the range being as short as 4 centimeters only. It uses a frequency of 13.56 MHz, used as a touch

Artificially Intelligent and Sustainable Smart Cities

243

D.

or contact-based information exchange technology. NFC tags are used in applications such as touch payment systems, digital information exchange, smart cards, smart parking, toll taxes tags, logistics, access control devices (smart door locks). It does not require a direct line of sight making it very useful and can also be used in dirt-prone environments. Radio Frequency Identification (RFID)

E.

It is a way of identity sharing of an object or a person in terms of serial numbers via radio waves. First used in the Second World War in 1948 by the British to identify friends or enemies, it is a cost-effective way of identification of objects nearby in IoT. There are mainly three types of RFID tags based on the method of power supply, Active RFID, Passive RFID, and Semi Passive RFID. Its applications are varied such as library systems, Real-Time Location systems (RTLS), logistics and supply chain, race time tracking, attendance tracking, access control [20]. Actuators

F.

A device that converts that one form of energy into motion is defined as an actuator. The types of actuators can be electrical, hydraulic, or pneumatic based on the method they use to actuate motion. Most electric actuators generate force through the interaction of magnetic fields and current-carrying conductors. Dynamo and an alternator do the reverse process, producing electrical energy from mechanical energy. While electric actuators are AC/DC or stepper motors, hydraulic motors use hydraulic fluids to actuate a motion and pneumatic actuators compressed air for the same. In all these, the most commonly used is the electric actuators [21]. Wireless Sensor Networks (WSN) A network of sensors and spatially distributed autonomous devices are used for monitoring environmental or physical conditions such as vibration, pressure, motion, temperature, pollutants or sound, at different locations. It is formed as an interconnected web of devices spread over an area in hundreds or thousands of endpoints, to communicate with every other node. Its applications are, such as agriculture monitoring, manufacturing, habitat monitoring, military, homeland security, healthcare, precision, forest fire, and flood detection [22]. For exhaustive knowledge on this topic, a decent source can be the book by Singh et al. (2020) [23], consisting of topics from protocols, data aggregation, energy conservation, to security aspects and new advancements in WSN.

Our goal here is the creation of a smart city requiring a lot of information collection first and then resources to act upon judiciously. The issue here i.e. the need for information to act upon is what IoT is mounted to do. Also, the scale of the project is the size of a city, therefore the equipment needs to be robust, automatic, and needs low maintenance. The fact that IoT devices are small and easy to use makes them versatile. There is a multitude of fields in which IoT is being used today. They vary from agriculture, automation, healthcare, security, automotive industries, and entertainment [24–26]. In short, IoT devices are interconnected network devices fitted with sensors

244

M. K. Gourisaria et al.

and other technologies, which send data to the cloud where they are processed with AI to act upon it. More about AI and cloud is briefed later in the chapter. However, for a profound study books by Singh et al. [27] and Singh et al. [27] are virtuous sources containing necessary topics and more. From intelligent networking, image processing, computer vision, cloud, big data, to topics such as Database, and even study on camera parameter in process of matching and reconstruction, etc.

3.2 Artificial Intelligence (AI) Let us also understand the meaning of AI. In broad terms, AI can be said to be a computer-based model that mimics or tries to mimic the human brain in a way learning to solve the problems on their own [26]. AI may also learn to grow and adapt all by itself over time. The first-ever introduction to AI was given in 1956 at Dartmouth College. At the time of its inception, the idea could not lift off very well owing to the low computational power present at the time. Recently the AI has witnessed massive growth in computational speed, software efficiency, and data collection boom. The reason for AI’s popularity is due to its ability to produce generalized results. There are various fields in which AI is deployed, for example, computer vision/image processing, where it can identify an object from a certain class of objects, sound wave processing, where the AI can filter different waves accurately. It is also used for data analysis and study [26, 28]. Applications of AI are shown in Fig. 4. A.

Image Recognition: This is a subfield of AI under computer vision where an AI model is deployed to recognize the objects in an image or classify the

Fig. 4 Fields of applications of AI

Artificially Intelligent and Sustainable Smart Cities

245

Fig. 5 Generic convolution neural network

image. The task of working with images is very different from regular deep learning tasks. Hence fully connected neural networks are not used instead a convolutional neural network is used, which is primarily designated for feature extraction and this feature extracted and filtered data is fed into a fully connected system for the classification tasks. In Fig. 5, CONV denotes the convolution layer, POOL is the pooling layer, FC is the fully connected layer, and ACTIV is the activation functions such as ReLU or tanh.  G[m, n] = ( f ∗ h)[m, n] = h[ j, k] f [m − j, n − k] (1) j

B.

k

where f is the input image, h is the kernel m and n are the respective row and column indices, and j, k are the iterators [11]. Action Recognition: Action recognition is a part of intelligent video surveillance. It is a video-based pattern recognition technique, used to classify a video action. Using the power of image recognition and object detection to perform this task. This is a challenging task as it is not merely classifying but judging what the intentions of the actions are in a given video. While different researchers have shown a variety of ways to tackle this task. Let us see a few examples. Mathe et al. (2020) achieved this using a 3D model of a human skeleton hyper imposed on the human corresponding to its joints in the video. Further, the moment is studied and the model is trained with the help of diverse CNNs models [14]. Major drawbacks of video detection come due to the presence of dynamic backgrounds, which leads to a decrease in the proper feature selection. To overcome this issue, Jaouedi et al. (2020) used a Gaussian mixture model to perform background modeling. This evenly distributes the pixels so that there no significant motion is detected iteratively. For the action classification part, Recurrent Neural Networks are used. Also, to reduce the variables and parameters a memory cell of the Gated Recurrent Unit was used [13]. Optimization functions are used to adjust weights and minimize the loss function as follows;

246

C.

M. K. Gourisaria et al.

W t−1 = W t − γ

∂E ∂Wt

(2)

U t−1 = U t − γ

∂E ∂U t

(3)

∂E where γ is the learning rate, ∂∂WEt and ∂U t are the gradient values and E is the error and loss function. Speech Recognition: It is a branch of AI, which deciphers human speech audio into text. It uses deep learning to find the pattern in the sounds of words. Then with the trained data, each word is extracted and those words are converted into textual form. Simply providing an output of what the machine hears cannot be straightforwardly converted to text because of the contextual barriers, such as the difference between a person speaking a popular name or word. To decipher such kinds of speech, NLP (Natural Language Processing) and deep learning are used. The purpose of NLP is to do a thorough semantic analysis. The way how NLP calculates the similarity or difference of the two sentences is best up of cosine similarity, which is calculated of the angle generated between the vectors of the two sentences on a 3-D space.

− → → u .− v  − cosθ = − →  u ||→ v ||

D.

(4)

→ → v are the two vectors formed by the words of the two sentences u and − where, − and θ is the angle between them [15]. Object Detection: This is very different from image recognition as, object detection requires first the localization of the required object to be found in the given image, only then a classification model is needed to classify that object. The task of object detection can further be divided into four categories, such as semantic segmentation, classification, and localization (single object), Object detection (multiple images), Instance Segmentation. The first task is to determine the region proposal that leads to the idea of how many objects are present in a given. Different researchers have given different ideas of how to tackle this problem, a popular way by Felzenszwalb and Huttenlocher (2004) is graph-based segmentation. They treated the whole image as a graph G = (V , E) where V is the nodes and E is the edge. All the edges have corresponding weights related to each of them (Vi , V j ). These weights are the non-negative measure of the dissimilarity between the neighboring elements Vi and V j . In the process of creating bounding boxes corresponding to the dissimilarity. It is likely to lose some crucial data. For this, the original authors used a hierarchical grouping algorithm.           s ri , r j = a1 scolor ri , r j + a2 stextur e ri , r j + a3 ssi ze ri , r j + a4 s f ill ri , r j (5)

Artificially Intelligent and Sustainable Smart Cities

247

where based on different notions of similarity based on color, texture, size, and fill the algorithm merges the regions [16]. After the region has been distinguished, architectures such as R-CNN, Fast RCNN, or faster RCNN can be used for classification purposes. While there is a multitude of ways for object detection. Another very popular method is YOLO (“you only look once”) originally given introduced by Redmon et al. (2015). This method gives more weightage to the speed of detection than accuracy. Although the newest versions of YOLO are as efficient as accurate. The main purpose and the principle difference between YOLO and R-CNN are that YOLO is used for real-time object detection i.e. is video cameras or surveillance cameras [17].

3.3 Augmented Reality (AR) Augmented reality is a technology that provides a real-time experience of viewing the real environment in front of us but with digital augmentation overlaid on it. It is a variation of VR (Virtual Reality) while VR shows only the computer-generated visuals; AR in contrast shows the real world with the computer-generated information overlaid on it. AR is used in industries fields such as medical training, repair and maintenance, classroom education, and military training [29]. Based on the type of environment on which the information is overlaid, there are mainly 5 types of AR: A.

Projection-based AR

B.

In this type of AR, the information is projected onto surfaces. The surface on which the projection is taking place is studied as a physical space. They might be interactive or non-interactive accordingly. For example, a projected keyboard is an example of interactive projection-based AR. Alternatively, a projection of you a product you are buying to check if it fits perfectly into a certain location. Recognition-based AR

C.

In this, a pattern, text, or image is recognized to display some kind of information. An example of its use may be a product feature that can be shown using AR. The product can either be scanned itself or a pattern that is associated with that product be scanned, after which an AR representation comes up. Location-based AR

D.

This kind of AR is usually used for real-world navigation solutions, where information such as street names, direction arrowheads, distance to reach are shown in real-time. Eliminating the need for intermittently watching the environment and the informational display while driving, reducing the risk factor. Outlining AR This can be said as an application of object detection, where the detected object is outlined to prompt their boundaries such as empty parking spots or road boundaries. Another fascinating application could be for construction work,

248

E.

M. K. Gourisaria et al.

where the boundaries and pillars of the building to be constructed can be visualized. Superimposition based AR This again takes the use of object detection, the detected object can then be superimposed with either information. For example, a human body can be superimposed with an X-ray scan or any kind of diagnostics to study by doctors.

One important application of Augmented Reality is its use in educational purposes; AR makes learning very intuitive and thus having a larger impact. Initially, the growth has also been seen in the entertainment industry, where three-dimensional games such as the world renounced AR game Pokémon Go were popularized and sometimes for content consumption such as watching a movie [30, 31].

3.4 Drones A drone is an unmanned aerial aircraft or a vehicle that is capable of flying autonomously without any human intervention. It may use a GPS (Global Position System) to locate itself. Moreover, a drone today is equipped with high-tech camera modules and sensors which with the help of AI give an adequate amount of data to act and fly autonomously. Being driverless has benefits such as less weight to carry making the vehicle lighter, saving fuel. The absence of a person implies it can works tirelessly for more hours without breaks. Making it efficient and ideal for surveillance duties. While drones come in different shapes and sizes, they are majorly categorized into four classes [32, 33]. A.

Multi-Rotor Drones

B.

These are the most common, used by professionals and even hobbyists. They carry the main function of aerial surveillance and photography. They are the easiest to manufacture too. Further, they can be subdivided based on the number of blades they have as, Tricopter (3 rotors), Quadcopter (4 rotors), Hexacopter (6 rotors), and Octocopter (8 rotors). Quadcopter being the most popular among them. Although easy to manufacture and fly, they have their fair amount of downsides such as not as much flying time, a reduced amount of endurance, less speed, and hence cannot be deployed for scaling larger areas. Fixed Wing Drones They are completely different from rotor drones in terms of design and build. They have fixed wings like a regular airplane, using no extra effort to stay afloat, in fact, fixed-wing drones cannot stay afloat like the rotor ones. They keep on flying forward and are maneuvered through guide control systems, remotely by humans or maybe self, in the case of autonomous drones. They can fly for much larger times than the rotor drones, making them suitable for conditions where a large area needs to be covered. While having its downside such as higher costs

Artificially Intelligent and Sustainable Smart Cities

249

C.

they require skilled people to fly them too. They require either a runway or a catapult to fly or land. Single Rotor Drones

D.

These drones are very similar to actual helicopters, unlike the multi-rotor drones they have a single large enough rotor and a small-sized rotor at the tail for maneuvering or balancing. In terms of aerodynamics, a lesser number of rotors implies lower drone spin, hence a single drone is efficient than a quadcopter which is itself better than an octocopter. One major drawback with a single rotor is the risk of having a single large blade, which may lead to fatal injuries too. Hybrid VTOL Here, VTOL stands for Vertical Take-Off and Landing. These are a combination of rotor-based drones and fixed-wing ones, where the rotors are used to get a vertical lift from the ground straight up. To keep the drone stable in the air, gyros and accelerometers function in an automated manner.

3.5 Cloud Computing In nonprofessional terms cloud computing is a delivery of service, where resources such as applications, data storage, databases, networking, and software are provided to a consumer on lease. The idea here is in place of having a proprietary hard drive to save your data locally, the data is stored online on remote systems that are not only fast but easily accessible. Cloud computing gives complete reliability and immense computational power with the least of expenses and the worry to buy or manage the physical system. This increases speed and performance, reduces cost, increases productivity and security [34]. The components of cloud computing are shown in Fig. 6. Based on ownership there are four types of cloud computing namely, Private, public, community, and hybrid [35]. Hosting all the computing infrastructure by one person privately is known as private cloud computing. The security and the control level are maximum in this case. In the public cloud, the complete infrastructure is with the company offering the service, while the cloud can be accessed by anyone. Security and control are minima in a public cloud. A community cloud is a service where a common set of organizations require a shared cloud but with access limited to that community. There is a mediocre level of security and control over the cloud. This type of service is usually set up for a goal. In the case of a hybrid cloud, the public can access certain sections of a cloud, where there is controlled access provided to companies and the public depending on the purpose. It is a combination of public and private clouds hence known as a hybrid cloud. Cloud can be classified on the basic types of service it provides [36], A.

Software-as-a-Service (SaaS): This involves the licensing or delivery of software on a subscription basis which is hosted centrally on the cloud by a cloud

250

M. K. Gourisaria et al.

Fig. 6 Components of cloud computing

B.

C.

D.

company. It is also referred to by on-demand software. It provides an advantage to its users by reducing the time to install, manage and upgrade software on proprietary systems saving cost, time, valuable office space, reducing the extra labor required to maintain such systems. This makes companies focus more on their side of the business rather than focusing on maintaining such systems to manage their workflow. Examples are Microsoft Office 365, Google G Suite, Cisco WebEx. Infrastructure-as-a-Service (Iaas): IaaS involves the delivery of everything from operating system, server, and storage. This enables users to use the product without buying or having to manage the complete system rather than a subscription-based service, an outsourced on-demand service, again saving the cost, space, and infrastructure required to handle such systems. This is probably the most flexible cloud computing, easy to automate, networking, servers, and processing power. Clients retain complete control of their infrastructure and are highly scalable. Examples are Microsoft Azure, IBM Cloud, and the Amazon Web Services. Platform-as-a-Service (PaaS): PaaS provides cloud components to certain software while being used mainly for applications. It is a virtual cloud environment where companies or their users have access to resources that help them to build the product they need within limited trouble. They are also called middleware, they are scalable and are readily available in the cloud having all the major characteristics of a cloud. PaaS is scalable, highly available, automates business policies, simple, cost-effective development, and deployment of apps. Examples are Windows Azure, Google App Engine. Everything-as-a-Service (XaaS): Also known as anything as a service. XaaS refers to the highly individualized service and offerings that are determined by customers. XaaS includes models such as SaaS, IaaS, PaaS, and even DaaS (Desktop as a Service).

Artificially Intelligent and Sustainable Smart Cities

251

3.6 Big Data It is a collection of data in huge amounts, and yet growing exponentially with time. The large latitude of the data makes it very difficult to manage and maintain with traditional methods [37–39]. Making it difficult for processing efficiently. To get a clear picture of the scale of hugeness let’s discuss some examples. A jet engine generates 10+ terabytes of data in just 30 min of flight time. Scale that to 1000 flights daily and the data on just a single day reaches more than petabytes. Other statistics show 500+ terabytes of data gets uploaded per day to social media sites regularly. Storing a large amount of data is not a problem, rather extracting something useful from the stored data is potentially a difficult task. Big Data is further divided into Structured, Unstructured, and Semi-structured [26, 40]. A.

Structured

B.

Data that has a fixed form and can be stored, accessed, and processed is termed as ‘structured data. A lot of development has taken place on how to work with such kinds of data i.e. data with known formatting. Although, the size of data has grown exponentially making it difficult to manage even structured data. An example of structured data can be a table with a header and well-defined rows. Un-Structured

C.

Any data that does not have a fixed format is classified as unstructured data. The non-linear nature of the data makes it challenging to process and derive results from it. Heterogeneous data that is a combination of video, images, and text files is a typical example of unstructured data. For example, a google search result is a prime example of heterogeneous data. Semi-Structured A type of data that contains both aforementioned forms of data is classified to be semi-structured data. Semi-structured data seems like a structured one but is not defined precisely. Example of semi-structured data are XML files, table definition in RDBMS [41]. Let us look at some characteristics of Big Data.

A.

B.

Velocity It refers to the speed at which data flows, i.e. between sensors, application processes, social media sites, or mobile devices. Velocity in terms of Big Data can be further subdivided into, Batch, Near Real-Time, Real-Time, and Streams. Volume It refers to the size of the data that is being generated. Its units are Terabytes, Petabytes, exabytes , and Zettabytes.

252

C.

M. K. Gourisaria et al.

Variety It refers to the nature of the data whether heterogeneous, structured, or unstructured. They can be spreadsheets, pdfs, audio, video, text, email, pictures, etc.

4 Different Portfolios of Smart Cities To make a city smart we need to use the above technologies in an interconnected manner, complementing each other. Let us look at some portfolios that need to be addressed in a city, considering the most affected and influential topics. Besides, what are the issues with the existing setup, and how it will be solved with the help of technology? The following section discusses some of these portfolios in brief.

4.1 Smart Traffic Solution People face a lot of inconveniences when it comes to traffic jams and finding proper parking places. This problem can be tackled by proper use of IoT and AI, collecting and sharing real-time information among individuals, companies, and government agencies. The data collected from individual drivers on the road determines the traffic of that road, leading to decision-making by the traffic control to act upon the information. The real-time information can help either individual drivers chose an alternative route or the agency responsible for the traffic control shift the traffic [42]. Since the information is real-time this also provides information about the public transit systems such as buses i.e. there current location and expected time of arrival. The concept is that all the sensors and IoT devices are connected to a central system, which then processes the data to get a result on how to react. This concept is illustrated in Fig. 7. Parking lots can be made efficient with the use of AI and IoT again such as, all the parking spots contain a sensor that signals whether a spot is empty or occupied. All this information either is displayed or is made public through an online platform such as a mobile app. This implementation may also show the nearest parking spots, saving time and fuel. This type of automated traffic control system has already been applied and tested, decreasing traffic jams by 40% nearly everywhere [43]. Some places in the USA where this system was applied are San Diego, Los Angeles, and Pittsburg, Pennsylvania [44].

4.2 Smart Fire Brigades The major problems faced by the fire department can be—(a) late reporting of fire, (b) jams to face to reach the destination point, and (c) visibility of the firefighters

Artificially Intelligent and Sustainable Smart Cities

253

Fig. 7 Smart traffic management system [42]

while they perform rescue operations. Let us discuss how AI may help solve these problems. Issue one i.e. the late reporting of the fires may be triggered in many ways. If an interconnected system is applied, the fire alarms system, the digital security cameras, and the IoT devices may signal the alarm as soon as they sense anything suspicious. The occurrences of the second problem, i.e. high traffic to reach the destination can be drastically reduced if all the traffic flow can be centrally controlled and monitored. The traffic may be either diverted or even stopped in case extreme measures are required. The third issue may be solved with the use of a hybrid. With the help of helmets [45] enabled with an infrared as well as a sonar sensor to detect the surroundings along with the help of the computer vision techniques an exact image can be projected for the firefighter for the position of the victim to rescue. The helmet comes attached with an AR (augmented reality) display, which gives realtime data and other important information such as temperature, and even heartbeats. This is depicted in Fig. 8. Another route can be the use of robots or drones fitted with thermal imaging sensors and cameras, which gives the real-time data on which the firefighters can act. Even the drones may carry a fireproof suit to the victims and guide them to the exit.

4.3 Smart Policy Making and Planning Major prerequisites of policymaking are data to act upon, majorly before making any decisions committees are formed to evaluate the feasibility of any project cum policy. The benefit of having a monitored city is the gain of real-time data, which can be again be fed into an AI model to do predictive analysis and statistical studies. The

254

M. K. Gourisaria et al.

Fig. 8 AR-enabled (Augmented Reality), a camera-fitted fire fighter’s helmet [45]

information so gained can be used to form policies without waiting for data to come up. This makes policymaking faster and hence helping to speed up the developmental work [46]. Policymaking can be automated by an AI system which will be suggesting out what policies need to be implemented owing to the changes happening in the city. Another benefit of having a lot of data is the assist in planning the city, which includes the expansion of the city, changing some existent system, construction work to be carried out, or a demolition activity [47].

4.4 Smart Farming Farming is one of those sectors, which has, and will benefit from the advent of AI the most. AI can help to farm in a multitude of ways, such as (a) with proper forecasting and detection of the weather conditions and climate conditions, AI can accurately predict the level of precipitation. Of course, all this is possible because of the presence of historical data and the new satellites, which provide real-time data, (b) farming often requires fertilizers and nutrients to be added separately for the yield to be satisfactory or even to keep the soil rich, but with the help of assembly sensors, AI can give the exact date of the requirement of the plants for the growth [48]. This only saves cost but also stops the over-utilization of some ingredients which may never be required whilst notifying the actual requirements of the plant that too real-time. When farming is done on a large area, it becomes difficult to trace the fields; the drones can be used to do the task of seed sprinkling, fertilizer sprinkling, and surveillance. At the same time, they can send the necessary information back to the farmers. Another very vital application of machine learning can be to use it for crop disease detection

Artificially Intelligent and Sustainable Smart Cities

255

Fig. 9 Components in smart farming [48]

or prediction. This can help take precautionary measures at the right time, stopping the spread and letting the farmers counteract with necessary medication [49]. The components of smart farming through IoT are realized in Fig. 9.

4.5 Smart Electricity Grids As said the requirement for energy consumption increases drastically with an increase in the population levels, an equilibrium needs to be established between supply and demand, in which AI can play a vital role. The basic requirement is the data collection but that has to be real-time, making it possible to track the consumption level of each household or industrial demand easy. AI can help generate the algorithm for when to generate less energy and when to increase production. This is further extended from the power stations to the primary power grids, which keep track, and maintains a supply and demand equilibrium [50, 51].

4.6 Smart Parking Solutions This may seem very basic but as much as there are traffic problems on the roads there is a lack of parking stops. As the number of vehicles grows day by day, the burden on the existing infrastructure grows. The implementation for smart parking can be the use of sensors to detect empty or full, parking spots, a camera that spots a vehicle entering or exiting the parking place, and an indicator making pointing to empty open spots. All these integrated and data collected regularly will generate enough

256

M. K. Gourisaria et al.

information for the AI to learn and create workflows on how to manage the parking flow. This further concept can further be modified as automated parking lots, where one can drop his/her car and the rest is done by the machines, where it notes the estimated time you will be coming up to pick up the car. Having an efficient parking solution also saves fuel [52, 53].

4.7 Smart Security Management (Law Enforcement) With the increase in population and economic growth, an important portfolio to run a city is law and order. The ways by which a city’s law and order can be maintained is the high level of surveillance and quick response to any hostile situation if occurred. Surveillance can be done in numerous ways through interconnected camera systems which are then powered with powerful computer vision models, able to detect and alarm any kind of hostile situations, be it a person carrying a gun in public, a road accident that just happened, or any kind of unethical activity such as theft [37]. This is again instantaneous and real-time which reduces the time of affirmative actions. Drones can be deployed for tasks such as surveillance where monitoring is difficult to be performed maybe because of lack of infrastructure or the surveillance needs to be continuous without any breaks. Another task can be to chase or reach a hostile location without endangering human life [54, 55].

4.8 Smart Waste Management Waste management in a city plays a crucial role in its sustainable growth, a large percentage of waste collection bins are the first step and the most crucial one too. The trash that is separated in its inception phase is highly likely to be recycled properly. Smart cities promote the idea of using smart bins which may be implemented in a plethora of ways. All the trash bins are labeled and marked for the class of waste it stores (wet, dry, plastic, paper, biodegradable, etcetera), creating scope for recyclability. IoT enabled smart bins that send the level or the weight of trash in it, enables the trash collection units to choose whether to pick that bin or not, saving both the pickup vehicle fuel and time (Fig. 10) [57].

4.9 Smart Pollution Control Pollution is a chief prospect of any sustainable city. Already a great number of factors are playing their role in bringing down the pollution levels, but since we are planning to build a sustainable city, the specific portfolio needs to be assigned a role of pollution

Artificially Intelligent and Sustainable Smart Cities

257

Fig. 10 Smart interconnected bins that are bifurcated for different classes of waste [56]

control. Sensors spread out in the cities give the data about the pollution levels of air, water, and land. Then, accordingly, proper steps can be taken such as flagging a vehicle excreting too much smoke or a factory releasing waste and effluents into the water bodies and landfills above harmful levels [58].

4.10 Smart Self-Sustainable Public Toilets Access to clean and hygienic toilets is a basic need and the lack of toilets directly represents the lack of development of a state. The lack of access to clean toilets leads to health issues influencing the average life expectancy of that country. A data from 2015 shows a generic picture of the situation, although the data is old a true representation of the current overall situation is depicted. South Sudan is the hardest place to find a household toilet, where 93.3% of the country’s population lacks access to a toilet. This is reflected in the country’s average life expectancy, being only 55 years. 774 million people in India lack access to clean usable toilets. 28% of the population in Russia lacks access to a safe and private toilet. 9.5% of the population in Ireland is affected in the same way [59]. To address this issue smartly, eco-friendly, and self-sustainable portable toilets could be introduced. All of these are maintained and monitored using sensors and IoT-enabled devices, alarming and notifying if cleaning or maintenance work is required. The required power is generated using solar panels where the access energy

258

M. K. Gourisaria et al.

is stored in battery packs. For the waste collected, bio-digester could be used. A bio-digester utilizes organic waste to produce fertilizer and biogas. This process is anaerobic i.e. it takes place in the absence of oxygen where bacteria are responsible for decomposition. The self-maintenance is done through sensor-based flushing and RFID tag monitoring.

4.11 Smart Healthcare Facilities A city cannot be smart without the use of the latest technology in the medical field. The use of Machine Learning and Artificial Intelligence increases the accuracy of judging and predicting a certain disease. The medical diagnostic data in the form of X-ray images, electro-cardiographs (ECG), and other such information are quite challenging to read and require trained and experienced medical professionals to study them. Machine Learning algorithms can do the same at a much faster pace and with many great accuracies such as 97–99%. Gourisaria et al. (2020) worked in the field of malaria detection using convolution neural networks on the microscopic images of malaria-infected blood cells; their model can detect the presence of malaria in the early stages of diagnostics, saving patients from muscular paralysis or even death with an overall precision of 95.23% [60]. Das et al. (2020), Nayak et al. (2019), and Nayak et al. (2020) have all worked in the field of heart disease detection using core machine learning and deep learning concepts, mining frequent items and classification techniques, and classification algorithms using Big Data analytics respectively [61–63]. Another very important application of machine learning can be its use in the study of ECGs; these are a reliable form of monitoring the heartbeat’s measure of electrical impulses in a graphical form. While giving a lot of information, ECGs can be tricky to read and need expertise. The machine, on the other hand, can handle this task with ease, speed, and accuracy [64]. Sharma et al. (2020) did a detailed analysis of various machine learning and deep learning algorithms for ECG classification tasks and were able to classify them on the standards set by AAMI-EC57. The datasets used by them were Physionet’s MIT-BIH and PTB Diagnostics. During the Pandemic spread of SARS-CoV-2 (Corona-virus) disease, we witnessed how a disease could stop the whole world, making the advancement of the current medical system ever more important. Jee et al. (2021) researched how chest radiographs can be used for the detection of corona-virus [65].

5 Self-Building AI Model With the growth of the city and population increase, the city’s core AI will sustain a certain threshold. The AI discussed above is a fixed size model in terms of the number of parameters. It is a deterministic model that would not know how to adapt to a situation out of the domain. Here out of domain means a situation unprecedented

Artificially Intelligent and Sustainable Smart Cities

259

or when an anomaly occurs. Building infrastructure as big as a city that depends on a core AI system for its functioning becomes very important that the system is robust. To handle such a situation a self-building AI model needs to be implemented, which is adaptive and can grow as the city does. Alahakoon et al. (2020) gave a robust self-building AI that empowered big data analytics for smart cities [8]. They proposed a model that can self-structure, self-configure, and self-learn. The ability of the model to self-build reduces the need for human intervention or its supervision. They used a growing self-organizing map (GSOM) algorithm incapacitating the limitations of conventional AI models. GSOM is a variant of dynamic self-organizing maps originally by Alahakoon (2000) [66]. Let us understand self-organizing maps (SOM) first. The self-organizing map is a neural network that produces a reduced dimensional representation of input space. The idea is to map the neural network as a lattice, also each neuron carries a weight vector with it. BMU (Best Matching Unit) is calculated based on the distance between the weight vector and the input vector given by Eq. (6), E = ||Wk − xi ||

(6)

where E is the quantization error, Wk is the weight vector, and xi is the input vector. After calculating the lowest quantization error neuron is selected as the BMU also indicated as the winning node. The next process is updating neighboring neurons done using the Eq. (7), wk (t + 1) = wk (t) + αh ck (t)[xi − wk (t)]

(7)

where wk (t + 1) is the updated node value, wk (t) being the previous value of the neuron, α being the learning rate, h ck is the decay function. While this algorithm is self-sustained but the fact that the size of the reduced dimensional grid structure and the dimensionality needs to be pre-defined makes it unfit for our job. To bridle this issue, growing self-organizing maps are used, which grows elicited on the heuristics and input representation. To choose a neuron or remove a neuron a threshold is chosen known as growth threshold (GT) given by Eq. (8) by comparing the accumulated BMU, γ = −δ × ln(σ )

(8)

where γ is the growth threshold, δ is the number of dimensions in the input space and σ is the spread factor. Which determines the range of the network and is tolerant of the dimensionality. This enables GSOM with structural adaptation and hierarchical clustering [66].

260

M. K. Gourisaria et al.

6 An Ideal Smart City Until now, we saw the very building blocks of a smart city, the compulsions of a smart city, the amenities, and the advantages of a connected city. We saw the technologies contemporary in a smart city, how they add up to the promise of a superior administrated and governed city. All of these technologies and setup would work only if there is a comprehensive inter-connected environment, taking the advantage of the tiniest of data across the city. All this needs to be seamless and interconnected via the cloud. Now for the system to work, and to act upon any kind of information, there needs to be a central administrative setup where all this information is gesticulated. Let us name this central command as Smart Central Administrative System (SCAS), which is responsible for the complete operation of the city. Also considering the scale of the project, the SCAS has been divided into three major levels namely, αcommand, β-command, and γ -command based on decreasing order of the hierarchy cum spread of area. The direct information, i.e. the first uninterrupted data of a local area reaches the gamma command units. These units are present in large numbers spread across the city and are the first partition of SCAS. They then filter the data and transfer it to the beta command. β-command heads a certain number of gamma units, and are less frequent than gamma command units are. As the gamma units, beta units filter the information, solving what it could by itself, else transferring it to the central command i.e. α-command. The α-command is a single unit with the complete power and accountability of the city. This can be viewed as a federal system of governance, with the local bodies’ municipalities (γ -command), state bodies (β-command), and central bodies (α-command). For the choice of how to divide the city, it is flexible to choose any shape or form according to the need, but for our representation, we have used hexagonal divisions dividing the city into the smallest hexagonal blocks (γ -command). This structure is shown in Fig. 11. Let us understand in detail the responsibility of each of these levels their distribution of power, functioning, and how they will interact with each other. Fig. 11 Hierarchical structure of the SCAS

Artificially Intelligent and Sustainable Smart Cities

261

Fig. 12 Hierarchical position of α-command in SCAS

6.1 α-Command This is the apex of the SCAS - all the end decisions are made in this single unit (Fig. 12). Apart from handling the whole city, this command is responsible for the maintenance part of the SCAS system, controlling, monitoring, and diagnosing the working of the β-command and then in turn γ -commands. This is also a distinguishing feature of α-command from its predecessors that it can control the autonomy of its predecessors. Another important choice in the model is the involvement of human intelligence over very important decisions i.e. at the very top although the machine is processing the data and suggesting the steps to be taken, but is bound not to proceed without the green light from the controllers. This may feel like slowing down the system and making the system prone to human errors, but the best part about this structure is in parallel the lower commands i.e. the β and γ commands are doing their basic task of running the city without hiccups. It is the division of power that the α-command has over important portfolios.

6.2 β-Command This is second in line in the SCAS system, in between α-command and γ -command (Fig. 13). This command is also responsible for the filtration of the data reaching the α-command. However, the primary goal of this command is taking the decisions in the systematic running of the smart city from the data it receives from the γ -command. Decisions are taken considering the data received from all the γ units under that particular command and if the need arises to raise the issue, i.e. it is not able to handle the situation on its own or the situation does not allow β to take actions, it is raised to the α-command. As the α-command is responsible for the maintenance and

262

M. K. Gourisaria et al.

Fig. 13 Figure of a β-command consisting of a certain number of γ command

controlling of the β units, β units are responsible for the working and maintenance of the γ units. Again, β units are not only connected to its superior α units and its γ units, but with the β units on the complete city, making it some type of web of units.

6.3 γ -Command This is the direct, first connection point of the SCAS with the city (Fig. 14). It includes all the interconnected technologies such as drones, traffic sensors, fire alarm sensors, surveillance cameras, waste management systems, etc. These commands will have the largest number of units spread across the city and will be acquiring the most amount of raw data. In our model γ -command is just the data collection point and has no literal power or autonomy to act on that data. It will be only responsible to preprocess the data and send it to its superior that is the Fig. 14 A representation of the γ -commands consisting of several sensors and IoT devices

Artificially Intelligent and Sustainable Smart Cities

263

β-command. This is because the data collected needs to be compared with the rest of the γ -command units for an intelligent decision to be made (Table 1). The model has been discussed on an abstract level and many more insights such as the actual implementation could be challenging and need more research. Our primary focus was the introduction of a base level so more research could be done on top of it. The whole setup is shown in Fig. 15. Table 1 Detailed overview of the model S.N Level

Decision autonomy

Interconnectivity

1

α-command

Partial human intelligence Single unit, only connected to the involvement in final decision-making sub-level β commands steps

2

β-command

Pre-assigned categories of decisions allowed, those not involved or for anything unprecedented, the decision or action is raised to the α-level

3

γ -command No decision-making is allowed (zero-level), all the data collected is passed over to the β-level

Connected to all the sub-level γ -units and also all the nearby β-level members for collective and intelligent decision making No interconnection between other γ -level units, only connected to the β-level units

Fig. 15 The interconnectivity diagram between the different levels

264

M. K. Gourisaria et al.

7 Major Drawbacks With everyone and everything getting interconnected, data privacy is a concern for many people not only because they are scared to share their information but also because of the corrupt use of the data that can be done in a wrong way. With this level of surveillance and monitoring, many may feel insecure about anything they do and may get anxious [67]. To develop a project such as this as a green field, a lot of planning and infrastructural development is needed. The initial cost of setting up all the IoT devices and network setup is too expensive and to control and maintain this entire infrastructure a huge workforce is needed, making it too expensive to be a green field. Since the entire city is connected, be it energy, traffic, or security, if the central system crashes, the whole city may halt, i.e. too much control over a central machine that has a probability to collapse is a severe concern. Another argument is that machines are incapable of making decisions above their echelon, machines can be reliable for repetitive tasks, but if a situation arises that is completely new and unprecedented, i.e. it has never encountered that before it may cease to work or crash. That is why some kind of human supervision shall always be required and a system cannot be fully autonomous. With this rapid automation, many tasks that were earlier performed by a human would be snatched. While this point is debatable, one cannot predict the level of joblessness accurately [68].

8 Conclusion and Future Scope Different aspects of smart cities i.e. technologies involved, the implementation, and drawbacks have been discussed. The basic requirements to call a city smart, consist of various technologies essential to building a smart city namely IoT, image recognition, action recognition, speech recognition, AR (augmented reality), drones, cloud computing, and big data. We discussed briefly the different portfolios, i.e. how the technologies need to be implemented to form an interconnected city like smart fire brigades, smart traffic systems, smart policymaking, smart farming, smart security solutions, etc. We discussed the need for a sustainable self-building AI system also discussing a model for sustainable AI. Establishing it as the backbone of any modern smart city, i.e. it must be interconnected, self-sustainable, and autonomous. We have introduced a novel three-tiered system of data collection, monitoring, and decision making. Which is a centralized system for structured smart city management, named SCAS. It is a conceptual implementation and needs more research and understanding of the real-world scenarios before proper implementation. The succeeding research could be the study of the feasibility of such a system, including the cost estimation, failure rates, and calibrations required. Another fascinating study could be the implementation cum abstraction of a decentralized system free from human intervention and fully autonomous in itself.

Artificially Intelligent and Sustainable Smart Cities

265

A few more aspects of smart cities that were not considered may be due to the lack of technological advancements or the naïve nature of the technology that cannot be incorporated. For example, self-driving public transport.

References 1. United Nations, Global Issues, Population. https://www.un.org/en/sections/issues-depth/pop ulation/#:~:text=The%20world%20population%20is%20projected,and%2011.2%20billion% 20by%202100. Last accessed 12 Dec 2020 2. Voda, A.I., Radu, L.D.: Artificial intelligence and the future of smart cities. BRAIN. Broad Res. Artif. Intell. Neurosci. 9(2), 110–127 3. Yigitcanlar, T., Desouza, K.C., Butler, L., Roozkhosh, F.: Contributions and risks of artificial intelligence (AI) in building smarter cities: insights from a systematic review of the literature. Energies 13(6), 1473 (2020) 4. Bj, T.: How are smart cities perceived by project leaders and participants in an ongoing project: the challenge of evaluating smart cities. In: 2018 Smart City Symposium Prague (SCSP), May 2018, pp. 1–5. IEEE (2018) 5. Navarathna, P.J., Malagi, V.P.: Artificial intelligence in smart city analysis. In: 2018 International Conference on Smart Systems and Inventive Technology (ICSSIT), Dec 2018, pp. 44–47. IEEE (2018) 6. Yigitcanlar, T., Han, H., Kamruzzaman, M., Ioppolo, G., Sabatini-Marques, J.: The making of smart cities: are Songdo, Masdar, Amsterdam, San Francisco and Brisbane the best we could build? Land Use Policy 88, 104187 (2019) 7. Mathur, S., Modani, U.S.: Smart city-a gateway for artificial intelligence in India. In: 2016 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), March 2016, pp. 1–3. IEEE (2016) 8. Alahakoon, D., Nawaratne, R., Xu, Y., De Silva, D., Sivarajah, U., Gupta, B.: Self-building artificial intelligence and machine learning to empower big data analytics in smart cities. Inf. Syst. Front. 1–20 (2020) 9. Lee, I., Lee, K.: The internet of things (IoT): applications, investments, and challenges for enterprises. Bus. Horiz. 58(4), 431–440 (2015) 10. Khan, M.A., Salah, K.: IoT security: review, blockchain solutions, and open challenges. Futur. Gener. Comput. Syst. 82, 395–411 (2018) 11. Albawi, S., Mohammed, T. A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), August 2017, pp. 1– 6. IEEE (2017) 12. Lopes, A.T., De Aguiar, E., Oliveira-Santos, T.: A facial expression recognition system using convolutional networks. In: 2015 28th SIBGRAPI Conference on Graphics, Patterns and Images, August 2015, pp. 273–280. IEEE (2015) 13. Jaouedi, N., Boujnah, N., Bouhlel, M.S.: A new hybrid deep learning model for human action recognition. J. King Saud Univ. Comput. Inf. Sci 32(4), 447–453 (2020) 14. Mathe, E., Maniatis, A., Spyrou, E., Mylonas, P.: A deep learning approach for human action recognition using skeletal information. In: GeNeDis 2018, pp. 105–114. Springer, Cham (2020) 15. Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H.G., Ogata, T.: Audio-visual speech recognition using deep learning. Appl. Intell. 42(4), 722–737 (2015) 16. Felzenswalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. (2004) 17. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)

266

M. K. Gourisaria et al.

18. Statistica, Thomas Alsop: https://www.statista.com/statistics/802706/world-wlan-connec ted-device/#:~:text=The%20statistic%20shows%20the%20number,to%20be%20connected% 20via%20WLAN. Last accessed 05 Jan 2021 19. Scientific American: https://www.scientificamerican.com/article/experts-how-does-bluetoothwork/. Last accessed 05 Jan 2021 20. Atlas RFID Store, James Thrasher: https://www.atlasrfidstore.com/rfid-insider/what-is-rfidused-for-in-applications/. Last accessed 05 Jan 2021 21. Frecker, M.I.: Recent advances in optimization of smart structures and actuators. J. Intell. Mater. Syst. Struct. 14(4–5), 207–216 (2003) 22. Bellavista, P., Cardone, G., Corradi, A., Foschini, L.: Convergence of MANET and WSN in IoT urban scenarios. IEEE Sens. J. 13(10), 3558–3567 (2013) 23. Singh, P.K., Bhargava, B.K., Paprzycki, M., Kaushal, N.C., Hong, W.C.: Handbook of Wireless Sensor Networks: Issues and Challenges in Current Scenario’s, Advances in Intelligent Systems and Computing (AISC), vol. 1132. Springer (2020) 24. Dlodlo, N., Gcaba, O., Smith, A.: Internet of things technologies in smart cities. In: 2016 IST-Africa Week Conference, May 2016, pp. 1–7. IEEE (2016) 25. Harmon, R.R., Castro-Leon, E.G., Bhide, S.: Smart cities and the internet of things. In: 2015 Portland International Conference on Management of Engineering and Technology (PICMET), August 2015, pp. 485–494. IEEE (2015) 26. Chin, J., Callaghan, V., Lam, I.: Understanding and personalising smart city services using machine learning, the internet-of-things and big data. In: 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE), June 2017, pp. 2050–2055. IEEE (2017) 27. Singh, P.K., Singh, Y., Kolekar, M.H., Kar, A.K., Chhabra, J.K., Sen, A.: Recent Innovations in Computing, vol. 701. Springer Nature, Switzerland AG (2021). ISBN 978-981-15-8297-4 28. Harshvardhan, G.M., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: A comprehensive survey and analysis of generative models in machine learning. Comput. Sci. Rev. 38, 100285 (2020) 29. Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E., Ivkovic, M.: Augmented reality technologies, systems and applications. Multimedia Tools Appl. 51(1), 341–377 (2011) 30. Lee, K.: Augmented reality in education and training. TechTrends 56(2), 13–21 (2012) 31. Van Krevelen, D.W.F., Poelman, R.: A survey of augmented reality technologies, applications and limitations. Int. J. Virtual Real. 9(2), 1–20 (2010) 32. Floreano, D., Wood, R.J.: Science, technology and the future of small autonomous drones. Nature 521(7553), 460–466 (2015) 33. Hassanalian, M., Abdelkefi, A.: Classifications, applications, and design challenges of drones: A review. Prog. Aerosp. Sci. 91, 99–131 (2017) 34. Bharati, N., Das, S., Gourisaria, M.K.: A review on mobile cloud computing. In: Intelligent and Cloud Computing, pp. 209–218. Springer, Singapore (2021) 35. Dillon, T., Wu, C., Chang, E.: Cloud computing: issues and challenges. In: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, April 2010, pp. 27–33. IEEE (2010) 36. Marston, S., Li, Z., Bandyopadhyay, S., Zhang, J., Ghalsasi, A.: Cloud computing—The business perspective. Decis. Support Syst. 51(1), 176–189 (2011) 37. Srivastava, S., Bisht, A., Narayan, N.: Safety and security in smart cities using artificial intelligence—a review. In: 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence, January 2017, pp. 130–133. IEEE (2017) 38. Mishra, S., Pandey, M., Rautaray, S.S., Gourisaria, M.K.: A survey on big data analytical tools & techniques in health care sector. Int. J. Emerg. Technol. 11(3), 554–560 (2020) 39. Prasad, A.G., Gourisaria, M.K., Vashishtha, L.K.: Building hybrid recommendation system based on Hadoop framework. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), March 2016, pp. 3493–3499. IEEE (2016) 40. Allam, Z., Dhunny, Z.A.: On big data, artificial intelligence and smart cities. Cities 89, 80–91 (2019) 41. Sagiroglu, S., Sinanc, D.: Big data: A review. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), May 2013, pp. 42–47. IEEE (2013)

Artificially Intelligent and Sustainable Smart Cities

267

42. Lorenˇcík, D., Zolotova, I.: Object recognition in traffic monitoring systems. In: 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA), August 2018, pp. 277– 282. IEEE (2018) 43. Javaid, S., Sufian, A., Pervaiz, S., Tanveer, M.: Smart traffic management system using internet of things. In: 2018 20th international conference on advanced communication technology (ICACT), February 2018, pp. 393–398. IEEE (2018) 44. Smart City Hub, Smart Traffic Control: The Pittsburg Example. http://smartcityhub.com/mob ility/smart-traffic-control/#:~:text=The%20city%20of%20Pittsburg%2C%20Pennsylvanial ight%20to%20recognize%20traffic%20activity. Last accessed 25 Dec 2020 45. Forbes.com: Lee Bell, Consumer Tech. https://www.forbes.com/sites/leebelltech/2017/06/30/ qwake-techs-ar-helmet-helps-firefighters-see-through-smoke-and-get-out-of-fire-five-timesfaster/?sh=25a3a01f71f6. Last accessed: 18 Dec 2020 46. Kumar, T.V., Dahiya, B.: Smart economy in smart cities. In: Smart Economy in Smart Cities, pp. 3–76. Springer, Singapore (2017) 47. Yang, C., Su, G., Chen, J.: Using big data to enhance crisis response and disaster resilience for a smart city. In: 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), March 2017, pp. 504–507. IEEE (2017) 48. Cordis, European Commission: https://cordis.europa.eu/article/id/413531-unleashing-the-fullpotential-of-smart-agriculture. Last accessed 22 Dec 2020 49. Sharma, R., Das, S., Gourisaria, M.K., Rautaray, S.S., Pandey, M.: A model for prediction of paddy crop disease using CNN. In: Progress in Computing, Analytics and Networking, pp. 533–543. Springer, Singapore (2020) 50. Kok, K.: The powermatcher: Smart coordination for the smart electricity grid, pp. 241–250. TNO, The Netherlands (2013) 51. Vytelingum, P., Ramchurn, S.D., Voice, T.D., Rogers, A., Jennings, N.R.: Trading agents for the smart electricity grid (2010) 52. Aydin, I., Karakose, M., Karakose, E.: A navigation and reservation based smart parking platform using genetic optimization for smart cities. In: 2017 5th International Istanbul Smart Grid and Cities Congress and Fair (ICSG), April 2017, pp. 120–124. IEEE (2017) 53. Polycarpou, E., Lambrinos, L., Protopapadakis, E.: Smart parking solutions for urban areas. In: 2013 IEEE 14th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), June 2013, pp. 1–6. IEEE (2013) 54. Harikiran, G.C., Menasinkai, K., Shirol, S.: Smart security solutions for women based on the internet of things (IoT). In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Mar 2016, pp. 3551–3554. IEEE (2016) 55. Aloul, F., Al-Ali, A.R., Al-Dalky, R., Al-Mardini, M., El-Hajj, W.: Smart grid security: Threats, vulnerabilities and solutions. Int. J. Smart Grid Clean Energy 1(1), 1–6 (2012) 56. Samann, F.E.: The design and implementation of smart trash bin. Acad. J. Nawroz Univ. 6(3), 141–148 (2017) 57. Wijaya, A.S., Zainuddin, Z., Niswar, M.: Design a smart waste bin for smart waste management. In: 2017 5th International Conference on Instrumentation, Control, and Automation (ICA), August 2017, pp. 62–66. IEEE (2017) 58. Kök, I., Sim¸ ¸ sek, M.U., Özdemir, S.: A deep learning model for air quality prediction in smart cities. In: 2017 IEEE International Conference on Big Data (Big Data), December 2017, pp. 1983–1990. IEEE (2017) 59. World Economic Forum, Paul Muggeridge: https://www.weforum.org/agenda/2015/07/thesecountries-have-the-fewest-toilets-per-person/. Last accessed 09 Jan 2021 60. Gourisaria, M.K., Das, S., Sharma, R., Rautaray, S.S., Pandey, M.: A deep learning model for malaria disease detection and analysis using deep convolutional neural networks. Int. J. Emerg. Technol. 11, 699–704 (2020) 61. Das, S., Sharma, R., Gourisaria, M.K., Rautaray, S.S., Pandey, M.: Heart disease detection using core machine learning and deep learning techniques: A comparative study. Int. J. Emerg. Technol. 11(3), 531–538 (2020)

268

M. K. Gourisaria et al.

62. Nayak, S., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: Prediction of heart disease by mining frequent items and classification techniques. In: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), May 2019, pp. 607–611. IEEE (2019) 63. Nayak, S., Gourisaria, M.K., Pandey, M., Rautaray, S.S.: Comparative analysis of heart disease classification algorithms using big data analytical tool. In: International Conference on Computer Networks and Inventive Communication Technologies, May 2019, pp. 582–588. Springer, Cham (2019) 64. Sharma, R., Gourisaria, M.K., Rautaray, S.S., Pandey, M., Patra, S.S.: ECG classification using deep convolutional neural networks and data analysis. Int. J. Adv. Trends Comput. Sci. Eng. 9, 5788–5795 (2020) 65. Jee, G., Harshvardhan, G.M., Gourisaria, M.K.: Juxtaposing inference capabilities of deep neural models over posteroanterior chest radiographs facilitating COVID-19 detection. J. Interdiscipl. Math. 1–27 66. Alahakoon, D., Halgamuge, S.K., Srinivasan, B.: Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE Trans. Neural Netw. 11(3), 601–614 (2000) 67. Smart Data Collective, Ryan Kh: https://www.smartdatacollective.com/big-data-privacy-iss ues-worry-every-internet-user/. Last accessed 18 Jan 2021 68. Smart Cities World, Radim Cmar: https://www.smartcitiesworld.net/opinions/opinions/thethree-biggest-challenges-to-building-a-commutable-smart-city. Last accessed 16 Jan 2021

Machine Learning Self-Tuning Motivation Engine for Telemarketers Daniela López De Luise and Rodrigo Borgia

Abstract Telemarketing is a task that involves creativity, flexibility and the ability to make other people get interest in acquiring certain products. Many people performs this type of activity, and the impact of their efficacy might be strong in corporation revenues. For that reason, it is of interest to analyze and to model telemarketer’s productivity in order to find proper ways to motivate them. Many studies confirm that offering a reward is not a solution for this type of jobs. The way to offer incentives, to get better performances matters since findings show that it doesn’t work the same way for creative positions. Technology is gradually replacing less creative positions to machines. Humans remain in jobs that require motivation with novel approaches. The main goal of MOTIVARNOS project is an intelligent-gamified environment, with a self-tuning motivation engine based on sentiment analysis of telemarketer’s playing peculiarities, personal preferences, personality and social well defined characteristics. The project is focused on those Managers, Directors, Vice Presidents and everyone in the command chain that understands that without happy agents, there is no happy clients. And it’s also focused on companies that care in both clients and employees. The main contribution of this short paper is the model on a reduced set of shifting parameters considered to motivate individuals, outperforming traditional payroll for telemarketers motivation. The scope of this paper covers an introduction of the topic, its relevance, and the presentation of MOTIVARNOS as a proposal. Keywords Gamification · Telemarketing · Sentiment analysis · Machine learning · Computational intelligence

D. L. De Luise (B) CI2S Labs, C1180AAB Buenos Aires, Argentina e-mail: [email protected] R. Borgia Gamifica Group, S2000 Rosario, Argentina e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_15

269

270

D. L. De Luise and R. Borgia

1 Introduction Gamification can be described as the application of typical elements of game playing (e.g. point scoring, competition with others, rules of play) to other areas of activity, typically as an online marketing technique to encourage engagement with a product or service. It has been recently applied to many aspects of the real world as a recent approach to improve certain classical tasks. In those cases, it is usually referred to as serious games. According to Gnauk, Grant McKenzie [1] considers the theory behind gamification is that users are more likely to adopt (and actively use an application when there is an aspect of gameplay associated with it. Although his interests are in the area of cognitive engineering for mobile GIS (Geographical Information Systems) and its relation with gaming, a number services have recently adopted this model, combining it with fundamental concepts to give rise to a new form of social networking. He also found that gamification is one possible answer to the question of how we engage users. In the Industry, other authors explored video-games as a tool to improve the chain value and to asses the influence of several factors like stress [2] in the productivity. Among others Gnauk, Dannecker, and Hahmann proposed a motivational framework which applies well-known game mechanics, such as points and leader boards, to engage customers in the system. This is accomplished by embedding a special scoring system and social competition aspects into a stimulating user interface for the definition and management of flexible energy demand. In a first user study, the system showed a high user acceptance and the potential to engage consumers in participation. Games in serious contexts have also been used for enhancing the value of a service or a product for its users. One of the most productive field of this type of application is in academy, considered as an industry [3, 4]. In [5] are the psychological foundations and social motives for this type of gamification. Authors also provide a deep analysis of its principles and concepts, game rules (mechanics) and elements of game as well as techniques and patterns of gamification. Motivarnos can be though as one more exploration of this type of implementation in the telemarketing field, in order to improve the training and productivity of the employees in this sector. The main contribution of this research is the design of a lightweight model on a reduced set of shifting parameters considered to motivate individuals, outperforming traditional payroll for telemarketers motivation. The rest of this paper organizes as follows: Sect. 2 is a very short state of the art in the telemarketing and data science, Sect. 3 explains the need of new contributions to motivate people, Sect. 4 describes the project and the proposed model, and Sect. 5 has conclusions and future work.

Machine Learning Self-Tuning Motivation Engine for Telemarketers

271

2 State of the Art Though Industry had many times used games to perform deep analysis of other industries [6–10], most of the research effort of telemarketing analysis with Data Science is in the field of Banking. In [11] a Portuguese Bank telemarketing Analysis covers the typical steps with some derivations after mining data. As in the other papers, authors evaluate two main features in certain campaign and its effects on customers. Duration, distribution, and customer profiles are measured from a Yes/No perspective. In this case accept a product of the bank. Authors analyze the distribution of main variables, resulting in a description or the client´s behavior: call length of about 3 min, most calls are relatively short, half of the clients have been contacted by the bank for the second time, “yes” clients were contacted by fewer times and had longer call duration, most “yes” clients were approached by less than 10 times, and certain indications that the bank should resist calling a client for more than five times, which can be disturbing and increase dissatisfaction. They also found specific information for future campaigns. Must initiate the telemarketing campaign in fall or spring and the following features to success: • Feature 1: age < 30 or age > 60 • Feature 2: students or retired people • Feature 3: a balance of more than 5000 euros. In general data sets has attribute as [11–26] month, duration, campaign, days for payment, previous call, outcome payment, cost-price index, confidence index, etc. in order to evaluate how telemarketing works in the achievements of predetermined commercial goals. The analysis is centered on the consumer and the impact of certain campaigns with predetermined characteristics, and the typical model consists of a Regression and classification approaches. But the model here does not consider other important influencing factors [27] like the type of enterprise, the organizational information management [28], the type of work [29], and the role of the employee in the business [30]. In [31] the telemarketing analysis keeps on the client´s behavior, classifying phone calls as successful or not, depending on the a-posteriori result. The model, like in the previous case requires a trained data-set. In this work we compared six different classification techniques in credit environment: Nearest Neighbor, C4.5, Repeated Incremental Pruning to produce Error Reduction, Multilayered Perceptron, and Sequential Minimal Optimization Algorithm for training a Support Vector classifier and Naive Bayes. All of them are supervised classification algorithms, and compare their performance, based on the Area under the ROC Curve, over different well-known telemarketing data banks. The studies showed that Naïve Bayes Simple model turned out to be the best classifier with the RIPPER model both getting the best performance in two data sets and on the other hand the SMO classifier got the worst performance in this comparative, even using different kernels. But this result was provided by the

272

D. L. De Luise and R. Borgia

authors without Kappa statistics, and an unbalanced test set, so results need to be confirmed. One problem that arises in all the cases is the unbalance of data: there are many more negative data or vice-versa, due to the strategy of the testing: to evaluate a portion of the clients, a type of product, a specific period of time, etc. Furthermore, an inaccurate or off-profile database can result in an underwhelming telemarketing campaign and a wasted budget [32]. Thus, the predictive performance may be decreased in case of the input data have many nominal features (like the bank customer information in the case of bank trading). There are two ways to overcome the unbalance in data: organize tests previously to avoid this type of problem, and apply an approach to compensate for it. In [33] Chakarin Vajiramedhin and Anirut Suebsing, focus on how to reduce the feature of input data and balance the training set for the predictive model to help the bank to increase the prediction rate. They improve the system performance evaluation for all accuracy rates of each predictive model based on the proposed approach. The basic idea is to enhance the predictive rate of the bank telemarketing is a correlation-based feature subset selection algorithm and a balancing technique. The study uses the balancing techniques to make the label of data equivalent before using a correlation-based feature subset selection algorithm to select the robust feature. Another possibility to work with the unbalanced problem is in [34] with a Treebased model, but it needs a proper pre-processing of metric data and domain extension considerations. From the motivational perspective, the community developed at work plays an important role. Nowadays it is frequently imprinted as social networking. A social network is a good tool to congregate employees involved in similar activities. In this context, a person is a node that relates with one o more other nodes. Every interaction generates data that allow to explore and research with more insight about feelings, status, and profiles of the people participating in it. The analysis of these social networks has resulted in uncovering the structure of a community. Many authors like in [35] perform the analysis not only of the graphs and the evaluation of its dense areas but also with textual analysis and user profiling approaches. Certain similarities between group dynamics and graphs show interesting characteristics. For instance, dense areas in the representing graph represent entities that are related closely and hence belong to a community. There are many approaches and algorithms proposed in the community. It is important to note that here traditional methodologies sometimes overlap new categories in the field of natural language processing, sentiment analysis, and deep learning. A good combination is required of more conservative community detection, clustering techniques for uncovering the structure of communities, non-clique-based techniques for uncovering the structure of communities, community detection using genetic algorithms, improved modularity approach for uncovering the structure of communities, and so forth. One approach that belongs to the large list of novel algorithms mentioned previously, that deserves to be mentioned here is the new generation of software assistants, also known as chatter-bots or simply chat-bots. They rely on the best practices of

Machine Learning Self-Tuning Motivation Engine for Telemarketers

273

Natural Language Processing combined with many software artifacts that make them able to interact in natural language in real-life and real-time. This new generation of intelligent assistant systems can be useful to interact with clients and complement many WEB sites. But the technology is not completely dominating the sector due to many problems that derive from the complexity of language, like polysemy, semantic conflicts, slang, etc. There are many authors in this field working on reducing the gap between linguistic reasoning (typical of humans) and the system’s current ability to interact in a dialog [36]. The main challenge is to emulate the building the ontological structure of knowledge, that in humans takes full-time many years, the correct administration of information, the process of searching, accumulation, and processing of information objects on the Internet, the dialog contextualization, the proper balance between resources, and results, etc. This paper presents a proposal called MOTIVARNOS, that changes the focus of the analysis: from telemarketing success to telemarketers motivation by means of serious games. Data Science now changes from profiling the telemarketers to model gamification relationships with telemarketers. One of the main clues to understanding the approaches were given during the first pilot of Motivarnos, a comment from the first user: “Motivarnos allowed us to improve the performance results of the group of participants (AHT, ACW, etc.), improve the spirit of the team, the integration, and also in some cases, the aspect of self-improvement. Entertainment, productivity, motivation, interest, heterogeneity, were part of an equation that resulted in a very satisfactory and innovative experience.” The observation came from Marcelo Greco, customer services manager of La Caja de Ahorro y Seguros, an insurance group based in Argentina. It ranks second in the country’s insurance market with a total share of 5.9% as of December 2017.

3 Motivation and Serious Games The traditional approach to motivate people is to offer a reward to the employee. For example to give extra benefits as a counterpart of an effort, like more money or a promotion. This type of incentive-based motivation is being applied at many levels: at work, at school, even at home. But when social psychologists test whether incentives work, they get surprising results. One interesting result was obtained by Sam Glucksberg [37], from Princeton University, in the United States of America. He set people a problem to solve and told them he was going to time them to see how long they took. Then he put them in two groups. He offered one group a reward for finishing fast. Five dollars for anyone finishing in the top 25% and 20 dollars for the person who finished the fastest of all. To the other group, he offered no incentive, but he told them he was going to use their times to calculate an average time. The first group, the ones with the reward, took three and a half minutes longer than the group who just thought they were being timed. Incentives didn’t work. In

274

D. L. De Luise and R. Borgia

fact, it made them slower. This experiment has been repeated, with the same results, many times. But in business, we still offer bonuses, promotions, and rewards to staff. A deeper analysis explains that this type of motivation works just for very simple tasks. But for complex and/or creative tasks rewards don’t work. They might even have the opposite result, and make people perform worse. Another study, by Dan Ariely [38] the James B. Duke Professor of Psychology and Behavioral Economics at Duke University, showed that the bigger the reward, the worse the subjects performed on a complex task. The reward made them focus so hard on the result that they couldn’t think creatively anymore. With the big and accelerated changes in the industry, to be able to engage people matters because simple jobs are becoming automated. It is important to find a way to motivate people to do those jobs when it has been proved the traditional incentives don’t work. Ariely found that what does work is giving the workers freedom; freedom to work on the things they want to work on, freedom to choose when, where, and how they work. Let them work from home some days in the week, get up late, and work into the night if they feel like that, etc. This, with a requirement of job deadlines and quality. Evidence shows people who choose the way they work to get results. Companies that give employees time during the week to work on things that interest them and are not part of their regular job achieve amazing things. Some of the big tech companies are good examples of this, with ping-pong tables and areas to relax in. MOTIVARNOS aims to integrate these findings to game playing, by applying Serious Games. This combines a serious intention with the game’s rules and targets. Traditional technology that uses games to engage individuals involves Serious Gaming and Gamification as the most interesting and valuable approaches in this domain. While the first changes the goal of a game via different methods, in order to offer activities that go beyond mere entertainment, the second uses game design to enhance an individual’s willingness to participate in originally non-playful experiences [39]. This paper discusses different aspects related to the adaption of Serious Games and its use to perform gamification as an innovation in the field of telemarketing. Serious games are growing rapidly as a gaming industry as well as a field of academic research [40]. There are many surveys in the field of digital serious games; however, most surveys are specific to a particular area such as education or health. So far, there has been little work done to survey digital serious games in general, in different application areas including education, well-being, advertisement, cultural heritage, interpersonal communication, and health care. In [41] there is a proposed taxonomy for digital serious games.

Machine Learning Self-Tuning Motivation Engine for Telemarketers

275

4 A New Perspective: MOTIVARNOS A call center/contact center is very well known to be a stressful workplace. Most of the interactions are thought. The individual is exposed to six hours of non-stop yelling at clients. The main goal of this project is to recognize jobs well done at the agent level and to use game mechanics and gamification to produce more focused performance-based decisions, while agents are being recognized for their efforts and results. The contribution of the project is the specific combination of features selected for data to provide an efficient way to model how every employee needs to be motivated with Serious Games. The proposal is based on the fact that 80% of an operational budget goes to Payroll (salaries), with most solutions client-centered or IT-focused. There are few approaches centered on the employee and a few of them address motivation, recognition, and fun. As telemarketing is a complex engine that does not depend only on the telemarketer, the project targets those Managers, Directors, VP’s and everyone in the command chain that understands that without happy agents, happy client’s cannot be “produced”. And it’s also focused on companies that care in both ways: their clients and their employees. MOTIVARNOS began with a client that requested something fun, something to entertain while taking care of productivity, something that recognizes in a professional manner. From that, the focus is on how to get this complex combination of ROI (Return on investment, a performance measure used to evaluate the efficiency or profitability of an investment or compare the efficiency of a number of different investments). ROI tries to directly measure the amount of return on a particular investment, relative to the investment’s cost., simple actions, and how to generate the behavioral change needed to ensure a recognition climate in an organization. The following sections explain the solution obtained.

5 Architecture and Basic Concepts Telemarketing is a marketing technique that sees sales teams reach out to prospective clients via several channels like telephone, fax, or online. A B2B (business-tobusiness, a type of electronic commerce consistent with the exchange of products, services, or information between businesses, rather than between businesses and consumers). B2B contact data providers can generate a high-quality list of personalized leads, making the telemarketing process more efficient and lucrative. In this context, MOTIVARNOS has to perform every step to collect telemarketing activity at every enterprise and manage it to derive proper information suitable for data science and motivation profiling. Figure 1 shows the steps to collect data, and Fig. 2 the steps to perform motivation profiling.

276

D. L. De Luise and R. Borgia

Fig. 1 Telemarketer data compilation

Fig. 2 Motivation profiling

In Fig. 1, the process is based on how people play. There are certain factors embedded into the project. If you are a “winner”, you will love rankings and bets, if you are an explorer, you will definitely like our missions, if you are social-wellall activities can be liked for others’ enjoyment. There is a production-recognition mechanism included to enhance results. In the graph, there is a typical implementation phase of the use of the platform. In a total eight weeks period, using the combination of communication to telemarketers, the correct implementation of trivias (a game or competition where the competitors are asked questions about interesting but unimportant facts in many subjects) to validate operational knowledge, to reinforce good attitudes observed with the delivery of badges, an implementing a tournament based on a specific metric / KPIs (key performance indicators) to improve and last but not least to congratulate winners, the business can see the specific achievements. This helps the habitual requirement of

Machine Learning Self-Tuning Motivation Engine for Telemarketers

277

having a specific ROI in place, even in cases when—as here—we are in the soft skills arena; mostly not measured with rock-solid metrics as—for example—the amount of sales. The steps for building the motivation profile are described in the remaining of this section.

5.1 Loading of the Raw Data The data-set is about any activity performed during marketing campaigns, which aim to promote different types of products among existing customers.

5.2 Structure of the Data The database is complex and is mainly composed of fields quite different from those mentioned in the literature. Among other variables refer to game metric-data, the user (instead of clients), avatar information, log of challenges, teams, company, gift items, insignia, match, mission, skill, tournament, trivia, role, and avatar among others. As can be seen, the traditional classification approaches on yes/no are not suitable here, since there are multiple dimensions to consider. Even regressive models should be careful as the main goal here is to determine the clues to motivate every telemarketing to let him reach his best performance. Figure 3 shows a high-level diagram of data organization.

Fig. 3 Information for metrics

278

D. L. De Luise and R. Borgia

A user belongs to a team (a logical organization of employees in an enterprise), which typically will have a set of activities designed. Accordingly, a number of Trivias, Missions, and Tournaments will be associated with those activities. Along with his activities, a user (employee) is obtaining insignias, gifts, and a log of transactions. A participant is allowed to set just one Avatar. The resulting of every transaction updates the current calculated skill and a set of metrics. Some metrics collected are highest scoring, average scoring, feeling (derived from a table that relates scoring, participation, and type of transactions), insignias earned in the last week, gifts earned in the last week, number of challenges, etc. The original data was designed in a more conventional way to reflect employee activity, then it was tuned changing activity as transaction (Transaction entity) that covers the mapped game activities. Tables 1, 2, 3, 4 and 5 show the entire set of main tables in the architecture. Note that from the current perspective, the enterprise goals must be parameterized in a set of challenges, trivias, and tournaments. Table 1 Fields and indexes of the Administrative information in the database Number of fields

Table name

Number of indexes

admin_integracion

4

1

i_importmetricdata

31

7

4

2

m_integrationconf

10

5

m_session

13

5

m_syserrorlog

17

4

admin_integracion_configuracion

Table 2 Fields and indexes for USER information in the database

Table name

Number of fields

Number of indexes

i_user

27

4

m_avatarpart

12

4

m_challenge_team

2

2

m_challenge_user

2

2

m_company

14

4

m_emailqueue

13

5

9

5

18

6

m_feelingshistoric m_insignia m_insigniauser

17

7

m_filedata

11

4

9

5

m_skill m_user

23

6

m_userrole

11

6

2

2

m_user_avatarpart

Machine Learning Self-Tuning Motivation Engine for Telemarketers Table 3 Fields and indexes for TEAM information in the database

Table name

Number of fields

279 Number of indexes

i_user

27

4

m_account

12

4

m_insigniateam

10

6

m_insignia_team

2

2

m_insignia_user

2

2

m_tournament_team

2

2

Table 4 Fields and indexes for METRIC information in the database Table name

Number of fields

Number of indexes

i_user

27

4

m_avatarpartuser

12

7

m_giftitem_team

2

2

m_giftitem_user

2

2

m_metricclassification

9

4

27

4

2

2

12

7

m_metricconf m_metricconf_team m_metricuser m_metricconf_user

2

2

m_metricdata

13

7

m_metric_user_summary

15

0

m_skillcategory

8

4

m_tournament

17

5

m_team

13

5

m_tournament_user

2

2

m_trivia_team

2

2

m_trivia_user

2

2

5.3 Data Pre-processing This step covers cleaning the data-set, dealing with missing data outliers, and data transformations. There could not be missing values by construction. Nevertheless, there could be values like “unknown”, “others”, which are helpless just like missing values. Thus, these ambiguous values must be removed from the original information. Some outliers might arise. In order to capture the general trend in the data set, outliers are dropped. Outliers are defined as the values which are more than three standards. Finally, some changes must be made to fields with extensive domain values, units, and data types for easier analysis.

280 Table 5 Fields and indexes for TRANSACTION information in the database

D. L. De Luise and R. Borgia Table name

Number of fields Number of indexes

i_user

27

4

m_accounttransaction

17

6

m_challenge

17

6

m_challengeuser

22

10

m_giftitem

14

4

m_giftitemuser

15

7

m_integration

10

4

m_match

29

16

m_mission

14

5

m_missionuser

17

7

m_mission_team

22

2

2

2

m_news

m_mission_user

10

4

m_news_team

22

2

m_news_user

2

2

m_notification

14

6

m_notificationuser

11

6

m_notificationvote

9

6

m_notification_filedata

2

2

19

4

m_trivia m_triviaanswer

14

6

m_triviaquestion

22

5

m_triviauseranswer

11

6

m_triviauserquestion

11

6

m_triviauserscore

18

7

5.4 Exploratory Data Analysis In order to obtain a better understanding of the data collected, the distribution of key variables and the relationships among them are evaluated. As mentioned in previous sections, the model will require regression analysis with a combination with classification on specific variables like product and industry. The final model is required to be a set of rules to make it easy to use the output of the analysis. A specific set of rules will be obtained and therefore implemented to improve every telemarketer’s proficiency.

Machine Learning Self-Tuning Motivation Engine for Telemarketers

281

5.5 Telemarketer Profile The last step refers to apply the model of performance to a telemarketer, according to its individual parameters, and complete the loop with its performance variations. The model then considers the feedback to self-tune individual parameters.

6 Conclusions and Future Work To motivate an employee requires of a certain conception of how the individual can be engaged with a certain goal. A short literature review shows that the solution is not to increase salaries or to make severe pressure on the environment, but to find a combination of gamification and real-life goals. Telemarketing is a stressful activity that also lays on creativity and individual ability to sell. This paper presents a short introduction on the field of motivation and the relevance to introduce Serious Games to help telemarketers improve their performance. A blueprint of the project and certain steps required to implement the approach that is underneath are also included. From the first findings, certain data has been introduced to improve the metrics to be applied, in order to express the relationship between current goals in the enterprise and the set of gamified artifacts in the platform. Due to the restricted domain of implementation and the required time to get a consistent set of data, authors expect to be able to get the first results in the following couple of months. Once the test data-set is available, motivation models will be constructed and the close-loop will be complete to evaluate the whole process. By now authors are just collecting data and monitoring the quality of the process.

References 1. Gnauk, B., Dannecker, L., Hahmann, M.: Leveraging gamification in demand dispatch systems. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops (EDBT-ICDT’12), ACM (2012) 2. Schlömmer, M., Spieß, T., Schlögl, S.: Leaderboard positions and stress—experimental investigations into an element of gamification. In: MCI—The Entrepreneurial School, 6020 Innsbruck, Austria. Turkanovi´c, M., Heriˇcko, M. (eds.) Sustainability 13(12), 6608 (2021). https://doi.org/ 10.3390/su13126608 3. Tamminen, L.: Gamification and use engagement in self-learning software. University of Tampere School of Information Sciences Information Studies and Interactive Media. Master’s Thesis (2015) 4. Giannetto, D., Chao, J., Fontana, A.: Gamification in a social learning environment. In: Issues in Informing Science and Information Technology, vol. 10 (2013) 5. Ašeriškis, D., Damaševiˇcius, R.: Gamification of a project management system. In: ACHI: The Seventh International Conference on Advances in Computer-Human Interactions (2014) 6. Yang, W. et al.: Mining player in-game time spending regularity for churn prediction in free online games. In: 2019 IEEE Conference on Games (CoG), pp. 1–8. https://doi.org/10.1109/ CIG.2019.8848033

282

D. L. De Luise and R. Borgia

7. Zammitto, V., Ambinder, M., Lorusso, T., Hrennikoff, C.: Applied games user research: industry panel. In: 2013 IEEE International Games Innovation Conference (IGIC), pp. 291–292. https:// doi.org/10.1109/IGIC.2013.6659166 8. Li, Q., Bian, Y.: Game analysis of information collaboration in the linkage between manufacturing industry and logistics industry. In: 2020 Management Science Informatization and Economic Innovation Development Conference (MSIEID), pp. 408–411. https://doi.org/10. 1109/MSIEID52046.2020.00087 9. Wnag, N., Wang, S., Zhang, B.: Game analysis for the framework of green supply in construction industry. In: 2011 International Conference on Electric Technology and Civil Engineering (ICETCE), pp. 1283–1286. https://doi.org/10.1109/ICETCE.2011.5775351 10. Dhawan, S., Singh, K., Batra, A.: Defining and evaluating network communities based on ground-truth in online social networks. In: Recent Innovations in Computing, Proceedings of ICRIC 2020, pp. 151–163.https://doi.org/10.1007/978-981-15-8297-4_13 11. Asare-Frempong, J., Jayabalan, M.: Predicting customer response to bank direct telemarketing campaign. In: International Conference on Engineering Technology and Technopreneurship (ICE2T), Kuala Lumpur, Malaysia, pp. 1–4 (2017). https://doi.org/10.1109/ICE2T.2017.821 5961 12. Islam, Md. S., Arifuzzaman, M.: SMOTE approach for predicting the success of bank telemarketing. In: 4th Technology Innovation Management and Engineering Science International Conference (TIMES-iCON) (2019). 13. Jin, W., He, Y.: Three data mining models to predict bank telemarketing. IOP Conf. Ser. Mater. Sci. Eng. 490(6), 062075 (2019) 14. Chen, C., Chiu, H.: Applying AI techniques to predict the success of bank telemarketing. In: 4th International Conference on Deep Learning Technologies ICDLT 2020 (2020) 15. Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing (2014).https://doi.org/10.1016/j.dss.2014.03.001DSS 16. Subarkah, P., Pri Pambudi, E., Oktaviani Nur Hidayah, S.: Data mining in bank telemarketing. Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer 20(1), 139–148 (2020) 17. Sembiring, G., Huang, J., Prompreing, K., Prompreing, T.: Dynamic parameters and algorithm in predicting bank telemarketing success. Int. J. Bus. Inf. Syst. 1(1), 1 (2019) 18. Hassan, D., Rodan, A., Salem, M., Mohammad, M.: Comparative study of using data mining techniques for bank telemarketing data. In: Sixth HCT Information Technology Trends (ITT) (2019) 19. Kozak, J., Juszczuk, P.: The ACDF algorithm in the stream data analysis for the bank telemarketing campaign. In: 5th International Conference on Soft Computing & Machine Intelligence (ISCMI) (2018) 20. Che, J., Zhao, S., Li, Y., Li, K.: Bank telemarketing forecasting model based on t-SNE-SVM. J. Serv. Sci. Manag. 13(03), 435–448 (2019) 21. Zeinulla, E., Bekbayeva, K., Yazici, A.: Comparative study of the classification models for prediction of bank telemarketing. In: IEEE 12th International Conference on Application of Information and Communication Technologies (AICT) (2018) 22. Koumetio Tekouabou, C.S., Cherif, W., Hassan, S.: A data modeling approach for classification problems: application to bank telemarketing prediction. In: The 2nd International Conference on Telemarketing (2019) 23. Shamala, P., Mustapha, A., Mohd Foozy, C.F., Atan, R.: Customer profiling using classification approach for bank telemarketing.https://doi.org/10.30630/joiv.1.4-2.68 24. Gao, H., Wu Pan, X., Shan Fam, P., Chin Low, H.: Neural networks with different activation functions applied in bank telemarketing. In: 62nd ISI World Statistics Congress at KL, Malaysia (2019) 25. Yu, J.M., Cho, S.B.: Prediction of bank telemarketing with co-training of mixture-of-experts and MLP. In: International Conference on Neural Information Processing (2016) 26. Lahmiri, S.: A two-step system for direct bank telemarketing outcome classification. Intell. Syst. Account. Fin. Manage. 24(1) (2017)

Machine Learning Self-Tuning Motivation Engine for Telemarketers

283

27. Nummenmaa, T., Kankainen, V.: Social features in hybrid board game marketing material. In: FDG’19: Proceedings of the 14th International Conference on the Foundations of Digital Games, Article No.: 67, pp. 1–8 (2019). https://doi.org/10.1145/3337722.3341864 28. Lee, M., Jin, J.H., Ryu, G.: Motivated to share? Using the person-environment fit theory to explain the link between public service motivation and knowledge sharing. In: García-Holgado, A. (eds.) Sustainability 13(11), 6286 (2021). https://doi.org/10.3390/su13116286 29. Kleiman, F., Jassen, M.: Gaming for meaningful interactions in teleworking lessons learned during the COVID-19 pandemic from integrating gaming in virtual meetings. Digital Govern.: Res. Pract. 1(4), 1–5 (2020). https://doi.org/10.1145/3416308 30. Misiak-Kwit, S., Wi´scicka-Fernando, M., Dilruk Fernando, K.: The symbiotic mutualism ´ between co-creation and entrepreneurship. Kot, S., Slusarczyk, B. (eds.) Sustainability 13(11), 6285 (2021). https://doi.org/10.3390/su13116285 31. Serrano-Silva, Y.O., Villuendas-Rey, Y., Yáñez-Márquez, C.: Telemarketing success: evaluation of supervised classifiers (2020). ISSN 1870-4069 32. Dickens, A.: Fact: data is key to a successful telemarketing campaign. https://www.virtualsales.com/data-is-key-for-telemarketing-databases/VSL. Sales driven 2015. Last accessed 21 Mar 2021 33. Vajiramedhin, C.: Feature selection with data balancing for prediction of bank telemarketing. Appl. Math. Sci. 8(114), 5667–5672 (2014) 34. Telemarketing classification in banking institution: tree-based model. https://rpubs.com/Arifyu nan360/Telemarketing. Last Accessed 29 Mar 2021 35. Dhawan, S., Singh, K., Batra, A.: Defining and evaluating network communities based on ground-truth in online social networks. In: Singh, P.K., Singh, Y., Kolekar, M.H., Kar, A.K., Chhabra, J.K., Sen, A. (eds.) Recent Innovations in Computing, Proceedings of ICRIC 2020, pp.151–163. Lecture Notes in Electrical Engineering Book Series (LNEE), vol. 701. Springer (2020) 36. Bova, V., Kravchenko, Y., Rodzin, S., Kuliev, E.: Simulation of the semantic network of knowledge representation in intelligent assistant systems based on ontological approach. In: Singh, P.K., Veselov, G., Vyatkin, V., Pljonkin, A., Dodero, J.M., Kumar, Y. (eds.) Futuristic Trends in Network and Communication Technologies. Third International Conference, FTNCT 2020, Taganrog, Russia, pp. 241–252, CCIS, vol. 1395. Springer (2020) 37. Glucksberg, S.: The influence of strength of drive on functional fixedness and perceptual recognition. J. Exp. Psychol. 63, 36–41 (1962) 38. Himmelstein, D., Ariely, D., Woolhandler, S.: Pay-for-performance: toxic to quality? insights from behavioral economics. Int. J. Health Serv. 44(2), 203–214 (2014) 39. Boughzala, I., Michel, H., de Freitas, S.: 48th Hawaii International Conference on System Sciences. IEEE Press (2015) 40. Ma, M.: Introduction to serious games development and applications. Entertain. Comput. 2(2), 59–60.https://doi.org/10.1016/j.entcom.2011.03.001 41. Laamart, F., Abdulmotaleb Saddik, M.: An overview of serious games. Int. J. Comp. Games Technol. (2014)

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics Luis-Daniel Ibáñez, Eddy Maddalena, Richard Gomer, Elena Simperl, Mattia Zeni, Enrico Bignotti, Ronald Chenu-Abente, Fausto Giunchiglia, Patrick Westphal, Claus Stadler, Gordian Dziwis, Jens Lehmann, Semih Yumusak, Martin Voigt, Maria-Angeles Sanguino, Javier Villazán, Ricardo Ruiz, and Tomas Pariente-Lobo Abstract Optimizing mobility services is one of the greatest challenges Smart Cities face in their efforts to improve residents’ wellbeing and reduce CO2 emissions. The advent of IoT has created unparalleled opportunities to collect large amounts of data about how people use transportation. This data could be used to ascertain the quality and reach of the services offered and to inform future policy—provided cities have the capabilities to process, curate, integrate and analyse the data effectively. At the same time, to be truly ‘Smart’, cities need to ensure that the data-driven decisions they make reflect the needs of their citizens, create feedback loops, and widen participation. In this chapter, we introduce QROWD, a data integration and analytics platform that seamlessly integrates multiple data sources alongside human, social and computational intelligence to build hybrid, automated data-centric workflows. By doing so, QROWD applications can take advantage of the best of both worlds: the accuracy and scale of machine computation, and the skills, knowledge and expertise of people. We present the architecture and main components of the platform, as well as its usage to realise two mobility use cases: estimating the modal split, which refers to trips people take that involve more than one type of transport, and urban auditing. L.-D. Ibáñez (B) · R. Gomer University of Southampton,Southampton, UK e-mail: [email protected] E. Maddalena · E. Simperl King’s College London,London, UK M. Zeni · E. Bignotti · R. Chenu-Abente · F. Giunchiglia University of Trento,Trento, Italy P. Westphal · C. Stadler · G. Dziwis · J. Lehmann Institute for Applied Informatics (InfAI),Leipzig, Germany S. Yumusak KTO Karatay University,Karatay, Turkey S. Yumusak · M. Voigt AI4BD AG, Zürich,Zürich, Switzerland M.-A. Sanguino · J. Villazán · R. Ruiz · T. Pariente-Lobo ATOS Research and Innovation,Madrid, Spain © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_16

285

286

L.-D. Ibáñez et al.

1 Introduction In a world dominated by huge societal and environmental shifts, urban mobility is likely to remain one of the most pressing challenges cities need to tackle over the next decades. The UN 2030 development agenda states as one of its goal the provision of “access to safe, affordable, accessible and sustainable transport systems for all”. In terms of environmental impact, road transport represents nearly 30% of the CO2 emissions in Europe and the US; reducing this share is critical to fight global warming, calling for novel approaches and tools to improve urban mobility. Furthermore, commuting time spent on the road has been shown to have substantial effects on productivity and wellbeing; For example, according to the European Commission, traffic congestion costs the EU economy more than 100 billions every year.1 The advent of IoT has created unparalleled opportunities to collect large amounts of data about how people use transportation, real-time positions of public transport; traffic cameras and meters; weather reports, etc. However, to be truly ‘Smart’, city authorities must find ways to ensure that they make sense of all this data to drive their decisions, and that decisions reflect the needs and expectations of the people they serve [9]. This may include focus groups, co-design workshops, or ideas competitions, as well as crowdsourced data collection activities in which citizens can share data about their own transport patterns via mobile phones, wearables and other sensored devices to improve and provide feedback on existing services [5]. In this chapter, we introduce the QROWD platform, a data integration and analytics platform designed to include citizens in the data value chain of Smart Cities. It incorporates advanced interlinking and analysis capabilities for different sources of data, including human computation to train and validate algorithms, alongside means to crowdsource data collection and feedback. The platform is designed to develop and deploy arbitrary hybrid workflows that bring together the accuracy and performance of machine computation with human skills, knowledge and expertise that machines cannot emulate. In addition to the QROWD platform, we report on the design, implementation of two end-to-end mobility use cases deployed in the city of Trento, Italy: the estimation of the modal split, i.e., the share of citizens that use each available mode of transport; and the auditing of infrastructure location and information data. We also report on the results and lessons learned from their deployment on the city of Trento, Italy. This chapter extends previous work published in the 5th International Smart Cities Workshop, colocated with Web Conference 2019, where we presented the design and implementation of the modal split and mobility infrastructure use cases; and i-Log, a crowdsensing mobile application (i-Log) to collect data from personal mobile phones in an unobtrusive, responsible way [17]. This chapter introduces the following novel contributions:

1

https://ec.europa.eu/transport/themes/urban/urban_mobility_en.

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

287

• The QROWD platform. A data integration and analytics architecture, compatible with the FIWARE set of open standards,2 in use in more than 100 cities in the world, which supports the design, development and deployment of hybrid humanmachine workflows for data collection, curation, integration and analysis. • A linked-data-enabled, big sensor data storage component (QROWDDB) • Extensions of the FIWARE open standard data models for transport and mobility to include contributions and feedback from citizens and manage potential rewards or compensation. • Guidelines to help Smart City managers design human computation tasks as part of hybrid workflows. • Rewriting of the use cases design following the guidelines • Extension of the technical details of the machine components developed for the use cases. The transport mode classifier used to analyse data contributed by citizens through the i-Log app and the interlinking component that uses semantic technologies to integrate heterogeneous data sources. The remainder of the chapter is organised as follows: Sect. 2 reviews previous frameworks for Smart Cities aimed at leveraging IoT devices and those that tackle the human and citizen perspective. Section 3 describes the QROWD platform and architecture. Section 4 introduces the the guidelines for designing and implementing tasks for citizens and crowdworkers, and how to architect them as crowdsourcing services for their integration with machine processes. Section 5 describes the tools for acquiring data from pre-existing static and dynamic sources and crowdsourcing services included with the QROWD platform for data collection. Section 6 introduces the data models we developed to facilitate the inclusion of citizens, and the QROWDDB, a component to manage data related with citizens and their contributions in a privacy preserving way. Section 7 describes the data integration capabilities of the QROWD Platform. Section 8 describes how we used the QROWD Platform to develop and pilot two urban mobility use cases: generating and curating mobility infrastructure data and estimating modal split. We report on the results and lessons learned from the pilots.

2 Related Work 2.1 IOT for Smart Cities Several frameworks have been developed to help Smart Cities harness the power of IoT infrastructures and sensor devices [10]. They can be classified according to their provision of the following capabilities [22] (1) data acquisition (2) semantic interoperability (3) real-time data analysis (4) application development support

2

https://www.fiware.org/developers/.

288

L.-D. Ibáñez et al.

SmartSantander created the first experimental test facility for the research and experimentation of architectures, key enabling technologies, services and applications for the IoT in the context of a Smart City. It provides the data acquisition and application development support dimensions [27]. The semantic interoperability dimension was considered in the SPITFIRE EU project, that developed vocabularies to integrate descriptions of sensors and things with the LOD cloud; Semantic entities as an abstraction for things with high-level states inferred from embedded sensors; Semi-automatic generation of semantic sensor descriptions; Efficient search for sensors and things; all on top of an unified service infrastructure. Another platform that focuses on the semantic dimension is OpenIoT, that leverages the W3C Semantic Sensor Network ontology to annotate data from sensor streams, providing also a toolkit for filtering, selecting them and visualizing them [21]. STAR-CITY integrates data of heterogeneous variety, velocity and volume, and combines description logic, rule-based reasoning, machine learning inferences and stream based correlation to provide spatio-temporal analysis of traffic conditions for their diagnosis and prediction. Its main application scenario is real-time data analysis and event detection of traffic events [15]. CityPulse aims at facilitating knowledge extraction from Smart City environments, using a combination of large scale data stream processing modules and adaptive decision support, while providing application development support, demonstrated by the development of a prototype adaptive travel planner app [22]. The IOTManager is a versatile and resilient framework capable to store and rearrange data collected by IoT sensors [2]. The main contribution is the provision of a software that could be easily deployed by public organisations and Smart City managers, without being tied to “platform-as-a-service” contracts with large providers. The framework is demonstrated with a case study on traffic controllers and weather stations. Deepint introduces the concept of “City-as-a-platform” [8]. The main innovation is the introduction of wizards to easily create and deploy AI models for common Smart City tasks. Deepint is demonstrated with the implementation of a Crowdcounter to monitor pedestrian traffic levels by using video cameras in the City of Melbourne. None of these frameworks considers the integration of citizens or human computation in general as part of their toolkit. The potential of human sensing for Smart Cities was first studied in [6]. For the particular case of social media, it presents a methodology to extract the perceptions that may be relevant to Smart City initiatives from social media updates, validated on dataset of tweets geolocalised in New York City. Beyond social media, [9] proposed a re-imagination of the role of citizens in Smart Cities, highlighting the importance of supporting them in playing an active role in urban innovation, from the crowdsourcing of initial ideas, to facilitating their involvement in the realisation of community projects. TCitySmartF outlines a Smart Cities roadmap from the technological, social, economic and environmental point of view. It puts both residents and urban dynamics at the forefront of the development with participatory planning and interaction for the robust community- and citizen-tailored services. It also includes connections to

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

289

other cities, in order to create a region or country-wide network of data that could be used to implement further policies and share technical knowledge [13]. They define a high level architectural design that but do not provide an implementation.

2.2 Mobile Crowdsensing and Crowdsourcing The field of Mobile Crowdsensing studies the wide variety of sensing models by which individuals collectively share data and extract information to measure and map phenomena of common interest, and for which several platforms have been discussed in the literature (see for example the survey on [14]). A similar line of work studies the efforts of making mobile crowdsourcing useful for Smart Cities. Note that mobile crowdsourcing can be used for other types of problems, for example, to perform task within a private organisation (like an University campus), or for realising food or package delivery. The particular application for Smart Cities has been surveyed by Kong et al. [12]. According to their classification, the QROWD platform is a technology enabler for the use of mobile crowdsourcing in the context of Smart Cities. Both fields are related to our work as one of the goals of the QROWD platform is the integration of the human factor, and citizens in particular, in the data value chain of a Smart City. The QROWD platform includes its own crowdsensing application (the i-Log app) that implements best practices from literature, and enables the flow of collected data towards data analytics components, and back to citizens for further curation. In this subsection, we review other frameworks that highlight a citizencentric approach. A discussion of the challenges for sustainable people-centric sensing is presented in [24], highlighting the importance of being energy-efficient for user’s devices guaranteeing the privacy and security of people’s data, and putting in place appropriate incentive mechanisms. Vol4All [28] enables ideas exchange and crowdsourcing by facilitating citizens’ involvement in the realization of community projects. Volunteering actors (initiators, participants, stakeholders) can easily interact via the Vol4All platform which enables volunteering opportunities dynamic sharing, evolution and monitoring. Vol4All includes a point-system to incentivize participation in volunteering activities, and a number of tools for monitoring volunteering activity and analyse the result of specific campaigns. OrganiCity [11] provides an Experimentation as a Service (EaaS) framework that aims at providing a scalable platform to manage city services and of a co-creation environment including citizens. It includes technical components to integrate different data sources and to host co-creation experiments that empower citizens at different stages of the urban service lifecycle. TO this end, it provides a set of co-creation tools:(1) SensiNact studio: aims at helping coders working with data streams from deployed data assets without the need to learn about the Organicity APIs.(2) TinkerSpace: Toolkit for creating mobile services—Apps—without the need for extensive software training or experience. Providing (3) Smartphone Experimentation: A

290

L.-D. Ibáñez et al.

complementary framework that facilitates experimenter to gather and process data from the sensors and communication interfaces of the smartphones of volunteers and use them to run experiments. Both Vol4All and OrganiCity restrict citizen participation to tools for contributing data. The QROWD platform extends this in two ways: first, it enables the harnessing of human computation, not only from citizens, but also from paid crowdworkers; second, human computation is not limited to data collection, but also integrated into all the other steps of the data value chain, providing the building blocks to create hybrid data flows comprised of several human and machine processing steps. CitySpeed is an application and server to collect, manage and provide access to vehicular speed data. Participating citizens download a mobile application that monitors the speed of their vehicles [4]. The proposed mobile-phone based monitoring was found to match the speed as collected by the ECU units of a set of test vehicles. The whole system was piloted on two cities in Brazil. The theoretical framework for a similar application is described on [19]. CitySpeed could be re-factored as a component of the QROWD platform, giving the additional advantage of managing the incentives for the citizens that wish to participate.

3 The QROWD Platform and Architecture The QROWD platform is designed to seamlessly connect human computation tasks (HCTs) with machine analytics process, reducing the friction for developers and enabling the continuous improvement of data and services. From an architectural point of view, it is divided into five sets of components, as shown in Fig. 1. 1. The Crowdsourcing services (bottom) component set is a repository of standalone HCTs. We detail how HCTs should be architected in Sect. 4.2. 2. The Data generation and acquisition (bottom right quadrant) includes a data storage component to host heterogeneous data sources that could be static (e.g., records of parking locations and fees) or dynamic (e.g. live streams of occupancy of said parking). In this component set, we also include machine components to perform data harvesting, extraction and semantization of data. 3. Storage (bottom left): Data acquired from citizens through crowdsourcing services and pre-existing data needs to be integrated for further analysis. This component set includes the QROWDDB, a Big Data Storage for personal big data generated by mobile devices of citizens, and the associated. We describe it in detail in Sect. 6. 4. Data interlinking, fusion and analytics (top right and top left): components that take as input data integrated in the QROWDDB and perform machine-based fusion and interlinking, or other data analytics. To implement inter-component communication, we chose to adopt a technology stack consistent with the one promoted by the Open and Agile Smart Cities (OASC) alliance, a non-profit, international Smart City network that connects +140 Smart

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

291

Fig. 1 QROWD Platform architecture

Cities globally organised in national networks from 27 countries and regions, aimed at establishing the Minimal Interoperability Mechanisms (MIMs) needed to create a Smart City market. The QROWD platform makes use of three core technologies: 1. Apache NiFi to host and execute data flows. The use of NiFi Is also consistent with the design decision of following the FIWARE architecture. 2. CKAN3 is an open source, fully-featured, mature data portal and management solution that can be easily adapted and extended and provides an API. CKAN is used by hundreds of data publishers around the world and is the standard platform recommended by OASC to store datasets at rest. We use it as the repository for acquired data. 3. FIWARE context broker (Orion). Orion manages the entire lifecycle of context information including updates, queries, registrations and subscriptions. Context information consists on entities (e.g. a car) and their attributes (e.g. the speed or location of the car). Orion implements the NGSIv2 specification.4 The QROWD 3 4

https://ckan.org/. https://fiware.github.io/specifications/ngsiv2/stable/.

292

L.-D. Ibáñez et al.

Fig. 2 Integration of machine data analytics task as Apache NiFi processor and human computation task deployed as crowdsourcing service

platform use Orion to manage streaming and sensor data, and to orchestrate message passing across different components. To illustrate, consider the problem of annotating text with entities from a knowledge base. A machine learning model is trained to output annotations and a confidence value. Machine annotations may or may not be correct, therefore, humans could be recruited to validate them and provide new ones that could be used for re-training the model. How to seamlessly connect inputs and outputs of both types of processes? Fig. 2 illustrates how to do this with the QROWD platform, annotation is implemented as a a human computation task and deployed as a crowdsourcing service. The Machine Learning model is deployed as a NiFi processor calls the crowdsourcing service through a crowdsourcing connector whenever the confidence value of an annotation is under a configurable threshold. The corrected result is used as input to re-train the model.

4 Crowdsourcing Services When designing a hybrid human-machine workflow, the first step is to have a clear separation between tasks to be executed by machines and Human Computation TAsks (HCTs). Designing HCTs is fundamentally different from designing a machineonly data pipeline. Designers need to consider an appropriate user interface, what incentive humans have to perform the tasks, and the general unpredictability of human behaviour. Lack of proper thinking about what, how and why a human engages in a task might lead to poor quality of results, or even no results at all. In the following, we

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

293

discuss how the QROWD platform helps Smart City managers and service providers with the design of HCTs and their implementation as services for their integration with machine data processing, as described in Sect. 3.

4.1 Design Guidelines for Human Tasks Several frameworks have proposed a taxonomy of dimensions that need to be considered for general purpose crowdsourcing tasks literature to design effective and efficient human and crowdsourcing tasks [18, 23, 25]. However, they all overlook some important dimensions for an hybrid context: first: the characteristics and restrictions of devices required to fulfill the task; second, for data in motion or streams, human tasks need to output results at a certain velocity consistent with the processing speed of a machine component, suggesting the consideration of an acceptable delay dimension; third, depending on the type of contributors, one might only be able to tolerate a certain delay to assign tasks to them before they lose attention or consider the proposed incentives as insufficient for the time they invest. To fill this gap, we developed a guideline (Table 1) that combines the most important dimensions of previous frameworks and adds five new ones tailored to hybrid human-machine workflows. The first column of the guideline is the name of the dimension; the second one indicates either the source framework, or if it is introduced by us; the third column lists sample values for the dimension. To apply the guideline, HCTs designers must ask themselves for each dimension which value on the third column corresponds to the task being designed. In the following, we expand the questions associated with each dimension and the sample values. We proceed in the same order given in Table 1 • What is going to be done? This dimension refers to both what is required to the crowd and what goal the requester wants to achieve. In [18], possible values are information finding, verification and validation, and content creation. We add passive and active sensing as two activities often required in the crowdsensing context. • Who is carrying out the task? The who represents the type of crowd. In some cases contributors are drawn from an undetermined group of people, meaning no assumptions regarding their skills can be made. However, contributors with particular skillsets are often required, such as polyglots for translation tasks, or citizens of a particular city for location-dependent tasks. Online crowdsourcing platforms also often implement strategies aimed at identifying the best or most reliable contributors, who may receive access to special benefits and privileges. We propose a novel set of values with respect to the literature: (1) Experts, if specific knowledge is required (2) Citizens, for location-dependent tasks (3) Whoever, if the task could be assigned to any crowdworker (4) Specific contributor, when the task can only be performed by a specific person, e.g., the verification of personal data provided by a citizen may only be done by the concerned citizen.

294

L.-D. Ibáñez et al.

Table 1 Guideline for human computation task design Dimension Based on What

Who

Why—Motivation

Why—Reward

How

Required skill

Required device

Device constraint

Interaction limit Acceptable question delay

Acceptable resolution delay

Malone et al. [18]

Sample Value (values in bold are those proposed by us)

Information finding Verification and validation Interpretation and analysis Content creation Surveys content access Passive sensing Active sensing Malone et al. [18] Expert Citizen Anyone Specific contributor Malone et al. [18], Smart et al. Economic [25] Altruistic Hedonic Reputational Other Malone et al. [18], Smart et al. None [25] Monetary Prize Fun Other Malone et al.[18] Collection Collaboration Context Quinn and Bederson Visual recognition Language understanding Basic communication Physical Novel PC Mobile None Novel Battery Storage Bandwidth CPU None Novel [0-N] Novel Immediate (seconds) Short (minutes) Medium (hours) Long (days) Novel Immediate (seconds) Short (minutes) Medium (hours) Long (days)

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

295

• Why is this task being performed? This dimension concerns the reason why a contributor would engage in the task. It is split into two dimensions: motivation, related to the intrinsic value for the contributor, e.g., is she doing it for the money (economical), for reputation, or for altruism?; and the concrete reward that she will get for completing the task. • How the task will be organised? We consider three different ways of organising a task: (1) Collection: The task is partitioned in several independent micro-tasks that are then assigned to one or more contributors. Results are then collected an aggregated. (2) Collaboration: Contributors collaborate in solving the task. (3) Contest: contributors compete to better perform the task, rewards are higher for the winners of the contest. • What skill is required to complete the task? Previous work considered visual recognition, language understanding and basic communication [23]. We added specific mobility requirements to accommodate location-based tasks. Note however that some tasks may have different skill requirements. • Do contributors require a device to complete the task? Especially for sensing tasks, contributors might need to be in possession of a connected device. • Does the device have any constraints? If a device is needed, it is important to consider constraints it might have, especially for the case that the device belongs to the contributor. We consider in our framework the basic constraints of an IoT device: battery, storage, bandwidth and CPU. A further set of dimensions that need to be considered are those of quality and aggregation, that is, how individual contributions will be quality-checked and aggregated into a final result. Table 2 describes the necessary dimensions and values, that we mostly re-use from [23]. We added ‘Formula’ to the aggregation dimension to refer to the algorithm that aggregates the set of individual contributions into a final result in order to include techniques beyond simple aggregation, such as clustering. Once the guidelines have been applied, the next step is to implement the design choices in such a way that they can input and output to a workflow composed of machine and human processes. To this end, QROWD provides a specification of the high level components that a crowsourcing task needs to implement to become a crowdsourcing service and fit in the QROWD architecture.

4.2 Crowdsourcing Service Implementation Framework Once a crowdsourcing task has been designed, the next step is to ensure that its input/output can be easily plugged from/to other data processes. QROWD proposes a framework for implementing human tasks as crowdsourcing services, that interact with other components of the QROWD platform. The elements of the framework are described in Fig. 3, together with their interactions with inputs, crowdsourcing channels and the QROWDDB.

296

L.-D. Ibáñez et al.

Table 2 Quality and aggregation dimensions of a human task Based on Dimension Quality control

Quinn and Bederson [23]

Aggregation

Quinn and Bederson [23]

Task request cardinality

Quinn and Bederson [23]

Sample Value Output agreement Input agreement Economic models Defensive task design Redundancy Statistical filtering Multilevel review Automatic check Reputation system Collection Statistical processing of data Iterative improvement Active learning Statistical Search Iterative improvement None Formula One-to-one Many-to-many Many-to-one Few-to-one

The first element is a repository of task and/or question templates. A task template is a set of source codes, libraries, and resources (such as texts, images, appropriate handlers for the incoming data item) that define the logic of a human task, including aggregation and quality assurance methods chosen from the list defined in Table 2. The second element is a decision component that handles the who, why, reward, device constraints, interaction limit and acceptable delay dimensions of the design guidelines. More precisely, a decision component must include: • The list of contributors to the task • A register of how many tasks have been assigned to each contributor and a counter of interactions • Register of each contributor’s answering time and its difference with respect to the acceptable delay • If relevant, track of any device constraints associated to each contributor • An assignment function that given a processing or generation request, instantiates a task template and decides to which contributor(s) assign it.

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

297

Fig. 3 Architecture of a crowdsourcing service within QROWD

The instantiated task template output by the assignment function is passed to a Deployment Manager that handles the deployment of the task on the relevant crowdsourcing channel and collects the results. Results are passed to an Aggregation and Quality Component, that based on the logic provided in the task template, executes the relevant quality checks and aggregations. If results are of insufficient quality, the decision component may decide to redeploy the task for further iterations. The final results are written either back to the context broker, or to the QROWDDB.

298

L.-D. Ibáñez et al.

5 Data Acquisition and Generation 5.1 Pre-existing Data Sources Cities often have pre-existing data sources that they would like to integrate to perform analytics or to connect with crowdsourcing services. The QROWD platform supports the acquisition of pre-existing data sources for both static sources that have a low update rate (also called data at rest); and dynamic sources coming from streams and other sensors (also called data in motion). Static data acquisition is organised around the concept of a dataset that is responsibility of an organisation, was produced by a particular source, has a number of different formats and is annotated with provenance information. Consider the example of a map with the coordinates and types of bike racks in the city. A possible source of this information is a city expert that has collected them for a certain area, a second possible source is a Volunteered Geographical Information system, e.g., Open Street Maps, or a bike-enthusiast association. The procedure to add a dataset is as follows: 1. The originator of the dataset is added as a CKAN organisation 2. The visibility of the dataset is set (public or private) 3. The name of the dataset is constructed by concatenating the following input: • Name of the dataset, e.g., Bike Racks • Version, one of lastVersion or Historical • Type of the dataset Source (indicating that an organisation produced the dataset), Fusion (indicating the dataset is the result of a fusion via a QROWD automated or crowdsourced process).

When a dataset is updated, the platform manages versions automatically by backing up the contents of the current dataset in a new archive dataset tagged with the timestamp of archival. QROWD provides three acquisition process templates according to the need or not for executing a data transformation on the acquired dataset: • Upload/Update a dataset: takes a dataset available in a remote URL path and creates a new dataset if the name combination does not exist in CKAN, and updates (including versioning) of a dataset if it already exists. • JSON transformations: Implements a number of configurable JSON transformations • Custom transformations: After fetching the dataset, apply a custom transformation to a target format and add the output as a format of the dataset. For dynamic data acquisition, a single process receives the data and transforms the original schema into FIWARE data model entities (cf. Sect. 6.1), and uploads/updates entities into Orion for their querying by other processes. The procedure is divided in the following steps:

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

299

1. Evaluate JSON path, processor in charge of getting the id of the entity. 2. Invoke HTTP, processor in charge of checking if an entity with this id already exist in the Context Broker. 3. If it exists, the entity is posted to the context broker. 4. If it does not exist, a pre-processing step is carried out to add the FIWARE entity type (if missing), followed by the posting of the entity in the Context Broker.

5.2 Data from Citizens Devices Citizens are the principal human agents of a Smart City ecosystem. As such, a fundamental component of a hybrid human-machine platform is one that enables data collection and interaction with them. The general idea is to leverage the power of devices owned by the citizens, while at the same time balancing the level of intrusiveness of the solutions, to ensure a high rate of response and not hurting the relationship between citizens and Smart Cities. The QROWD platform includes the i-Log mobile application [29], which collects data from the user in an unobtrusive, data protection compliant and efficient way. The application can be used to generate two very diverse types of data, namely (1) streams of value-pairs generated by the devices’s internal sensors, while (2) it can also collect the user input in different formats, from text to visual. The latter capability can be used to use i-Log as a channel for pushing crowdsourcing services, as seen in Fig. 3. A simplified version of i-Log’s architecture is presented in Fig. 4. The system is composed of a set of modular, logically isolated components, each one enabling a subset of the overall functionalities of the application. The modularity of the architecture allows to personalize the application and adapt it to different contexts and projects, with the need to modify only the involved components. This architecture gives i-Log a significant advantage in terms of adaptability and extensibility of its features. The four main components are:

Fig. 4 i-Log mobile application architecture

300

L.-D. Ibáñez et al.

• Data collection module: it is responsible for efficiently collecting and storing the data from the smartphone’s internal sensors. The data collection has been designed to be remotely configurable in terms of (i) which sensors to use and (ii) at which frequency to collect data from them. Enabled/disabled sensors can be configured per individual tasks, within the same infrastructure. Once collected, the data are temporarily stored in compressed and encrypted logs file on the device and synchronized over WiFi whenever a connection is available. • User contribution module: is responsible for collecting the user’s knowledge in terms of answers to simple questions (a contribution). The knowledge can be of different types, from text, to images and to other objects that are use-case dependent, i.e., coordinates on a map. The questions are sent by a remote server as JSON objects that are then visualized on the smartphone and made available to the user. • Communication module: is responsible for all the outbound and inbound connections. In more detail, it allows to contact the backend infrastructure of the application to perform operations such as registering/logging in users, to synchronize the generated logs of data and save them in a database. At the same time, it allows to receive the questions that the user has to reply to provide her own knowledge and keep her in the loop. • User interface module: i-Log’s main functionality is to collect data about the user while running in background on the phone. The reason for this is that the collection process must be as unobtrusive as possible. For this reason the user interface is very limited: it consists on a notification system that is always present in the notification area of the smartphone while the data collection is active. This is a mandatory requirement from a data protection point of view since the user must always been informed when someone is dealing with her data. A second notification is present whenever the user is asked to provide his knowledge. From these two notifications the user can access the actual views of the application, two menus, Settings and Contributions that allow respectively to setup the application and to have access to all the contributions. i-Log allows to assign points to users depending on the quantity of data they have provided. These points can be used to implement the economic, reputational and hedonic values of the Why - motivation dimension, and by extension, the Monetary and prize values of the ‘Why - reward’ dimension of the guidelines in Sect. 4. Points may be assigned depending on: • The time users spent using the application throughout the day and consequently contributed with their sensors data. • The day and time of contribution, e.g., to incentivise contributions in the date and time that are more needed. • Sensors enabled while i-Log is running. This aspect involves the combination of different factors: (1) each smartphone has a different set of sensors and this information had to be taken into account during the point assignment phase to not penalize users who don’t have a sensor in their personal device, bur rather penalize those who have it and decided to turn it off. A second aspect is that (2)

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

301

not all the sensors could be disabled, i.e., the accelerometer cannot while the GPS can. Finally, we should also consider that (3) not all the sensors have the same importance for all use cases, for example, accelerometer could be more valuable than position for certain type of analytics.

5.3 Citizen Challenges Section 5.2 described how the QROWD platform supports passive data contributions from citizens, where the only action they need to do is to install and run the i-Log app on the background of their phones. Other types of applications require humans to take a more active role, like taking pictures, or answering questions about a Point of Interest, that is, the crowdsensing dimension discussed in Sect. 2. To support these use cases, the QROWD platform includes a citizen challenges Crowdsourcing service that could be run on top of the i-Log app. Challenges receive the following input parameters: 1. An area of interest on the challenge will take place (defined as a geospatial polygon) 2. Optionally, a set of coordinates within the area of interest where challenge participants should go to perform actions. If this set is empty, it is assumed that the purpose of the challenge is to locate something within the area of interest, that is, creating a map instead of validating it. 3. An HTML form that allows data input by the citizen, e.g., coordinates using the phone capabilites, upload photos or answer questions. Data contributions from challenges can be associated with points that may be redeemed for rewards, in the same way as described for passive data contributions in Sect. 5.2.

5.4 Annotations from Street-Level Imagery The third data acquisition crowdsourcing service included with the QROWD platform is the Virtual City Explorer (VCE). It allows contributors to explore cities through street-level imagery services and provide annotations (e.g. coordinates or state) of point of interest in the map. The VCE accepts the same parameters than a citizen challenge: the area that needs to be explored by the contributors; the type of item contributors should locate; an HTML form with questions about points of interest found; and the number of contributors to assign to a given area; The VCE can be regarded as a virtual alternative over citizen challenges that is not limited to contributors physically present in the city. In turn, the VCE depends on the existence and up-to-dateness of street-level imagery.

302

L.-D. Ibáñez et al.

Fig. 5 Virtual city explorer interface

Figure 5 shows a screenshot of the VCE interface from a contributor’s perspective. Before exploration starts, the contributor reads the task instructions that explain its general functioning works and which are the types of objects required to locate. The contributor then starts their exploration from a random point within the area of interest. When a contributor discovers a candidate item, she is required to take three photos of it, from three different angles. In the background, the VCE triangulates the vectors of the different angles to determinate the coordinates of the item and stores them in a database. After submitting a pre-established number of items, the task ends. In case a crowdworker was the one completing the task, she is redirected to the crowdsourcing recruiting platform to receive her payment. An extensive evaluation of the VCE with paid crowdworkers is available on [16].

6 Data Models and Storage Once data from citizens and pre-existing sources has been acquired, the next step is to have an appropriate storage in a unified data model that allows further analysis. To this end, the QROWD Platform provides: (i) An extension of the FIWARE data models for transportation to include Citizens, Visitors, and Trips as first-class citizens. (ii) A Big Data Storage tailored towards data collection on personal devices that facilitate compliance with data protection regulations.

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

303

6.1 Data Models To ensure data portability for different applications including, but not limited, to Smart Cities, the QROWD Platform reuses data models developed by the FIWARE framework. We extended the FIWARE transportation data model5 to include citizens and their data and processing contributions with three classes (Citizen, Visitor, and Trip) . The Citizen class is defined as an agent that lives or commutes in a city using the transport infrastructure. Citizen has two mandatory properties: • citizenId: A UUID assigned to the citizen • citizenType: The type of citizen according to its mobility: Resident or Commuter. Second, the Visitor class as an agent that does not reside in the city. Visitor has two mandatory properties • visitorId: A UUID assigned to the visitor • visitorType: The type of visitor: Business or Tourist. Both classes can be extended with further properties according to application needs. We describe an example of such an extension in Sect. 8. When considering mobility data, an important concept is the one a trip a citizen makes within the city. Trips can be used to power a number of services, for example, estimate the transport mode usage in a city, understand the demand at certain times of the day, or suggest a group of citizens an alternative transportation mode. With an unified data model, different transport operators or providers can then add data to a common data shared space where analytics can be conducted. As such, we included in our data model extension a Trip class, defined as follows: • Mode: List of transport modes used by the trip • Purpose: purpose of the trip. The concept restriction is work, school, accompanying, errands, free time, working trip, return • initDate: timestamp of the start of the trip • endDate: timestamp of the stop of the trip • startCoordinate: coordinate of the start of the trip • stopCoordinate: coordinate of the end of the trip • Path: polyline representing the path of the trip • Multitrip (Boolean): If the trip has multiple subtrips • Subtrips: List of trip identifiers that conform a trip. Subtrips are subject to the restriction that their paths must be a subset of the path of the parent trip, that their initDate and endDate must be in the range formed by the initDate and endDate of the parent trip, and that their have a single Mode. We also added four super-classes to facilitate further extension to related scenarios: (i) Event: events represent occurrences that can have temporal or spatial parts. We use it as superclass of Trip (ii) Location: represents spatial parts. We use it as superclass 5

https://github.com/smart-data-models/dataModel.Transportation/tree/1278849c096d8ea0ceaa3e 3d8d7b30d6940ab474.

304

L.-D. Ibáñez et al.

of transportation areas like (iii) Structure: Physical objects representing structural entities. We use is as a superclass of the various mobility infrastructure classes in the FIWARE data models. (iv) Person: we use it as a superclass of Citizen.

6.2 Big Data Storage Including citizens in data analytics of Smart Cities enables the leveraging of their personal mobile devices to contribute with data or to solve tasks. As such, is crucial for Smart Cities to have the technical means to manage huge amounts of data from potentially thousands of citizen’s devices. Furthermore, inline with the recent entry into force of data protection laws like the European General Data Protection Regulation (GDPR), data controllers, i.e., organizations that collect personal data are responsible of ensuring that any sharing of data with other organisations for further processing is consented, or that appropriate anonymisation or pseudonymisation measures have been taken. The QROWD Platform provides a Big Data Storage component based on Apache Cassandra.6 Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous masterless replication allowing low latency operations for all clients. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. To provide pseudonymisation, we included the following policies: 1. A Cassandra keyspace is associated with the data of a single user. This allows to have different consistency strategies for different users and, most importantly, will enable to isolate the data for privacy concerns. If every user’s data is saved in a separate keyspace it is easier to deal with data protection requests, e.g., delete them if the user wants to uninstall the application. The anonymization is granted at this level since the name of the keyspace is a 160bit salt string generated randomly using the Secure Hash Algorithm10 (SHA-1). All data processing by other components of the platform in an hybrid workflow uses this anonymized identifier. Thus, the users’ personal data is never used in this regard. Both data, the salt and any property of the citizen considered personal are stored in a disambiguation table that is accessible only by designated data controllers. 2. There is one table per query we need to reply per sensor. Since we are dealing with time series, we chose to allow querying the data by time and in some limited cases also by value. In time series most of the time a client application needs to have the values in a time interval, e.g., the accelerometer data to understand if the user is moving from 08:00 AM to 10:00 AM. In less common situations, we would like to query by value, e.g., to understand is the user previously visited a specific location.

6

https://cassandra.apache.org/.

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

305

7 Data Integration Data integration is a fundamental task in most value-adding data processing workflows. The general problem statement is to obtain a single coherent dataset from a set of hetergoeneous data sources. Data integration consists of data normalization, data interlinking and data fusion. The QROWD Platform supports data through the leveraging of two tools: Limes,7 a link discovery framework for the Web of Data with time-efficient approaches for large-scale link discovery based on the characteristics of metric spaces, provides the interlinking capabilities; and Sparql-Integrate,8 for normalization and fusion of data. The first step of data integration is normalization, that is, represent data in a uniform way. Foremost, this involves data models (e.g. graph-based, hierarchical or tabular), and schemata (e.g. the domain of bicycle parking). However, it also affects units, lexical representations (e.g. date formats) and encodings. Traditionally, a distinction between schema and instance data is made, as there are different data integration problems and solutions related to them, for example, the approaches for aligning class hierarchies in general differs from fusing attributes of instance data. Once data has been normalized, interlinking can be applied to both schema and instance level in order to find candidate matches. These matches serve as the base for fusion. In general, the set of matches may suffer from data quality problems related to ambiguity (multiple candidates exist where only one is expected), faultiness and incompleteness. While ambiguity is resolved using conflict resolution strategies, these may itself introduce additional errors. For this reason, it makes sense to decouple the dataset of annotated candidate matches (e.g. confidence scores and provenance) as a valuable asset by itself—i.e. in isolation from the remaining fusion process. For example, a search for Trento on OpenStreetMap yields the city in Italy as well as a Paseo del Trento in Mexico. Data fusion refers to the merge of data records of a given set of datasets for the sake of completing information and enabling resolution of conflicts. A prerequisite to fusion is schema integration such that the relevant properties of data records from multiple sources are represented uniformly. Interlinking can be applied to provide additional input to fusion processes in order to establish candidate equivalence relations between entities. Going back to the Trento search example above, while the remaining fusion process simply adds the geo-coordinates of the match marked as correct to the final dataset, the dataset of candidate matches allows for quick verification and revision. The QROWD platform makes use of Semantic Web technologies RDF and SPARQL. With RDF, we can represent schema and instance data uniformly in a graph-like structure often referred to as a knowledge graph, which enables retrieval and manipulation of data stored in with SPARQL queries. By relying on the linked data principles this workflow keeps minimizes the necessary groundwork. All data transformations, which achieve the data integration tasks are defined by SPARQL 7 8

http://aksw.org/Projects/LIMES.html. https://github.com/QROWD/SparqlIntegrate.

306

L.-D. Ibáñez et al.

queries and/or RDF config files. This leads to two main benefits: the user only has to be fluent in these two technologies and this workflow can be integrated in any other dataflow where SPARQL processors can be added. Traditionally, in the relational database world, fusion processes are specified using sequences of SQL statements which implement domain specific rules. Although the SPARQL standard does not provide a feature set as rich as that of SQL (dialects), the basic principles can be applied to SPARQL nonetheless. For example, DBpedia recently introduced a very similar workflow for data fusion in [7], where the set of annotated candidate matches is referred to as the PreFuse dataset. Following the same pattern described in Sect. 4 for crowdsourcing services, the integration of Limes and Sparql-Integrate into the QROWD Platform was achieved by writing corresponding Apache NiFi processors.9 Recall from Sect. 3 that the QROWD Platform is based on Apache NiFi, which is designed to automate the flow of data between software systems. A NiFi dataflow is defined by a network of processors, where flow-files are used to pass data along connections, and may have multiple ingoing and outgoing connections. Sparql-Integrate is a tool developed in QROWD which leverages SPARQL for the integration of heterogeneous data. The SPARQL specification itself allows for extension functions but also notes the risk of limited interoperability.10 For QROWD, we chose the Apache Jena Semantic Web framework as the basis for our own SPARQL extensions. Our extensions exploit this framework’s plugin system, making it possible to easily integrate them into any other Jena-based software project. The SPARQLIntegrate project is comprised of two libraries and two interfaces: • A standalone SPARQL extension library for Jena with support for data formats (XML, JSON, CSV), HTTP requests, and file system access. Naturally, some of these extensions are meant only for internal processing and should not be exposed in e.g. public SPARQL endpoints due to their potential for abuse. • A small standalone core library with additional functionality, especially a parser for documents holding multiple SPARQL statements, and a corresponding processor that gives control over the output. • A command line interface for processing files of RDF data and SPARQL queries. Additionally it supports launching an embedded SPARQL endpoint with HTML frontend and the provided extensions. • An Apache NiFi processor integration. We wrapped Sparql-Integrate into a processor which can take either data or its configuration as content of a input flow-file. With this the Sparql-Integrate processor is able to access files over HTTP, locally on the file system or as the content of a flow-file. With the most common file formats CSV, XML and JSON supported it is possible to create an ontology aligned graph out of these sources with one ore more SPARQL queries. 9

https://github.com/QROWD/nifi-sparql-integrate-bundle. https://www.w3.org/TR/sparql11-query/#extensionFunctions.

10

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

307

LIMES is a link discovery frameworks for the Web of Data that identify similar entities as well as duplicates in Web datasets. LIMES can execute so-called link specifications, which contain heuristics for the similarities of entities in datasets. Those specifications can be either created manually or via machine-learning techniques. Human feedback is required to assess and maximize the precision and recall of these link specifications as well as resultant output. We also wrapped Limes into a Apache NiFi processor, the processor accepts a configuration file as input and returns a list of found links between entities. The configuration specifies the location of the datasets, which entities based on which specific properties should be linked and what kind of metric should be used as a distance measure. Nifi processors that wrap Limes and Sparql-Integrate can be used within the QROWD Platform with a simple “drag and drop”. While it is possible to engineer a data integration workflow with the QROWD Platform only, we also developed a complementary offline workflow. To deal with the increasing complexity of SparqlIntegrate queries as tasks become more complex, the complementary workflow11 treats those files as source code, which enables syntax checking, autoformat, completion and version control. When finished, these files can be used to configure the QROWD platform processors. With the addition of Sparql-Integrate and Limes processors, it is possible to create arbitrary data integration workflows within the QROWD Platform. In Sect. 8 we will see the processors in action for solving the urban auditing problem.

8 Use Cases In this section we show how we used the QROWD platform to develop two hybrid human-machine data flows: one to generate and manage mobility infrastructure data, and another to estimate modal split. We report on the piloting and evaluation of both applications in the city of Trento, Italy.

8.1 Generating and Managing Mobility Infrastructure Data Accurate information of current mobility infrastructure is crucial for the implementation of mobility policies. However, records may be incomplete due to certain items being installed and owned by private parties, or due to digitization errors. Sending municipality employees to scout an area or regularly check known infrastructure does not scale in area and is expensive. A smarter alternative would be to involve citizens to help with the task. In this section, we describe a hybrid human machine workflow we deployed on a live setting in the city of Trento, Italy, for generating and curating a map of bike rack locations for the Limited Traffic Zone of Trento. In 11

https://github.com/QROWD/link-discovery-and-data-fusion.

308

L.-D. Ibáñez et al.

Table 3 Design guideline applied to bike rack map collection using the VCE What Who Why—motivation Why—reward Crowdworkers Economic Collection Required skill Required device Device constraint How Visual recognition PC None Collection Interaction limit Question delay Resolution delay Medium Medium 5

the following, we describe each of the steps we follow together with the Smart City managers in Trento. Acquisition The municipality of Trento had an initial dataset of 39 bike racks in the area of interest, each one including the type of bike rack (single sided or double sided), the name of the street is located and the capacity. The Smart City managers wanted to know id the dataset was complete in the sense that all bike racks in the area were included, and accurate, in the sense that all properties of each bike rack in the dataset were correct. Using the tools described in Sect. 5.1, we acquired this dataset into the CKAN repository. A second bike rack dataset was openly available from OpenStreetMaps. Two volunteers had contributed the locations and properties of 59 bike racks in the same area of interest than the Municipality dataset. However, 36 bike racks were missing at least one of the type or capacity properties. We acquired the dataset to the CKAN repository, but before further analysis, we decided to generate a new dataset taking advantage of the availability of recent street-view level imagery in Trento. Generation We used the Virtual City Explorer tool described in Sect. 4 to create a crowdsourcing task to collect bike racks from the Google Maps street-view imagery of the area of interest defined by the municipality. Table 3 shows the application of the design guidelines to this task. We deployed the task on a crowdsourcing platform and recruited 25 crowdworkers that mapped 44 bike racks. Interlinking and Fusion Using the machine components described in Sect. 7 we ran an interlinking process between the three acquired datasets based on the bike rack’s geographical coordinates. The output is a fused dataset where bike racks from different datasets judged to be the same are merged into a new representative entity that aggregates all properties from its parents in a set. Figure 6 shows a visualisation of the fused dataset. Bike racks from different datasets considered to be the same are shown as groups of circles with the same color. White circles represent bike racks that were not linked with any other. Shapes within the circles encode from which dataset the bike rack comes from: squares represent bike racks from the data generated with the VCE; triangles from OpenStreetMaps, and crosses bike racks from the Municipality dataset. Datasets for each

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

309

Fig. 6 Map of bike rack clusters

source (OpenStreetMaps, Municipality and VCE) and the fused dataset are openly available in the Zenodo repository.12 Curation To validate the location and properties of the bike racks in the interlinked map, a human needs to verify them. We modeled this as a Citizen Challenge (Sect. 5.3) using as input the area of interest and the interlinked map. Table 4 shows the instantiation of the crowdsourcing guidelines given in Sect. 4 for this task. Citizens were asked to go to one of the locations in the interlinked map and confirm (Fig. 7 (center)) if the bike rack is there using a form with the following input fields: 1. Their location, taken from the device’s GPS by the i-Log app. This step is needed to confirm that the citizen is on the point featured in the map. Location was only considered valid if the measured GPS accuracy provided by the app was below 10.0m.

12

https://doi.org/10.5281/zenodo.3574485.

310 Table 4 Design guideline applied to bike rack verification challenges What Who Why—motivation Verification and Citizens Economic, hedonic validation How Required skill Required device Collection Visual recognition, Mobile phone Physical Question delay Resolution delay Interaction limit None Challenge duration Challenge duration

L.-D. Ibáñez et al.

Why—reward Prizes Device constraint Bandwidth

Fig. 7 Three i-Log interfaces, for (left) a user to decide if accept to participate a challenge, (middle) a user contribute with a new item detection, and (right) a user taking a picture of a new item discovered

2. Reply Yes/No to the question Is the bike rack still here? If the selected answer is No, then the contribution is submitted. 3. If the answer to the previous question was Yes, provide a photo of the bike rack (Fig. 7 (right)). 4. Answer the question What kind of bike rack do you see? The answer is picked from example pictures of three different types of bike racks. 5. Answer the question: What is the capacity of the bike rack? 6. Answer the question: How many available spots does the bike rack have? 12 citizens of age 18–25, students of the University of Trento, accepted to participate in the challenge over a one week period. For each verified bike rack, participants received 5 points, for each 20 points accumulated, the participant had the right to a 5 euros voucher to exchange for phone credit. All participants validated at least 4 bike racks, and all but one of the bike racks on the input map were verified as as existent

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

311

Fig. 8 High level description of the hybrid human workflow for completing mobility infrastructure

or not. The one bike rack missed by the challenge was checked by a municipality employee and found to be within a private ground, therefore, removed from the final result. On the light of the good results with bike racks, we decided to invite the same participants to a second challenge to collect locations, types and pictures of special parking spots (disabled, taxi ranks, freight load/unload). These type of parking spots are challenging to collect with the VCE due to the fact that available pictures on street-view level imagery may have a vehicle on them, impeding their identification. The input form was comprised of the following three fields: 1. Take picture of parking spot 2. Share location using phone capabilities 3. Answer the question What type of parking spot is this, disabled, taxi rank or freight? Examples of each type were provided in the interface for clarity. The new challenge ran for one week and allowed the collection of a dataset containing 401 special parking spots (Fig. 8).

8.2 Modal Split Surveys Modal split is a fundamental indicator for understanding how citizens use various means of transport. It is defined as the percentage of citizens using a particular

312

L.-D. Ibáñez et al.

Fig. 9 High level description of the hybrid human workflow for trip collection from Citizen’s devices

mode of transportation for their travel in a specified time period (e.g., 30% car, 30% bus, 20% bike, 20% walking). It is also an important input for designing and evaluating mobility policies. For example, if a large number of car trips are detected towards a certain district, the municipality can then devise a policy to encourage other transportation modes. Furthermore, the same measurement can be made again focused on trips to that district to evaluate the effectiveness of the policy. Traditionally, modal split is estimated through travel surveys, where citizens either fill a paper form, or provide answers by telephone to an operator, with details of their trips during a certain period of time. This is quite expensive and time consuming, greatly limiting the number of times the modal split can be measured. An interesting approach is to use citizen’s mobile phones to automate the application of the survey, and use data analytics on collected data to “fill the form” as automatically as possible, asking the user only to confirm trips where we are not confident enough about the result provided by the machine. Figure 9 shows a high level overview of the data flow for collecting trips from citizens. We describe below how we implemented it to satisfy the particular needs of the Municipality of Trento (MT) with the help of the QROWD platform. First, we extended the Citizen data model described in Sect. 6.1 with the demographic properties required by MT to filter and aggregate modal split. Table 5 describe the added attributes. The data model was loaded into QROWDDB, and CRUD operations on Citizens and Trip were configured into the QROWDDB (Sect. 6) For connected vehicles and public transport, it is relatively easy to generate trip data that is complete, with accurate start, stop points and a correct transport mode

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

313

Table 5 Attributes added as an extension to Citizen data model for modal split survey application Attribute Description Occupation numberCohabitants numberVehicles preferredMode WorkSector HomeSector Age Gender Email streetAddress drivingLicense

Principal work activity Number of people that live in the same house Total number of vehicles available to all cohabitants Preferred transportation mode Sector where citizen works Sector where citizen lives Age of citizen Gender of citizen e-mail address Address Type of driving license owned, if any

label. However, contributions from mobile devices simply push data upstream and do not have the capabilities to convert raw data into trips. Within the QROWD Platform we developed a Transport Mode Detection component, that uses Machine Learning models to processs GPS and accelerometer time series to (i) Separate a GPS trace into (multimodal) trips by detecting start and stop points. (ii) For each trip, infer the transport mode of each leg. However, data from personal devices is often noisy and/or sparse, and training data for specific transport modes in the specific topological and traffic conditions of a city may not be available, leading to inaccurate trip classification. To solve this, we put citizens in-the-loop by providing a Trip Update Interface as a Crowdsourcing Service (cf. Sect. 4) to allow the confirmation and amendment of the trips inferred by the machine. The overall data and workflow is shown in Fig. 10. We assume the sensor data captured by the citizens’ device is available in the QROWDDB component. For each citizen, we analyse whole-day GPS trajectories. First we preprocess them to extract the actual traveling segments. After removing outliers, unsupervised machine learning techniques like space-time clustering are applied to find stop points, e.g. when a citizen only moved inside a building where many captured GPS positions are in the near vicinity, thus building a point cluster, as shown in Fig. 11. The actual traveling segments, or ‘trips’, are then the movements between clusters. To detect transportation modes, we apply supervised machine learning techniques trained on labeled data, i.e., example trips where we knew the correct transportation modes. For training, the same preprocessing was performed. We use supervised machine learning approaches that can be grouped into two categories, ‘numeric’ and ‘symbolic’. The numeric machine learning approaches work on features derived from the accelerometer data streams we captured on the users’ smart phones. Here we use several classification algorithms from the Scikit-Learn [20] machine learning

314

Fig. 10 Data and workflow of the transport mode detection component Fig. 11 Part of an example GPS trajectory showing a cluster at a stop inside a shopping mall

L.-D. Ibáñez et al.

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

315

library.13 The symbolic machine learning algorithms make use of the citizens’ trip trajectories and symbolic background data representing the traffic infrastructure and further geographic information of the model region. This background data is represented by means of the Resource Description Framework (RDF)14 and the Web Ontology Language (OWL).15 As supervised machine learning software for learning OWL class expressions that describe and serve as classifiers to distinguish, e.g. bus trips from non-bus trips, we used the DL-Learner framework16 [1]. To be able to infer class expressions that reflect distinctive spatial relations, e.g., that a trip was probably made by bus if it started and ended near a point of interest of type ‘bus stop’ and went along a line feature which represents a known bus route, we extended the OWL reasoning components of the DL-Learner to enable ‘spatial reasoning’. This spatial reasoner component is able to make implicit knowledge stored in the background knowledge base explicit and thus usable in OWL class expressions. We set up an RDF vocabulary of ‘virtual’ spatial RDF properties covering the relations from the Region Connection Calculus (RCC) [3] and further relations that seemed suitable for the task of expressing characteristic features of the different transportation modes. Those spatial properties are inferred by means of the spatial coordinates attached to spatial entities in the knowledge base. A simple example would be the near property, where an assertion a near b is inferred whenever the distance between the geographical coordinates of a and b is less than, e.g. 10 m (where the actual value can be configured). Taking into account that GPS trajectories recorded on general purpose commodity hardware like smart phones usually are not 100% accurate, the spatial reasoner also needs to handle a certain degree of fuzziness. Figure 12 exemplifies this for the runs along property. Here, the spatial reasoner extension we developed returns all road segments from LinkedGeoData17 [26] on which the given GPS trajectory runs along even though the respective trajectory segments do not exactly match the road segments. All trained classifiers are consolidated in an overall ‘meta’ transportation mode classifier which may chose a classification outcome, e.g., from that classifier that could achieve the highest confidence. However, both start-stop detection and trip classification may fail, e.g. due to very sparse GPS trajectories or lack of enough training data. The citizen is the only one that knows exactly what the itinerary was, therefore, to put them in the loop, we designed a Trip Update Interface (TUI) as a crowdsourcing service to allow the confirmation and amendment of the trips inferred by the machine. Table 6 shows the design guidelines applied to the TUI. The motivation was set as partly altruistic (desire to collaborate with the municipality) and partly economic (winning a prize for contributing data). To avoid annoying users, the interaction limit was set to one question per trip. To avoid issues of users forgetting

13

https://scikit-learn.org. https://www.w3.org/TR/rdf11-primer/. 15 https://www.w3.org/TR/owl2-overview/. 16 http://dl-learner.org/. 17 http://linkedgeodata.org. 14

316

L.-D. Ibáñez et al.

Fig. 12 Bike ride trajectory exemplifying the spatial relation runs along; the lgdr prefix of the results resources resolves to http://linkedgeodata.org/triplify/ Table 6 Design guideline applied to the trip verification task Who Why—motivation What Verification and Trento citizens Economic, altruistic validation How Required skill Required device Collection Visual recognition Mobile phone Interaction limit Question delay Resolution delay 24 h 72 h 1 question per trip

Why—reward Prizes Device constraint Battery—bandwidth

the details of a trip, the question delay was fixed to 24 h. Resolution delay was set to 72h. We implemented a decision component that for each trip detected by the TMD generated one instance of the TUI. Both the classifier and the decision component were executed every morning on data collected from the previous day, according to the value of the question delay property in the guidelines. In case no sensor data had been received, a failsafe question asking the reason why no data had been received (app failure or conscious decision of not submitting data) and providing the user with a blank map where they could manually input their trips if they wished to do so. As deployment manager, we implemented a simple connector that encapsulated the instantiated TUI into an i-Log question and called the i-Log API. As questions about a trip can only be answered by the citizen that made it, we did not implement any aggregation metric. The TUI receives as input a trip, and generates an HTML5 + JavaScript responsive interface (for which an example is shown in Fig. 13) that enables amendment as follows:

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

317

Fig. 13 The trip update interface

1. Each approximate start and stop point is shown as a highlighted circle on a section of the map. A user can drag the circle on the map to amend the location any of the points. 2. Each pair of start/stop points is linked to an icon representing the detected transport mode used between the two points in question. In the example, bus was detected between the start point and the first stop point, while walking was detected between the first and second stop points. A user can tap on the icon to select another transportation mode. 3. Finally, an user can add or remove intermediate stops using the add stop button and the rubbish bin icon. The TUI outputs a trip with the amended start/stop points and transport modes (if any). Amended trips are considered ‘ground truth’. As such, we can also use them for bootstrapping the training of the classifiers by always asking for confirmation of all the trips and periodically re-training the classifiers. Once they achieve a certain accuracy, one can only ask for amendment of those trips with a confidence level below a certain threshold. Model re-training can be configured as an offline process, or as a step of the data flow, based on a certain condition, e.g., number of confirmed/amended trips collected. To collect data from citizens, we used the i-Log application described in Sect. 5.2. Inline with the data minimisation principle of the European General Data Protection Regulation (applicable as the use case is within Europe), we configured i-Log to only collect accelerometer and GPS data, as the only streams required by TMD.

318

L.-D. Ibáñez et al.

A pilot was run in the first week of October 2019 with 149 participants. 44 participants submitted either sensor phone data or a failsafe question every day of the week, 65 at least one day of the week and 40 abandoned the experiment without contributing any data. Users could provide qualitative feedback through email after the end of the pilot. 18 people chose to do so, from which we highlight the following comments. 1. In some phones, the impact of sensor data collection on the device’s battery was perceived as high, prompting users to uninstall the app. Lesson learned: Further improvements in the engineering of the app would be required for going into production. 2. When the automatically inferred trips were very different from real trips, it was hard to update it to reflect reality. Lesson learned: further research needs to be conducted on the user experience of interfaces to update trips, especially in mobile devices. 3. Users that were less skilled with their phones considered the interface too complicated, leading to worries about providing wrong data. Lesson learned: the rationale of using mobile devices for modal split surveys is to take advantage of their ubiquity and the assumption that embedding questions about the data on the same devices would increase the number and quality of the answers. However, for some demographics this needs to be balanced with the user experience. A possible way forward is allowing trip update on a PC or tablet. In terms of the experience of the Municipality, we identified as main pain point the need to run a helpdesk to support citizens with questions and to resolve issues with. This need partially offsets the savings of this approach with respect to the phone surveys that it intends to replace. Nevertheless, the i-Log approach was estimated to be 20% less expensive than an equivalent phone survey. We expect this percentage to increase with improvements on the battery usage of the app and in the user experience of the Trip Update Interface.

9 Summary and Conclusion In this chapter, we presented the QROWD Platform, a collection of crowdsourcing enabled integrations within a FIWARE-compliant architecture to create hybrid human-machine data processing workflows. The platform provides a framework for helping Smart City managers and their IT teams with the design and implementation of human computation tasks and citizen sensing as Crowdsourcing services such that they can be integrated with machine processes. We demonstrated QROWD’s capabilities by describing how we use it to develop two hybrid human-machine workflows to solve two real problems in the municipality of Trento, Italy: locating mobility infrastructure, and implementing modal split surveys leveraging mobile phone sensor data and citizen’s feedback. The approach

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

319

was very successful for the first use case. For the second, there is still room for improvement for their large scale implementation: better engineering of the sensor data collection application to reduce impact on phone battery, and providing more interface options to citizens for validating trip classifications provided by the machine. As the amount of data available to Smart Cities grows, there will be a need for purposeful analytics for operational managers and decision makers, in addition to approaches that enable citizen inclusion towards more human-centric cities. QROWD paves the way towards this end, and at the same time provides tools for leveraging other types of human contributions, such as those available from crowd-working platforms. Future work will be focused on two areas: first, incorporate advanced data privacy mechanisms that allow citizens fine-grained control on what type of analysis they allow on their data; second, when integrating data from different sources from different providers, the question about how to price queries on that data in the context of an analytic process arises. This also has implications for monetary or prize rewards for citizens contributing data. Acknowledgements Research on this paper was supported by the QROWD project, part of the Horizon 2020 programme under grant agreement 732194. We also acknowledge the Smart City managers of the Municipality of Trento.

References 1. Bühmann, L., Lehmann, J., Westphal, P.: DL-Learner—a framework for inductive learning on the semantic web. J. Web Semantics 39, 15–24 (2016) 2. Calderoni, L., Magnani, A., Maio, D.: IoT Manager: An open-source IoT framework for smart cities. J. Syst. Architect. 98, 413–423 (2019). https://doi.org/10.1016/j.sysarc.2019.04.003. https://www.sciencedirect.com/science/article/pii/S1383762118306520 3. Cohn, A.G., Bennett, B., Gooday, J., Gotts, M.M.: Qualitative spatial representation and reasoning with the region connection calculus. GeoInformatica 1(3), 275–316 (1997) 4. Costa, D.G., Damasceno, A., Silva, I.: CitySpeed: a crowdsensing-based integrated platform for general-purpose monitoring of vehicular speeds in smart cities. Smart Cities 2(1), 46–65 (2019). https://doi.org/10.3390/smartcities2010004. https://www.mdpi.com/2624-6511/2/1/4 5. Delmastro, F., Arnaboldi, V., Conti, M.: People-centric computing and communications in smart cities. IEEE Commun. Mag. 54(7), 122–128 (2016). https://doi.org/10.1109/MCOM. 2016.7509389 6. Doran, D., Gokhale, S., Dagnino, A.: Human sensing for smart cities. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1323–1330. ASONAM ’13, Niagara, Ontario, Canada. ACM, New York, NY, USA (2013). https://doi.org/10.1145/2492517.2500240 7. Frey, J., Hofer, M., Obraczka, D., Lehmann, J., Hellmann, S.: DBpedia FlexiFusion the Best of Wikipedia > Wikidata > Your Data. In: Ghidini, C., Hartig, O., Maleshkova, M., Sviatek, V., Cruz, I., Hogan, A., Song, J., Lefrancois, M., Gandon, F. (eds.) The Semantic Web - ISWC 2019. pp. 96–112. Lecture Notes in Computer Science, Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-30796-7_7 8. Garcia-Retuerta, D., Chamoso, P., Hernández, G., Guzmán, A.S.R., Yigitcanlar, T., Corchado, J.M.: An efficient management platform for developing smart cities: solution for

320

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19. 20.

21.

22.

L.-D. Ibáñez et al. real-time and future crowd detection. Electronics 10(7), 765 (2021). https://doi.org/10.3390/ electronics10070765. https://www.mdpi.com/2079-9292/10/7/765 Gooch, D., Wolff, A., Kortuem, G., Brown, R.: Reimagining the role of citizens in smart city projects. In: Adjunct Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2015 ACM International Symposium on Wearable Computers, pp. 1587–1594. UbiComp/ISWC’15 Adjunct, Osaka, Japan. ACM, New York, NY, USA (2015). https://doi.org/10.1145/2800835.2801622 Gopinath, B., Kaliamoorthy, M., Ragupathy, U.S., Sudha, R., Nandini, D.U., Maheswar, R.: State-of-the-art and emerging trends in internet of things for smart cities. In: Maheswar, R., Balasaraswathi, M., Rastogi, R., Sampathkumar, A., Kanagachidambaresan, G.R. (eds.) Challenges and Solutions for Sustainable Smart City Development, pp. 263–274. EAI/Springer Innovations in Communication and Computing, Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-70183-3_12 Gutiérrez, V., Amaxilatis, D., Mylonas, G., Muñoz, L.: Empowering citizens toward the cocreation of sustainable cities. IEEE Internet Things J. 5(2), 668–676 (2018). https://doi.org/ 10.1109/JIOT.2017.2743783 Kong, X., Liu, X., Jedari, B., Li, M., Wan, L., Xia, F.: Mobile crowdsourcing in smart cities: technologies, applications, and future challenges. IEEE Internet Things J. 6(5), 8095–8113 (2019). https://doi.org/10.1109/JIOT.2019.2921879 Kuru, K., Ansell, D.: TCitySmartF: a comprehensive systematic framework for transforming cities into smart cities. IEEE Access 8, 18615–18644 (2020). https://doi.org/10.1109/ACCESS. 2020.2967777 Liu, J., Shen, H., Narman, H.S., Chung, W., Lin, Z.: A survey of mobile crowdsensing techniques: A critical component for the internet of things. ACM Trans. Cyber-Phys. Syst. 2(3) (2018). https://doi.org/10.1145/3185504 Lécué, F., Tallevi-Diotallevi, S., Hayes, J., Tucker, R., Bicer, V., Sbodio, M.,Tommasi, P.: Smart trac analytics in the semantic web with STAR-CITY: Scenarios,system and lessons learned in Dublin City. J. Web Semantics 27–28, 26–33 (2014). https://doi.org/10.1016/j.websem.2014. 07.002. http://www.sciencedirect.com/science/article/pii/S157082681400050X Maddalena, E., Ibáñez, L.D., Simperl, E.: Mapping points of interest through street view imagery and paid crowdsourcing. ACM Trans. Intell. Syst. Technol. 11(5), 1–28 (2020). https:// doi.org/10.1145/3403931 Maddalena, E., Ibáñez, L.D., Simperl, E., Gomer, R., Zeni, M., Song, D., Giunchiglia, F.: Hybrid human machine workflows for mobility management. In: Companion Proceedings of The 2019 World Wide Web Conference, pp. 102–109. WWW ’19. ACM, New York, NY, USA (2019). https://doi.org/10.1145/3308560.3317056 Malone, T., Laubacher, R., Dellarocas, C.: The collective intelligence genome. IEEE Eng. Manage. Rev. 38(3), 38–52 (2010). https://doi.org/10.1109/EMR.2010.5559142. http://ieeexplore. ieee.org/document/5559142/ Olariu, S.: Vehicular crowdsourcing for congestion support in smart cities. Smart Cities 4(2), 662–685 (2021). https://doi.org/10.3390/smartcities4020034 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, È.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) Pfisterer, D., Romer, K., Bimschas, D., Kleine, O., Mietz, R., Truong, C., Hasemann, H., Kröller, A., Pagel, M., Hauswirth, M., Karnstedt, M., Leggieri, M., Passant, A., Richardson, R.: SPITFIRE: toward a semantic web of things. IEEE Commun. Mag. 49(11), 40–48 (2011). https://doi.org/10.1109/MCOM.2011.6069708 Puiu, D., Barnaghi, P., Tönjes, R., Kümper, D., Ali, M.I., Mileo, A., Parreira, J.X., Fischer, M., Kolozali, S., Farajidavar, N., Gao, F., Iggena, T., Pham, T., Nechifor, C., Puschmann, D., Fernandes, J.: CityPulse: large scale data analytics framework for smart cities. IEEE Access 4, 1086–1108 (2016). https://doi.org/10.1109/ACCESS.2016.2541999

QROWD—A Platform for Integrating Citizens in Smart City Data Analytics

321

23. Quinn, A.J., Bederson, B.B.: Human computation: a survey and taxonomy of a growing field. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1403–1412. CHI ’11. ACM, New York, NY, USA (2011). https://doi.org/10.1145/1978942. 1979148 24. Santos, F.A., Silva, T.H., Braun, T., Loureiro, A.A.F., Villas, L.A.: Towards a sustainable people-centric sensing. In: 2017 IEEE International Conference on Communications (ICC). pp. 1–6 (2017). https://doi.org/10.1109/ICC.2017.7997223 25. Smart, P., Simperl, E., Shadbolt, N.: A Taxonomic framework for social machines. In: Miorandi, D., Maltese, V., Rovatsos, M., Nijholt, A., Stewart, J. (eds.) Social Collective Intelligence: Combining the Powers of Humans and Machines to Build a Smarter Society, pp. 51–85. Computational Social Sciences, Springer International Publishing, Cham (2014) 26. Stadler, C., Lehmann, J., Höffner, K., Auer, S.: Linkedgeodata: a core for a web of spatial open data. Semantic Web J. 3(4), 333–354 (2012) 27. Sánchez, L., Gutiérrez, V., Galache, J.A., Sotres, P., Santana, J.R., Casanueva, J., Muñoz, L.: SmartSantander: experimentation and service provision in the smart city. In: 2013 16th International Symposium on Wireless Personal Multimedia Communications (WPMC), pp. 1– 6 (2013) 28. Vakali, A., Dematis, I., Tolikas, A.: Vol4all: A volunteering platform to drive innovation and citizens empowerment. In: Proceedings of the 26th International Conference on World Wide Web Companion. pp. 1173–1178. WWW ’17, Perth, Australia. Companion, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2017). https://doi.org/10.1145/3041021.3054712 29. Zeni, M., Zaihrayeu, I., Giunchiglia, F.: Multi-device activity logging. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp. 299–302. UbiComp ’14 Adjunct, ACM, New York, NY, USA (2014). https:// doi.org/10.1145/2638728.2638756

Estimation of Short-Time Forecast for Covid-19 Outbreak in India: State-Wise Prediction and Analysis Puneet Bawa, Virender Kadyan, Anupam Singh, Kayhan Zrar Ghafoor, and Pradeep Kumar Singh

Abstract Covid-19 pandemic is of major concern that largely impacts the human and growth of respective countries. Countries like India also tried their best to manage this Covid outbreak situation through lockdown and handle its growth through strict relaxation using zonal distribution strategy. An urge of proper estimation for this outbreak is required, which can be beneficial in arrangement of proper healthcare facilities in different states of the country. India has wide diversity between its states. The effect of temperature and dense population have been two key parameters that have been poorly studied with respect to each state. In this paper, we tried to forecast the number of Covid-19 cases (8 Jan 2020 to 25 April 2020) using Kalman filter at state and national levels to generate various trends and patterns. Our analysis has been evaluated on four classification of states: most affected, moderate affected, least affected and pandemic free states. The results have been collected on vulnerable temperature parameters (historical and forecast data) of each state. The national level estimates are further compared with other countries like United States of America, Spain, France, Italy and Germany through confirmed, recovered and death cases. In the current lockdown situation our estimation shows that India should expect as P. Bawa Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India e-mail: [email protected] V. Kadyan · A. Singh (B) Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies (UPES), Energy Acres, Bidholi, Dehradun, Uttarakhand 248007, India e-mail: [email protected] V. Kadyan e-mail: [email protected] K. Z. Ghafoor Department of Computer Science, Knowledge University, University Park, Kirkuk Road, 44001 Erbil, Iraq e-mail: [email protected] P. K. Singh Narsee Monjee Institute of Management Studies (NMIMS), School of Technology Management and Engineering, Chandigarh Campus, India © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 P. K. Singh et al. (eds.), Sustainable Smart Cities, Studies in Computational Intelligence 942, https://doi.org/10.1007/978-3-031-08815-5_17

323

324

P. Bawa et al.

many as 60,140 cases by May 24, 2020. The trends achieved shows that India has been found to be one of the beneficiaries of lockdown decisions but failed at some places in its regions due to social activities, huge dense population and temperature variation. This study will be beneficial for different state level bodies to manage various health care resources between its states or can support intra-state and can start their administrative functionality accordingly. Keywords COVID-19 · Coronavirus · SARS CoV-2 · Predictive analysis · India

1 Introduction In December 2019, the Chinese specialists informed the spread regarding the unexplained pneumonia cases throughout their locale. Afterward, these unexplained cases were affirmed to be intense respiratory infectious disease formally named by WHO as Covid-19. It is brought about by the novel coronavirus (SARS-CoV-2) that is believed to have originated from the live animal and seafood market of Wuhan [1]. These infections are acknowledged to be a more enormous group of the infections that are regular in individuals and a wide range of types of creatures including felines and bats [2]. In consecutive months, the infection spread to a large number of nations around the world in such a way that the infected cases are multiplying within a day. As per India perspective, the essential instance of this fatal coronavirus is first traced in the southern state of Kerala on 31st January 2020, almost around a month afterwards the Wuhan virus outbreak [3]. Thereby, Indian government decided for 21-day lockdown (1st phase) for approximately 1341 million population of India on March 24, 2020 and later it was extended with second phase of lockdown further 3rd May with some relation on few facilities of agriculture, food transportation etc. Apart, this flare-up is divided into four periods of transmission in dense population as— (1) Imported Cases, (2) Local transmission, (3) Community Transmission, and (4) Widespread Outbreak [4]. Earlier, most of India prediction models failed in Covid-19 prediction due to consideration of only hot tropical. But the weather conditions of Northern states and the more continental influenced climate of East and Southern India are not taken into consideration which effect their output. During the winters, the pinnacle of ascend in the cases identifying with Covid-19 in mild regions is all around described and even unsurprising [5, 6]. Since the significant outbreak of the pandemic, numerous mathematical algorithms as well as models foreseeing the pattern for the ascent of the quantity of cases in different nations and their particular locales have been studied. Since, some of the models have well accurate forecasts of the intensity in the pandemic relating to the temperature variations [7, 8]. Regardless, the vacillation in the magnitude, populace network and the capriciousness in the weather conditions are yet to be considered for the profound and reliable conjecture of the pandemic flare-up around the South-Asian country of India and its most influenced states. However, it still remains hard to foresee the future outcome of the pandemic curve yet it can somehow be viewed as one of the needs of the hour for the public healthcare system in the country.

Estimation of Short-Time Forecast for Covid-19 …

325

The daily sequence for the observed measurements in the cases of Covid-19 requires the need of the mathematical algorithm which can decrease the uncertainty required by the continuous estimations. Hence, in this paper, the predictions around the outbreak of the pandemic in all states of the country considering state estimation and filtering algorithms altogether is better evaluated using the procedure of Kalman filter. Though, the seasonal patterns for these kinds of overpowering contaminations of the spread is quite favorable while the Beta coronaviruses including MERS-CoV and SARS-CoV are not thought to be occasional [9]. In this manner, we have incorporated the dynamics of temperature alongside the state latitudes and longitude to the Kalman prediction algorithm for better prescient analysis of the epidemiological system model. Moreover, the model also assumes a simulated scenario for human interaction which is based upon the density of population in a particular region of the state. This type of prescient analysis through the assimilation of the data based upon the seasonal as well as populace metrics will somehow assist in accelerating the improvements of the model predictions for the limited time scales. In this way, these improved prediction statistics will help in better comprehension of the reason for seasonal regularity of coronavirus cases which are divided into: (i) most affected states, (ii) severely affected states, (iii) least affected states, and (iv) virus free states essential for better aid and required facilities. Likewise, it will help the public healthcare system to be used in a prevalent way for heightened surveillance systems. The transient traits of these forecasts are some way or another believed to be progressively reliable. The current study presents six sections: Sect. 2 laid the state of the art work presented with respect to Covid-19 pandemic using different mathematical models applied across the world. Section 3 presents the terminology employed in this study. Sections 4 and 5 discusses the data pre-processing and result analysis along with findings of the implemented prediction model. Finally Sect. 7 provides a conclusion with future observation.

2 Related Work There is a need to establish a prediction based model which utilized data science and machine learning algorithms for adequate control in the rise of pandemic cases. In this way, Victor [10] was tried to predict the infection rate by creating disease free equilibrium for a total number of 129 countries utilizing the parameters of exposed and infected population. However, the aspects might be change which depends upon the topography of a particular region and even were impacted by the measures followed by the individual government. Therefore, it was essential to have a deep analysis of such factors in individual countries which can somehow lead to boost the accuracy of the utilized prediction models. Adopting such a scenario, Gao et al. [11] performed the Boltzman function based analysis for Covid-19 outbreak in a particular country of China. The model forecasted 3260 cumulative number of deaths with the prediction of 2550 deaths within the epicentre of China—Wuhan. In this way, combined

326

P. Bawa et al.

prediction related to the number of reported cases and its impact for asymptomatic cases in the countries of Italy, South Korea, France and Germany were modelled by Magal et al. [12]. The model was presented by the researcher for forecasting the cases which were taken into account the parameters of reported and unreported cases alongside the incorporation of various measures of social distancing using time dependent transmission rate. Similarly, Dhanwant and Ramanathan [13] was tuned SIR model by taking into consideration the actual situation for factual prediction of Covid-19 pandemic in India. The author has projected the rise in the number of cases despite the strict lockdown measures of the government and has estimated a higher number of cases than the reported numbers in near future. On the other hand, Wang [14] was performed the piecewise Crow-AMSAA method rather than using traditional epidemiological models [15, 16]. The author was predicted a reliable growth of pandemic through daily data reported in epicentre of United States of America— New York City and many other countries. Based upon the deep analysis of respiratory viruses, Pica et al. [17] were presented the effects of environmental factors on the direct and indirect human transmission of a virus which infected the respiratory tract of humans. Moreover, the researcher were highlighted the influence of spread which was based upon the environmental determinants alongside the droplet sprays or aerosol. It was somehow related to the rate of direct or indirect transmissions in humans. Moreover, the Object Meteorological parameters were believed to be one of the driving factors which has large influence on infectious diseases like severe acute respiratory syndrome (SARS). Therefore, Ma [18] were investigated the generalized additive model with a need of exploring the impact of humidity and temperature on the mortality rate of COVID-19. The author was showcased the positive association of daily mortality with DTR and negative with the humidity based upon the data of Wuhan Health Commission.

3 Theoretical Background The ideal solution towards tracing and prediction of insightful tasks is generally performed through utilization of procedure using Kalman filter [19]. This statistical analysis using Kalman filter is considered to be an ideal system with uncertain information as well as continuously varying data. The general problem for appropriate estimation of the state Q ∈ ℋ(n) with a discrete-time controlled process is tended while using Kalman filters. The process is governed by a linear stochastic distinction using: Qi = A.Qi−1 + B.ui + wi−1 ,

(1)

with Q ' ∈ ℋ(m) such that the estimation/measurement corresponds to Q'i = HQi + vi ,

(2)

Estimation of Short-Time Forecast for Covid-19 …

327

The random evaluated variables vi and wi-1 in Eqs. (1) and (2) corresponds to the measurement and process noise individually. In this way, the vulnerability can be assessed in our estimation with portrayal of the state covariance lattice through: P' = H.P.HT + N

(3)

where the process noise covariance matrix N is a kind of similar to that of wi in Eq. (1) which helps in keeping the state covariance matrix from getting excessively small or going to be zero. In addition, two suspicions of the Kalman filter always worked using Gaussian distribution and Linear function. The nonlinear functionality involved angles and sine, cosine functions which are mostly used while solving a real world problem. Accordingly, it very well may be viewed as going into one direction and taking the relating suppositions towards another path. In such circumstances, the extended Kalman Filter (EKF) is used in such a way that the mean of Gaussian on the nonlinear curve is evaluated and later the approximation through methodology of first order Taylor expansion [20] is performed. In this process, the process of transformation from a nonlinear space into a linear space is performed using a Jacobian matrix Hj , which is lately applied at the step of updation. Consequently, the estimation and update process continues as before for evaluation of the transition and measurement capacities since just the transformation matrix H changes to the Jacobian matrix Hj .

4 Data Pre-processing Initially, different states were divided into four categories: most, moderate, least affected states and pandemic free states. These states were categorized on the basis of their confirmed cases (25 April 2020). Further certain key parameters such as population, and temperature were taken into consideration.

4.1 Population India is the second largest nation in the world in terms of population and the seventh biggest country with respect to land area. The populace among the different states in India isn’t uniform to such an extent that there are numerous spots with exceptionally thick populaces and a few states with not so much populace but rather more land. The state of Rajasthan represents the biggest state in the country with 342,239 sq. km territory of land yet with less populace involving 74 million people. Apart, Uttar Pradesh is one of the exceptionally thick states with 205 million individuals living in 240,928 sq. km of land where the area is relatively less and population is nearly three times as compared to the state of Rajasthan [21]. Along these lines, numerous areas

328

P. Bawa et al.

in the nation offer such various situations making it extreme for the consideration of the transmission network of Covid-19. In this paper, we have gathered state-wise populace information from a report of the Technical Group on Population Projections, November, 2019 [22] with the end goal that the exact expectation for the uniform prediction of the model can be made with regards to the states and their corresponding populace. The density of population is one of the major challenges for the rise of cases in certain regions which are studied in consecutive sections.

4.2 Weather India contains desert in its western region, glaciers in northern region, alpine tundra, moist tropical areas supporting rain-forests in southwestern region, and its encompassing islands. However, the investigation of weather conditions of diverse geology for the country is challenging as well as intriguing. The topography is based upon the Köppen framework [23] which determines the climate being divided into six primary climate types based on the vegetation. In addition, the regions in India contain altogether different kinds of microclimates. Along these lines, there is a requirement for the thought of variating state insightful temperature for the sufficient forecasting of the pandemic. The weather extraction process for the time series prediction has been done through a python wrapper of Weatherbit IO [24] and is additionally classified into two types: (i) historical data (ii) forecast data. For historical data, the minimum and maximum temperatures of every state have been extracted from the date of the first appearance of the case. Likewise, the forecast data comprises the observational data by setting the parameter to perception, and determining the location using the latitude and longitude in the future for which data is required.

5 Result Analysis 5.1 Time-Series Plots The plots of raw sample data are performed in three forms: confirmed (C) recovered (R) and death (D) rates. It can be somehow seen that it relates to the valuable diagnostics which helps in identification of trends for Covid-19 pandemic. The data regarding the COVID-19 stats is collected till 25 April through COVID19-India API [25]. Furthermore, extraction of state level information are obtained on the basis of its latitude and longitude. The visual representation for the evaluation of all the cases have been classified into three categories using Kalman filter prediction model.

Estimation of Short-Time Forecast for Covid-19 …

5.1.1

329

Representation of State-Wise Historical Findings

The Indian government has announced its first phase of lockdown on 24 March 2020. This lockdown has a duration of 21 days with restriction of all major activities as well with allowance of only emergency services. During this time duration, a sharp growth of confirmed cases were analyzed in some states. Initially confirmed cases are 501 on 23 March 2020 and 565 on 24 March 2020 which increases afterwards. To further study the impact of this lockdown the trend analysis was generated on four state categorization strategies. (i)

Most affected states: The top ten most affected states with respect to a large number of confirmed cases are as shown in Fig. 1a which are approximately greater than 1000. This information has been evaluated from the time series plots. The other cases of R and D have been also shown in Figs. 1b and 1c.

(ii)

Moderate affected states: The nine moderately affected states were observed which have medium rise of confirmed (c) cases as shown in Fig. 2a. These states were selected on the basis of their current cases which were approximately greater than or equal to 100. The corresponding plots for the cases of R and D also shown in Figs. 2b and 2c.

Fig. 1a The number of confirmed Covid-19 cases till 25 April 2020 in most affected states

Fig. 1b The number of recovered Covid-19 cases till 25 April 2020 in most affected states

330

P. Bawa et al.

Fig. 1c The number of death cases of Covid-19 till 25 April 2020 in most affected states

Fig. 2a The number of confirmed (C) Covid-19 cases till 25 April 2020 in moderate affected states

Fig. 2b The number of recovered (R) Covid-19 cases till 25 April 2020 in moderate affected states

Fig. 2c The number of death cases of Covid-19 till 25 April 2020 in moderate affected states

Estimation of Short-Time Forecast for Covid-19 …

331

(iii)

Least affected states: Eight least affected states with minor fluctuations in their confirmed cases are as shown in Fig. 3a and were considered for projection on time series plots. The corresponding plots for R and D cases were also enlisted in Figs. 3b and 3c.

(iv)

Pandemic free states: The classification of states where C ≥ 1 and where C = R + D till 25 April were considered as pandemic free states which were depicted in Table 1. The corresponding time series plot for their C cases were represented in Fig. 4. Few selected states: Goa, Manipur, Tripura, Arunachal Pradesh, and Mizoram did not report any cases in recent days. Till date, they were declared as pandemic free states. The prediction of these states would

Fig. 3a The number of confirmed (C) Covid-19 cases till 25 April 2020 in least affected states

Fig. 3b The number of recovered (R) Covid-19 cases till 25 April 2020 in least affected states

Fig. 3c The number of death cases of Covid-19 till 25 April 2020 in least affected states

332

P. Bawa et al.

Table 1 Zero growth rate of Covid-19 cases in pandemic free states 26-Apr

03-May

10-May

17-May

24-May

Goa

7

7

7

7

7

Manipur

2

2

2

2

2

Tripura

2

2

2

2

2

Arunachal Pradesh

1

1

1

1

1

Mizoram

1

1

1

1

1

Fig. 4 The number of confirmed cases of Covid-19 cases till 25 April 2020 in least affected states

remain null for expected confirmation, recovery and death cases as listed in Table 1. From the above studies it has been analyzed that 5 states of India are in infection free zones in the upcoming month with zero reporting of Covid-19 cases. On the other hand, other states were struggling with very high spikes as per current findings.

6 Results 6.1 Short Term Predictions and Their Analysis The general use of Kalman filter was accomplished for one-day predictions, whereas the linear model with contemporary features were employed for long-term predictions of the confirmed cases. The aforementioned model was trained in such a way that the principle features consolidate the Kalman predictors, the infection spread corresponding to the population of the states, time-dependent features or time series, use of weather history and forecasting were evaluated altogether. The corresponding Case-Fatality ratio were reported using: CFR = D/(D + R)

(4)

Estimation of Short-Time Forecast for Covid-19 …

333

Table 2a Doubling Growth rate of confirmed Covid-19 cases by 24 May 2020 in lockdown situation with respect to most affected states 26-Apr

03-May

10-May

17-May

24-May 21,324

Maharashtra

8054

10,982

13,801

18,210

Gujarat

3315

4382

5691

6658

8356

Delhi

2716

3220

3512

3964

4667

Rajasthan

2218

2502

2818

3242

3891

Madhya Pradesh

2068

2357

3101

3345

4124

Uttar Pradesh

1868

2418

2721

2974

3245

Tamil Nadu

1878

2521

3078

3412

3921

Andhra Pradesh

1081

1463

1838

1969

2442

Telangana

1021

1102

1174

1411

1689

602

690

734

912

1211

West Bengal

The state-wise analysis using forecasts was performed on the aforementioned four stages that helped in generation of trends and patterns of different regions. (i)

Most Affected States Maharashtra and Gujarat: The two significant states: Maharashtra and Gujarat were apparently observed as focused states of India. They were observed with high rise in coronavirus crisis with forecast metrics of 21,324 and 8536 cases by 24 May 2020 as listed in Table 2a. There have been certain media reports in regards to the community transmission in the small region: Dharavi in Mumbai, state of Maharashtra [26] and Ahmedabad city of Gujarat [27]. These regions in comparisons to other parts of the states were densely populated. This factor helped in prompting a high network based transmission among local populace making it sure to have a lakh number of anticipated cases in near future of Covid-19. These regions were yet to be considered for precise prediction of such cases by testing each populace of this region. Based upon the obtained predictions, the expected CFR were evaluated using Eqs. (4) were 21.36% and 33.95% in the states of Maharashtra and Gujarat respectively with respect to number of projected recoveries as enlisted in Table 2b and number of projected deaths in Table 2c. NCT of Delhi: Delhi is one of the largest Union Territories and capital of India. NCT of Delhi was seemingly projected to be the largest coronavirus spike due to social gathering of more than 8000 people [28]. Though earlier spike was large enough, the projection which ranges from 2716 cases on 26th April to 4667 cases on 24 May was so far better due to the effect of rising in temperature. It can be somehow attributed to the slow spread of cases among the highly dense populations. With the rising number of cases, the CFR projected was 4.65% which was very less in comparison to that of Maharashtra and Gujarat states respectively with respect to number of projected recoveries as enlisted in Table 2b and number of projected deaths in Table 2c.

334

P. Bawa et al.

Table 2b Growth rate of recovered Covid-19 cases by 24 May 2020 in lockdown situation with respect to most affected states Maharashtra Gujarat Delhi

26-Apr

03-May

10-May

17-May

24-May

1149

1390

1663

1879

2174

316

457

564

689

776

1032

1650

2118

2415

2625 1315

Rajasthan

571

880

1100

1282

Madhya Pradesh

278

383

478

551

633

Uttar Pradesh

275

480

673

872

941 2311

Tamil Nadu

1040

1505

1764

1998

Andhra Pradesh

186

281

329

358

385

Telangana

317

431

481

561

601

West Bengal

110

142

158

181

214

Table 2c Growth rate of death Covid-19 cases by 24 May 2020 in lockdown situation with respect to most affected states 26-Apr

03-May

10-May

17-May

24-May

Maharashtra

332

404

452

526

600

Gujarat

143

199

249

327

399

56

70

87

109

128

Delhi Rajasthan Madhya Pradesh Uttar Pradesh

36

44

49

56

60

101

123

147

170

203

29

35

42

50

52

Tamil Nadu

24

31

42

51

55

Andhra Pradesh

33

43

50

53

63

Telangana

26

28

33

37

43

West Bengal

29

47

70

80

95

Rajasthan: Rajasthan as mentioned was one of the biggest state with varying topographic features, where the major part of the state was dominated by parched and dry regions. Apart, there exists some regions: Jaipur, Ajmer, Bhilwara and Tonk which had a large density of population were witnessed. Due to regional network transmission, the cases were projected in these regions ranging from 2218 to 3891 till 24 May 2020. It was naturally due to the slow spread contributed because of dry regions with more temperature present in the state. Due to slow spread, the CFR projected was around 4.36% which was less than that of Delhi despite being the largest state in size respectively with respect to number of projected recoveries as enlisted in Table 2b and number of projected deaths in Table 2c.

Estimation of Short-Time Forecast for Covid-19 …

335

Madhya Pradesh: The temperature in Madhya Pradesh during summers i.e. from March to June ranges above 29.4 °C approximately. The eastern part of the state was believed to be somewhat hotter than the western parts. Its regions like Gwalior, Morena and Datia were recorded temperatures of over 42 °C in the month of May. The hotter temperature can be somehow related to the slow spread but the projections showed consistently high spike in coronavirus cases. The reason for such cases were due to the lack of testing or slow speed of the testing process. The late reporting of cases or inadequate medical facilities present in the state might be the reason for the projected CFR of 24.28% respectively with respect to number of projected recoveries as enlisted in Table 2b and number of projected deaths in Table 2c. Uttar Pradesh: The sudden rise being projected in the highly populated state of Uttar Pradesh can somehow be seen to be a result of large-scale movement of migrant labourers. Migrant labourers or unskilled workers moved from one region to another offering their services on a temporal or seasonal basis associating them with some degree of social and economic development. The major projection of the slow spread occurred due to rising temperature which ranged from 1862 to 3245 cases by 24 May 2020 as listed in Table 2a. It can be majorly due to the large number of people hailing from the most affected state like Maharashtra. Though due to the migrant crisis in the state, the fewer value of 5.33% CFR was projected due to strict actions of the Government despite being the largest state in terms of population. The number of projected recoveries are as enlisted in Table 2b and number of projected deaths in Table 2c. Tamil Nadu, Andhra Pradesh, and Telangana: These southern regions of the country has topographically dense forests and the states of Tamil Nadu, Andhra Pradesh and Telangana were a mixture of hills and plains. Though these variations were contributed in the temperature which may lead to highly variations in the projected confirmed cases as listed in Table 2a. Although these states have populations with more travel histories. Moreover, the large scale rise in these states were due to the migration of people who attended the social gathering events. Tamil Nadu reported more cases for such migrations and was projected to have rise in cases which ranged from 1878 on 26 April 2020 to 3921 cases on 24 May 2020. The projected value of 14.06% CFR remained relatively high for the state of Andhra Pradesh in comparison to that of 6.67% in Telangana and 2.32% in Tamil Nadu. The number of projected recoveries are as enlisted in Table 2b and number of projected deaths in Table 2c. West Bengal: Though the cases were less in the state, still we have decided to put West Bengal in the list of highly affected states. The historical time series as shown in Fig. 1a somewhat relates to the non-uniform reports in the data related to the spread. Still the unorganised trends obtained related to the projection but such projections were most of the time not found to be true. The large rise in the number of cases in the state of Bengal were seen in near future due to the large number of people residing in the 88,752 sq. km of an area of the state. Since, the structure of the data was not much clear, but the state was

336

P. Bawa et al.

Table 3a Growth rate of confirmed Covid-19 cases by 24 May 2020 in lockdown situation with respect to moderate affected states 26-Apr

03-May

10-May

17-May

24-May

Jammu and Kashmir

519

671

775

914

1182

Karnataka

512

610

706

909

1104

Kerala

469

490

512

540

608

Punjab

318

376

454

528

582

Haryana

291

328

342

381

408

Bihar

262

420

502

568

612

Odisha

105

132

157

203

268

74

94

106

119

138

Jharkhand

expected to have higher CFR value as compared to the projected CFR value of 30.74% along with number of projected recoveries are as enlisted in Table 2b and number of projected deaths in Table 2c. (ii)

Moderate Affected States Jammu and Kashmir: The union territory of Jammu and Kashmir is located in the vicinity of Karakoram and western most mountain ranges of Himalaya. The major regions of the area remain very cold and thus the longer spread of the virus rises which ranged from 519 to 1882 cases on 24 May as listed in Table 3a was expected. Moreover, the non-uniform structure of the data indicated that the situation was similar to that of West Bengal. Therefore, the large rise in the number of cases in the state of Jammu and Kashmir may be visible in the near future along with consideration of temporal effect. There may have been the chances of large transmission being reported in the state [29] and such cases were not considered yet in our prediction analysis. The projected CFR value of 7.84% was expected but will move into the higher directions due to unstructured form of data. Karnataka: The topography of Karnataka comprises of the chain of high mountains, residual hills, plateaus and coastal plains. The state will witness a consistent fall in the temperature from 39.5 °C on 26 April to 33 °C on 24 May which certainly showed the least number of rises from 10 May and apparently consistent rise from 706 on 10 May to 1104 on 24 May due to sudden fall in temperature. Though these variations are due to the temperature which may lead to high variations in the projected confirmed cases as listed in Table 3a, also this state has population with more travel history. The CFR on 24 May was projected to 7.79% using Eq. (4) which clearly implied that the strict norms being followed and also highlighted the improved medical facilities in the state. Kerala: Kerala is the southernmost state of the country and it is the state where the first cases of the pandemic appeared. The state has consistently reported the cases and it was the state with most recoveries in the country. Despite the fall of temperature in the state, the state was expected to be not

Estimation of Short-Time Forecast for Covid-19 …

337

Table 3b Growth rate of recovered Covid-19 cases by 24 May 2020 in lockdown situation with respect to moderate affected states 26-Apr

03-May

10-May

17-May

24-May

Jammu and Kashmir

122

180

199

226

235

Karnataka

168

195

221

245

272

Kerala

350

380

402

425

463

87

102

124

141

168

208

266

298

340

404

Bihar

46

51

59

70

84

Odisha

36

41

44

45

49

Jharkhand

10

19

22

24

28

Punjab Haryana

much affected by the pandemic with (608–469) = 139 projected confirmed cases within next 28 days as listed in Table 3a. The recovery rate in the state was very high and the projected CFR value was 1.27% which is on the lowest side in the country as shown in Table 3b. Punjab and Haryana: The temporal classification and variations of both the states are quite similar with temperature rising from 33 °C on 26 April to 39 °C on 24 May. The cases in Punjab state is relatively expected to be increased from 318 on 26 April to 582 on 24 May as listed in Table 3a. The more increase than that of projected was might be possible in the state of Punjab because of a large number of travellers returning to home since the worldwide spread of the pandemic happened and movement of people from social events [30]. The delay in reporting of the cases and inappropriate measures of testing might be the reason for our projected CFR value of 14.72% in the state. On the other hand, “the trend of Haryana suggested that it was the only state that would be free from pandemic till 24 May with 408 confirmed cases, 404 recovered and 4 deaths with CFR values of 0.98%-lowest in the country as shown in Tables 3b and 3c”. Table 3c Low death rate of Covid-19 cases by 24 May 2020 in lockdown situation with respect to moderate affected states 26-Apr Jammu and Kashmir Karnataka

03-May

10-May

17-May

24-May

6

10

12

18

20 23

16

18

19

21

Kerala

3

3

4

5

6

Punjab

17

20

22

24

29

Haryana

3

3

3

4

4

Bihar

3

5

8

11

16

Odisha

1

1

2

2

4

Jharkhand

3

3

3

4

6

338

P. Bawa et al.

Bihar: Bihar is the eastern state of India adjacent to West Bengal with a small area of land and highly dense population. Moreover, the state was home to a large number of migrants or unskilled labourers hailing from the different regions of the country. The sudden spike in the rise of cases in the state was highly expected with a rise from 262 to 612 cases till 24 May more than that of Kerala as shown in Table 3a. Since the sudden spike of the cases, the CFR value was projected to rise to 16% in the state which was in large numbers based upon the population residing in the small area of the state. Odisha and Jharkhand: The states of Bihar, Odisha and Jharkhand also shared the same temporal variations with 33 °C on 26 April to 43.2 °C on 10 May to 39 °C on 24 May. Due to sudden rise in cases in recent days, the state of Jharkhand was expected to have 268 cases on 24 May. On the other hand, Odisha was expected to have fewer number of cases varying by 64 cases within next 28 days. The corresponding CFR projected values of 7.55% and 17.64% were projected in the states of Odisha and Jharkhand respectively. The reason for the projected value in Jharkhand might be due to the unavailability of appropriate medical facilities and even the late reporting of the cases. (iii)

Least Affected states A few cases have been reported in the category of our least affected states. Of the 8 least affected regions as listed in Table 4a. Andaman and Nicobar Islands was the only place where the confirmed cases were expected to be doubled due to the spike in the confirmed cases in the recent days. Ladakh, Uttarakhand and Himachal Pradesh are the colder regions of Northern India. Therefore, these states were projected to have slow recoveries as listed in Table 4b with the least number of projected casualties as listed in Table 4c. With the total number of 6 deaths were projected in the least affected states along with the CFR value which was less than 10% and was none for the rest of the states with N/A status.

Table 4a Growth rate of confirmed Covid-19 cases by 24 May 2020 in lockdown situation with respect to least affected states 26-Apr

03-May

10-May

17-May

24-May

Uttarakhand

50

57

57

62

63

Himachal Pradesh

40

41

41

42

44

Chattisgarh

37

38

42

45

48

Assam

36

36

37

37

38

Andaman and Nicobar Islands

33

42

56

62

71

Chandigarh

28

30

30

32

32

Ladakh

20

22

22

24

27

Meghalaya

12

13

17

18

20

Puducherry

8

9

9

11

12

Estimation of Short-Time Forecast for Covid-19 …

339

Table 4b Growth rate of recovered Covid-19 cases by 24 May 2020 in lockdown situation with respect to least affected states 26-Apr

03-May

10-May

17-May

24-May

Uttarakhand

28

35

41

47

50

Himachal Pradesh

22

27

28

32

36

Chattisgarh

33

34

36

42

45

Assam

21

24

25

28

32

Andaman and Nicobar Islands

11

12

15

24

30

Chandigarh

15

18

21

24

26

Ladakh

16

19

21

22

24

Meghalaya

NA

NA

NA

NA

NA

Puducherry

5

7

8

10

12

Table 4c Zero–one death rate of Covid-19 cases by 24 May 2020 in lockdown situation with respect to least affected states Uttarakhand

26-Apr

03-May

10-May

17-May

24-May

NA

NA

NA

NA

NA

Himachal Pradesh

2

2

3

4

4

Chattisgarh

NA

NA

NA

NA

NA

Assam

1

1

1

1

1

Andaman and Nicobar Islands

NA

NA

NA

NA

NA

Chandigarh

NA

NA

NA

NA

NA

Ladakh

NA

NA

NA

NA

NA

Meghalaya

1

1

1

1

1

Puducherry

NA

NA

NA

NA

NA

6.2 Kalman X-days Prediction Discussion for Indian Subcontinents The mathematical modelling utilized various parameters: Kalman Predictors, temporal classification of states along with the population metrics which helped in the accumulation of data for 28-days prediction of India. Such accumulation will certainly help in understanding of the trajectory over pandemic in time and the state wise parameters were accumulated which affects such rise in the trajectory. The trend peak rises which ranged from 50,605 to 60,140 as shown in Fig. 5a depicts the confirmed Covid-19 cases as visible between 17 and 24 May. The total number of recoveries as projected in Fig. 5b were 13,946 which was very less as compared to the confirmed cases in the country. Likewise, the number of deaths predicted in India will rise from 865 on 26 Apr to 1812 on 24 May as shown in Fig. 5c. Our model predicted that the 44,382 active cases on 24 May where the state of Haryana will

340

Fig. 5a Predicted number of confirmed Covid-19 cases in India for next 28 days

Fig. 5b Predicted number of recovered Covid-19 cases in India for next 28 days

Fig. 5c Predicted number of death in Covid-19 cases in India for next 28 days

P. Bawa et al.

Estimation of Short-Time Forecast for Covid-19 …

341

be declared as free from pandemic only if the lockdown throughout the country was implemented and extended to 24 May. Moreover, India was likely to be in stage 2 of local transmission as seen from the projected rise cases which are not in large numbers relative to the number of people residing in the country. Thus, the projected CFR value in the country based upon the valuations in the states were projected to fall from 11.80% on 26 April to 10.64% on 10 May and then rise to 11.50% on 24 May. The variations in the temperature might slow down the spread of virus in many states but still the country was far away from getting completely free from the pandemic even during the extension of the lockdown. Though relieving some relaxations in the lockdown for the states with fewer number of cases less than 1000 as shown in Fig. 6 may certainly not lead to the exponential rise of the cases by the end of May 2020. However, any kind of relaxation in the lockdown for the states with confirmed cases > 3000 can somehow lead to disastrous effects. It can further result in a severe shortage of healthcare facilities making the outbreak even worse for the country.

6.3 Comparative Analysis of Different Nations The comparative analysis showed that the lockdown and measures taken by the Government of India are very effective and have certainly helped in slowing down the process of the transmission. Though, some socio-based problems including social gatherings and migrant laborer crisis in the country might lead to projected increase in the numbers but still the performance of the control is much better as compared to the top countries as affected by this pandemic. In the USA, the novel coronavirus SARS-CoV-2 was first detected by testing of a 35-year-old man on Jan. 19, 2020. This was done on the fourth day after he returned from Wuhan to his home in Snohomish County, Washington [31]. Currently, the United States has reported 938,154 confirmed cases with 100,372 recoveries and 53,755 deaths on 25 April. Our study showcased that the spike in the countries of US, Spain, Italy, France and Germany was high due to the late implementation of the lockdown measures. It results into the projected numbers which are likely to be increased in all the nations by the end of May as presented in Table 5a except for Germany. Germany had seen a sharp spike in the recoveries as depicted in Fig. 5b and the fewer number of deaths as depicted in Fig. 5c relative to confirmed cases have been reported in the country as compared to other countries. The active cases in Germany are projected to be 27,017 cases relative to 207,398 confirmed cases till 24 May as listed in Table 5a which is even less than the number of active cases in India around this date. The lower temporal conditions in the European countries of Spain, Italy and France might prompt for the slower recoveries by the end of May. Moreover, the faster spread of the virus might be visible in these countries if the measures of lockdown being implemented are somewhat relaxed or partially lifted (Tables 5b and 5c).

342

P. Bawa et al.

Fig. 6 State-wise projected number of Covid-19 cases on or before 24 May 2020 Table 5a Predicted number of confirmed Covid-19 cases in different nations in next 28 days 26-Apr

03-May

10-May

17-May

24-May

United States

964,516

1,206,913

1,238,401

1,320,699

1,483,408

Spain

227,766

25,921

264,895

294,905

368,487

Italy

198,451

229,870

252,233

299,513

337,297

France

161,894

190,302

196,419

220,147

221,017

Germany

158,719

177,809

194,821

199,924

207,398

Estimation of Short-Time Forecast for Covid-19 …

343

Table 5b Predicted number of recovered cases in different nations in next 28 days 26-Apr

03-May

10-May

17-May

24-May

104,382

158,945

183,266

203,755

242,923

Spain

97,842

121,899

145,249

154,321

160,435

Italy

65,613

84,603

99,472

104,946

114,034

France

46,026

50,560

54,429

55,914

58,633

114,868

124,141

130,951

139,923

168,095

United States

Germany

Table 5c Predicted number of expected death rate in different nations for next 28 days 26-Apr

03-May

10-May

17-May

24-May

United States

56,358

74,166

79,550

84,519

92,524

Spain

23,315

26,819

28,706

32,217

39,254

Italy

26,880

29,651

31,902

35,980

38,491

France

23,223

25,723

26,155

28,058

31,018

6141

7486

8819

10,797

12,286

Germany

7 Conclusion This study presented a state-wise impact of Covid-19 pandemic cases in India. An analysis has been predicted for short period analysis (next 28 days) using Kalman filter for generating the trend patterns of different regions. These cases are presented in four categories that divided all states and union territories into: most, moderate, least affected and pandemic free states through confirmed, recovered and death cases. The current projections are showcased to be exponentially raised in coming days which may contribute to the challenge for people and government to operate. Note that most affected states like Maharashtra, Gujarat and Madhya Pradesh need to be more focused with care to other parts of this country to overcome this situation. These aspects are computed through temperature and population parameters. Certain hike in cases have been reported in some states are analyzed due to migration and social gathering issues. To gain more perspective, the current and future projects are also compared with late, semi or lifted lockdowns countries. It showed that India has been found to be better controlled due to implementation of an early lockdown in comparison to that of a large population of the country. This short term trend analysis may find to be beneficial in migration or arranging of healthcare resources, administration operations and upliftment of lockdown in the country. It also helps in making increase in the process of testing in certain parts of the country where situation can be alarming in near future. Acknowledgements We would like to heartily thankful to the continuous efforts of Ministry of Health and Family Welfare, Government of India (https://www.mohfw.gov.in/) and different state level bodies in timely providing of such updated information for COVID-19 cases on their respective portals or through press releases. We also like to acknowledge the efforts of John Hopkins University

344

P. Bawa et al.

for publicly releasing the up-to-date datasets information occurring worldwide due to the pandemic COVID-19 situation along with work implemented by COVID-19-India web link team members through https://www.covid19india.org/. Conflict of Interest The authors declare that they have no conflict of interest.

References 1. Hui, D.S., Azhar, E.I., Madani, T.A., Ntoumi, F., Kock, R., Dar, O., … Zumla, A.: The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health—the latest 2019 novel coronavirus outbreak in Wuhan, China. Int. J. Infect. Dis. 91, 264 (2020) 2. Lu, G., Wang, Q., Gao, G.F.: Bat-to-human: spike features determining ‘host jump’ of coronaviruses SARS-CoV, MERS-CoV, and beyond. Trends Microbiol. 23(8), 468–478 (2015) 3. Bhatnagar, T., Murhekar, M.V., Soneja, M., Gupta, N., Giri, S., Wig, N., Gangakhedkar, R.: Lopinavir/ritonavir combination therapy amongst symptomatic coronavirus disease 2019 patients in India: protocol for restricted public health emergency use. Indian J. Med. Res. 151(2), 184 (2020) 4. Jamwal, A., Bhatnagar, S., Sharma, P.: Coronavirus disease 2019 (COVID-19): current literature and status in India (2020) 5. Park, P.G., Kim, C.H., Heo, Y., Kim, T.S., Park, C.W., Kim, C.H.: Out-of-hospital cohort treatment of coronavirus disease 2019 patients with mild symptoms in Korea: an experience from a single community treatment center. J. Korean Med. Sci. 35(13) (2020) 6. Yap, T.F., Liu, Z., Shveda, R.A., Preston, D.: A predictive model of the temperature-dependent inactivation of coronaviruses (2020) 7. Deng, X., Mettelman, R.C., O’Brien, A., Thompson, J.A., O’Brien, T.E., Baker, S.C.: Analysis of coronavirus temperature-sensitive mutants reveals an interplay between the macrodomain and papain-like protease impacting replication and pathogenesis. J. Virol. 93(12), e02140e2218 (2019) 8. Chan, K.H., Peiris, J.S., Lam, S.Y., Poon, L.L.M., Yuen, K.Y., Seto, W.H.: The effects of temperature and relative humidity on the viability of the SARS coronavirus. In: Advances in Virology (2011) 9. Bonilla-Aldana, D.K., Quintero-Rada, K., Montoya-Posada, J.P., Ramírez-Ocampo, S., PanizMondolfi, A., Rabaan, A.A., … Rodríguez-Morales, A.J.: SARS-CoV, MERS-CoV and now the 2019-novel CoV: Have we investigated enough about coronaviruses?–A bibliometric analysis. Travel Med. Infect. Dis. 33, 101566 (2020) 10. Victor, A.: Mathematical predictions for COVID-19 as a global pandemic. Available at SSRN 3555879 (2020) 11. Gao, Y., Zhang, Z., Yao, W., Ying, Q., Long, C., Fu, X.: Forecasting the cumulative number of COVID-19 deaths in China: a Boltzmann function-based modeling study. Infect. Control Hosp. Epidemiol., 1–16 (2020) 12. Magal, P., Webb, G.: Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany (2020) (March 19, 2020) 13. Dhanwant, J.N., Ramanathan, V.: Forecasting COVID 19 growth in India using susceptibleinfected-recovered (SIR) model (2020). arXiv preprint arXiv:2004.00696 14. Wang, Y.: Use crow-AMSAA method to predict the cases of the Coronavirus 19 in Michigan and USA (2020). medRxiv 15. Awad, S.F., Critchley, J.A., Abu-Raddad, L.J.: Epidemiological impact of targeted interventions for people with diabetes mellitus on tuberculosis transmission in India: modelling based predictions. Epidemics 30, 100381 (2020)

Estimation of Short-Time Forecast for Covid-19 …

345

16. Tuite, A.R., Fisman, D.N.: Reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-nCoV) epidemic. Ann. Internal Med. (2020) 17. Pica, N., Bouvier, N.M.: Environmental factors affecting the transmission of respiratory viruses. Curr. Opin. Virol. 2(1), 90–95 (2012) 18. Ma, Y., Zhao, Y., Liu, J., He, X., Wang, B., Fu, S., … Luo, B.: Effects of temperature variation and humidity on the mortality of COVID-19 in Wuhan (2020). medRxiv 19. Gómez, V., Maravall, A.: Estimation, prediction, and interpolation for nonstationary series with the Kalman filter. J. Am. Stat. Assoc. 89(426), 611–624 (1994) 20. Reif, K., Gunther, S., Yaz, E., Unbehauen, R.: Stochastic stability of the discrete-time extended Kalman filter. IEEE Trans. Autom. Control 44(4), 714–728 (1999) 21. Chandramouli, C., General, R.: Census of India 2011. Provisional Population Totals, pp. 409– 413. Government of India, New Delhi (2011) 22. Population Projections for India and States 2011–2036. [Online] Available: https://nhm.gov. in/New_Updates_2018/Report_Population_Projection_2019.pdf 23. Subrahmanya, Y.: Climatic types of India according to the rational classification of Thomthwaite. Ind. Jour. Meteor. Geophy. 7, 1, 12 (1956) 24. Weatherbit.io. [Online]. Available: https://www.weatherbit.io/ 25. COVID19-India API. [Online]. Available: http://api.covid19india.org 26. Suryawanshi, S.: Coronavirus outbreak has reached community transmission stage in Mumbai, says BMC. The New Indian Express (2020, April 7). Retrieved from https://www.newindian express.com 27. Ghosh, S., Kateshiya, G.: Gujarat: Source of infection unknown for at least 10 patients, officials suspect community transmission. The Indian Express (2020, March 31). Retrieved from https:// indianexpress.com/ 28. Bihst, A., Naqvi, S.: . How Tablighi Jamaat event became India’s worst coronavirus vector. Aljazeera (2020, April 7). Retrieved from https://www.aljazeera.com/ 29. Ashiq, P.: Coronavirus | Community transmission happens in Kashmir hamlet. The Hindu (2020, April 18). Retrieved from https://www.thehindu.com 30. Kumar, S.: Maharashtra begins sending back Sikh pilgrims stuck in Nanded’s Hazur Sahib. The Tribune (2020, April 24). Retrieved from https://www.tribuneindia.com/ 31. Holshue, M.L., DeBolt, C., Lindquist, S., Lofy, K.H., Wiesman, J., Bruce, H., … Diaz, G.: First case of 2019 novel coronavirus in the United States. New England J. Med. (2020)