Tourism Analytics Before and After COVID-19: Case Studies from Asia and Europe 9811993688, 9789811993688

This book is compilation of different analytics and machine learning techniques focusing on the tourism industry, partic

333 107 9MB

English Pages 248 [249] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Hong Kong Tourism Under COVID-19
Data Preparation
Modeling and Results Comparison
Feature Importance
Business Analysis
Impact on Airlines: Case Study on Cathay Pacific and Dragon Air
Conclusion and Future Studies
References
Tourism Analytics, the Case for Hainan China
Impacts on Tourism Industry
Analytics Methodology
Model Selection
Conclusions
Reference
Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe
Dataset and Analysis
Methodology and Experimental Results
Recommendation and Conclusion
References
Tourism Rebounds Analysis—Lessons from Baltics Countries
Business Understanding and Approach
Data Model Analysis
Tourism Income Baseline Growth Trajectory 2020–2021, Without COVID
XGBoost
Model Evaluation
Prediction of International Arrivals in 2020 and 2021—an Outlook Without COVID-19
The Case of Travel Bubble in Estonia
Business Case Analysis
Policies Effectiveness Quantitative Analysis
Qualitative Analysis of Other Measures for Consideration
Conclusion
References
Compare and Contrast the Impact of COVID-19 from Small to Large Country
Tourism in Singapore
Tourism in China
Tourism Analytics—The Case for South Africa
References
Hotel Booking Cancellation Analytics on Imbalanced Data
Data Preparation
Data Visualization
Machine Learning
Business Insights and Solutions
Conclusion
References
Tourism Prediction Analytics
Dataset and Analysis
Current Situation of COVID-19
Prediction of COVID-19
Development of Tourism/Hotel Industry
Seasonality of Arrivals
Age of Visitors
Purpose of Trips
Places of Interest
Hotel Industry
Impact of COVID-19 on Singapore’s hotel industry
Descriptive Analysis
Time Series Prediction
Recommendation
Conclusion
References
Marketing Segmentation and Targeted Marketing for Tourism
Visualization with Descriptive Analytics
Business Solutions Using Machine Learning
Conclusion
References
Machine Learning for Tourism
Visualization-Based Analysis
Time Series Analysis
Machine Learning Analysis
Recommendation
Data Visualization on Tourism
Data Sources
Data Visualization and Analysis
Recommendation
Conclusion
References
Sustaining Tourism Sector Through Domestic Tourism and Analytics
Dataset and Analysis
Proposed Solution: Analytics-Enabled Domestic Tourism Model
References
Tourism Analytics with Price and Room Booking Simulation
Analytics Approach on Tourism
Price, Room Booking and Revenue Simulation
Scenario 1
Scenario 2
Scenario 3
Conclusion
Recommendation
References
Tourism Arrival Prediction
Proposed Solutions
Fiscal Stimulus
Domestic Tourism
Travel Bubble
Reshape the Travel Activities
References
Recommend Papers

Tourism Analytics Before and After COVID-19: Case Studies from Asia and Europe
 9811993688, 9789811993688

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Yok Yen Nguwi   Editor

Tourism Analytics Before and After COVID-19 Case Studies from Asia and Europe

Tourism Analytics Before and After COVID-19

Yok Yen Nguwi Editor

Tourism Analytics Before and After COVID-19 Case Studies from Asia and Europe

Editor Yok Yen Nguwi Nanyang Business School Nanyang Technological University Singapore, Singapore

ISBN 978-981-19-9368-8 ISBN 978-981-19-9369-5 (eBook) https://doi.org/10.1007/978-981-19-9369-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Contents

Hong Kong Tourism Under COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cui Yuting, Gao Yinan, Ge Xinyi, Hao Junyi, Jiang Zhongyang, and Yu Peichen Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling and Results Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feature Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Business Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impact on Airlines: Case Study on Cathay Pacific and Dragon Air . . . . . . . . . Conclusion and Future Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism Analytics, the Case for Hainan China . . . . . . . . . . . . . . . . . . . . . . . He Pan, Lu Hengyu, Wei Yuzhi, Wang Qi, Wu Meng, and Zhang Qiqi Impacts on Tourism Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analytics Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Shijing, Chen Yuheng, Chen Ziyan, Gong Manlin, Lai Zijun, Lin Dazheng, and Li Hongtao Dataset and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology and Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendation and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism Rebounds Analysis—Lessons from Baltics Countries . . . . . . . . . Long Zhaowen, Wei Kexian, Wu Mengran, Xiong Yike, Yang Yafeng, Zhao Chenxi, and Zhou Yang Business Understanding and Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

3 5 7 10 15 16 17 19 21 26 28 35 36 37

38 40 45 46 47

48

v

vi

Contents

Data Model Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism Income Baseline Growth Trajectory 2020–2021, Without COVID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XGBoost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prediction of International Arrivals in 2020 and 2021—an Outlook Without COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Case of Travel Bubble in Estonia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Business Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Policies Effectiveness Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qualitative Analysis of Other Measures for Consideration . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Compare and Contrast the Impact of COVID-19 from Small to Large Country . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hu Yubin, Ma Defeng, Qiu Zicong, Tang Manhong, Wang Lyu, and Wang Yang Tourism in Singapore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism in China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50 50 52 52 56 58 60 60 62 63 64 65

68 82

Tourism Analytics—The Case for South Africa . . . . . . . . . . . . . . . . . . . . . . . Yong Heng Michael Tan and Yok Yen Nguwi References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

Hotel Booking Cancellation Analytics on Imbalanced Data . . . . . . . . . . . . Cai Yuxuan, Hsu Tuan-Chun, Jin Zhuofan, Tan Chian Wen Melvin, Vivek Goyal, and Zheng Yijun Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Business Insights and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

Tourism Prediction Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Shuhua, Gao Yuan, Lin Desheng, Shen Yi, and Wu Di Dataset and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Current Situation of COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prediction of COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Development of Tourism/Hotel Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seasonality of Arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Age of Visitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Purpose of Trips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Places of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hotel Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Impact of COVID-19 on Singapore’s hotel industry . . . . . . . . . . . . . . . . . . .

96

100 101 104 113 116 116 119 120 120 122 123 123 125 125 125 126 128

Contents

vii

Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time Series Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

128 131 133 136 137

Marketing Segmentation and Targeted Marketing for Tourism . . . . . . . . Liu Ye Xin, Li Yiteng, Ritika Jain, Tran Thi Hong Van, William Lim, and Zhao Yilin Visualization with Descriptive Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Business Solutions Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139

Machine Learning for Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chang Chai, Yanbo Chen, Taiying Kuang, Chun-Yu Lai, Jingyi Li, and Jian Zhang Visualization-Based Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Machine Learning Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157

Data Visualization on Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hanlin Xiao, Jie Cheng, Yunfan Lyu, Yuqing Ma, Dongxu Sun, and Qian Wu Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Visualization and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

Sustaining Tourism Sector Through Domestic Tourism and Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dingming Chen, Pou Ing Gan, Hoi Ming Lee, Ziye Li, Vadlamudi Santosh Krishna, and Quanxin Wang Dataset and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proposed Solution: Analytics-Enabled Domestic Tourism Model . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism Analytics with Price and Room Booking Simulation . . . . . . . . . . Yile Cai, Ke Duan, Congcong Peng, Xiaodan Shao, Yichu Sun, Jiayi Wang, and Linghao Zeng Analytics Approach on Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Price, Room Booking and Revenue Simulation . . . . . . . . . . . . . . . . . . . . . . . . . Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scenario 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

140 145 153 155

157 160 175 180

183 183 194 198 198 199

200 206 209 211

213 222 225 226 227

viii

Contents

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Tourism Arrival Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cao Wenfei, Gu Yichao, Wang Jingyi, Wang Yanan, Zhao Yifan, and Zhu Haoxiang Proposed Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fiscal Stimulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Domestic Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Travel Bubble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reshape the Travel Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

231

244 245 245 245 246 246

Hong Kong Tourism Under COVID-19 Cui Yuting, Gao Yinan, Ge Xinyi, Hao Junyi, Jiang Zhongyang, and Yu Peichen

This work aims at deeply analyzing the COVID-19 impact on Hong Kong’s tourism by applying several machine learning models with significant features. We will first analyze the properties, characteristics, and meaning of all the attributes we selected that will have a potential impact on the number of Hong Kong visitors. After data wrangling process, three different machine learning techniques will be adopted in this study, where we choose Ridge regression, Linear SVR and XGBoost. The most suitable model will be selected, and some adjustments will be made to optimize the performance of chosen model in order to find the best suitable model. Lastly, strong features will be analyzed to try to find out the causality behind it and the number of visitors, and constructive recommendations are provided specifically along with data visualization insights and case studies.

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_1. C. Yuting · G. Yinan · G. Xinyi (B) · H. Junyi · J. Zhongyang · Y. Peichen Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] C. Yuting e-mail: [email protected] G. Yinan e-mail: [email protected] H. Junyi e-mail: [email protected] J. Zhongyang e-mail: [email protected] Y. Peichen e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_1

1

2

C. Yuting et al.

Prior to discussing the model, we will start with reviewing some related literatures on how to use machine learning models to predict tourism data, followed by a study on Hong Kong tourism industry and the COVID-19 recovery strategies. In [1] on Tourism Recommendation Using Machine Learning Approach, the work proposed to perform tourism forecasting in Puri, a hot tourist spot in Odisha. Supervised machine learning models, such as Artificial Neural Network (ANN) and time series models, such as ARIMA method, were used in the prediction. In ANN, they tried Self Organizing Maps (SOM) model which differs from other ANN models as they apply competitive learning relative to error-correction learning (backpropagation with gradient descent) and use a neighborhood function to maintain the topological properties of the input space. The researchers concluded that the SOM model is efficient and time-saving when facing plenty of data; on the other side, time series analysis can get a good forecasting analysis result measured with Mean Square Percentage Error (MAPE). Koutras et al. [2] developed models such as Multilayer Perceptron (MLP), Support Vector Regression (SVR), and Linear Regression (LR) to estimate tourism demand in the accommodation industry. It covers data from the year 2005 to 2012 in the Western Region of Greece. The work concluded the use of Linear Regression (LR) and Support Vector Regression with Radial Basis Function (SVR-RBF) performed the best with the lowest Root Mean Square Error (RMSE). Baldigara and Koi´c [3] modeled the international tourism demand in Croatia using polynomial regression analysis. They analyzed important patterns of German tourism demand in Croatia. The only model used in the paper is second-order polynomial regression method. Detailed illustration of model building was included in the paper. Apart from measuring the adjusted R-squared and Variance Inflation Factor (VIF) indicator, the Jarque-Berra Test for residuals was also performed to understand whether the sample data has skewness and kurtosis, and Durbin-Watson test to know whether there is autocorrelation. Claveria et al. [4] used ARMA model as a benchmark to compare whether the selected machine learning models can improve the forecast accuracy of tourism demand. They picked three machine learning techniques: Support Vector Regression (SVR), Gaussian Process Regression (GPR), and Neural Network (NN) models. The result is that all the three methods have better performance (lower MAPE) than the benchmark model, especially mid- and long-term forecasting. Chen [5] compared short and long-haul vacation tourists in Hong Kong. He investigated reasons on why visitors go to particular attractions. Based on the case of Hong Kong, studies were conducted through the comparison among multiple source markets, including Mainland China market, other short haul markets (excluding Mainland China), and long-haul markets. Departing Visitor Survey (DVS) by the Hong Kong Tourism Board (HKTB) was used for analysis. It was concluded that reasons for choosing a travel place depend on different needs and the level of importance to each person, and the study suggested Destination Management Organizations (DMOs) framework to better allocate attractions according to visitor needs. In the work on COVID-19, Recovery Strategy for Tourism Industry [6], Strielkowski

Hong Kong Tourism Under COVID-19

3

reviewed COVID-19 outbreak impact and compared it with the influence of SARS in 2002–2003. Additionally, the study also analyzed and evaluated what have been done in Mainland China and Hong Kong to prevent the disease from further spreading.

Data Preparation We collected data from different official data sources on inbound visitors as shown in Fig. 1. The period covers from January 2002 until the end of December 2020, there are 9 columns (including monthly time index) and 2052 observations in total. In this study, the target Y variable is the Number of Visitors, it refers to Visitor Arrivals Details by Country/Territory of Residence and Mode of Transportation. For example, in the first row, the number 97,575 represents that in January 2002, there are 97,575 visitors arrived at Hong Kong from Mainland China by airplane. This column of data is collected through Hong Kong Tourism Board’s monthly visitor arrival statistics. On the features X variables, we have a seasonal index ranging from 0 to 3, representing four different quarters. Transportation is the way by which the visitors travel to Hong Kong, including Air, Sea, and Land. As for the origin of visitors, we only select three countries/regions: Mainland China, the United States, and Singapore. Besides, we also select some macroeconomic factors in our dataset. Considering some travelers are price-sensitive, we gathered Exchange Rate as one of the X variables. For example, when the currency of Hong Kong is weaker than before when pairing with China currency, then some visitors from Mainland China may consider spending their holidays in Hong Kong instead of other countries/regions. Or, if they live near Hong Kong, such as Shenzhen, they may want to take multiple trips to and

Fig. 1 Data on Hong Kong arrivals

4

C. Yuting et al.

from Hong Kong under the visa permission. The exchange rate data was obtained from Yahoo finance database on a monthly basis. Furthermore, we also consider the use of Gross Domestic Product per capita (for both HK GDP and Origin GDP) as X variables, but there are some constraints: (1) It is not feasible to obtain monthly data on GDP, as the minimum time span is a quarter; (2) GDP of 2020 has not yet been computed as of the time of writing; (3) Although GDP functions as a comprehensive scorecard of a given country’s economic health, the number also includes factors other than consumption, such as government spending, net exports, and investment. As such, we focus on Consumer Price Index (CPI). This is often an important reference indicator for market economic activities and government monetary policy. It is one of the main indicators to evaluate inflation, which directly reflects the purchasing power of the currency. Inflation in the destination country/region (in our case, it is Hong Kong) will reduce the purchasing power of tourists (for example, visitors from Mainland China, the US, and Singapore), thereby affecting the number of tourists and tourism income of the destination country/region; on the contrary, inflation in the origin country will prompt residents to travel abroad. In our dataset, the CPI data comes from different official websites and all the data are not seasonally adjusted. Hong Kong CPI is from the Census and Statistics Department of the Government of the Hong Kong Special Administrative Region. CPI data of China and the US are from the Economic Research Department of Federal Reserve Bank of St. Louis. The CPI for Singapore was obtained from the website of Department of Statistics Singapore. After combining them together, we find a problem in the consolidated CPI data. According to the formula of CPI, there should be a base year. However, base years for all the four countries/regions are not the same, but generally speaking, the span of the base year is not large. We can normalize them to shrink the bias in the model preparation process. Therefore, we still consider these CPI values in our dataset. Consumer Price Index =

Cost of Market Basket in a Given Year × 100x Cost of Market Basket at Base

In addition to all the above, we add a column called “COVID/SARS” which indicates whether there is an epidemic in that month or not. If no, the value should be “0”, otherwise, it will be “1”. This helps us measure the influence of the epidemic better. After collecting the whole dataset, we do the data preparation. Since the original dataset is quite clean, there are nearly no null values. Therefore, we only performed (1) setting baseline and (2) generating dummies data. We also notice that for a particular month from a particular origin, the number of visitors who travel by land is much larger than that of visitors who travel by other transportation. For example, in January 2002, there are 284,667 visitors from Mainland China by land; however, only 97,575 visitors travel by airplane, about one-third of the previous number. The huge difference in the number of visitors may lead to inaccurate modeling results. Under this situation, we set each first record as a

Hong Kong Tourism Under COVID-19

5

Fig. 2 Baseline setting for each transportation category from each origin

baseline for each transportation category as shown in Fig. 2. In each origin category, the number of visitors in all records should be divided by the baseline corresponding to their category. For example, in Fig. 1 earlier, the number, 259,258 in the 12th row should be divided by the number, 284,667 in the 3rd row. Additionally, since “Transportation” is in pure text, we can use the get_dummies() function to transform text into numbers, which can fulfill the requirements for the further model building process. As for “Origin”, there is no need to do so, because we will divide our dataset into three groups for further modeling based on “Origin”.

Modeling and Results Comparison In the experiment we have designed, we plan to use different features such as transportation methods, CPI, exchange rate, and so on, to predict the number of visitors per month in Hong Kong, to make a regression problem to start with. Apart from regression, another method that can be used to predict continuous data is time series. The reason why we did not use time series here is that time series uses autocorrelation and the moving average of previous y to summarize the historical information of y and make a final prediction. Therefore, in time series forecasting, predicting y is the primary goal. However, we may get very accurate predictions but do not know the underlying reason. For instance, in this case, we all know that visitors in Hong Kong are changed by season, meaning that last year’s data may be predictive of this year’s data. But the underlying reasons are still unknown. Regression is therefore a great way with which we can not only predict y but also explore underlying reasons according to the feature importance or coefficient. Additionally, time series is not sensitive to the external change such as COVID-19. Regression may help us evaluate how COVID-19 affects the visitors of Hong Kong by focusing on each feature. Following the above strategy, three different regression algorithms including Ridge Linear Regression, Linear Support Vector Regression (SVR), and ExtremeGradient Boosting (XGBoost) were trained by using all features. We did not perform feature selection before training the model since our features are limited. The detail of each model is shown below.

6

C. Yuting et al.

· Ridge Linear Regression Ridge linear regression is a modified version of linear regression. Linear regression is a linear approach to model the relationship between the target and one or more variables. Ridge linear regression is a technique used to analyze regression data to avoid the multicollinearity. It is often used in conjunction with Lasso linear regression. Both of them use some simple techniques to reduce the model complexity and avoid the over-fitting problem which may be occur in simple linear regression. Ridge linear regression adds a degree of bias which is also called penalty term (lambda) to the cost function such that if some coefficients get large value, the cost function is penalized. We perform ridge linear regression with three-fold cross validations. One hundred different penalty terms (lambda) were set for each Ridge linear regression. In each training, the whole training set was divided into three parts. The one with the best root mean square error (RMSE) was selected as the final output algorithm. · Linear Support Vector Regression (SVR) A support vector machine (SVM) is a supervised machine learning algorithm that is commonly used for classification. SVM can also be used for a regression problem, Support Vector Regression (SVR). As compared to SVM which maximizes the distance to the nearest sample point on the hyperplane, SVR is to minimize the distance to the farthest sample point on the hyperplane. The objective of using linear SVR here is to fit the data and return a best fit hyperplane. Therefore, it will not minimize the observed training error but will try to achieve generalization performance. In this experiment, we use one of the most basic SVR configurations known as linear SVR which makes decision boundary (hyperplane) using a straight line. Linear SVR with three-fold cross-validation was performed. Ten different regularization term, cost (C), was set for each linear SVR. It served as the degree of importance given the error. In each training, the whole training set was divided into three parts. The one with the best root mean square error (RMSE) was selected as the final output. · Extreme-Gradient Boosting (XGBoost) XGBoost is a decision-tree-based ensemble machine learning algorithm that uses a gradient boosting framework. Rather than training all decision tree models in isolation from each other (bagging), boosting offers a solution to train the model continuously, and each new model is trained to correct mistakes based on the previous model. We add models one by one until no further improvements are needed. XGBoost is considered as one of the most advanced algorithms in boosting. As compared to the gradient boosting machine (GBM), XGBoost applies a more regularized model formalization to control the over-fitting issue. Randomized grid search with three cross-validation was used to find the best hyperparameter combination of XGBoost. The hyperparameters include the maximum depth of each tree, learning rate, the minimum number of samples in each node, and so on. Grid search was run 100 times randomly out of multiple groups. One

Hong Kong Tourism Under COVID-19

7

of the best root mean square errors (RMSE) is selected as the final hyperparameter for XGBoost. In measuring the performance, we adopted Mean Absolute Error (MAE), R-Square (R^2), and Root Mean Square Error (RMSE) as metrics for model comparison. We created three different models based on the data in Mainland China. For Ridge Linear Regression, the best parameter “alpha” is 1, MAE is 1.41, R^2 and RMSE are 0.69 and 3.3, respectively. As for SVR, the best parameters “C” equals to 5 and MAE, R^2, RMSE are 1.03, 0.73, and 2.86, respectively. For XGBoost model, the best number of estimators is 150, learning rate is 0.05 and max depth equals to 3. MAE, R^2 and RMSE are 0.53, 0.93, and 0.69, respectively. We, therefore, deduce that one of the most important metrics RMSE of XGBoost is much lower than the RMSE of Ridge Linear Regression and SVR. Hence, XGBoost is selected as the final prediction model. But the RMSE is very high with the predicted data due to the sudden outbreak of COVID-19. For example, the RMSE of Mainland China data is 6.75, which is much higher than the RMSE (3.29) for data prior to the pandemic. The model’s performance of the USA and Singapore are similar to Mainland China. So overall, XGBoost has got the best result. It had the highest R^2 which is nearly equals to 1 and the lowest MAE and RMSE. So, our team decided to use XGBoost to predict the data after the outbreak of COVID-19.

Feature Importance From the results of the three previously trained models, we selected XGBoost to predict our data and plotted the graphs of feature importance to find the most important features. As shown in Fig. 3 on important features for visitors from Mainland China, we can see that the top three features that have highest F scores are CPI of Mainland China, CNY/HKD exchange rate, and CPI in Hongkong. Figure 4 shows the feature importance graph for visitors from United States, we can see that the US CPI, Land transportation and USD/HKD exchange rate are the top three important features. As for Singapore data in Fig. 5, the important features are very similar to those of Mainland China, which are Singapore CPI, Hongkong CPI, and SGD/HKD exchange rate. We now discuss the advantages and disadvantages of the machine learning models. The SVM and Ridge are weak regressor, and XGBoost Regressor is a strong Regressor that adopts Boosting approach to optimize the algorithm. XGBoost adds a regularization item to the objective function. When the base learner is CART, the regularization item is related to the number of tree leaf counters (T) and the value of the leaf routine. While XGBoost is a Boosting method, it also has parallelism like Bagging. However, unlike Bagging, XGBoost’s parallelism is based on feature granularity. We know that one of the most time-consuming steps in decision tree learning is to sort the values of features (because the best split point is to be determined). Before training, XGBoost sorts the data in advance, and then saves it as a block structure.

8

C. Yuting et al.

Fig. 3 Feature importance for visitors from Mainland China

Fig. 4 Feature importance for visitors from United States

This structure is repeatedly used in iterations, which greatly reduces the amount of calculation. This block structure also makes parallel possible. When splitting nodes, you need to calculate the gain of each feature. Finally, the feature with the largest gain is selected for splitting. Then the gain calculation of each feature can be performed in multiple threads. The author of XGBoost also adds Taylor expansion based on GBDT, so the calculation time is much shorter than that of other Boosting algorithms.

Hong Kong Tourism Under COVID-19

9

Fig. 5 Feature importance for visitors from Singapore

The criterion for finding the best split point in the CART regression tree mentioned above is to minimize the mean square error (MSE), and the criterion for XGBoost to find the split point is to maximize. Lamda and Gama are related to the regularization term. XGBoost allows cross-validation to be used in each iteration of Boosting. Therefore, the optimal number of Boosting iterations can be easily obtained. There is also a downside in the use of XGBoost. There are too many algorithm parameters in XGBoost. Time needs to spend on selecting the different parameters when the model was built. XGBoost is only suitable for processing structured data. As compared with deep learning algorithms, XGBoost algorithm is only suitable for processing structured feature data, and it does not have good processing capabilities for task-heavy unstructured data such as area of interest detection in images. XGBoost is not suitable for processing ultra-high-dimensional feature data. It has good processing speed and accuracy for low- and medium-dimensional data, but for large-scale image object recognition, or the ultra-high-dimensional features that will appear in some time interval, XGBoost may not be ideal. Other deep learning is more suitable in that case. Linear Regression is good at obtaining the linear relationship in the data set, training speed and prediction speed are faster. Ridge is a regularized linear regression model, so it helps to prevent over-fitting. By adjusting the L2 penalty factor to prevent the problem of too large parameters when the dimension becomes higher. Linear Regression is suitable when we on data with few features. The downside of Linear Regression is that the parameter cannot be 0, that is, the total number of parameters cannot be reduced. Also, Ridge is not applicable to nonlinear data. Generally, the R-squared value of the ridge regression equation is slightly lower than that of ordinary regression analysis, but the significance of the regression coefficient is often significantly higher than that of ordinary linear regression.

10

C. Yuting et al.

Support Vector Regression (SVR) is very effective in solving the regression problem of high-dimensional features, and it still has a good prediction ability when the feature dimension is greater than the number of samples. In SVR, there are many kernel functions available, which can be very flexible to solve various nonlinear classification and regression problems, such as linear, poly, Radial Basis Function (RBF), and sigmoid. SVR can use only a part of the support vectors to make hyperplane decisions without relying on all data. The disadvantage of SVR is when the feature dimension is much larger than the number of samples, SVR performance is average. There is no universal standard for the choice of kernel function for nonlinear problems, and it is difficult to choose a suitable kernel function. The other issue is SVR is sensitive to missing data.

Business Analysis Hong Kong’s economy is driven by four key industries: financial services, tourism, trading and logistics, and professional and producer services, which together contribute around 55% of GDP and 45% of employment. However, the COVID19 pandemic hit these industries asymmetrically: financial services and professional services were less affected as they were less reliant on face-to-face contact. Tourism and trading and logistics were hit the hardest. With this time series data (from 2002 to 2020), there are a few insights that can be extracted through the time index. Figure 6 illustrates the inbound arrivals places of origin, and Fig. 7 shows the trend line individually. It could be observed from Fig. 6 to Fig. 7 that the total number of US tourists rose and fell around 1.2 million, while that of Singapore fluctuated around 0.7 million since 2004, whereas during this period, the number of Mainland tourists had been soaring and accounted for 80% of all visitors in the past 5 years. That trend continued in the year 2019, with over 40 million mainlanders visiting Hong Kong from January through July, almost four times the number from the rest of the world. These indicate that the attractiveness of Hong Kong to foreign tourists has not increased with globalization and economic growth and the main tourist group from Mainland China is the key to reviving Hong Kong travel sector. Figure 8 depicts that the proportion of tourists arriving in Hong Kong by land is getting higher and higher, while the number and proportion of mainland tourists visiting Hong Kong are also increasing. This is partly due to the seven land ports between Shenzhen and Hong Kong. The good connection between land ports and Shenzhen highways, passenger transportation, freight transportation, and entry have greatly facilitated traveling, saving time and costs on transportation, mainland tourism, and cross-border trade are the biggest beneficiaries following this upward trend in tourism. Figure 8 also shows that the total number of visitors was increasing rather dramatically during the period from 2002 to 2015, except for SARS which happened around

Hong Kong Tourism Under COVID-19

11

Fig. 6 Overall trend of Hong Kong inbound visitors by different origins

2003. However, after SARS, the number of visitors in Hong Kong increased exponentially until 2015. From 2015 to the new peak in 2018, there is an obvious trough in the number of tourists. According to the data released by Hong Kong Tourism Board on 2016 January 29th, the overall number of visitors to Hong Kong fell by 2.5% and this was the first time since SARS. In addition to mainland tourists, Japan, Indonesia, Singapore, and other markets all experienced a decline. At this period of time, this phenomenon on Hong Kong Tourism is contributed by both internal and external reasons. For internal reasons, the affordability of Hong Kong is low due to the higher currency, thus some policies (like Shenzhen’s “one-trip per week” policy) have been posted to limit the number of visitors. As for the external reason, the growth of major economies has shown signs of slowing down and the currency depreciation of neighboring regions has affected the number of tourists. The other reason is the aging scenic spots were unable to attract returning customers. Followed by anti-government protests since June 2019, Hong Kong is on the brink of recession. A leading cause of its economic troubles is the well-publicized plunge in the tourism industry. Additionally, since June 2019, thirty-one countries have issued travel advisories or alerts on Hong Kong, the corresponding impact was very significant. Starting from July, inbound travelers from Mainland China, US, and Singapore dropped sharply and the trend of decline was intensified in the following months. While the total number of visitor arrivals in Hong Kong maintained positive growth in the first half of the year 2019 (+52% vs. Jan–Jun 2018), the amount in the second half of the year 2019 dropped by 50% over Jul–Dec 2018. Visitors from mainland China decreased by 45%, as shown in Table 1. For the whole year of 2019, the total number of tourists dropped by 14% vs. 2018. All the figures are pointing out that Hong Kong travel industry was at stake. There was no better time for Hong

12

C. Yuting et al.

Fig. 7 Hong Kong inbound visitors by different origins

Kong to regain its stability and reputation as a safe and hospitable tourist destination. However, the arrival of COVID-19 dashes the last hope to recover. The Coronavirus was first confirmed to have spread to Hong Kong on 23 January 2020, this moves the attention from the street protestors to looking at the negative growth on Hong Kong tourism following the pandemic. The worst situation on the impact of COVID-19 pandemic on Hong Kong’s tourism industry is unprecedented. According to the data released by the Hong Kong Tourism Board, the onset of pandemic epidemic since February 2020 caused the number of visitors to Hong Kong to go down by more than 90% for 10 consecutive months. Since June 2019, 31 countries have issued travel advisories or alerts on Hong Kong. The corresponding impact was so significant that starting from July, inbound travelers from Mainland China, US, and Singapore dropped sharply and the trend of decline intensified in the following months, as shown in Fig. 9. In the first 11 months of this year, the total number of visitors to Hong Kong was 3.564 million, shrinking 94% year-on-year, of which 2.704 million were from the mainland, down 94%. From Fig. 9, it can be seen

Hong Kong Tourism Under COVID-19

13

Fig. 8 Hong Kong visitors by different mode of transportations

Table 1 YoY growth for Hong Kong inbound visitor in 2018 and 2019 by origin Nationality/Region Mainland China

Variance

Variance

Variance

19’ vs. 18’ 1st half (%)

19’ vs. 18’ 2nd half (%)

19’ vs. 18’ total (%)

+27

−45

−14

Sub-total

+407

−62

−14

Long haul markets

+281

−57

−13

Short haul markets

+502

−65

−15

Total

+52

−50

−14

from the figures that the number of tourists in 2020 has shown a precipitous decline. As listed in Table 2, total visitor arrivals in the year 2020 decreased by 94% when compared with the year 2019. Tourism will affect approximately 800,000 livelihoods in Hong Kong tourism industry directly, while tourism-related industries such as retail and catering are also affected. Since January 2020, Ocean Park and Disneyland, the two iconic theme parks in Hong Kong, have been closed for three times, respectively, due to the epidemic. For this reason, in May 2020, the Hong Kong SAR Government allocated about 5.4 billion yuan to Ocean Park to help it to relieve its plight. Travel agencies in Hong Kong have not closed down on a large scale through government funding. COVID-19 impact on air travel segment will be addressed in the next section with a global view and spotlight on Hong Kong air tourism.

14

C. Yuting et al.

Fig. 9 Trend of Hong Kong inbound traveler from 2019 to 2020 by origin Table 2 YoY growth on Hong Kong inbound visitor in 2019 and 2020 by origin

Nationality/Region 2019

2020

Variance

Total

Total

20’ vs. 19’ (%)

Mainland China

43,607,058 2,685,108 −94

Sub-total

12,305,551 883,767

−93

Long haul markets

4,246,452

364,889

−91

Short haul markets

8,059,099

518,878

−94

Total

55,912,609 3,568,875 −94

Hong Kong Tourism Under COVID-19

15

Impact on Airlines: Case Study on Cathay Pacific and Dragon Air The landscape of international aviation remains incredibly uncertain with border restrictions and quarantine measures still in place across the globe. Although the industry has begun to see some initial developments, it is still yet to see any significant signs of immediate improvement. The International Air Transport Association (IATA) released the latest global passenger forecast showing that Global passenger traffic will not return to preCOVID-19 levels until 2024. Additionally, the recovery in short haul travel is still expected to happen faster than for long haul travel. Similar to Singapore but different from Mainland China market, it does not have domestic markets of its own. Airlines in Hong Kong are entirely exposed to international competition and their life and death are completely constrained by the global COVID-19 pandemic recovery. Cathay Pacific, Hong Kong’s flagship carrier and a key player in Hong Kong maintaining and enhancing its position as an aviation hub, is no exception in this unexpected industry recession. In the first half of the year 2020, Cathay Group’s passenger revenue decreased by 72.2% as compared with a 72.6% decrease in capacity. Cargo revenue increased by 8.8%, as compared with a 24.6% decrease in capacity. The group announced its attributable loss was HK$9,865 million in the first half of the year 2020 (vs. 2019 first half with a profit of HK$1,347 million). To tide, the city’s ailing flagship carrier through the impact of the coronavirus pandemic, Hong Kong government injected HK$1,060 million recapitalization on Cathay Pacific in June 2020. However, with the third wave of Hong Kong coming in early July and the explosion of global spread, it seems that the government injection is not able to help Cathay hold on until the pandemic is over. On 21 October 2020, we witnessed Cathay Pacific Airways’ stock price jumping to the highest in a month after it announced a HK$2.2 billion restructuring that will see the biggest job cuts in its history and the closure of the Cathay Dragon brand. Its decision to lay off 5,900 employees, accounting for 17% of its established headcount, goes beyond the company and further harms Hong Kong’s status as an international aviation hub and the city’s overall development. The future of Cathay Pacific seems unclear: it is concluded that the most optimistic scenario it can responsibly adopt is that for the year 2021, Cathay will be operating at less than 50% of the passenger capacity than it operated in the year 2019. Assuming that the vaccines that are currently under development prove to be effective and are successfully rolled out on a global scale by the summer of 2021. It is expected to be operating at well below 25% capacity in the first half of the year 2021 but followed by a gradual recovery in capacity in the second half of the year. Figure 10 shows the trend line of Cathay Pacific share price since March 2020. Although the outlook of Hong Kong tourism in short term still remains unclear, Hong Kong Tourism Board (HKTB) and government have organized several key

16

C. Yuting et al.

9 June

21 Dec

Fig. 10 Trend of share price for Cathay Pacific since March 2020 as of Feb 2021

initiatives to build the image that Hong Kong is ready to welcome visitors back with warm hospitality when the pandemic situation is stabilized. To facilitate leisure travel between Singapore and Hong Kong, the tourism boards in both countries collaborated to welcome air travel bubble launch in November 2020. However, given that the number of local unlinked cases in Hong Kong rocketed again in November 2020, both parties have decided to defer the Travel Bubble to a later date. Both online and offline key initiatives have been promoted to encourage Hong Kong citizens to be tourists in their own city as a way to boost domestic consumption. Activities of inviting local celebrities to engage with an international audience on social media platforms were launched to express a sincere wish to see visitors return to the city.

Conclusion and Future Studies In this study, we conclude that the use of XGBoost regressor performs the best among the three selected machine learning models with the lowest MAE, R^2, and RMSE (0.53, 0.93, and 0.69 respectively). From the Feature Importance analysis, knowing the CPI from the country of origin, Hong Kong CPI as well as the exchange rate are playing important roles in the pre-COVID period to predict Hong Kong’s tourism. Since tourism demand refers to the demand for tourism products in order to meet their desire to travel, more in-depth studies of the economic principles of their relationships are needed, in order to explore the economic implications of tourism more deeply. From the previous data analysis, Hong Kong’s tourism industry has indeed been greatly affected by the COVID-19. In addition to analyzing the factors that affect tourism most, future study can also focus more on what kind of methods and factors are most effective for the recovery of Hong Kong’s tourism industry. For example, due

Hong Kong Tourism Under COVID-19

17

to the gradual recovery of social economy in this period, there will be a great demand for business travel in the short-term future. Therefore, it will be a breakthrough for marketing innovation to make travel plans around business travel customers in Hong Kong.

References 1. Dewangan A, Chatterjee R (2018) Tourism recommendation using machine learning approach. In: Progress in advanced computing and intelligent engineering. Springer, Singapore, pp 447– 458 2. Koutras A, Panagopoulos A, Nikas IA (2017) Forecasting tourism demand using linear and nonlinear prediction models. Acad Tur Tour Innov J, 9(1) 3. Baldigara T, Koi´c M (2015) Modelling the international tourism demand in Croatia using a polynomial regression analysis. Turist Pos 15(15):29–38 4. Claveria O, Monte E, Torra S (2016) Combination forecasts of tourism demand with machine learning models. Appl Econ Lett 23(6):428–431 5. Chen Z (2018) A comparison of short-and long-haul vacation tourists on evaluation of attractiveness: the case of Hong Kong. Int J Hum Soc Sci 5(2) 6. Strielkowski W (2020) COVID-19 recovery strategy for tourism industry. Center for Tourism Studies

Tourism Analytics, the Case for Hainan China He Pan, Lu Hengyu, Wei Yuzhi, Wang Qi, Wu Meng, and Zhang Qiqi

Tourism is one of the major economic sectors in the world. For some countries, it can represent more than 20% of their GDP. It accounts for 7% of global trade in 2019 [1]. In some Small Island Developing States (SIDS), tourism accounts for up to 80% of exports, while it also represents a significant share of national economies for both developed and developing countries. In addition, tourism supports one in ten jobs and provides livelihoods for millions of people in both developing and developed economies. The COVID-19 pandemic has huge impact on the tourism industry as it has caused travel restrictions as well as a slump in traveler demand. As a result of the pandemic, many countries and regions have imposed quarantines, entry bans, or other restrictions on citizens or travelers from the most regions. Others have imposed global restrictions on all foreign countries and regions or prevented their citizens from traveling abroad. According to the UNWTO Report on COVID19 related Travel Restrictions, as of September 1, 2020, a total of 115 destinations (represents 53% of all global destinations) had eased travel restrictions. Of these, two destinations have lifted all restrictions, while the remaining 113 destinations continue to have some restrictions in place. Of these, 93 destinations (represents 43% of all global destinations) have closed their borders completely to international travel. This represents a decrease of 22 destinations as compared to July 2020 where most borders were shut. Business travel and international meetings were cancelled or moved to online mode. International tourist arrivals fell by 72% in January–October 2020 compared to the same period last year. Compared to the same period in 2019, the decline in the first 10 months of the year translates to 900 million fewer international tourist arrivals. Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_2. H. Pan · L. Hengyu · W. Yuzhi · W. Qi · W. Meng (B) · Z. Qiqi Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_2

19

20

H. Pan et al.

This is equivalent to a loss of $935 billion in international tourism export earnings, more than 10 times the loss under the impact of the global economic crisis in 2009. The plunge in international tourism is estimated to cause an economic loss of more than $2 trillion in global GDP, more than 2% of the world GDP in 2019. McKinsey’s COVID-19 global tourism recovery scenarios show a decline of 35–48% in terms of tourism expenditures lost in the year 2020 as compared to the previous year. That translates to the top 10 tourism markets losing $1.4 trillion to $1.9 trillion in travel spend within a year. While the entire travel industry has been hard-hit by COVID-19, some sectors have felt more impact than others. Airlines and cruises are among the hardest hit, while vacation rentals and online travel agencies have fared relatively better. The airline sector is one of the hardest hits, despite government support. Most airlines are consistently reporting losses, and some have become insolvent. Air travel is not expected to reach pre-COVID-19 levels globally until 2024, though a faster recovery is possible in Asia. Two-thirds of the world’s aircraft fleet has been parked, and 18 airlines have filed for bankruptcy in a matter of months. The global airline industry is estimated to lose $315 billion in passenger revenue in 2020. Like airlines, hotels have been hard hit by COVID-19. The hotel sector has seen revenue declines more than four times greater than during the previous two crises combined. Many hotels have seen considerable workforce reductions and furloughs, many were forced to close their doors, either temporarily or even permanently. As many as 100 million direct tourism jobs are at risk, in addition to tourismrelated sectors such as the labor-intensive accommodation and food service industries that provide employment for 144 million workers worldwide. Small businesses (which account for 80% of the global tourism industry) are particularly vulnerable. Ninety percent of countries have closed World Heritage Sites, with significant socioeconomic impacts on communities that depend on tourism. In addition, 90% of museums have closed, and 13% may never reopen. The way travelers research and purchase travel related products and services digitally have long been changing profoundly. Recent trends in response to COVID19 have signaled an increasing shift to mobile and digital platform. COVID-19 has accelerated the digitization of most consumers’ daily lives, from grocery delivery to traditional e-commerce. For travel companies, this meant a rise in digital intelligence for the average customer during the pandemic. To succeed in this new environment, companies will need to ensure that their digital channels live up to the growing expectations. Tourism is one of the vital pillars of the economy. Tourists coming into a country and spend money boosts the revenue of local government and the country. Additionally, tourism makes the cultural exchange possible, adding to the internationalization of a nation and reinforcing a country’s international standing. In Hainan’s case, tourism is extremely crucial in local economy, and the backbone sectors of Hainan tourism are: · Transportation: Airline industry, car rental, and water transport. · Accommodation: Hotels, shared accommodation, hostels, and cruises.

Tourism Analytics, the Case for Hainan China

21

· Food and beverage: Restaurants, catering, bars, and cafés. · Entertainment: Casino, shopping centers, travel agencies, and guides. Hainan is a Chinese island located in the South China Sea. With the same latitude as Hawaii, this island is the view as the only island in China with potential to become a tropical island resort. The annual average temperature of Hainan is between 23 and 25 °C, along with five crucial factors in international tourism, namely sunshine, sea, beach, fresh air, and forest, Hainan Island has been positioned as a famous travel destination where the natural capital is as good as Hawaii, Bali Island, and Phuket. Important attractions in Hainan include Haikou, Sanya, east coast, west coast, and central Hainan and they encompass travel themes like history, culture, family fun, nature scenery, wildlife, food, sports, and adventures. Apart from the above-mentioned factors, government support also plays an important role in the development of tourism in Hainan. In 1988, Hainan was no longer administered by Guangdong province. Instead, it became an independent province and was designated a Special Economic Zone, attracting both domestic and international investors. Later on, more favorable policies and implementations such as “visa issued on spot” have increased the accessibility to the island significantly. As a result, Hainan receives extensive attention from visitors all over the world. Tourism in Hainan started to take off at the end of twentieth century. In 2000, over 100 million visitors came to this island. In 2007, more than 753,100 international visitors traveled to Hainan, which increased by three times compared to that of the year 1988. This portion of visitors brought about 302 million dollars revenue, which increased by almost 20 times than that of 1988, counting for 13.94% of GDP of Hainan province. Around 2009, more than 20 million people visited this island every year, and Hainan entered a new stage of tourism transition. In 2019, Hainan received approximately 83.1 million visitors, 82.1 million of which stayed overnight. The proportion of domestic travelers is 98.3% in 2019, the rest of which came from Hong Kong, Macau, Taiwan, and other foreign countries.

Impacts on Tourism Industry We start with looking at the aviation industry as it connects Hainan to the rest of the world and represents the most important part of Hainan’s tourism industry. There are two main airports in Hainan, Meilan International Airport, and Sanya Phoenix International Airport. These two airports connect tourists in-and-out of Hainan Island. Hence, the passenger throughput of these two airports is an important indication of the performance of tourism. Figure 1 shows the throughput of airport passengers declined by a small margin from 2017 to 2019. The obvious seasonal trend can also be seen from the same figure, which reaches a peak from November to February. We can see a cliff descent since the outbreak of COVID-19, the number of passengers was only 14% of the same period in 2019. However, there is also an obvious recovery with the improvement of

22

H. Pan et al.

Fig. 1 Airport passenger throughput of Hainan

domestic epidemic situation. It is shown that the throughput returns to the same level around August 2020. September is considered off-peak period for Hainan tourism, passengers by plane had a surprising positive growth. Figure 2 shows the declared runway capacity of Meilan & Phoenix airports. We can also observe that after the sharp decline, the number of planes arrivals and departures reached the same level as the previous year in July 2020. Typically, the third quarter is not a tourist season, the runway capacity experienced a slight decline after September, but it also had a positive YoY growth. Occupancy rate of flight’s seats significantly relates to aviation revenue. This measurement is not publicly accessible, we use another indicator “the total number of passengers of single flight” as a substitute. It bears similarity to occupancy rate. However, domestic flights use small and medium aircrafts with lower passenger load. International flights were banned during the period, but occupancy rate recovers rapid surprisingly. It is worth mentioning that China’s major airlines launched various promotional activities during June 2020, contributing to the recovery of the aviation Industry (Fig. 3). Hospitality represents a pillar industry in tourism, it was badly hit by the outbreak of COVID-19. Figure 4 shows the Average Daily Rate (ADR) and Revenue per Available Room (RpAR) of Hainan Hospitality. ADR can show the demand of the market. We can see that although there are fluctuations caused by seasonal factors, ADR remain at a stable level in recent 4 years. On the other hand, from the perspective of RpAR, the revenue of available room kept declining that results in the decrease of income and increase of operational cost relatively. Although the daily price on

Tourism Analytics, the Case for Hainan China

Fig. 2 Declared runway capacity of Meilan and Phoenix Airport

Fig. 3 Number of passengers of single flight in-and-out Hainan Province

23

24

H. Pan et al.

Fig. 4 Average daily rate of hotels in Hainan

average remains relatively high after the outbreak of COVID-19, it could not lead to the prosperity of hospitality due to the travel bans. From the same Fig. 4, the bottom of ADR and RpAR was during the second quarter of 2020. We collected additional data on the occupancy of Hainan hospitality and plotted the results in Fig. 5. As shown in Fig. 5, hotel occupancy declined sharply after the outbreak of COVID-19 and reached the lowest level around March 2020. From our earlier analysis, the total revenue of hospitality improved gradually after June 2020. Both aviation and hospitality of Hainan tourism received fatal shock after the outbreak of COVID-19, and both were recovering with the improvement of domestic epidemic situation with aviation experienced a more rapid recovery. We reached the conclusion that aviation received smaller impact due to the following reasons: · The recovery signs appeared around the end of Q1 2020 for aviation, while those were realized around the end of Q2 for hospitality. Aviation recovered more rapid as compared to hospitality. · In terms of costing, hospitality has a relative higher fixed cost, while the main operational cost of aviation is relatively more flexible. Although government has given hospitality more preferential policies such as tax relief, rental remains a high fixed cost for hotels. This means the expenditure is inevitable regardless of whether there are customers staying. In contrast, aviation industry can control some of its cost in time of crisis by reducing the number of flights. Business insiders pointed out that aviation as well as hospitality will continue to recover for more than half a year. Instead of directly showing “retaliatory growth”, COVID-19 affected both leisure and business travels at the beginning of 2020, so the tourist revenue receipts may not reach the expected level as the previous years. In

Tourism Analytics, the Case for Hainan China

25

Fig. 5 Hotel occupancy rate on average of Hainan

addition, the domestic consumer market needs to go through recovery, cultivation, and psychological adjustment. The COVID-19 has spread around world, the prevention and control measures introduced by various countries will also have various impacts and restrictions on Hainan tourism market. Hence, aviation and hospitality will not develop too fast in the short term. Hainan tourism enjoys a few good months in a year, especially in winter, we can observe this seasonal effect from Fig. 6. When covid-19 became pandemic, few travelers came to Hainan in 2020 due to health risk and travel restriction. Although international visitors accounts for a small part of Hainan’s visitors, approximately 10% on average, foreigners’ average spending is typically a lot higher than domestic visitors. Hence, the overall decrease of foreign visitors reduces the total revenue of Hainan tourism income significantly. The total revenue makes up a much smaller percentage of international visitors as such, decreasing from 7% to less than 1% in 2020 (Fig. 7). Domestic and international visitors in Hainan have very different levels of consumption. From 2017 to the end of 2019, it can be inferred from the picture that the average revenue from overseas visitors is around RMB ¥4,200, which is 3 times higher than that from domestic visitors. However, it shows a sudden downward spike in the average revenue from overseas visitor curve at the beginning of 2020. At that time, covid-19 was just beginning its spread to most countries in the world and triggered panic among the travelers. This results in some travelers moving more rapidly, changed hotels and tourist attractions more frequently. This can be seen by calculating total revenue divided by the number of visitors, consider only the number of visitors that comes from overnight stay as provided by hotels.

26

H. Pan et al.

Fig. 6 Tourism income

Fig. 7 Consumption comparison between domestic and overseas visitors

Analytics Methodology In January 2020, COVID-19 was reported in Wuhan, China. By February, outbreaks were reported in cities and provinces in the whole country. Outbreaks of COVID-19 in more than 100 countries and regions around the world have been reported by the

Tourism Analytics, the Case for Hainan China

27

World Health Organization (WHO), the global tourism industry has suffered a huge impact. Although Hainan was not greatly affected by the epidemic, tourism and its related services, which are the major industry in Hainan, were inevitably affected. Accurate visitor forecasts are important for the development of the tourism industry, as they can be used as a tool by the local government to make decisions about the tourism industry in Hainan. This section uses time series models to make short-term forecasts on the number of overnight domestic tourists in Hainan under the scenario that there is no impact of the epidemic and compares them with the actual number of overnight stays under the impact period of the epidemic. From the analysis, we can evaluate the impact of the COVID-19 outbreak on tourism in Hainan. In the analysis, we collected five years’ monthly tourism data from Hainan Province, China. Covering the period from November 2015 to December 2020, the dataset contains various indicators, such as Tourism Consumer Price Index, Number of Overnight Visitors (Domestic Travelers and Foreign Visitors), Tourism Revenue, Tourist Throughput, and Aircraft Landings and Takeoffs. Based on these indicators, we have a detailed overview of the tourism industry in Hainan, China. The monthly data on the Number of Domestic Overnight Visitors, the Number of Foreign Overnight Visitors, and Tourism Revenue to measure the overall situation of the tourism industry in Hainan were collected. The Number of Overnight Visitors refers to visitors who stay for a continuous period from one day up to one year for tourism, including leisure, shopping, sightseeing, visiting relatives, attending conferences, engaging in cultural, sports, and religious activities. It consists of the number of domestic overnight visitors and the number of foreign overnight visitors; both can be used as good measures of tourism. Tourism revenue is calculated based on all tourism expenses made by tourists. This indicator is widely used to measure the operational status of tourism. We will now discuss the Time Series Models being adopted. We have model it based on Holt Linear Model (HL) model, Holt Winter’s Exponential Smoothing (HWES) model, Moving Average (MA) model, Autoregressive Integrated Moving Average (ARIMA) model, and Seasonal Autoregressive Integrated Moving Average with Exogenous Regressors (SARIMA) model. Holt Linear is a two-parameter model, also known as linear exponential smoothing. It is a popular smoothing model for forecasting data with the trend. The Simple Exponential Smoothing (SES) method models the next time step as an exponentially weighted linear function of observations at prior time steps. Holt Winter’s Exponential Smoothing (HWES) models the next time step as an exponentially weighted linear function of observations at prior time steps, taking trends and seasonality into account. Holt-Winters is a way to model three aspects of the time series: a typical value (average), a slope (trend) over time, and a cyclical repeating pattern (seasonality). In time series analysis, the moving average (MA) model, also known as the moving average process, is a common approach for modeling univariate time series. The moving average model specifies that the output variable depends linearly on the current and various past values of a stochastic term.

28

H. Pan et al.

In time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model which combines autoregressive property and smoothen it with moving averages. Non-seasonal ARIMA models are generally denoted ARIMA (P, D, Q) where parameters p, d, and q are non-negative integers, p is the order (number of time lags) of the autoregressive model, d is the degree of differencing (the number of times the data have had past values subtracted), and q is the order of the moving average model. Seasonal ARIMA models are usually denoted by ARIMA (P, D, Q) m, where m refers to the number of periods in each season, and the uppercase P, D, Q refers to the autoregressive, differencing, and moving average terms for the seasonal part of the ARIMA model. Seasonal Autoregressive Integrated Moving Average, SARIMA, or Seasonal ARIMA, is an extension of ARIMA that explicitly supports univariate time series data with a seasonal component. It models the next step in the sequence as a linear function of the differenced observations, errors, differenced seasonal observations, and seasonal errors at prior time steps and it combines the ARIMA model with the ability to perform the same autoregression, differencing, and moving average modeling at the seasonal level.

Model Selection Seasonality is the most important feature in tourism data. A stationary time series is one whose statistical properties do not depend on the time within the time series data. Thus, time series with trends or seasonality are not stationary—the trend and seasonality will affect the value of the time series at different timing. Transformations such as logarithms can help to stabilize the variance of a time series. Differencing can help to stabilize the mean of a time series by removing changes in the level of a time series, and therefore eliminating trend and seasonality. To select the best time series model to forecast tourism indicators, we use the tourism data from Nov 2015 to Dec 2018 as the training set and data from Jan 2019 to Nov 2019 as the testing set. Before Dec 2019, COVID-19 did not have any impact on tourism, so we can evaluate and verify model performance on these time series data. The models’ performance is depicted in Table 1 where the Root Mean Square Error (RMSE) is tabulated on the three sets of indicators (Domestic Tourists Foreign Tourists, and Tourism Revenue) by different models. We adopted voting principle to select best model by voting, the best time series model is Holt Winter Model with the lost RMSE across the three indicators. The other two models SARIMA and ARIMA, both perform well as well, but Holt Linear Model has very high errors. The detailed results for Holt Winter and SARIMA are shown in Fig. 8. One of the most important techniques applied in the Holt Winter Model is Exponential Smoothing. It is a rule of thumb technique for smoothing time series data using the exponential window function. Whereas in the simple moving average the

Tourism Analytics, the Case for Hainan China Table 1 RMSE of time series prediction models

29

Models

Domestic tourists

Foreign tourists

Tourism revenue

Holt Linear

165.42

2.99

33.26

Holt Winter

52.64

2.09

21.19

Moving average

79.87

1.92

22.55

ARIMA

86.35

2.13

21.79

SARIMA

53.82

2.16

21.42

past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time. With the seasonal parts in the model, SARIMA is capable of modeling a wide range of seasonal data. The seasonal part consists of terms that are similar to the nonseasonal components of the model but involve backshifts of the seasonal period. Since tourism data of Hainan is a typical seasonal data with 12 seasonal lags, SARIMA has an outstanding performance. While selecting the model, some randomization and shortcoming of RMSE should also be considered. The RMSE metric favors a model, where the individual errors are of consistent magnitude, as large variations in error will increase the RMSE. Due to the squared error, a few poorly predicted values in an otherwise good forecast can increase the RMSE. In our case, we can safely conclude that Holt Winter is the best model using RMSE metric. In Dec 2019, COVID-19 was reported in China, from then on, the pandemic had a constant impact on tourism. By comparing the prediction value and actual statistics, it is possible to quantify the impact of COVID-19 on local tourism. Table 2 shows the comparison between predicted and actual values. From February to April 2020, tourism industry was badly affected by the epidemic, with the number of domestic overnight visitors falling short of the forecast by over 70%. As the epidemic eased, the impact of COVID-19 on the tourism industry began to subside and Hainan’s tourism industry began to recover in May. As of August, the number of domestic overnight visitors received by Hainan Province has recovered to around 55% of its previous level. For the Tourism Revenue, the predicted income is 99.94 hundred million yuan for the tourism sector, but the actual value was only 14.45 hundred million yuan which represents a decrease of over 85%. According to the estimation, the total loss of tourism revenue in Hainan was amounted to over 310 hundred million yuan from Jan to June 2020. After July, the tourism revenue started to recover and increased by about 20% in October and December. Tourism is the most affected industry from the COVID-19 epidemic. UNWTO estimated that global world tourism traffic could fall by 20–30% which translates to losses of US$ 30–50 billion in international tourism receipts. According to The Statistics Portal for Market Data, global revenue from the travel and tourism industry

Fig. 8 Prediction results of Holt Winter and SARIMA

30 H. Pan et al.

826.53

897.49

339.55

2020/5

771.06

966.94

522.49

2020/8

747.54

833.60

2020/11

2020/12

1167.38

1250.48

1089.64

1176.75

509.69

645.73

2020/9

2020/10

962.24

354.42

447.93

2020/6

2020/7

817.73

207.59

273.21

2020/3

797.39

2020/4

121.81

2020/2

771.22

709.80

790.31

427.94

2019/12

2020/1

Prediction

1.45 1.55 1.38 2.93

−53.22

−40.22

−28.59

1.86

−45.96

−45.13

1.06 1.63

−60.51

0.72

−58.92

−53.45

0.38 0.45

0.23

−84.72

−73.08

8.77

−66.59

17.35

11.34

Actual

15.20

18.66

17.53

15.32

14.32

15.56

14.40

13.79

13.65

15.40

16.23

11.23

12.92

Prediction

Overnight foreign tourists

−44.51

Percentage (%)

Overnight domestic tourists

Actual

Indicators

Date

Table 2 Comparison of predicted versus actual values

−80.72

−92.61

−91.16

−90.53

−87.01

−89.52

−92.64

−94.78

−96.70

−97.53

−98.58

−21.90

34.24

Percentage (%)

136.95

134.22

149.49

80.10

80.68

63.22

47.07

41.52

31.95

26.88

14.45

65.81

114.45

113.11

125.46

124.27

135.03

83.24

87.66

81.78

72.52

76.61

79.29

99.94

128.76

101.53

Prediction

Tourism revenue Actual

21.08

6.98

20.30

−40.68

−3.08

−27.88

−42.45

−42.74

−58.29

−66.10

−85.54

−48.89

12.72

Percentage (%)

Tourism Analytics, the Case for Hainan China 31

32

H. Pan et al.

is about to decrease by over 20%. However, looking at the Hainan case, several interesting trends can be observed below: 1. The number of foreign tourists gets more heavily impacted than that of domestic tourists. From Fig. 7, we can see that COVID-19 has a more prolonged effect on foreign tourism than domestic tourism. The main reason why more foreign tourists stop traveling to Hainan is due to travel restriction. From the beginning of March, China has announced the quarantine policy that all foreigners visiting China must stay at designated quarantine location for 14 days and would be allowed to move out only after he or she showed negative test results for COVID-19 using nucleic acid test. Starting from March 28th of 2020, various Visas like Port visas, 24-h visa-free transit, Hainan 30-day visa-free entry were suspended from being issued. From late March, only one of the airlines from a foreign country flying to China is allowed, and the number of international flights drops to 108, which is only 1.2% of the number before the outbreak of COVID-19 pandemic. As confirmed cases decrease over time, people in China were more eager to go on vacations after staying at home for a long time. As a result, the number of domestic tourists visiting Hainan caught up with the predicted figure based on the time series model within the short term. Such high recovery speed mainly comes from firstly confidence about controlling the spread of COVID-19 in China. Secondly, willingness to relax in nature after a long time of staying at home, and thirdly more severe COVID-19 situation reported from overseas. Hainan case also shows that local tourism can help with the deep potential impact of pandemic outbreaks on the tourism industry as well as the national economy. Policymakers and tourism practitioners could consider providing extra convenience for people to visit local scenic spots to boost local tourism. 2. Although the total number of tourists gets lower, the revenue in late 2020 grows more than expected. We collected tourism price index and plot it out in Fig. 9. We can observe that the price index in late 2020 is an overall decreasing trend. Hence, the previously reported revenue should come from other sources. Travel restrictions in different countries render inability for locals to travel overseas. As a result, this shifts the high overseas vacation to domestic travels as tourists seek for better traveling experience. As a result, the revenue per person was increased accordingly. Another factor arises from the quarantine policy. Domestic tourists are required to stay at the designated hotel for 14 days and pay the bills for meals and accommodation fees at their own expense. Such actions can be a way to make up for the shortfall in the hospitality industry due to COVID-19. Policymakers and tourism practitioners understand the deep potential impact of pandemic outbreaks on tourism industry as well as the national economy. An appropriate managerial and economic policy response can tame the potentially catastrophic consequences. The response should be feasible and quick in response. Without the appropriate managerial and economic policy react, the pandemic consequences can

Tourism Analytics, the Case for Hainan China

33

130.00 120.00 110.00 100.00 90.00 80.00 70.00 2016-01-03

2017-01-03

2018-01-03

2019-01-03

2020-01-03

Fig. 9 Hainan weekly tourism price index from 2016 to 2020

push any economy into a deep recession. The potential impact of COVID-19 has a tsunami effect on the overall tourism industry, leading to large-scale bankruptcies. The total potential economic impact on the tourism industry and the national economy from the pandemic should be classified as a super cycle event in economic literature. Given that the outbreak of the COVID-19 pandemic has reduced certain negative effects of tourism such as congestion, destruction of cultural icons, and ecosystem degradation. The next direction will be on whether the current crisis can generate innovative, sustainable solutions. Among the many alternatives, slow tourism and virtual tourism bring reasons for optimism. Policymakers and tourism practitioners ought to develop a crisis-readiness mechanism to respond to the current pandemic crisis as well as future pandemic crises. To do so, it is good to gain some empirical knowledge on the nature of the COVID-19 crisis. The COVID-19 epidemic has caused a huge negative impact on the domestic and international tourism in Hainan province. Since the issuance of a series of restrictions on travel, the number of tourists and the tourism revenue of Hainan province have decreased severely. However, although the epidemic will continue to exist in the near future, we will propose several suggestions on the recovery of its tourism business from the blow of the epidemic based on the previous data analysis results. First, develop tourism programs to attract tourists within the province. Prior to the onset of COVID-19, over 97% of the tourists of Hainan province were from other provinces of China, with the remaining from overseas. The number of international flights has been reduced significantly because of COVID-19. This impacted number of international tourists. From 2016 to 2019, the average percentage of foreign tourists in all tourists is 1.92%, but in 2020, the average percentage of foreign tourists has been reduced to only 0.41% (Fig. 10). As for domestic tourists, although things were largely restored in the second half of 2020, inter-provincial travel still comes with some restrictions like tourists may are required to be isolated after they return home, etc. This also impacted the number of tourists from other provinces in China to decrease. Under such circumstance, it is essential for tourist attractions and tourist cities in Hainan province to attract more tourists within the province. Due to the epidemic, traveling from province to province

34

H. Pan et al.

Overnight Foreign Tourists 3.00% 2.50% 2.00% 1.50% 1.00% 0.50% 2020-10

2020-07

2020-04

2020-01

2019-10

2019-07

2019-04

2019-01

2018-10

2018-07

2018-01

2018-04

2017-10

2017-07

2017-04

2017-01

2016-07

2016-10

2016-04

2016-01

0.00%

Fig. 10 Hainan weekly overnight foreign tourists amount from 2016 to 2020

is either restricted or come with long quarantine period. The travelers can only travel within the province to meet their travel needs. The tourist attractions and tourist cities could increase more publicity to attract domestic tourists. Second, attract more tourists through promotions on air tickets and hotels. The hassle of traveling following COVID-19 often puts off tourist desires to travel again. Promotions and discounts on air tickets and hotels will make traveling to Hainan more attractive for tourists and compensate the hassle that comes with traveling. According to data of Hainan Tourism Consumer Price Index, the prices of tourism accommodations and meals have not reduced significantly. The prices of air tickets have decreased as compared to before. If the tourist attractions and tourist cities in Hainan can offer more discounts on accommodations and meals, it will surely become more attractive to tourists. Besides, it is also significant for scenic attractions, hotels, and restaurants to improve service quality and provide better tour experience for visitors. Third, develop Virtual Reality (VR) tourism programs. VR technology has been widely used in the tourism industry. Many scenic attractions use VR technology to make scenic guide maps and promotional videos, etc. However, VR technology currently only plays a supporting role in tourism and cannot be used as a substitute for visiting. During the epidemic, some indoor attractions are not accessible or reduced capacity is imposed. The use of VR technology can make up for tourists’ regrets. With the development of VR technology, it can be applied in more areas. It will better enhance the traveling experience of tourists and keep tourism going despite the epidemic.

Tourism Analytics, the Case for Hainan China

35

Conclusions In this study, we analyzed the data trend changes of overnight domestic tourists, overnight foreign tourists and tourism revenue using time series models and discussed several insights following the visualizations. As shown previous sections, the blow of COVID-19 epidemic caused a great negative impact on tourism revenue of Hainan province. In February 2020, the tourism revenue decreased 88% as compared to that of February 2019. We analyzed the data and modeled it with time series models, it was discovered that although number of tourists decreased significantly and has not been recovered sufficiently, the tourism revenue has continuously increased and even obtain YoY growth rate of 20% in December 2020. We conclude that it could be due to three reasons: firstly, the hassle for tourists to travel during this period will cost them more on restrictions requirements like nucleic acid amplification testing, etc., the travelers are likely to spend more money during travel and hence the per capita consumption is higher than before; secondly, tourists who were quarantined in Hainan for reasons like entering China from another country increased the income of hotels and restaurants; thirdly, people from Hainan province is more willing to travel within the province, which also contributes to the increase in the overall tourism revenue. Finally, we also made a few suggestions to increase tourism revenue in the background of epidemic. There are limitations to this study. Firstly, is on the calibration of data. The data indicator of “Number of Overnight Domestic Tourists” mainly comes from the Hainan Provincial Department of Tourism, Culture, Radio, Film, and Sports. Although the statistical caliber is basically the same, some changes may not be noticed due to the long-time span. For example, starting from October 2019, “Migratory Bird Visitors” has been newly included in the “Number of Overnight Domestic Tourists”. Our study has processed the data to make up for the problem, but there could still be other unspecified changes that affect the modeling results. Secondly is on the sufficiency of data. We could not get sufficient data from all dimensions we need from public domain; the calculation of results may contain bias arising from incomplete data. For example, when we calculate the per capita tourism consumption of domestic and foreign tourists, we used total revenue divided by the number of visitors, and the number of visitors comes from stay-overnight-customer count provided by hotels. As a result, visitors changing accommodations may be counted more than once and reduce the average revenue from overseas visitors. If we could get sufficient data indicators afterward, this problem can be avoided. Thirdly is on the use of time series ARIMA model. We used the ARIMA model to model the “number of domestic overnight tourists”. ARIMA model requires data or differentiated data to be stable, but the number of tourists is often affected by factors such as policies and publicity. For example, in 2010, Hainan promoted the “International Tourism Island” and the construction of the “Free Trade Pilot Zone” on the entire Hainan Island in 2018 have a significant impact on the number of tourists in the later period. The model fitting has not yet fully reflected these effects. Based

36

H. Pan et al.

on the characteristics of the ARIMA model, the prediction error of the fitted model will increase with time. Fourthly is on the use of provincial data versus municipal data. Based on the availability of data, this article uses provincial-level data indicator in ARIMA modeling. However, tourism cities such as Haikou and Sanya may be affected more by the COVID-19 epidemic. The impact of the epidemic on other cities in Hainan is relatively small. If more micro-level data could be obtained, detailed analysis will make the results of modeling analysis more practical.

Reference 1. United Nations World Tourism Organization, https://www.unwto.org/tourism-and-covid-19-unp recedented-economic-impacts

Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe Chen Shijing, Chen Yuheng, Chen Ziyan, Gong Manlin, Lai Zijun, Lin Dazheng, and Li Hongtao

COVID-19 has been one of the worst epidemics in the human history and is still evolving. It was first detected in Wuhan, China, in early 2020 and caused health risk all over the world. This pandemic is exerting a huge impact on all sectors in real economics while global tourism has been one of the most affected ones during this public health crisis. The global travel and tourism revenue in 2020 is nearly the half of the original forecasting, only reaching about 396,370 million US dollars. According to the report from Statista 2020 [1], the total travel and tourism revenue in North America reduced from 181,805 million US dollars in 2019 to 129,230 million US dollars in 2020; The total revenue of tourism industry in Europe dropped from 211,972 million US dollars in 2019 to 124,209 million US dollars in 2020; the total earnings from travel and tourism industry in Asia reduced from 225,889 million US dollars in 2019 to 150,404 million US dollars in 2020. In terms of employment loss, Asian Pacific suffers the most job loss with 63.4 million jobs lost in travel and tourism industry in year 2020. This accounts for more than 60% of job loss in the world, followed by Europe with more than 13 million workers lost their jobs in tourism industry during the COVID-19 epidemic. Countries announced fully and partially lockdowns, while air travel in the world has dropped by over 70% in main cities. The travel restriction deeply influenced the related industry negatively, such as restaurant industry. Using the United States as an example, the sales of total Food and Beverages outlets declined from 65.4 billion dollars in February 2020 to 45.7 billion dollars in March 2020 according Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_3. C. Shijing (B) · C. Yuheng · C. Ziyan · G. Manlin · L. Zijun · L. Dazheng · L. Hongtao Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, 639798, Singapore, Singapore e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_3

37

38

C. Shijing et al.

to National Restaurant Association. In addition, the worldwide year-on-year daily change in seated restaurant diners dropped sharply in March 2020 and implied a declining trend after a steady recovery in November 2020.

Dataset and Analysis In this work, we collected data in food, aviation, and accommodation industries from the Eurostat website [2]. To gain a better understanding of the pattern presented in data, we performed data visualization using these three industries (aviation, hotels, and restaurant). For food industry, the data describes food & beverage production index for the years 2017–2020. We created a dashboard for this data as shown in Fig. 1 that denotes Food Industry Production Index Dashboard. The monthly food & beverage production index shows a steady growth trend over the 12 months from 2017 to 2019. However, in 2020, the index dropped rapidly from February, reaching a low of 30.8 in April. Compared to the same period in 2019, the production index decreased by 76% (as can be seen from the year-over-year ratio picture). The number started to recover since May and bounce up by 84% in June. A similar observation can be found in Fig. 2 that depicts Accommodation Production Index Dashboard. The index rises back to a normal level in August. Overall, the monthly production index in 2020 is lower than in the past few years due to the impact of COVID-19. For accommodation industry, we collected data on the monthly accommodation production index for years 2017–2020 September and visualized as shown in Fig. 2. The data have a similar pattern as the previous Food Industry Production Index

Fig. 1 Food Industry Production Index Dashboard

Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe

39

Fig. 2 Accommodation Production Index Dashboard

Dashboard and achieved the peak in August for the previous 3 years. However, accommodation production saw a sharp decrease in the first quarter of 2020 due to the influence of COVID-19. The lowest point of index occurs in 2020 April, which is 86.16% lower than in 2019 April. After the decrease, accommodation industry had a recovery and accommodation index saw a 140% increase in June, and eventually reached its annual peak in August. For aviation industry, we collect the flight number data and aviation production index data and then plotted two dashboards. Figure 3 shows the dashboard on flight number and Fig. 4 shows the dashboard for air transport production index. We first look at Fig. 3 on flight number, flight number in Europe decreased significantly in the year 2020 as compared to the previous year 2019. The decrease in flight number occurred in the first quarter of the year 2020 and it was slightly improved starting from May. The peak of flight number for year 2020 is in the month of August. However, this number is still declined by 53% when compared to 2019 in the same month. The second figure for aviation industry is shown in Fig. 5 on Air Transport Production Index Dashboard. We can observe the significant decrease in the first quarter. The index bounced back a little from May and finally stabilized in July. When we made a year-on-year comparison, air production index from April to September 2020 was reduced by almost 70%. In conclusion, production index in three tourism-related industries suffered greatly due to the pandemic in 2020. Since April, production in all the three industries started to grow. Specifically, food industry recovered quickly and went back to the normal level in August. However, the number of flights and air transport production index remained low as compared to the numbers from previous years. Overall, food industry is more resistant to this significant event on pandemic.

40

C. Shijing et al.

Fig. 3 Flight Number Dashboard

Fig. 4 Air Transport Production Index Dashboard

Methodology and Experimental Results We built a Long Short-Term Memory Neural Network (LSTM) to estimate the impact of COVID-19 on the European economy. LSTM is an artificial neural network that can be used to model time series data like tourism data. When we compare this with regression-like model such as ARIMA and its variations, LSTM model does not reflect the relationship between dependent variables and independent variables. The economic activities surrounding tourism are influenced by a wide variety of reasons. The model is incapable of incorporating all the surrounding variables that may result in variable bias. The strength of LSTM is in its ability to indirectly model other variables by learning the past instances that incorporate those variables. Therefore,

Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe

41

Fig. 5 LSTM model for accommodation industry

LSTM model, in this case, is more suitable for analysis impacts of COVID-19 on economy. Recurrent Neural Network (RNN) is a deep learning model taking a sequence data as input for modeling. It is one of the most used models in Natural Language Processing. RNN makes the same use of each current data input, together with past instance information to predict the current output. When the current iteration is completed and output a result, the result will be copied and returned to the RNN. Thus, this means that the concurrent output and input learn from past input. There are some inherent problems that come with the use of RNN. The downside of RNN includes gradient vanishing, exploding problems, and additional computation for added longer sequences.

42

C. Shijing et al.

LSTM was, first, proposed by Hochreiter & Schmidhuber in the year 1997, it was later improved and promoted by Alex Graves in subsequent years. Currently, LSTM has achieved considerable success and has been widely used. The repeating module in the standard RNN contains a single layer. When comparing RNN to LSTM, LSTM has the same structure, but the repeated modules are not the same as in the case of RNN. We studied three industries that are influenced by the pandemic most severely: aviation, hotels and resorts, and restaurants. The time frame we studied was from January 2010 to September 2020 and the records are reported monthly. We used the total production value of the industry to evaluate its performance. The baseline time is January 2010 and has an initial index of 100. The reason for using decadelong data is to provide sufficient data to train the LSTM model for more precise modeling. We believed that the impact of COVID-19 was a shock to the market, and hence its impact would not store in data before the outbreak of the pandemic. We predicted the performance of the industry in the absence of pandemic where the past trends should continue. We then compared it with the actual number after the pandemic to understand the impact on the industries. The COVID-19 pandemic was first known to public since the first outbreak was recorded in Wuhan, China, where the government started to impose movement restriction to limit people’s activities. We trained our model with data from January 2010 to December 2019. A period that was not influenced by the pandemic. We then make a nine-step prediction (January 2020–September 2020) based on our trained model. For the hotel and resort industry, we built an LSTM model with four hidden layers to model the performance of accommodation sector, with 128 units in the first 3 layers and 64 units in the last layer. The dropout rates are 0.3, 0.3, 0.3, and 0.2 for the four layers. We used “nadam” as optimizer. The model performance is plotted in Fig. 5 with the top graph showing the performance of accommodation sector before pandemic and the bottom graph for performance during pandemic. The orange line is the predicted number while the blue line is the actual data. Based on our estimation, the accommodation industry on average shrank by 45.76% each month from the projected data. The worst period is in April 2020, when the industry had only 14.35% of production value against the estimated value in the absence of pandemic. However, the industry recovered slightly from the pandemic. In September, the actual value was 56% of the predicted value. This change may be due to the industry was slowly adapted to the pandemic and developed new strategies such as staycation, to attract customers. For the aviation industry, we built another LSTM model with a similar structure with four hidden layers, 128 units in the first three layers and 64 units in the last layer. The dropout rates are 0.5, 0.3, 0.3, and 0.2 for the four layers. We used “nadam” as optimizer. The model performance is shown in Fig. 6. The orange line is the predicted value, while the blue line is the actual data. Aviation on average shrank by 55.12% each month. While the worst case occurred during April, the actual number was only 21.92% of the predicted value in the absence of pandemic. Aviation is still far from

Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe

43

recovery, the actual production was only 32% of the predicted value from the last set of data collected in more recent period. Many countries are still imposing travel restrictions at the time of writing and banning foreigners from entering to control the spread of COVID-19. If travel restriction still holds, it will be hard for the aviation industry to recover. We built another LSTM model for the food and beverage industry with four hidden layers, 128 units in the first two layers and 64 units in the rest layers. The dropout rates are 0.5, 0.5, 0.3, and 0.3 for the four layers. We used “nadam” as optimizer. The model performance is shown in Fig. 7, the orange line is the predicted number, while the blue line is the actual data. The food & beverage on average shrank by

Fig. 6 LSTM model for aviation industry

44

C. Shijing et al.

31.90% each month. While the worst case occurred during April, the actual number was only 24.69% of the predicted value in the absence of pandemic. The recovery of food and beverage was immediate as compared with the other two industries. In August, the actual production was 88.65% and in September the number was 78.67% of the predicted value. As of September 2020, most of the places in Europe allowed restaurants to reopen with safe management measures.

Fig. 7 LSTM model for food and beverage industry

Impacts of COVID-19 on Food, Aviation, and Accommodation in Europe

45

Recommendation and Conclusion In conclusion, the aviation industry suffered the most from the pandemic due to travel restrictions. Hotels and resorts were the second worst, but they could grab the opportunity to continue their business with a revised revenue model like staycation. Restaurant and catering service was the least impacted industry, as they could still draw revenues from local residents. The F&B industry suffered a temporary shutdown in April 2020 when the government orders to close most of the restaurants, but it recovered quickly after the ban was lifted. Temporary isolation hotels can help ease the loss suffered by hotels. Although it has been practiced in China and some cities in America that governments started to initiate hotel isolation programs with selected hotels, it has yet been widely adopted in Europe and the United Kingdom. Service providers who need to comply with safe distancing measures and hygiene practices will drive businesses to move towards adopting robot automation in serving clients to provide contactless services. With the development of AI and robotic technologies, machines can serve people as good as real people. COVID-19 provides a turning point for business to accept these technologies to reshape the future of this industry. Delivery service is a promising service in restaurant industry in a depressed environment. According to Kim and Wang [3], delivery service contributed to most of the sales of restaurants during the COVID-19 epidemic. Dine-in options with discount options failed to bring up sales during the pandemic. It might be due to customers’ concern about dining in will put them at higher risk of contracting by coronavirus. Although most of the countries have prevailing travel restrictions, however, they are limited to passengers only. Logistic services are still in strong demands. Airline operators can use their existing fleet to transport cargoes. Some of the medicines and vaccines have very strict requirements for storage and transportation. For example, the COVID-19 vaccine developed by Moderna is required to be stored below 70 degrees Celsius. Therefore, airlines can tap on these demands to keep their fleet in operation. The COVID-19 pandemic has a great impact on the world economic, and we believe that its impact will continuously impact our economy. In this work, we studied aviation, hotel, and resort business, as well as restaurant and catering service. Compared with year 2019, those industries in year 2020 shrank by 70–80% in the worst period. To quantify the impact of COVID-19, we built an LSTM model to predict how those industries would have performed in 2020 if COVID-19 did not happen and compared it with the actual data. From our estimation, restaurant and catering services was the first industry that recovered from the pandemic, followed by hotel and resort business. The aviation industry was still in a difficult period since travel restrictions were still prevailing in most regions. Based on our study, we make some suggestions to better support the industries and economic regrowth. First, we can use the excess capacities of hotels to host people in mandatory isolation. Second, restaurant should promote and continue delivery services even when pandemic is over to cushion this industry even in the most difficult times.

46

C. Shijing et al.

References 1. Statista Report, https://www.statista.com/studies-and-reports/ 2. Eurostat, https://ec.europa.eu/eurostat/web/main/data/database 3. Kim J, Kim J, Wang Y (2021) Uncertainty risks and strategic reaction of restaurant firms amid COVID-19: Evidence from China. Int J Hosp Manag 92:102752

Tourism Rebounds Analysis—Lessons from Baltics Countries Long Zhaowen, Wei Kexian, Wu Mengran, Xiong Yike, Yang Yafeng, Zhao Chenxi, and Zhou Yang

Tourism is a major industry and contributor to Singapore’s economy, attracting more than 17 million international tourists in 2017; this is more than three times Singapore’s total population and accounted for 4.1% of Singapore’s GDP. The percentage of tourism’s contribution to Singapore’s GDP is projected to rise to 4.4% in 2028. However, the coronavirus (COVID-19) pandemic has changed everything and triggered a global economic crisis. To control the spread of the virus, governments around the world have introduced unprecedented measures to contain the virus, including restrictions on travel, business operations, and human interactions. Singapore government’s response is very prompt. On February 1, 2020, Singapore placed restrictions on tourists from China and progressively applied them to other countries. On April 3, 2022, Singapore’s Prime Minister announced the government’s decision to adopt a temporary measure called the "The Circuit Breaker", escalating further travel and entry restrictions. The entry restrictions directly resulted in a decline in the number of inbound tourists. In this work, we reflect on Singapore’s Tourism Income, International Arrival Statistics, and specifically how the hotel industry is doing during the pandemic. According to data from Singapore Tourism Board [2], Singapore’s international arrival reaches about 1.6 million in January 2020. The tightening of border control from the February to April period in the same year brings this number to almost 0. After several months of tightening measures, the pandemic began to be under control and the "Circuit Breaker" ended for Singapore to reopen. However, the arrival number is still far below that in January. For example, in January when no restrictions on international travel were implemented, China and ASEAN regions contribute Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_4. L. Zhaowen · W. Kexian · W. Mengran (B) · X. Yike · Y. Yafeng · Z. Chenxi · Z. Yang Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_4

47

48

L. Zhaowen et al.

to most international arrivals. As the virus continues to spread all over the world with increasingly more restrictions being implemented, international arrivals dropped significantly. In July of the same year, Singapore started to reopen. Tourists from China (with well-controlled COVID-19 measures at that point in time) became the largest international arrival contributor, and other counties like Japan whose COVID19 was also well controlled become the second major contributor. Gradually opening the border to different countries plays a significant role in lifting up international arrivals in Singapore and the tourism industry.

Business Understanding and Approach This study consists of 3 phases as shown in Fig. 1: understanding the current situation, modeling baseline growth trajectory, and evaluating policy effectiveness. The objective of the phase I study is in line with the first two steps of the Cross Industry Standard Process for Data Mining (CRISP DM) process, namely business and data understanding: to understand how the Singapore Hotel Industry landscape looks like, and to understand what data we need, where to get it, and why do we need it. Figure 2 shows how the Singapore hotel industry landscape can be divided into two segments: international receipts and domestic receipts. Domestic receipts are made up of only a thin slice of the total receipts. There are four key drivers that influence international receipts. The number of International Arrivals should exclude 1-day trips, as they do not contribute to a hotel stay. Such a 1-day trip includes transit and traveling to nearby locations like Malaysia and Indonesia. The “Dollars spent per Night in Hotel” includes room services, hotel restaurants, amenities, etc.

Fig. 1 Tourism analytics approach in 3 phases

Tourism Rebounds Analysis—Lessons from Baltics Countries

49

Fig. 2 Singapore hotel industry business framework

In phase II of the study, we reflect the actual data from 2020 and best guess estimates of the 2021 hotel industry outlook, given no new measures/policies from the government are rolled out. We attempt to collect the relevant data as shown in the T shape table on the right of Fig. 2. It lists the data we are looking for, where we find them, what assumptions we take in the event that we cannot find the data, and why do we need the data. For instance, data on tourism income, and international arrivals (excluding intra-day travel where most come from Malaysia by land and Indonesia by sea) over the past years are needed for the projection of hotel industry growth without COVID-19 impact. Time series autoregression method and linear regression are examined for their suitability in performing the estimation. Since linear regression requires feeding various independent variables, such as global/ASEAN/regional economic indices, forex, cyclical effect of the 2020 Olympics in Asia into the model, and 2020 data are all skewed due to the COVID-19 pandemic. Hence, autoregression using data prior to 2020 is employed in this phase of the study, specifically, the data below are used: • Quarterly tourism receipts for the period 2008–2019, with a specific sub-category called “Accommodation” (data was obtained from Singapore Tourism Board). • Singapore inbound international visitor arrivals for the period 2008–2019, which excludes arrivals of Malaysians by land (data was obtained from https://www.sin gstat.gov.sg/). • Tourism growth for the year 2004 post SARS. • Travel bubble among Baltics Estonia, Latvia, and Lithuania from May to September 2020 (data was obtained from Estonia government statistic bank statistika. https://www.eestipank.ee). To enable our prediction on the effect of the travel bubble on the international arrivals and hotel industry, another model was built using data with only the travel bubble that has taken place among the Baltic countries back in 2020. We believe our growth trajectory prediction for the hotel industry on how it will rebound is

50

L. Zhaowen et al.

appropriate, as we have obtained credible data from Estonia to back this work of similar nature on tourism rebound from SARS that occurred in the year 2003. Two policies were investigated quantitatively in this study to look at the potential effectiveness, namely the effect of travel bubbles and the effect of local stimulus to generate higher demand in hotels. Meanwhile, other policies will be analyzed qualitatively. Combining both approaches, a hotel industry revival playbook will be formulated to path the way forward.

Data Model Analysis In the first section, predictions on important economic indicators without COVID19 will be illustrated. We selected three important indicators, which are tourism spending, international visitor arrivals, and employment related to tourism to measure the performance and impacts of the tourism industry.

Tourism Income Baseline Growth Trajectory 2020–2021, Without COVID Tourism spending is a key statistic that can directly reflect the situation of the tourism industry. It includes five parts of expenditure: accommodation, food and beverages, shopping, sightseeing, and Entertainment and Gaming, and others. The target variable is time series data; we apply three time series models, autoregression (AR), ARIMA, and one machine learning technique using XGBoost. Model configuration and the result will be explained in detail. Autoregression is a time series model that uses observations from previous time steps as input to a regression equation to predict the value at the next time step. It assumes that the observations at previous time steps are useful to predict the value at the next time step. This relationship between variables is called autocorrelation. If both variables change in the same direction (for example, both variables go up together or down together), this is called a positive correlation. If the variables move in opposite directions as values change (for example, one goes up and the other one comes down), then this is called a negative correlation. In this section, we built two models with one on the relation between a single day and its previous day, and another is the relation between a single day and its previous two days. After applying linear regression, the second model achieves higher accuracy of 92.30%, which means that the day before its previous day still plays an important role in prediction. As shown in Fig. 3 is the results from actual value, prediction from the first model (AR1) and prediction from the second model (AR2). Both models give close predictions from the actual value.

Tourism Rebounds Analysis—Lessons from Baltics Countries

51

Fig. 3 Comparison between actual and predicted values for AR1 and AR2

A popular and widely used statistical method for time series forecasting is the Autoregressive Integrated Moving Average (ARIMA) model. This model captures a suite of different standard temporal structures in time series data, especially for analyzing non-stationary time series whose cause is random. ARIMA model has three important parameters: the lag for AR(p), the differencing degree for smoothing data(d), and the lag for MA(q). To better apply the model, we need to determine the parameters first. The stability of data gives a sign for the determination of the d parameter, which is the differencing degree. As shown in Fig. 4, the data has a clear trend. This suggests that the data is not stationary and will require differencing to make it stationary, at least a difference order of 1. When differencing order equals 2, there is not much difference, so d will be 1. Figure 5 shows the autocorrelation plot (ACF) plot on the autocorrelation coefficient. The autocorrelation plot determines the appropriate q value. A conclusion can be made from the plot that the autocorrelation coefficient thins out. The graph plots out the ACF for the case of a q value equivalent to zero. The partial autocorrelation coefficient is truncated at about 2. Before this split point, the sample partial autocorrelation coefficient is significantly greater than 2 times the standard deviation within the range in the initial d-order. Almost 95% of the sample partial autocorrelation coefficients fall within 2 times the standard deviation range. We estimate similarly for the range of parameters; we then build the model with different combinations and select the optimal model based on AIC and BIC. Finally, a combination of (1,1,0) yields the best output. As shown in Fig. 6, the actual and predicted values are highly fitted. Among all the models, ARIMA predicts with the highest accuracy of 94.00%.

52

L. Zhaowen et al.

Fig. 4 Comparison between actual values and differencing order of 1

XGBoost XGBoost is an ensembled learning method; it provides a tree boosting mechanism which can solve the machine learning problem efficiently. It is known for its fast speed and good performance in prediction problems. Through the iteration process, misclassified data will increase its weight to increase prediction accuracy. The dependent variable in this problem is continuous in nature, hence we chose the XGBoost regressor for prediction. The result is reported on a quarterly basis; we used the data spread across four periods before the current period to make a complete year for prediction. After applying XGBoost, the final accuracy for the test set is 93.64%. Figure 7 shows the actual and predicted tourism income using the XGBoost model.

Model Evaluation We evaluate the models using metrics, it is an important indicator to decide which model to select. Table 1 shows a comparison of accuracy for different models. ARIMA has the highest accuracy among the four models. Even though the difference between different models is less than 2%, when it is applied to large data like thousands of people or millions of incomes, even 0.1% will make a huge difference. Besides, ARIMA is a model designed for time series data; it generates forecasts by considering the most recent time series data and past random errors [6] to transform the non-stationary time series data into a stationary one. XGBoost is a deep learning model. We used the data across four periods before the current period to predict

Tourism Rebounds Analysis—Lessons from Baltics Countries

53

Fig. 5 Plots of Autocorrelation ACF (top) and Partial Autocorrelation PACF (bottom)

the current data. The result is also highly dependent on the time period we choose, and the model performance might be unstable as compared to ARIMA. Therefore, we chose ARIMA as the prediction model for Tourism income. The actual tourism income in the second quarter of 2020 is 160 million, and the forecast value is 7,357 million, which is about 46 times the actual value. Therefore, we can see that tourism income was badly affected by COVID-19 (Fig. 8). We now look at Tourism Employment Baseline Growth Trajectory for the year 2020–2021 projection in the absence of COVID-19. Different from tourism income, employment data (employment arises from tourism) is on the yearly-basis data with 25 instances from 1995 to 2019. ARIMA is sensitive to the total number of data and requires data to be of at least 30 instances. Hence, we use autoregression to perform the projection. From the previous study on tourism income, we discovered the use of the past two days of record to predict has a better performance as compared to using a

54

L. Zhaowen et al.

Fig. 6 Comparison between the actual and predicted values for ARIMA

Fig. 7 Comparison between the actual and predicted tourism income for XGBoost Table 1 Comparison of prediction accuracy on different models Accuracy

AR1

AR2

ARIMA

XGBoosr

91.91%

92.30%

94.00%

93.65%

Tourism Rebounds Analysis—Lessons from Baltics Countries

55

Fig. 8 Comparison of predicted value and actual impact of COVID-19

single independent value. Therefore, for the prediction of tourism employment data in 2020, we adopted the use of second-order autoregressive after experimenting with different configurations. The accuracy of the model achieves 91.12%, and the actual value for tourism’s contribution to employment in 2020 should have been 105.83. The actual and predicted values on tourism employment data can be seen in Fig. 9. We see closely fit lines between the two. We obtained data on tourism income for Q1 and Q2 of 2020. We used that to build a linear regression model to find the relationship between tourism income and tourism employment, we then proceed to use the tourism income to predict the actual value of tourism employment affected by COVID-19. After applying the linear regression model, the accuracy of the model achieves 95.94%. And the predicted value for tourism employment in 2020, which is affected by COVID-19, should have been 28.79.

Fig. 9 Comparison between actual and predicted values for AR2

56

L. Zhaowen et al.

Fig. 10 Comparison between actual and predicted data for tourism employment

Figure 10 shows the comparison of employment data with and without COVID19. We can see the influence of COVID-19 on tourism employment is severe. The number of employed decreases by 45.59%. COVID-19 not only has a negative effect on Singapore’s income, but it also affected other aspects such as employment which ultimately affects GDP. Therefore, a timely response is crucial to reverse this trend.

Prediction of International Arrivals in 2020 and 2021—an Outlook Without COVID-19 Tourism arrivals is another important variable that can evaluate the performance of the tourism sector. The previous section suggested the use of the ARIMA model performs the best; we now apply ARIMA to the analysis for arrivals prediction. Based on the plot for autoregression, partial autoregression, and the trend of original data, we then estimate the suitable values for different parameters. The model was built with different configurations and selected the optimal model according to AIC and BIC. A combination of (16,1,0) for (p, d, q) in ARIMA achieves the highest accuracy of 95.91%. Figure 11 plots out the actual and predicted values from the ARIMA model. We look at prediction values on international arrivals with and without COVID19 for comparison as shown in Fig. 12. Graph (a) depicts the values for the year 2020 and graph (b) depicts the period from 1980 to 2025 forecasted values. We can observe from the plot that Covid-19 has a significant impact on the arrivals. The first positive case of Covid-19 was reported at the end of January 2020 in Singapore, and it extended to more than 100 cases in a few weeks’ time. Since March, the Singapore

Tourism Rebounds Analysis—Lessons from Baltics Countries

57

Fig. 11 Comparison between actual and predicted values for ARIMA

government has introduced restrictions on visitor’s entry, which can explain the reason why the number stabilizes since April. Table 2 depicts the comparison of actual performance versus predicted values. We can observe from the table that COVID-19 has taken its toll on the tourism sector with tourism income decreasing by 70% and the number of arrivals decreasing by 73% in the first half and coming to almost a complete halt from April to November.

(a) Actual and predicted values for arrival data from 2020 to 2022

Fig. 12 Arrivals data

(b) Arrivals data before and after covid-19

58

L. Zhaowen et al.

Table 2 Comparison between tourism industry performance with and without COVID-19 Tourism income

Internationl vistor arrival First half

Apr to Nov

Actual values under COVID-19

4,189

2,664,866

57,129

Predicted values without COVID-19

15,831

9,755,777

13,062,684

Decrease rate

−70%

−73%

−99.56%

The Case of Travel Bubble in Estonia BALTics countries like Lithuania, Latvia, and Estonia removed travel restrictions and opened their common borders on May 15, 2020. It signaled an opportunity for businesses to reopen, and people to resume back to normal lives. Since late April 2020, the authorities have loosened their lockdown measures and the slower spread of COVID-19 lent to the formation of a “travel bubble”, in which people can travel freely across borders. The three Baltic states are close partners to each other, and they have a similar epidemiological situation; synergies can thus be achieved through a travel bubble. Economic factors also contribute to fostering the growth of the travel bubble; the three Baltic states have well-integrated economies. They are ranked the lowest among the eurozone in terms of country wealth: they expect their economy to shrink by around 8% in 2020, and Lithuania was prepared for double-digit shrinkage if the economies are not reopened in time. On September 11, 2020, Latvia claimed a 14-day quarantine on Estonia-inbound visitors, calling an end to the only existing travel bubble in the world. This was driven by the rising number of new COVID-19 cases in Estonia: 21 COVID-19 infections per 0.1 million population were detected in the previous 14 days. In the meantime, Latvia still maintained as one of the lowest infection rates in the European Union. In this case study, we examine the outbound traveling data of Estonia. The dataset is collected from the bank of Estonia, which is an official source of data. The time step available for the dataset is limited to by a quarter. The period of the dataset starts in the first quarter of 2008 and ends in the third quarter of 2020. It covers the main active period of the travel bubble between the three states in the third quarter of 2020, thus the data point is used as an estimation of the effectiveness of the travel bubble. To solve this time series prediction problem, we model it using the ARIMA model as it is suitable for this problem. We model the three parameters of the ARIMA model: the lag for AR(p), the differencing degree for smoothing data (d), and the lag for MA(q). The three parameters are kept the same to maintain the data stationary, which works better for the comparison of results performance. We first train the model with data in the absence of COVID-19; it covers data from 2008 Q1 to 2019 Q4, and the estimation of the model can be used as a non-COVID version of the tourism data so that the absolute impact of COVID-19 can be illustrated.

Tourism Rebounds Analysis—Lessons from Baltics Countries Table 3 Comparison between estimated and actual numbers of outbound visitors in 2020 Q3

59 Gap (real/pred − 1) %

Country

Real value (real)

Prediction (pred)

Latvia

146,374

120,262

16.04

9,364

97,371

−90.38

Russian Federation Sweden Italy

21,378

67,228

−68.20

6,564

21,428

−69.37

To compare the influence on tourism brought by the external shock of COVID-19 between regions, we identify three countries of different scales: the Russian Federation (large), Sweden (moderate), and Italy (small). These European countries were chosen as they experienced the COVID-19 pandemic due to their close proximity. We used the data from 2008 Q1 to 2019 Q4 to train the model and use it to predict the subsequent 10 quarters and compare the prediction with the actual figure. The predicted values for the third quarter of 2020 are shown in Table 3, and full results are shown by line charts in Fig. 13. From Table 3, we can see that Latvia, the country within the travel bubble, has an actual value which is higher than the number for nonCOVID prediction, while the actual number of other countries like Russia, Sweden, and Italy is below the estimation. Other interesting findings can be observed in Fig. 13: the four countries all experienced a downward trend from 2020 Q1 to Q2 due to the onset of COVID-19 infections and to some extent due to seasonal factors which should go up in the third quarter of 2020. However, only Latvia achieved a result better than the estimation, which provides support for the effectiveness of the travel bubble.

Fig. 13 Prediction versus actual numbers of visitors from Estonia to Latvia (upper left), Russian Federation (upper right), Sweden (bottom left), and Italy (bottom right)

60

L. Zhaowen et al.

Fig. 14 Business analysis basis summary

Business Case Analysis IN this section, we evaluate the gaps created by the impact of COVID-19; what policies the government can consider; and how effective they are. Figure 14 summarizes the modeling results and backend research we have done to establish grounds for our business case analysis.

Policies Effectiveness Quantitative Analysis GIven the 93% reduction of international arrivals in Singapore since circuit breaker 2020, and circuit breaker phase 3 policies will remain in the year 2021. This means the hotel industry will not be getting progressive stimulus in the year 2021; we project that international arrivals will continue to remain low with a possible reduction of 80%–90%, as compared to the pre-COVID level. This will yield the hotel industry income of approximately 37% of the pre-Covid projection. To proceed further to evaluate the effectiveness of the travel bubble and additional local stimulus provided to the hotel industry, we computed the proposed figures based on our model’s recommendation as shown in Fig. 14 and the projections are illustrated in Fig. 15. The following lists down the some of the key points: • Fig. 15 shows the baseline y-axis representing $5.8Bil SGD tourism income in 2019. The projected tourism income without the COVID impact is expected to grow 6.2%, reaching $6.2B SGD by the end of the year 2021. • With current policies remaining status quo, the tourism industry is expected to shrink by 60% in the year 2021, which threatens the survival of relevant businesses. • To help to bridge the gap, the government can initiate a travel bubble with selected countriesorregions, which will provide better yields for the travel industry. As shown in Fig. 15, we depict the picture of having a travel bubble with Australia,

Tourism Rebounds Analysis—Lessons from Baltics Countries

61

Fig. 15 Policy effectiveness evaluation

New Zealand, Japan, Korea, Taiwan, China, and ASEAN countries. For instance, if we open up the travel bubble with Australia and New Zealand, which generated 1.3mil visits in 201, we project an increase of 16% in tourist visits post reopening; this bubble can boost the hotel industry income by 5.5% based on the year 2019 baseline income. The same rationale applies to Japan, Korea, Taiwan, China, ASEAN, etc. • The last second bar from Fig. 15 depicts the effectiveness of the existing Rediscover Singapore Program, and the last bar on potential enhanced local stimulus. These draw some income, but it is less significant as compared to the earlier proposed travel bubble arrangement. From the above analysis, the following insights can be derived: • With every additional $1 million of international arrivals generated from the travel bubble or easing of border controls, the hotel industry income can recover its annual income by 4% in the year 2021. • Two bigger regions like China and ASEAN will be the key enabler(s) for the industry that deduce more than 60% of the expected income prior to the pandemic. This will provide businesses with more breathing room from ripple effects of tightening like loans default, laying off, and even bankruptcy. Hence, we recommend close monitoring and prioritizing the establishment of a travel bubble with these two regions. • Stimulus to generate hotel demand locally from residents or businesses plays a minor role in helping the industry to recover. Singapore is a small island country, and unless the government considers a more radical policy in restricting and subsidizing quarantine stays in hotels to boost the demand for hotel rooms, we are unlikely to see a substantial boost from this initiative. We need to also consider the current context of the pandemic, weighing the risk of easing border control and establishing travel bubbles, as the risk of overall impacts on the economy, should the situation worsen after the policy becomes effective.

62

L. Zhaowen et al.

Qualitative Analysis of Other Measures for Consideration In this work, we also qualitatively analyze Singapore’s tourism industry from another perspective using SWOT analysis to look at the main strengths, weaknesses, opportunities, and threats surrounding it, as seen in Fig. 16. Based on the SWOT analysis, the strengths include Singapore being a leading business travel destination in Asia; it has a strong national policy and efficient governance. Opportunities we can tag on include its strategic location, and transform some outbound tourism into domestic tourism. We make the following suggestions on national policy and consumer front. First is on tax reduction, exemption, rent, and loan deferment to sustain the cash flow for businesses, especially for small and medium enterprises and hotel chains. Reducing the guaranteed rate for these enterprises and providing loan support guarantees will help roll out new business initiatives, such as converting hotel rooms or ballrooms into conferencing centers, company functions, and other events. For local tourists, local staycation vouchers can be issued. Policy incentives for corporations, community centers and organizations to channel some funds on yearend bonus, awards for employees to local weekend staycations, hotel dining packages, and having more conferences to be held in hotels will promote more business activities between retail, hotels, catering, and other companies. A high level of hygiene requirements is mandated following the pandemic, and disinfection and epidemic prevention procedures can be stepped up to repurpose hotels for quarantine purposes. Looking broadly, tourists’ demand for the service industry may further increase after the pandemic, which requires the hotel sector and other sectors in the tourism industry to strengthen the improvement and learning of the professional skills of internal staff in order to respond to market changes in a timely manner. We believe that the tourism industry can tap into the adoption of technology to enter into a new normal. For Singapore’s tourism industry, the emerging field is an online business. The first technology with potential is VR and AR technologies, which can be combined to promote online virtual tourism. Secondly, businesses shifting online during the epidemic, the tourism industry can tap into the online presence to attract visitors and acquire new customers. Some projects have achieved initial

Fig. 16 SWOT analysis for Singapore tourism industry after COVID-19

Tourism Rebounds Analysis—Lessons from Baltics Countries

63

successes, such as broadcasting tourism-related videos on multiple online platforms like YouTube; Safari Park has launched a new online project for a virtual zoo experience. Finally, the hotel industry can introduce new technologies, including artificial intelligence hotel services, such as contactless check-in and drone delivery.

Conclusion To conclude this work, we recommend a 4-step approach playbook to revive Singapore’s hotel industry, as shown in Fig. 17. First, we recommend prompt reduction of tax, rent, and loosening loan policies from the government to sustain the cash flow for businesses. Second, we should monitor and prioritize the diplomatic travel bubble policy negotiation with China and ASEAN, if the risk is deemed manageable, to boost international arrivals timely. Third, policy maker can promote cooperation among businesses, and provide platforms and incentives for cross-cooperation, such as promoting company events with a staycation, hosting corporate or annual events in hotels. Last, forward-looking, banks are encouraged to lower the loan requirements for hotels that are willing to take initiatives in transforming their business to prepare for the worst and compete in a new norm after the pandemic.

Fig. 17 Singapore hotel industry revival playbook

64

L. Zhaowen et al.

References 1. Singapore Tourism Board: https://stan.stb.gov.sg/public/sense/app/877a079c-e05f-4871-8d878e6cc1963b02/sheet/3df3802e-2e5b-4c79-950d-d7265c4c07a9/state/analysis 2. Singpaore Government Statistics: https://www.tablebuilder.singstat.gov.sg/publicfacing/create DataTable.action?refId=1991 3. Singapore Tourism Analytics Network: https://stan.stb.gov.sg/content/stan/en/tourism-statistics. html 4. Estonia Government Statistics Bank: https://statistika.eestipank.ee/#/en/p/MAKSEBIL_JA_ INVPOS/1410 5. Shearer (2000) The CRISP-DM model: the new blueprint for data mining. J Data Warehousing 5:13–22 6. Durbin J, Koopman SJ (2012) Time series analysis by state space methods. Oxford University Press, Oxford 7. Petaling J (2020) How SARS impacted tourism. The Star, Jan 29. https://www.thestar.com.my/ news/nation/2020/01/29/how-sars-impacted-tourism

Compare and Contrast the Impact of COVID-19 from Small to Large Country Hu Yubin, Ma Defeng, Qiu Zicong, Tang Manhong, Wang Lyu, and Wang Yang

In this study, we first look at the changes in China’s domestic and international flights during the epidemic, developing domestic tourism is an alternative during the pandemic. The overview of China’s tourism data during the epidemic showed that although international passengers have dropped significantly, the number of domestic flights has only dropped by less than 10% as compared to 2019. This means that domestic tourism has not suffered the same level of impacts as international tourism. Domestic tourism has also been revamped after the pandemic to shift toward greener and healthier traveling options, and lesser sightseeing of a slower pace. The small land size of Singapore mandates more innovative measures to encourage domestic residents to promote tourism-related consumption in the development of domestic tourism. This work makes recommendations for the recovery of domestic tourism in Singapore and China. During the epidemic, domestic tourism should be focused on for a period of time. This is a viable solution even if the pandemic drags on. We first look at the picture of tourism from a global perspective. As a global public health emergency, COVID-19 has impacted tourism worldwide. In this section, we

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_5. H. Yubin (B) · M. Defeng · Q. Zicong · T. Manhong · W. Lyu · W. Yang Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] M. Defeng e-mail: [email protected] T. Manhong e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_5

65

66

H. Yubin et al.

Fig. 1 Number of departure flights from 2003 to 2020

investigate the impact of COVID-19 on global tourism in terms of flights, international arrivals, accommodation and measures adopted by the government to revive tourism. First, we look at the global annual flight data from 2003 to 2020. Figure 1 displays the number of departure flights from 2003 to 2020, and the breakdown of international and domestic flights. It has shown an upward trend since 2003, and peaked in 2019. It follows with a sharp decline in the year 2020; it also shows fewer flights in 2020 than there were in 2003. Next, we look at the monthly flight and seats data for the years 2019 and 2020 as shown in Fig. 2. In 2019, both the number of flights and seats show fluctuations, but remain around a certain level. However, the number of flights and seats continue to plummet from January 2020 and reach the lowest value in April and May 2020. The number of flights and seats then gradually picked up in June and remained relatively stable for the rest of the year. It was still far from reaching the level of 2019. As compared to 2019, the number of flights and seats in 2020 reduced in all months except for January and February. The number of flights fell by more than 70% in April and May, while the number of seats fell by more than 80%. Although the number of flights and seats gradually rebounds after May, they still dropped by more than 20 and 40% respectively as compared to 2019. Figure 3 shows the number of flights coming from Africa, America, Asia, Europe, the Middle East and the rest of the world in the year 2020. As compared to the year 2019, international arrivals in 2020 decreased throughout the year. International arrivals decreased by up to 97% in April. Although it shows a rebound in June, after August there was a repeated downward trend, with a decrease of more than 80%. We now look at the accommodation. The occupancy rates generally showed a downward trend. Although it showed a short upward trend from April, there was

Compare and Contrast the Impact of COVID-19 from Small to Large …

67

Fig. 2 Difference in percentage change on flights

Fig. 3 Difference in the number of international arrivals in year 2020

a second downward trend after August with occupancy rates at less than 40%. As compared to 2019, short-term rentals in 2020 showed an increase from January to March, then plummeted and remained at a 25% decrease as compared to the same period in 2019. Hotel searches and hotel bookings showed the same pattern, decreasing throughout the year in 2020, with a maximum decrease of 80% in August as compared to the same period in 2019. Although there is a rebound afterwards, it

68

H. Yubin et al.

Fig. 4 Monthly accommodation indicators in year 2020

still remained at a level of more than 50% decline as compared to the same period in 2019 and showed a further downward trend. The accommodation indicators are shown in Fig. 4. In order to mitigate the negative impact of COVID-19 on tourism and to recover this industry, various countries and regions are actively implementing new initiatives. Figure 5 shows the measures taken by different countries and regions. More than half of the countries and regions have implemented fiscal policy, monetary policy, jobs and skills and restarted tourism-related measures. Pacific regions are more active in various types of measures. At the same time, we can see that less than half of the countries and regions have implemented measures to develop domestic tourism.

Tourism in Singapore Singapore’s tourism industry is an important industry and contributor to Singapore’s economy, currently contributing 4% to Singapore’s GDP and attracting 19.11 million international visitors in 2019, more than three times the country’s total population. Tourism plays a vital role in cementing Singapore’s position as a vibrant global city that attracts capital, businesses and talent. Figure 6 shows the top 10 destination cities by the number of international overnight visitors. Singapore was ranked 5th on Mastercard’s Global Top 10 Destination Cities and ranked 4th on Top Cities by Dollars Spent. Singapore is positioned as a very competitive tourist destination in the world. In the next step, we will analyze the development of Singapore’s tourism industry in recent years, before it was affected by the COVID-19 epidemic.

Compare and Contrast the Impact of COVID-19 from Small to Large …

69

Fig. 5 Measures taken by countries and regions

Fig. 6 The global top 10 destination cities in 2018

Figure 7 shows the number of international visitor arrivals from 2010 to 2019. From the International Arrivals and Tourism Receipts, we can see that Singapore’s inbound and tourism revenue has an upward trend during 2010–2019, but the growth has been volatile. We do not see significant growth in arrivals or revenue for both 2014 and 2015. One possible reason for this is the decline in arrivals from neighboring countries like Malaysia and Indonesia due to the devaluation of the ringgit and rupiah against the Singapore dollar. Meanwhile, according to Singapore Tourism Board, the company’s travel budget cuts have a significant impact on tourism revenues as the

70

H. Yubin et al.

average business traveler spends about twice as much as the average leisure traveler. At the time of 2016, tourism revenue from shopping saw a promising increase, up by 52% as compared to 2015. From that point onwards, Singapore’s tourism entered another year of growth streak (Fig. 8). When we look at Singapore’s international arrivals by month as shown in Fig. 9, we see an interesting seasonal pattern. In addition to the general upward trend, Singapore international arrivals show a distinct seasonal pattern with peak seasons in July, August and December each year. This is associated with national holidays in several major visiting countries. Figure 10 shows the origin countries of Singapore’s visitors in the year 2019. According to the year 2019 data, the top five countries visiting Singapore are China, Indonesia, India, Malaysia and Australia. China showed the largest increase of 6.1% compared to 2018, with a total of 3.63 million visitors. Tourism receipts can be

Fig. 7 Singapore international visitor arrivals during 2010–2019

Fig. 8 Singapore tourism receipts by major components during 2010–2019

Compare and Contrast the Impact of COVID-19 from Small to Large …

71

Fig. 9 Singapore international arrivals by month, 2015–2019

Fig. 10 Top 10 Countries visiting Singapore 2019

from accommodation, food & beverages, shopping or other tourism components. Figure 11 depicts tourism revenue by source country; it shows that Chinese tourists lead the way with S$4.1 billion in spending. The top 5 spenders were from China, Indonesia, India, Australia and Japan. Chinese tourists spent more on shopping; its spending on shopping alone is more than India’s total spending, and Chinese tourists also spent the highest percentage of total spending on shopping. We now shift the focus to the top five countries that bring in the highest tourism revenue and observe the age and length of stay of visitors from these countries. As shown in Fig. 12, the majority of visitors from China are from the young age group ranging 25–34 years old, and a large percentage of them are likely to be business

72

H. Yubin et al.

Fig. 11 Singapore tourism receipts by country 2019

travelers. There is also a large proportion of tourists from the older age group, more inclined to leisure traveling. In terms of length of stay, tourists from China generally make short trips of less than 5 days, with a high percentage of people staying for 1 day. Indonesians tend to stay longer, with nearly half of them staying longer than 2 days. The tourism sector in Singapore has experienced a difficult time since early 2020 due to the pandemic. International visitor arrivals and hotel performance indicators are two main reflections of the tourism industry. Figure 13 shows the total number of international visitors from 2018 to 2020. It can be seen that the international visitor arrivals fluctuate around 1600 thousand per month from 2018 to the end of 2019. It then plummeted in February 2020 when the epidemic started in China. The international arrivals in Singapore decline to half of that in January 2020. Although the pandemic did not spread to Singapore and other parts of the world in February, people were less movable due to travel restrictions. International visitor remains below a thousand per month in April and May due to border control that restricted the entrance of foreigners into Singapore. After June, border control has relaxed on arrival for certain long-term pass holders like employee pass holders and student pass holders. Hence, the international visitor arrivals have steadily and slightly increased to 15 thousand until November. However, it is still far fewer than the situation before COVID-19. Figure 14 shows the comparison of international visitor arrivals for the same month in 2019 and 2020 separately. It can be seen from the area chart that the whole industry of tourism has been greatly and negatively influenced in 2020, especially from April 2020 onwards when the Singapore government proposed the border restrictions. The hotel industry is another crucial industry in tourism. Figure 15 shows the change in available room nights in Singapore from 2018 to 2020. Figure 16 shows the trend of room revenue and the number of gazetted hotels in Singapore. The gazetted hotels are the hotels which are registered in Singapore and the number can reflect the development and scale of the hotel industry in Singapore. The number of gazetted hotels remains stable from 2018 to 2020. Hence, the total capacity of hotel rooms does not show significant changes in recent years. However, the available

73

Fig. 12 Top 10 Singapore visitors by features 2019

Compare and Contrast the Impact of COVID-19 from Small to Large …

74

H. Yubin et al.

Fig. 13 International visitor arrivals

Fig. 14 International arrivals comparison by months

room-nights and room revenue have the same decline as the international arrivals. Singapore is a small country with a little market share of domestic tourism. So, hotel rooms are mostly provided for foreigners. When the international arrivals dropped dramatically from February 2020, the hotel industry would be impacted consequently. The difference from the overall trend of international visitors is that the number of available room-nights and room revenue show signs of rising from July 2020. This is because the government has carried out a series of campaigns to help the local hotels to stay afloat. Staycation is one alternative to warm up the hotel industry. Due to movement restrictions to travel overseas, local residents choose to spend one night or two in the local hotels as an alternative to travel overseas. As a result, this boosted the revenue of the hotel industry from July to August 2020. Figure 17 shows the monthly year-on-year changes in room revenue of hotels in Singapore. The total revenue of Singapore hotels in January is 10.21% higher than that in 2019. This indicates that year 2020 was supposed to be a better year than in the year 2019 in the absence of COVID-19. What follows is a series of downfalls starting from February. The downward trend reaches its peak in July with room revenue only equal to 10% of the room revenue in the previous year. It can also be seen from the bar chart that the situation has gradually improved from August 2020.

Compare and Contrast the Impact of COVID-19 from Small to Large …

Fig. 15 Available room nights from 2018 to 2020

Fig. 16 Room revenue per million

Fig. 17 Monthly revenue change by percentage

75

76

H. Yubin et al.

Fig. 18 Revenue per available room, standard average room rate and percentage of occupancy rate

Figure 18 shows the standard average room occupancy, standard average room rate and revenue per available room of the four main types of hotels: luxury, upscale, mid-tier and economy. The figure reflects that the occupancy rate and revenue per available room were first influenced by the pandemic in January 2020 and began to fall. The average room rate decreases consequently. Hotels also lower their room rates to attract guests in order to compensate for the daily operational expenses. In the second stage when the epidemic was under control, these three ratios rebound. The occupancy rate does not have a clear sign to rebound, while the average room rate has recovered to half of the previous level until November 2020. As the number of local residents who visit the hotels only made up a small portion of the number of foreign visitors before the epidemic, the overall tourism climate was still worrying. Figures 19, 20 and 21 show the percentage change in occupancy rate, average room rate and revenue per available room for different hotel classes, respectively. Among the four types of hotels, COVID-19 has the most profound impact on luxury hotels. The occupancy rates for the four types of hotels are similar before COVID19, at the level from 80 to 90%. After the pandemic, luxury and upscale hotels’ occupancy rates dropped to below 30%, while the economy and Mid-Tier hotels’ occupancy rates only reach from 40 to 50%. The changes in the average room rates of luxury hotels and upscale hotels are also larger than that of economy hotels and mid-tier hotels. The revenue per room available is similar for the four types of hotels during the most critical period from March to May 2020. The situation turned better after July and the difference got back to normal. We develop an ARIMA model based on time series data to predict the number of international visitors to Singapore from January to November 2020 under normal conditions. We then compare the results of the predictive model with the actual data, to quantitatively estimate how much the tourism industry suffered in 2020 from the impact of COVID-19. From there, we present the estimated result on road to recovery. The international visitor arrival prediction forms the first target variable; a univariate time series model is developed. The adoption of time series models is due to their reliable prediction even when the data come from a broad time period.

Compare and Contrast the Impact of COVID-19 from Small to Large …

Fig. 19 Percentage change of occupancy rate for different hotel classes

Fig. 20 Average room rate for different hotel classes

Fig. 21 Revenue per available room for different hotel classes

77

78

H. Yubin et al.

The information about conditions can be extracted. Moreover, seasonal fluctuation patterns can be extracted from time series analysis and serve as the basis for future forecasts. This type of information is of particular importance to markets whose products fluctuate seasonally. For time series data, there are several commonly used models such as Moving Average (MA), Autoregressive (AR), Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average (ARIMA). Moving average implies that the next value in the series depends on the average of the previous historical values. The autoregressive model specifies that the output variable depends linearly on its own prior and a random term, which implies that the error of the next value depends on the error of the previous certain value. The autoregressive moving average model is a combination of the autoregressive and moving average. The autoregressive integrated moving average is a generalization of the autoregressive moving average (ARMA) model and is usually applied to some cases where the data show non-stationarity. We will identify the most applicable model based on the characteristics of the time series data of this project. The monthly number of international visitor arrivals to Singapore from January 2008 to December 2019 was selected as the training set as shown in Fig. 22. We can use an addictive model to break down the movement into trend, seasonality and residuals to see how these components contribute to the total movement. The decomposed components can be found in Fig. 23. From the second line plot in Fig. 23, we can see an upward trend, and the seasonal component as shown in the third line plot shows repetition on a yearly basis. Stationarity is a critical property for time series modeling because we can use the data directly for modeling when a time series is stationary. Besides the above visualization methods, some statistical test methods such as Augmented Dicky Fuller

Fig. 22 Singapore international visitor arrival train set during 2008–2019

Compare and Contrast the Impact of COVID-19 from Small to Large …

79

Fig. 23 Components of time series data

Test (ADF), KPSS and Phillips-Perron test (PP) can be applied to test the stationarity of time series. We next look at Augmented Dicky Fuller Test on stationarity test. We performed Augmented Dicky Fuller Test without handling and adjusting the stationarity of the time series. The test statistics is −1.51 while the p-value is 0.52. From the result, the p-value of the test is higher than the significance of the 5% cutoff point; this points to the lack of stationarity in this time series data. We therefore difference the time series one round to remove the dependency, and then re-applied Augmented Dicky Fuller Test. The first-order differencing line plot has no more upward trend (as shown in Fig. 24) and the Augmented Dicky Fuller Test result shows that the time series data became stationary with test statistics of −3.12 and p-value of 0.02. Based on the characteristics of the data, we use the ARIMA model for prediction due to its fit to this set of data. Autocorrelation and partial autocorrelation are used to measure the correlation between the current and historical values and to indicate the most useful historical values for predicting future values. Figure 25 shows the autocorrelation and partial autocorrelation of 1-diff time series data following the previous step. These help to determine the optimal parameters for ARIMA. The Akaike Information Criterion (AIC) measures the fits of data to a model. Based on autocorrelation, partial autocorrelation and Akaike Information Criterion value, the model we finally chose is ARIMA with parameters (6,1,2), and the predicted results are shown in Fig. 26. The forecasted number of international visitors from January 2020 to November 2020 was 18.13 million. However, the actual number was only 2.71 million, of which 2.4 million were for the January–February period. Based on the per capita spending of international visitors to Singapore in 2019, we estimate that Singapore’s total tourism industry lost 22.4 billion Singapore dollars due to

80

H. Yubin et al.

Fig. 24 Singapore international visitor arrival train set and 1-diff time series

Fig. 25 Autocorrelation and partial autocorrelation of 1-diff time series

COVID-19. As the epidemic in Singapore is gradually brought under control, the tourism industry is in dire need of recovery in order to reverse the losses. The Singapore Tourism Board (STB) has launched several new programs to help the tourism industry to stay afloat, strengthen their business base and deepen their

Compare and Contrast the Impact of COVID-19 from Small to Large …

81

Fig. 26 Singapore international visitor arrival predication versus actual data

infrastructure. For travel agencies, there was additional support provided to alleviate cash flow problems. All-economic measures were announced in the 2020 budget to support businesses affected by COVID-19 and their implications for the tourism industry. To ensure safe traveling, the STB also implemented a mandatory cruise safety certification scheme, which stipulates strict hygiene and safety measures for the entire passenger journey from pre-departure to post arrival. Singapore Rediscovery (SR) coupons is another means to help this industry. In August 2020, the government announced that S$320 million will be distributed in the form of SR coupons to encourage Singaporeans to support the local tourism industry to recover from one of the worst hit periods historically. In September, the city-state received the first passengers from Brunei and New Zealand under its Air Travel Pass program. The plan allows tourists to apply to travel to Singapore without going through a 14-day quarantine. Since launching the program in September, Singapore has expanded the list of eligible regions to Australia, China, Taiwan and Vietnam. Later the same year, STB and Expedia (one of the world’s leading full-service online travel brands) have established a two-year global marketing partnership. Its focus is on stimulating local tourism by supporting local businesses and strengthening Singapore’s location as a preferred destination when international travel resumes. Singapore Airlines did not further pursue the idea of destinationless flights. Instead, it chose a rooted experience in the October period to provide customers with dinner and movies on a double-decker A380 aircraft at prices ranging from US$37 to US$440.

82

H. Yubin et al.

Tourism in China The market volume illustrates the total number of annual transactions. We first provide an overview of China’s tourism market volume, as shown in Fig. 27. The tourism industry in China is a progressive upward trend before the year 2019. We plot out the quarterly bonus for three leading tourism companies in China from the year 2015 Q2 to the year 2020 Q2. It shows substantial losses in the year 2020 Q2. The COVID-19 epidemic broke out in China at the beginning of 2020 and has spread globally since March. After the outbreak, thirty-one domestic provinces have successively initiated first-level responses to public health emergencies. The Ministry of Culture and Tourism (MCT) of China mandates travel agencies to suspend domestic (Fig. 28). group business from January and suspend all outbound group businesses at the same time. Local deployment of epidemic prevention and control also sees some tourist businesses suspend operations entirely. As of mid of 2020, the domestic epidemic has started to be controlled, but the situation abroad is still severe. Some domestic flights began to resume normal operations, and travel companies began to gradually recover from the cold and hard winter. Large-scale companies like Zhongqing Travel (CYTS) suffered more losses when the epidemic struck, but they are more resilient to the epidemic. Some main players like Caesars Tourism have not improved until the third quarter of 2020. These companies have to endure the low period brought by seasonality on top of the uncertainty of the epidemic before things turn better. Hotels are an indispensable component of the tourism industry, as such the hotel industry has also suffered very badly. The outlook of big hotel chains like BTG Hotels Group has taken a sharp turn in the first quarter of 2020. The losses in the first quarter even exceed the maximum quarterly profit in the past six years. As with

Fig. 27 Tourism market volume per year

Compare and Contrast the Impact of COVID-19 from Small to Large …

83

Fig. 28 Quarterly profit of tourism companies: three leading tourism companies

the other travel companies, larger hotels have also recovered more rapidly. Solutions should look at increasing the room occupancy rate, standing out from competitors, and becoming the first choice for domestic travelers. From the year 2003 to 2019, China’s civil aviation market has continued to expand, its transportation capacity has gradually improved and its route network has continued to improve. The number of people traveling by plane has been rising every year. As depicted in Fig. 29, for total departures from the year 2003 to 2020, we see how the pandemic has reversed the upward trend. When the new coronavirus swept the world, China adopted a series of lockdown measures and suspended international flights. This is shown in Fig. 30 where the rapid reversal is observed more significantly than the total departures in the previous figure. The rapid transmission of the new coronavirus restricted the movement of people greatly, especially overseas’ movement. As compared to the year 2019, international departures in 2020 have been reduced by 69.88%, and the international aviation business did not have a good response plan. Only when the epidemic situation around the world improves, the international departure business will see a turnaround. Domestic departure business suffers less severe losses than international departures in the year 2020 as shown in Fig. 31. Since May and June, the domestic market influenced by the COVID-19 epidemic in China was under control, and the country has resumed normal operation for most domestic flights. As a result, the number of travelers has greatly increased. Looking at the overall trend, the number of passengers departing from China has only dropped by 9.92%. The travel agencies resumed the operation of intra-provincial travel starting from March and inter-provincial travel starting from July. These are important milestones for domestic tourism from a partial resumption of work and production to national recovery. The tourism industry took immediate action to enrich its offering and

84

H. Yubin et al.

Fig. 29 Yearly total departures in hundreds from 2003 to 2020

Fig. 30 Yearly international departures from 2003 to 2020

seize the opportunity for recovery. Recreational car tours, healthcare tours and rural tours see upticks in business. Domestic tourism after this revival is different from the past, and it is shifting toward a greener and healthier lifestyles. Tourists opt for lesser sightseeing, more leisure activities and a slower pace. Tourists pay more attention to safety factors and experience in tourism. Previously, less popular tourist attractions, less crowded attractions and natural attractions became more popular after the pandemic. Tourists seek more quality traveling. Since the beginning of the COVID-19 epidemic, China’s Ministry of Culture and Tourism (MCT) took the opportunity to promote restoration work and production. From these policies, China not only effectively reduced the impacts of the epidemic

Compare and Contrast the Impact of COVID-19 from Small to Large …

85

Fig. 31 Yearly domestic departures from 2003 to 2020

on tourism but also speed up the recovery to further develop tourist attractions and improve connectivity internationally. In February 2020, China’s Ministry of Culture and Tourism (MCT) drafted the “Guidelines for the Reopening of Tourist Attractions under Conditions of Pandemic Prevention and Control”. In July of the same year, it issued a notice on the restoration of production for tourism. In September, the number of visitors. was restricted to a maximum of 75%. Tickets need to be booked online and sold at specific viewing times. Approximately, US$900 million of travel services guarantee funds were returned to more than 25,000 travel agencies to ease the financial pressure on their cash flow. The Chinese government has asked banks to extend the terms of commercial loans and commercial owners to reduce rents. The other relevant measure is on maintaining the work of tour guides to provide more resources for the restoration of the tourism industry. This measure highlights the protection of the rights and interests of tour guides, encourage them to take up free online training courses, and in the meantime ensures that preferential policies are provided for tour guides including the exemption of annual fees on that year and extension of the renewal period. Funding support was also provided. Funds were allocated to support discount loans for local projects and instructed local administrations to assist small, medium, and micro enterprises. A special epidemic control column has been established to enable enterprises to be updated with the current situation and at ease to obtain the relevant information. Support was provided for the development of a national one-stop tourism pilot zone, the most visited destinations, and holiday destinations. Infrastructure construction was actively developed during that time. The MCT has identified the first set of 346 construction projects during the pandemic hit year and funded it with the central budget.

86

H. Yubin et al.

Management in tourism should also be enhanced at the same time. Businesses to look to strengthen the business supervision and management of tourism enterprises and improve the competition and flexibility of the tourism industry. While effectively responding to the pandemic, the MCT of China has scaled up its digitalization efforts. It launched free online training programs, initiated digital projects for cultural tourism companies, vigorously promoted the construction of smart tourism, and strengthened tourism market management and industrial rectification. This greatly scales up the capabilities of the tourism industry. During the 5-day long Labor Day holiday in China, the number of domestic tourists reached 53.5% of the same period as compared to the previous year. In the first three quarters, China’s GDP grew by 0.7% year-on-year. Domestic tourists reached 1.93 billion, accounting for 42% of the same period last year. Urban and rural residents’ travel remained the same in the fourth quarter as compared to the previous year. During both China’s National Day and Mid-Autumn Festival, a total of 637 million trips were made by local tourists. This accounts for 79% of the total in the previous year. The tourism industry’s turnover was 466.56 billion yuan (US$68.71 billion), accounting for 69.9% of the same period last year.

Tourism Analytics—The Case for South Africa Yong Heng Michael Tan and Yok Yen Nguwi

Tourism has been cited as one of the key drivers of South Africa’s national economy; it also significantly contributes to the country’s job creation. From 2016 to 2019, travel and tourism revenue accounted for 6.9% to 7.6% of South Africa’s total GDP. From 2017 to 2019, South Africa employed 2.8 million to 2.9 million workers in the tourism sector. In the same period, South Africa employed 16.2 million to 16.5 million workers, suggesting that the tourism sector accounts for around 17% of the total employment. To determine which countries are most interested in visiting South Africa for tourism, we analyzed tourists from two segments. The first segment consists of the South African Development Community (SADC), which is an inter-governmental organization headquartered in Gaborone, Botswana. The SADC consists of 16 member states with members like Angola, Botswana, Comoros, Congo, Eswatini, Lesotho, Madagascar, Malawi, Mauritius, Mozambique, Namibia, Seychelles, South Africa, Tanzania, Zambia and Zimbabwe. The second part comprises countries outside of the African continent. In the past five years from 2016 to 2020, the number of visitors from SADC countries ranges from 2.1 million to 7.6 million. The second segment sees the highest average number of visits to South Africa from the United Kingdom (UK), United States of America (USA), Germany, France and the Netherlands. In the year 2020, we see a sharp downturn in tourism due to the COVID-19 pandemic. We attempt to look at the extent of impacts it has on South Africa. Foreign Y. H. M. Tan (B) · Y. Y. Nguwi Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] Y. Y. Nguwi e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_6

87

88

Y. H. M. Tan and Y. Y. Nguwi

tourism spending in the year 2020 dropped significantly by 65.8% (from USD 8.2 billion to USD 2.8 billion) from that of the year 2019. The impact of the COVID-19 pandemic on South Africa’s Tourism Industry likely started when a 38-year-old man traveled to Italy and returned to South Africa along with nine others on March 1, 2020, who tested positive for COVID-19 on 5 March [2]. In 2019, international visitor spending in South Africa dropped 67.6% from US$ 9.8 bn to US$ 3.2 bn in the year 2020. It worsens further and dropped 38.2% in the year 2021 to US$1.98 bn. In 2019, domestic visitor spending dropped 54.1% from US$12.7 to US$5.8 bn in 2020 but recovered 47.2% to US$8.6 bn in 2021. This suggests that domestic tourism is more resilient during the COVID-19 period. Hence, during a pandemic, the tourism industry should spend more effort on the local market. When the pandemic eases, the tourism industry can then switch back to the international market instead as the rate of recovery is likely to be higher. Following the COVID-19 recovery period, tourism to South Africa is expected to reach its pre-pandemic level. However, this trend is likely to be highly dependent on each country’s outbound policy. It is estimated that South Africa’s international tourism receipts derived from expenditures by international inbound visitors, including payments to national carriers for international transport, will return to their pre-COVID level in 2024. Furthermore, some notable developments indicate a recovery in the tourism industry, especially in the accommodation sector. In January 2020, InterContinental Hotels Group and Valor Hospitality Partners Africa announced the co-development of an upscale hotel in Johannesburg. The Rockfeller Hotel and Hyatt House Johannesburg Rosebank were launched in 2021 and 2022, respectively. Kasada, a South Africa-based fund, purchased the Cape Grace Hotel in 2022, suggesting optimism in the recovery of the tourism industry. In the aviation sector, the United National World Trade Organization (UNWTO) signed a memorandum of understanding (MoU) with the African Airlines Association (AFRAA) to strengthen the relationship between the tourism and aviation sectors. The MoU involves AFRAA’s Secretary-General Abdérahmane Berthé and UNWTO’s Secretary-General Zurab Pololikashvili. The MoU aims to promote greater connectivity and travel across Africa. With the MoU signed, it is expected to attract more travelers from the SADC countries. Africa attracted about 437,000 tourists from the United Kingdom in the year 2019; the total population of the United Kingdom is about 66.84 million. This represents 0.6% of the United Kingdom’s total population. We first look at the factors that make Africa attractive to international tourists. 1. Affordability Real exchange rate of South Africa fluctuates from time to time, but on a general downward trend. From the World Bank Data index of 100 in the year 2010, South Africa’s index declined to about 71 in 2010, followed by a 3% decline each year based on a straight-line depreciation method. The low currency rate makes it attractive to visitors from countries with a stronger currency like the United Kingdom [5].

Tourism Analytics—The Case for South Africa

89

2. Wildlife Tourism One of the main reasons to visit South Africa is its wildlife tourism. It offers wildlife observation in national parks, and provincial and private game reserves. Tourists can self-drive for the trip or engage field guides. The Kruger National Park is one of the 42 game reserves in South Africa. Other well-known game reserves include Addo Elephant Park in the Eastern Cape, Hluhluwe Imfolozi Game Park in Kawzulu Natal, Kgalagadi Transfrontier Park in the Northern Cape and the Lion Sands Game Reserve. Aside from experiencing close encounters with wild animals on land, tourists can also watch whales and dolphins from a close distance in South Africa. There are at least 37 species of whales and dolphins found in the waters of South Africa, but whale watching usually focuses on species such as the humpback and right whales Hermanus, a seaside town southeast of Cape Town in the Western Cape province, a town in the Western Cape province, is one of the top places for whale watching. 3. Adventures Located in Cape Town, Table Mountain with a height of 1086 meters is one of the most well-known landmarks of South Africa. It is one of the most photographed attractions in South Africa. Table Mountain also hosts approximately 2200 species of plants and 1470 floral species. It is also accessible by cable car. Thrill seekers can experience different adventures in South Africa such as bungee jumping. South Africa offers a number of bungee jumping sites for tourists: Power Swing Across Soweto, Bridge Swing, Big Swing at Moses Mabhida Stadium and Bloukrans Bridge Bungee Jump. Located between the Eastern and Western Capes, Bloukrans Bridge boasts a height of 216 meters for tourists looking for adventures. 4. Wine Appreciation Another reason for tourists to visit South Africa is to appreciate its varieties of wine including the Pinotage, semi-sweet white, sparkling and fortified wines. The South African wine industry exported approximately 388 million liters of wine in the year 2021 with more than 2613 farmers cultivating about over 90,000 hectares of land under vines. The industry boasts 24 red grape varieties and 22 white grape varieties and among the top recommended varietals are Sauvignon Blanc, Chardonnay, Cabernet Sauvignon and Pinotage. 5. Beaches South Africa boasts more than 60 beaches for tourists to visit. Since each of the listed beaches consists of many smaller tourist spots, South Africa offers vast options for beach lovers to choose from. One of the beaches, the Cape of Good Hope, is a rocky headland on the Atlantic coast of the Cape Peninsula. Camps Bay is another small bay on the west coast of the Cape Peninsula; it is a popular beach among tourists.

90

Y. H. M. Tan and Y. Y. Nguwi

6. Heritage Sites Located about 50 km northwest of Johannesburg in Gauteng province is The Cradle of Humankind, a paleoanthropological site, believed to be home to the largest concentration of human ancestral remains in the world. The size of the site is about 47,000 hectares, and it was declared a World Heritage Site by UNESCO in 1999. Aside from the Cradle of Humankind, South Africa has seven other World Heritage Sites proclaimed by UNESCO, and one such site is Robben Island, where many political prisoners, including the first democratically elected President of South Africa, Nelson Mandela, was imprisoned. It is now a museum for tourists to visit. 7. Accommodations and Transport Infrastructure In addition to offering tourists lots of options for attractions and activities, South Africa is equipped with decent accommodation and transport infrastructure. The country has over 90 airports including 23 international airports, and as of August 31, 2022, 32 international airlines provide services to South Africa. Furthermore, to make it easier for passengers to enter the country, citizens from 133 countries (68.2% of the world) do not require a visa to enter South Africa during 2016–2019; hotel occupancy rate hovers slightly above 50%, providing more than enough space for additional visitors. Despite the above-mentioned factors that make South Africa a good tourist destination, we dwell deeper to understand the other side of the story on factors that suppresses the growth of South Africa Tourism. One study by Duha Altindag [3] found that violent crimes decrease the number of tourists visiting a country. For example, a 10% increase in the aggregate violent crime rate may lead to a decrease of approximately USD 140 to USD 200 millions in tourism revenue. Therefore, it is important to consider crime in South Africa as part of this tourism study. According to statistics provided by South Africa’s Police Service, the top six crimes experienced by an individual in the past five years (2016/17–2020/21) are burglary at residential premises, common assault, assault with the intent to inflict grievous bodily harm, drug-related crime, robbery with aggravating circumstances and malicious damage to property. The department [5] also surveys random individuals about their perception of safety when walking in their neighborhoods. Over the same five-year period, an average of 33.68% of individuals who took part in the surveys indicated that they feel safe while walking in their neighborhood at night. Although this percentage represents a considerable drop from the perceived safety during the daytime, this number has seen an increasing trend from the year 2017 onwards. Given its vast number of recreational facilities, transport and accommodation support and lower crime rate, it looks like South Africa is becoming more attractive to tourists. However, we observe the number of tourists from both worldwide and from South Africa fluctuates over the years as shown in Fig. 1. We compare the growth of South Africa and Worldwide Tourism Industry and see if it vindicates the above observation. The following diagram shows the number of international

Tourism Analytics—The Case for South Africa Fig. 1 Index comparison of number of international tourists arrival (2016 = 100)

91

140 120 100 80

100 100

107 103

113

117

105

102

60 32

40 20

28

0 2016

2017 World Arrival

2018

2019

2020

South Africa Arrival

tourist arrival converted to an index with the based year 2016 represented as 100 for benchmarking. The index derived from Fig. 1 suggested that South Africa’s Tourism is lagging as compared to the rest of the world, contradicting even before the pandemic. This suggests that South Africa is becoming less and less attractive as a tourist spot as compared to the rest of the world. We now look at how to improve tourism in South Africa. A study conducted by Perry and Pogieter [4] looked at why South Africa continues to be a key tourism destination despite being viewed as the crime capital of the world. In part, this apparent contradiction could be attributed to the trend that foreign tourists visit established tourist areas such as ecotourism sites and selected locations in a few cities such as Cape Town. These are often high-end attractions with good infrastructure to ensure tourists’ safety and security measures which include strict access control. The same study noted another major concern of under-reporting of crime in South Africa. Crime statistics (as a proxy level for violence) are not reliable due to underreporting, difficulties in interpreting the report and lack of reliability of data. Furthermore, a significant percentage of crime in South Africa goes unrecorded because it is not reported to the police, often because there is deep-seated animosity between the police and civil society due to historical processes. In another study by George [7], crime-related factors include the perceived safety of prospective tourists, the perception of the safety of existing tourists of a destination outside of their accommodation and word of mouth from past tourists who felt threatened after leaving South Africa. While most visitors to Cape Town felt reasonably safe, a significant number of them felt unsafe going out in the dark and taking the city’s transport. Moreover, the visitor’s perception of the low safety of a destination and their past encounters with crime at a destination are likely to prevent them from visiting that destination. Another observation is that prior to arriving at a particular destination, tourists’ risk perceptions are indirect, mostly coming from secondary information such as government propaganda, news reports and others’ comments. Their prior knowledge, such as travel experience, also plays a part in forming initial risk perceptions, which they termed “Naive Risk Perceptions”, which helps determine whether they would

92

Y. H. M. Tan and Y. Y. Nguwi 120 110 100 90 80 70 60 50 40 30 20 2016

2017

2018

2019

2020

Tourist Arrival Index (South Africa) Tourist Arrival Index (World) Perception of Safety Walking in the day Total Crime in South Africa

Fig. 2 Index comparison of number of different parameters discussed above (2016 = 100)

travel to a destination. If the tourist decided that they would, they would form what is termed “Revised Risk Perception” gained from both “Naïve Risk Perceptions” and the experience from their travel. Li et al. [8] did not mention any statistical research on tourists in the paper. Figure 2 shows tourist arrival rates in South Africa, the world, and perceived safety and crime rate in South Africa. It can be observed that while crime is going down and the perception of the safety of those in South Africa is going up, South Africa Tourism’s growth presents a gap that is widening as compared to that of the world. It can be observed that there is a closing gap between 2019 and 2020 when COVID-19 started. This is likely to be an exception rather than the norm, where most other countries have stricter control over tourism. South Africa continues to have lower tourist arrival growth than the rest of the world. This observation supports the earlier points that crime could possibly be underreported, and tourists’ perception of the country is the main factor affecting tourism in South Africa. We obtained data from Macrotrend [9] on South Africa’s tourism and economic data and plotted it in Fig. 3. It shows the tourism in South Africa over the years on Gross Domestic Product (GDP), Gross National Product (GNP), Number of Tourists (in millions) and Tourism Receipts (in billions of USD) from the year 1995 to 2020. We can see a generally upward trend for GDP, which is partially driven by an increased number of tourists and tourism receipts. In the following graph, Fig. 4, we examine further the total receipts per tourist over the same period. We can see from this visualization that the receipts per tourist have been going downwards over the years, despite the growth in the number of tourists. This indicates that although it draws tourists to the region, tourists are not spending as much as before. Tourists are spending less despite the overall growth of the country observed in GDP. This will be a direction to boost South Africa’s tourism, to find ways to increase the spending of

Tourism Analytics—The Case for South Africa

93

Fig. 3 Tourism in South Africa over the years on GDP, GNP, number of tourists and receipts

each tourist. Promoting the purchase of consumables is a way to increase spending. Figure 5 depicts the inflation in the country; it shows a fluctuating trend which slowly stabilizes in recent years. A stabilizing trend in inflation is unlikely to make cause significant increase or decrease in tourism cost. Hence, potential tourists to South Africa are unlikely to change their decision on whether to travel to South Africa based on cost alone. The crime rate per hundred thousand population is depicted in Fig. 6; it is showing a healthy decreasing trend. Tourism is a lagged indicator; we believe it is going toward the right direction. Figure 7 shows the close-up visualization of tourism over the year. When we examine it closely, we can see that starting from the year 2007, the growth is slow; it was enjoying good growth from the year 2002 to 2006; subsequent to this the growth is not maintained. The lower currency rates play a part in tourism receipt calculation. In conclusion, South Africa has done well in providing recreational facilities and infrastructure support. If another pandemic strikes, the tourism sector can mitigate the damage by focusing on domestic tourism, but it would be a tough challenge based on the recent observation. The most important factor in promoting its tourism attractiveness is, therefore, to create a safe environment for tourism and along with it, an effective marketing campaign to change the image of a country’s current perceived high crime rate to one that is safe. Perhaps a special task force on “Tourism Safeness” could be formed to address this issue to promote tourism in South Africa.

94

Y. H. M. Tan and Y. Y. Nguwi

Fig. 4 Tourism in South Africa over the years on GDP, GNP, number of tourists and receipts per tourist

Fig. 5 Tourism in South Africa over the years on inflation, number of tourists, receipts and receipts per tourist

Tourism Analytics—The Case for South Africa

Fig. 6 Crime rate and tourism receipts in South Africa over the years

Fig. 7 Close up on tourism receipts in South Africa over the years

95

96

Y. H. M. Tan and Y. Y. Nguwi

References 1. Current Crime Index Numbeo. https://www.numbeo.com/crime/rankings_current.jsp 2. First Case of Covid-19 Coronavirus Reported In SA. (2020, March 5) Website. https://www. nicd.ac.za/first-case-of-covid-19-coronavirus-reported-in-sa/ South Africa National Institute for Communicable Diseases (NICD) 3. Duha A. (2014, January). Crime and international tourism. Retrieved September 13, 2022, from https://cla.auburn.edu/econwp/Archives/2014/2014-01.pdf 4. Perry EC, Potgieter C (2013) Crime and tourism in South Africa. J Hum Ecol 43(1):101–111. https://doi.org/10.1080/09709274.2013.11906616 5. Real effective exchange rate index (2010 = 100)—South Africa | Data (n.d.) Retrieved September 23, 2022, from https://data.worldbank.org/indicator/PX.REX.REER?locations=ZA 6. South African Police Service (n.d.) Crime statistics: Integrity. Retrieved September 15, 2022, from https://www.saps.gov.za/services/crimestats.php 7. George R (2003) Tourist’s perceptions of safety and security while visiting Cape Town. Tour Manage 24(5):575–585. https://doi.org/10.1016/s0261-5177(03)00003-7 8. Li G, Sun X, Li J (2022) Identification of tourists’ dynamic risk perception—the situation in Tibet. Human Soc Sci Comms 9(1). https://doi.org/10.1057/s41599-022-01335-w 9. Macrotrends https://www.macrotrends.net/countries/ZAF/south-africa/population

Hotel Booking Cancellation Analytics on Imbalanced Data Cai Yuxuan, Hsu Tuan-Chun, Jin Zhuofan, Tan Chian Wen Melvin, Vivek Goyal, and Zheng Yijun

The outbreak of Covid-19 had an immense impact on the global economy, with the tourism industry suffering the greatest. Visa restrictions, flight suspensions, border closures, social quarantines and the like brought the global tourism industry to almost a complete halt. Airlines, travel agents, hotels, attractions, and other tourism-related services entering a period of unprecedented recession. According to UNWTO World Tourism Barometer, International tourist arrivals (overnight visitors) fell by 72% in January–October 2020 over the same period last year, which represents 900 million fewer international tourist arrivals compared to the same period in 2019. Europe is one of the hardest hit regions by the epidemic. Portugal, as one of the Mediterranean countries in Europe, has always been a popular tourist destination. As of 2019, Portugal had been visited by 27 million visitors. However, due to the pandemic, Portugal suffered $1.7 billion in losses between March to June 2020. Based on time-series forecasting, the international tourism expenditure should have been around $27.03 billion in 2020 and $28.60 in 2021. However, though experts have pointed to a rebound in international tourism by the second half of 2021, a return to the pre-pandemic level in terms of international arrivals could take 2.5 to 4 years. In another word, Portugal’s tourism industry may continue to suffer the loss for at least 2 years. Therefore, it is imperative for the tourism industry to develop special strategies to survive through this special period. Most of the research focuses on demand forecasting in the aviation sector. However, hospitality is also an important sector of the tourism industry. So, we focus this work on examining the booking data for hotels

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_7. C. Yuxuan · H. Tuan-Chun · J. Zhuofan · T. C. W. Melvin · V. Goyal · Z. Yijun (B) Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_7

97

98

C. Yuxuan et al.

in Portugal, this tourist destination and country that was hit hard by the outbreak of Covid-19 to develop suitable strategies for boosting its tourism. We have two key objectives for this work: 1. To identify key factors affecting guests’ booking cancellation and provide business recommendations to improve the fulfilment of hotel orders. 2. To help hotels better organize resources by predicting cancellations. By achieving these two objectives, we hope to provide some actionable insights for hotels to become more profitable and thus mitigate the impact of the epidemic on them through this difficult time. Many European countries have relaxed their control measures amid the overall improvement of the epidemic. The tourism industry that has been hit hard by the epidemic is looking forward to pressing the “restart button”. Analysts pointed out that the road to recovery for the European tourism industry is uncertain and will face many challenges. Tourism is one of Europe’s pillar industries. According to reports, about 10% of the EU’s GDP comes from tourism, which accounts for as much as 13% of Italy’s GDP. Since March 2020, European countries have issued “cities lockdown orders”, “foot-free orders”, and “home orders” in order to curb the spread of the epidemic. The tourism industry suffered heavy losses as such. Data from the Swiss Federal Statistical Office showed that the number of foreign tourists staying in Swiss hotels in March 2020 dropped by about 70% as compared to the same period last year. Statistics show that within one month after France issued a “foot ban” in mid-March, tourism revenues dropped by 14 billion euros. The French Tourism Development Agency estimates that the tourism industry will lose nearly 45 billion euros in the first half of 2020. A research institution estimated that about 40,000 Italian tourism companies may go bankrupt and more than 180,000 people will lose their jobs in the year 2020. Germany-based TUI Group is the largest travel company in Europe. It announced on May 13 that it plans to lay off 8,000 employees to control operating costs [1]. The tourism industry provides about 2 million jobs in France, and tourism revenue accounts for nearly 8% of GDP for France. The government announced a plan to save the tourism industry as a national priority. Italian Minister of Culture and Tourism announced a 55 billion euro “restart decree” supported by the Italian government, with 2.4 billion euros on “vacation bonuses”. Families can get up to 500 euros in subsidies for staying in farm hotels, rural hotels, and holiday camps in the second half of 2020. Other countries like Switzerland, Austria, etc. have reached an agreement with neighbouring European countries to reopen their borders to re-develop their tourism. Industry insiders pointed out that the road to recovery of the European tourism industry will not be smooth and will face many challenges. First, the global epidemic is still spreading, and the epidemics in different European countries are not the same. This restricts the development of European tourism. In time to come, European tourism is expected to be dominated by tourists from the region. In addition, the social distancing regulations will be maintained for a long time, and the number of

Hotel Booking Cancellation Analytics on Imbalanced Data

99

tourists will not increase soon. According to media reports, Greece plans to open more than 500 beaches to the public, but it stipulates that there can be no more than 40 people per 1,000 square meters. In addition, the willingness to take risks to travel and the degree of resumption for social and economic activities will directly affect the road to recovery for the tourism industry. The Irish government stipulates that gatherings of more than 5,000 people are not allowed until the end of August 2020, and other countries have similar regulations. The latest report released by a French research institute stated that 80% of professionals believe that the tourism crisis caused by the epidemic will last at least 8 to 12 months, and some experts even believe that the tourism crisis may last until 2022. There are various factors in hotel bookings that can affect order cancellation. We attempt to understand the factors which may determine if a booking order will be cancelled. From this understanding, we can derive means to improve the fulfilment of hotel orders. The goal of this work is to determine these factors and be able to predict if a given booking order will be cancelled or not. We study this topic with two questions below: 1. What are the main contributing factors to order cancellation? 2. Can a machine learning classifier be used to predict cancellation? In addressing question 1, we will find out the answer by analyzing the dataset; several models are implemented for this problem which will be elaborated on in detail in a later section. It is important for us to analyze data on booking records, insights on how to reduce cancellation, and keeping the cancellation option open will be highlighted. This should provide a reference guide for a hotel on planning like personnel and facilities. Question 2 will be attempted with a comparison between different machine learning algorithms using different evaluation metrics, we consider metrics like recall rate to assess the performance of machine learning model. Predicting cancellation using machine learning classifiers is an important direction in the future, and certain procedures should be taken to better improve local tourism. We place greater emphasis on the recall rate than the accuracy rate. The computation for accuracy and recall are different Accuracy =

TP + TN TP + FP + FN + TN

Recall rate =

TP TP + FN

where TP denotes True Positive, TN denotes True Negative, FP denotes False Positive, and FN denotes False Negative. False Positive (FP) accounts for negatives that still yield positive test outcomes, while False Negative (FN) stands for all positives which yield negative test outcomes with the test. FP refers to the non-cancelled orders

100

C. Yuxuan et al.

that are predicted to be cancelled, which requires a more competitive price or advertisement spending to make up for the losses. Meanwhile, FN refers to the cancelled orders that are predicted to be non-cancelled, the loss in this category would be the idle rooms and redundant resources, which will cost higher as compared to FP. There would be revenue implications if a lot of cancelled orders were predicted to be non-cancelled and lead to a heavy loss to the hotels. The higher the recall rate, the more accurately cancelled orders can be predicted. Thus, the analysis focuses more on lowering the FN by emphasizing the recall rate.

Data Preparation Prior to analyzing data, it is important to have quality data. Hence, we start off by examining the number and percentage of missing values presented in different variables in the dataset. Two variables with headings of company and agent contain company IDs and agent IDs, respectively. There is a large number of missing values for both columns. In addition, the ID itself is unlikely to provide any valuable information to our analysis as there is no available documented nominal list that maps the IDs back to specific companies or agents. We, therefore, drop these two columns. Next, we remove 4 rows of data with missing values on the variable number of children. As this represents only a very small percentage of the total amount of data instances available. There are another 488 rows of data with missing values for the variable country, we replace the missing value with “Unspecified” to represent countries that have been unspecified in the data. Figure 1 shows the summary of missing data. Some categorical columns contain values with very few entries, this can pose a problem for subsequent machine learning. The first issue is that when there are only a few entries for a given category, the subsequent train-test split procedure done prior to the learning phase may result in instances from the minority category to all being in either test set only or in training set only. Both scenarios will present further issues during the accuracy measurement stage. Although tree-based methods can circumvent this problem with the introduction of surrogates, other machine learning Fig. 1 Missing data summary

Hotel Booking Cancellation Analytics on Imbalanced Data

101

models are not able to handle such a case. The second problem is that a machine learning model can have difficulty in learning such scarce instances well due to a gross underrepresentation of such instances in the data. Hence, to standardize our procedure, we address the sparse categorical issue by removing the specific categories where applicable or aggregate the specific categories to form a larger more generic category. We observe that for another column on the distribution channel, there is only one instance in the undefined category. We remove this category accordingly. For the country column, there are several countries with either only one or very few instances. We aggregate all countries with fewer than 488 instances together into one category called “Minority_sum” to denote the sum of all minority countries where a minority country is defined as a country with fewer than 488 instances in the data. There are two columns on the room type: reserved room type and assigned room type. We observe that both categories P and L have very few instances. We aggregate both categories together into one category called “P&L”. There is a total of 403 rows with zero value in a particular column on the number of adult traveller(s). This does not make contextual sense as any booking must have at least 1 adult. We are unable to investigate further from the dataset provider. We suspect that the zero value was used as a placeholder for bookings where such information was initially not being provided for or was unknown, it was not updated subsequently. We drop these 403 rows of data as they represent only a very small subset of the data.

Data Visualization IN this work, we hope to decrease the booking cancellation rate by examining past data. We examine this dataset spans across 3 years period reported about the room cancellation on Resort Hotel and City Hotel from 2015 to 2017. We first look at data on cancellations using pie chart and bar chart as shown in Fig. 2. From the pie chart, we can see that there are more than 1/3 guest cancelled for both hotels. The bar chart is shown on the right, we can observe that City Hotel has larger capacity and with nearly double the booking City Hotel. However, the cancellation rate of City Hotel (41.79%) is almost 50% more than Resort Hotel (27.77%). We next look at the arrivals of hotel clients by year and month as shown in Fig. 4. We can see that year 2016 is a peak year for both hotels with. The right figure shows the arrivals by month in year 2016. May, June, July, and August see the most bookings. This is their summer season, in other words, hotels are more popular during summer season. We now look at the staying patterns over weekends and the length of stay, see the plots in Fig. 3. Most City hotel’s clients do not stay for more than 2 nights over the weekend. Most clients only stay for at most one weekend in their trip. However, in Resort Hotel, there are more guests staying longer for 2 weeks, as compared to City hotel.

102

C. Yuxuan et al.

Fig. 2 Room booking cancellation

Fig. 3 Arrivals by year and month

Fig. 4 Weekend stay for resort hotel and city hotel

In terms of cancellation rate, Resort hotel’s customers who spend one whole weekend have a relatively higher cancellation rate of 33.09%, compared to its own average cancellation rate of 27.77%. While in City hotel, customers with or without weekend stays have cancellation rates around 40%–43%, similar to its own cancellation average of 41.79%. Figure 5 depicts the cancellation over week nights. We can see from the figure that most guests would stay from 1 to 3 weeknights. More guests at Resort Hotel

Hotel Booking Cancellation Analytics on Imbalanced Data

103

Fig. 5 Weeknight stay for resort hotel and city hotel

Fig. 6 Cancellation for clients with and without children

stay for a whole week, but those with 2 or more days of stay have cancellation rates of more than 30%, which is higher than Resorts’ average. In City Hotel, those with 2 or more days of stay also have higher cancellation rates, as compared to those with 0 or 1 weeknights’ stay. From this, we may infer that customers who planned longer stays are more likely to change their mind and cancel (Fig. 6). Most bookings are from guests of 2 adult. At the same time, the data also illustrates customers with adult in 2 are most likely to cancel the bookings, with a rate of 30.24% in Resort and 44.21% in City. Both are higher than their averages. Customers with children are less likely to cancel compared to those with no child as we observe in City hotel (No child: 42.22%; 1 child: 33.42%; 2 children: 38.93%). However, this is just the opposite for Resort hotel, customers with children have higher probability to cancel in Resort hotel than without children (No child: 26.82%; 1 child: 30.38%; 2 children: 46.25%). The data also consists of information on babies travelling with adult. Though customers come with babies are the minority, they are less likely to cancel booking as compared to those without baby. For example, the Resort Hotel, which has a relatively higher number of bookings with children, cancellation rates drop by nearly 10% with 1 baby (No baby: 42.22%; 1 baby: 33.42%). This could illustrate that travellers with young babies are less likely to make changes and stick to their original travel plan.

104

C. Yuxuan et al.

Machine Learning We look at the correlation among variables prior to constructing machine learning model. In this problem, the target variable of interest is a binary categorical variable that describes whether a hotel booking ends up being cancelled (with the value of 1) or not (with the value of 0). We perform a Cramer’s V correlation test which measures the correlation between two categorical variables. The results are shown in the table below in Fig. 7. We observe that reservation status is also a binary categorical variable similar to the target variable, it has a perfect correlation with the target variable. This means that the target variable “is_canceled” and reservation status can be used interchangeably. Reservation status is hence redundant and dropped from the data. We follow it with a point biserial correlation test which measures the correlation between binary categorical variable (the target) and continuous variables. The results are shown in the table below in Fig. 8. The data is slightly imbalanced where one class significantly outnumber the other class. We perform a stratified train-test split of 70–30 for training and testing dataset. We try to balance out the imbalanced dataset to provide the trainset with a combination of samples from over sampling and under sampling. There are a total of 52,407 non-cancellations and 30,876 cancellations in the trainset before balancing. We first randomly over sample the minority class to 41,642 cancellations and subsequently under sample the majority class to 41,642 non-cancellations to obtain a balanced trainset. The total number of instances in the trainset before balancing is 83283 and after balancing is 83284, this balances out the imbalance dataset. Fig. 7 Cramer’s V correlation

Hotel Booking Cancellation Analytics on Imbalanced Data

105

Fig. 8 Point biserial correlation

Dummy encoding was adopted to pre-process categorical features as sklearn library requires such features to be in numerically encoded form. We then make use of logistic regression, decision tree, random forest, XGboost, artificial neural network (ANN), and support vector machines (SVM) to create our machine learning models. For logistic regression, we aim to construct a model with the intercept term and hence we employ drop-first dummy encoding. All base cases are absorbed by the intercept term. For all tree-based models and SVM, we employ non-drop-first dummy encoding. In addition, as SVM is a distance-based algorithm, we will perform the relevant scaling of features for SVM in later step. For ANN, we construct a model without any bias term in the input layer. Hence, we also employ non-drop-first dummy encoding. In our machine learning models, we report the accuracy of the model to assess its performance emphasis on the recall rate of positive class (with label value of 1) which represents customer cancellation. In our business problem, we want to pick out as many cancellation cases as we can out of the total number of true cancellations. This is because cancellation cases require some form of special attention and in-depth analysis on what are the factors that influence a customer to eventually cancel a booking. Failing to anticipate too many booking cancellations can be very costly to the hotel business and tourism sector in general because it leads to both revenue loss and wasted logistic. On the other hand, wrong prediction on booking cancellation is not as bad as some measures like enhanced personalized services, or more relevant

106

C. Yuxuan et al.

customer service, etc. can avert booking cancellations and will still benefit customers who do not eventually cancel a booking. The first model we constructed is Logistic Regression model. Logistic Regression is one of the most fundamental machine learning techniques available. It has the capability to perform some form of customer profiling in customer analytics with its statistical significance indicators. We first perform a multiple regression model on all the features available. This model with all the features involved is called the full model and it yielded an accuracy of 81.03% and a recall of 78.37%. However, we find that not all the features in the model are statistically significant. We perform a backward selection to remove insignificant features from the model. For categorical variables with a mixture of significant and insignificant dummy variables, we use a rule of thumb where the feature is insignificant if more than half of the dummy variables are insignificant. The statistically insignificant features include arrival month, arrival week, arrival day, number of babies, and car park spaces required. The backward selection was done to select important features and the final model is called the reduced model. The results of the reduced model against the test set are shown below in Fig. 9. The accuracy and recall dropped slightly in the reduced model. However, a simpler model is preferred, especially in this problem, where we are conducting some form of profiling study as it zeroes in on the more statistically relevant features. Although the full model performs slightly better in terms of accuracy and recall, it fails to pick up statistically significant variable from statistically insignificant features. We constructed a table in Fig. 10 to summarize the list of features remaining in the reduced model. All these features like which hotel and lead time are considered as statistically significant in the profiling that we study using the logistic regression model. Another important machine learning technique capable of performing some form of customer profiling in customer analytics is decision tree. We first grow the full tree and subsequently prune the tree via the cost complexity alpha parameter. At a value of 0, the tree is completely unpruned. As the value increases, the tree is pruned to a greater extent until a maximum value is reached. The two graphs below in Fig. 11 Fig. 9 Confusion matrix for logistic regression model

Hotel Booking Cancellation Analytics on Imbalanced Data

107

Fig. 10 Important features

Fig. 11 Nodes vs CCP alpha (top graph) and depth vs CCP alpha (bottom graph)

shows how the number of nodes in the tree and the depth of the tree varies as the cost complexity alpha parameter is increased. At the maximum value, the tree is completely pruned. The full tree yielded an accuracy of 84.32% and a recall of 81.90%. We prune the full tree until the value of cost complexity alpha achieved the maximum recall rate. This model will then be able to predict the number of hotels booking cancellations more accurately out of the total number of true cancellations. The graph in Fig. 12 shows how the average sixfold cross validation alpha varies with the cost complexity alpha for a restricted range of the cost complexity alpha. It can be observed that the accuracy and recall have increased slightly in the pruned tree model. The table in Fig. 13 summarizes the list of features and their relative importance normalized to percentage value. The importance measures the contribution of each feature to the overall Gini gain. It measures the relative extent of which each feature contributes to purer splits of the target variable throughout the tree (Figs. 14 and 15).

108

Fig. 12 Recall vs CCP alpha

Fig. 13 Confusion matrix for decision tree model

C. Yuxuan et al.

Hotel Booking Cancellation Analytics on Imbalanced Data

109

Fig. 14 Accuracy and confusion matrix for random forest model

Fig. 15 Accuracy and confusion matrix for XGBoost model

The third model we constructed is Random Forest model. Random forest, like its name implies, consists of many individual decision trees that operate on an ensemble basis. Each individual tree in the random forest makes a class prediction and the class with the most votes becomes the overall model’s prediction. Random Forest uses bootstrap method in building the decision trees, which provides an improvement

110

C. Yuxuan et al.

over bagged trees by way of slight modification that decorates the trees. When these decision trees were built, each time a split in a tree is considered, a random sample of predictors is chosen as split candidates from the full set of predictors [1]. In order to enhance the performance of decision tree model, we use pruning to remove nodes that do not provide additional information and reduce the size of decision tree model. However, the cross-validation set should be measured so as not to reduce the predictive accuracy. Random forest performs the best result in accuracy rate because the splitting on a random subset of features decorrelates the trees, which means that the model only considers a small subset at each split of the tree rather than all the features. The final modelling results are shown as follows: XGBoost and Gradient Boosting Machines (GBMs) are both ensemble tree methods that apply the principle of boosting weak learners using the gradient descent architecture. XGBoost improves upon the base GBM framework through systems optimization and algorithmic enhancements. XGBoost has three general steps: parallelization, tree pruning and optimization. For the modelling, several combinations of parameters were tried to find out the optimal XGBoost model. From the experiment, the higher the number of estimators, the better the model performs. Hence, we increased the number of estimators. By limiting the number of features for building each tree we may end up with trees that gained different insights from the data. They learn how to optimize for the target variable using different sets of features. By lowering the learning rate from the default setting, we slow down the pace of this model to learn from the problem and this mitigates the risk of diverging. The final modelling results are shown as follows: Neural network has been an actively researched area in artificial intelligence since 1980s. It abstracts the neural network of human brain from the perspective of information processing, establishes some simple model to form different networks according to different connection modes. Neural network is an operational model, which is composed of a large number of neurons connected with one another. Each node is identified by a specific output function, which is called activation function. Each connection between two nodes represents a weighted value of the signal passing through the connection called weight. This is equivalent to the memory of the artificial neural network. The output of the network varies with the connection mode, the weight value and excitation function of the network. The network itself is usually the approximation of some algorithm or function in nature and may also be the expression of a logic strategy [2]. In our modelling, likewise, we first did the train-test-split and the ratio was 70–30. Then the structure and the configuration of MLP is listed below: 1. 2. 3. 4.

One input layer with 85 variables (including dummy ones). Two hidden layers with 16 nodes in each layer. The activation function is Relu. One output layer with sigmoid as the activation function (For classification label). Hyper parameter configuration: epochs = 16; batch_size = 120.

The final modelling results are shown in Fig. 16. This model shows high recall rate and moderate accuracy.

Hotel Booking Cancellation Analytics on Imbalanced Data

111

Fig. 16 Accuracy and confusion matrix for MLP model

A support vector machine (SVM) is a binary linear classifier where the classification boundary works by minimizing the generalization error. Unlike other classifiers, the support vector machine boundary is obtained using geometrical reasoning instead of algebraic. With that in mind, the generalization error is associated with the geometrical notion of a margin, which can be defined as the region along the classification boundary that is free of data points. In that manner, a support vector machine (SVM) has the goal of discriminating different classes using a linear decision boundary with the largest margin, giving rise to the hyperplane with the maximum margin. Having the maximum margin is similar to minimizing the generalization error [3]. On the other hand, an application called kernel advances SVM to better deal with nonlinear problem. The main idea is to obtain a linear separable boundary by mapping the data into a higher-dimensional space using kernel transformation. Kernel carries out a transformation that allows us to retrieve back the original feature space. In our modelling, we tried different kernels with two kinds of feature selection on the cleaned dataset. The first SVM model was processed with kernel ‘poly’, with original dataset with full features. The second SVM model was processed with kernel ‘linear’ and ‘rbf’, but we selected 6 features that are more significant in Random Forest. Because Random Forest is the model with the highest accuracy among all models we built. The final result is shown in the following Fig. 17. The table indicates the final test set accuracy and recall rates of the models. In terms of overall accuracy, the three tree models have better performance. They are Decision Tree at 84.5%, Random Forest at 88.6%, XGboost Forest at 83.55%, respectively. On the earlier experiment on Logistic Regression, even though we have filtered out the

112

C. Yuxuan et al.

Fig. 17 Accuracy and confusion matrix for SVM model

insignificant features by statistical method (selected using p-value), the final accuracy is still lower than the three tree models. The value is lower than 80%. In this work, we focus more on recall rate due to the nature of this problem. With this measurement to assess model’s performance, Artificial Neural Network model has the highest recall rate. However, the downside of ANN is its difficult to explain the results obtain as it works like a black box model. It is not possible to clearly illustrate the procedure of training and to understand the significant features from the test result. Despite the good performance on the prediction, the usability for business to explore more valuable information is low. In conclusion, in order to get a trade-off between the best accuracy and the best recall rate, we decide to use ANN to be our final selected model to predict the cancellation of hotel booking since we focus on achieving higher recall rate. However, as Random Forest has the best accuracy, it is still a robust model that will be useful in this tourism problem. Therefore, we proceed with using Random Forest to analyze business problems based on the importance features. Based on our selected model of highest accuracy with Random Forest, there are several features that are extremely significant for the prediction performance of model. We list down top 5 attributes and will provide further business recommendation based on these top features in the next section: 1. deposit_type 2. market_segment 3. lead_time

Hotel Booking Cancellation Analytics on Imbalanced Data

113

4. country 5. total_of_special_requests.

Business Insights and Solutions In this section, we will discuss about the business insights following the earlier modelling. After applying several machine learning models, we sum up our findings to five important features namely: Deposit Type, Market Segment, Lead Time, Country of origin, and Number of Special Requests. Upon examining each of these features closer, we can generate some interesting insights which helps to understand the issue better on what is causing bookings to be cancelled in Portugal. We come up with some recommendations this sector can adopt to reduce losses arising from booking cancellation for it to recover quickly from losses once the pandemic is over. First, on Market Segment and Deposit Type. Among the customers who cancelled, a disproportionately large number of customers came from the “Groups” market segment, this represents 27.4% of total cancellations as compared to those who did not cancel, or 10.3% of total non-cancellations. Among these “Group” market segment cancellations, with two deposit types, namely: “Non-Refund” and “No Deposit”. Non-Refund type constitutes 75.3% of the total number of “Group” market segment cancellations, while “No Deposit” accounts for 24.6%, respectively. These numbers show that groups booking being a major market segment in Portugal also contributes to the most cancellation case with non-refundable deposit types. Hence, for group bookings market segment, we make the following recommendations: · Increase the deposit amount for group bookings. Increasing the deposit amount would lead the groups to make booking reservation only when they have confirmed their travel plan or to organize an event and this can reduce last-minute cancellations. · For customers with “Non-Refund” deposit, we can introduce a flexible scheme where customers have a onetime option to reschedule the hotel booking to another period or to change the booking to another partnering hotel, instead of having to cancel the original booking. This is to better cater to their changing needs and convenience in the case of event cancellation. This is feasible for most local tourist from Portugal itself. The following Fig. 18 shows the number of bookings by different market segments among customers who cancelled (Fig. 18a) and market segments for customers who did not cancel (Fig. 18b). Second is on the country of origins. We plot out a graph to show the country of origins as shown in Fig. 19. It can be observed that the greatest number of bookings comes from within the country and the greatest number of cancellations is also the same. Booking from Portuguese locally accounts for 40% of the total bookings and around 57% of the total domestic bookings among the cancelled bookings. One

114

(a) Market Segments for Non-Cancellation

C. Yuxuan et al.

(b) Market Segments for Cancellation

Fig. 18 Market segments by cancellation

Fig. 19 Cancellation based on different country of origins

suggestion to reduce cancellations is to focus on local Portuguese market. We propose the following changes to be made to the policies to reduce such cancellations: · Provide Portuguese with staying deferment service in 6 months, but with higher deposit than current or with a phased refund. · Portuguese Tourism Bureau can provide a platform for local tourists to transfer their booking to other buyers at a lower price (or the same price, depending on the season) if they need to cancel the reservation. · Reduce the quota of hotel rooms for Portuguese. Provide priority to local Portuguese tourists who have confirmed travel plan to show evidence of confirmed travel plan like transportation tickets, attractions tickets booked, or other travel documentation from travel agent. Thirdly, on the number of special requests. Some travellers make one or more special requests for their hotel stays. Figure 20 shows the breakdown of different number of special requests, for both the cancelled and non-cancelled cases. It shows that people without any special requests tend to cancel their bookings more than people having special requests. We can see from the graph that the red colour denotes the cancelled data, the proportion of cancelled with no special request (zero) is higher than the other cases with one or more special requests. It is interesting to note that this is

Hotel Booking Cancellation Analytics on Imbalanced Data

115

Fig. 20 Cancellation based on special requests

supported by the fact that people who have confirmed travel plans take this option seriously and such bookings tend towards non-cancellation which is also evident from the numbers shown in Fig. 20. Following the above observation, we would like to provide suggestions for hotel operators to allow for higher over-bookings quota considering the cancellations that may arise especially for bookings without any special requests. This can provide better allocation of resources in the event of last-minute cancellations from such bookings and would reduce losses for hotel operators. Fourth on lead time. We analyze the lead time over three years period, this gives a good estimate of the average lead time for cancelled bookings as well as the average lead time for non-cancelled bookings. Figure 21 shows the average lead time for the years 2015, 2016, and 2017. Lead time denotes the duration between the booking and the arrival date. We can observe that an average lead time of 140–160 days leads to a high number of cancellations as compared to an average of 40–60 days for non-cancelled bookings. With this observation, we would like to recommend for the advance booking period be limited to 60–80 days rather than the current no limit on advance booking period. Due to no limit on the advance booking period, travellers tend to reserve a booking well in advance and are more likely to cancel it last minute subsequently even though they may not have a concrete travel plan at that point of time. Another recommendation would be to increase the cancellation charges nearer to the arrival date. This can prevent last-minute cancellations and tourists will make bookings when things are more firmed up.

116

C. Yuxuan et al.

Fig. 21 Average lead time for cancellation

Conclusion We have witnessed increasing international tourism receipts that contributed to significant growth in global GDP in the last few decades. Increased human interactions from tourism also bring about second-order benefit to international trade, foreign direct investment, supply chain integration, and jobs. However, tourism is often vulnerable to various disruptions coming from the destination and origin markets. Prior to the COVID-19 pandemic, significant disruptions to international tourism include the Foot and Mouth disease in the UK (2001), terrorist attack on Indonesia’s resort of Bali (2002), the outbreak of severe acute respiratory syndrome (SARS) (2003), and tsunami in South Asia (2004). The impacts of such disruptions on regional tourism immediately impact the tourism sector. The time required for tourism to fully recover from a crisis depends on the nature of the disturbance and how exactly the tourism system has been affected [4]. Different approaches have been used to study tourism and its recovery from disruptions. We can take a mixed-method approach for this work to look at tourism in Portugal and reveal the factors contributing to hotel booking cancellation. Suggestions were made following the observations made from the data itself and the machine learning models constructed.

References 1. A Simple Introduction to The Random Forest Method (2020) https://arifromadhan19.medium. com/a-simple-introduction-to-the-random-forest-method-badc8ee6c408 2. Shruti_Iyyer (2019, 4). ANN and classification. https://www.kaggle.com/shrutimechlearn/deeptutorial-1-ann-and-classification

Hotel Booking Cancellation Analytics on Imbalanced Data

117

3. Rogel-Salazar R (2017) Data science and analytics with Python. Chapter 9. 4. Zeng B, Carter RW, De Lacy T (2005) Short-term perturbations and tourism effects: The case of SARS in China. Curr Issue Tour 8(4):306–322. https://doi.org/10.1080/13683500508668220

Tourism Prediction Analytics Chen Shuhua, Gao Yuan, Lin Desheng, Shen Yi, and Wu Di

Globalization and economic growth have steered Singapore to be one of the most important international destinations for business travelers as well as recreational visitors in Southeast Asia. As one of the most prosperous cities in southeast Asia, Singapore attracted business travelers and event organizers from Asia, Europe, and the United States. In addition, the well-established transportation to other southeast Asia countries like Malaysia, Indonesia, and Thailand contributes to Singapore being a popular destination. It is coupled with efficient airports like Changi Airport, the 7th busiest airport in the world, according to the statistics from Airports Council International in 2018. With many direct routes to the rest of the world, Changi Airport provides convenience and flexibility to visitors. Therefore, the tourism industry in Singapore has developed stably in the last decade, with the monthly total number of visitors growing from 800,000 in 2008 to around 1,800,000 in 2019. The tourism and hospitality industry are important industries in Singapore, accounting for around 4 percent of the Gross Domestic Production of Singapore. As a southeast country with a small land area, Singapore’s economy relies heavily on the contribution of inbound international visitors. However, the unexpected outbreak of the COVID-19 epidemic has adversely affected the inflow of tourists and hit the tourism sector hard. With the outbreak of the Coronavirus Disease (COVID-19) in 2019, the sudden increase in cases in late February 2020 has caused widespread concern around the world. Italy, South Korea, Iran, France, Germany, Spain, the United States, India, and Japan were among the most severely affected countries.

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_8. C. Shuhua (B) · G. Yuan · L. Desheng · S. Yi · W. Di Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_8

119

120

C. Shuhua et al.

Collecting epidemiological data and predicting epidemic trends are critical to formulating and measuring public intervention strategies. This work attempts to examine the severity and global trend of COVID-19 and provide business recommendations to recover the tourism industry.

Dataset and Analysis We analyzed data on COVID-19 provided by the World Health Organization (WHO) and Johns Hopkins University. The dataset provides a daily update on the confirmed cases, death cases, and recovery cases around the world from 2020–01 to 2021– 01. This enables us to understand the global spread of COVID-19 and forecast the severity of COVID-19 in each country in the near future. On top of this, we collected data related to tourism and the hotel industry from the Department of Statistics of Singapore. The tourism industry dataset provides the information on international visitors’ arrivals by different modes of transportation and tourism receipts before and during the COVID-19 period. The hotel industry dataset provides some critical key factors which influence the total revenue of each hotel including standard average room rate and average occupancy rate before and amid the COVID-19 period. We obtained some useful insights from analyzing these datasets, such as the influence of COVID-19 on these industries and some potential remedies for these industries.

Current Situation of COVID-19 The following analysis first focuses on some global trends of COVID-19. Figure 1 depicts the recovery rate of COVID-19 and Fig. 2 illustrates the fatality rate of the virus. Figure 1 shows that the current situation of COVID-19 is still severe because of the high number of active cases. However, the plot also indicates that COVID-19 is controllable because the recovery rate is significantly higher, and the mortality rate is significantly lower as compared to other viral diseases in the past fifty years. Secondly, we look at country-level confirmed cases. It provides some useful trending information. In this work, we also focus on the ranking of confirmed cases and active cases because these data can be important indicators showing the severity of COVID-19 in each country. This understanding on the situation around the world helps to establish suitable travel restrictions and SHN regulations. The graph in Fig. 3. shows the top number of confirmed cases by countries. Based on the graph, Europe and the United States are still suffering badly from the pandemic. As for the total confirmed cases, the top three countries are the United States, India, and Brazil, which will be the key countries for the following analysis. Figure 4 depicts the number of active cases by countries. The number of active and recovery cases

Tourism Prediction Analytics

121

Fig. 1 Recovery rate of COVID-19

Fig. 2 Fatality rate of virus

indicates that the situations in Brazil and India are getting much better. Combining the data provided by the Singapore government, we hence target on countries like China, India, USA, and South Korea where situations are not well under control.

122

C. Shuhua et al.

Fig. 3 Confirmed cases by countries

Fig. 4 Active cases in each country

Prediction of COVID-19 In this section, we perform some statistical modeling and related analysis. We first delve into some relevant factors like Growth Factor, Growth ratio, and Growth rate. The growth factor on day N is the number of confirmed cases on day N minus the number of confirmed cases on day N-1 divided by the number of confirmed cases on day N-1 minus the number of confirmed cases on day N-2. The growth ratio on day N is the number of confirmed cases on day N divided by the number of confirmed cases on day N-1. Finally, the growth rate is the first-order derivation. We next have a quick look at the statistical factors of the selected country. We use these growth indicators to understand which countries may have reached an inflection point that denotes a significant change in a situation. For example, if a country’s growth factor stabilizes at around 1.0, it may indicate that the country has reached an inflection point. We will then use the curve fitting technique to fit the logistic curve to the number of confirmed cases in each country. Based on these plots, all the countries including India, USA, and China are qualified to fit the logistic model. Based on mathematical statistics and machine learning models that have been used to predict the development of epidemics in time series, the Logistic principle is simple and computationally efficient. It is often used for regression fitting of time series data. For example, in SARS cases, Logistic growth is characterized by slow growth at the beginning, the rapid growth phase which is close to the peak of the incidence curve, followed by the slow growth phase nearing the end of the outbreak. Mathematically,

Tourism Prediction Analytics

123

Fig. 5 Number of arrivals in Singapore from the year 2008 to 2021

the logistic model describes the dynamic evolution of infected individuals being controlled by the growth rate and population capacity. Based on the above design of the Logistic model, we can build some statistical models to predict the severity of COVID-19 in each country.

Development of Tourism/Hotel Industry In this section, we take a closer look at the international visitor arrival statistics. Apart from the main metric of the tourism industry, other variables like the number of arrivals, age of visitors, the purpose of trips, and places of interest are also chosen to reflect the development of Singapore’s tourism industry and the impact of COVID-19. The number of arrivals can directly show the level of prosperity of the tourism industry. During the last 10 years before COVID-19, the yearly growth rates of arrivals were all positive except years 2009 and 2014 ranging from 1% (2015) to 13% (2011). This shows an increasing trend in the number of arrivals in general as shown in Fig. 5. Some countries and regions showed an evident growth trend in terms of the number of arrivals such as the USA, Mainland of China, and even India, while other countries and regions showed a stable or decreasing trend in recent years. This phenomenon could be considered when a country wants to target more promising markets.

Seasonality of Arrivals Apart from the increasing trend, the number of arrivals also shows apparent seasonality. This was validated by decomposing the number of arrivals using the base time

124

C. Shuhua et al.

series model. Prior to the outbreak of COVID-19, the peak months are January, July, August, and December. This can be seen in Fig. 6 showing the peak number of visitors in the corresponding months. The seasonality of arrivals varies in different regions. Southeast Asia visitors usually prefer to visit Singapore in December, due to year-end holidays such as Christmas and New Year at the beginning of each year. The second peak for the number of visitors from Southeast Asia is around the July period. Figure 7 depicts the number of arrivals from different countries. Looking at the trend for China’s visitors, the period during summer vacation for schools in China witnesses the highest number of arrivals throughout the year. Another peak Singapore traveling period is around the Chinese New Year.

Fig. 6 Seasonality of the number of arrivals in Singapore

Fig. 7 Number of arrivals from different countries

Tourism Prediction Analytics

125

Fig. 8 Average age of total arrivals

Age of Visitors We also observe an age pattern for tourists visiting Singapore. Visitors visiting during July and August are student travelers who are much younger in general. December also shows a relatively young age among visitors, while other months show an average age of around 40 years old. This can be validated by different variations of each age group. The age group of 14 and below has the largest variation since many of them are students. The age group between 35 to 44 also has a large variation since many of them are parents. The visualization for the average age of total arrivals can be seen in Fig. 8.

Purpose of Trips Identifying the purpose of traveling for visitors can help to pinpoint the target market and match it with a suitable marketing strategy later on. Throughout the last decade, the proportion of people traveling for holiday rose significantly from 40 to 60%, while the figure for business purpose shows a reverse trend dropping from 30% to nearly 10%. This change illustrates that Singapore has become a tourism destination rather than a business center for the past 10 years. Therefore, to boost the tourism industry in Singapore, the government should focus on recreational visitors as target audiences instead of business travelers. Figure 9 shows the purpose of visits for visitors entering Singapore, and we see a significant uptick for the “Holiday” category.

Places of Interest Prior to COVID-19, the popular places of interest are shown using pie charts in Fig. 10. The first figure on the left shows the percentage of visits by main attractions in Singapore and the right figure shows the free access attractions in the year 2019

126

C. Shuhua et al.

Fig. 9 Purpose of visits

before the pandemic hit. In order to boost Singapore’s tourism industry, ticket bundles could be considered to bundle popular attractions with less popular attractions.

Hotel Industry Singapore is constantly ranked as one of Asia’s premier tourist destinations. It offers a diverse range of hotels, from luxury hotels to chic boutique spaces, as well as nofrills budget establishments. Travelers can experience diverse hospitality selections and offerings when traveling to Singapore for businesses or leisure. There are different types of hotel offerings as follows: · Heritage—Hotels awarded the Architectural Heritage Awards or any equivalent awards and hotels gazette as national monuments. · Boutique—Hotels with usually less than 150 rooms and endorsed by Small Luxury Hotels of the World, Design Hotels, and Leading Hotels of the World or other equivalent awards. · Resort—Hotels that are typically located away from the bustling city and offer recreational facilities such as golf, tennis, sailing, swimming, and snorkeling. In terms of business models, there are two typical business models: asset-heavy and asset-light models. Under the asset-heavy model, brand operators operate hotels on leased or owned properties. In the asset-light model, brand operators provide brand licenses and support to franchisees and collect fees in return. Hotel groups typically start with the asset-heavy model for better quality control, and then further roll out their network rapidly under the asset-light model. The upstream of the hotel industry consists of commercial property developers and owners who provide operation sites for hotel operators, as well as suppliers of

Tourism Prediction Analytics

127

(a) Percentage of major attractions and sites in 201

(b) Percentage of free access attractions and sites in 2019 Fig. 10 Percentage of major attractions and free access attractions in 2019

facilities, equipment, and essential materials for daily operations. End-customers of hotel services are the downstream consumers of the hotel industry, and it generally includes individual guests for business and tourism trips as well as corporate customers attending Meetings, Incentives, Conferences, and Exhibitions (MICE).

128

C. Shuhua et al.

Increasingly more end-customers make direct hotel bookings through hotel websites, applications, corporate accounts, direct phone calls, and onsite “walkin”. Besides, indirect hotel booking channels such as online travel agency (“OTA”) platforms and other intermediaries are also common customer acquisition channels.

Impact of COVID-19 on Singapore’s hotel industry COVID-19 represents one of the most significant challenges the hotel industry has ever faced. Since COVID-19 was reported in January 2020, it has quickly escalated into a global pandemic, discouraging global business and leisure travel that negatively impacted the tourism segment. The severe slowdown in domestic and global economic activities, especially in the travel and hospitality industries, makes the timing and extent of sustainable recovery uncertain. The followings are the observed phenomena and changes brought by COVID-19 to Singapore’s hotel industry: · Hotel chains tend to be more resilient than independent hotels due to better brand recognition and more capital resources, and thus are more resilient during the pandemic. · Higher expectations on hygiene and safety measures. COVID-19 has reinforced guests’ awareness and demand for safety and hygiene when choosing a hotel. Leading hotel chains which have better branding, more stringent daily procedures for sanitation, and quicker response to guests’ needs are better positioned to gain trust from guests. In return, their occupancy rates tend to rebound faster. · Accelerated application of innovative technology in the hotel industry. In response to the demand for social distance, hotels with more advanced technologies (such as the deployment of smart robot services and non-contact check-in/check-out services) have been increasingly appealing to guests and thus tend to recover more quickly from COVID-19.

Descriptive Analysis The pandemic and the related restrictions on travel, business activity, and individual movements had an unprecedented impact on the tourism industry, especially for hotels. We measure and compare three metrics related to hotels’ daily operations before and after the breakout of COVID-19. The impact of the pandemic on the room price, occupancy rate, and the number of available rooms is depicted in the following visualization and analysis. First, we look at revenue per available room. Before COVID-19, the revenue of hotels in Singapore was going upward except for the economic crisis in 2009. However, the revenue per available room dropped evidently in February 2020.

Tourism Prediction Analytics

129

Because of the lagging effect of revenue computation, the situation quickly deteriorated as more and more countries started lockdowns. The low demand in the market and the pessimistic attitude of investors drove hotel owners to adjust the room rates. As the pandemic started to be under control in Singapore after June, the room rates stopped dropping. Despite the initial upticks in numbers later in 2020, the percentage of change remained low making it hard to recover back to its normal level. As shown in Fig. 11. is the revenue and percentage of change for revenue per room. The deep dive in the year 2020 shows the great barriers to recover to its normal upward trend before the pandemic. The second is on the number of available room nights. The number of available room nights kept increasing in the past decade and has doubled since 2008 as shown in Fig. 12. However, this trend was disrupted by COVID-19. The seasonality of the past time series trend can no longer be observed because the pandemic has a longlasting impact for Singapore as well as the whole world. The number of available rooms dropped to about 800 after the lockdown in February 2020. The increasing trend of hotel rooms implies that hotels in Singapore are providing more accommodations to tourists from all over the world. Since the arrival of tourists has fallen to a historical low, the supply of hotel rooms should correspondingly increase. However, the truth is that more than half of the 67,000 hotel rooms in Singapore were being used in the battle against COVID-19. These rooms are used as isolation or quarantine facilities, as well as accommodation for returning residents serving their 14-day stay-home notice. At the same time, the construction of new hotels is halted during the pandemic and the government set aside some hotel rooms in case there is a sudden surge of infections, it would still take lots of time for the number of available hotel rooms to be back on track. The occupancy rate of hotel rooms is another key metric to measure the impact of COVID-19, and the occupancy rate for different types of hotels is shown in Fig. 13. This metric is more sensitive and intuitive since it responds directly to the market.

Fig. 11 Percentage of change for revenue per room

130

C. Shuhua et al.

Fig. 12 Percentage of change for average available room nights

Fig. 13 Occupancy rate of different types of hotels

As you can see, the occupancy rate started dropping earlier than other metrics, even before the beginning of 2020 when the first case of COVID-19 was reported which leads to the lockdown policy launched in China. Tourists from China is one of the largest group of tourists visiting Singapore. The occupancy rate is over 80% for all room types in 2019 and quickly dropped to less than 50% in 2020. Although the occupancy rate gradually increased after March 2020, it is uncertain that the occupancy rate would recover to its normal level since it still fluctuated following the control measures of the pandemic in Singapore. Another insight is that the luxury room type suffered the most while the economy room type has minimal impact and recovered fast from the pandemic. Economy rooms are also expected to have the fastest return to pre-pandemic levels, while luxury and upscale hotels recover slower. This is in part because economy hotels are

Tourism Prediction Analytics

131

better able to tap on segments of demand that remain relatively healthy despite travel restrictions.

Time Series Prediction In this section, we will illustrate how the time series prediction model was developed. We first developed five different time series prediction models and then choose the best model to predict related metrics’ value without the influence of COVID-19 to show the impacts on the Singapore tourism industry. We developed the Simple moving average (SMA) model and Seasonal Autoregressive Integrated Moving Average (SARIMA) model as discussed below. Simple moving average (SMA) model calculates the average of a selected range of target variables. In our case, the length of the window is set to be 12 to cover the whole year of data. Simple exponential smoothing method predicts based on the weighted sum of past observations, but the model explicitly uses an exponentially decreasing weight for past observations. The smoothing factor is set as 0.95 to assign a large weight to recent sample points. Holt model includes two smoothing equations to consider the influence of the trend. The Holt Winter model takes into account three aspects of the time series: average, trend, as well as seasonality. It then uses exponential smoothing to encode values from the past and uses them to predict values for the present and future. Finally, SARIMA was used to predict the time series target variable. It uses the parameters in ARIMA models plus seasonal elements like seasonal autoregressive order, seasonal difference order, seasonal moving average order, and the number of time steps for a single seasonal period to predict future values. Data before 2020 was used as our training set and the values in 2020 as the targeted prediction dataset to characterize the impact of COVID-19 and compare prediction with real-world data. We use RMSE (root mean square error) to measure the performance of models. The first target variable is the number of arrivals. SARIMA is chosen to predict the number of arrivals because its features match with this set of data. Figure 14 shows the results from the SARIMA model. As we can see a significant drop in the first quarter of 2020 when the COVID-19 epidemic started. Table 1 shows the actual arrivals versus the predicted arrivals. The data also reveals that in February 2020, it was decreased by half because the number of visitors from China decreased dramatically following its reporting on the spread of the COVID-19 virus. The lockdown measure implemented by the Singapore government saw the number of arrivals went below 1000 in April 2020, resulting in more than 99% drop as compared with the predicted number. Although Singapore moved to phase two and phase three of reopening after controlling the spread of COVID-19 in the community, the number of visitors remained extremely low due to the quarantine requirements in both departure and destination.

132

C. Shuhua et al.

Fig. 14 Results from the SARIMA model

Table 1 Actual and predicted arrivals

Month

Actual arrivals

Predicted arrivals

Drop rate (%)

Jan-20

1,688,099

1,697,900

−0.58

Feb-20

732,965

1,566,368

−53.21

Mar-20

240,001

1,638,323

−85.35

Apr-20

750

1,656,366

−99.95

May-20

880

1,544,135

−99.94

Jun-20

2171

1,624,789

−99.87

Jul-20

6843

1,859,266

−99.63

Aug-20

8912

1,798,707

−99.50

Sep-20

9500

1,524,511

−99.38

Oct-20

13,397

1,586,240

−99.16

Nov-20

14,676

1,577,346

−99.07

We next look at three other important target variables to predict: revenue per available room, available room nights, and occupancy rate. Among all five models we have developed, SARIMA models have the lowest RMSE for all targeted variables, so we adopted SARIMA model in the subsequent analysis. Figure 15 shows the prediction made by SARIMA models. Figure 15(a) and (b) shows the prediction results of room revenue and average room nights. Followed by the prediction results of the occupancy rate of four different room types: upscale, mid-tier, and economic. Table 2 depicts the RMSE computed for the above prediction. Although SARIMA has the best performance among all the five prediction models we used, the RMSE is still very high as the actual situation is highly unpredictable. The RMSE for revenue per available room and average room nights are 294.1 and 866.1, and the RMSE for occupancy rate of four different room types are 0.383, 0.288, and 0.220. This result is coherent with our evaluation of the occupancy rate that economic room type would be the first to recover from the pandemic. However, it is still very hard for the revenue per room and average room nights to come back to the normal level. For the other variables like revenues per available room and available room nights, it is still a long way to eliminate the impact of COVID-19 entirely.

Tourism Prediction Analytics

133

(b) Predicted average room nights

(a) Predicted revenue per available room

(c) Predicted occupancy rate (upscale)

(d) Predicted occupancy rate (mid-tier)

(e) Predicted occupancy rate (economy)

Fig. 15 Detailed Results from SARIMA models

Table 2 RMSE of targeted variables Targeted variables

Revenue per Average room available room nights

Occupancy rate Occupancy (upscale) rate (Mid-tier)

Occupancy rate (economy)

RMSE

294.1

0.383

0.220

866.1

0.288

Recommendation Despite the uncertainty of the COVID-19 epidemic, the prediction of the number of arrivals is extremely important for both policymakers and the tourism industry. Machine learning algorithms trained with seasonality factors are recommended to

134

C. Shuhua et al.

be used in tourist flow prediction. If there is still travel limitation in Singapore or other countries, the main customers of tourism attractions will be local residents or long-term pass holders in the community. During the epidemic, controlling the movement of people is essential to prevent the outbreak of COVID-19. If the number of visitors is predicted to be high in special periods like holidays, tourism companies should adopt crowd control measures beforehand such as pre-register entries to avoid overcrowding. On the contrary, during the time when the number of visitors is predicted to be low, companies can change marketing strategies to attract more local tourists and earn more money to balance the loss due to the epidemic. Upon recovery from the epidemic, the prominent seasonality feature of the Singapore tourism market in the past is an opportunity as well as a challenge to the Singapore tourism industry. If the epidemic is under control and the border is opened again, the number of tourist arrivals is expected to follow the seasonal trend, especially during holidays. Therefore, the trained time series model can be used to predict the tourist’s movement in Singapore. Some researchers have implemented a similar method to predict the flow. For example, Chen, R. [1] implemented the Support Vector Regression (SVR) and Artificial Neural Network (ANN) to predict the tourist movement in China. Similar techniques can also be used in hourly crowd flow management to estimate the peak hours and off hours on weekdays and holidays. With such information, managers of tourism attractions can better control the crowd flow and maximize their profits. The high demand during peak seasons such as July and December presents the opportunity for companies to recoup losses. The early prediction can help them to prepare beforehand to handle the high volume of overseas tourists properly. Otherwise, the overcrowding can affect the tourist experience, tourism attraction reputation, and impressions in tourists’ minds. Therefore, predictive models are recommended to be used in the tourism industry. Since the outbreak of COVID-19, tourists stay far longer than before although the number of visitors has been reduced significantly. This means that recent visitors are not tourists who come for holiday but rather they belong to the group who hold long-term passes such as international students or business travelers. The first suggestion is on the development of the local tourism market by redesigning multi-entry recreational packages that allow people to go for a holiday in certain periods of time. For example, Sentosa packages consist of multi-entry Universal tickets and vouchers for restaurants and bars in Sentosa. This can attract international students and local teenagers to spend their free time there. Longer time packages, for example, two months multi-entry tickets, allow them to enjoy multiple attractions and return within the period. Similar tourism packages can be designed for every group of people in Singapore, encouraging them to spend time and money on local tourism attractions and thus helping the tourism industry to survive and go through this difficult time. From the policymaker’s perspective, packaging those popular attractions with less attractive ones can also help. Tourists may be selective about the attractions to visit from the lists, but the package can still encourage some people to spend time in less popular places.

Tourism Prediction Analytics

135

Secondly, it is suggested to hold events and activities with strict epidemic prevention and control measures. We should treat the epidemic as a new normal state which might exist for a long period of time. The prolonged period of the pandemic will see the economy recover slower. Therefore, events can be held locally to attract residents to come for leisure and spending. Event organizers ought to manage the crowd movement well to reduce the risk of infection. Hotel operators need to find ways to increase revenue per available room, available room nights, and average occupancy rate. With the constraint of lesser people traveling and staying in hotels, some techniques can be used to maximize the business opportunity despite the constraint. Another recommendation is to attract more direct bookings from corporate travelers. As there are high functional costs, inventory maintenance, and other expenditures to manage in the accommodation industry. One recommendation is to increase the hotel’s visibility by connecting to integrated platforms with good rankings which are optimized for search engines. This will attract more direct bookings from organization tourists who are looking to host functions in venues recommended by popular integrated platforms. Discounts, loyalty programs, and other perks should also be re-designed. Discounts motivate people to book for hotel stay. By lengthening the stay at a discounted price, or free upgrade, individuals will see values in their booking. Loyalty programs and providing other perks can also be considered. For example, as part of a loyalty program, visitors can be offered a complimentary night after they have actually booked for 5 or 10 evenings. This can also improve the occupancy rates of the hotel as well. Food and beverages are another essential part of the tourism industry. The global pandemic has disrupted and greatly impacted the way people live. For the hospitality industry where dining in (restaurants), hotel stays, and travel are no longer possible given worldwide restrictions. This pushes businesses to rethink about their business models and operations to enable guests to continue enjoying comfortable experiences where possible while remaining safe. For luxury hotels in Singapore, food and beverages have always been a key focus. The commencement of takeaway and delivery services for food and beverages has a direct impact on the current situation. It is evident that leveraging on innovation, digitalization using the Internet of Things (IoT), and remote working technologies are going to be the primary enablers for hotels to implement the new normal of hospitality. Things that can better improve customer’s experience include advanced sanitation measures, minimumcontact practices, efficient staff services, and personalized experiences. Thus, leveraging on innovation to improve present offerings and be transparent about the new safeguards are the natural next step for hotels to boost their sales. As the hotel industry is preparing for the post-COVID world, many hotels around the world are doubling down on contactless check-ins and check-outs. Contactless check-in modern technologies enable visitors to have a contactless arrival experience, bypass the front desk, and go to their spaces directly. Guests can check-in, select their space, access their area, and check-out with minimal human involvement. On top of that, by introducing contactless check-in and check-out, hotels can save on labor

136

C. Shuhua et al.

costs. Furthermore, hotels can also collect health declarations from guests who are checking in to contain the risks of an outbreak. The other aspect is to supply home entertainment or exercise solutions. Each room can be equipped with a workstation for work or tap on the following online resources to enrich their stay in the hotel: · Online electronic books: Hotels can sign up with Amazon or Audible to get ebooks, audiobooks, magazine articles, and more resources and offer consumers the right to use e-book devices, e-books, audiobooks, magazine articles, and more to anyone. The online reading tool has millions of titles and programs available and can be accessed online or through its app. · Game device rental: Hotels can offer game device rental services to consumers, covering XBox, Animal Crossing, PS4, and more. Offering additional leisure game services, can be a new revenue source. · Online gym courses: Hotels can also provide virtual gym courses including yoga, meditation, strength training, cycling, running, and more. Hotels can tap on apps like Golds Gyms, Planet Fitness, Pure Barre, etc. The classes are pre-recorded and range from 30 min to up to an hour long. The studio is also offering online guided breathing practices and meditations. Based on the result of the predictive models developed in the previous section, we recommend the following recommendations related to the international travel restrictions. · In order to promote the resurgence of tourism in Singapore, the travel restriction regulation for Chinese visitors should be eased. However, some stay-home periods can be imposed. · The results earlier also showed that India has reached the tail-end of the COVID-19 outbreak. Therefore, March 2021 will be good to list India as a partially restrictive country and shorten the stay-home period.

Conclusion This work provides a brief overview of the outbreak of COVID-19 and examines the impact of the mentioned pandemic on the tourism industry of Singapore. It was shown that the outbreak of COVID-19 has severely and adversely impacted Singapore’s tourism industry, as tourists from all across the world are canceling bookings and delaying travel plans to Singapore due to worries about the virus and restricted by the corresponding travel restrictions and bans. We proposed time series prediction models developed using historical data to simulate how long will the influence of the pandemic lasts and measure the economic losses brought by COVID-19 from the perspective of tourists’ arrival and the hotel industry. Results show that the pandemic has exerted a long-term impact on the tourism industry in Singapore, and although various measures have been taken to mitigate the negative influence, it will still take some time for Singapore to recover from the worldwide pandemic.

Tourism Prediction Analytics

137

References 1. Chen R, Liang CY, Hong WC, Gu DX (2015) Forecasting holiday daily tourist flow based on seasonal support vector regression with adaptive genetic algorithm. Appl Soft Comput 26:435– 443 2. World Health Organization, https://covid19.who.int/

Marketing Segmentation and Targeted Marketing for Tourism Liu Ye Xin, Li Yiteng, Ritika Jain, Tran Thi Hong Van, William Lim, and Zhao Yilin

COVID-19 has impacted how we live and, more importantly, the global economy that we depend on. Tourism falls into one of the worst-hit sectors and contributes 4% to Singapore’s gross domestic product [1]. With restricted travel, Singapore is unable to rely on international visitors as a revenue source and has to target inwards locally for the short to mid term. With a limited stimulus from the government, businesses in affected sectors such as hotels, F&B, and malls will have to tap on effective methods to reach the limited pool of customers locally. With excess supply for tourism services and a much smaller pie in demand, there is heightened competition as businesses attempt to offer similar services to the same potential customer. In such circumstances, the use of data becomes highly significant in reaching the right audience and determining if a business can survive the pandemic. The Singapore government has been trying to reduce retrenchments by subsiding salaries through grants and subsidies [2]. However, government assistance can only do so much as businesses struggle to stay afloat with minimal revenue. In the following sections, we explore different strategies that businesses and the government have implemented. Through these initiatives, we strive to find opportunities where analytics can help to improve the situation. The Singapore Tourism Board (STB) has encouraged Singapore residents to rediscover Singapore through a S$45 million stimulus package [3]. Singaporeans spent more than S$34 billion on overseas travel yearly, and this campaign was designed to redirect the spending to local businesses domestically. Singaporeans are encouraged to visit and spend in the tourist hotspots and the vouchers can also be spent on staycation deals at hotels that are heavily impacted. Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_9. L. Y. Xin · L. Yiteng · R. Jain · T. T. H. Van · W. Lim (B) · Z. Yilin Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_9

139

140

L. Y. Xin et al.

With border control that restricts traveling to other destinations, airlines and cruises experiment with the ideas of “flights to nowhere” and “cruises to nowhere” [4]. However, Singapore Airlines (SIA) settled for dine-in in the planes instead. The proceeds from such activities are deemed insignificant as a recovery strategy for COVID-19. Air travel bubble arrangements with selected countries were made for travelers to bypass the confinement period and allow for unrestricted entry. However, no successful outcome has been derived. For instance, the air travel bubble between Singapore and Hong Kong has been postponed [5].

Visualization with Descriptive Analytics In this section, we adopt descriptive analysis to quantitatively assess the extent of the impact on areas such as arrival rates, tourism sector revenue, and tourism spending for aviation, hospitality, and restaurants. Most of the visualizations depict the sharp decline of a particular metric on the advent of COVID-19 and provided no key insight that can be used for further analytics. Technical analysis could not be performed for such trends as the influence of fundamentals will outweigh the trends. However, some useful insights were discovered through the visualization of Singapore demographics and household spending on travel. We intend to use them to assist businesses in reaching out to the right audience at a lower cost. Figure 1 shows the inbound arrival rates from the year 1980 to 2020 from all regions and Asia. As we can see there is a drastic down slope from the end of 2019 to the mid of 2020. It can be seen that the trends are the same for both international and southeast Asian categories, which is an effect of travel restrictions imposed by most governments globally and Singapore itself. The Y-axis for this graph represents the logarithmic transformation of the total passenger’s value. The following visualization in Fig. 2 shows the total arrivals and departures by aircrafts for the past 20 years. We can see from the figure that since the beginning of the COVID-19 pandemic, the number has dipped to the level as in the 1980s. The plot

Fig. 1 Total inbound arrivals from the year 1980 to 2020

Marketing Segmentation and Targeted Marketing for Tourism

141

on the left displays arrivals by air while the plot on the right displays arrivals by all means combining all transportation modes like land, air, and sea. Figure 3 displays arrivals from the three largest groups of Singapore’s inbound arrivals: from India, Malaysia, and China. The pie charts in Fig. 4 compare the proportion of inbound arrivals based on region, before and after the pandemic. The proportion of inbound visitors from Southeast Asia and Greater China has risen most significantly and we can leverage on this big group for recovery plans. We now examine the revenues changes in the four tourism sectors: · · · ·

Sightseeing, Entertainment, and Gaming (Fig. 5); Shopping (Fig. 6); Food and Beverage (Fig. 7); and Accommodation (Fig. 8).

We obtained data from two sources: Government data [6] and Singapore Tourism Board [7]. Marina Bay Sands and SkyPark were officially opened on 23 and 24 June 2010 [8], following the casino’s opening on 27 April that year, causing the revenue of the sector to skyrocket in 2010. The average quarterly revenue of Sightseeing, Entertainment, and Gaming (Fig. 5) has decreased 71.22% from 2019 to 2020. The average

Fig. 2 Total arrivals and departures by aircraft

Fig. 3 Top three countries with the most inbound tourists

142

L. Y. Xin et al.

Fig. 4 Arrivals distribution before and after COVID-19

Fig. 5 Revenue changes in sightseeing, entertainment, and gaming

quarterly revenue of Shopping (Fig. 6) has decreased 76.56% from 2019 to 2020. The average quarterly revenue of Food & Beverage (Fig. 7) has decreased 68.91% from 2019 to 2020. The average quarterly revenue of Accommodation (Fig. 8) has decreased 68.35% from 2019 to 2020. These are summarized in Table 1. From the summary table, we observe that the shopping sector was hit the hardest in terms of the percentage of decline. We further look into one of the segments, the accommodation sector. This sector can be drilled down by tiers of hotel to include Luxury, Upscale, Mid-tier, and Economy. We observe the difference in the following:

Marketing Segmentation and Targeted Marketing for Tourism

Fig. 6 Revenue changes in shopping

Fig. 7 Revenue changes in F&B

143

144

L. Y. Xin et al.

Fig. 8 Revenue changes in accommodation

Table 1 Summary table showing the impacts on revenue

Sector

Decline from 2019 to 2020 (%)

Sightseeing, entertainment, and gaming

71.22

Shopping

76.56

Food and beverage

68.91

Accommodation

68.35

· Hotel Room Rate; · Occupancy Rate; · Revenue. The prices for all tiers of hotels have been stable for years before the pandemic. Figure 9 shows the hotel room rates of different tiers. As the number of visitors shrank, hotels lowered prices to attract locals despite the significantly lower demand. Luxury hotel prices dropped by around 70% while other categories fell by 60%. If we look at the occupancy rates as shown in Fig. 10. on hotel occupancy rates by different tiers. Before the pandemic outbreak, all tiers of the hotel had stable occupancy rates at around 80%. The occupancy rate began to decline at the beginning of 2020 across all tiers and luxury hotels suffered the most. We can see a rebound in the mid of the year 2020, and economy hotels took the lead to recover first. Figure 11 shows the hotel revenue by room, and we can observe that the hotel revenue per room is inversely proportional to the tier of the hotel. In other words, the higher hotel is

Marketing Segmentation and Targeted Marketing for Tourism

145

Fig. 9 Hotel room rates by different tiers

in terms of tier, the more it was impacted. Since luxury hotels are not essential in times of a pandemic. On the other hand, economic hotels do not have much room for further reduction in prices. This leads to an interesting finding that during the worst times of the pandemic, all tiers of hotels generate approximately the same revenue for each room. Some hotels have been designated to be used for the mandatory 2-week quarantine. For such hotels, we create the visualizations in Fig. 12 that displays the revenue generated in for each room type and the occupancy rate over the six years (from 2015 Q4 to 2020 Q3) by the gazetted hotels. The blue bar represents the revenue generated and the orange bar represents the occupancy rate. The dip as seen in Q2 2020 is attributed to the period of restricted travel and the recovery in Q3 2020 is a result of compulsory stay-home notice (SHN) served in these hotels by incoming travelers. Unfortunately, this applies only to specific hotels selected for SHN quarantines.

Business Solutions Using Machine Learning In this section, we will discuss about how to develop business solutions using Machine Learning. According to Singapore 2018 Household Expenditure Survey, Singapore households spend roughly 3.5% of their annual income on tourism, amounting to 1 to 3 trips per year. On average, each household spends S$4,080 per year on vacation. The number of households in Singapore is around 1.37 million. By gathering the

146

L. Y. Xin et al.

Fig. 10 Hotel occupancy rates by different tiers

information, we can estimate the market size of the tourism industry in Singapore is around S$5.67 billion. By applying descriptive analytics on Singapore demographics, we have an overview of the market and know the actual market size. As highlighted earlier, the solution must be based on the domestic market due to movement restrictions on traveling. This makes the COVID-19 business landscape extremely competitive as the domestic market is much smaller. As such, marketing is one of the critical factors, in fact the most critical factor in determining whether a business lives to see a world post-COVID-19. In order to create a better fit of the right customers with the right business while using the least resources, we have built multiple models using different forms of datasets that can be applied to the tourism industry. We first perform the classification of customers for hotel recommendation. The first model is using K-Neighbors Classification to classify customers on Expedia based on their historical search on destinations, hotel country, and other variables and adjust hotel recommendations accordingly. The dataset used is the dataset provided by Kaggle’s Expedia Hotel Recommendations. According to 80 Days [9], a digital marketing agency, specializing in the luxury travel sector, the average conversion rate from hotel website is around 0.73% (independent hotel) to

Marketing Segmentation and Targeted Marketing for Tourism

147

Fig. 11 Revenue per room by different tiers

Fig. 12 Hotel revenues (blue bars) and average occupancy (orange bar) from the year 2008 to 2020

1.9% (group hotel), and from booking engine is around 3.28% (independent hotel) to 6.8% (group hotel). We can see that the conversion rates are 2 to 3 times higher on booking websites regardless of the hotel type. The aggregation, lower prices, and apposite suggestions on booking websites are the reasons for this phenomenon. We recommend applying this machine learning model and targeting their potential customers precisely by showing advertisements while they are searching for hotels online, the conversion rate on hotel websites should improve the conversion rates.

148 Table 2 Summary of conversion prediction

L. Y. Xin et al. Baseline performance

1% (randomly classify with 100 clusters in total)

Machine learning performance

15%

Increase in performance

14%

Cost savings

5–10% of room revenue

On a side note, K-Neighbors Classification was chosen since the training phase of it is much faster as compared to other classification algorithms. There is no need to train a model for generalization. That is why K-Neighbors Classification is known as the simplest and instance-based learning algorithm. Also, it can be used on regression problems as well, and we will illustrate this in the next section. The results are shown in Table 2, and we expect the use of machine learning to improve the performance by 14%. The second model we train is to identify customers with higher spending potential on traveling, and the model was trained on K-neighbors on continuous data. We used K-neighbors regression to predict each household’s yearly expenditure on tourism, which is a continuous variable. In this case, we simulated data based on general household statistics found on Singapore’s official statistics website. In this dataset, there are variables such as marital status, number of members in the household, household yearly income, and household yearly expenditure on tourism. We used Euclidean distance to compute the distance measures. The leaf number is set to 5. We found that when the k is 11, the model generates the lowest Root Mean Square Error (RMSE) at 469.87. Figure 13 shows the performance of this model using different k values. RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the model to the data; how close the observed data points are to the model’s predicted values. It is a good measure of how accurate the model predicts the response, and it is the most important criterion for fit if the main purpose of the model is prediction. Once hotels are able to predict potential customers’ spending on travel, they can group households by the spending level and advertise different hotel rooms or packages to different groups. By doing so, they can maximize the profit by exploiting the market fully without paying booking engines which typically charge 15% of commission like Booking.com. As shown in Table 3 is the summary of the prediction on travel spending. We can expect 17.47% of improved business tapping on this prediction ability. The next step in building the machine learning system for tourism is to build a model capable of predicting the target for click on the advertisement. We build Logistic Regression, Decision Tree, and SVM for the prediction of clicks on the advertisement. Based on a survey conducted by CMO [10] and Deloitte Digital, companies in general spend about 12% of their revenue on marketing. According to the CMO (Chief Marketing Officer) survey [11], mobile and social media marketing spend spiked during the pandemic at 70% and 74% of total marketing expenses, respectively. This translates to approximately 9% of revenue on general digital

Marketing Segmentation and Targeted Marketing for Tourism

149

Fig. 13 Performance of machine learning model on spending prediction

Table 3 Summary of prediction of travel spending

Baseline performance (RMSE)

569.33 (when k = 1)

Machine learning performance (RMSE)

469.87 (when k = 11)

Increase in performance

17.47%

Cost savings

5–10% of room revenue

marketing activities. Getting active online to promote the company and its offerings came in as the top activity for marketing departments from the same survey. As such, a well-targeted marketing campaign will be significant in its impact on revenue. For digital marketing, the average conversion rate (percentage of website visitors who purchase the service) is about 2% [9]. Using the consideration of a 2% conversion rate, getting the advertisement to the right audience in the first place will help substantially. In order to illustrate how machine learning might help in increasing the click-through rate, we have gathered some labeled data from Kaggle [12]. The dataset includes features of the audience that the advertisement was presented to and data on customer’s click history. The data consists of the following features with some original features removed to simplify the analysis: · Daily Time Spent on Site; · Age; · Area Income;

150 Table 4 Accuracy of machine learning model to predict customer’s clicks

· · · · · · ·

L. Y. Xin et al. Model

Accuracy (%)

Logistic regression

97

Decision tree

94.5

SVM

96.5

Daily Internet Usage (minutes); Male (Gender); Ad Topic Line (Removed due to high number of unique attributes); City (Removed due to high number of unique attributes); Country; Timestamp (Removed due to high number of unique attributes); and Clicked on Ad (Label 1 or 0).

The outcome to predict is a binary variable, with two possible outcomes of either a 1 for a successful click or a 0 for no click. We built three machine learning models for the prediction of the click-through. Table 4 shows the accuracy of the models trained. High accuracy was obtained, and the prediction of customer’s clicks can be predicted very accurately. With high accuracy, the cost of targeted marketing can be reduced greatly. The distribution for both classes on the binary labels is equally distributed which mimics the scenario of 0.5 click-through rate randomly. Using our models, the accuracies are above 90% which maximizes the click-through rate to close to the maximum of 1. The doubling in performance of click-through rate may bring the conversion rate to 4%. In other words, presenting the advertisement to an audience with twice the number of visitors per unit number of impressions will double the number of conversions as well. This estimation of 4% conversion is slightly lower than the conversion rate of booking engines with 6.8% conversion. However, this is without the high cost of listing in booking engines. The accommodation sector makes up approximately 20% of the 5-billion-dollar domestic tourism industry. The amount spent on digital marketing will be about 9% of total revenue ($90 million). Assuming the industry has not tapped on machine learning as a cost-reduction solution, the potential for cost savings stands at $45 million. Since doubling the performance of marketing reduces the resources required by half. Table 5 illustrates the cost saving involved which benefits from the machine learning models created. Shopping is one of the top activities of tourists. We can create a machine learning model to classify visitors to visiting shopping malls. In this segment, we attempt to help malls to target customers with higher spending scores through classification modeling. Figure 14 displays the revenue generated through tourism receipts in the orange line and revenues from shopping in the blue line. For the shopping sector, the revenue fell to about $2.5 billion. The graph in Fig. 15 shows how much shopping has contributed to tourism receipts. It is the lowest in 2020, standing at approximately 15% of total tourism receipts.

Marketing Segmentation and Targeted Marketing for Tourism Table 5 Summary of marketing outcomes

151

Baseline performance (Random targeting)

50%

Machine learning performance

>90%

Current digital marketing spending (industry wide)

SGD 90 million

Potential cost saving (industry wide)

SGD 45 million (approx. 50%)

Fig. 14 Revenues from shopping (blue line) and tourism receipts (orange line)

Fig. 15 Contribution of shopping in tourism receipts

152

L. Y. Xin et al.

There are a few shopping areas that contribute most toward the tourism receipts, and these locations include the followings: · · · · · · ·

Jewel Changi; Marina Bay Sands; Orchard Road; Chinatown; Little India; Tampines; Jurong East.

In order to build models that better target customers for malls, we collect the dataset on malls’ customer [13]. The data consists of the following variables: · · · ·

Gender; Age; Annual Income (k$); and Spending Score (1–100).

We adopted K-means clustering to categorize different kinds of customers and determine the right clustering number using the elbow method. According to the results from the elbow method, five and six clusters are recommended. We then create five clusters for annual income and six clusters for the age. We extract their centroids that show the gravity center of the cluster and plot the centroids to visualize the cluster outcomes. The first segmentation performs clustering by annual income as shown in Fig. 16. Five groups of customers are formed following this segmentation, and we can see that customers with high annual income fall into both high and low spending scores. Similarly, customers with low annual income also fall into both high and low spending scores. This makes annual income an unreliable feature for classification into useful clusters. We further segment customers by age, and the clusters formed are shown in Fig. 17. We approximate that the younger demographic from the 20 to 40 s tends to have a higher spending score. Even though 110 people from the 20–40 age group make up 55% of the total population of 200, the sum of spending scores (6,676) is 66.5% of the sum of spending score for the entire population (10,040). Thus, we conclude that by focusing on the younger segment, we can decrease the marketing cost by about 21% (66.5–55%)/55%) through the efficient use of marketing resources. We summarize the savings through the use of this customer segmentation in Table 6. We see a translation of 17% of cost savings by targeting the younger age group which constitutes 66.5% of the sum of spending. Following the above works done using machine learning, we recommend the marketing strategies as depicted in Fig. 18. The potential customer base is one that may not be engaged in domestic tourism, but it will reap much benefit to receive customer traffic through the initiatives being promoted currently such as the Rediscover Singapore tourism campaign. After which, the segments which hold the

Marketing Segmentation and Targeted Marketing for Tourism

153

Fig. 16 Customer segmentation by annual income

greatest potential to be converted can be targeted using machine learning models that were built. Intuitively speaking, it will not be beneficial to market a service to a disinterested crowd. Thus, the identification of key segments comes first.

Conclusion Analytics support businesses in reaching out to the right audience at a lower cost. This work contributes by tapping on the use of machine learning, to provide insights and look for potential improvement in performance with minimal cost using publicly available tools. However, there are challenges faced in its implementation with the first barrier being the adoption and awareness of data analytics. In addition, there is also the challenge in integration with current business processes. An agile setup will be more receptive to such solutions as compared to businesses entrenched in rigid SOPs. As such, our solutions do not stand on their own and will require further work in their execution.

154

L. Y. Xin et al.

Fig. 17 Customer segmentation by age Table 6 Summary of cost savings from customer segmentation

Average spending score (baseline performance without ML)

55%

Average spending score of target segment 20–40 years old (with the application of ML)

66.5%

Increase in performance

Approx. 20%

Cost savings

Approx. 17%

Fig. 18 Recommended marketing strategy

Marketing Segmentation and Targeted Marketing for Tourism

155

References 1. Singapore Tourism Board (2021, January 1) About STB: overview. Retrieved from Singapore Tourism Board Website: https://www.stb.gov.sg/content/stb/en/about-stb/overview.html 2. Ministry of Social and Family Development (2021, January 1) COVID-19 recovery grant. Retrieved from Ministry of Social and Family Development: https://www.msf.gov.sg/assist ance/Pages/covid-recovery-grant.aspx 3. CNA (2020, July 22) S$45 million tourism campaign launched urging locals to explore Singapore. Retrieved from Channel News Asia: https://www.channelnewsasia.com/news/sin gapore/ 4. Fortune (2020, December 15) Singapore is desperate to revive tourism and business travel. Here are all the ways it has tried. Retrieved from Fortune: https://fortune.com/2020/12/15/sin gapore-tourism-business-travel-bubble/ 5. Changi Airport Group (2020, November 21) Air travel bubble. Retrieved from Changi Airport Group. https://www.changiairport.com/en/airport-guide/Covid-19/air-travel-bubble.html 6. Data.gov.sg Tourism Sectors Revenue (2007–2020) https://data.gov.sg/dataset/annual-tourismreceipts 7. Singapore Tourism Board (2019 Q4–2020 Q2) 8. Wikipedia (2021, January 1) Marina Bay Sands. Retrieved from Wikipedia: https://en.wikipe dia.org/wiki/Marina_Bay_Sands 9. 80 Days. (2020, October 14) What is the average hotel website conversion rate? Retrieved from 80 Days: https://www.eighty-days.com/2020/what-is-the-average-hotel-website-conver sion-rate/ 10. CMO Surver (2020, June 1) COVID-19 and the state of marketing. Retrieved from CMO Survey: https://cmosurvey.org/wp-content/uploads/2020/06/The_CMO_Survey-Highli ghts-and_Insights_Report-June-2020.pdf singaporediscovers-45-million-tourism-campaignstb-singapoliday-12952932 11. Marketing Budgets Vary by Industry (2017, January 24) Marketing budgets vary by industry. Retrieved from Deloitte: https://deloitte.wsj.com/cmo/2017/01/24/who-has-the-biggest-mar keting-budgets/ 12. Kaggle (2018) Practice data analysis and logistic regression prediction. Retrieved from Kaggle: https://www.kaggle.com/fayomi/advertising 13. Kaggle (2019) Mall customer segmentation data. Retrieved from Kaggle: https://www.kaggle. com/vjchoudhary7/customer-segmentation-tutorial-in-python

Machine Learning for Tourism Chang Chai, Yanbo Chen, Taiying Kuang, Chun-Yu Lai, Jingyi Li, and Jian Zhang

In this work, we utilize three datasets for analysis. First, to view the changes in international inbound visitors where we collect monthly visitors’ arrival using Singapore as a context. To understand the extent to which hospitality industry was affected, we use the Hotel Statistics published by Singapore Tourism Board. It contains the monthly statistics of the gazetted hotels, which are the hotels that have been declared as tourist hotels under the Singapore Tourism Board. We collected data starting from January 2008 to November 2020. The dataset provides information such as room type, maximum room, available rooms, gross lettings, standard average occupancy, and venue per available room. To discover the patterns and features of Singapore tourism industry in normal periods, we utilized Tourism Data from year 2014 to 2015 to run both descriptive and prescriptive data analysis. The dataset is an opensource dataset from Kaggle and involves sufficient tourism-relevant features such as city_of_residence, travel_type, shopping_amount etc.

Visualization-Based Analysis After four years of consecutive growth, Singapore tourism suffered its worst year in the year 2020, with international arrivals dropping sharply in February 2020 and reached the bottom in April. It is surprised to find that compared with previous years, the number of visitor arrivals has decreased by around 99% year on year. In 2019, Singapore received approximately 19.12 million of international tourists, more than Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_10. C. Chai (B) · Y. Chen · T. Kuang · C.-Y. Lai · J. Li · J. Zhang Nanyang Business School, Nanyang Technological University, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_10

157

158

C. Chai et al.

Fig. 1 International arrivals on year 2019 and 2020

three times the population of the country. But in 2020 it was down to 2.74 million, nearly one tenth of the previous year, due to an unprecedented fall in demand and countries imposing travel restrictions. Figure 1 depicts the monthly arrival data from year 2019 to 2020. The sharp decline started from February and there was slow recovery. Figure 2 shows the relationship between number of visitor arrivals and their average length of stay. The size of the circle represents visitors flow, we can see that fewer people are allowed entry to the counter. Hence, those who entered would choose to stay longer. When there is no restriction on border crossing, most people prefer to stay less than 4 days. We further look at the country of origins for inbound tourists. The pandemic outbreak has significantly impacted visitor arrivals from around the globe. This is especially so for tourists from China, which accounts for more than 20% of international visitors. In 2020, this figure dropped by 89.5% year on year due to travel restrictions on both sides. As shown in Fig. 3, tourists from Southeast Asia, China and Europe were the 3 largest group of tourists coming to Singapore in the year 2019. The numbers in year 2020 is far from the year before. We next look at the numbers for confirmed cases and the arrival rates as shown in Fig. 4. We can see a strong correlation in monthly confirmed cases and number of visitor arrivals month-on-month. A lagged response was observed in arrivals data. The more the number of confirmed cases, the fewer the tourists coming in due to travel restrictions. However, the estimated decrease in visitor arrivals could change depending on how long the tightening of country of origins last. The longer the control

Machine Learning for Tourism

Fig. 2 Average length of stay and arrivals

Fig. 3 Tourists’ country of origins

159

160

C. Chai et al.

Fig. 4 Monthly confirmed cases versus arrivals

measures, the longer it will take for travelers’ demand to return. In the fourth quarter of 2020, with the Covid outbreak under control in Singapore and the decreasing in the number of daily confirmed cases, it signals the starting point with increasing number of international visitors to return gradually.

Time Series Analysis In order to observe the extent of damage done by the pandemic on the tourism industry, we first perform a time series analysis on the tourism visitor’s arrival data to make predictions for the tourism arrival volume without the pandemic. We collected data on monthly tourist arrivals in Singapore from January 1978 to November 2020. We next look at the data set as a whole. The resulting charts are shown in Fig. 5 with the left graph showing the original tourist arrivals and the right graph plots out the time series components from the arrival data. We can observe that besides the unusual pattern in 2020 caused by Covid-19, the chart also shows that monthly visitor arrival data have an upward trend as well as seasonality. At the same time, we see significant seasonality from the decomposed plot from the third subplot on the right graph.

Machine Learning for Tourism

161

Fig. 5 Decomposing month visitors arrivals

Fig. 6 Test for stationary with original data

When the time series data is stationary, we can also perform regression on stationary time series. In the following experiment, we performed the Dickey–Fuller test to test whether the data is stationary. First, we draw the autocorrelation chart, which shows that the autocorrelation coefficient is always greater than zero, indicating that the time series has a strong autocorrelation. We can then derive that the data is not stationary. The result tallies after the Dickey–Fuller test, the test statistic is −2.3 with p-value of 0.17 greater than the significant value of 0.05. Figure 6 shows the original data with rolling mean and standard deviation on the left. The right figure shows the plots on autocorrelation and the results of Dickey–Fuller Test. We take the first order difference to eliminate the overall trend of the data and obtain the results as shown in Fig. 7. We noted similar test result and data is not stationary. The p-value of 0.51 ends up higher than the original data from the Dickey– Fuller Test. We then proceed to use seasonal difference to eliminate the seasonality of the series. This step is still not able to eliminate the component that causes it to be non-stationary. The respective Dickey–Fuller Test results are shown in Fig. 8 with

162

C. Chai et al.

p-value of 0.73. We perform further processing to use the seasonal difference based on the first-order difference. This step finally turns it around to make a stationary data with p-value less than the significant value as shown in Fig. 9. We then follow by drawing the autocorrelation plot (ACF) and partial autocorrelation plot (PACF) chart to find the optimal parameters as shown in Fig. 10, where it shows the ACF and PACF of the seasonal difference based on the first-order difference. Since the ACF of the difference series is negative in the lag period of 12 (after one year), we should use SMA in the model.

Fig. 7 Test for stationary with first order differencing

Fig. 8 Test for stationary with seasonal differencing

Machine Learning for Tourism

163

Fig. 9 Test for stationary with first order seasonal differencing

Fig. 10 Autocorrelation and partial autocorrelation plot of first order seasonal differencing

The above analysis allows us to proceed with modelling and making prediction. We configure the relevant model and obtain the results as shown in Table 1. Based on the figures, the model parameters should be (0,1,0) X (1,1,1,12), then we can build the model. We adopted this setting to continue to fine-tune the model. We attempt to make predictions for the year 2020 with and without the impact of Covid-19. Figure 11 shows the forecasting in the absence of the pandemic, forecasting is shown as the orange line. Figure 12 shows the forecasting under the pandemic. The forecasting results were validated. We first make predictions for 2018–2020 and use the previous year data to validate the forecasting accuracy. In an attempt to quantify the impacts of pandemic. We used time series analysis previously to predict how the visitor arrivals volume would be in the absence of the pandemic. Next, we try to combine it with tourism revenue data to estimate how much the loss revenue would be should these tourists arrive without the pandemic. We obtained the quarterly data on Singapore’s tourism revenue starting from the year 2007, we then added up the visitor arrival data quarterly from the year 2007 to achieve the effect of aligning it with the revenue data. We drop the data for the year 2020 to remove the impact of the pandemic. We obtain the scatter chart as shown in Fig. 13 with visitor’s arrival (VA) along the X-axis in millions and Y-axis on tourism receipts (TR). Positive correlation is observed from the same plot.

164

C. Chai et al.

Table 1 Results of SARIMA model

Fig. 11 Forecasting in the absence of pandemic

One initial step is to determine variables with significant influence on the tourism industry income in order to provide a closer quantitative estimate. Based on the strong positive correlation between the visitor’s arrival and receipts and the observed linear relationship, we use simple linear regression to form the model for predictions. In addition, as we attempt to predict the amount of visitor’s arrival data in the year 2020 in the absence of the pandemic from the previous time series analysis. In this

Machine Learning for Tourism

165

Fig. 12 Forecasting under pandemic

Fig. 13 Scatter plot on visitor’s arrival and tourism receipt

section, we directly use the aggregated data for the year 2020 forecast to estimate the receipts. Table 2 shows the predicted visitor’s arrival (VA) from the processed data. We can see that the pandemic makes the tourism industry suffer huge losses in the last column. The respective figures are shown in Table 2. Hotel industry is closely linked to the international tourism. However, the presence of pandemic results in the number of international tourisms dropped dramatically in the year 2020. We therefore proceed with evaluating the impact of COVID- 19 to the hospitality industry.

166

C. Chai et al.

Table 2 Prediction of visitor’s arrivals Year_Q

Actual VA

Actual receipts

Predicted VA

Predicted receipts

VA loss

Receipts loss

2020Q1

2661065

4029

5022655

6654

2361590

2625

2020Q2

3801

160

4957304

6586

4953503

6426

2020Q3

25255



5277908

6923

5252653



Fig. 14 Hotel’s revenue versus number of covid cases

First, we look at the overall performance of the hotel sector in the year 2020. The left graph in Fig. 14 shows the monthly revenue generated by the hotel industry in the year 2020 and we compare it with the number of new Covid cases in Singapore is represented by the right axis with the size of bubbles in the graph. From February 2020, Covid-19 began to spread in Singapore and the situation became worse in April and May, with a dramatic increase in the number of monthly new confirmed cases of approximately fifteen thousand in April and eighteen thousand in May. The right figure displays the monthly data the year before the pandemic in 2019. As the spread of Covid-19 renders international travel to Singapore close to impossible during that period, the hotel sector suffered badly in its main revenue source loss from international tourists. From January to February, the hotel sector experienced a sudden and huge drop in monthly revenue from $369 million to $190 million. Following an overall downward trend, monthly revenue continued to decrease until July. As shown in the graph on the year 2019’s data, monthly revenue remains relative stable at a high level (above $300 million). Without the negative effect of Covid-19, we would expect a similar pattern in the year 2020. However, as compared to the previous year, the actual monthly revenue was significantly lower in the year 2020. From May to July, the amount of revenue was below $50 million. Although from August onwards, we gradually see an overall recovery from this pandemic, the amount of revenue was unable to return back to the level before Covid-19. We further examine the variation of hotel demand and supply in the year 2020 as shown in Fig. 15. The gross-letting is defined as the number of occupied rooms monthly and available room-nights is the number of rooms available for occupancy. In order to eliminate seasonality effect, we make Year-Over-Year (YOY) comparison

Machine Learning for Tourism

167

on the right axis. With a sudden and significant decrease, the hotel’s demand in 2020 was significantly less than that in 2019. From the YOY comparison, we can see that the demand dropped by more than 40% from February onwards and the most severe contraction happened in July (YOY: −74%). In terms of hotel supply as shown in Fig. 16, it shows a similar downward trend with some lagging effect. Although Covid19 hit Singapore in February, hotel supply was not much affected at that time. It was until May that hotel supply began to experience a huge decrease (YOY: −40%). As the whole hotel industry was hit by Covid-19, we further examine the extent to which hotels in different tiers were affected. The graph in Fig. 17 shows the Average Occupancy Rate (AOR) of different hotel tiers in the year 2020. In January, hotels in all tiers shared similar AOR at around 80%. When Covid-19 hit Singapore in

Fig. 15 Hotel’s demand in gross lettings and year-on-year

Fig. 16 Hotel’s supply in available room nights and year-on-year

168

C. Chai et al.

Fig. 17 Average occupancy rate in year 2020

February, hotels in all tiers suffered, but of different degree. To drill this further, AOR of luxury hotels dropped below 20% in April and May. Economy hotel saw the lowest AOR in March, at around 50%. During the second half year of 2020, Covid19 was gradually controlled in Singapore and the hotel industry began to recover. In November 2020, AOR for economy hotel was back to 60% but AOR for luxury and upscale hotels was still below 50%. Therefore, we can infer that luxury hotels were the most vulnerable group during Covid time while economy hotels were the least-affected group, in terms of the drop in AOR and speed of recovery. We next perform time series analysis on gross lettings, the monthly number of occupied rooms. We build a time series model with the historic data before the year 2019 to predict the number of gross lettings on the year of 2019 and 2020. We can then compare the difference between the predictions generated by the time series model and the actual amounts to evaluate the impact of COVID-19. We start from data exploration, as shown in Fig. 18, is the historical values on gross lettings. We can see that before the outbreak of COVID-19, the number of gross lettings shows an upward trend and yearly seasonality. It then follows with a sudden plunge in 2020 due to the COVID-19. Prior to building a time series model, we are to assess stationary property in the data. Otherwise, with the irregular mean and variance renders predictions with more variable. Therefore, we decompose our data into three components: trend, seasonality, and residual to check the components made up of this data. Figure 19 shows the original data and its rolling values, we can see an upward trend and yearly seasonality from original data in Fig. 20. Also, from the plot of rolling mean and rolling standard in Fig. 19 there is also an increasing rolling mean and irregular rolling

Machine Learning for Tourism

169

Fig. 18 Historical values on gross lettings from year 2008 to 2020

standard deviation. We performed augmented Dickey–Fuller Test which shows that this is a non-stationary time series data. To transform the non-stationary time series to stationary, we adopted differencing methods. We take the difference between consecutive observations. We can see that the rolling mean and standard deviation became more stable. The p-value in the Dickey–Fuller test is 0.029, which is less than 5% but still larger than 1%. Due to the seasonality, we perform seasonal differencing based on the first differencing. The time series is more stable with p-value 0.000246 in the Dickey–Fuller test. Therefore, we can proceed further as this time series data is stationary now. To obtain the parameters for time series model, we create the autocorrelation (ACF) and partial autocorrelation (PACF) plots as shown in Fig. 21 with first order differencing and seasonal differencing. We adopted the use of SARIMA model as there are seasonal component in this time series data. From the plots in Fig. 21, we can still see correlation in lag 12. The plots also suggested that p = 2 and q = 2 for AR and MA parts. Therefore, we would try the values within this range to run in the SARIMA models and choose the values with the minimum AIC (Akaike Information Criterion) as the best model parameters. The results are shown in Fig. 22, where we can see that the model with parameters in SARIMAX (1,1,0) X (0,1,1,12) shows the best performance. We then use this model to generate the prediction of gross lettings and make a comparison with the actual number in year 2020 to evaluate the impact of COVID-19. The prediction results are shown in Figs. 23 and 24. Figure 23 shows the full data from year 2008

170

Fig. 19 Prediction on gross lettings

Fig. 20 Decomposed gross lettings

C. Chai et al.

Machine Learning for Tourism

171

Fig. 21 Autocorrelation and partial autocorrelation plots

Fig. 22 Results of SARIMA using the optimized setting

until year 2020 prediction while Fig. 24 shows the close up on predicted data and actual data from year 2019 to year 2020. The estimation on revenue loss will be discussed in the following section. The influence of COVID-19 was felt since February. July and August should be the peak season of tourism, the impact reached an all-time high in this period. We assume $210.35 as revenue per room as this is the average standard room rate in 2019. The difference is multiplied by the room rate to estimate the calculation of loss in the hospitality industry due to the pandemic. Table 3 tabulates the revenue loss from predicted values from the earlier SARIMA model. We can see that up to November, the hospitality industry has lost around 2.3 billion, which is more than 50% of 2019’s total revenue (4.21 billion).

172

C. Chai et al.

Fig. 23 Prediction on gross lettings using SARIMA model with train data

Fig. 24 Comparison of prediction on gross lettings using SARIMA model with actual data

We next look at one of the large group of tourists from Indonesia. We plot out the percentage on whether they are visiting Singapore for the first time or not as shown in Fig. 25. It can be seen that 85% of the Indonesian visitors had at least visited Singapore once before. This group of visitors are likely to be Long-Term pass holders and they made up the majority of inbound visitors between the year 2014 and 2015.

Machine Learning for Tourism

173

Table 3 Estimation on Revenue Loss from Gross Lettings Prediction Jan-20

Prediction

Actual

Difference

Revenue in Difference (K)

1,596.24

1,609.69

13.44

$2,827.79

Feb-20

1,551.50

855.76

−695.74

−$146,346.69

Mar-20

1,668.89

712.77

−956.12

−$201,116.81

Apr-20

1,622.50

635.37

−987.13

−$207,639.54

May-20

1,627.50

576.03

−1,051.47

−$221,171.99

Jun-20

1,612.51

515.48

−1,097.04

−$230,757.70

Jul-20

1,794.68

440.92

−1,353.75

−$284,756.37

Aug-20

1,783.69

512.41

−1,271.28

−$267,408.86

Sep-20

1,648.87

503.05

−1,145.82

−$241,018.57

Oct-20

1,709.57

503.69

−1,205.88

−$253,651.27

Nov-20

1,656.40

468.44

−1,187.96

−$249,882.17

Total

−10,938.75

−$2,300,922.19

Fig. 25 Proportion of first visits from Indonesia

We further drill down to the city of origins following the proportion pie chart, the city of origins is shown in Fig. 26. When breaking down the inbound visitors by their city of residence, we found that visitors from Jakarta and Batam made up nearly ¾ of the whole group. We generated a word cloud to see the purpose of their visits, as shown in Fig. 27. Most of the visitors come to Singapore for holiday reasons, some are here for visiting relatives and business purposes.

174

C. Chai et al.

Fig. 26 Drill down of city of origins on Indonesian inbound visitors

Fig. 27 Purpose of visits

We gather data on duration of stays and the accommodation choice and plot out the figures in Figs. 28 and 29. Figure 28 ranks the duration of stay and makes use of bars to show, most of the visitors chose to stay for 2 days in Singapore and most of them lived in hotels. We listed the top 10 hotels for Indonesian visitors as shown in

Machine Learning for Tourism

175

Fig. 28 Duration of stays

Fig. 29 Choice of accommodation

Fig. 29. The top hotel of choice is Mandarin Orchard, and it is a popular choice for tourist looking for shopping trip. Shopping expenditure is another key economy contribution from inbound visitors. We plot out the correlation heatmap as shown in Fig. 30. From this correlation matrix we can see that products with the most correlation with shopping expenditure is fashion products, followed by well-being products and jewelry products. Thus, these are also the most affected sectors during pandemic. Moving the sales online until brighter climate sets in can help these businesses to ride through the storm.

Machine Learning Analysis We utilize a dataset with a total of 8887 rows and 48 columns to train up the machine learning model. The number of features is large. From the previous data visualization, we found that the relationship among features might be complex. We can

176

C. Chai et al.

Fig. 30 Correlation heat map on shopping expenditure

see that by exploring the relationship of tourism expenditure with various features, such as gender, occupation, travel purpose, travel companion and so on we can discover some useful insights. We create a machine learning model to predict High or Low travel expenditure. Different machine learning models for this classification problem have been constructed, including AdaBoost, GradientBoost, XGboost, Decision Tree, Random Forest and Neural Network. We then compare and contrast the models’ performance. First, we look at the whole data set. The size of the data set, rows, columns, and features inside the data set. We then calculate total_shop, total_other and total_expenditure values for future prediction. We perform sanity check like validating the unique values in the data set, followed by data transformation and cleaning. As we are focusing on the prediction of total expenditure in this data, we can eliminate all the unnecessary columns before converting to dummy data. Another important data cleaning step is to handle missing values. We observe only a few data with missing value, so we drop the relevant records. We then attempt to categorize total_shop, total_oth, total_exp variables for classification. We use binary values for this classification task. Followed by getting dummies for all the categorical variables and convert the transformed y variable into binary. The relevant features are then prepared for classification and partitioning. The partitioning step involves splitting data into a train set and a test set on the cleaned data set into two parts using the ratio of 70% for training and 30% for testing. As a result, we have 6198 records to train the models and 2656 to test the model. Due to the large dimensions of features, we have to carry an additional step to reduce the dimension using Principal Component Analysis (PCA). Looking at

Machine Learning for Tourism

177

our current training set after partitioning and dropping the unnecessary columns, we still have a total of 69 features. These features make machine learning training extremely slow. Moreover, the higher the dimensions in the training set, the higher the risk of overfitting. In order to avoid building an unnecessarily complex model. We proceed with the use of Principal Component Analysis (PCA) technique for our dimensionality reduction. PCA selects the principal component (PC) that preserves the maximum amount of explained variance, where the explained variance is the proportion of the dataset’s variance that lies along each principal component. We choose the number of dimensions that add up to 98% of the total variance. It will speed up the training process and to some extent prevent it from overfitting. After dimensionality reduction, we preserve 98% of the total variance with only 41 features, comparing with 69 features before, we reduce about 1/3 of the total features, which greatly decreases the model’s complexity. The first machine learning model we created is using AdaBoost, which is a family of boosting classifiers. Boosting classifier iteratively processes data and boosts the weak classifiers to improve performance. AdaBoost classifier usually starts with a weak learner, it then makes a shallow tree with a single split. After that, it will adaptively modify the weights sequentially to improve prediction accuracy. AdaBoost classifier can be used for classification and regression. The results of AdaBoost classifier are captured in Table 4. Table 4 shows the classification report of AdaBoost. It achieved 80% accuracy and 82% of f1-score; this indicates the good prediction ability of this machine learning model. The second classifier we trained is the GradientBoost classifier. GradientBoost classifier minimizes the loss function by performing derivation. At first, a small decision tree was constructed with a single split. Gradually, trees were added one at a time, loss is computed with the addition of trees and derivation performed. Subsequent trees were added that reduces the loss following the concept of the reducing gradient. GradientBoost achieved 81% accuracy and 83% of f1-score as shown in Table 5, better performance than AdaBoost. Another family of boosting classifier is the XGboost. It is a decision tree-based ensemble machine learning algorithm that uses a gradient boosting framework. This algorithm can be trained in fast speed with good efficiency. It has a balanced optimization technique to give high accuracy results with lesser computing resources. For the XGboost model that has been trained, it achieved 80% of accuracy and 82% of f1-score as shown in Table 6. Table 4 Performance of AdaBoost classifier

178

C. Chai et al.

Table 5 Performance of GradientBoost classifier

Table 6 Performance of XGBoost classifier

Table 7 Performance of decision tree classifier

The fourth classifier we will train is the decision tree. A decision tree is a supervised machine learning algorithm that can be used for both classification and regression problems. It is simply a series of sequential decisions made to reach a specific result. The decision tree is a distribution-free or non-parametric method, which does not depend upon probability distribution assumptions. Decision trees can handle high dimensional data with good accuracy. The results are shown in Table 7. The performance of the decision tree is not as good as compared with other models, it only achieved 74% accuracy and 77% of f1-score as shown in Table 7. We have seen boosting classifier’s performance in this prediction problem, now we turn to bagging classifier. The bagging classifier works by stratifying the samples to improve the classification performance. We first use Random Forest to predict. Random Forest is a forest of randomly created decision trees. The Random Forest then combines the output of individual decision trees to generate the final output. Random forest leverages on the power of multiple decision trees, which improves the predictive accuracy and controls over-fitting. The results are shown in Table 8. Random forest performs better than the decision tree and achieved 80% accuracy and 82% of f1-score. The last classifier was trained with the Neural Network structure, in particular the Multi-Layer Perceptron. Neural Network is a computational learning system that uses a network of functions to understand and translate a data input of one form into a desired output. Neural Network Algorithms work on three main

Machine Learning for Tourism

179

Table 8 Performance of random forest classifier

Table 9 Performance of neural network classifier

Table 10 Performance summary of classifiers

Accuracy (%)

Recall (%)

F1 Score (%)

AdaBoost

80

82

82

GradientBoost

81

83

83

XGboost

80

82

82

Decision Tree

74

76

77

Random Forest

80

81

82

Neural Network

77

78

80

layers of their architecture: input layer, hidden layer, and output layer. The model’s performance of the Neural Network is shown in Table 9. It achieved 77% accuracy and 80% of f1-score, a relatively good performance as compared with other models. The performance summary is shown in Table 10 for all the classifiers trained. Among the classifiers, GradientBoost has the highest Accuracy of 81%, the highest Recall of 83% and the highest F1 Score of 83%, this is then the recommended model for this problem. The key advantages of GradientBoost are: · GradientBoost is a gradient descent type of an algorithm. In each round, it takes the current ensemble that it has and computes a gradient on the direction the model can improve. With this direction in hand, it trains a tree to predict it and adds it to the gradient. Therefore, each additional tree tries to get the model closer to the target and reduce the bias of the model rather than the variance. · GradientBoost can also make better splits, the splitting of a node is stopped when a negative loss in the split was encountered. GradientBoost splits until the maximum depth is achieved and then returns to pruning the tree. For split that does not result in positive gain, it will be removed.

180

C. Chai et al.

· Other models such as Decision Tree and Random Forest tend to overfit on the data. While Neural Network always require larger data sets to train a well-performed model. In this case, GradientBoost becomes our best choice. Features is another important aspect of machine learning; important features help machine learning model to predict well. Table 11 is a summary of feature importance ranking based on the best performance model results. The first column denotes the name of the features, and second column shows the weighted importance of the corresponding variable. It can be seen that there are some features with significant importance in predicting the total expenditure of tourists like the age of children and traveling partners like business associates. These top 15 features have the highest feature importance, with greater explanatory power.

Recommendation Domestic tourism is an alternative in this trying times. Residents should be encouraged to “travel” domestically and fundings can be provided for offering vouchers. In June, as Singapore was loosening measures like stay- at-home orders and other COVID-19 restrictions, the Singapore Tourism Board started to allow popular tourist Table 11 Important features on travel spending

Machine Learning for Tourism

181

attractions to reopen in stages and began accepting applications from domestic tour operators to resume business. In the following month, the board launched SingapoRediscovers, a $45 million campaign to boost local tourism, offering residents “staycation” packages, tours, and discounted attractions. That campaign provided a $100 voucher for every adult Singaporean to use on hotels and attractions. On December 1, the first day the vouchers were redeemable, the designated booking platforms received $1.4 million in sales. Through online ticketing, timed booking, and scanning code to enter, etc., the scenic spot can implement contactless and prior booking to manage the traffic and crowd control. Another initiative is flight to nowhere. Passengers can board a plane that took off from Singapore’s Changi Airport, fly around, and then land at Changi again. This idea did not take off successfully, the airlines opting instead for a grounded experience that offered customers dinner and a movie on a double-decker A380 plane for $37– 440. The airline also began offering tours of its training center and provide flight experience through flight simulator. The opening of green lanes provides option for restricted travelling. This scheme allows visitors, including leisure tourists, apply to travel to Singapore and without the need to undergo a 14-day quarantine. Since its launch in September, Singapore has expanded the list of eligible regions to include Australia, China, Taiwan, and Vietnam. The scheme is unilateral, local residents who want to travel to those countries are still subjected to existing coronavirus travel restrictions in the destination country. Singapore also runs reciprocal “green lane” programs with China, South Korea, Indonesia, Malaysia, Japan, Brunei, and Germany that allow certain corporates and diplomatic travelers to bypass the required quarantine on arrival with negative test result for COVID-19. Considering the relationship between number of visitor arrivals and their average length of stay, the demand of long-term rent will increase. Hotels can offer more pre-arranged service and promote new long term rental concessions to boost sales. Tourism industry plays an important role in the Singapore economy, it contributes to more than 4% of the GDP. From this work, we saw the visualizations and analysis on the sudden drop in the number of international visitors due to the pandemic. All the related industry from aviation, hospitality to food and beverage have been impacted severely and even reset the whole industry. We hope this work will shed some lights on the impacts from pandemic quantitatively and how we should start embedding machine learning in this industry for faster recovery.

Data Visualization on Tourism Hanlin Xiao, Jie Cheng, Yunfan Lyu, Yuqing Ma, Dongxu Sun, and Qian Wu

Data Sources We obtained data from the following sources: · OCED (The Organization for Economic Co-operation and Development) database: The Organization for Economic Co-operation and Development (OECD) is an international organization that works to build better policies for better lives. The data extracted from the OCED database are summarized in Table 1: · STAN (Singapore Tourism Analytics Network) database: Stan is a data analytics platform to view visualizations and perform analysis on tourism-related data, aggregated from STB and the industry. These data help to derive actionable insights about Singapore’s visitors and the data obtained from this database are shown in Table 2:

Data Visualization and Analysis In 2020, Singapore tightened the restrictions on border entry. Measures like Stay-athome Notice (SHN) for 7 days or 14 days were required for travelers from most countries. This lengthens the stays and costs for tourists significantly. Figure 1 H. Xiao (B) · J. Cheng · Y. Lyu · Y. Ma · D. Sun · Q. Wu Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_11

183

184 Table 1 Data in use on tourism revenue and expenditure

H. Xiao et al. Tourism revenue and expenditure

Tourism revenue by country Tourism revenue by category Tourism expenditure by country Tourism expenditure by category

Hotel supply and demand

Maximum Room/Night Available Room/Night Gross Lettings Rooms/Night Paid Lettings Rooms/Night Standard Average Occupancy Rate (%) Standard Average Room Rate(S$) Revenue Per Available Room(S$)

Table 2 Data in use on visitors’ arrivals and hotel statistics

Visitors arrivals

Visitors arrivals by year Visitors arrivals by geography Visitors arrivals by demography Visitors arrivals by behavior

Hotel statistics

Gazetted Hotel Statistics Gazetted Hotel Statistics By Hotel Size Gazetted Hotel Statistics By Hotel Tier Room Revenue of Gazetted Hotels Supply of Hotels and Hotel Rooms Room Stock of Gazetted Hotels

displays the overall trend in the past 10 years since the year 2010. It records the influence of COVID on Singapore’s Tourism on arrivals, visitor’s days, and receipts. Although these data experience seasonal fluctuations, the number of arrivals, the total number of visitor days, and the tourism receipts were on an increasing trend in the 2010s onwards. The numbers shrank dramatically in the year 2020 due to the border restrictions and the public’s concerns about COVID-19. The number of arrivals and the total visitor days plunge drastically in the year 2020 as compared with the years before. Singapore took some active measures to contain the spread of COVID-19 which led to tourism starting to recover from May. Although the speed of recovery was slow, it was showing progress. Another trend that is important to observe is the increase in the average length of stay as depicted in Fig. 2. This trend can be observed when we compare the situations in the year 2019 and 2020. Due to the requirement of a Stay-Home Notice, visitors have to stay longer when they arrive even if their traveling plan was meant to be a short-term visit.

Data Visualization on Tourism

185

Fig. 1 General trends over time on arrivals and tourism receipts

Fig. 2 Detailed trends over time on arrivals and length of stays

The influence of pandemic on tourism is significant not only in terms of time, but also at countries level and affects the mode of arrivals. Figure 3 compares the visitor arrivals, visitor days, and the average length of stay of tourists from different regions in November 2019 and November 2020, without pandemic and with pandemic, respectively. Inbound tourists arriving in Singapore mostly come from China and Southeast Asia regions. Unlike tourists from China, tourists from Southeast Asia prefer a shorter

186

H. Xiao et al.

Fig. 3 Visitor arrivals, visitor days, and average length of stay trends by region

duration of visit, mainly because of the shorter distance between Southeast Asia and Singapore and it is easier to set up a short-term visit to Singapore. There are also some other interesting facts. The growth rate of the average length of stay of tourists from China is the highest, partly due to concerns about travel restrictions for going abroad and the strict quarantine regulations. A large number of tourists canceled their plans to travel to other countries with stricter restrictions. In the meantime, the average length of stay of tourists from Europe did not increase as fast as that of tourists from other regions. The change in countries of origin can be seen in Fig. 4. Indonesia used to be the largest group of tourists arriving in Singapore. However, the pandemic situation in Indonesia was not promising at that point of time and stricter SHN measures were imposed for Indonesian tourists. China surpassed Indonesia as the largest group of travelers in Singapore. The staying preference is different for tourists from different countries as shown in Fig. 5. The tourists from the two countries with the largest group of tourists, China and Indonesia, mostly come for a shorter time of visit. This is in sharp contrast to the other countries with lesser travelers, who typically stay for a longer period of time during their visits. India and Philippines belong to this group. Worker or student could made up part of this group from India that explains their longer visits. Figure 6 shows the details of how the mode of transportation changed during this pandemic. In the pre-COVID period, the total number of arrivals was increasing by around 10% yearly, mainly contributed by the increase of arrival by air. Arriving by air is the largest proportion of arrivals, while sea and land have a much smaller proportion. The strict traveling ban has taken its toll on the dramatic downturn as shown in Fig. 6 in the first quarter of the year 2020.

Data Visualization on Tourism

187

Fig. 4 Visitor arrivals, visitor days, and average length of stay trends by origins

Fig. 5 Visitor arrivals by average length of stay and visitor days

Figure 7 illustrates the distribution of mode of arrival during the pandemic, it can be seen as the reflection of travel recovery during COVID. The number of travelers in all three modes reached a near-zero level in April and May and started to show signs of recovery in June at a moderate speed. In terms of proportion, traveling by air is the mode with the largest proportion before COVID, it remains in its place and makes the most contribution to the total arrivals. Arrivals by sea recovers faster than arrivals by land, which indicated stricter land regulations that were imposed on land travelers.

188

H. Xiao et al.

Fig. 6 Distribution of mode of arrival from the year 2008 to 2020

Fig. 7 Distribution of mode of arrival during the pandemic

We follow by analyzing the impact from a demographic perspective using data visualizations. Figure 8 records the visitor arrival distribution for different age groups from the year 2008 to 2020. We can see the total visitor arrival increase mainly comes from travelers aged 25 and above. Figure 9 shows the distribution of visitors’ arrival starting from April 2020, this is the starting point of the recovery for international travel. We can see that travelers aged from 20 to 44 make up the largest group, while the elderly aged above 55 recover at a slow speed. The elderly group is less likely to due to high COVID fatality in this age group. Gender distribution does not change at a significant level before and during the pandemic. We now further analyze the change in distribution for the length of stay. Figure 10 shows the stacked distribution of different lengths of stay. As is shown in the figure, a length of stay of less than 3 days makes up the majority of travel and

Data Visualization on Tourism

189

Fig. 8 Visitor arrival by age groups from the year 2008 to 2020

Fig. 9 Visitor arrival by age groups during the pandemic

contributes to the most increase, more than 70% of travel are less than 4 days and around 40% of travel only last for 1 day. Figure 11 shows the distribution of length of stay starting from March 2020, this is the starting point of recovery for international travel. We can see that the distribution for length of stay changed dramatically. Travel stay for more than 15 days consists of the majority of all travel. In months like May, June, and July, more than 80% of travel lasts for more than 60 days. This drastic change is a good illustration of the implementation of SHN mentioned previously which lasts for 14 days for all international travelers from May to July. The average length of stay began to fall in July, but the travel with less than 10 days of stay remains at a similar proportion as a result of changing SHN requirements for visitors from several countries and regions. Also, the increase of 15 to 29 days of length of stay reflects the recovery of leisure traveling. This section shows the statistics of standard average occupancy rate (AOR), standard average room rates (ARR), and revenue per available room (RevPAR) by hotel tier. The hotel tiering system is a reference system developed by the Singapore Tourism Board (STB) to categorize different hotels in Singapore into tiers based on a combination of factors that include average room rates, location, and product

190

H. Xiao et al.

Fig. 10 Length of stay from the year 2008 to 2020

Fig. 11 Length of stay during COVID

characteristics. And the tiers can generally be categorized into Luxury, Upper class, Midtier, and Economy segments. First comes the luxury segment. Figure 12 shows the relevant indicators like occupancy rates, room rates, and revenue per room for luxury hotels over the years 2008, 2012, 2016, and 2020. Looking at the Standard Average Occupancy Rate from the top graph, we can see that the occupancy rate stays relatively stable with slight fluctuation across the whole year. On the next two graphs for Standard Average Room Rate and Revenue Per Available Room, both the price and revenue remain at a high level across the year. A significant increase can be observed starting from August, reaching a peak in September and then slowly thins out until December. The period from August to September is the peak season for tourism as most students are on vacation, and families have free time to travel. Due to the increase in customer demand, the average room rate has risen as a result, this is accompanied by an increase in revenue per available room. However, due to the influence of COVID-19, visitors from different countries are restricted to enter Singapore and even local residents are limited in many aspects of activities. The hotel industry is also impacted by the interplay of viral spread, government policy, and social behavior. The impact of the pandemic is marked in all the trendlines for the year across all indicators. What is gratifying is that under the influence of the aggressive epidemic prevention measures taken by the government, the hotel industry has improved since June and

Data Visualization on Tourism

191

Fig. 12 Luxury hotels indicators across years 2008, 2012, 2016, and 2020

has shown a steady upward trend. However, due to limitations like social distancing and entry restrictions for foreign tourists, both the occupancy rate and revenue per room stay are at a relatively low level as compared with previous years. From the above analysis result, both upper-class and midtier segments have similar trends in the three indexes mentioned above, with a recovery trend slightly earlier than that in the Luxury segment. We should focus on the performance in the Economy tier, which shows a significant difference from the other tiers. The figure for the monthly trend of Singapore economy hotels across the years 2008, 2012, 2016, and 2020 is shown in Fig. 13. From Fig. 13, we can see that the economy class is faring better than other tiers when the constraint of social gathering is imposed. It responds to the restriction more quickly than other tiers of hotels since the occupancy rate climbs up from April to August and remains high until October. The reason for this may be that many luxury and upper-class hotels have to close temporarily, economy hotels can stay open at lower occupancy rates due to lower operating costs. Even though we can see the standard average room rate and revenue per available room remain low, better demand and lower operating costs suggest that economy hotels will recover faster. That would be consistent with what we would normally see in past crises, like the financial crisis in the year 2008. Hoteliers should build up resiliency and spread out the risks to different markets.

192

H. Xiao et al.

Fig. 13 Economy hotels indicators across years 2008, 2012, 2016, and 2020

The final analysis in this by tier hotel partition would be the entire overview across the years from 2011 to 2020 as shown in Fig. 14. We can see that even though the number of gazette hotels keeps rising, the revenue of hotels across tiers shows a huge decline in 2020. This is not the case for previous years, inferring from past trends we should expect that the revenue of hotels will keep climbing up in the absence of social gathering constraints. Besides, from the graph of revenue per available room, we can see the most significant impacts of COVID-19 in luxury hotels, making the revenue per room decline from $403.90 to $142.10. Both upper-class and midtier hotels only witness up to 50% decline as compared to the luxury segment. We next analyze the performance of hotels in another dimension—the hotel size. The relevant indicators are the standard average occupancy rate (AOR), standard average room rates (ARR), and revenue per available room (RevPAR) by hotel size. The hotel sizing system is a reference system developed by the Singapore Tourism Board (STB) to categorize different hotels in Singapore into different sizes based on a combination of factors that include average room rates, location, and product characteristics. The sizes can generally be categorized into large, medium, and small segments.

Data Visualization on Tourism

193

Fig. 14 Overview of hotel performance by tiers from the year 2011 to 2020

Large size hotels usually can accommodate more travelers, but at the same time, they also need a high average occupancy rate to compensate for higher operation costs. We look at the monthly trend of Singapore large size hotels across the years 2008, 2012, 2016, and 2020 in Fig. 15. It shows that even though the average occupancy rate shows a quick recovery from 40% to almost 70% due to the fact that Singapore has acted quickly and has been responsive to changing conditions, the average room rate and revenue per available room was still heading south. Under this test of the epidemic, it is difficult for large-scale hotels to increase hotel average occupancy rates while keeping the existing room rates. Their usual strategy is to attract more customers by providing higher quality services and lowering the unit price of hotel stays. Thus, there is no sight of an upward trend in the revenue per available room. However, cases are quite different in the performance of medium and small-size hotels. For instance, from Fig. 16. Below, we can see that it is quite easy for smaller size hotels to recover their average occupancy rate. Since smaller hotels do not need to consider the average occupancy rate, they can spend more time on the quality of the services they provide and increase the unit price of the hotel in parallel with the additional service provided. Thus, the revenue is shown in pick-up. For the yearly trend of the standard average occupancy rate (AOR), standard average room rates (ARR) and revenue per available room (RevPAR) across hotel

194

H. Xiao et al.

Fig. 15 Large size hotels indicators across years 2008, 2012, 2016, and 2020

sizes from the year 2008 to 2020, we can conclude that the hospitality industry is sensitive to social responsibility in a pandemic, this is a classical example of a black swan event. Under the challenge, strategic adjustments need to be made to the hotel in time for different tiers and sizes.

Recommendation The economic impact of COVID-19 has clearly battered the entire tourism industry. The pandemic has also accelerated the push for going online and shifted to online channels and experiences such as electronic payments and early check-ins. Changes in consumer preferences and behavior are likely to continue. Hence, businesses need to be agile in adapting to the new paradigm. One recommendation is to take measures to attract travelers for long-term visits. The average length of stay for tourists increases a lot in the year 2020 and a larger fraction of them came to Singapore for long-term visits. As compared with those coming to Singapore for short-term visits, long-term visitors generally do not take traveling as their main purpose of staying in Singapore but studying or working

Data Visualization on Tourism

195

Fig. 16 Small size hotels indicators across years 2008, 2012, 2016, and 2020

instead. This group of visitors, typically visit attractions on weekends or public holidays. They have a more spaced-out time frame for touring. Therefore, considering this trend is likely to continue, tourism agencies can conduct a series of measures to attract long-term visitors for traveling. Holiday and weekend discount on tickets for example. This fits the travel pattern of this group of visitors. Following the analysis of changes in demographic distribution and length of stay, we can see that although the length of stay may be influenced by the SHN period, the length of stay is still longer than ever before the pandemic after deducting the SHN period. Previously, the majority of travel plan only lasts for less than 3 days, but this situation changes dramatically during COVID. Moreover, in the analysis of demographic distribution, especially the age distribution, we can see that the average age of travelers is lower because of the lesser elderly travelers. It is, therefore, recommended that local travel agencies and hotels adjust for younger travelers and cut down on resources or packages planned for the older age group. Travel agencies and hotel operators should consider this dynamic change and react to these changes to better allocate resources and budgets. The next step is to transit to a more technologically enabled transformation. Artificial Intelligent (AI) solutions with machine learning algorithms connect to big

196

H. Xiao et al.

data and provide precise estimates across various relevant industry and risk management metrics, enabling businesses to significantly improve their decision-making capabilities. The adoption of AI is gaining traction in the hospitality industry like chatbots, which aims at improving the experience of hotel stay and addressing guests’ queries promptly. Hotel chatbots analyzed data from a wide array of sources (interactions with guests in the hotel app, gathering purchase history, food preferences, stored payment options, spa, amenity usage, etc.) to provide a personalized experience. The more the data collected for the chatbot’s algorithms to learn from, the better the delivered outcome and the chatbot’s suggestions are. Furthermore, AI-driven chatbots have a quick response time: guests can receive answers to their queries almost immediately as if they speak to serving staff facing them directly. Chatbots are positioned to alter the operational backbone of the hospitality industry, starting with business processes like the booking process and streamlining workflows at call centers and other hotel support units. Machine Learning (ML) algorithms in chatbots can be trained to utilize historical calls with customers and their booking behavior on a hotel website, offering them the most relevant booking options. Mobile app will continue to be the backbone in the process of improving the technology behind the next-generation hotel experience. A better hotel app allows for two-way communication between guests and the hotel operator: guests can access hotel services and other information anytime (for example, order room service dinner while they are still in the spa). The hotel can use the application to get in touch with guests at the right moment like sending important notifications, updates, offers, and alerts. Some important features a hotel app can offer are listed below: · · · · · · · ·

Booking options Remote check-in/check-out Restaurant booking with in-app menus Chat with staff Guest services (in-room dining, laundry, etc.) Hotel map Other timely information (flight schedules, hotel entertainment) Room key functionality.

Today’s smart room apps represent only the first iteration on the way to nextgeneration hotel rooms. There is plenty of room for innovation, to help shape the future of tourism industry with fresh ideas. The other future trend is the adoption of AI for hotel pricing. The challenge of changing dynamic in tourism industry presents opportunity on how to set appropriate price for hotels of different tiers and sizes. One suggested way is to use Artificial Intelligence to identify leading indicators that signal and recommend pricing strategies as markets decline or recover. During this significant period of disruption, travelers are

Data Visualization on Tourism

197

certainly more price sensitive. Anxious hoteliers engaging in price slicing wars may suppress pricing to reduce losses. Intelligent pricing is a channel-agnostic approach. Instead of direct bookings, for example, smart pricing looks at every distribution channel’s relative value and assesses how much each channel can drive guest room demand and help to achieve the overarching goal. The next generation, AI-powered revenue management, is the next exciting opportunity in tourism industry. AI-powered revenue management is also about smart pricing. The aim is to use market estimates, cost sensitivities, and competition rates, demand drivers like special event activities, seasonality, and day of week variations to optimize room occupancy at best possible price. It is also important to develop a marketing plan that ramps up with travel demand. Strategizing a month-to-month marketing plan that progressively builds up with the expected rise in travel demand with the relaxation of entry and exit policy over time will fuel the booking funnel and maximize revenue. In the case of hotel that is currently closed, the plan should start one month prior to hotel reopening and continue to shift from upper-funnel to lower-funnel targeting month to month. Considerations for marketing plans are shown as below: · Month 1: Focus on the target of the upper channel, mainly the local feeder market, 60–90 days of booking period and market to interested audiences, of which 80% of the budget is allocated to the upper channel plan. Exclude demographic information for individuals who are unlikely to travel. Messaging should be time-sensitive and focus on increasing the interest in hotel openings and the attractiveness of destination content. · Month 2: Turn to medium to low channel positioning, start to target viable fly-in markets, and implement website remarketing as the pool grows. The budget allocation should be roughly 60% of the upper channel and 40% of the lower channel. Message delivery should focus on the hotel’s unique selling point and special offers and packages. · Month 3: As travel demand continues to grow, hotels should continue to improve the goals for travel intent and begin to increase auxiliary income opportunities through scheduled spas and meals. The budget allocation should be approximately 30% of the upper channel and 70% of the lower channel. Message delivery should focus on ways to increase stay time, increase sales of ancillary products, and maximize average booking value. · Month 4 and onwards: Prioritize lower channel plans and focus on locating potential guests based on real-time travel intent to the destination. Budget allocation should account for approximately 20% of the upper-level channels and 80% of the lower-level channels. Hotel operator should continue to adjust the respective strategy to reflect changes in travel needs. As travel demand begins to stabilize, messaging should focus on meeting key business needs and typical seasonality.

198

H. Xiao et al.

Conclusion Through the visualization and analysis of the impact of COVID on tourism industry, we have a deeper and more thorough understanding of this pandemic and what difference it has made in terms of macro factors including the number of arrival visitors, length of stay, age group distribution and length of stay distribution, as well as micro factors like revenue and occupancy rate in different room types. Suggestions were proposed for the tourism industry, including the deployment of new technologies like dynamic pricing system, more efficient boarding policies, adjustment of the target customer, and better marketing plans. The outbreak of COVID pandemic is a sudden and damaging catastrophe to all industries, especially so for tourism with profound and long-lasting effects. Prompted changes have to take place during this pandemic and new strategies should be made to survive and re-thrive.

References 1. Calderwood LU, Soshkin M (2019) The travel and tourism competitiveness report 2019, World Economic Forum 2019 2. Channel News Asia The big read: A vital economic pillar, S’pore’s tourism sector faces a brutal test of mettle amid COVID-19 fallout. https://www.channelnewsasia.com/singapore/big-readsingapore-tourism-sector-covid-19-fallout-977476 3. OECD (2020) Mitigating the impact of COVID-19 on tourism and supporting recovery. https:// doi.org/10.1787/47045bae-en

Sustaining Tourism Sector Through Domestic Tourism and Analytics Dingming Chen, Pou Ing Gan, Hoi Ming Lee, Ziye Li, Vadlamudi Santosh Krishna, and Quanxin Wang

During the last few decades, increasing international tourism receipts have contributed to significant growth in global GDP [1]. Increased human interactions from tourism also bring about second order benefits to international trade, foreign direct investment, supply chain integration and jobs. However, tourism is often vulnerable to disruption in the destination and origin markets [2]. Tourism demand is susceptible to safety and health concerns [3], fluctuations in the global political circumstances, natural calamities, and pandemics [4]. Prior to the COVID-19 pandemic, significant disruptions to international tourism include the Foot and Mouth disease in UK (2001), terrorist attack on Indonesia’s resort of Bali (2002), the outbreak of Severe Acute Respiratory Syndrome (SARS) (2003), and tsunami in South Asia (2004). Impacts of such disruptions on regional tourism are usually immediate [5] and the impact is most direct when health risks from infectious diseases are involved [6]. Time required for tourism to fully recover from a crisis depends on the nature of the disturbance and how exactly the tourism system has been affected [7]. Different approaches have been used to study tourism and its recovery from disruptions. We can take a qualitative, quantitative, or a mixed-method approach [8]. Qualitative approaches are text-based [9], where written accounts of the tourism system and disruptive events are analyzed. A quantitative approach emphasizes on objective measurements, the statistical or numerical analysis of data and comparing the data against the concepts and frameworks. The mixed-method approach combines Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_12. D. Chen · P. I. Gan · H. M. Lee (B) · Z. Li · V. S. Krishna · Q. Wang Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_12

199

200

D. Chen et al.

quantitative and qualitative data within a single investigation that allows additional questions to be asked on top of the traditional method to analyze the problems better [10]. A mixed-method approach is chosen for this work.

Dataset and Analysis COVID-19 has tremendously affected international tourism with approximately 900 million [1] decrease in international tourist arrivals the first ten months of 2020 compared to the corresponding period in 2019. This translates into an approximate loss of US$ 935 billion in export earnings [1] from global tourism—more than ten times the loss in 2009 under the influence of the global financial crisis. Asia and the Pacific saw the largest decline in international arrivals (82%) from January to October 2020; while the decline in Europe and Americas is the least drastic (68%). This may partially explain how COVID-19 has spread in different continents. The World Travel & Tourism Council (WTTC) [11] has estimated that reduced international arrivals will cause over 100 million jobs in the tourism section to be lost, with an economic loss of up to US$ 2.7 trillion in 2020. Several datasets were used in this work to understand the different factors that impact on tourism before and after the COVID-19 outbreak as shown in Table 1. This work uses Singapore as a case study to infer what may be happening in other popular tourism destinations. Prior to the COVID-19 outbreak, Singapore’s international visitor arrivals have been on an upward trend, with seasonal peaks in July to August (summer holiday months in the northern hemisphere) and in December to January (Christmas and new year holidays) as shown in Fig. 1. However, there is a sharp decline in the number of international visitors’ arrivals starting from February 2020, where the COVID-19 pandemic began to affect tourism worldwide as shown in Fig. 2. In the same month, Singapore started to close boundary and stopping visitors who have been to mainland China to enter; and the ban widened to cover more countries in the subsequent months [1]. Entry restrictions remained largely in place till the end of 2020. In line with tourism arrivals, tourism spending has also been on an upward trend before COVID-19, albeit at a lower rate as shown in Fig. 3. Unfortunately, all 5 types of tourism spending took a nosedive in Q2 2020 when international travel grinded to a halt. Spending on Sightseeing, Entertainment & Gaming (SEG) which grew the fastest in the past decade took the hardest hit in percentage terms. To rule out the seasonality trend’s effect, we did an analysis on Quarter 2 across the years depicted in Fig. 4. Spending dropped to 160 million dollars in Q2 2020, about 2% of the level in prior years. Spending mix has also changed significantly. The less discretionary “Others” component—which includes expenditure on airfares on Singapore-based carriers, port taxes, local transportation, business, medical, education, and by transit/transfer visitors—has taken a dominant share in the spending mix. Accommodation, Shopping and SEG saw their share of total spend decrease significantly.

Sustaining Tourism Sector Through Domestic Tourism and Analytics

201

Table 1 Data used in this work No.

Data

Source

Date Range

Description

1

Total arrivals

SingStat [12]

01.2011–11.2020

Total number of international visitors’ arrival into Singapore

2

Tourism receipts

SingStat [12]

Q1 2007–Q2 2020

Tourism expenditures in Singapore sum up by each quarter

3

Air transport

SingStat [12]

01.2011–11.2020

Total number of aircrafts arrivals and departures from Singapore

4

Water transport

SingStat [12]

01.2011–11.2020

Total number of vessels arrivals into Singapore

5

Hotel statistics

Stan [13]

01.2019–11.2020

Average room rate, room revenue etc. of hotel in Singapore

6

International tourism expenditures and receipts

DataBank [14]

2000–2020

Foreign tourism expenditures over domestic’s receipts across different countries

7

Hotel booking

Kaggle [15]

07.2015–08.2017

Booking information for a city hotel and a resort hotel

Fig. 1 Total International Visitor Arrivals in Singapore 2011–2020

202

D. Chen et al.

Fig. 2 Total international visitor arrivals in Singapore 2019–2020

Fig. 3 Tourism receipts in Singapore 2011–2020

In contrast with the drastic shift in spending mix, contribution by the country has remained stable. Singapore Tourism Board [16] shows that Mainland China, Indonesia, and India were the top three sources of tourism receipts for Singapore both before and after the onset of COVID-19, contributing more than a third of the total. Figure 5 depicts a drop in both room rate and occupancy rate in Singapore during the onset of COVID-19 outbreak. While occupancy rate partially recovered from May to July 2020, the situation worsened again towards the end of the year. A more in-depth analysis into the room types in Fig. 6 shows that the economy and mid-tier rooms almost returned to pre-crisis occupancy rate by August, while occupancy for upscale and luxurious rooms only managed to recover to half of the original levels. This may relate to the impact of COVID-19 on disposable income

Sustaining Tourism Sector Through Domestic Tourism and Analytics

203

Fig. 4 Tourism receipts by major components—comparison of Quarter 2

Fig. 5 Room rate and occupancy rate in Singapore

and business travel budgets, where priority is given to what is essential over the good-to-haves. Impact on room revenue from lower occupancy was mitigated by an increase in the average room rate as shown in Fig. 7. Despite the above phenomena, COVID-19 travel restrictions have not dampened people’s desire to travel. As of December 2020, global travel sentiment has risen by 24% year-on-year. Demand for solutions to fulfil these needs remain even in a pandemic; and possibly panted up due to travel restrictions. Where the desire to travel cannot be met through oversea holidays, there is anecdotal evidence that people are looking domestically for a replacement. This might explain the recovery of domestic seat capacity alongside travel sentiment in the second half of 2020 while international travel continues to stagnate. This prompts us to explore the feasibility of using domestic tourism to help the tourism sector restore its lost revenue. We define domestic tourism as travel for leisure where departure and arrival occur in the same country. For any country, we assess the financial feasibility of

204

D. Chen et al.

Fig. 6 Average occupancy rate in Singapore

Fig. 7 Room revenue in Singapore

using domestic tourism as a revenue replacement by taking the ratio of tourism expenditure by that country’s population overseas (the potential replacement revenue source) against the tourism receipts of that country (the revenue lost due to absence of international travel). We call this the “replacement ratio”. Figure 8 shows the replacement ratios of selected countries. Countries with a replacement ratio above 1, such as Singapore and Brazil, indicates that their population could have channeled the monies they spend on overseas trips to explore their own countries and support their local tourism sector. This makes domestic tourism a financially viable solution. In contrast, domestic tourism may not be effective for those countries with replacement ratios well below 1. They include countries such as Thailand, New Zealand, Cambodia and Egypt.

Sustaining Tourism Sector Through Domestic Tourism and Analytics

205

Fig. 8 Replacement ratio chart

Figure 9 further explores why some countries have good or poor replacement ratio. The scatterplot shows each country’s tourism expenditure against tourism receipts. Countries with replacement ratio of 1 are colored white. Countries with ratios above and below 1 are colored green and red respectively, with the strength of the color showing how far a country’s ratio is from 1. On the left are small countries like Nauru, Kiribati, Burundi, Eswatini and Gabon who did not really develop their tourism industry (i.e., low receipts). These countries are not the concerns of our study. On the bottom right are countries where tourism makes up a large part of their economies and is often their biggest source of foreign exchange. They include exotic island destinations like Samoa, Fiji and New Zealand, and those with rich history and culture like Thailand, Egypt and Cambodia. Many of these countries do not rank high in per capita GDP, and their population’s disposable income will take a hit when foreign tourists stop coming. These are the countries where domestic tourism may not work so well; and they will have to find ways to temporarily redeploy manpower to other sectors of the economy. In contrast, countries in the top right are in better shape. They tend to be either developed economies like Singapore, UK and Germany with high disposable income; or countries like China and US with large land mass and population to sustain a robust domestic economy. These are the countries where domestic tourism is likely to work well.

206

D. Chen et al.

Fig. 9 Replacement ratio scatterplot

Proposed Solution: Analytics-Enabled Domestic Tourism Model In this section, we continue the earlier discussion where domestic tourism is deemed as a viable replacement for foreign tourist revenues, we discuss how countries may achieve a vibrant domestic tourism model using analytics. For countries such as China and Germany [17] with traditionally substantial domestic tourism, these suggestions will help sharpen their offerings to their local population. For countries who do not have established domestic tourism infrastructure, these suggestions should help them leverage on latest developments in analytics and hasten their transitions to a digital economy. Fundamentally, the model needs to consider both the supply side and the demand side of the market. By supply side, we referred to companies and workers providing services or products to the tourism sector, including airlines, tour agencies, hotels, and restaurants. For the demand side, we refer to people who are keen to travel and tour.

Sustaining Tourism Sector Through Domestic Tourism and Analytics

207

It begins with identifying exactly which components of the value chain is suffering a revenue shortfall and by how much. Descriptive analysis shown previously about which tourism spending type sees the most dip can be a starting point, but it needs to be much more granular to enable a more targeted design of recovery measures. The analysis should result in a high-level target on replacement revenue to be created through domestic tourism at a detailed tourism sub-sector level. For firms that structurally cannot be helped by domestic tourism, for example international airlines, the analysis should yield a targeted number of people to be redeployed into other sectors. Recommendation System is a category of Machine Learning [18], based on the principles of transforming data into information by utilizing both historical and current data to anticipate the future [19]. From Netflix to online tour agents like Expedia, automated suggestions based on consumer data serve well to boost businesses, upsell, and retain faithful customers coming back for more. Applying it to the tourism context, it means providing tourists with smart itinerary recommendation [20], personalized schedule recommendation [21], and Point of Interest (POIs) recommendation [22] etc. The earlier gap analysis can be used as a starting point and provides an inventory of POIs, accommodations, eateries, shopping malls, etc. that will form the building blocks of domestic tourism itineraries. Some initial itinerary designs are presented in a consolidated local tourism application. Gamification, such as getting user to “check-in” when they reach a POI and tracking how users click and scroll through the app’s recommendation is crucial. From the users’ perspective, tracking user preference helps to make future recommendations more relevant, resulting in better user experience. This may take the form of unsupervised learning often seen in market segmentation exercises. From the app developer perspective, data collected for different application layout can help in “A/B Testing” to see how the layout can be improved to nudge the user to tourism service providers that require the most help. It will also help to inform improvements to itinerary design and give feedback to service providers about what is popular among tourists for providers to adjust their offerings. In this time of crisis, every dollar of revenue gained, and reduced wastage have significant impacts on cash flow of business. We first take the hotel industry as an example. After a hotel has deployed resources to prepare its rooms and restaurants in anticipation of a certain booking volume, it will hurt the hotel’s viability if it faces a lot of no shows. It would be useful if one can predict impending booking cancellations, so the hotel can pre-emptively reach out to the guest to conserve the case, release unused room to other guests, and avoid wasting resource to prepare room and food for a guest that is not going to arrive. We developed a prototype of hotel room cancellation prediction model. A hotel booking dataset sourced from Kaggle [15] was used. It contains attributes describing each hotel booking and whether or not the guest eventually cancel the booking. We model the prediction model on cancellation as the classification problem. We conducted exploratory data analysis, cleaned the dataset of missing values, made necessary attribute transformations (e.g. encoding of categorical variables), split data

208

D. Chen et al.

between training and testing data, and trained eight different classifier model types to find out which classifier yields the best performance. Figure 10 shows the important features according to classifier and Fig. 11 displays the results of the model output. For this dataset, booking lead time, guests’ home country, and whether deposit has been placed are the three most important features affecting room cancellation likelihood. With that knowledge, a hotel may choose to follow-up closely with guests from “higher-risk countries” or those who have made their bookings long ago or demand a deposit from the guest. For this sample data set, Random Forest Classifier performed best in terms of both accuracy and highest Receiver Operating Characteristics (ROC) over Area Under the Curve (AUC). Analytics also offer many opportunities to improve customer experience and make domestic travel fuzz-free. According to the HubSpot research report [23], 71% of

Fig. 10 Features importance of hotel prediction

Fig. 11 Summary of prediction model

Sustaining Tourism Sector Through Domestic Tourism and Analytics

209

people are already using chatbots to get quick resolution to their queries. Chatbot uses text classification and Natural Language Processing (NLP) techniques, trained with thousands of conversation logs to understand what type of question requires what sort of answers [24]. Chatbots’ ability to operate 24/7 offers travelers great convenience and help companies reduce staff cost. Once a trip is booked, automatic reminders can be set prior to arrival. Recommendation systems can be applied to promote nearby recreation venues and transportation facilities at the destination. Analytics can also be applied at hotel check-in. More specifically, by using computer vision with deep learning, a facial recognition machine can learn the customer’s ID photo during check-in and check-out processes [25]. Tourists will not need to wait to be check-in by hotel staff which in turn increase efficiency. This also minimizes contacts between people and reduce risk of pandemic spread—an appealing proposition to travelers who are health-conscious. UNWTO’s [1] extended scenarios for 2021–2024 point to a rebound in international tourism by the second half of 2021. Nevertheless, recovery to 2019 levels in terms of international arrivals could take 2½ to 4 years [26]. So domestic travel, where feasible, is important in preventing permanent damage to each country’s tourism capacity in the interim. Some success is already seen in markets like United States and France. Markets such as China and Russia even saw domestic air travel mostly recovered to pre-COVID levels.

References 1. Impact assessment of the COVID-19 outbreak on international tourism | UNWTO. 2021. Impact assessment of the COVID-19 outbreak on international tourism | UNWTO. [Online]. https:// www.unwto.org/impact-assessment-of-the-covid-19-outbreak-on-international-tourism 2. Prideaux B, Laws E, Faulkner B (2003) Events in Indonesia: exploring the limits to formal tourism trends forecasting methods in complex crisis situations. Tour Manage 24:475–487 3. Blake A, Thea SM (2003) Tourism crisis management: US response to September 11. Ann Tour Res 30:813–832 4. Ioannides D, Apostolopoulos Y (1999) Political instability, war, and tourism in Cyprus: effects, management, and prospects for recovery. J Travel Res 38:51–56 5. Huang JH, Min JCH (2002) Earthquake devastation and recovery in tourism: the Taiwan case. Tour Manage 23:145–154 6. Cartwright R (2000) Reducing the health risks associated with travel. Tour Econ 6:159–167 7. Bar On RR (2001) Middle East and the Mediterranean to 1999, and preliminary worldwide estimates for 2000. Tour Econ 7(1):89–103 8. Saunders M, Lewis P, Thornhill A (2009) Research methods for business students, 5th edn. Pearson Education Limited, England 9. Kuckartz U (2014) Qualitative text analysis: a guide to methods, practice and using software. Sage, London 10. Thomas M (2003) Blending qualitative and quantitative research methods in theses and dissertations. Crown Press Inc., California 11. WTTC. WTTC Now Estimates over 100 Million Jobs Losses in the Travel & Tourism Sector and Alerts G20 12. Singapore Department of Statistics | Singapore Department of Statistics (2021) Singapore Department of Statistics | Singapore Department of Statistics. [Online]. https://www.tablebuil der.singstat.gov.sg/publicfacing/backToMainMenu.action

210

D. Chen et al.

13. Stan | Tourism Statistics (2021) Stan | Tourism Statistics. [Online]. https://stan.stb.gov.sg/con tent/stan/en/tourism-statistics.html 14. databank.worldbank.org (n.d.) World Development Indicators | DataBank. [Online]. https://dat abank.worldbank.org/reports.aspx?source=2&series=ST.INT.ARVL 15. Hotel booking demand | Kaggle (2021) Hotel booking demand | Kaggle. [Online]. https://www. kaggle.com/jessemostipak/hotel-booking-demand 16. https://en.wikipedia.org/wiki/Timeline_of_the_COVID-19_pandemic_in_Singapore 17. UNWTO Tourism Recovery Tracker (2021) UNWTO Tourism Recovery Tracker. [Online]. https://www.unwto.org/unwto-tourism-recovery-tracker 18. SiteMinder (2021) Exploring the domestic tourism market post-Coronavirus—SiteMinder. [Online]. https://www.siteminder.com/r/marketing/hotel-digital-marketing/corona virus-covid-19-recovery-hotels-domestic-tourism/ 19. Melville P, Sindhwani V (2010) Recommender systems. In: Encyclopedia of Machine Learning, vol 1, pp 829–838 20. Negre E (2015) Information and recommender systems. John Wiley & Sons 21. Yoon H, Zheng Y, Xie X, Woo W (2010, October) Smart itinerary recommendation based on user-generated GPS trajectories. In: International conference on ubiquitous intelligence and computing. Springer, Berlin, Heidelberg, pp 19–34 22. Chiang HS, Huang TC (2015) User-adapted travel planning system for personalized schedule recommendation. Inf Fusion 21:3–17 23. Baral R, Li T (2016, September) Maps: a multi aspect personalized POI recommender system. In: Proceedings of the 10th ACM conference on recommender systems, pp 281–284 24. HubSpot (2021) Why chatbots are the future of marketing: the battle of the bots. [Online]. https://www.hubspot.com/stories/chatbot-marketing-future 25. Maruti Techlabs (2021) What are the inner workings of a chatbot? | by Maruti Techlabs | Chatbots Magazine. [Online]. https://chatbotsmagazine.com/what-is-the-working-of-a-chatbot-e99 e6996f51c 26. Machine Learning Mastery (2021) A gentle introduction to deep learning for face recognition. [Online]. https://machinelearningmastery.com/introduction-to-deep-learning-for-facerecognition/

Tourism Analytics with Price and Room Booking Simulation Yile Cai, Ke Duan, Congcong Peng, Xiaodan Shao, Yichu Sun, Jiayi Wang, and Linghao Zeng

Since the start of COVID-19 pandemic from the end of 2019, there have been a total of over 100 million confirmed coronavirus cases and over 2 million deaths all around the world. According to Haydon and Kumar [1], the industries most impacted by COVID-19 are airlines industry, restaurant industry, oil and gas industry, auto parts and equipment industry, and leisure facilities industry. Therefore, the tourism sector, which is highly correlated with airlines, restaurant, and leisure facilities industries, is one of the most impacted sectors during the pandemic. Tourism plays an important part in the development of Singapore; it is a popular business and leisure destination attracting visitors from all over the world. Prior to the coronavirus pandemic, more than 4% of Singapore’s gross domestic product (GDP) was derived from the tourism sector. However, with strict policies for visitors’ arrival, the implementation of safe management measures, and other relevant factors, the arrival rates decreased rapidly as the tourism receipts in the first half of 2020 declined by 68.2%. Besides, according to Tourism Sector Performance 1H 2020 Report [2, 3], the hotel industry suffered a decline of 58.3% in revenue (2020). Although the government has released supportive policies and provided support packages to help with enterprises’ operations as well as stabilizing employment, the growth of tourism sector is still limited. In this work, we adopt time series techniques, data visualization and analytics, as well as Monte Carlo simulation to find out the factors that influence the tourism sector performance. We explore how the airlines and hotel industries were affected, how the government can resume tourism activities and help business owners to continue their business, and how hotel operators can maximize their revenue. Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_13. Y. Cai · K. Duan · C. Peng · X. Shao (B) · Y. Sun · J. Wang · L. Zeng Nanyang Business School, Nanyang Technological University, 52 Nanyang Avenue, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_13

211

212

Y. Cai et al.

Singapore is the hub for international air and sea transportation, and the destination for international travel. It is well connected with other countries in the world that have gained advantages for tourism before the pandemic. The confirmed COVID19 cases in the first wave during late-January 2019 were imported from Wuhan, China. Singapore Government reacted quickly to COVID-19 and implemented strict policies on border control as well as safe management measures. At the same time, some countries also released travel restrictions on Singapore. The impact of COVID-19 lowered Singapore’s GDP in 2020 by 5.8% compared with the previous year. There is a 12.6% decrease in the service sector, including accommodation and food services which declined due to the weak tourism demand [4]. Enterprises in the tourism sector, especially local businesses, suffered from COVID-19 and had to cut positions or freeze hiring to survive, leading to higher unemployment. The sharp decrease in the number of international tourists was mainly caused by stricter border controls. On January 29, 2020, Singapore started restricting visitors from Hubei, China from entering. And then on February 1, 2020, all visitors who had been to China in the past 14 days were not allowed to enter Singapore. From March 17, 2020 onwards, short-term visits to Singapore were denied. Apart from government policies, market rule plays an important role in lower tourism receipts. The sharp decline in the demand for traveling and entertainment caused an oversupply of hotels. Therefore, the price of hotel rooms was lower as well. As a result, the revenue of hotel industry decreased due to the lower prices and fewer customers. This work aims to understand the impacts of COVID-19 on Singapore tourism and provide hotel enterprises with analytics-based suggestions. We set out to answer the following questions: · How did Singapore tourism sector perform before the COVID-19 pandemic? What is the impact on tourism compared to the past performance? · What are the factors that influence the tourism sector? What insights can we draw from impacts of COVID-19 on tourism? · How was the performance of the hotel industry? What are the reasons for the change in revenue? Which type of room is more vulnerable? How can hotel enterprises maximize their profit? · What long-term and short-term suggestions could we provide the hotel operators to help with its recovery and boost future revenues? We approach it with time series techniques on the problem about past performance as we can use to compare the prediction based on past performance with the actual value to examine the impact of the event. Data visualization and analytics techniques would be used to address the insights of impacts of COVID-19 on tourism. We generated a Monte Carlo simulation to find the ideal price points to maximize the revenue. This translates to greater flexibility to do scenario analysis using simulation. The development of tourism has long been regarded as an essential topic for discussion in different countries which might have a significant impact on the formation of general local culture and economic power. Under this circumstance [5], thorough

Tourism Analytics with Price and Room Booking Simulation

213

and constructive discussion on the application of technologies and national strategies in tourism need to be conducted for sustainable tourism, cultural tourism, and related public policy [5, 6]. The tourism sector in Singapore achieved record highs in International Visitor Arrivals (IVA) and Tourism Receipts (TR) in recent years [2]. The strategies for the growth of Singapore tourism have been well discussed. Henderson [7] argued that destination branding in Singapore is shown to be growing in popularity among marketers and the factors underlying this trend are revealed. The synergetic effects between modern hospitality and international tourism in Singapore have been thoroughly discussed by Lee [8] that there is a long-run unidirectional causality from health care to international tourism. Besides, Savage et al. [9] confirmed the use of thematic zones in Singapore as a strategy to achieve its ambition of becoming a tourism capital of the world. Systematic and scientific tourism development policies earned Singapore admirable compliments, but everything has changed since COVID-19. Farzanagen et al. [10] confirmed through regression that countries exposed to high flows of international tourism are more prone to cases and deaths caused by the COVID19 outbreak. By applying a self-polished tourism monitoring index, this trend has been observed by Yang et al. [11]. Nearly 60,000 COVID-19 cases in Singapore and consequent health-related policies have restricted the continuously growing movement of tourism. The first half of 2020 had a decline of 68.2% in tourism receipts as compared to 2019 [12], which can be observed across all major components. Therefore, research on relevant issues should be prioritized.

Analytics Approach on Tourism We first focus on the number of passengers in Changi Airport, as this reflects the tourism in this region to a certain extent. Changi Airport is the only commercial airport in Singapore. In this section, we applied time series techniques to decompose and analyze the number of passengers in Changi Airport over the last ten years from 2020 [13]. We then use ARIMA model to forecast the number of passengers in the next five months starting from November 2019. This is to compare it with the trend of passengers assuming there is no outbreak of COVID-19 using the actual number of departures and arrivals (with COVID-19). We can then assess the impact of the epidemic on the number of people entering and leaving Singapore. Figure 1(a) shows the number of passengers in Changi Airport from year 2010 to 2020 and (b) shows how this information is decomposed into time series components like trend, seasonal and residual. We can see a positive relationship between month and the number of passengers. Over the long-term trend, passengers have a steady increase gradually. According to the data of International Air Transport Association (IATA), the aviation industry experienced a constant growth with an average of 5% annually over the past 30 years. Air traffic is an important element indicating the

214

Y. Cai et al.

flow of capital and people throughout the Asia Pacific region. A continual growth is expected as wealth grows, enabling airlines to serve almost more countries globally. The decomposition graphs in Fig. 1(b) shows a strong upward trend and seasonality between 2010 and 2019, the period before COVID-19. Number of passengers in Singapore Changi airport is on a gradual upward trend. Next, we use ARIMA model to predict the performance in the next 5 months on monthly total passengers from 2019 November to 2020 March as shown in Fig. 2. The orange line represents the predicted arrival at the peak in December 2019 following the earlier trend. Table 1 shows the comparison of predicted and actual passengers. We compute the difference between the two to understand how the predicted values deviate from actual values. We can see the gradual increase in the difference as COVID-19 worsens. The data is plotted in Fig. 3 with the blue line showing the actual value with the epidemic and the orange line showing the predicted value without the epidemic. From the difference between the predicted value and actual passenger numbers during COVID19, we can see the total number decreased gradually by nearly 70% in March of 2020. The impact on the total number of passengers in and out of Singapore is immense. We next look at the overall effect on Singapore tourism based on the number of arrivals and departures during COVID-19. Figure 4 shows how the different policies responding to COVID-19 affect the arrival and departure rates. At the end of January, the non-entry policy for China was implemented. Singapore [14] receives the greatest number of tourists from China as shown in Fig. 5. This causes a drastic decrease in the overall number of visitors to Singapore. As the confirmed new cases grew gradually in mid-March, additional policies were released, for instance, short-term visit was not allowed and only work permit holders from some specific regions could enter Singapore. This further cut down the number of arrivals to an unprecedentedly low level. After the new cases started to dwindle, the border was reopened for China, Australia, etc., and the number of flight visitors resumed slowly. Figure 6a shows the number of visitors by different modes of transportation: air, land and sea. The yellow bars denote the number of covid cases in the respective period. The number of visitors by air, land, and sea went down sharply after the restrictive policies released in the first quarter and went back up starting from August 2020. This is consistent with the situation shown in Fig. 6b. Aviation accounted for the lion’s share in travel mode in this island country. Residents from Southeast Asia countries, come in by sea or land, contributing to the gradual recovery of travel in Singapore. We now look at the profile of travelers in the past periods. Figure 7 shows traveler’s place of origin in 2020. Singapore receives the greatest number of visitors from China and Indonesia from the top graph. The bottom plots show the gender and age distribution of travelers. Male visitors between 25 and 44 in the Southeast Asia form the largest age group entering Singapore. We continue the study to examine the traveler’s profile before this period. Figure 8 shows the traveler’s profile in 2019, before the outbreak of COVID-19. We see more tourists from Europe regions like Germany, Italy, and the UK. This explains the sharp drop in the number of visitors to

Tourism Analytics with Price and Room Booking Simulation

(a) Number of Passengers in Changi Airport from 2010-2020

(b) Decomposition of Past Performance Fig. 1 Number of passengers and decomposition

215

216

Y. Cai et al.

Fig. 2 Prediction of next 5 months’ numbers of passengers Table 1 Comparison table for prediction and actual passengers

Date (month)

Prediction

Actual passengers

Difference

2019 Nov

(*1e6)

(*1e6)

(*1e6)

5.95

5.72

− 0.23

2019 Dec

6.13

6.41

0.28

2020 Jan

6.00

5.95

− 0.05

2020 Feb

5.74

3.45

− 2.29

2020 Mar

5.61

1.65

− 3.96

Fig. 3 Difference between actual and prediction data

Tourism Analytics with Price and Room Booking Simulation

217

Fig. 4 Arrival and departure trend in COVID-19 under various policies

Fig. 5 Numbers of visitors from different countries

Singapore, since those European countries were more severely affected by COVID19 in the second half of 2020 and residents were not allowed to enter Singapore. Travelers from Southeast Asia and China arrive more for business or pre-employment visits. Residents from Malaysia, Philippines and Indonesia are more likely to visit their friends and relatives in Singapore. Visitors travel individually, with their spouse, or with their children. Their accommodation preference was shifted from staying at the residence of friends or relatives to hotels that affects the hotel’s revenue which is an integral part of tourism. We next look at how the onset of COVID-19 affects the performance of the hotel industry. Figure 9 depicts the average occupancy rate and room rate in 2020. The outbreak of COVID-19 had a significant negative impact on the overall revenue of

218

Y. Cai et al.

(a) Visitors by different mode for first half 2020

(b) Visitorsby different modein second half of 2020 Fig. 6 Number of visitors by air, land and sea

Singapore’s hotel industry. From February onwards, the revenue of the hotel industry began to decline sharply and reached the lowest level in May. After that, revenue began to rebound to a certain extent, but the rate of recovery was still limited. The overall income was still at a low level compared to that in January. The payment of the room is mainly affected by the unit price of the room and the occupancy rate of the room, which are displayed as two lines in the same figure. We can observe that the decrease in income is mainly caused by the substantial drop in the average room price to make an attractive room rate. This is later compensated with a higher occupancy rate picking up from April onwards.

Tourism Analytics with Price and Room Booking Simulation

219

Fig. 7 Travelers’ profile 1 during COVID-19

Fig. 8 Travelers’ profile 2 before COVID-19

We further this discussion to examine other reasons behind the observation mentioned above. Figure 10 depicts the picture of hotel supply and demand versus the average room rate. We can see that the blue bar represents available rooms for letting and the orange bar denotes gross letting. From February onwards, the gross letting

220

Y. Cai et al.

dropped by almost a half. However the available rooms stayed nearly unchanged, causing an imbalance between supply and demand, and then leading to a huge drop in the average room price. This shows another interesting phenomenon of price lag reaction. Although the supply and demand occurred in February, the price began to change in March. In a nutshell, we can reasonably believe that the spread of COVID19 resulted in a huge drop in demand in January which directly led to a decrease of the average room price and finally deteriorated overall revenue for the hotel industry. We next perform room size analysis to understand which type of room is more vulnerable to the COVID-19. Figure 11 visualizes the revenue by different room sizes. We can observe that the revenue per room for large rooms suffered the sharpest decline during COVID-19 and its rate of recovery was relatively low. Middle rooms have a similar observation, but it recovers slightly faster than large rooms. Small rooms have a smoother trend in this difficult time. We break down the large room by the room price as shown in Fig. 12. We can see from the figure that the drop in average room price was the main factor. Lower room rates result in lower total

Fig. 9 Average occupation rate and average room price

Fig. 10 Hotel supply and demand vs average room price

Tourism Analytics with Price and Room Booking Simulation

221

revenue. The rate or occupancy was not significantly impacted despite the pandemic. This might show that hoteliers are reacting strongly to the pandemic, worrying about the impacts on revenue and slashing prices aggressively for continued survival. Apart from the room type, we also investigate rooms by different tiers. Figure 13 shows the revenues by different tiers of rooms, suh as luxury rooms, upscale rooms, mid-tier rooms, and economy rooms. We find from the figure that the revenue per room for luxury room dropped significantly during COVID-19. In January, revenue per room of luxury room was $415.60. After several months into the pandemic, the revenue per room declined to an extremely low price. Even though it bounded back slightly from June, the rate in November was only around 1/3 of the revenue as compared to January period. We explore further about the breakdown of room price and occupancy rate for luxury room as shown in Fig. 14. We find the decrease in revenue per room for luxury room is mainly due to the reduction of average room price.

Fig. 11 Revenue by different room size

Fig. 12 Large room revenue break down

222

Y. Cai et al.

Fig. 13 Revenue by different room type

Fig. 14 Luxury room revenue break down

Price, Room Booking and Revenue Simulation In this section, we simulate prices in order to find out the ideal price point. Simulation has given us greater flexibility in dealing with uncertainties as compared to approaches like decision tree and scenario analysis. We follow the steps of simulation algorithm by Aswath Damodaran [15] as listed below: 1. Determine the relevant probabilistic variables. 2. Define probability distributions for these variables: Damodaran has listed three ways to define probability distribution. · Historical data: In our case, we believe the historical prices and historical bookings are relevant for our simulation study. · Cross-sectional data.

Tourism Analytics with Price and Room Booking Simulation

223

· Statistical distribution and parameters: For most of the variables that we try to forecast, the historical and cross-sectional data will be insufficient or unreliable. In our case, we assume that all the distributions are normal with “fat tail” (leptokurtic), and right-skewed. Yet, as Damodaran noted, “picking the right distribution and the parameters remain difficult”. First, few inputs that we see in practice meet the assumption of statistical distribution. Consequently, we assume that the statistical distributions are close enough to the real distribution. Secondly, some parameters still need to be estimated, and we estimate it following the assumed distribution. This adds another layer of complexity, and creates more uncertainty in view of different scenarios that can occur. 3. Check for correlation across variables: Before jumping into the simulation, it is imperative to check for correlations across the variables. Since we are developing the simulation for prices and the number of rooms booked, both are critical in determining the revenue. Yet, it is also extremely likely that these two values are correlated with each other, or instance, lower number of booked rooms may be accompanied by lower room prices. When there is a strong correlation, positive or negative, across inputs, we have two choices. One is to pick only one of the two inputs to vary: it makes sense to focus on the input that has a larger impact on value. The other is to build the correlation explicitly in the simulation. This is what we have designed for our simulation: we modeled supply and demand by implicitly modeling both price and number of rooms booked to be right skewed with a skewness of 0.9. This means that lower prices will always correspond with lower number of rooms booked. 4. Run the simulation: For the first simulation, we draw one outcome from each distribution and compute the values based upon those outcomes. This process can be repeated as many times as desired, though the margin contribution drops off as the number of simulations increases. The number of simulations to run is determined by the following: · Number of probabilistic inputs: The larger the number of inputs that have probability distributions attached, the greater the required number of simulations will be. · Characteristics of probability distributions: The greater the diversity of distributions in analysis, the larger the number of required simulations will be. · Range of outcomes: The greater the potential range of outcomes on each input, the greater the number of simulations. According to Damodaran [15], there have generally been two impediments to good simulations. The first is the information: estimating distributions of values for each input into a valuation is difficult. The second is computational: until the advent of personal computers, simulations tend to be too time and resource intensive for the typical analysis.

224

Y. Cai et al.

Fig. 15 Steps in price simulation

We attempt to simulate two parameters: price and number of rooms booked. We will then be able to generate revenues based on these two simulated variables. We now look at the algorithms to generate the price. Figure 15 illustrates the steps for price simulation. The distribution of price is first assumed. Price samples are then simulated through simulation. We place some constraints on the price for conditions, and from there we can then infer the prices. There are several assumptions we have applied to generate prices: 1. The price follows a normal distribution with a mean close to the historical mean (the mean price of the past two years), and a standard deviation mimicking the historical standard deviation. 2. The price is skewed similarly to how the number of booked rooms is right skewed. The reasoning for the pricing distribution to be right skewed is derived from the probability of generating a sample with value below the mean will always be higher than sample with value higher than the mean. Since the number of rooms booked follows a similar distribution, a lower price will always correspond to fewer rooms being booked. This approach allows us to model the correlation. 3. The price distribution is also leptokurtic with a kurtosis of 3.5. This allows the distribution to be more closely approximate the actual scenario. 4. We also apply a constraint which requires the price value to be positive. After applying the assumptions, we generate a number of prices. In our case, we set the size to 130,000. We now model the number of rooms booked. Figure 16 illustrates how the room booking can be simulated. Similar to the pricing simulation, the number of rooms booked follows the following assumptions. 1. The number of rooms booked is normally distributed, with its mean closely resembling that of the historical mean, and standard deviation closely resembling the historical standard deviations. 2. The distribution is also right-skewed, to simulate correlation with price. We apply the constraint in which the number of rooms booked is positive and cannot exceed the hotel capacity. Currently it is set as 500 maximum. We are able to generate 130,000 samples of number of rooms booked.

Tourism Analytics with Price and Room Booking Simulation

225

Fig. 16 Steps in room booking simulation

Fig. 17 Steps in revenue simulation

Lastly, we simulate the revenue using the steps depicted in Fig. 17. The revenue simulation is based on the two variables we have generated before. The total revenue is the price multiplied by the number of rooms booked. We can then choose the best price that maximizes the revenue. We design several scenarios to understand how we can use the simulation to infer best prices under different circumstances. We implemented three scenario simulations in order to understand how the price is changed under different assumptions.

Scenario 1 We assume that the hotels have two tiers of rooms available: standard rooms and luxury rooms. These two rooms differ in their prices and the number of rooms available for booking. For the first scenario, we set the mean price of the standard room to be 300 with a volatility of 40, while the mean price of luxury room to be 800 with a standard deviation of 80. We then proceed to generate 130,000 samples assuming the price is normally distributed, with a skewness of 0.9, and a kurtosis of 3.5, as shown in Fig. 18(a) for the standard room price and (b) for luxury room price. For the simulation of room booking numbers, we assume the mean number for standard rooms to be 250 with a standard deviation of 50, and the mean number

226

Y. Cai et al.

(a) Standard Room Price

(b) Luxury Room Price

(d) Standard Room Booking

(e) Luxury Room Booking

(c) Revenue Simulation

Fig. 18 Simulated prices, room booking and revenue for scenario 1

of luxury rooms is 100 with a standard deviation of 20. These can be observed in Fig. 18(d) and (e). We are then able to generate the revenue using the algorithm mentioned earlier, and the revenue distribution histogram is shown in Fig. 18(c). We are then able to find the ideal prices and the rooms booked based on this scenario. Through these generated numbers, we have found the maximum revenue to be around $290,000. The optimized price for standard rooms should be around $265, and the optimum price for the luxury room should be around $730. The simulated revenue is based on the optimized room booking for standard rooms to be 259 units, and luxury rooms to be 104 units.

Scenario 2 In the second scenario, we assume that there has been a significant drop in the number of rooms being booked. The mean price of the standard room is around 150 with a standard deviation of 25, mean price of the luxury room is 500 with a standard deviation of 25. The distributions for price, room booking, and revenue are shown in Fig. 19. Since in this scenario, the standard room booking has dropped considerably, we assume that the mean number of standard rooms is 250 with a standard deviation of 50, and the mean number of luxury rooms is 100 with a standard deviation of 20. In this pessimistic scenario, we also applied some additional constraints: 1. The total number of rooms booked is less than 250, meaning the hotel’s booking rate is less than 50%. 2. The total number of standard room bookings is less than 200.

Tourism Analytics with Price and Room Booking Simulation

(a) Standard Room Price

(d) Standard Room Booking

(b) Luxury Room Price

227

(c) Revenue Simulation

(e) Luxury Room Booking

Fig. 19 Simulated prices, room booking and revenue for scenario 2

3. The total number of luxury room bookings is less than 20. We then generate the revenues after all the constraints have been applied. From the generated revenue, we then obtained the corresponding price with the number of rooms booking. The maximum revenue is as low as $84,126, much lower than the previous scenario 1. The price of the standard room is set at around $259.8. The price of the luxury room is set at around $486. The simulated number of room bookings is 145 for standard rooms, and 19 for luxury rooms.

Scenario 3 In the second scenario, we assume that there has been a significant drop in the number of rooms being booked. The third scenario builds on the previous scenario but adds an additional type of room, called the quarantine room. This type of room is being used by the government to quarantine international travelers. In our simulation, we set all the parameters to be the same as that in scenario 2, except that we assume about 50% of the standard rooms are being used for quarantine with a fixed price of $100 (assuming the hotel has low power of negotiation with the government). We then find out the maximum revenue and its corresponding parameters. In this scenario, the simulated revenue is $103,957, which is higher than that of the second scenario. The simulated price of the standard rooms is $456, and the price of the luxury rooms is $538. The number of room bookings for standard rooms is 348 units while luxury rooms is 16 units.

228

Y. Cai et al.

Conclusion When we compare all the scenarios, it is clear that the booking limitations will have an impact on the revenue. Scenario 1 simulates the normal scenario whereas scenario 2 is in the COVID-hit scenario and hotel operator sticks to the original business model. The last scenario 3 is when hotel operator adjusts to the COVID-hit context and restructures the revenue model to include quarantine rooms. When we compare the hotels with quarantine arrangements and those without, it is also clear that those with quarantine arrangements will result in more bookings than others, therefore generating higher revenue.

Recommendation COVID-19 pandemic brought international travel to an abrupt halt, significantly affecting the employment of the whole supply chain of the travel industry. This industry is labor-intensive in nature, and many people lose their job during this period. Some workers shifted to other sectors. The shortages of well-trained workers in the tourism sector further exacerbated the labor issue. This is an existing problem in the tourism industry and will also slow down the recovery of tourism. In order to mitigate the impacts during and after the COVID-19 pandemic, both the government and the enterprises should work hand in hand to recover this industry by providing support like financial relief and unemployment subsidies. As there is a considerable proportion of casual and self-employed workers in this industry, their livelihood is equally affected. Singapore government has provided support for Involuntary No-Pay Leave for individuals to receive the support of up to $700/month for 3 months. Self-employed people can receive the support of up to $500/month for 3 months. More support should be given for the sector to recover quickly. The prospect of the current tourism industry remains uncertain. Complex travel restrictions and procedure for clearance continues to evolve following the virus outbreaks. Clear communication, well-designed information policies, and clarity of epidemiological standards are essential. Travelers may face longer travel restrictions, making it difficult to finalize travel schedules. Flexible cancellation and refund policies will play an important role in increasing the likelihood of travelers confirming a travel booking. As discussed earlier, in the short term, a series of strict shutdowns led to the rapid and sharp decline of international tourism in Singapore. Therefore, the most effective and direct way should certainly be the implementation of more flexible policies on border restriction, especially for visitors from low-risk regions. Instead of withdrawing all border restrictions at once, this process should be attempted gradually with mutual diplomatic understandings on entry permission and full coverage of COVID-19 swab tests for international tourists. Though the general international traffics suffered an unprecedented decline during COVID-19, aviation industry generally enjoys higher revenue and higher fill-up rates

Tourism Analytics with Price and Room Booking Simulation

229

as compared to others. Hence, some preferential measures should be considered for international arrivals by air. In the short term, promotion bundles of COVID-19 swab test and economy tickets from major airlines might be attractive to provide travelers with financial aid and assurance of safe traveling with all travelers tested prior to onboarding the flights. Encouraging domestic tourism can be the first step to recover the tourism industry. Support from both the government and attractions can help travel providers to discover potential tourist attractions and launch local tourism to meet the increasing demand arising from the potential lockdown. As local tourists are typically more sensitive to price, providing discounts is the most workable solution to increase the revenue to attract local tourists. Tickets from renowned attractions (e.g., Sentosa Universal Studio) can be bundled with nearby hotel rooms with discounts to make a local staycation. An alternative revenue model would be to convert some existing rooms for quarantine to boost hotels’ revenue. This strategy should be taken with precaution to dispel the worries of travelers who are on business trips or for leisure purposes. Firstly, hotels should separate the space between quarantine rooms and normal rooms to keep them apart. Secondly, step up existing cleaning and hygiene work. Thirdly, we can adopt artificial intelligence techniques and simulation models to optimize the allocation of rooms for quarantine and normal rooms to maximize the revenue. In order to increase the room rates without compromising the expected demand, hotels can provide some add-on services such as massage and tour add-ons which will generate other sources of income to enhance hotels’ total revenue. In the long run, the hotel industry should gear up its digitalization and information technology capacity to boost its efficiency in management and operation. COVID-19 has imposed huge effects on people’s behavior which tends to move online especially for the tourism industry. This will be a trend in the near future, and the hotel industry should continue to refine its information technology infrastructure to meet consumers’ demands.

References 1. Haydon, D, Kumar, N (2020, September 21). Industries most and least impacted by COVID-19 from a probability of default perspective – September 2020 update. Retrieved from https:// www.spglobal.com/marketintelligence/en/news-insights/blog/industries-most-and-least-imp acted-by-covid19-from-a-probability-of-default-perspective-september-2020-update 2. Board ST (2020) Tourism sector performance Q4 2019 report 3. Board ST (2021) Tourism sector performance H1 2020 report 4. Ministry of Trade and Industry Singapore (2020) Singapore’s GDP contracted by 3.8 per cent in the fourth quarter of 2020. Singapore, pp 1–2 5. Weaver D, Oppermann M (2000) Tourism management. John Wiley and Sons 6. Hall CM, Jenkins JM (1995) Tourism and public policy. Routledge, London, pp 43–56 7. Henderson JC (2007) Uniquely Singapore? a case study in destination branding. J Vacat Mark 13(3):261–274 8. Lee CG (2010) Health care and tourism: evidence from Singapore. Tour Manage 31(4):486–488

230

Y. Cai et al.

9. Savage VR, Huang S, Chang TC (2004) The Singapore River thematic zone: sustainable tourism in an urban context. Geogr J 170(3):212–225 10. Farzanegan MR, Gholipour HF, Feizi M, Nunkoo R, Andargoli AE (2020) International tourism and outbreak of coronavirus (COVID-19): a cross-country analysis. J Travel Res 0047287520931593 11. Yang Y, Altschuler B, Liang Z, Li X (2020) Monitoring the global COVID-19 impact on tourism: the COVID19tourism index. Ann Tour Res 103120 12. Ministry of Health Singapore (2020) Covid-19 (temporary measures) act 2020—control orders. Retrieved from https://www.moh.gov.sg/policies-and-legislation/covid-19-(temporary-measur es)-(control-order)-regulations 13. Data source. https://www.kaggle.com/gohsoonheng/air-passengers-sgp-changi-airport-forpast-10yrs 14. Data source. https://stan.stb.gov.sg/content/stan/en/tourism-statistics.html; https://github.com/ owid/covid-19-data/tree/master/public/data 15. Damodaran A (2012) Investment valuation: tools and techniques for determining the value of any asset. John Wiley & Sons, New York

Tourism Arrival Prediction Cao Wenfei, Gu Yichao, Wang Jingyi, Wang Yanan, Zhao Yifan, and Zhu Haoxiang

The breakout of the Coronavirus Disease 2019 (COVID-19) in early 2020 has significantly impacted all the countries. It started in China, then spread to Singapore on Jan 2020. Confirmed cases soon increased sharply from March to May when government ramped up its COVID-19 testing ability. The pace of spreading then slowed down in June. By February 3, there are 59,602 covid-19 confirmed cases reported in Singapore in total. As compared to other countries in the world, Singapore is one of the countries that keep the lowest confirmed cases and death rate. In order to control the spread of the virus, Singapore government has made great effort to implement strict measures, including lockdowns and border closures to limit the movement of people, which has led to significant disruptions in economic activity. Looking back from where it started, Singapore come across several phases: from total locked down, phase 1, phase 2 and finally phase 3. Economic performance across nearly all the industries suffered a sharp U-turn in the lock down period. In the second quarter in 2020, Singapore Gross domestic product decreased 17.64% as compared to the year before. In 2020 quarter 3, economic situation shows a turnaround as government relaxed social activities restrictions, but GDP continued to drop 10.36%. Travelling, promotes commodity trade, tourism and its related sectors of goods and services. Pandemic has significantly impacted Aviation industry as travel restrictions were imposed around the world. Asia and the Pacific are the first regions to suffer the impact of the pandemic and the one with the highest level of travel restrictions. According to the data from Singapore Department of Statistics, in Jan 2020, number of air arrivals and departures still shows 2% year-on-year growth. However, it drops Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-19-9369-5_14. C. Wenfei · G. Yichao · W. Jingyi (B) · W. Yanan · Z. Yifan · Z. Haoxiang Nanyang Business School, Nanyang Technological University, Singapore 639798, Singapore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Y. Nguwi (ed.), Tourism Analytics Before and After COVID-19, https://doi.org/10.1007/978-981-19-9369-5_14

231

232

C. Wenfei et al.

Fig. 1 International arrivals by country

around 88% to the bottom within three months. Number of arrival and passengers dropped 99.8% in April and departure passengers, 99.36%. Overall, number of Air passengers drops 82.79% in 2020, this is almost five times as severe as compared to the impact caused by SARS in year 2003 when the number drops 15.46%. The time after April shows slow insignificant growth when government relaxed the visa approval requirements and started to accept international workers and students on board. At the end of year 2020, the yearly growth of the number of flights increases 10% as compared to April and number of airline passengers increased 1%. From the strikingly downward trend, we can expect a long recovery time to get back to the normal level for tourism. In the past few years, Indonesia, China, Malaysia, India, Australia, and Japan brought about nearly 50% of international visitors in Singapore. The percentage of international visitors does not change significantly in the past years. Due to the effect of pandemic, China International visitors reduce 5% in overall visitors. Percentage of visitors from Malaysia drops 0.8% but Indonesia increases 0.4%. More Visitors from Australia and U.K entered Singapore as shown in Fig. 1. We collected data from Department of Statistics (DOS) on Retail Sales Index and plotted the graph in Fig. 2 covers the period from year 2016 to 2020. It shows the retail sale change in main retail areas. We can see that non-necessities retail sales, including wearing apparel and watches, are the most affected retails. With one exception Supermarkets and Hypermarkets, which reaches the peak when all the other index go down, as people spend more on keeping more groceries to reduce the need to go out frequently. For Food and Beverage Service index, Food caterers are badly affected by Covid19 as shown in Fig. 3 which covers the period from Year 2016 to 2020. That is due to the loss of revenue from dine-in customers. However, for fast food outlets, cafes, food courts, they have recovered back to almost the same level as before pandemic. Considering the significant impact on tourism receipts in types of service, we find that tourism receipts in the sectors of accommodation, food and beverage, shopping, sightseeing, entertainment, and gaming have declined since the first quarter of year 2020. Quarter 2 locked down has led to a further fall in all sectors, all over 80%.

Tourism Arrival Prediction

233

Fig. 2 Retail Sales Index

Fig. 3 Food and Beverage Service Index

Shopping fell by 87%, followed by sightseeing and entertainment and gaming by 85% year-on-year respectively, without sign of recovery. Half of the ten largest global hotel groups set their regional headquarters in Singapore, including InterContinental Hotels Group, Hilton Worldwide and Carlson Rezidor. Singapore’s attractive business environment and investment in entrepreneurship have also helped it maintain a steady ecosystem of tourism industry. The travel agency giants such as TripAdvisor, Rakuten Travel and Skyscanner also expand their markets into Singapore. However, the tourism sector in the third quarter of 2020 recorded huge declines in International Visitor Arrivals (IVA) and Tourism Receipts (TR) during the COVID19 outbreak period, according to Quarter 3 2020 Tourism Sector Performance Report issued by Singapore Tourism Statistics [1]. From January to September 2020, the IVA dropped 81.2% over the same period in year 2019. Tourism Receipts sank at

234

C. Wenfei et al.

Fig. 4 Decomposition of occupancy rate

4.4 billion Singapore dollars, a decline of 78.4% compared to the same period in 2019. Also, the gazetted hotel witnessed a decline of 68.8% YoY on room revenue and decline of 31.4% YOY on average occupancy rate. As for the average occupancy rate (AOR), the luxury hotel suffered the most decline, 31.4% percentage points over the same period in 2019. The economy hotel’s AOR decreased by 21.2% to only 63% compared to the same period in 2019. We try to look into the Hotel sector in Singapore to analyze the hotel sector’s past performance before pandemic and especially the performance under the pandemic. First, we decompose the time series of occupancy rate as shown in Fig. 4. We can see that there is an increasing trend by year and strong seasonality. Specifically, July and August are the hottest seasons of the year. Also, we can check the seasonal decomposition of revenue per room as shown in Fig. 5. There is an increase followed by decreasing trend. We also observe that the time series data of revenue per room features strong seasonality. The revenue per room boosts from August to October. We next build an ARIMA model to predict what the average occupancy rate and revenue per room would be if there was no pandemic. Figure 6 shows the occupancy rate prediction before, after, and assuming no pandemic. The blue line is the actual occupancy rate before covid-19. The green line is the predicted occupancy rate if no covid-19 and the orange line is the actual occupancy rate under the covid19. The breakdown of numbers can be found in Fig. 7. We can see that when the pandemic outbreaks, the occupancy rate decreased by nearly a half compared to without pandemic. The hotel industry then revived a little bit during July and August 2020 period, but it continues the downfall after last September. We develop another ARIMA model for revenue per room as shown in Figs. 8 and 9. As for the revenue per room, it is slightly different from occupancy rate. The predicted value is steadier than the occupancy rate and the orange line demonstrates a recovery that continues until November 2020. When the pandemic first set in, the

Tourism Arrival Prediction

235

Fig. 5 Decomposition of revenue per room

Fig. 6 Occupancy rate prediction

revenue per room revenue even dropped to less than 30% of the predicted value. While the revenue is turning around gradually after August 2020. The first COVID-19 case was confirmed in Singapore on the 23rd of Jan 2020. The pandemic soon spread out and more cases were confirmed in the following few days. On the 31st of January, the government announced the restrictions order on all visitors from mainland China. With the spread of the virus in both Singapore and other countries of the world, the travel bans posted by the governments tightened step by step and included more countries and regions such as South Korea, Japan, European Union, etc. On the 7th of April, the parliament passed the COVID-19 (Temporary Measures) Act and began the circuit breaker to further reduce social events under most circumstances. The circuit breaker, which was planned to last for

236

C. Wenfei et al.

Fig. 7 Occupancy rate prediction in numbers

Fig. 8 Revenue per room prediction

one month, was extended to June. We see the entertainment facilities shut down or being restricted during this period. These policies above from the authority contributed to the sharp decline after Jan 2020. The occupancy rate dropped by 52.36% from 83.11% in January to 39.59% in March as shown in Fig. 10. The revenue per room was in very bad shape. It was S$180.6 in January, but in April, the number was only S$33.1, a decrease of 81.67%. The entire industry was almost frozen during that one to two months.

Tourism Arrival Prediction

237

Fig. 9 Revenue per room prediction

Fig. 10 Occupancy rate and revenue per room

Before the pandemic, the occupancy rate of all tiers of hotels was more or less the same. The revenue per room was ranked according to the tier as shown in Fig. 11. The luxurious rooms had a much larger margin over others. After the travel ban in January, all tiers were faced with a sharp decrease. However, the changes among different tiers were not all the same. The luxurious tier suffered the most. It witnessed a brutal decrease to only around 15% occupancy rate after January and remained low when other tiers showed signs of recovery. In terms of revenue per room, luxury hotels were confronted with the worst obstacles. Their revenue, which was once extremely high, decreased to nearly S$20 per room in May, lower than any other category, even

238

C. Wenfei et al.

Fig. 11 Occupancy rate and revenue per room by hotel tier

economy hotels had an average revenue of S$26.5. In contrast, the loss for lower tiers of hotels, although facing the same difficulty, was relatively mild, especially for Economy and Mid-Tier ones. The sign of recovery started from March, when the government began to pose more strict orders on inbound arrivals for locals to return to Singapore and require compulsory quarantine. This actually helps economy tier hotels to make up for the loss of revenue, though there were fewer tourists who would choose to stay in luxurious hotels for quarantine. Besides tiers, the pandemic’s influence on different sizes of hotels is noticeably different, see Fig. 12. Broadly speaking, larger hotels with more rooms tend to have higher occupancy rate than the smaller ones. This may be due to the nature of tourist countries, larger and mature hotels are more attractive to visitors that drives up their revenue. However, when the tourism sector was abruptly interrupted in the first quarter, the larger ones also suffered more severe loss because they had more number of rooms. The occupancy rate for large ones drops from 85 to 39%, same for the revenues. The large and medium ones saw more brutal decrease. Nevertheless, comparing to tiers, the difference among sizes were not that extreme. One possible explanation is that when people choose hotels and make reservations, they would value more on the tier rather than the scale of that hotel, and these two components were not that closely correlated as well. The hit of COVID-19 certainly hurt the whole hotel industry, with the high-class hotels being the biggest victim. Since Singapore ended the circuit breaker period on 1st June, the economy has been reopening cautiously, an effort that is being done in three phases. Singapore’s borders were reopening gradually to allow safe travel in limited numbers, and therefore the hotel industry performance witnessed slight recovery since then.

Tourism Arrival Prediction

239

Fig. 12 Occupancy rate and revenue per room by hotel size

In June 2020, although the number of visitors was more than twice than that in May, the overall hotel occupancy rate increases only by 2% from 52 to 54%. The considerably long stay period of these visitors indicates that most of the visitors during this time were returning Singapore residents or Malaysia citizens other than leisure travelers. Figure 13 shows the occupancy rate and revenue per room by different hotel tiers in the first half of year 2020. While the average room rates and revenue per room fluctuated a bit for different kinds of hotels, the overall rates and revenue elevated slightly after hitting the trough in May. Meanwhile, the changes in the occupancy rate varied more among different tiers and sizes of hotels. As the occupancy rate for economy hotels continued to increase steadily, luxury hotels and large hotels experienced a more noticeable increase in occupancy rates. One possible explanation for this could be that large, and luxury hotels were selected for quarantine than the smaller ones. During the phase 2 from 1 July 2020, tourism businesses were permitted to resume operations in stages, starting with 13 attractions. The Singapore Tourism Board (STB) also allowed domestic tour operators to submit their applications to resume operations. Hotels may also apply to reopen for staycation bookings, as well as the recreation areas for children such as the Kids’ Club. As we can see from Fig. 14, in July and August, the luxury hotels benefited the most from the local tourism for staycation. The average occupancy rate, room rate, and revenue per room for luxury hotels all increase significantly, though still far away to full recovery. The occupancy rate and room rate level return to around half of those before Covid-19. Meanwhile, the revenue per room remains lower than one-third of

240

C. Wenfei et al.

Fig. 13 Occupancy rate and revenue per room by hotel tier in January–June

the normal level since the fixed costs and maintenance costs are usually heavier for hotels. On the other side, despite the higher occupancy rates in mid-tier and economy hotels, the room rates and revenue per room of these hotels didn’t seem to recover. As the higher tier hotels lowered their prices, it’s necessary for lower-tier hotels to decrease their room rates to stay competitive for customers. When consider by sizes, the average performance of all sizes of hotels improved considerably as shown in Fig. 15, with the medium size hotels increasing the most in terms of average room rates and revenue per room. From September to November, when the travel restrictions were further lifted, the number of visitors continued to grow, but at a slower pace. In addition, the average visit days of visitors decreased significantly as compared to previous months. As a result, during this time, the average occupancy rates decline for all tiers of hotels, with the sharpest fall in economy hotels. Meanwhile, the room rates and revenue recovered at a steady and slow pace. Covid-19 narrows the price range among different tiers of hotels and left no space for cheaper hotels to raise their prices at this stage. Nevertheless, the ratios of revenues between different sizes of hotels remain roughly the same as before Covid-19 (Fig. 16). On the other hand, international epidemic situations remain unpredictable, and December saw some travel restrictions retightening for countries with increasing Covid-19 cases, such as South Korea and the UK. While the vaccine could help to recover international tourism, boosting local tourism is essentially a long process for Singapore tourism to return to normal. For this part, we try to predict the number of visitors for the first quarter of 2021. The correlation between daily number of visitors and hotel revenue is 93.84%, from the chart in Fig. 17 we can also see that they are highly correlated. We can then predict the hotel revenue subsequent to predicting the daily number of visitors accurately.

Tourism Arrival Prediction

241

Fig. 14 Occupancy rate and revenue per room by hotel tier in June–August

Fig. 15 Occupancy rate and revenue per room by hotel size in June–August

We combine the use of “prophet” package and Random Forest to predict the number of visitors for the following 90 days. For time series prediction, we choose to use the “fbprophet” package developed by Facebook. Prophet follows the sklearn model API. The input to Prophet is a simple data frame, so it is quite an easy API for us to use. For regression, we use Random Forest. There is a difference between the true value and predicted value generated by Prophet. As in the covid-19 period, the daily

242

C. Wenfei et al.

Fig. 16 Occupancy rate and revenue per room by hotel tier in August–November

Fig. 17 Relationship between number of visitors and revenue

new cases of other countries will influence tourism a lot, we will use the daily new cases of countries with top 10 number of entries to fit the difference. As the entry policy of Singapore is consistent since April 2020, the number of daily visitors becomes more predictable after that. Hence, we use the data after April 2020 to predict values for the first quarter of 2021. The prediction result of Prophet is shown in Fig. 18, the number of visitors will increase significantly in the first quarter of 2021. The predicted value at the end of 2020 is underestimated. A conservative estimate is that the number of visitors will double in the first quarter to nearly 2,000.

Tourism Arrival Prediction

243

Fig. 18 Prediction on the number of visitors

Table 1 Cases prediction for top 10 partner countries

Rank

Country

Number of visitors

1

China

3,627,120

Daily new cases 95,963

2

Indonesia

3,110,626

743,198

3

India

1,417,993

10,266,674

4

Malaysia

1,220,730

113,010

5

Australia

1,143,336

28,425

6

Japan

884,308

235,811

7

Philippines

829,325

474,064 20,032,035

8

USA

729,409

9

South Korea

645,848

61,769

10

UK

607,821

2,496,231

Covid-19’s outbreak in Singapore and travelling partner countries can significantly affect the number of visitors arriving in Singapore. We use the top 10 partner countries in 2019 to predict the difference between actual values and the predicted values using Prophet package. The list of the top 10 countries is shown in Table 1. From the table we can see that for different countries the daily new cases are quite different. To predict the trend of covid-19 accurately, we use Prophet to predict the number of cases for different countries. The prediction details including Singapore are shown in Fig. 19. From the prediction in Fig. 19, we can see that some countries have brought the epidemic under control and the epidemic in some countries is going to get worse. For Singapore, the daily new cases are decreasing significantly. The influence of

244

C. Wenfei et al.

Fig. 19 Prediction for top 10 partner countries of Singapore

Table 2 Performance in prediction

Mean Absolute Error

21.49

Mean Squared Error

715.29

Root Mean Squared Error

26.74

epidemic is stable. With more people receiving vaccination, tourism will improve significantly in the future as predicted by the time series model previously. We use the predicted daily new cases to fit the difference using Random Forest model. Data from April 2020 to December 2020 was used as the trainset to train our model. The performance of the trainset is shown in Table 2.

Proposed Solutions The coronavirus pandemic has profound impacts on tourism industry as a result of borders closing and nationwide lockdowns. According to an analysis by World Travel and Tourism Council [2], COVID-19 has already led to Travel and Tourism GDP loss of around 1475 billion USD in Asia–Pacific regions in year 2020. Currently, Singapore has begun to adopt various measures in order to mitigate the effects of COVID-19 on this sector to accelerate tourism recovery. Based on our analysis from the previous sections, we believe the following four measures should be further implemented in order to support tourism during this difficult time.

Tourism Arrival Prediction

245

Fiscal Stimulus Looking at the current pandemic situation, we believe that both fiscal stimulus and monetary policies adopted to combat the pandemic crisis are crucial to alleviate the operational burden of tourism-related companies and should be kept maintained until the spread of the coronavirus outbreak can be further controlled. Due to high-level of income instability of travel and tourism industry resulting from travel restrictions, we suggest to continue providing financial support to businesses in the following ways: · Allowing the late payments of taxes and levies from companies in hospitality industry. · Design and arrange special loan scheme catering to companies in hospitality industry. · Increase the current pool of governmental subsidy receivers.

Domestic Tourism Promoting domestic tourism can be vital to boost the recovery of the sector within a short time during the pandemic period. With national borders closing restrictions, domestic travel is more likely to restart tourism business much faster and earlier. Although the characteristics of being a relatively small country posed certain restrictions to reap the full benefits of domestic travel, the policy maker can still turn to domestic tourism market to make up a proportion of shortfall in tourism revenue by using some creative method. For example, to cater to the surge in demand for “staycation”, the policy maker can partner with local hotels to offer resident “staycation” packages including discounts on hotel stays and local attractions. At the same time, the policy maker can also work with key opinion leaders who can help to promote domestic travel destinations. In fact, developing domestic tourism should not just be regarded as a short-term strategy because an increasingly regulated and wellorganized tourism system can further attract foreign visitors in the post-pandemic period. Therefore, we suggest to further prioritize domestic tourism at the moment and collaborate with local tourism businesses to promote domestic demand.

Travel Bubble Besides domestic tourism promotion, Singapore can continue to establish two-way “travel bubbles” or one-way “travel corridor” with countries or cities which successfully controlled the spread of the COVID-19 virus to entice the international travelers and kickstart tourism recovery. Under the agreement between the two countries to open up their borders for quarantine-free travel, the residents from the other country

246

C. Wenfei et al.

are allowed to travel to Singapore much more freely, encouraging international arrivals and driving the flagging tourism industry in a sustainable way. Therefore, we suggest that Singapore can expand the travel bubble to include more countries with “comprehensive public health surveillance” and low incidence of COVID-19 cases.

Reshape the Travel Activities Last but not least, the policy maker should work with the industry to shape tourism and hospitality industry of future by redesigning tourism products and accelerating digital transformation in the sector, equipping the businesses with more flexibility to get adapt to the changing economic environment and building a more resilient and more sustainable tourism economy. For instance, hotels can allocate more resources to develop the “safe Mice (meetings, incentives, conferences and exhibitions) events” service to organize big events without violating safe distance measures; big events can be broken down into various smaller groups that connect together through technology. Tourism products or services redesigning also helps to cater to the change in customer demand and corresponding customer experience. To cope with the shift in consumption patterns in a timely manner, businesses in tourism sector or hospitality industry can enhance interaction with customers through diverse online platforms, such as Tik Tok or Facebook to promote their products or services more effectively and create additional revenue. Therefore, we believe that the pandemic crisis brings us an opportunity to reimagine the future of tourism sector and the relevant stakeholders should take action to transform the sector in a more innovative way. Lastly, on the limitations to this study. First, when constructing the predictive models, we mainly considered the number of visitors and confirmed COVID-19 cases as the factors affecting the performance of Singapore hotel industry; however, the market is very dynamic and there are certainly some hidden factors which we failed to disclose. Second, the dataset we are using here is considered a small one since we can only access monthly records. Small datasets may run the risk of producing overfitting issues, which in turn, leads to weak predictive power. Thus, one would expect to see a more promising prediction with the presence of daily records of various features.

References 1. Singapore Tourism Board, Tourism Sector Performance. https://www.stb.gov.sg/content/dam/ stb/documents/statistics-marketing-insights/Quarterly-Tourism-Performance-Report/STB% 20Q3%202020%20Tourism%20Sector%20Performance%20Report_Final.pdf 2. World Travel & Tourism Council, WTTC Economic Trends Report reveals COVID-19’s dramatic impact on Travel & Tourism around the world