File loading please wait...
Citation preview
Zheng Xiang Matthias Fuchs Ulrike Gretzel Wolfram Höpken Editors
Handbook of e-Tourism
Handbook of e-Tourism
Zheng Xiang • Matthias Fuchs • Ulrike Gretzel • Wolfram Höpken Editors
Handbook of e-Tourism With 238 Figures and 112 Tables
Editors Zheng Xiang The Howard Feiertag Department of Hospitality and Tourism Management Pamplin College of Business Virginia Tech Blacksburg, VA, USA
Matthias Fuchs Department of Economics, Geography, Law and Tourism The European Tourism Research Institute Mid-Sweden University Östersund, Jämtland, Sweden
Ulrike Gretzel Annenberg School for Communication and Journalism University of Southern California Los Angeles, CA, USA
Wolfram Höpken Institute for Digital Transformation Ravensburg-Weingarten University of Applied Sciences Weingarten, Germany
ISBN 978-3-030-48651-8 ISBN 978-3-030-48652-5 (eBook) https://doi.org/10.1007/978-3-030-48652-5 © Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland.
Foreword
The field of tourism information technology (IT), or e-tourism, began around 1994. I remember the first meeting in Innsbruck, Austria, where a handful of scholars fascinated by this fledging field came together. There was much excitement as we tried to understand how computer technology and the Internet might change the tourism industry. Little did we realize then the huge disruption and innovation it would bring to one of the world’s largest industries. At that time, airline and hotel computer reservation systems and the disintermediation of the travel distribution channels were the topics that mostly occupied our attention. Since those early days, the field has grown exponentially in size and content. Now hundreds of researchers around the world (computer scientists, network engineers, scholars in management and marketing information systems, consumer behaviorists, industry practitioners, and many others) are engaged in researching its application and impacts. Their professional association, the International Federation for Technology in Tourism (IFITT), and its allied research publication the Journal of Information Technology and Tourism (JITT) continue to be at the forefront of e-tourism research. This Handbook of e-Tourism is a welcome and important addition to the field’s growth. Its four editors are to be congratulated for bringing together over 100 respected authors from all over the world writing on diverse topics. It lays out a comprehensive roadmap for the field including its historic and global development. Specific established areas of study such as IT as a tourism marketing tool, the design of websites, destination management systems, mobile technologies, location-based systems, tourist-tracking, and GPS and GIS are covered in depth. But more recent themes are also included. For example, in the last decade or so, recommender systems, big data, and data mining of traveler behavior have taken on greater importance and many authors address these topics. Readers fascinated by social media will find new material addressing its thunderous impacts on travel patterns, communication strategies, and associated travel reviews. The topic of integrating more intelligence into systems is tackled by some authors. They investigate how artificial intelligence (AI), virtual reality (VR), augmented reality (AR), and robotics are being applied to change tourism in the future.
v
vi
Foreword
It is encouraging to see the softer side of IT also being addressed with topics such as technological mindfulness, trust, well-being, cultural and ethical issues, and education related to e-tourism. These are likely to become increasingly relevant in the years ahead. The handbook also charts relatively new waters by covering topics such as blockchain and tourism, the hive mind, opinion-mining, and postsmart tourism. Even though only one chapter tackles e-tourism and sustainability, this topic begs for more research as climate change, pandemics, and other threats loom in tourism’s future. This handbook is an important and authoritative resource for students, researchers, and industry professionals seeking to understand the immense impacts of technology on tourism. The editors and the authors are to be congratulated. The quality and breadth of the handbook is a testament to the continued growth and relevance of the field, and is a firm foundation for its future development. With such a rapidly growing field, I hope that the four editors will continue their excellent work and publish a subsequent volume in the not-too-distant future. University of Hawai’i, School of Travel Industry Management Hawaii, USA April 15, 2021
Pauline J. Sheldon
Preface
While it is difficult to trace the inception of the field of e-tourism, the inaugural ENTER Conference in 1994 held in Innsbruck, Austria, was widely recognized as the very first documented event that brought academicians and industry practitioners together to discuss the multifaceted impact of the Internet on the travel and tourism field, particularly the ramifications of the nascent “e-commerce” of tourism. In the nearly 30 years of its history, the loyal communities surrounding this conference have witnessed substantial growth in both theory and development in this multidisciplinary field called e-tourism. Directly affiliated with the ENTER Conference are its parent association, the International Federation of IT and Travel & Tourism (IFITT), and its flagship publication, the Journal of Information Technology & Tourism (JITT), both dedicated to promoting and supporting the development and dissemination of knowledge in the applied field of e-tourism. Today, e-tourism, in its various connotations, is no longer a niche topic; rather, it has tremendously grown to become a highly visible subject that permeates industry forums and mainstream tourism journals and conferences. And, travel and tourism continues to serve as an application domain that attracts researchers from various fields such as computer science, information systems, marketing, management, economics, and even moral philosophy, with keen interests in understanding the increasingly important role of information technology in society, in general, and in travel and tourism, in particular. The Handbook of e-Tourism comes of age to document the substantial body of knowledge in e-tourism research and its significant contribution to creating the intersection between the domains of information technology and tourism. Specifically, the handbook is designed to serve primarily two goals: first, it aims to document and synthesize the development and maturity of e-tourism as a research field in the last three decades. Secondly, it attempts to communicate cuttingedge ideas, recent developments, as well as general guidance and future research directions. As such, the handbook as a whole is intended to provide a comprehensive overview of the ontological and epistemological characteristics of e-tourism as a field of study for researchers, educators, and practitioners alike within and outside the broad domain of tourism.
vii
viii
Preface
The handbook project was spearheaded by JITT with its editorial team consisting of four editors and associate editors of the journal. A special workshop was conducted in late November 2017 in Vienna, Austria, with invited members from the JITT board and the IFITT community to discuss and outline the overall framework and structure of the handbook. Given the historical, cooperative relationship between the JITT/IFITT community and Springer, the latter was chosen as the exclusive publisher for the e-tourism handbook. Importantly, it was decided that the handbook would be published as a “live reference,” which means that, once a chapter manuscript has been accepted by the editorial team, it will be published online immediately while remaining open to ongoing updating in its digital format. Subsequent meetings were held in conjunction with the ENTER conference to give updates to as well as to solicit feedback from the community. In this process, a number of new chapter titles were added to the core structure to reflect the broader view of the field as well as the diverse expertise of the e-tourism community. Due to the COVID-19 pandemic, the project experienced considerable delay. As such, the final compilation of the handbook reflects the ongoing contribution and unwavering support of the global e-tourism community over the course of the last 3 years. During this process, the ontological nature of e-tourism was discussed, debated, and challenged in order to provide a more comprehensive and also forward-looking view of the domain. E-Tourism is conventionally defined as the analysis, design, implementation, and application of information technology-based solutions in the travel and tourism industry as well as the analysis of the respective technical and/or economic processes and market structures. While this definition reflects the overall output of research related to IT and tourism in the last three decades, it largely views IT as a “solution,” which represents a particular instrumental view of technology that serves tourism businesses. Also, it focuses on the micro- and meso-level perspectives while ignoring macro-level problems regarding the systems and governance approaches that promote, facilitate, and regulate specific kinds of technologies. In addition, this definition highlights that e-tourism research has been rooted in a conventional rather than a post-digital, humanist paradigm. For example, individuals are often defined and studied as consumers, users, or data sources rather than affective human beings embedded and embodied in physical and virtual communities and places. As such, the handbook also aims to address these limitations by delineating e-tourism as a more dynamic field that is, while increasingly more mature, constantly evolving to identify and explore the – positive and negative – impacts of technology on travel and tourism in various socio-cultural and socio-economic contexts. The handbook was conceptualized to include a series of themes organized into eight sections. They are: (1) Foundations of e-Tourism; (2) Technologies in e-Tourism; (3) e-Tourism Methods; (4) Individuals and Groups; (5) Organization and Enterprise; (6) Networks and Market; (7) Policy, Regulation, and Ethics; and (8) e-Tourism Future. Each of these sections consists of a varying number of chapters that discuss the key topics pertinent to the general theme. By drawing on many of the world’s recognized authors with a variety of expertise, we hope the Handbook of e-Tourism will inform and, most importantly, inspire the current and future
Preface
ix
generations of researchers to continue developing our understanding of the exciting and challenging field of e-tourism. Our world will remain under the ongoing influence of the COVID-19 pandemic. Thus, although topics related to COVID-19 are scattered throughout some of the chapters, one potential “drawback” of the handbook in its current form is the lack of a focused, dedicated discussion that specifically assesses its impacts on e-tourism, and addresses the issues and challenges facing the tourism industry as a whole. Fortunately, as a live project, the handbook will continue to grow and to include these topics of emerging and ongoing significance to the future study of e-tourism. Blacksburg, USA Östersund, Sweden Los Angeles, USA Weingarten, Germany June 2022
Zheng Xiang Matthias Fuchs Ulrike Gretzel Wolfram Höpken
Acknowledgements
The editors would like to thank Professor Hannes Werthner of Vienna University of Technology, Austria, for his instrumental role in initiating the discussion on the development of the book project. Prof. Werthner led the planning workshop in late November 2017, which was held in the beautiful campus of MODUL University Vienna where the participants enjoyed the hospitality provided by President Karl Wöber and his staff. We would also like to thank a number of past and current JITT editors, associate editors, and board members including Francesco Ricci, Daniel Fesenmaier, Dimitrios Buhalis, Rodolfo Baggio, and Juho Pesonen, to name just a few, who were actively involved in some of the meetings and discussions that helped shape the handbook. We are in debted to Professor Emeritus Pauline Sheldon of the University of Hawaii, who has been a true pioneer of the e-tourism field and an avid supporter of this community, for writing the foreword to this handbook. We are extremely grateful for the enthusiasm and contribution from the communities surrounding JITT and beyond, without which the final publication of the handbook would not be possible. We specifically wish to thank many of the authors for their support and patience throughout the process. Finally, we would like to thank the Springer team of Barbara Wolf and Johanna Klute for providing technical assistance, editorial support, and coordination for the project.
xi
Contents
Volume 1 Part I Foundation of e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
e-Tourism: An Informatics Perspective . . . . . . . . . . . . . . . . . . . . . . . . Hannes Werthner
3
2
Development of Information and Communication Technology: From e-Tourism to Smart Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rosanna Leung
23
3
Drivers of e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitrios Buhalis
57
4
e-Tourism Research: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yulan Yuan, Yuen-Hsien Tseng, and Ching Li
75
5
A Post-disciplinary Perspective on e-Tourism . . . . . . . . . . . . . . . . . . . Tim Coles, C. Michael Hall, and David Timothy Duval
95
6
Consumer Behavior in e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. Volo and A. Irimiás
119
7
Developments in German e-Tourism: An Industry Perspective . . . . Claudia Brözel
141
8
Digitalization and the Transformation of Tourism Economics . . . . . Luis Moreno-Izquierdo, Ana B. Ramón-Rodríguez, and Adrián Más-Ferrando
173
Part II
Technologies in e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
193
9
The Evolution of Online Booking Systems . . . . . . . . . . . . . . . . . . . . . . Robert Goecke
195
10
Advanced Web Technologies and E-Tourism Web Applications . . . Robert Goecke
221
xiii
xiv
Contents
11
Web Information Retrieval and Search . . . . . . . . . . . . . . . . . . . . . . . . Jürgen Dorn
253
12
Mobile Applications for e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Wörndl and Daniel Herzog
273
13
Internet of Things and Ubiquitous Computing in the Tourism Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elena Not, Dario Cavada, and Adriano Venturini
295
14
Augmented, Virtual, and Mixed Reality in Tourism . . . . . . . . . . . . . Roman Egger and Larissa Neuburger
317
15
Electronic Data Interchange and Standardization . . . . . . . . . . . . . . . Christian Huemer, Philipp Liegl, and Marco Zapletal
343
16
Semantic Web Empowered E-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . Kevin Angele, Dieter Fensel, Elwin Huaman, Elias Kärle, Oleksandra Panasiuk, Umutcan Sim¸ ¸ sek, Ioan Toma, and Alexander Wahler
373
17
Big Data Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Constantine J. Aivalis
419
18
Artificial Intelligence and Machine Learning . . . . . . . . . . . . . . . . . . . Luisa Mich
435
19
Recommender Systems in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francesco Ricci
457
20
Blockchain and Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Horst Treiblmaier
475
21
Business Intelligence in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfram Höpken and Matthias Fuchs
497
Part III
e-Tourism Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
529
22
Data Mining and Predictive Analytics for E-tourism . . . . . . . . . . . . . Nuno Antonio, Ana de Almeida, and Luis Nunes
531
23
Content Analysis of Online Travel Reviews . . . . . . . . . . . . . . . . . . . . . Estela Marine-Roig
557
24
Network Science and e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Julia Neidhardt
583
25
Spatial Analytics and Data Visualization . . . . . . . . . . . . . . . . . . . . . . . Yang Yang
595
26
The Hive Mind at Work: Crowdsourcing E-Tourism Research . . . . Jing Ge-Stadnyk
617
Contents
xv
27
Tourism Design: Articulating Design Beyond Science . . . . . . . . . . . . Mads Bødker
635
28
Log File Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Constantine J. Aivalis
659
29
Eye-Tracking Technology for Measuring Banner Advertising Efficacy on E-Tourism Websites: A Methodological Proposal . . . . . Francisco Muñoz-Leiva
685
30
User-Centered Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hilda Tellio˘glu
31
E-Tourism Research, Cultural Understanding, and Netnography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert V. Kozinets
737
Mobile Ethnography in Tourism and Hospitality: Concept, Tools, and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elaine Yulan Zhang, Dan Wang, and Sut Ieng Lei
753
32
717
33
Experimental Research in E-Tourism: A Critical Review . . . . . . . . . Lawrence Hoc Nang Fong, Erin Yirun Wang, Rob Law, and Shousheng Chai
34
Website Evaluation Frameworks: A Review of the Hospitality and Tourism Field from 1996 to 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . Shanshan Qi
797
User Modelling in E-Tourism: A Human-Computer Interaction Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Hitz and Gerhard Leitner
829
35
775
36
Market Segmentation for e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . Sara Dolnicar
849
37
Visual Methods and Visual Analysis in Tourism Research . . . . . . . . Katharina Lobinger and Emanuele Mele
865
38
Compositional Data Analysis in E-Tourism Research . . . . . . . . . . . . Berta Ferrer-Rosell, Germà Coenders, and Eva Martin-Fuentes
893
Volume 2 Part IV 39
Individual & Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
919
Travel Information Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zheng Xiang and Daniel R. Fesenmaier
921
xvi
40
Contents
Group Decision-Making and Designing Group Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amra Deli´c, Thuy Ngoc Nguyen, and Marko Tkalˇciˇc
941
41
Acceptance and Adoption of eTourism Technologies . . . . . . . . . . . . . Shahab Pourfakhimi, Tara Duncan, Louise Ould, Katie Allan, and Willem Coetzee
965
42
Tourists and Augmented and Virtual Reality Experiences . . . . . . . . Jacques Bulchand-Gidumal and Edu William
997
43
User Experience and Usability: The Case of Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017 Safak Korkut, Emanuele Mele, and Lorenzo Cantoni
44
Trust in E-Tourism: Antecedents and Consequences of Trust in Travel-Related User-Generated Content . . . . . . . . . . . . . . . . . . . . . 1039 Kyung-Hyan Yoo and Jin-A Choi
45
Smart Tourists and Intelligent Behavior . . . . . . . . . . . . . . . . . . . . . . . 1067 Philip L. Pearce
46
Interactive and Context-Aware Systems in Tourism . . . . . . . . . . . . . 1085 Dietmar Jannach and Markus Zanker
Part V Organization & Enterprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107 47
Strategic Use of Information Technologies in Tourism: A Review and Critique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109 Matthias Fuchs and Marianna Sigala
48
Management and Leadership for Digital Transformation in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1147 Juho Pesonen
49
E-Business Models in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1181 Stephan Reinhold, Florian J. Zach, and Christian Laesser
50
Service Management in the E-Tourism Era . . . . . . . . . . . . . . . . . . . . 1211 Volo Serena and David D’Acunto
51
Destination Management Organization’s Emotional Branding Communication: Challenges and Opportunities in Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235 Assumpció Huertas and Lidija Lalicic
52
Revenue Management and E-Tourism: The Past, Present, and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1261 Lydia González-Serrano and Pilar Talón-Ballestero
Contents
xvii
53
e-Supply Chain Management in Tourism Destinations . . . . . . . . . . . 1289 Xinyan Zhang and Pimtong Tavitiyaman
54
Digital Marketing in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1311 Christian Maurer
55
Use of GIS and Remote Sensing in Tourism . . . . . . . . . . . . . . . . . . . . 1335 James M. Magige, Charlynne Jepkosgei, and Simon M. Onywere
56
Social Media Approaches and Communication Strategies in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1363 Roberta Minazzi
57
The Voice of Major E-Tourism Players: An Expedia Group Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1391 Jan Krasnodebski
58
Information and Communication Technology in Event Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417 Christine Van Winkle and Jill Bueddefeld
59
Technology-Assisted Mindfulness in the Co-creation of Tourist Experiences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1439 Uglješa Stankov and Viachaslau Filimonau
60
E-Tools for Tourism Innovation Management: A New Typology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1465 Anne-Mette Hjalager, Søren Graakjær Smed, and Jens F. Jensen
Part VI Network & Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1491 61
The Diffusion of Information and Communication Technologies in the Tourism Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . 1493 Miriam Scaglione
62
Sharing and Platform Economy in Tourism: An Ecosystem Review of Actors and Future Research Agenda . . . . . . . . . . . . . . . . . 1521 Marianna Sigala
63
Digital Ecosystems, Complexity, and Tourism Networks . . . . . . . . . 1545 Rodolfo Baggio
64
Value Co-creation in Dynamic Networks and E-Tourism . . . . . . . . . 1565 Tuomas Pohjola, Arja Lemmetyinen, and Darko Dimitrovski
Part VII 65
Policy, Regulation & Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1589
Data Privacy and the Travel Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . 1591 Peter O’Connor
xviii
Contents
66
Cybersecurity in Travel and Tourism: A Risk-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1605 Alexandros Paraskevas
67
e-Government and Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1629 Anna Picco-Schwendener, Nadzeya Kalbaska, Lea Hasenzahl, and Lorenzo Cantoni
68
Information and Communication Technology for Sustainable Tourism Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1645 Alisha Ali
69
e-Learning in Tourism Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1667 Nadzeya Kalbaska and Lorenzo Cantoni
70
Technology-Enabled Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1687 Ruiqi Deng and Pierre Benckendorff
71
IT and Well-Being in Travel and Tourism . . . . . . . . . . . . . . . . . . . . . . 1715 Delia Gabriela Moisa and Eleni Michopoulou
72
E-Tourism Curriculum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1743 Matthias Fuchs and Wolfram Höpken
73
E-Tourism as a Tool for Socio-economic Development . . . . . . . . . . . 1769 Alessandro Inversini, Isabella Rega, and Siew Wei Gan
74
Digital Divide in E-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1785 Francesc González Reverte and Pablo Díaz Luque
75
Social Media and Crisis Communication in Tourism and Hospitality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1807 Danielle Barbe and Lori Pennington-Gray
76
Biometrics in Tourism: Issues and Challenges . . . . . . . . . . . . . . . . . . 1835 Han-Foon Neo and Chuan-Chin Teo
77
Simulations in e-Tourism Learning and Management . . . . . . . . . . . . 1851 G. Michael McGrath, Madelene Blaer, Faith Ong, Leonie Lockstone-Binney, Elisabeth Wilson-Evered, and Paul Whitelaw
Part VIII
e-Tourism Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1871
78
Robotics in Tourism and Hospitality . . . . . . . . . . . . . . . . . . . . . . . . . . 1873 Stanislav Ivanov, Craig Webster, and Katerina Berezina
79
Virtual Reality and the End of Tourism? A Substitution Acceptance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1901 Daniel Guttentag
Contents
xix
80
A Futuristic Look at Tourism in the Era of the Internet Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1921 Sam Lanfranco
81
Impact of Artificial Intelligence in Travel, Tourism and Hospitality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1943 Jacques Bulchand-Gidumal
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1963
About the Editors
Zheng Xiang, PhD, is an associate professor in the Howard Feiertag Department of Hospitality and Tourism Management in the Pamplin College of Business at Virginia Tech, USA. His research focuses on the strategic implications of information technologies for the hospitality and tourism industry. He currently serves as president of the International Federation for IT and Travel & Tourism (IFITT) and editor-in-chief of the Journal of Information Technology & Tourism (SSCI; Scopus). He has published more than 90 peerreviewed journal articles, conference papers, and book chapters. He has co-authored one textbook Tourism Information Technology (third edition) and co-edited two books related to concepts and methods in data analytics in hospitality and tourism management. In May 2015, he received the “Emerging Scholar of Distinction” award from the International Academy for the Study of Tourism. Recently, he was designated by the Web of Science Group as a “Highly Cited Researcher” (0.1% among researchers around the world) 2 years in a row (2019–2020) based upon citations to his work in the last decade. Xiang holds a PhD in business administration from Temple University.
xxi
xxii
About the Editors
Matthias Fuchs, PhD, is a Full Professor of Tourism Studies at Mid Sweden University, Östersund, Sweden. His research interests include electronic tourism (e.g., business intelligence and data mining applications in tourism, online auctions), customer-based destination brand equity modeling, and socio-economic impact analysis. Matthias serves the editorial board of the Journal of Travel Research, the Annals of Tourism Research, the Journal of Hospitality & Tourism Management, and Tourism Analysis. Matthias is also an associate editor of the Journal of Information Technology & Tourism. A number of his co-authored articles have received the best paper award at international conferences. Matthias has been the research track chair of the conference ENTER@Helsingborg, 2012, and the overall chair of the conference ENTER@Jöngköping, 2018. He was a board member of the International Federation for IT and Travel & Tourism (IFITT) during the period 2014–2018. Matthias holds a PhD in business administration from Innsbruck University, Austria. Dr. Ulrike Gretzel is a senior fellow at the Center of Public Relations, University of Southern California, and serves as director of research at Netnografica. She received her PhD in communications from the University of Illinois at Urbana-Champaign. She is a fellow of the International Academy for the Study of Tourism. Her research focuses on technology-mediated communication and persuasion in digital media. Her expertise spans the design and evaluation of intelligent systems, as well as the development and implications of artificial intelligence. Her work in tourism addresses ways in which tourists engage with each other and with tourism organizations through websites, mobile apps, and social media, and has analyzed how tourism experiences are represented and marketed online. She studies social media marketing, influencer marketing, and the emerging reputation economy. She has also researched smart tourism development, technology adoption and non-adoption in tourism organizations, and the quest for digital detox experiences. Dr. Gretzel has published more than 100 peer-reviewed journal articles and has co-edited two books. She is frequently acknowledged as one of the most cited authors in the fields of tourism and persuasion.
About the Editors
xxiii
Dr. Wolfram Höpken is professor of Business Informatics and e-Business at the University of Applied Sciences Ravensburg-Weingarten and director of the Institute for Digital Transformation. His main research fields are data science and big data analytics as well as ICT systems in tourism. He has been involved in several research projects in the area of knowledge discovery and big data analytics within tourism destinations as well as semantic web and seamless data interchange in tourism (EU-funded projects Harmonise, HarmoTEN, Euromuse, HarmoSearch). Wolfram Höpken has been vice president, commercial director, and member of the management board of IFITT for more than 15 years. He has been research track chair of the ENTER conference 2009 and overall chair of ENTER 2014. He has chaired the CEN/ISSS workshop eTOUR dealing with harmonization in the field of tourism. Wolfram Höpken has published over 70 peer-reviewed articles and book chapters and is associate editor of the Journal of Information Technology & Tourism.
Contributors
Constantine J. Aivalis Hellenic Mediterranean University of Crete, Heraklion, Crete, Greece Alisha Ali Sheffield Business School, Sheffield Hallam University, Sheffield, UK Katie Allan USC Business School, University of The Sunshine Coast, Hervey Bay, QLD, Australia Kevin Angele Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Onlim GmbH, Telfs, Austria Nuno Antonio Nova Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisboa, Portugal Rodolfo Baggio Master in Economics and Tourism and Dondena Center for Research on Social Dynamics and Public Policy, Bocconi University, Milan, Italy Danielle Barbe Department of Marketing, Operations and Systems, Northumbria University, Newcastle Upon Tyne, UK Pierre Benckendorff UQ Business School, The University of Queensland, Brisbane, QLD, Australia Katerina Berezina Department of Nutrition and Hospitality Management, University of Mississippi, University, MS, USA Madelene Blaer Victoria University, Melbourne, VIC, Australia Mads Bødker Department of Digitalization, Copenhagen Business School, Copenhagen, Denmark Claudia Brözel Eberswalde University for Sustainable Development, University of Applied Sciences, Berlin, Germany Jill Bueddefeld Faculty of Kinesiology, Sport, and Recreation, University of Alberta, Edmonton, AB, Canada
xxv
xxvi
Contributors
Dimitrios Buhalis Hong Kong Polytechnic University, Hong Kong, China Bournemouth University, Poole, UK Jacques Bulchand-Gidumal Institute for Sustainable Tourism and Economic Development (TIDES), University of Las Palmas de Gran Canaria, Las Palmas, Spain Lorenzo Cantoni Faculty of Communication, Culture and Society, Institute of Digital Technologies for Communication, USI-Università della Svizzera italiana, Lugano, Switzerland Dario Cavada Suggesto S.r.l., Trento, Italy Shousheng Chai Department of Business Administration, Ocean University of China, Shandong, China Jin-A Choi William Paterson University of New Jersey, Wayne, NJ, USA Germà Coenders University of Girona, Girona, Spain Willem Coetzee Otago Business School, University of Otago, New Zealand
Dunedin,
Tim Coles University of Exeter Business School, Exeter, UK David D’Acunto Faculty of Economics and Management, TOMTE, Free University of Bozen-Bolzano, Bolzano, Italy Ana de Almeida Department of Information Science and Technology, ISCTE-IUL, Lisbon, Portugal CISUC, Coimbra, Portugal ISTAR-IUL, Lisbon, Portugal Amra Deli´c Faculty of Informatics, EC Research Unit, TU Wien, Vienna, Austria Ruiqi Deng Department of Educational Technology, School of Education, Hangzhou Normal University, Hangzhou, China Darko Dimitrovski Faculty of Hotel Management and Tourism, University of Kragujevac, Kragujevac, Serbia Sara Dolnicar UQ Business School, The University of Queensland, Brisbane, QLD, Australia Jürgen Dorn Institute for Information Systems Engineering, Technische Universität Wien, Wien, Austria Tara Duncan School of Technology and Business Studies, Högskolan Dalarna, Falun, Sweden David Timothy Duval Faculty of Business and Economics, University of Winnipeg, Winnipeg, MB, Canada
Contributors
xxvii
Roman Egger Innovation and Management in Tourism, Salzburg University of Applied Sciences, Salzburg, Austria Dieter Fensel Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Berta Ferrer-Rosell University of Lleida, Lleida, Spain Daniel R. Fesenmaier Modul University Vienna, Vienna, Austria Viachaslau Filimonau Faculty of Management, Bournemouth University, Bournemouth, UK Lawrence Hoc Nang Fong Faculty of Business Administration, Department of Integrated Resort and Tourism Management, University of Macau, Macau, S.A.R., China Matthias Fuchs Department of Economics, Geography, Law and Tourism, The European Tourism Research Institute, Mid-Sweden University, Östersund, Jämtland, Sweden Siew Wei Gan University of Nottingham Malaysia, Selangor, Malaysia Jing Ge-Stadnyk University of California, Berkeley, Berkeley, CA, USA Robert Goecke Munich University of Applied Sciences, Munich, Germany Lydia González-Serrano Department of Business Economics, Rey Juan Carlos University,Madrid, Spain Søren Graakjær Smed Department of Communication and Psychology, Aalborg University, Aalborg, Denmark Daniel Guttentag Department of Hospitality and Tourism Management, College of Charleston, Charleston, SC, USA C. Michael Hall Department of Management, Marketing and Entrepreneurship, University of Canterbury, Christchurch, New Zealand Lea Hasenzahl USI – Università della Svizzera italiana, Lugano, Switzerland Daniel Herzog Department of Informatics, Technical University of Munich, Garching bei München, Germany Martin Hitz Institute for Informatics Systems, University of Klagenfurt, Klagenfurt, Austria Anne-Mette Hjalager Department of Entrepreneurship and Relationship Management, University of Southern Denmark, Odense, Denmark Wolfram Höpken Institute for Digital Transformation, Ravensburg-Weingarten University of Applied Sciences, Weingarten, Germany Elwin Huaman Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria
xxviii
Contributors
Christian Huemer Institute of Information Systems Engineering, TU Vienna, Vienna, Austria Assumpció Huertas Departament d’Estudis de Comunicació, Universitat Rovira i Virgili, Tarragona, Spain Alessandro Inversini Ecole hôtelière de Lausanne, HES-SO University of Applied Sciences and Arts Western Switzerland, Lausanne, Switzerland A. Irimiás Faculty of Economics and Management, Free University of BozenBolzano, Bruneck-Brunico, Italy Stanislav Ivanov Varna University of Management, Varna, Bulgaria Dietmar Jannach University of Klagenfurt,Klagenfurt, Austria Jens F. Jensen Department of Communication and Psychology, Aalborg University, Aalborg, Denmark Charlynne Jepkosgei Department of Geoinformation and Earth Observation, Technical University of Kenya, Nairobi, Kenya Nadzeya Kalbaska Faculty of Communication, Culture and Society, Institute of Digital Technologies for Communication, USI-Università della Svizzera italiana, Lugano, Switzerland Elias Kärle Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Safak Korkut Faculty of Communication, Culture and Society, USI – Università della Svizzera italiana, Lugano, Switzerland School of Business, Institute for Information Systems, FHNW – University of Applied Sciences and Arts Northwestern Switzerland, Basel, Switzerland Robert V. Kozinets Jayne and Hans Hufschmid Professor of Strategic Public Relations and Business Communication, Annenberg School for Communication and Journalism and Marshall School of Business at the University of Southern California, Los Angeles, CA, USA Jan Krasnodebski Expedia Group, Geneva, Switzerland Christian Laesser IMP-HSG, University of St. Gallen, St. Gallen, Schweiz Lidija Lalicic Department of Tourism and Service Management, MODUL University Vienna, Vienna, Austria Sam Lanfranco York University, Toronto, ON, Canada Rob Law Asia-Pacific Academy of Economics and Management, University of Macau, Macau, S.A.R., China Faculty of Business Administration, Department of Integrated Resort and Tourism Management, University of Macau, Macau, S.A.R., China
Contributors
xxix
Sut Ieng Lei Macao Institute for Tourism Studies, Macau, China Gerhard Leitner Institute for Informatics Systems, University of Klagenfurt, Klagenfurt, Austria Arja Lemmetyinen School of Economics, University of Turku, Pori, Finland Rosanna Leung Department of International Tourism and Hospitality, I-Shou University, Kaohsiung, Taiwan Ching Li Graduate Institute of Sport, Leisure, and Hospitality Management, National Taiwan Normal University, Taipei, Republic of China Philipp Liegl ecosio GmbH, Vienna, Austria Katharina Lobinger Faculty of Communication, Culture and Society, USI Università della Svizzera italiana, Lugano, Switzerland Leonie Lockstone-Binney Griffith University, Gold Coast, QLD, Australia Pablo Díaz Luque Faculty of Economics and Business, Universitat Oberta de Catalunya, Barcelona, Spain James M. Magige Department of Environmental Planning and Management, Kenyatta University, Nairobi, Kenya Estela Marine-Roig University of Lleida, Lleida, Catalonia, Spain Eva Martin-Fuentes University of Lleida, Lleida, Spain Adrián Más-Ferrando Economics of Innovation and AI Research Group, University of Alicante, Alicante, Spain Christian Maurer IMC University of Applied Science Krems, Krems an der Donau, Austria G. Michael McGrath Victoria University, Melbourne, VIC, Australia Emanuele Mele Faculty of Communication, Culture and Society, USI Università della Svizzera italiana, Lugano, Switzerland Luisa Mich Department of Industrial Engineering, University of Trento, Trento, Italy Eleni Michopoulou University of Derby, Buxton, UK Roberta Minazzi Department of Law, Economics and Cultures, University of Insubria, Como, Italy Delia Gabriela Moisa University of Derby, Buxton, UK Luis Moreno-Izquierdo Economics of Innovation and AI Research Group, University of Alicante, Alicante, Spain
xxx
Contributors
Francisco Muñoz-Leiva Department of Marketing and Market Research, Universidad de Granda, Granada, Spain Sport and Health University Research Institute (iMUDS), Universidad de Granda, Granada, Spain Julia Neidhardt Research Unit E-Commerce, TU Wien, Vienna, Austria Han-Foon Neo Faculty of Information Science and Technology, Multimedia University, Melaka, Malaysia Larissa Neuburger Department of Tourism, Hospitality and Event Management, University of Florida, Gainesville, FL, USA Thuy Ngoc Nguyen Free University of Bozen-Bolzano, Bolzano, Italy Elena Not Intelligent Interfaces and Interaction Research Unit, Fondazione Bruno Kessler, Trento, Italy Luis Nunes Department of Information Science and Technology, ISCTE-IUL, Lisbon, Portugal ISTAR-IUL, Lisbon, Portugal Instituto de Telecomunicações, Aveiro, Portugal Peter O’Connor UniSA Business School, University of South Australia, Adelaide, Australia Faith Ong University of Queensland, Brisbane, QLD, Australia Simon M. Onywere Department of Environmental Planning and Management, Kenyatta University, Nairobi, Kenya Louise Ould USC Business School, University of The Sunshine Coast, Hervey Bay, QLD, Australia Oleksandra Panasiuk Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Alexandros Paraskevas London Geller College of Hospitality and Tourism, University of West London, London, UK Philip L. Pearce College of Business Law and Governance, James Cook University, Townsville, QLD, Australia Lori Pennington-Gray Department of Tourism, Hospitality, and Event Management, Northumbria University, Gainesville, FL, USA Juho Pesonen Business School, Centre for Tourism Studies, University of Eastern Finland, Joensuu, Finland Anna Picco-Schwendener USI – Università della Svizzera italiana, Lugano, Switzerland
Contributors
xxxi
Tuomas Pohjola School of Economics, University of Turku, Pori, Finland Shahab Pourfakhimi USC Business School, University of The Sunshine Coast, Hervey Bay, QLD, Australia Shanshan Qi School of Tourism Management, Macao Institute for Tourism Studies, Macao, China Ana B. Ramón-Rodríguez Department of Applied Economic Analysis, University of Alicante, Alicante, Spain Isabella Rega Bournemouth University, Bournemouth, UK Stephan Reinhold School of Business and Economics, Linnaeus University, Kalmar, Schweden Francesc González Reverte Faculty of Economics and Business, Universitat Oberta de Catalunya, Barcelona, Spain Francesco Ricci Faculty of Computer Science, Free University of Bozen-Bolzano, Bozen-Bolzano, Italy Miriam Scaglione Institute Tourism, University of Applied Sciences and Arts Western, Switzerland, Valais, Switzerland Marianna Sigala Department of Business Administration, University of Piraeus, Athens, Greece Umutcan Sim¸ ¸ sek Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Uglješa Stankov Faculty of Sciences, Department of Geography, Tourism and Hotel Management, University of Novi Sad, Novi Sad, Serbia Pilar Talón-Ballestero Department of Business Economics, Rey Juan Carlos University,Madrid, Spain Pimtong Tavitiyaman College of Professional and Continuing Education, The Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR Hilda Tellio˘glu Faculty of Informatics, Institute of Visual Computing and HumanCentered Technology, Artifact-Based Computing and User Research (ACUR), Vienna University of Technology (TU Wien), Vienna, Austria Chuan-Chin Teo Faculty of Information Science and Technology, Multimedia University, Melaka, Malaysia Marko Tkalˇciˇc University of Primorska, Koper, Slovenia Ioan Toma Onlim GmbH, Telfs, Austria Horst Treiblmaier Modul University Vienna,Vienna, Austria
xxxii
Contributors
Yuen-Hsien Tseng Graduate Institute of Library and Information Studies, National Taiwan Normal University, Taipei, Republic of China Christine Van Winkle Faculty of Kinesiology and Recreation Management, University of Manitoba, Winnipeg, MB, Canada Adriano Venturini Suggesto S.r.l., Trento, Italy S. Volo Faculty of Economics and Management, TOMTE, Free University of Bozen-Bolzano, Bruneck-Bolzano, Italy Serena Volo Faculty of Economics and Management, TOMTE, Free University of Bozen-Bolzano, Bolzano, Italy Alexander Wahler Onlim GmbH, Telfs, Austria Dan Wang The Hong Kong Polytechnic University, Hong Kong, China Erin Yirun Wang Faculty of Business Administration, Department of Integrated Resort and Tourism Management, University of Macau, Macau, S.A.R., China Craig Webster Department of Management, Miller College of Business, Ball State University, Muncie, IN, USA Hannes Werthner e-Commerce Research Unit, TU Wien, Vienna, Austria Paul Whitelaw Southern Cross University, Melbourne, VIC, Australia Edu William Institute for Sustainable Tourism and Economic Development (TIDES), University of Las Palmas de Gran Canaria, Las Palmas, Spain Elisabeth Wilson-Evered Victoria University, Melbourne, VIC, Australia Wolfgang Wörndl Department of Informatics, Technical University of Munich, Garching bei München, Germany Zheng Xiang The Howard Feiertag Department of Hospitality and Tourism Management, Pamplin College of Business, Virginia Tech, Blacksburg, VA, USA Yang Yang Department of Tourism and Hospitality Management, Temple University, Philadelphia, PA, USA Kyung-Hyan Yoo William Paterson University of New Jersey, Wayne, NJ, USA Yulan Yuan Department of Landscape Architecture, College of Fine Arts and Creative Design, Tunghai University, Taichung, Republic of China Florian J. Zach Pamplin College of Business, Virginia Tech Blacksburg, VA, USA
Contributors
xxxiii
Markus Zanker Free University of Bozen-Bolzano,Bolzano, Italy Marco Zapletal ecosio GmbH, Vienna, Austria Elaine Yulan Zhang The Hong Kong Polytechnic University, Hong Kong, China Xinyan Zhang College of Professional and Continuing Education, The Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR
Part I Foundation of e-Tourism
1
e-Tourism: An Informatics Perspective Hannes Werthner
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Short History of e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remarks on e-Tourism Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the Nature of Informatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodological Pillars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Importance of Informatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Informatics: Good New World? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IT and Structural Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future Developments in e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 7 9 10 11 12 13 15 16 19 20 21
Abstract More and more aspects of our life “move” to the Web. The Web and the Internet, as the underlying information infrastructure, cannot only be considered as a mirror of the “real” physical world; it is increasingly hard to distinguish between the physical and the virtual. The Web, however, is not only reflecting and “operating” this world, it is obviously also transforming it. Information technology (IT) acts both as an enabler and as driver of technical, economic, and societal developments. One can even speak of a coevolution of humans and machines. With recent achievements in areas such as computational models, Internet of things, or artificial intelligence, we see the power of informatics.
H. Werthner () e-Commerce Research Unit, TU Wien, Vienna, Austria e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_1
3
4
H. Werthner
Moreover, from an ontological point of view, informatics influences how we perceive the world, how we think about it. This chapter starts by highlighting the historic importance of technology to growth and development. In the following, it intertwines two lines of thought: as informatics forms the technical and methodological basis of e-tourism, it reviews the nature of informatics as a scientific discipline, its methodological roots, and its importance, both for applications and for science. This is then related to the e-tourism domain, its history, development, and possible future. But most importantly: while informatics offers many positive prospects and possibilities, its development raises at the same time serious issues with respect to society, economy, politics, or the individual. Thus, the chapter concludes with a reference to our responsibility as scientists and to the Vienna Manifesto on Digital Humanism.
Keywords Informatics · Platform economy · Digital humanism · Network effect · Co-evolution · Informatization
Introduction Edward A. Lee highlights that we experience the “co-evolution” of humans and machines, or technology in general (Lee 2020). Consider the flood of data, algorithms, and computational power, which is disrupting the very fabric of society by changing human interactions, societal institutions, economies, and political structures. This is also valid for all areas of scientific research. Moreover, this disruption simultaneously creates and threatens jobs, produces and destroys wealth, and improves and damages our ecology; it shifts power structures. It is a dialectic socioeconomic techno process, where informatics/computer science (In the following, for the sake of brevity, I do not distinguish between the two terms, being aware, however, that they have different semantics.) and its artefacts are one of the major driving forces. This is an accelerating process Hanson (1998) describes that the three “essential” technologies – hunting, agriculture, and industry – grew 100 times faster than the respective predecessor did, leading to both growth and wealth. This is very well illustrated – as a revelation – in the following figure with the increase of the world GDP per capita, from 1 AD until today (Fig. 1). Consider a woman living around 1 AD, and being reborn in the year 1600. Although there would be changes, she would still recognize the world she is born in. But now think of the difference between the year 1700 and the year 2000. She would be lost – cars, airplanes, computers, mobile devices, and Skype – what is this? These acceleration and growth are based on innovation and technology and the sequence of industrial revolutions since the seventeenth century (see the following table
1 e-Tourism: An Informatics Perspective
5
World GDP Per Capita (1990$) 8,000
THE WORLD
6,000
4,000
2,000
0
1
200
400
600
600
1000
1200
1400
1600
1800
2000
Fig. 1 World GDP per capita, from 1 to 2008 AD. (Source: Angus Maddison, University of Groningen, www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-projectdatabase-2018)
highlighting technological revolutions and the related different techno-economic paradigms, Perez 2002). The table relates the respective industrial revolutions from the steam engine and railroad to steel production and electricity and then to the age of oil and the automobile until ultimately reaching electronics and information processing with the associated technical-economical paradigms and innovation principles. New technologies diffuse and lead to innovation and growth. Machines are increasingly taking over the roles of humans, energy is readily available almost everywhere, and knowledge and science are becoming production factors. Geographically, this development started in the UK (It is also interesting to note that, as Acemoglu and Robinson (2012) show, the economic development has a direct relationship to changes in political structures and institutions.) – later, the USA took over the central driving role – after which a geographical propagation all the way to globalization can be observed. Ultimately the current “information revolution” and “informatization” of our world is resulting in increased flexibility, virtualization, global real-time communication, almost unlimited mobility, structures resembling networks, as well as rising acceleration. One can see the connection of the fourth period (following the classification of Table 1) automobile and the beginning of mass markets and mass production – with the beginning of tourism and later mass tourism. Similarly, the age of information and telecommunications, with its information-intensity and microelectronics-based IT, laid the ground for e-tourism, the transformation of the travel and tourism business based on IT. Thus, in the following, I provide a short e-tourism history, and then turn to the nature of informatics, its methodological pillars, and its importance, but also very critical issues.
6
H. Werthner
Table 1 Industrial revolutions and techno-economic paradigms (Perez 2002) Technological revolution Country of initial development FIRST The ‘Industrial revolution’ Britain
SECOND Age of steam and railways In Britain and spreading to the Continent and the USA
THIRD Age of steel, electricity, and heavy engineering the USA and Germany overtaking Britain
FOURTH Age of oil, the automobile, and mass production In the USA and spreading to Europe
FIFTH Age of information and telecommunications In the USA spreading to Europe and Asia
Techno-economic paradigm Innovation principles Factory production Mechanization Productivity/time keeping and time saving Fluidity of movement (as ideal for machines with water power and for transport through canals and other waterways) Local networks Economies of agglomeration/industrial cities/national markets Power centers with national networks Scale as progress Standard parts/machine-made machines Energy where needed (steam) Interdependent movement (of machines and of means of transport) Giant structures (steel) Economies of scale of plant/vertical integration Distributed power for industry (electricity) Science as a productive force Worldwide networks and empires (including cartels) Universal standardization Cost accounting for control and efficiency Great scale for world market power/“small” is successful, if local Mass production/mass markets Economies of scale (product and market volume)/horizontal integration Standardization of products Energy intensity (oil based) Synthetic materials Functional specialization/hierarchical pyramids Centralization/metropolitan centers-suburbanization National powers, world agreements, and confrontations Information intensity (microelectronics-based ICT) Decentralized integration/network structures Knowledge as capital/intangible value added Heterogeneity, diversity, adaptability Segmentation of markets/proliferation of niches Economies of scope and specialization combined with scale Globalization/interaction between the global and the local Inward and outward cooperation/clusters Instant contact and action/instant global communications
1 e-Tourism: An Informatics Perspective
7
Short History of e-Tourism Since its early diffusion, information technology has played an important role in tourism; in the 1960s, computerized reservation systems/global distribution systems (CRS/GDS) were one of the first worldwide electronic networks. Moreover, since the beginning of the Web in the early 1990s, travel and tourism was and is a major application domain for Web-based services. An interesting side remark is that CRS/GDS laid the basis for the “automation” of the air travel industry with its massive efficiency gains through highly sophisticated optimization algorithms as well as worldwide ticketing and reservation systems. This industry with its lobbying power forced national governments to open boarders and “systems” interfaces (even of physical systems). Just compare this with still nationally “controlled” railway systems where it is not possible – or at least, not easy – to buy cross-border tickets. However, everything has a price; these efficiency gains in the airline industry are one of the sources of the climate problems of today. Already in 1994, I stated – at the first ENTER conference in Innsbruck, Austria – that tourism is an “information business,” and it will become electronic. And it will change radically – with new services, new structures, and new players. The tourism industry is a huge industry, with estimated 1.6 billion internal arrivals in 2020 (As this paper was written before COVID-19, these estimates are too high; but even so, one can see the economic importance of this industry.); it is worldwide networked and cooperating and has a worldwide demand side of nonfrequent users, acting in very different sociocultural contexts. A major feature is its fragmented structure, and most companies are SMEs, e.g., in 1997 of the 1.3 Mio tourism enterprises in Europe, 95% were very small (i.e., 1–9 employees) (Werthner et al. 1997). The product itself is complex – it is a bundle of basic products, where one can identify over 30 different industry sectors. In essence, since a tourist travels to another part of the world and consumes there, nearly every service can become a tourism service. This is a rather tricky issue: the tourism product is defined by the tourist at the time of consumption. Thus, from a computer science point of view, domain models have to cover the “entire world.” Finally, it is an emotional and personality-based product associated with fun, driving inspiration and behavior. Thus, these nonrational components impose specific challenges for interfaces as well as for decision-making, recommender, and highly personalized persuasion systems. The product (simple or complex) cannot be tested in advance; it is produced when it is consumed, and tourists have to rely on the information provided. Thus, the product can be considered a credence good. As tourism is an information business – it needs a lot of good information to be successful – the IT industry took tourism seriously from the early beginning, as the tourism industry realized that IT is a strategic issue (Werthner and Klein 1999). At the core of tourism products is normally a destination-related product, e.g., a hotel or an event, which is either directly sold to a tourist or bundled to a more complex product. Thus, the basic experience is provided by local suppliers and related destinations (A destination can be seen as geographically confined network
8
H. Werthner
of suppliers.). This is important to remember when I will come back to some critical e-tourism issues. As foreseen already in 1994, IT has radically changed the industry; one might even say, IT created a new tourism industry. Today, we have a well-developed e-business landscape in tourism, where we moved from simple Websites to partnerships and to IT-based business networks, leading to a so-called informatization of the entire value chain. In addition, since e-commerce favors, in tendency, consumers, we moved from a “customer focused” to a “customer-driven” industry, which becomes obvious when looking at online communities or social media sites and related user empowerment. Moreover, online services became commodities, with an ongoing deconstruction of the value chain. This means that nowadays, you do not need to develop yourselves all your IT services; you just use and integrate preexisting ones on the Web, which provides standard interfaces. In this complex mix of cooperation and competition, the emphasis moved (or should have moved) from process re-engineering to network engineering, which implies that in complex network structures of companies and services, enterprises have to focus more on their value contribution (asking, what is their respective added value) and not so much on efficiency. This might be a little bit provocative, but otherwise they run into the risk of producing very efficiently things which nobody needs. At the same time, we see an enormous trend toward concentration, where the winner takes it all (see, e.g., the strong roles of platforms such as booking.com). In some sense, the development of the Web can be seen as an evolution between order (i.e., highly centralized) and disorder (i.e., permanent appearance of new services and technologies). A second general phenomenon worth mentioning might be characterized as the transparency paradox: more and more information leads to increasing and decreasing transparency at the same time. On the one hand, all this information is available and can be easily accessed. On the other hand, the amount of available information eventually leads to information overload, where it might be hard for the users to find what they are actually looking for. Search tools and recommender systems are available and help to address this problem, but users typically do not know how these systems exactly work and what data they exploit. All of this is increasingly leading to a feeling of insecurity and related problems of trust. Finally, innovation was driven by IT-based newcomers with their fast imitation of business models and technology, whereas traditional market players had problems with service innovation, business models, and technology development. These newcomers became very fast the new and now stronger intermediaries (compared to pre-Web times). In the two-sided market, where suppliers sponsored consumers (consumers pay with their data), the network effect favored this development. Fragmented supplier markets, constituted mainly by SMEs, could not compete with the new players. In addition, these new companies primarily focused on the market level (transaction) and not on the product itself. They conquered the consumers very fast, providing the necessary information and trust functions in an information asymmetric market. The initial hope of the start time of the 1990s of direct distribution and a direct link from suppliers to consumers was not fulfilled (or
1 e-Tourism: An Informatics Perspective
9
even worsened). It is interesting to note that destinations, although being the core for the tourism product and acting as early leaders in the electronic market, lost their central position (Scaglione et al. 2013). The situation can be summarized as follows: • The field of e-tourism is mature, the industry has been radically changed, and users have adopted an ever-growing range of new information and communication technologies. Users became even drivers of this development. • On the other hand, the field of IT and tourism is continuously opening new technical and business possibilities and challenges, leading to a rather complex situation where some guidance and consensus would be needed. • Finally, specific problems arise since the issues are, very often, at the interface between scientific research and development and require inputs from multiple disciplines. Consequently, this interdisciplinary nature of research leads to a mix of different approaches and methods such as quantitative as well as qualitative behavioral research, constructive research or formal methods. This reads like a history of e-commerce, but it is one of e-tourism. The reason is that e-tourism was at the forefront of the general development (and still is) due to its inherent features (Werthner and Ricci 2004).
Remarks on e-Tourism Research The high impact of IT on the tourism industry and also the high interest of the IT industry led to a fast emergence of a research community, journals, and numerous conferences, starting at the ENTER conference on IT and Tourism already in 1994 in Austria. Soon, several international and national research programs followed. Both tourism and IT have a high innovation potential, which was recognized very fast by political decision-makers. I define e-tourism as follows: e-tourism denotes the analysis, design, implementation and application of IT/e-commerce solutions in the travel and tourism industry, as well as the analysis (of the impact) of the respective technical/economic processes and market structures. This non-normative definition contains a constructive and an analytical part, needing a mixture of methods coming from engineering, mathematics, and social science. Interestingly, this shows some similarity to informatics, which also contains an analytical and an engineering part. In addition, e-tourism research is distributed across and gains from different areas such as computer science, IS, geography, sociology, or economics and management science. With respect to research topics, several reviews of scientific contributions of the previous 10–20 years were published, such as Buhalis and Law (2008) or Wang et al. (2010), with similar results. From a historical perspective, particularly in the early days, research focused on technical topics (such as system integration and semantic Web, mobile systems, virtual reality, software engineering) and on showing how to automate processes through IT. However, this changed, and social
10
H. Werthner
sciences and marketing aspects gained more importance. Similarly, with respect to the methodological approach, the field moved away from engineering and system design with its constructive approach to empirical studies, case studies, and, later, data-driven approaches. It is interesting that already in 2006 at the ENTER conference, I gave a keynote on network analysis and service science. With the online availability of data (mainly on user interactions), the number of quantitative papers was rising. This is in accordance with Lazer et al. (2009) who note that the availability of data and tools has an impact on the way social sciences work. However, it is worth pointing out that research in the e-tourism field still mainly relies on data acquired by questionnaires and surveys rather than open data or largescale Web-based data, as a survey of used data sources of all publications in the ENTER 2017 proceedings suggests Sertkan (2018). A critical issue is that rather few articles on tourism applications can be found in pure computer science journals. In those articles that exist and where tourism serves as an application domain, the specific features of the field, which are particularly challenging, or the potential of the domain, are typically not mentioned. Another observation indicates that the field moved from the macro level to the micro level, i.e., the focus is on user adoption and/or marketing for individual enterprises, which involves the danger of losing the big picture at the macro and strategic industry scale. This is especially true with respect to platform strategies, power relationships, and structural changes in a complex market, the latter having being a hot topic in the early days of e-tourism (Neidhardt and Werthner 2018). As a final remark, I add the observation of missing papers on the (critical?) relationship among IT, sustainability, and tourism – at least as far as I am aware. However, I assume that this will change in the coming years.
On the Nature of Informatics I now turn back to informatics, to explain why it plays such an important role, not only in tourism, but in general, informatics can be considered as the science of today’s so-called information society. Its methods influence how we perceive the world and how we think about it. Its artifacts change the world. We are in the midst of the much discussed and described digital transformation. Since it is a process, which is as technological as it is socioeconomic, a plurality of disciplines is required to understand, analyze, and control this development. Nevertheless, the key to understanding and to designing those artifacts and systems is informatics. The computer can be referred to as a general-purpose automaton, which can (as the only automaton) control itself via software, and be instantiated by software to any particular specific problem-solving machine: here it becomes a control device for a power plant, there a social media or a hotel booking tool. This general-purpose machine has the unique property of being able to independently change and control its own behavior based on external inputs and internal states – one may even say that it shows some form of self-reflexivity, and thus to demonstrate intelligent behavior.
1 e-Tourism: An Informatics Perspective
11
Phenomena such as the World Wide Web, cyber-physical systems, or the Internet of Things (IoT) show us to what enormous extent informatics has developed in its short history. It is a journey from the stand-alone computer to the global operating system of our present society, and it is leading us into yet another industrial revolution: digitizing content and automating work and thinking. This global operating system integrates, links, and permeates everything: work, leisure, politics, the personal, the professional, and the private. At the same time, these devices become invisible and “disappear” increasingly: this is an almost dialectical process of all encompassing and disappearing simultaneously. And, everything touched by software becomes a computer! Systems consist of a stack of different hardware (not only computing machines, but also any other machine, be it traffic systems or individual vehicles), with tasks increasingly being delegated to the software. This leads to more and more virtualization. We are progressing from “IT supports business” to “IT runs the business” to ultimately “IT is the business.” Informatization and virtualization do not stop at individual machines or systems; it affects entire industries and societies with all its transformative power. Informatics is thus at the same time the basis and the driver of digital transformation. Against the background of this development, we use Kristen Nygaard’s comprehensive definition of informatics (interestingly, it dates back to the 1980s): “Informatics is the science that has as its domain information processes and related phenomena in artifacts, society and nature.” Informatics does not only – and not anymore – deal with a specific machine (i.e., the computer), it also is a discipline affecting other sciences (take as an example biology), either as a tool or as another ontological approach.
Methodological Pillars Informatics rests on three methodological pillars, whose respective significance to informatics have varied in its short history: mathematics and logic (for the formal theoretical basis), science (the process of the formulation of hypotheses and their testing), and engineering (from formal statement to design and implementation). As a common denominator, the discipline of informatics is based on a “computational model,” i.e., the mapping of the inductive or deductive problem-solving steps onto an (abstract) machine and the conversion into a corresponding abstract and/or concrete algorithm in the form of written code. This methodological breadth shows very well that informatics is essentially of an interdisciplinary “nature.” It shows two inseparable faces: • Informatics as subject, e.g., with research and development in areas such as algorithms, design, information presentation, programming languages, complexity and solvability issues, distribution aspects, predictability, software engineering, etc.
12
H. Werthner
• Informatics in subject, as a tool and methodical approach to other sciences and fields of application, such as tourism. These two faces also show that the distinction between fundamental and applied research (solving specific problems, such as in ecology or in tourism) and the use of informatics in other disciplines is difficult. At the same time, this means that fundamental research plays a significant role in this mix. This is also illustrated by the so-called tire track model (National Research Council 2012), which empirically shows that on average 20 to 30 years pass before a fundamental development creates a large multi-billion-dollar market. This is also valid for the current economic hype about artificial intelligence and deep learning, where the roots date back more than 50 years. However, the examples of machine learning and science, or the “datadriven computing paradigm,” clearly illustrate the previously mentioned features: (i) the combination of an empirical scientific approach (learning from data) with formal methods and algorithms as well as programming and system development; (ii) the intertwining of application and research – research in this area thrives on real application data and its understanding and interpretation.
The Importance of Informatics We experienced a metamorphosis of the computer from a calculating machine as a tool in various forms to the pervasive worldwide media machine of today. It led to the transformation of economic processes, companies, and entire industries. It brought about changes at the macro-level of social and political systems and at the micro-level of individuals by psychological influence. The importance of informatics is also reflected in the ranking of companies with the highest stock market values, which are IT platform companies (see section “Informatics: Good New World?”): the value of these companies is essentially based on information (mostly not their own, but information about and from their users) and the network of their users. They process this data, draw conclusions from it, and sell this “refinement.” Their value and importance result from the processing of information, not from tangible, “real” products and goods. In addition, the importance of informatics can also be described in terms of content, thereby revealing its interdisciplinary nature and extensive connections with engineering, technical and social sciences, and humanities: • Informatics has become a powerful tool for other disciplines and for science in general; it is versatile in scientific calculations and simulations; it has changed the practice in other disciplines. • At a more conceptual level, informatics – see the definition of Nygaard – has promoted a new view of natural and human-made phenomena, and in a reciprocal process, other disciplines continue to develop their self-conception. Informatics thus provides an “info-computational” theory of science with new ontology, epistemology, and methods.
1 e-Tourism: An Informatics Perspective
13
• In addition, informatics creates new things both virtual and real. It is the only (engineering) discipline that creates systems without being limited by physical constraints – it shows similarities with art (“Everything is possible”).
Informatics: Good New World? The development of informatics and its worldwide networked systems, which include everything (computers, programs, content, applications, and users), offers many possibilities but has also major downsides. Tim Berners-Lee stated, “The system is failing,” (Tim Berners-Lee in Olivia Solon: “Tim Berners-Lee on the future of the web: ‘The system is failing”’. In: The Guardian, 16 November 2017) and well-known developers from the early days of the Web such a Jaron Lanier (virtual reality pioneer) apologized for their actions (Noah Kulwin: “The Internet Apologizes . . . ”. In: New York Magazine, 16 April 2018). The Web’s development was driven by a primary focus on business interests, in particular based on an advertising-based business model. For the end users, the Web is free; they pay “only” with their data and their manifested interests and online behavior. This information (data about interests, behavior, etc.) then forms the basis of the advertising business. An interesting observation was published by Vardi (2019a), where he links the Web business model with the antiestablishment cultural movement and its members known as “hippies.” The Internet with its utopian culture connects with the 1960s counterculture movement. In short, the argument is as follows: the Web originally constituted a “commons,” an unregulated public resource with information moving freely. However, already Hardin (1968) describes the “Tragedy of the Commons” referring to the phenomenon where individual users acting independently according to their own self-interest behave contrary to the common good – and “destroy” the common good (Nobel Prize winner in Economics Elinor (Ostrom 1990), however, showed that there are counter examples where members of a community co-operate or regulate to exploit those resources without collapse. So, not everything is lost.). In the case of the Internet and the Web, Vardi states that “information freedom is, of course, an illusion.” Look at Google and Facebook, which make their money from advertising. And advertisers pass their costs to the services and products they offer. At the end, we, the consumers pay, also with our personal information. This in turn enables the platforms to streamline their business and plan their strategic moves – they do not only know us, but they also know the business and market better than anyone else. At the same time, we experience an evolution into a worldwide mega-system with simultaneous massive monopolization phenomena, which means a supremacy and control on a technical, economic, military, and political level. This power struggle does not only take place between companies but also between states and geopolitical power blocs. It is obvious: who is good at informatics leads the field. Informatics offers many opportunities to progress our society. However, there are several critical and questionable developments (This list is by no means exhaustive.):
14
H. Werthner
• Following the dynamics of the so-called platform economy, a few online platforms dominate the market (Parker et al. 2016). Their contribution is the efficiency increase of the market by reducing transaction costs, following also Williamson’s transaction cost Theory (Williamson 1985). The production or the concrete product plays almost no role, because it is virtualized. Thus, informatics virtualizes not only products and companies but also the entire markets as well as companies. This role as an information and transaction hub leads to immense market values (which are also virtual?). In March 2011, only two IT companies were in the ten top companies with respect to market capitalization; in July 2020, these were seven out of ten (occupying also the first seven places) (This is based on a list of publicly traded companies of the Financial Times Global 500.). However, these market capitalizations have no corresponding employee numbers; compare the number of employees at Siemens and Facebook and their corresponding market capitalization. But what is their real added value, and what are they creating? What is their true value? • The developments in the field of automated decision-making and artificial intelligence – trivially speaking, the computational mapping and automation of human thinking – lead to at least partly autonomously decisive systems (such as autonomous driving, soon autonomous tax consultants, or desk clerks). This is connected with serious legal and, above all, ethical questions (Larus et al. 2018). For instance, one might not want to conceive the impact of extreme examples such as autonomous weapons or “killer robots,” but provisions have to be made, and a global ban analogous to chemical weapons seems necessary. Or, as Berners-Lee once more notes, “The fact that power is concentrated among so few companies has made it possible to weaponise the Web at scale.” (Tim Berners-Lee: “The Web is under threat. Join us and fight for it.” In: World Wide Web Foundation, 12 March 2018. [https://webfoundation.org/2018/03/webbirthday-29]) • The evolution of automation of thinking – along with the ability that software components can control all machines as part of the technology stack – will have a massive impact on work conditions and jobs, both qualitatively and quantitatively. Productivity will increase massively and produce enormous wealth (I do not discuss the highly political issue how this wealth will be distributed.), jobs will be destroyed and created (most probably more destroyed than created), and we will develop a new understanding of work time and also leisure. • The Web allows massive invasions of privacy, both by private companies and government agencies. The urgency of this development will continue to increase and therefore requires legal and technical control measures. At the same time, the global Web facilitates terrorist attacks, as it itself is already a means of military warfare. Cybersecurity and data protection will become permanent topics of our society. • Of course, these developments are also noticeable in the political sphere through the effects of the deliberate production of fake news and the emergence of filter bubbles on the Web. The latter are also a result of recommender systems algorithms as well as the fact that the Web reflects in the data and in the
1 e-Tourism: An Informatics Perspective
15
algorithms the prejudices of its users. While technical improvements in and solutions for algorithms and data selection appear feasible, dealing with fake news and the use of the Web for subversive political manipulation will lead to massive political disputes, especially with online political decision-making processes already on the horizon.
IT and Structural Change As already described in section on the history of e-tourism, IT and informatics had a direct impact on the travel and tourism industry. And it will continue to be a major playing field in the e-commerce domain (or IT, in general), with many technical innovations such as recommender systems, emotional computing, group decision-making and social choice, social media, or computational model. However, I consider the structural change at an industry level, i.e., new innovative market players and their IT-based platforms and related ecosystems, as probably the most important aspect of IT-induced change. This seems also to be true for the fast emerging and growing sharing economy. It has to be highlighted that the first and major sharing economy examples are in tourism and mobility, namely, Airbnb and Uber. Also, in the tourism market, IT-based innovations were and are often connected with platform strategies, leading to the “Winners take it all” phenomenon. Such platforms offer technologies and services for a broad ecosystem of users and companies (Cusumano 2010). Notably external innovations create these ecosystems around the platform (Cusumano 2008). The platform operator provides the basic functionality and opens the platform to enable external innovation. Further functionalities are often provided by partners or competitors, who also use these services at the same time. This presents a competitive advantage over so-called pure product solutions that have to constantly implement and integrate their own services and innovations. Successful platforms have to meet two conditions: first, there must be at least one open technical interface as “a system of use,” and second, it has to be easy to get and stay connected. Relatively high switching costs and bundling of services form a strategically important part of platforms; they retain users by making it technically difficult to move to another platform. Obviously, the value of a platform increases with the number of participating companies and/or users. This is also called the Metcalfe’s law, which asserts that the effect of a communications network is proportional to the square of the number of connected users. In the tourism context, one can identify a so-called paradoxical spiral (This is similar to Kai-Fu Lee’s Virtuous Cycle: “More data begets more users and profit, which begets more usage and data.”) , where the participation of suppliers such as hotels in these platforms increase at the same time their dependencies (Calatrava et al. 2015). In general, e-tourism is an excellent example for such innovation approaches. Take Online Travel Agencies (OTA) such as booking.com, they continue to grow, with a monopolistic tendency. Due to the mentioned positive network effects, the already strong platforms will become stronger with every
16
H. Werthner
additional input (e.g., hotels, customers, reviews, etc.). The more hotels rely on the different platforms and distribute their rooms through them, the greater the market power of the latter. Owing to the competition and the importance of the OTAs, providers are forced onto such platforms. Comparison – based on only a few parameters – eases the cognitive load of customers and also lowers prices for the service providers. Similar considerations are valid for search engines or social media sites. The activities on these sites increase the expenses of the service providers and strengthen the network effects of the platforms. This creates a paradoxical spiral: the expenditures of the tourism service provider increase to the same extent as their dependence on the latter, including higher (direct and indirect) costs. The more effort is made by suppliers, the weaker their position becomes with regard to these “central” organizations. Both, Metcalfe’s law and the paradoxical spiral tend to turn such platform companies into natural monopolies (Vardi 2019b). It is an ironic fact that tourist destinations would nearly perfectly meet the prerequisites for an industry-wide service platform. They already have the required regional/national ecosystem (i.e., users, service providers, complementary services and products, content, advertisers, and channel partners) to form a successful platform. There is a kind of structural equivalence of a Web network/platform strategy and a destination’s cooperation strategy. This is also valid for the cooperation with other industries such as agriculture, which is especially important, since tourists look for a bundle of comprehensive tourism experiences, not only for hotels or for individual activities. Finally, such a platform would enable the bundling of expertise and the gathering of data for the analysis of consumer behavioral or product development. In addition, it would ease the participation in a wider innovation process. However, this opportunity was not taken by the destinations (Calatrava et al. 2015), due to their specific limited business models and operational as well as political constraints. It is something like an irony of fate that a technology such as the Web, which had created utopian hopes at the tourism suppliers’ side to enable and reinforce direct relationships with tourists, led to even stronger intermediaries and dependencies as in the history of tourism (The irony is even bigger, as already in the late 1990s organizations such as IFITT (International Federation for IT and Travel & Tourism) organized workshops and published papers on communities, social media (years before Facebook), and destination platforms.).
Future Developments in e-Tourism In the following, I try an outlook into a possible e-tourism future. This is of course not “scientifically sound” and highly subjective (I am not a “futurologist” (In a marketing note of an innovation and technology conference, the organizers even wanted to provide a look beyond the future – we have no idea how such a nonsense could be done!)), but I take the freedom to highlight some future issues, which I consider to be important. It is important to remember that the tourism product and the tourism industry have special, inherent features, which emphasize the
1 e-Tourism: An Informatics Perspective
17
important relationship between IT and tourism and which make it highly attractive for informatics applications and research. Consequently, travel and tourism will remain a critical test and application field for computer science, which, in turn, is enabling new services, products, and co-operations for the tourism industry and other domains. In this context, I also refer to the JITT research agenda (Werthner et al. 2015), which with its integrative classification of open and most promising e-tourism research issues would also provide a framework to identify future developments. It is based on the concept of a digital infrastructure with its five different layers, i.e., (i) individual, (ii) group/social, (iii) enterprise/corporation, (iv) networks/industry, and, finally, (v) governmental/policy. However, in this introductory chapter of the handbook, I limit myself to a simpler differentiation distinguishing only between probable technological and market developments, both being highly interrelated. From a technical point of view, the following developments appear to be influential in the near future (For the sake of brevity, I only provide a list with rather limited descriptions. Detailed explanations of at least some of these developments will be given in the respective chapters of this handbook. I do not include rather “distant” developments such as quantum computing. And of course, this list is by no means exhaustive.). In some cases, such as Internet of things or mobile services, their tourism relevance is obvious; in other cases, such as software engineering or system architectures, the impact will be indirect via faster development cycles or system architectures. Anyhow, one never knows in advance. • Mobile applications running on many different devices (or everywhere), not only in one, in addition to Internet of things (IoT), with permanent connectivity. • IoT with its ubiquity will require new search and recommender approaches that move away from textual search using the terminology of the suppliers to emotional, implicit, and sensor-based approaches. • Novel interaction paradigms between humans and computers such as new search and highly personalized recommendation approaches (emotional, implicit, sensor based, pro-active), probably leading to persuasion systems • In order to avoid permanent annoying accessibility and availability, we will see intelligent switch on/off services, e.g., also summarizing the most important missed interactions and content items • Data analytics or data-driven tools on all different levels – person, group, enterprise, sector specific – with applications using advanced computational model techniques • Related to the data-driven computing approach an increased use of “AI” techniques. These data-driven learning approaches will be combined with top-down logic-based ones, providing semantics and “interpretation” to the current systems (A real interesting AI project would be a Turing test, tailored and cut down to the tourism domain, maybe combined with a public announced prize.). • Lightweight software engineering and tools, enabling ubiquitous applications and leading to more prototype-based approaches as well as further increased
18
H. Werthner
reuse of existing software from (public) repositories. In addition, interfacing will become easier, resulting into improved software and system integration, eventually on a global system. • Decentralization vs. centralization: these two “computing” and system architecture paradigms will continue to co-exist. • Rise of privacy preserving approaches and tools, as well as more cyber security awareness due to an increasing number of attacks • Finally, I foresee the growth of content moderation tools as well as semiautomated truth-checking services in (social) online media or in applications such as hotel rating systems, using a mixture of text mining/AI tools and human intelligence (I do not list the huge number of possible applications of AI techniques and tools; this and a critical discussion of AI would be worth another article.). On the market and service level, I see the following probable developments. • With strong network effects and further market concentration, we will see increasing legal, economic, and political pressure and activities against further market concentration. • In addition, I expect that – with time – at least some of the big platform companies will really enter the tourism market, creating a hard competition with now already “old” tourism platforms. • Ongoing innovation with further waves of new services and ongoing commoditization of existing services, easing their reuse by others. • Competition between different electronic players and more global systems (on the same technical infrastructure) will lead to blurring boundaries and easier switching between the different segments. • This will lead to improved services and quality of search and recommendations (up to persuasion) with “better” contextualized content (for orientation, assurance, and trust). • Further segmentation of consumers, leading to more personalized offerings, with an interesting dialectic of bundling vs. unbundling of services • Total customer care services (across transactions and different customer life cycles) vs. “Do-it-yourself,” where consumers will switch between these two segments. • Further peer-to-peer (P2P) markets, where I expect further strong intermediaries as well as the entrance of established players with product offerings and sharing (probably similar to information sharing); • Data analytical approaches will play a crucial role in these developments, enabling faster reactions and shorter planning periods for providers, and there will be data analytics services and learning for consumers (e.g., ecological footprint services, personalized supplier ranking based on online data, optimization of leisure experience given preferences and budget). • Given this plethora of technologies and services, a major issue in a SME structured industry such as tourism is crucial: who will have the know-how and
1 e-Tourism: An Informatics Perspective
19
skills to provide such sophisticated services (see also Calatrava et al. (2015), for a related destination centric approach)? • This ongoing fast development as well as the networking nature of both, this new world and the tourism business, will call for more cooperation, not only between tourism enterprises but also “know-how” providers (hopefully also Universities). On a more general level, in addition to new services and technologies, the tourism system faces the issue of resilience and sustainability (as an industry and with respect to environment and social acceptance). Permanent short-term efficiency optimization can’t be the long-term strategy.
Conclusions The travel and tourism industry exemplifies the transformative power of IT. As such, this sector is only one but a very prominent example of the overall development driven by IT. This ranges from the individual level (as a user or as a service provider) up to the macro level with platform innovation leading by and by to monopolistic market structures. One should not forget the “big picture” of fundamental change. Clearly, this development will continue, and IT does not stop. We will see further technology and sometimes even disruptive, innovation waves with new services and market players. I conclude from the past that innovation – with a focus on platforms – will mainly come from outside the core industry (This is valid for all industry sectors, not only tourism. See, for example, the automobile sector, where companies like Tesla or Google can be considered as main innovators.). On a structural level, we will see an ongoing dialectic process of simplification and complexity, at a market level as well as regarding technology. The journey has not ended yet. However, we should not only observe but also contribute. This will not only need technical and/or tourism skills but the mastering of a “multidimensional cube” of different disciplines. This will be a challenge. The evolution of the Web and IT offers many opportunities but raises also serious concerns regarding its impact on the entire society and on our future, in general (see section “Informatics: Good New World?”). We have to be aware of our responsibility as a scientist (Popper 1969), putting human values and needs into the center, instead of allowing technologies to shape humans. In this context, I refer to the Vienna Manifesto on Digital Humanism (www.informatics.tuwien.ac. at/dighum. The Manifesto is the result of the first international workshop on Digital Humanism in Vienna, Austria, April 2019. There were over 100 participants from academia, governmental organizations, industry, and civil society, representing disciplines such as political science, law, sociology, history, anthropology, philosophy, economics, and informatics.), with its approach that describes, analyzes, and, most importantly, tries to influence the complex interplay of technology and humankind, for a better society and life. The quest is for enlightenment and humanism, to bring together humanistic ideals with critical thoughts about technological progress. The Manifesto is not only a critical examination of the current situation but primarily
20
H. Werthner
a call for action, both scientifically and practically. I shortly highlight some of the core principles of the Manifesto: Democracy • Digital technologies should be designed to promote democracy and inclusion • Fairness, responsibility, and transparency of software programs and algorithms are essential Regulation • Need for regulation, given the ongoing monopolization of the Web • Decisions affecting human rights must be made by humans The role of research, science, and academia • Universities have a particular responsibility. They are the place where knowledge is created and critical thought cultivated • The connection of different scientific disciplines is essential; we need to break up scientific silos Education • Academic teaching needs to combine humanities, social sciences, and engineering • Education on informatics and its societal impact must start as early as possible These principles are also valid for tourism and e-tourism – it is part of our society. For example, the call for interdisciplinarity, the combination or integration of different disciplines, is something where e-tourism as an interdisciplinary endeavor could even provide good practices. Similarly, putting human, society, and also nature at the center of IT-based sustainable developments is crucial for tourism; our resources are not endless. We are at a crossroads; the stakes are high! The outcome depends on us.
Cross-References A Futuristic Look at Tourism in the Era of the Internet Ecosystem Artificial Intelligence and Machine Learning Digital Divide in E-Tourism Digital Ecosystems, Complexity, and Tourism Networks Drivers of e-Tourism e-Tourism Research: A Review
1 e-Tourism: An Informatics Perspective
21
Group Decision-Making and Designing Group Recommender Systems Mobile Applications for e-Tourism Network Science and e-Tourism Semantic Web Empowered E-Tourism Strategic Use of Information Technologies in Tourism: A Review and Critique The Diffusion of Information and Communication Technologies in the Tourism
Sector The Evolution of Online Booking Systems Virtual Reality and the End of Tourism? A Substitution Acceptance Model
References Acemoglu D, Robinson JA (2012) Why nations fail: the origins of power, prosperity and poverty. Random House Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet – the state of e-tourism research. Tour Manag 29(4): 609–623 Calatrava Moreno M, Hörhager G, Schuster R, Werthner H (2015) Strategic e-tourism alternatives for destinations. In: Information and communication technologies in tourism 2015. Springer, Cham/New York, pp 405–417 Cusumano M (2008) Technology strategy and management. The puzzle of Apple. Commun ACM 51(9):22–24 Cusumano M (2010) Staying power: six enduring principles for managing strategy and innovation in an uncertain world (lessons from Microsoft, Apple, Intel, Google, Toyota and more). Clarendon lectures in management studies. Oxford University Press, Oxford, pp 22–30 Hanson R (1998) Long term growth as a sequence of exponential modes. http://hanson.berkeley. edu/longgrow.html Hardin G (1968) The tragedy of the commons. Science 162 (3859):1243–1248 Larus J, Hankin C, Carson SG, Christen M, Crafa S, Grau O, Kirchner C, Knowles B, McGettrick V, Tamburri DA, Werthner H (2018) When computers decide: European recommendations on machine-learned automated decision making. Joint report Informatics Europe & EUACM. https://www.informatics-europe.org/publications Lazer D, Pentland S, Adamic L, Aral S, Barabasi AL, Brewer D, Christakis N, Contractor N, Fower J, Gutmann M, Jebara T, King G, Macy M, Roy D, Van Alstyne M (2009) Life in the network: the coming age of computational social science. Science (New York, NY) 323(5915):721 Lee EA (2020) The coevolution. The entwined futures of humans and machines. MIT Press, Cambridge, MA/London (preprint) National Research Council (2012) Continuing innovation in information technology. The National Academies Press, Washington, DC Neidhardt J, Werthner H (2018) IT and tourism: still a hot topic, but do not forget IT. JITT 20(1–4): 1–7 Ostrom E (1990) Governing the commons: the evolution of institutions for collective action. Cambridge University Press, Cambridge Parker GG, Van Alstyne MW, Choudary SP (2016) Platform revolution: how networked markets are transforming the economy and how to make them work for you. W. W. Norton & Company, NY Perez C (2002) Technological revolutions and financial capital: the dynamics of bubbles and golden ages. Elgar, London Popper K (1969) Moral responsibility of the scientist, encounter
22
H. Werthner
Scaglione M, Schegg R, Trabichet JP (2013) Analysing the penetration of Web 2.0 in different tourism sectors from 2008 to 2012. In: Information and communication technologies in tourism 2013. Springer, Berlin/New York, pp 280–289 Sertkan M (2018) Classifying and mapping e-tourism data sets. Master’s Thesis, TU Wien Vardi M (2019a) How the hippies destroyed the Internet/full text; CACM July 2018, 61/7 Vardi M (2019b) The Winner-Takes-All Tech Corporation CACM, Nov 2019, 62/11 Wang D, Fesenmaier DR, Werthner H, Wöber K (2010) The journal of information technology & tourism: a content analysis of the past 10 years. Inf Technol Tour 12(1):3–16 Werthner H, Klein S (1999) Information technology and tourism: a challenging relationship. Springer, Wien Werthner H, Ricci F (2004) E-commerce and tourism. Commun ACM 47(12):101–105 Werthner H, Nachira F, Oreste S, Pollock A (1997) Information Society Technologies (IST) for Tourism. Report of the Strategic Advisory Group on the 5th Framework Program on Information Society applications for transport and associated services. “THINK TANK ON IST FOR TOURISM”. Report for the EU Werthner H, Alzua-Sorzabal A, Cantoni L, Dickinger, A, Gretzel U, Jannach D, Neidhardt J, Pröll B, Ricci F, Scaglione M, Stangl B, Stock O, Zanker M (2015) Future research issues in it and tourism. Inf Technol Tour 15(1):1–15 Williamson O (1985) The economic institutions of capitalism. Macmillan, New York
2
Development of Information and Communication Technology: From e-Tourism to Smart Tourism Rosanna Leung
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICT Development from the Tourism Supplier’s Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digitalization of Tourism Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data-Driven Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digital Ecosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICT Development from the Tourist Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technology Changes Pre-trip Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICT Co-create In-trip Experiences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICT-Empowered Real-Time and Post-trip Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ICT Catalyze IT Skills Training and E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24 25 26 29 34 37 38 40 41 42 43 45 46
Abstract This chapter reviews e-tourism development across three areas: suppliers, tourists, and educational needs. Regarding suppliers, the adoption of information and communication technology (ICT) begins with simplifying operational procedures, increasing employee productivity and enhancing information retrieval. Tourism managers rely on ICT and data to carry out management tasks such as promoting products online, making strategic decisions, monitoring customer satisfaction and product customization, and maintaining a sustainable business environment. Application systems within the tourism ecosystem should be interconnected and interoperable to form a smart network. An intelligent
R. Leung () Department of International Tourism and Hospitality, I-Shou University, Kaohsiung, Taiwan e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_2
23
24
R. Leung
environment enables smart networks with high-speed data exchange and autonomy services. From the tourist perspective, ICT significantly changes tourists’ trip decision-making process and behavior. Destination choice and travel product information are available online without time delays or any geographical or language barriers. Multimedia and 3D virtual images expand the information richness and facilitate co-created travel experiences. Social media allow tourists to share their trip experiences with friends and relatives in real time and express their feedback about travel products on easily accessible review sites. To address employers’ and future employees’ needs, the university must revise the tourism education curricula with a significant inclusion of the ICT component to ensure students will be equipped with the necessary ICT skills in areas such as data analysis, scenario interpretation, robot management, and artificial intelligence (AI) applications
Keywords e-Tourism · ICT · Suppliers · Tourists · Tourism curriculum · Tourism ecosystem
Introduction People have travelled for centuries all around the globe for leisure and business. Travellers benefit from experiencing different cultures, interacting with locals, dining, shopping, and being able to escape dull, boring daily routines (Greenblat and Gagnon 1983). Half a century ago tourism suppliers such as airlines and chain hotels introduced affiliated reservation systems on a worldwide basis to enable barrier-free tourism product distributions However, travel product procurement processes strongly depend on intermediaries’ expertise in operating these systems. On the one hand, tourism suppliers relied on travel intermediaries to distribute their products while on the other hand, travel intermediaries also needed to maintain good relationships with their suppliers to secure competitive rates for maximizing their profits and increasing their attractiveness for consumers. This tourism ecosystem has worked seamlessly for decades (Buhalis and Leung 2018). Travel agents assisted inexperienced tourists in arranging travel plans and purchasing travel products. Tourists followed the information provided in travel guidebooks and destination management office (DMO) promotional materials to plan and arrange their travel itinerary. Marketing activities of tourism organizations focused on advertisements in traditional media such as travel magazines and travel guidebooks and participating in travel expos hosted by DMOs to expand into new market segments and explore business opportunities. When tourists were away from their home countries, they were wholly disconnected from friends because international communication was not widely available or expensive.
2 Development of Information and Communication Technology: From e-Tourism . . .
25
This travel business model and tourist behaviors were followed for decades up to the dawning of the Internet era. However, information and communication technologies (ICT) and the Internet have revolutionized and disrupted the tourism ecosystem and tourist travel behavior. ICT simplifies communication, data storage, and computation; increases employee productivity, customized service, and travel products; and enables self-service capabilities for customers (Buhalis and O’Connor 2005). The rapid development of ICT, ubiquitous use of smartphones and mobile networks, as well as high-speed Internet connections affect business operation and management style. ICT strengthen tourism organizations’ competitiveness and improve their strategic performance and are inseparable from business operations, customer services, cost control, and strategic planning (Law et al. 2009). The Internet empowers both tourism enterprises and tourists. Technologyenabled tourism, commonly referred to as e-tourism, has drastically impacted the ways of doing business within the tourism industry. e-Tourism is broadly described as the practice of analyzing, designing, implementing, and applying ICT and Internet solutions in the tourism, travel, and hospitality industry (Chuang et al. 2017). To support public and private sector operations and management and enable tourists to stay connected with family and friends, e-tourism infrastructure requires integrated software, networked hardware, and diverse information sources. Tourism is a complex product, but ICT further enhance its experiential and complexity level (Racherla et al. 2008) . The affordable cellular network connects and allows tourism suppliers to be seamlessly connected with tourists. Nowadays, electronic word-ofmouth (eWOM) and user-generated content (UGC) replaced traditional marketing channels, changed customer information search behaviors, and impacted purchase decision processes (Bruhn et al. 2012). Travellers share status updates via online social network platforms during their trips (Rouhani et al. 2013) With the growth of smart city networks and intelligent technologies, hospitality and tourism suppliers start providing humanless and automated services. Furthermore, AI assists forecasting and decision-making and catalyzes the tourism ecosystem toward smart tourism. The following sections will highlight the development of e-tourism from two key stakeholder perspectives: suppliers and tourists. Then, the resulting education and training needs related to IT skills will be identified. The last section of the chapter will forecast future developments in smart tourism.
ICT Development from the Tourism Supplier’s Perspective ICT digitized the entire tourism ecosystem, including transportation, accommodations, dining, intermediaries, public sectors, and non-profit organizations. ICT has also disrupted the traditional ways of doing business. Developments in ICT have had profound effects on the determination of competitive advantages within the tourism industry. ICT has produced outcomes, introduced innovations, and facilitated the shift to a knowledge-based economy (Papanis and Kitrinou 2011).
26
R. Leung
ICT has also been the catalyst for the formation of an integrated tourism ecosystem through interconnected and interoperable application systems. Most importantly, ICT has provided added value to stakeholders such as cost reductions, valueadded marketing, and being able to participate in a virtual community (Mahajan et al. 2015). ICT adoption started as a tool to enhance service quality, increase the efficiency of business operations (Richardson and Marshall 1999), and provide cost savings (Oyewole et al. 2008). Organizations that implement e-business can be categorized into five types of adopters, including leaders, technology experts, fast adopters, beginners, and late adopters (Vlachos 2013). Leaders and technology experts are always the pioneers in their industry and role models in their discipline. They are willing to take risks and invest in ICT and reengineer traditional operating procedures. Late adopters on the other hand are more conservative regarding changes and reluctant to invest in ICT. Their investments mainly focus on those areas that can generate revenue, but for those intangible areas such as employee efficiency and productivity, brand image, and customer perceptions where the return on investment cannot be easily quantified, late adopters always show hesitation in new ICT adoption (Sigala 2003b). ICT adoption in the tourism industry can be classified on three levels. The first level is digitalization of operations to enhance data processing efficiency and transaction handling accuracy (Leung and Law 2013). The second level involves data-driven management where ICT assists managers with scenario analysis, decision-making, and revenue management. The third level is the adoption of tourism ecosystem applications with dynamic interconnectivity and interoperability via smart networks. In this last stage, ICT allows interoperable applications within the tourism ecosystem without human intervention.
Digitalization of Tourism Operations ICT adoption marks the beginning of a revolution in tourism operations. Tourism organizations typically start adopting ICT to reengineer operations and procedures more efficiently and effectively and also to more accurately handle transactions (Alcántara-Pilar et al. 2017; Leung and Law 2013). Initial ICT adoption can improve company efficiency and productivity (Sedmiak et al. 2016), enhance communication among operations and departments, increase employee productivity and efficiency (Davidson et al. 2002), and reduce manpower needs (Thorn and Chen 2005). Hotels and airline industries were the pioneers in adopting ICT for key operations (Buhalis 2004; Sheldon 1983). Also, the potential of ICT development, ICT adoption, and ICT acceptance have been researched for decades. Buhalis and Deimezi (2004) found that tourism business sectors lag behind other industries in a rapidly changing high-tech environment. Small- and medium-sized enterprises continue to find it difficult to cope with the ever more complex and rapidly challenging online environment in which they have to operate (Romero and Tejada 2020).
2 Development of Information and Communication Technology: From e-Tourism . . .
27
Front Office Operations ICT was first introduce to tourism and hospitality industries to reduce expensive human labor with technological labor to enhance operational efficiency and improve customer experience. Both customers and businesses can benefit from improved communication, reservations, and guest service systems. The aviation industry adopted tremendous ICT applications for both operation management and customer services. The first tourism-related ICT platform was an airline reservation system which was launched in 1946 by American Airlines, and the first business-to-business (B2B) platform – global distribution systems (GDS) – began operation in 1963 to keep track of flight schedules, availability, and prices (Sheldon 1997) Travel agent staff use the system to search and sell tourism products to customers. However, the system required intensive training to operate the system because the operation platform was command-based without any navigation assistance. Moreover, GDS required a dedicated terminal and communication line and therefore the cost of using GDS was relatively high. The Internet era saw many GDS enhancements. Travel agencies were soon able to use their existing computers and data lines to access the system, and staff no longer needed to use text commands since operations could be handled via mouse clicks. Reservations of all kinds of travel products became the norm including hotels, cruises, car rentals, and theme parks. Other than reservation systems, aviation operations also heavily rely on ICT for seamless operation. Marketing and customer relationship management (CRM) system provides loyalty programs and personalization services to frequent travellers; decision support system manages fleet management and schedule optimization; baggage and cargo handling system reads barcode and radio-frequency identification (RFID) tags to seamless transports of the checked baggage to and from aeroplanes; in-flight entertainment system allows passengers to obtain on-demand video programs on personal television (Benckendorff et al. 2019). Compared to the aviation industry, hotels were relatively conservative and slow to utilize ICT (Ayeh 2006). The first electronic front desk operation system Reservation handled the hotel guest reservations and inter-departmental communications for Sheraton Hotel and was launched in 1958 (Sheraton 2019). Later, it was modified as a property management system (PMS) for handling hotel operations. Before the Internet era, operation-oriented applications such as magnetic card key systems, CRM, and guest-operated in-room devices, such as call accounting systems, minibar systems, and in-room entertainment systems, played a key role in hotel service. These systems increase staff productivity and enhance inter-departmental communication efficiency. However, interconnecting different applications that run on different IT platforms required costly custommade propriety interfaces in which many hotels were not willing or able to invest. As a result, manual operations were required for an extended period, which only increased human error and decreased efficiency. With the growing popularity of the Internet, an increasing number of tourism start-ups that work as online travel intermediaries or as an online travel agency (OTA) providing travel products and services emerged. Because of the consumers’
28
R. Leung
successful adoption of these new intermediaries, hotels needed to handle an increasing number of online channels related to rate management and product availability. Various types of online channels and rate management applications were introduced and connected with PMS to reduce workload and assist hotel management (RateTiger 2020). International hotel chains developed their central reservation systems (CRS) to enable cross-country hotel reservations. However, this traditional and propriety system limited the development and collaboration with other online channels. As a result, hotel chains start phasing out of their propriety CRS (IHG 2015) and collaborate with GDS companies, such as Pegasus, Sabre SynXis, and Amadeus iHotelier, on a cloud-based reservation system in order to optimize their channel management via a web-based platform.
Back Office Operations Computers can reliably process enormous amounts of information faster and cheaper than humans, simultaneously analyzing and comparing the data, making information more easily distributable and accessible to users (Sheldon 1983). Therefore, tourism organizations started adopting ICT for back-of-the-house operations and management such as financial and human resources management (AlonsoAlmeida and Llach 2013), procurement (Au et al. 2014), and customer relations and marketing (Fuchs et al. 2010) decades ago. Back office operations typically focus on resources management such as procurement and inventory control, cost control, financial management, and document management. ICT also assists human resources management by handling employee payroll, leave, duty rosters, and various scheduling matters. CRM was first introduced to profile customers in the tourism industry (Álvarez et al. 2007; Taga et al. 2011) and gather customer preferences and activity usage that can be examined and later used to plan and implement direct marketing. These systems support frontline operations by keeping track of a large amount of daily business-related data so that the operation staff can easily retrieve historical information. Furthermore these data can be used as supplementary data for management decision-making. ICT also assist DMOs in managing and promoting tourist’s attractions. Destination management system (DMS) shifts from simple information databases to complex, web-based platforms that support and facilitate communication and interactions among stakeholders including tourists, OTAs, suppliers, and government. Small tourism enterprises have budget limitations when it comes to ICT adoption in which case governments should provide support and encourage investment in ICT to enhance competitiveness (Mwita 2014). Governments typically play a leading role in advocating for the development of e-tourism that can simulate the demand of adoption. Whenever there is a newly launched technology, the government must initiate changes and prepare relevant rules and regulations in order to protect the rights and obligations among the numerous stakeholders (Kalbaska et al. 2017). Document digitization, for example, cannot be successful without government involvement. Government legislators and policy makers recognize digitized contracts with a digital signature as official documents. Moreover, electronic visa services not only stimulate tourist travel desire and revisit intention but can also improve destination image (Çakar et al. 2018). Therefore, the government should consider digitizing visa
2 Development of Information and Communication Technology: From e-Tourism . . .
29
application processes and automated immigration services which can additionally expedite tourist arrivals and departures.
Data-Driven Management In the second stage of ICT adoption, tourism managers utilize ICT for decisionmaking. One of the main characteristics of a tourism product is perishability. No product can be stored indefinitely for sale at a later time. Therefore tourism managers must closely manage their inventory to minimize lost capacities and make price adjustments for their products to either stimulate purchase intention during low seasons or maximize profits during high seasons. In addition, Internet and social media marketing strategies could assist tourism organizations for brand image formation and customer engagement. Prior studies have examined the role of ICT in strategic planning, online marketing strategies (Ruiz-Molina et al. 2014; Yayli and Bayram 2010), the implementation of distribution strategies (Zare and Chukwunonso 2015), profit maximization (Xu and Li 2017), and customer value co-creation (Neuhofer and Buhalis 2014). Previous research has also proposed knowledge management-driven decision support systems for DMOs to provide up-to-date and precise information for scenario analysis and strategic planning (García-Crespo et al. 2010) . Knowledge-based destinations have the potential to enhance and sustain competitiveness in both urban and rural communities (Racherla et al. 2008). E-services such as web information, an interactive map with virtual tours, virtual games, and journey planners could improve the sustainability of tourism management (Chiabai et al. 2013). Moreover, external data related to political, economic, social, and technological data must be considered and made publicly available for tourism organizations in order to prepare better strategic planning and accurate management decisions (Buhalis and Leung 2018). The following subsections illustrate how ICT and big data influence online marketing strategies, channel management, service personalization, and environmental sustainability.
Online Marketing Online marketing starts with the development of a website. Small- and mediumsized organizations adopted the use of websites to maintain long-term relationships with their customers (Çetin et al. 2004) as well as to increase productivity (Sigala 2003a). Many marketing activities have moved to online platforms since the first commercial websites were launched in the early 1990s because this advertising medium seemed likely to increase global visibility and precise targeting ability of the most desired audiences (Giannopoulos and Mavragani 2011). In the mid1990s, web surfers needed web portals because obtaining a website address (URL) on the Internet was not easy, especially for novice users (Klausegger 2005). Web surfers would not be able to locate and retrieve information about a business without knowing the specific URL. Search engines changed information search behavior and nowadays play an important role in online marketing, and websites that allow quick navigation are likely to attract travellers, and distinct designs can even lead to subsequent revisit intentions (Ku and Chen 2015).
30
R. Leung
Establishing trust is essential for increasing e-booking website usage intention (Jeng 2019); trust greatly helps to reduce any perceived risks of online purchases. OTAs provide one-stop service with a wide range of products as well as product/service reviews that can assist customers with their purchase decisions. Therefore, their role in creating distribution channels is seen as important (Law et al. 2015; Novak and Schwabe 2009). The metasearch engine is another intermediary but does not directly offer any travel products; instead, metasearch engines offer a valuable service by referring or linking customers directly to key sources or businesses (Christodoulidou et al. 2010). Travel suppliers strongly depend on these new networks to help broadcast and globally distribute their products (Buhalis and Kaldis 2008; Novak and Schwabe 2009). Websites have become quintessential information sources and distribution channels for all facets of the tourism industry. Tourism industry practitioners need to know how to design and maintain an attractive website that stands out from their competitors. More so, tourism websites need to be user-friendly and accessible. First impressions are long-lasting, and acceptance and trustworthiness of a tourism website is essential, or else users will move on to another competitor (Besbes et al. 2016; Sahli and Legohérel 2015). Ert (2014) argued that a simplistic, minimalist website design could generate a surprisingly positive effect on customers. The semantic web can personalize information and improve higher satisfaction with e-tourism applications (Siricharoen 2010). Semantic web rules can also be used in an expert system to detect fraud in tourism e-commerce transactions (RoldánGarcí et al. 2017). Additionally, the emotions of a user can be predicted by using sentiment analysis based on their communication threats (Neidhardt et al. 2017). A website evaluation is an examination process where the quality of the websites, including features and performance, is accessed (Qi et al. 2020). Various evaluation models have been created to examine the functionality and usability of tourismhospitality websites, and such models serve to define industry benchmarks against which performance can be evaluated (Ali 2016; Law et al. 2010; Zafiropoulos and Vrana 2006). Mobile applications also need evaluation of their performance and efficacy due to their increasing information accessibility and purchase capability via smartphones (Groth and Haslwanter 2016; Hoadjli et al. 2017). Online platforms have greatly empowered and enhanced old-fashioned “wordof-mouth.” Social media and user review websites contain rich textual information on customers’ opinions and attitudes about a whole array of touristic tourism organizations and businesses. e-Tourism marketing strategies can now focus on customer interaction and engagement via social networks and mobile platforms. Nowadays, customers post negative reviews to express their dissatisfaction, vent their emotions, and share information with other prospective customers. Therefore, managers should respond properly and quickly, especially to any negative comments, suggestions, or feedback (Fernandes and Fernandes 2018). Destination promotion has recently faced challenges, especially due to unofficial information sources which capture tourists’ attention. Proactively learning and better understanding the users’ perceptions on the context of social media could help managers maintain high user engagement and fortify positive brand image (Mich and Baggio 2015).
2 Development of Information and Communication Technology: From e-Tourism . . .
31
Recommender systems, for example, must include social data in any recommendation algorithm. Facebook has many sophisticated users whose behavioral data can be tracked, collected, and analyzed to understand an organization’s online branding ranking (Capatina et al. 2018). Destination managers must manage their business’s online reputation by listening closely and responding appropriately to social media comments that can impact their reputation and patronage (Inversini and Cantoni 2011). Data mining and sentiment analysis on social network platforms allow tourism marketers to measure tourist preferences and then aggregate that information for tourism promotion and management (Sun et al. 2017). Big data stored within Facebook and Tripadvisor are treasure troves for marketing departments. “A picture is worth a thousand words” is a commonly used expression easily understood across virtually all cultures and languages. By analyzing photos’ geoposition indicators on image-focused social media, tourist’s visit intention of any particular destination can be discovered (Latorre-Martínez et al. 2014). However, analyzing big data is a challenge for frontline tourism operators and managers. Researchers have designed and tested tools that assist managers with data collection, analysis, reviews, and subsequent action planning. Other than websites, near field communication (NFC) tags on promotion posters draw tourists’ attention and provide supplementary information that improves destination service quality, branding, and marketing (Pesonen and Horster 2012). Digitization also helps form destination image (Mannas et al. 2013) because destination branding is easily communicated via social media (de Rosa et al. 2019). Technology can also customize and co-create new experiences by adopting a virtual environment to deliver multisensory simulations via the five senses; it has proven a useful tool for promoting wine tourism (Martins et al. 2017). Virtual reality (VR) generates realistic 3D images, sounds, and sensory simulation that stimulate interest; it is also a good marketing tool for destination promotion (Martins et al. 2017). However, when designing smartphone-based virtual guiding services, trustworthiness is an essential factor because tourists need to be assured that the multimedia representations are authentic (Koukopoulos and Styliaras 2013).
Channel Management Websites and online platforms affect the relationships between suppliers and travel intermediaries. Two decades ago, many studies forecasted disintermediation as customers would choose to directly purchase from suppliers via online platforms (Garkavenko et al. 2003; Standing and Vasudavan 2000; Tse 2003). Buhalis and Licata (2002) proposed a revised e-tourism intermediary framework to showcase the new era of travel distribution channels. To successfully address this challenge, traditional travel agents have altered their business models by adding value to customers by offering online access channels (Zare and Chukwunonso 2015). As an example, recommender systems (RS) enable travel agencies to provide dynamic case-based recommendations to their potential customers (Büyüközkan and Ergün 2011). Management decision support applications such as revenue management and artificial intelligence (AI)-based decision support systems assist managers in
32
R. Leung
making appropriate decisions derived from historical and external big data. Revenue management applications for forecasting future trends start with an analysis of customers’ historical booking patterns such as booking pace, price, and duration. Later versions of revenue management applications incorporate decision support systems that analyze external data, such as competitor data and tourism arrival statistics, to generate various revenue strategy scenarios from which managers are then better enabled to pick the best model for their organization. Small and medium travel organizations could form alliances to compete with big OTAs with their powerful interconnectivity and interoperability of application systems. A decade ago, many managers hesitated when they conducted risk assessments of implementing technology-related knowledge alliances (Pansiri and Courvisanos 2010). Most notably, the performance and efficacy of supply chains affect the overall business performance of tourism organizations (Paskaleva et al. 2011).
Service Personalization ICT are a strategic e-business tool for customizing products, personalizing mobile services, sharing data, and supporting UGC sharing (Stiakakis and Georgiadis 2011). An intelligent mobile tour guide keeps track of user behavior and refines the user model to provide flexible travel information (Cena et al. 2006). Recommender systems (RS) contain databases with travel schedules and a service-oriented architecture that can act as a virtual tourist hub (Smirnov et al. 2017) and recommend tourist activities by ranking the geo-location, spatial coordinates of the events, tourists’ location, price of the products, time slots, user profiles/personality, and diversity to provide personalized destination recommendations in real time (Gretzel et al. 2004; Montejo-Ráez et al. 2011). Personalized RS which includes Google Translate could resolve language issues and help cater to different user interaction needs (Håkansson et al. 2010). Graphical 3D maps, as an example, allow users to access visual assistance to customize the results generated by RS (Noguera et al. 2012). Personalized recommendations can further be enhanced by including Global Positioning System (GPS) and semantic web mining from user-generated content review sites or OTA (Logesh et al. 2018). Geotagged social data identify the popularity of tourist spots, so RS can provide real-time suggestions via mobile applications that enable tourists to decide the best time to visit any specific tourist spot (Komninos et al. 2017). Collaborative filtering, which includes machine learning analysis of customer online reviews, provides a precise recommendation on Tripadvisor (Nilashi et al. 2018). However, trust is one of the main factors whenever evaluating the reliability of RS. Therefore tourism organizations must ensure they build and maintain a trust-enabled database (Pettenati et al. 2008). This can be achieved to a significant degree by collecting a traveller’s personal network data from social media such as colleagues, schoolmate profiles, and travel behavior and provide recommendations according to their style and preferences (Frikha et al. 2017; He et al. 2016). ICT can also enhance tourism product loyalty. If the customers are satisfied with an online platform, their loyalty will increase (Kim et al. 2011). Playfulness
2 Development of Information and Communication Technology: From e-Tourism . . .
33
is a main factor that affects satisfaction with online services (Vladimirov 2012). Gamification on destination websites can contribute to a more rewarding interaction for web users and elicit higher levels of satisfaction, increased brand awareness, and loyalty (Xu et al. 2017). Prior studies have proposed multiuser online role-playing games for promoting tourism (Berger et al. 2007), but until now, gamification applications are still not widely adopted by tourism organizations. Customer feedback is perhaps the most ubiquitous of ICT applications. Such apps help business entities evaluate their service quality, gauge their customer satisfaction levels, and then identify areas in need of improvement. Collecting paperbased feedback from customers in the past was expensive, and response rates were extremely low. Online feedback forms on business websites allow customers to provide feedback anytime and anywhere. With social networks and review sites, customers share their opinions with their friends and the public at large. A negative experience posted on an Internet blog or website could be spread to millions of people overnight, and the supplier has no control whatsoever. Therefore, tourism managers should closely monitor social network platforms and promptly respond to user comments to avoid negative comments spreading quickly and afar (Ekiz et al. 2012).
Environmental Sustainability ICT has ushered in an era of innovative ways for sustainable tourism development. Energy management systems monitor hotel room occupancy to automatically adjust in-room temperature settings. Carbon calculators are used to determine carbon emissions and the amount of energy consumed by business activities. Computer simulation applications simulate real-world settings for direct observation, manipulation, and analysis of the most effective management strategies. Destination management systems consolidate and distribute tourist information via online channels and hence reduce paper reports and brochure printing. Environment management systems keep track of waste and emission data for cost-benefit assessment and better decision-making. Websites assist cultural tourism promotional campaigns by creating destination awareness through the use of digital images and virtual tours and can thereby protect fragile artifacts and archaeological sites (Valˇci´c and Domši´c 2012). GPS, geographic information systems (GIS), and carbon calculators can assist destination managers who are dedicated to sustainable tourism (Ali and Frew 2014). Local cities can adopt location-based services via smartphones for visitors and e-parking services via mobile communication to increase the utilization of urban green spaces (Karagiannis et al. 2014). Cultural heritage concerns can adopt ICT to convey and distribute diverse multimedia information and contribute to their own long-term competitiveness and sustainability (Paskaleva and Azorín 2010). Gamified ICT applications for promoting sustainable tourism can lead to a positive impact on communication and social interaction with tourists and residents who will be better informed, more skilled, and disposed with more positive attitudes toward tourism and sustainability (Negru¸sa et al. 2015). Heritage sites, museum exhibits, and historical artifacts are fragile; however, it is almost impossible to stop tourist visits since these are primary tourism attractions. AR allows tourists to interact with
34
R. Leung
virtual exhibits which can increase their enjoyment while also protecting fragile artifacts with displays of accurate digitized 3D objects (Webb et al. 2016). Overtourism is a hot topic that received intense attention and discussion in the past decade. Reino et al. (2014) proposed a benchmarking framework so that managers could quickly evaluate their current carrying capability against the benchmark and have a better understanding of their own capability. Social media is now an established factor that stimulates travel demand and leads to overtourism (Alonso-Almeida et al. 2019). Therefore using social media to disperse tourists away from the main pressure points is the best way to combat overtourism (Gretzel 2019). DMOs could manage a destination’s carrying capacity by careful planning and managing tourism which concurrently respects the well-being of the permanent residents at tourism destinations (Wall 2020). By adopting ICT analytic tools and sensors, tourist movement and travel patterns could be monitored and thereby lessen the tension of overtourism.
Digital Ecosystems In the current Internet era, it is necessary to have interconnectivity and interoperable application software to handle comprehensive business-to-business integrations (Fodor and Werthner 2004; Perks and Riihela 2004). According to Chuang et al. (2017), important technologies for e-tourism include computers, the Internet, mobile computing, social network, cloud computing, big data,, and IoT. Buhalis and O’Connor (2005) pointed out that ambience and intelligence should be the focal point of technology developments in tourism. This requires an emphasis on sensor technology, embedded systems, ubiquitous communications, media management and handling, natural interaction, contextual awareness,, and emotional computing. A prior study predicted that the future of e-tourism would focus on consumercentric technologies that would enable business entities to carry out innovative interactions and reengineer their communication strategies (Buhalis and Law 2008). From an organization’s perspective, smartness refers to the integration of a network of organizations and smart features that engage in interoperable and interconnected systems to simplify and automate daily activities. Doing so adds value throughout the entire ecosystem for all stakeholders (Buhalis and Amaranggana 2015; Buhalis and Leung 2018). Definitions of smart tourism have been discussed among scholars. Gretzel et al. (2015b) presented seven main differences between e-tourism and smart tourism: Core technologies of smart tourism are sensors and smartphones. Data is the lifeblood of smart tourism, which means it requires a move from discrete information to big data. Smart tourism also involves shifting from a value chain mindset to an ecosystem perspective that not just serves tourists pre- and post-travel but also includes time and activities during a trip. Rather than transactional information exchanges, smart tourism values stakeholder collaborations and partnerships. And finally, smart tourism is not only about digitization but helps actors within the smart tourism ecosystem bridge digital and physical environments.
2 Development of Information and Communication Technology: From e-Tourism . . .
35
Gretzel et al. (2015a, p. 181) defined smart tourism as: Tourism supported by integrated efforts at a destination to collect and aggregate/harness data derived from physical infrastructure, social connections, government/organizational sources and human bodies/minds in combination with the use of advanced technologies to transform that data into on-site experiences and business value-propositions with a clear focus on efficiency, sustainability and experience enrichment.
Liburd et al. (2017) proposed smart tourism should be considered involving three aspects including (1) embracing the fluid and emergent nature of intelligent, appreciative, and complementary understanding that is more empathic, flexible, humble, and sustainable, (2) co-design of values, and (3) collaborative design with trust.
Intelligent Environment The move toward a smart tourism ecosystem necessitates a re-engineering of operations and procedures with interconnectivity and interoperable application systems. Governments can play a leading role in moving an entire country through a smart transformation by investing in DMOs, ICT capital, human capital, and values (Vargas-Sánchez 2016). For example, the government of Russia constructed an information highway and associated e-government integrations with numerous digitalization programs (Kolarova et al. 2006). The Taiwan government was the first to provide free Wi-Fi access to both residents and tourists (iTaiwan 2011). Citizens and visitors in public areas within the European Union have been provided free WiFi connectivity primarily by funding made available by municipalities for hardware and maintenance. The vast mobile network will continue to play a crucial role in the tourism industry and must adjust and constantly adapt as the functions and durability of smartphones are continually enhanced. Smart hospitality networks connect all stakeholders within the overall smart ecosystem and sub-ecosystems and consolidate and aggregate historical and external environment contextual observations to form big data while all data can be maintained anonymously and confidentially (Buhalis and Leung 2018). Unfortunately, many stakeholders in the tourism industry have not yet recognized the importance and benefits of mobile technology (Dorcic et al. 2019). Many smart applications such as AI-enabled business intelligence systems are available, but ICT adoption in the hospitality and tourism industry is still considered rather limited. Most business applications remain mainly text- and image-based while 3D and real-time applications are not widespread. The involvement of government can be significant. Some local city governments have installed an impressive array of sensors to measure the external environment such as water and air quality which, in turn, can ensure travel quality and attractiveness (Karagiannis et al. 2014). Business organizations make use of these sensors and beacons for their marketing activities. Edge computing allows the information collected by these sensors to be processed to the edge of the network away from data centers so as to reduce data traffic (Taleb et al. 2017). Subsequently, data can be sent to the cloud for comprehensive data analysis by combining it into big data and using AI to scrutinize it.
36
R. Leung
Even though smart tourism has recently become a hot topic, there remain a series of challenges to overcome before full realization. Smart tourism experiences require extensive technologies and derivative services; therefore, the IT skill level of the tourists especially with handheld devices will impact smart service delivery and tourist satisfaction (Gretzel et al. 2015a). Moreover, some tourism stakeholders do not clearly understand the definition of “smart tourism” and its critical functions such as interconnectivity and interoperability along the supply chain points, applications, and linkages to external big data. For example, accurate revenue forecasting, social media monitoring, and the use of AI and robots are fundamentally beneficial to any manager or leadership team (Leung 2019). Interconnectivity and interoperability of devices installed inside a hotel for use by both employee and hotel guests is occurring with the implementation of highspeed 5G networks and the Internet of Things (IoT). Ambient intelligence brings current information to tourism ecosystems and helps make those environments more sensitive, flexible, and adaptive to the needs of stakeholders (Buhalis 2019). For example, hotel and restaurant lighting and temperature can be dynamically adjusted to provide a comfortable environment for customers and guests. Operations can also be automated via an IoT network. For example, inventory management can be monitored by sensors installed inside a warehouse or refrigerators that will identify the inventory on hand and the expiry date of the items.
Autonomous Services Self-service options can reduce contact with human staff members for mundane and/or simple assistance. Therefore, implementing a customer relationship management system and providing personalized services could maintain or bolster brand loyalty (Stockdale 2007). Self-service kiosks, for instance, provide a preprogrammed environment for novice users for simple tasks such as ticket selling, check-in/checkout, food ordering, and information retrieval that previously were completed by service staff. Autonomy devices, self-learning algorithms, and IoT are at the forefront of the movement of tourism ecosystems from e-tourism toward smart tourism. However, this ecosystem cannot be truly ubiquitous without the establishment of the necessary technological and regulatory foundations (Gretzel et al. 2015c). Public sector policy makers should define any technical standards, policies, and regulations for implementing any new tools in the tourism industry. As an illustration, robots have been increasingly replacing humans to carry out repetitive, boring, and stressful tasks at even high speeds and efficiency (Engelberger 2012). The first robot hotel Henn na Hotel was opened in Nagasaki, Japan, in 2015. The hotel management placed emphasis on humanless service. The receptionists, concierge, luggage handling, and in-room ambient controls were all done by robots. However, these robots were not AI-embedded, so they could not interact with hotel guests according to any specific, in-the-moment requests. Because of this inflexibility and lack of intelligence, Henn na Hotel Group decided to lay off half of their in-room robots after 4 years because they found that some robots were not advanced enough to perform many of the tasks they needed to do and quite
2 Development of Information and Communication Technology: From e-Tourism . . .
37
unexpectedly created much extra work for the human staff (Forbes 2019). Also, Hilton hotel collaborated with IBM to design the first Watson-enabled robot Connie to assist with guest requests, personalize guest experiences, and empower travellers with additional information for trip planning (Hilton 2016). It is important to extend research on human-robot interactions with the increasing number of users interacting with robots every day (Tung and Law 2017). Service robots in restaurants not only serve customers but can also cook and prepare meals. Operations from low-level operational robots for flipping hamburgers and making cocktails to sophisticated robots with learning ability that can simulate celebrity chef movements to assist with cooking at home are useful as well as popular (Moley Robotics 2015). Autonomous devices can save labor costs and increase service efficiency. Drones can be utilized for remote service delivery in areas where no infrastructure is available, and hotels or event venues can adopt autonomous furniture that can automatically relocate to a desired location for any specific occasion. Virtual agents (VA) or chatbots (e.g., Google Assistant, Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana) are currently the most developed forms of AI that support customer services (Syam and Sharma 2018). Tourists can use their mother language to voice control VAs to accomplish tasks such as tour and restaurant reservations, in-room ambient control, and numerous functions from inside their room even without any language barriers.
ICT Development from the Tourist Perspective One of the reasons why people travel is to escape from their daily routine; however, because of increasing digitalization of life and work, people are having difficulties due to their inability of letting go of network connections when they travel (Egger et al. 2020). On the positive side, ICT has changed tourist travel patterns and enabled co-creation of experiences. This transformation has revolutionized the travel planning process and ushered in a new era of co-created tourist travel experiences on three levels: technology-assisted experience, technology-enhanced experience, and technology-empowered experience (Neuhofer et al. 2014). At the technology-assisted experience level, ICT acts as a mediator for information retrieval and communication. At the technology-enhanced level, tourists participate and interact with tourism organization’s activities via interactive technologies such as social media platforms (e.g., Facebook, Instagram, Twitter, and YouTube). Finally, tourists incorporate high-level technologies at the technology-empowered experience to co-create and optimize their own experiences. These enhanced travel experiences appear in all three stages of the tourist life cycle (pre-trip, in-trip, and post-trip) (Joseph and Anandkumar 2016). With ICT, tourists can gather and analyze travel information to plan for the trip, adopt ICT to enhance the travel experience, and share their travel memories online. Figure 1 illustrates the ICT applications for tourists in the three travel stages.
38
R. Leung
Fig. 1 ICT applications for tourists in the three stages of travel
Technology Changes Pre-trip Behavior Traditionally, travel inspiration came from DMO promotional materials, tourist guidebooks, tourism promotional videos, recommendations from travel agencies, and friends and relatives’ word-of-mouth regarding their past experiences. Tourists went to a travel agency seeking travel information and products, and they usually would end up using the tourist agency’s services. Just a few decades ago, there were extremely few channels for the ordinary customer to buy airline tickets or book hotel rooms directly from suppliers. In those times, communication costs (e.g., IDD calls) were very expensive, and language differences posed major barriers whenever communicating with overseas companies. Besides, operating a GDS required special training, and printing of airline tickets also needed specific licensing from the airline company. As a result, travel agencies were the main channels for tourists to buy travel products. In the current Internet era, tourist travel behaviors have changed dramatically. At the pre-trip stage, tourist travel intention behavior can be gathered from computer clicks and keywords typed related to any banner and/or online advertisements. Information search behavior can be found from the product searched via e-distribution intermediaries. Visits to travel review sites can help better understand and evaluate processes. Online transactions inside shopping carts can indicate tourist buying behaviors, and membership profiles record customer preferences and help form target marketing (Louvieris and Driver 2004). With the popularity of smartphones and the rapid development of broadband cellular network technology, mobile applications have gradually replaced desktop applications. Tourists no longer depend on outdated paper-based travel materials for
2 Development of Information and Communication Technology: From e-Tourism . . .
39
travel ideas (Umlauft et al. 2002). Especially for younger generations, their travel desire is influenced by blogs, vlogs, and social networks (Bajpai and Lee 2015). Business travellers’ ICT needs include reliable website reservation systems and inroom Internet access; these have become the most critical ICT application for hotel choice (Yeh et al. 2005). Multimedia, 3D images, and web-based GIS applications such as Google Maps have greatly expanded the ways travel information is being accessed. Practical information, such as transportation, accommodation, and location-specific information, have become primary information sources for travel destination choice and planning (Chang and Caneday 2011). In addition, tourists can obtain VR materials prepared by tourism organizations which is an innovative way to be inspired when engaging in travel planning. Furthermore, users can immerse themselves in a computer-generated environment such as Second Life, where interactions with the virtual environment and co-created travel experiences can occur with avatars (Guttentag 2010). Traditional travel planning mainly depended on tour guidebooks and information brochures from DMOs. The information provided from these two channels was very generic. Consequently, the more experienced tourists were oftentimes not satisfied. Tourists could only explore their destination upon arrival at the destination, so no advanced planning could be carried out. ICT, however, assists tourists when planning their trips. The earliest intelligent travel planning began with collecting travel-related information by WebBots and providing solutions to resolve specific user problems (Camacho et al. 2001). However, since the results were obtained from existing databases, personalization was essentially impossible and unavailable. For instance, chat-based RS recommender systems allow users to make decisions among sets of alternatives (Nguyen and Ricci 2018). Destination information is an important piece of information for any travel planner. The e-tourism tools examine the spatial and temporal data collected from social media channels and then can forecast the most likely tourist travel behavior and assist tourists with their vacation planning (Pichler et al. 2014). RS assist with automatically arranging travel affairs (Camacho et al. 2006), so it must be able to provide various itinerary recommendations based on the tourist’s specific preferences, demographic characteristics, and places visited on former trips. In this manner, personalized travel information can be offered that best matches the user’s taste (Sebastia et al. 2009). Advanced RS include using social network data and travel patterns which provide customized recommendations (Chiang and Huang 2015; Frikha et al. 2017; He et al. 2016). Later on, RS included machine-learning algorithms to analyze customer online reviews and assist tourist travel decisionmaking (Nilashi et al. 2018). Travel products are high-involvement products because customers cannot try them beforehand. Therefore, travellers need extra information to support their purchase decisions. Review comments posted on social media and UGC sites play a highly valued role for travel product choices (Papathanassis and Knolle 2011). Facebook is one of the key information sources for travel planning, a practical and efficient place for sharing judgments and complaints, as well as enhancing tourist knowledge management, coordination, and even claim reduction (Pantano
40
R. Leung
and Pietro 2013); however, users understandably trust information that comes from close friends (Stiakakis and Vlachopoulou 2017). Since social media and UGC comments play a crucial role in travel planning, it is important to ensure the online review data are reliable data sources (Xiang et al. 2018). Tourists seek comments/reviews from the virtual community before purchasing travel products. Sources such as Tripadvisor should formulate their strategies for significant adoption and implementation of consumer-generated media which will attract tourist attention (Burgess et al. 2015). Even though tourists can purchase products directly from a supplier’s website, many of them use metasearch engines to compare prices. Metasearch engines have become a new form of travel intermediaries (Hikkerova 2010). All in all, understanding e-tourists’ digital behavior is essential and significant for enhancing the success of online channels.
ICT Co-create In-trip Experiences The durability of mobile devices, battery life, and high speed, along with low data communication costs and a cellular network, have greatly catalyzed the development of tourist in-trip applications. Context-aware mobile multimedia guides that integrate emerging technologies and new communication standards provide both appealing and ubiquitous solutions to tourists (Raposo et al. 2012). Geographical location, as an example, is one of the most critical pieces of information for tourists during their trip. Tourists can easily explore the city with mobile applications such as route planning and real-time map services (e.g., Google Map). Web-based selfguided applications integrated with online maps and GPS allow users to edit their travel plans, browse an itinerary via mobile devices, and share their itinerary with friends (Lin et al. 2014). With geotagged social data, social context crowdsourcing mobile application identifies the popularity of tourist spots and showcases a heatmap of a social “buzz” that can help tourists decide the best place to go and the best time (Komninos et al. 2017). RS can provide in-trip recommendations to enhance travel satisfaction by combining the GPS location of the tourists with review site data (Logesh et al. 2018). Multimedia displays, especially 3D images from augmented reality (AR) and VR, present new experiences to tourists in museums (Jung et al. 2016). ICT cocreates a tourist’s real-time experiences in various ways, for instance, location-based promotional messages via smartphone; receiving external environment conditions such as weather and road traffic conditions via sensors and beacons installed around the city; retrieving instant attraction details by scanning QR codes; and locating empty parking space locations via geofencing network (Karagiannis et al. 2014). Mobile technology spurs virtual experiences. AR applications support tourists via smartphone applications who like roaming around in unfamiliar environments (Yovcheva et al. 2012). During a trip, a tourist’s experiences can be enhanced by virtual tours with an interactive map, virtual games, and journey planners (Chiabai et al. 2013). AR provides supplemental information such as development or construction/repairs in progress, the original look of onsite architecture, and
2 Development of Information and Communication Technology: From e-Tourism . . .
41
even personalized site details via mobile devices that are interesting to the tourists (Tscheu and Buhalis 2016). On the one hand, tourists can explore and interact with an accurate digitized 3D exhibit and increase the enjoyment of learning and, on the other hand, protect the real, fragile artifacts (Webb et al. 2016). One of the major barriers to travelling is having to negotiate a foreign language. AI-enabled language recognition systems assist tourists in this regard. Crosslingual information retrieval systems that can automatically translate an English phrase into Chinese characters were developed over a decade ago (Li and Law 2007). Shortly thereafter Mexican entrepreneurs implemented a mobile application that incorporated Spanish translations, Google Map API, and a GPRS cellular network to display heritage site information (Zacarias et al. 2015). More recently, translation devices provide instant multiple language translation services for tourists that greatly facilitate interaction with locals worldwide. Any successful e-tourism system nowadays must also be able to handle a wide range of online payment options with efficient and user-friendly interfaces (Mohamed and Moradi 2011). Besides traditional payment methods like credit cards, stored-value pre-paid cards, and electronic wallets, cryptocurrency has demonstrated the potential to become an indispensable future currency for tourists since it can avoid currency exchange and rate fluctuation.
ICT-Empowered Real-Time and Post-trip Sharing Travelling is not only for creating short-lived personal experiences but also provides long-term memories. Images are a quintessential type of souvenir that provide authentic reminders of a particular place and/or experience (Gordon 1986). Traditionally in past generations, tourists would send a postcard to friends for sharing their happiness and travel experiences. The web platform Web 1.0 was mainly for information retrieval and business activities, so users were not able to share their experiences. However, Web 2.0 enables social networking such that tourists can easily contribute online by writing blogs (e.g., TravelBlog) and microblogs (e.g., Twitter) and through social network sites (e.g., Facebook) and multimedia-sharing sites (e.g., YouTube, Instagram) to share their travel diary with friends and family. Furthermore, tourists can also register reviews of their experiences with the public via user-generated content (UGC) review sites (e.g., Tripadvisor) or OTA websites. Photographs or short videos sent from mobile phones are now the new de facto postcard for sharing among friends. Sharing, of course, includes not only knowledgerelated aspects such as the product quality, prices, and weather conditions but may include the tourist’s personal emotions, imaginations, and fantasies about their trips. Tourists are widely different and have different motivations for contributing to social networks. One simple reason is the enjoyment of showing their experience to the particular virtual community to which they belong. Through their contribution they might feel a sense of belonging, solidarity, and identification as a unique member of their community. Moreover, they can support other members and help everyone feel needed and appreciated. Electronic word-of-mouth (eWOM)
42
R. Leung
comprises active persons who provide their personal evaluations and opinions about products and businesses in virtual communities found across the web. Travelling, for instance, is a unique visual experience and vacation photos are important components of one’s travels. Travel photos contain information relating to the interests and activities of tourists during their trips. “The art of much tourist photography is to place one’s ‘loved ones’ within an ‘attraction’ in such a way that both are represented aesthetically” (Urry and Larsen 2011, p. 179). Therefore, and not surprisingly, people believe that personal information on review sites is more trustworthy than the marketing materials provided by tourism organizations (Yoo and Gretzel 2008). The Internet has become an ideal channel for customers to voice their dissatisfaction to a service provider; it is fast and direct; minimizes any barriers of time, place, or prescribed process for registering a complaint; and eliminates any face-to-face embarrassment (Ekiz et al. 2012). Therefore, the response rates of posting negative online reviews are relatively higher than the paper-based versions. Customers log their comments not only to express their dissatisfaction but sometimes hope their feedback will help an organization to quickly and appropriately improve. As a result, tourists expect managers to respond to their reviews or else they will most likely feel even more disappointed. “Attention is retention,” as the adage goes.
ICT Catalyze IT Skills Training and E-Learning Over the last few decades, the development of ICT introduced new technologies such as the Internet, social media, NFC, VR and AR, ubiquitous computing, AI, IoT, and cloud computing. Their applications like reputation management, revenue management system, decision support system, service robot, and autonomous devices not only revolutionized tourism ecosystem operations and management and customer travel behavior (Boes et al. 2015). To cope with the rapidly changing and developing tourism industry environment, colleges and university have to ensure the graduates’ IT skills needed by the industry are covered in the curriculum (Bilgihan et al. 2014). Elliot and Joppe (2009) identified several IT-related competencies, including electronic information sharing, IT knowledge and e-business, e-marketing skills, computer skills, and general use of IT. ICT knowledge and skills have become a core set of competencies for students. Employers expect graduates will be well equipped with ICT skills after they graduate. At the most fundamental level, the tourism curriculum design should include computer skills on how to use office tools such as word processing (Microsoft Word) and presentation tools (PowerPoint) (Sigala 2002). With the rapid expansion of ICT throughout almost every aspect of the tourism industry, the need for updating the tourism curriculum is critical. For example, big data analysis, online marketing, database management, GDS/GIS operations, and word processing are indispensable. However, inside tourism program curriculum design, ICT-related educational components have been relatively weak. Future management decision-making strongly depends on extrapolating meaningful results generated from big data and problem-solving
2 Development of Information and Communication Technology: From e-Tourism . . .
43
applications. Future tourism employees must be able to understand the concept of data mining, analysis, and interpretation for decision-making. There currently exists a gap between the students’ expectations/needs and actual curriculum design (Femenia-Serra 2018). ICT and the Internet form the learning media platform in tourism (Cantoni et al. 2009). Industry practitioners and undergraduate students hold positive perceptions toward e-learning courses (Eraqi et al. 2011; Kalbaska et al. 2013). Integrating technology into instructional practices also helps develop student capabilities for wisely using and managing technological trends within their working environment (Sigala, 2007). Generation Z was born with the Internet, and smartphones are a commonplace critical tool used daily. Shifting lectures and tutorials onto a mobile platform is an essential next step forward (Fermoso et al. 2015). Goh and Sigala (2020) proposed that tourism education must teach the necessary ICT usage skills by integrating ICT into course curriculum content, and instructors must adopt ICT and innovative teaching methods for interacting with students. Innovative teaching materials have been shown to increase students’ learning motivation and outcomes. A study on using Wiki and blogs as teaching aids indicated that students show positive attitudes but these are not overwhelmingly favorable (Lillo-Bañuls et al. 2016). Online learning platform such as Massive Open Online Courses (MOOCs) uses a counselling learning approach so that by combining social media, interaction among instructors and learners could increase and help build a sense of an interpersonal relationship. This relationship continues even after a course is completed (Marchiori and Cantoni 2018). Organizing and taking a large group of students to a hotel for a field trip may be a challenging task for most instructors. However, visiting via a web (virtual field trip) could motivate students toward problem-solving and generate more anticipation with a 360-degree view (Patiar et al. 2017) which indicates that VR could serve as a valuable instructional platform.
Expected Future Developments Even though ICT has become an essential tourism tool, tourism organization managers and technology experts have overlooked its potency in a few key areas. For instance, accessible tourism offers substantial economic potential, but web-based information has not always been easy to access by disabled persons (Pühretmair 2004). Tourists who are challenged with different forms of impairments as well as elderly populations who require different levels of assistance when making decisions about travel and destinations are just two segments that can be assisted with ICT. In turn, this could increase travel business competitiveness via specially focused destination marketing activities (Buhalis and Michopoulou 2011). Elderly consumers are inaccurately considered and widely ignored as users of e-tourism (Szopiñski and Staniewski 2016). Tourism organizations should avoid inequalities in access, use, and engagement with ICT between tourism markets and destinations (Minghetti and Buhalis 2010).
44
R. Leung
The Internet of Things (IoT) and 5G network are fuelling the rapid evolution and growth of e-tourism toward smart tourism and are further disrupting the current ecosystem, business models, and practices. Sensors and beacons installed around a city and within organizations can monitor the external and internal environment and reflect real-time data accessible by application systems for big data analysis. Open data applications suit continuous business operations and data sharing among all tourism business entities (Yu 2016). However, open databases might not be easily implemented without government involvement. Blockchain technology is still in its infancy in practice (Yuan and Wang 2016), but it can assist in smart tourism developments in various ways. It creates new opportunities for the sharing economy, altering conventional e-commerce operations through effective and efficient reallocation of resources (Pazaitis et al. 2017). The emergence of sharing economy services such as Uber and Airbnb changes how people plan their trips. Nevertheless, the essence of the sharing economy cannot be optimized without mutual trust between service providers and users. Blockchain can serve as an intermediary for business contracts between travel suppliers and customers since contract information is impossible to be altered after being established and can deal with the trust problem effectively (Rashideh 2020). It can stream data from various sources and platforms to maximize capacity without overbooking (Irannezhad and Mahadevan 2020). Furthermore, the decentralization of blockchain, that is, distributed ledgers, can further alleviate the problem of trust engendered by peer-to-peer business models or hotels in their relationship with online travel intermediaries (Flecha-Barrio et al. 2020). Smart tourism network interconnected the tourism ecosystem. The whole network is supported by IoT, sensors and beacons, network infrastructure, and the cloud. Each organization’s internal databases are inter-exchangeable among the supply chains. External databases from public and private sectors and environmental data collected from sensors are consolidated on the cloud. AI-enabled applications support both organizational management and customer service. Decision support systems consolidate external data with operations and management data for strategic decisions. Robots and autonomous devices react automatically according to the customer’s context. Intelligent building adjusts architecture’s ambience: on one hand it can provide comfortable ambient to the customers and, on the other hand, can protect the environment and save energy cost. The smart tourism network and the applications also support the tourist’s three travel stages. The smart applications not only allow tourists to enhance their travel experience but also improve the interaction with the tourism practitioners. Figure 2 illustrates a proposed future smart network for the tourism ecosystem. Smart networks not only focus on technology adoption and implementation but also on architectural design. The layout and structure of an intelligent building with a responsive and adaptive design can be changed according to the user’s requirements (Urquhart et al. 2019) which allows tourism managers more flexibility on venue management. Buildings covered with smart materials can be adjusted according to outside stimuli such as weather conditions to reduce energy con-
2 Development of Information and Communication Technology: From e-Tourism . . .
45
Fig. 2 Proposed comprehensive smart tourism network
sumption and environmental impact (Alobeidi and Alsarraf 2018). The future hotel should be operated by an intelligent building with adaptive architecture. Inclusion of smart technology into basic building design and architectural planning can (1) dynamically adjust the internal ambience including lighting, temperature, and humidity according to the external environment and (2) adjust the living dimensions of a hotel room and functional areas according to guest size and layout needs (Leung 2020). Planners will also analyze real-time data on consolidated hotel occupancy, restaurant, and event bookings and can then accordingly convert guest rooms for an event venue or vice versa according to demand.
Cross-References Developments in German e-Tourism: An Industry Perspective Digital Marketing in Tourism Drivers of E-Tourism Electronic Data Interchange and Standardization E-Tourism Curriculum Impact of Artificial Intelligence in Travel, Tourism, and Hospitality Internet of Things and Ubiquitous Computing in the Tourism Domain Recommender Systems in Tourism Robotics in Tourism and Hospitality Smart Tourists and Intelligent Behavior Strategic Use of Information Technologies in Tourism: A Review and Critique Technology-Assisted Mindfulness in the Co-creation of Tourist Experiences
46
R. Leung
The Evolution of Online Booking Systems Virtual Reality and the End of Tourism? A Substitution Acceptance Model Website Evaluation Frameworks: A Review of the Hospitality and Tourism Field
from 1996 to 2019
References Alcántara-Pilar JM, del Barrio-García S, Crespo-Almendros E, Porcu L (2017) Toward an understanding of online information processing in e-tourism: does national culture matter? J Travel Tour Mark 34(8):1128–1142. Scopus. https://doi.org/10.1080/10548408.2017.1326363 Ali F (2016) Hotel website quality, perceived flow, customer satisfaction and purchase intention. J Hosp Tour Technol 7(2):213–228. Scopus. https://doi.org/10.1108/JHTT-02-2016-0010 Ali A, Frew AJ (2014) Technology innovation and applications in sustainable destination development. Inf Technol Tour 14(4):265–290. https://doi.org/10.1007/s40558-014-0015-7 Alobeidi MM, Alsarraf AA (2018) The impact of the use of smart materials on the facades of contemporary buildings. Int J Eng Technol 7(4.19):744–750 Alonso-Almeida M-M, Llach J (2013) Adoption and use of technology in small business environments. Serv Ind J 33(15–16):1456–1472. https://doi.org/10.1080/02642069.2011.634904 Alonso-Almeida M-M, Borrajo-Millán F, Yi L (2019) Are social media data pushing overtourism? The case of Barcelona and Chinese tourists. Sustainability 11(12):3356. https://doi.org/10.3390/ su11123356 Álvarez LS, Martín AM D, Casielles RV (2007) Relationship marketing and information and communication technologies: analysis of retail travel agencies. J Travel Res 45(4):453–463. https://doi.org/10.1177/0047287507299593 Au N, Ho GKC, Law R (2014) Towards an understanding of e-procurement adoption: a case study of six hotels in Hong Kong. Tour Recreat Res 39(1):19–38. https://doi.org/10.1080/02508281. 2014.11081324 Ayeh JK (2006) Determinants of Internet usage in Ghanaian hotels: the case of the greater accra region (GAR). J Hosp Leis Mark 15(3):87–109 Bajpai A, Lee C-W (2015) Consumer behavior in e-tourism services: a case of Taiwan. Tour Hosp Manag 21(1):1–17 Benckendorff PJ, Xiang Z, Sheldon PJ (2019) Tourism information technology, 3rd edn. CABI, Wallingford Berger H, Dittenbach M, Merkl D, Bogdanovych A, Simoff S, Sierra C (2007) Opening new dimensions for e-tourism. Virtual Reality 11(2–3):75–87. Scopus. https://doi.org/10.1007/ s10055-006-0057-z Besbes A, Legohérel P, Kucukusta D, Law R (2016) A cross-cultural validation of the tourism web acceptance model (T-WAM) in different cultural contexts. J Int Consum Mark 28(3):211–226. Scopus. https://doi.org/10.1080/08961530.2016.1152524 Bilgihan A, Berezina K, Cobanoglu C, Okumus F (2014) The information technology (IT) skills of hospitality school graduates as perceived by hospitality professionals. J Teach Travel Tour 14(4):321–342. https://doi.org/10.1080/15313220.2014.955303 Boes K, Buhalis D, Inversini A (2015) Conceptualising smart tourism destination dimensions. In: Tussyadiah I, Inversini A (eds) Information and communication technologies in tourism 2015. Springer International Publishing, pp 391–403. https://doi.org/10.1007/978-3-319-14343-9_29 Bruhn M, Schoenmueller V, Schäfer DB (2012) Are social media replacing traditional media in terms of brand equity creation? Manag Res Rev MRN Patrington 35(9):770–790. https://doi. org/10.1108/01409171211255948 Buhalis D (2004) eAirlines: strategic and tactical use of ICTs in the airline industry. Inf Manag 41(7):805–825. https://doi.org/10.1016/j.im.2003.08.015
2 Development of Information and Communication Technology: From e-Tourism . . .
47
Buhalis D (2019) Technology in tourism-from information communication technologies to etourism and smart tourism towards ambient intelligence tourism: a perspective article. Tour Rev 75(1):267–272. https://doi.org/10.1108/TR-06-2019-0258 Buhalis D, Amaranggana A (2015) Smart tourism destinations enhancing tourism experience through personalisation of services. In: Tussyadiah I, Inversini A (eds) Information and communication technologies in tourism 2015. Springer International Publishing, Cham, pp 377–389 Buhalis D, Deimezi O (2004) E-tourism developments in Greece: information communication technologies adoption for the strategic management of the Greek tourism industry. Tour Hosp Res 5(2):103–130. https://doi.org/10.1057/palgrave.thr.6040011 Buhalis D, Kaldis K (2008) eEnabled internet distribution for small and medium sized hotels: the case of Athens. Tour Recreat Res 33(1):67–81. Scopus. https://doi.org/10.1080/02508281.2008. 11081291 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet-The state of e-tourism research. Tour Manag 29(4):609–623. Scopus. https://doi.org/10.1016/j.tourman.2008.01.005 Buhalis D, Leung R (2018) Smart hospitality – interconnectivity and interoperability towards an ecosystem. Int J Hosp Manag 71:41–50. https://doi.org/10.1016/j.ijhm.2017.11.011 Buhalis D, Licata MC (2002) The future e-tourism intermediaries. Tour Manag 23(3):207–220. Scopus. https://doi.org/10.1016/S0261-5177(01)00085-1 Buhalis D, Michopoulou E (2011) Information-enabled tourism destination marketing: addressing the accessibility market. Current Issues Tour 14(2):145–168. https://doi.org/10.1080/ 13683501003653361 Buhalis D, O’Connor P (2005) Information communication technology revolutionizing tourism. Tour Recreat Res 30(3):7–16. https://doi.org/10.1080/02508281.2005.11081482 Burgess S, Sellitto C, Cox C, Buultjens J (2015) Strategies for adopting consumer-generated media in small-sized to medium-sized tourism enterprises. Int J Tour Res 17(5):432–441. https://doi. org/10.1002/jtr.2008 Büyüközkan G, Ergün B (2011) Intelligent system applications in electronic tourism. Expert Syst Appl 38(6):6586–6598. Scopus. https://doi.org/10.1016/j.eswa.2010.11.080 Camacho D, Borrajo D, Molina JM (2001) Intelligent travel planning: a multiAgent planning system to solve web problems in the e-tourism domain. Auton Agents Multi-Agent Syst 4(4):387–392. Scopus. https://doi.org/10.1023/A:1012767210241 Camacho D, Aler R, Borrajo D, Molina JM (2006) Multi-agent plan based information gathering. Appl Intell 25(1):59–71. Scopus. https://doi.org/10.1007/s10489-006-8866-z Cantoni L, Kalbaska N, Inversini A (2009) E-Learning in tourism and hospitality: a map. J Hosp Leis Sport Tour Educ (Oxford Brookes University) 8(2):148–156. https://doi.org/10.3794/ johlste.82.263 Capatina A, Micu A, Micu AE, Bouzaabia R, Bouzaabia O (2018) Country-based comparison of accommodation brands in social media: an fsQCA approach. J Bus Res 89:235–242. Scopus. https://doi.org/10.1016/j.jbusres.2017.11.017 Çakar K, Kalbaska N, Inanir A, ªahin Ören T (2018) eVisa’s impacts on travel and tourism: the case of Turkey. J Hosp Tour Technol 9(1):13–31. Scopus. https://doi.org/10.1108/JHTT-022017-0019 Cena F, Console L, Gena C, Goy A, Levi G, Modeo S, Torre I (2006) Integrating heterogeneous adaptation techniques to build a flexible and usable mobile tourist guide. AI Communications 19(4):369–384 Çetin B, Akpinar A, Ozsayin D (2004) The use of information and communication technologies as a critical success factor for marketing in Turkish agri-food companies. Food Rev Int 20(3):221– 228. https://doi.org/10.1081/FRI-200029420 Chang G, Caneday L (2011) Web-based GIS in tourism information search: perceptions, tasks, and trip attributes. Tour Manag 32(6):1435–1437. Scopus. https://doi.org/10.1016/j.tourman.2011. 01.006
48
R. Leung
Chiabai A, Paskaleva K, Lombardi P (2013) e-Participation model for sustainable cultural tourism management: a bottom-up approach. Int J Tour Res 15(1):35–51. https://doi.org/10.1002/jtr.871 Chiang H-S, Huang T-C (2015) User-adapted travel planning system for personalized schedule recommendation. Inf Fusion 21(1):3–17. Scopus. https://doi.org/10.1016/j.inffus.2013.05.011 Christodoulidou N, Connolly DJ, Brewer P (2010) An examination of the transactional relationship between online travel agencies, travel meta sites, and suppliers. Int J Contemp Hosp Manag Bradford 22(7):1048–1062. https://doi.org/10.1108/09596111011066671 Chuang TC, Liu JS, Lu LYY, Tseng F-M, Lee Y, Chang C-T (2017) The main paths of e-tourism: trends of managing tourism through Internet. Asia Pac J Tour Res 22(2):213–231. https://doi. org/10.1080/10941665.2016.1220963 Davidson R, Alford P, Seaton T (2002) The use of information and communications technology by the European meetings, incentives, conferences, and exhibitions (MICE) sectors. J Conv Exhib Manag 4(2):17. https://doi.org/10.1300/J143v04n02_03 de Rosa AS, Bocci E, Dryjanska L (2019) Social representations of the European capitals and destination e-branding via multi-channel web communication. J Destin Mark Manag 11:150– 165. Scopus. https://doi.org/10.1016/j.jdmm.2017.05.004 Dorcic J, Komsic J, Markovic S (2019) Mobile technologies and applications towards smart tourism – state of the art. Tour Rev. https://doi.org/10.1108/TR-07-2017-0121 Egger I, Lei SI, Wassler P (2020) Digital free tourism – an exploratory study of tourist motivations. Tour Manag 79:104098. https://doi.org/10.1016/j.tourman.2020.104098 Ekiz E, Khoo-Lattimore C, Memarzadeh F (2012) Air the anger: investigating online complaints on luxury hotels. J Hosp Tour Technol 3(2):96–106. https://doi.org/10.1108/17579881211248817 Elliot S, Joppe M (2009) A case study and analysis of e-tourism curriculum development. J Teach Travel Tour 9(3/4):230–247. https://doi.org/10.1080/15313220903379299 Engelberger JF (2012) Robotics in practice: management and applications of industrial robots. Springer Science & Business Media, Boston Eraqi MI, Abou-Alam W, Belal M, Fahmi T (2011) Attitudes of undergraduate students toward e-learning in tourism: the case of Egypt. J Teach Travel Tour 11(4):325–348. https://doi.org/10. 1080/15313220.2011.624397 Ert E (2014) Nontrivial behavioral implications of trivial design choices in travel websites. Adv Cult Tour Hosp Res 8:53–59. Scopus. https://doi.org/10.1108/S1871-317320140000008002 Femenia-Serra F (2018) Smart tourism destinations and higher tourism education in Spain. Are we ready for this new management approach? In: Stangl B, Pesonen J (eds) Information and communication technologies in tourism 2018. Springer International Publishing, pp 437–449. https://doi.org/10.1007/978-3-319-72923-7_33 Fermoso AM, Mateos M, Beato ME, Berjón R (2015) Open linked data and mobile devices as e-tourism tools. A practical approach to collaborative e-learning. Comput Hum Behav 51:618– 626. Scopus. https://doi.org/10.1016/j.chb.2015.02.032 Fernandes T, Fernandes F (2018) Sharing dissatisfaction online: analyzing the nature and predictors of hotel guests negative reviews. J Hosp Mark Manag 27(2):127–150. Scopus. https://doi.org/ 10.1080/19368623.2017.1337540 Flecha-Barrio MD, Palomo J, Figueroa-Domecq C, Segovia-Perez M (2020) Blockchain implementation in hotel management. In: Neidhardt J, Wörndl W (eds) Information and communication technologies in tourism 2020. Springer International Publishing, pp 255–266. https://doi. org/10.1007/978-3-030-36737-4_21 Fodor O, Werthner H (2004) Harmonise: a step toward an interoperable e-tourism marketplace. Int J Electron Commerce 9(2):11–39. Scopus Forbes (2019) World’s first robot hotel fires half of its robots. Forbes. https://www.forbes.com/ sites/samshead/2019/01/16/worlds-first-robot-hotel-fires-half-of-its-robots/ Frikha M, Mhiri MBA, Gargouri F (2017) Social Trust based semantic tourism recommender system: a case of medical tourism in Tunisia. Eur J Tour Res 17:59–82 Fuchs M, Höpken W, Föger A, Kunz M (2010) E-business readiness, intensity, and impact: an Austrian destination management organization study. J Travel Res 49(2):165–178. https://doi. org/10.1177/0047287509336469
2 Development of Information and Communication Technology: From e-Tourism . . .
49
García-Crespo A, Colomo-Palacios R, Gómez-Berbís JM, Chamizo J, Rivera I (2010) Intelligent decision-support systems for e-tourism: using SPETA II as a knowledge management platform for DMOs and e-tourism service providers. Int J Decis Support Syst Technol 2(1):36–48. https:// doi.org/10.4018/jdsst.2010101603 Garkavenko V, Bremner H, Milne S (2003) Travel agents in the “information age”: New Zealand’s experiences of disintermediation. In: Information and communication technologies in tourism 2003: proceedings of the international conference in Helsinki, 2003, pp 467–476 Giannopoulos AA, Mavragani EP (2011) Traveling through the web: a first step toward a comparative analysis of European national tourism websites. J Hosp Mark Manag 20(7):718– 739. https://doi.org/10.1080/19368623.2011.577706 Goh E, Sigala M (2020) Integrating information & communication technologies (ICT) into classroom instruction: teaching tips for hospitality educators from a diffusion of innovation approach. J Teach Travel Tour 20(2):156–165. https://doi.org/10.1080/15313220.2020. 1740636 Gordon B (1986) The Souvenir: messenger of the extraordinary – ProQuest. J Popular Cult 20(3):135–146 Greenblat CS, Gagnon JH (1983) Temporary strangers: travel and tourism from a sociological perspective. Sociol Perspect 26(1):89–110. https://doi.org/10.2307/1389161 Gretzel U (2019) The role of social media in creating and addressing overtourism. In: Overtourism. De Gruyter Oldenbourg, pp 62–75. https://doi.org/10.1515/9783110607369-005 Gretzel U, Mitsche N, Hwang Y-H, Fesenmaier DR (2004) Tell me who you are and I will tell you where to go: use of travel personalities in destination recommendation systems. Inf Technol Tour 7(1):3–12 Gretzel U, Reino S, Kopera S, Koo C (2015a) Smart tourism challenges. J Tour 16(1):41–47 Gretzel U, Sigala M, Xiang Z, Koo C (2015b) Smart tourism: foundations and developments. Electron Mark 25(3):179–188. https://doi.org/10.1007/s12525-015-0196-8 Gretzel U, Werthner H, Koo C, Lamsfus C (2015c) Conceptual foundations for understanding smart tourism ecosystems. Comput Hum Behav 50:558–563. https://doi.org/10.1016/j.chb. 2015.03.043 Groth A, Haslwanter D (2016) Efficiency, effectiveness, and satisfaction of responsive mobile tourism websites: a mobile usability study. Inf Technol Tour 16(2):201–228. https://doi.org/ 10.1007/s40558-015-0041-0 Guttentag DA (2010) Virtual reality: applications and implications for tourism. Tour Manag 31(5):637–651. https://doi.org/10.1016/j.tourman.2009.07.003 Håkansson A, Hartung R, Jung JJ (2010) Using multi-agent system for business applications in multilingual ontologies. Smart Innov Syst Technol 5:157–166. Scopus. https://doi.org/10.1007/ 978-3-642-14594-0_16 He J, Liu H, Xiong H (2016) SocoTraveler: travel-package recommendations leveraging social influence of different relationship types. Inf Manag 53(8):934–950. Scopus. https://doi.org/10. 1016/j.im.2016.04.003 Hikkerova L (2010) E-tourism: players and customer behavior. Probl Perspect Manag 8(4):45–51. Scopus Hilton (2016) Hilton And IBM Pilot “Connie,” The World’s First Watson-Enabled Hotel Concierge. Hilton Press Center. https://newsroom.hilton.com/corporate/news/hilton-and-ibmpilot-connie-the-worlds-first-watsonenabled-hotel-concierge Hoadjli A, Kazar O, Rezeg K (2017) A layered design approach for mobile tourism, pp 110–115. Scopus. https://doi.org/10.1109/ICITECH.2017.8079986 IHG (2015) IHG and amadeus to revolutionise the technological foundations of the global hospitality industry. @ihgdevelopment. https://www.ihgplc.com:443/en/news-and-media/newsreleases/2015/ihg-and-amadeus-to-revolutionise-the-technological-foundations-of-the-globalhospitality-industry Inversini A, Cantoni L (2011) Towards online content classification in understanding tourism destinations’ information competition and reputation. Int J Internet Mark Advert 6(3):282–299. Scopus. https://doi.org/10.1504/IJIMA.2011.038240
50
R. Leung
Irannezhad E, Mahadevan R (2020) Is blockchain tourism’s new hope? J Hosp Tour Technol 12(1):85–96. https://doi.org/10.1108/JHTT-02-2019-0039 iTaiwan (2011) ITaiwan Wi-Fi. Government indoor public area free WiFi access. https://itaiwan. gov.tw/en/ Jeng C-R (2019) The role of trust in explaining tourists’ behavioral intention to use e-booking services in Taiwan. J China Tour Res. Scopus. https://doi.org/10.1080/19388160.2018.1561584 Joseph AI, Anandkumar SV (2016) Demographic influences on social media use across tourist lifecycle phases. JOHAR New Delhi 11(2):49–73 Jung T, tom Dieck MC, Lee H, Chung N (2016) Effects of virtual reality and augmented reality on visitor experiences in museum. In: Inversini A, Schegg R (eds) Information and communication technologies in tourism 2016. Springer International Publishing, pp 621–635. https://doi.org/ 10.1007/978-3-319-28231-2_45 Kalbaska N, Lee HA, Cantoni L, Law R (2013) UK travel agents’ evaluation of e-learning courses offered by destinations: an exploratory study. J Hosp Leis Sport Tour Educ 12(1):7–14. Scopus. https://doi.org/10.1016/j.jhlste.2012.09.001 Kalbaska N, Janowski T, Estevez E, Cantoni L (2017) When digital government matters for tourism: a stakeholder analysis. Inf Technol Tour 17(3):315–333. Scopus. https://doi.org/10. 1007/s40558-017-0087-2 Karagiannis S, Anthopoulos L, Aspridis G, Sdrolias L, Polykarpidis A (2014) Green urban space utilization for mild ICT-based touristic activities: the case of Pafsilipo Park in Greece. JETA: J Environ Tour Anal 2(1):83–96 Kim M-J, Chung N, Lee C-K (2011) The effect of perceived trust on electronic commerce: shopping online for tourism products and services in South Korea. Tour Manag 32(2):256–265. https://doi.org/10.1016/j.tourman.2010.01.011 Klausegger C (2005) Evaluating Internet portals – an empirical study of acceptance measurement based on the Austrian national tourist office’s service portal. J Qual Assur Hosp Tour 6(3/4):163–183 Kolarova D, Samaganova A, Samson I, Ternaux P (2006) Spatial aspects of ICT development in Russia. Serv Ind J 26(8):873–888. https://doi.org/10.1080/02642060601011673 Komninos A, Besharat J, Ferreira D, Garofalakis J, Kostakos V (2017) Where’s everybody? Comparing the use of heatmaps to uncover cities’ tacit social context in smartphones and pervasive displays. Inf Technol Tour 17(4):399–427. https://doi.org/10.1007/s40558-017-0092-5 Koukopoulos D, Styliaras G (2013) Design of trustworthy smartphone-based multimedia services in cultural environments. Electron Commer Res 13(2):129–150. Scopus. https://doi.org/10. 1007/s10660-013-9112-5 Ku ECS, Chen C-D (2015) Cultivating travellers’ revisit intention to e-tourism service: the moderating effect of website interactivity. Behav Inf Technol 34(5):465–478. Scopus. https:// doi.org/10.1080/0144929X.2014.978376 Latorre-Martínez MP, Iñíguez-Berrozpe T, Plumed-Lasarte M (2014) Image-focused social media for a market analysis of tourism consumption. Int J Technol Manag 64(1):17–30. Scopus. https:// doi.org/10.1504/IJTM.2014.059234 Law R, Leung R, Buhalis D (2009) Information technology applications in hospitality and tourism: a review of publications from 2005 to 2007. J Travel Tour Mark 26(5–6):599–623. https://doi. org/10.1080/10548400903163160 Law R, Qi S, Buhalis D (2010) Progress in tourism management: a review of website evaluation in tourism research. Tour Manag 31(3):297–313. https://doi.org/10.1016/j.tourman.2009.11.007 Law R, Leung R, Lo A, Leung D, Fong LHN (2015) Distribution channel in hospitality and tourism. Int J Contemp Hosp Manag. https://doi.org/10.1108/IJCHM-11-2013-0498 Leung R (2019) Smart hospitality: Taiwan hotel stakeholder perspectives. Tour Rev. https://doi. org/10.1108/TR-09-2017-0149 Leung R (2020) Hospitality technology progress towards intelligent buildings: a perspective article. Tour Rev (ahead-of-print). https://doi.org/10.1108/TR-05-2019-0173 Leung R, Law R (2013) Evaluation of hotel information technologies and EDI adoption: the perspective of hotel IT managers in Hong Kong. Cornell Hosp Q 54(1):25–37. https://doi.org/ 10.1177/1938965512454594
2 Development of Information and Communication Technology: From e-Tourism . . .
51
Li KW, Law R (2007) A novel English/Chinese information retrieval approach in hotel website searching. Tour Manag 28(3):777–787. Scopus. https://doi.org/10.1016/j.tourman.2006. 05.017 Liburd JJ, Nielsen TK, Heape C (2017) Co-designing smart tourism. Eur J Tour Res 17:28–42 Lillo-Bañuls A, Perles-Ribes JF, Fuentes R (2016) Wiki and blog as teaching tools in tourism higher education. J Teach Travel Tour 16(2):81–100. https://doi.org/10.1080/15313220.2015. 1118367 Lin J-W, Chang C-H, Hsieh C-Y, Cheng Y-T, Huang X-T (2014) IEBSR: an integrated e-tourism service for self-guided travel. J Comput (Taiwan) 24(4):12–21. Scopus Logesh R, Subramaniyaswamy V, Vijayakumar V (2018) A personalised travel recommender system utilising social network profile and accurate GPS data. Electron Gov 14(1):90–113. https://doi.org/10.1504/EG.2018.089538 Louvieris P, Driver J (2004) Avoiding buyer behaviour Myopia in hotel eCommerce. J Hosp Leis Mark 11(1):65–84. https://doi.org/10.1300/J150v11n01_05 Mahajan KB, Patil AS, Gupta RH, Pawar BV (2015) A new ICT based business model for tourism industry for the Maharashtra and Goa States of India. Int J Hosp Tour Syst 8(1):64–69 Mannas PS, Kour P, Bhagat A (2013) Linking E-tourism and cultural digitalization: a sustainable marketing approach towards silk route image. J Tour 14(2):23–33 Marchiori E, Cantoni L (2018) Applying the counseling-learning approach to a tourism-related massive open online course. J Teach Travel Tour 18(1):58–74. Scopus. https://doi.org/10.1080/ 15313220.2018.1404697 Martins J, Gonçalves R, Branco F, Barbosa L, Melo M, Bessa M (2017) A multisensory virtual experience model for thematic tourism: a Port wine tourism application proposal. J Destin Mark Manag 6(2):103–109. Scopus. https://doi.org/10.1016/j.jdmm.2017.02.002 Mich L, Baggio R (2015) Evaluating Facebook pages for small hotels: a systematic approach. Inf Technol Tour 15(3):209–231. https://doi.org/10.1007/s40558-015-0031-2 Minghetti V, Buhalis D (2010) Digital divide in tourism. J Travel Res 49(3):267–281. https://doi. org/10.1177/0047287509346843 Mohamed I, Moradi L (2011) A model of e-tourism satisfaction factors for foreign tourists. Aust J Basic Appl Sci 5(12):877–883. Scopus Moley Robotics (2015) Moley – the world’s first robotic kitchen. https://www.moley.com/ Montejo-Ráez A, Perea-Ortega JM, García-Cumbreras MA, Martínez-Santiago F (2011) Otiûm: a web based planner for tourism and leisure. Expert Syst Appl 38(8):10085–10093. Scopus. https://doi.org/10.1016/j.eswa.2011.02.005 Mwita M (2014) Opportunities and challenges in ICT adoption in Tanzania’s tourism industry: case study of tour operators. E-Rev Tour Res 11(1/2):18–25 Negru¸sa AL, Toader V, Soficã A, Tutunea MF, Rus RV (2015) Exploring gamification techniques and applications for sustainable tourism. Sustainability 7(8):11160–11189. https://doi.org/10. 3390/su70811160 Neidhardt J, Rümmele N, Werthner H (2017) Predicting happiness: user interactions and sentiment analysis in an online travel forum. Inf Technol Tour 17(1):101–119. https://doi.org/10.1007/ s40558-017-0079-2 Neuhofer B, Buhalis D (2014) Experience, co-creation and technology – issues, challenges and trends for technology enhanced tourism experiences. In: McCabe S (ed) The Routledge handbook of tourism marketing. Routledge, New York Neuhofer B, Buhalis D, Ladkin A (2014) A typology of technology-enhanced tourism experiences. Int J Tour Res 16(4):340–350. https://doi.org/10.1002/jtr.1958 Nguyen TN, Ricci F (2018) A chat-based group recommender system for tourism. Inf Technol Tour 18(1):5–28. https://doi.org/10.1007/s40558-017-0099-y Nilashi M, Ibrahim O, Yadegaridehkordi E, Samad S, Akbari E, Alizadeh A (2018) Travelers decision making using online review in social network sites: a case on TripAdvisor. J Comput Sci 28:168–179. Scopus. https://doi.org/10.1016/j.jocs.2018.09.006 Noguera JM, Barranco MJ, Segura RJ, Martínez L (2012) A mobile 3D-GIS hybrid recommender system for tourism. Inf Sci 215:37–52. Scopus. https://doi.org/10.1016/j.ins.2012. 05.010
52
R. Leung
Novak J, Schwabe G (2009) Designing for reintermediation in the brick-and-mortar world: towards the travel agency of the future. Electron Mark 19(1):15–29. Scopus. https://doi.org/10.1007/ s12525-009-0003-5 Oyewole P, Sankaran M, Choudhury P (2008) Information communication technology and the marketing of airline services in Malaysia: a survey of market participants in the airline industry. Serv Mark Q 29(4):85–103. https://doi.org/10.1080/15332960802218802 Pansiri J, Courvisanos J (2010) Attitude to risk in technology-based strategic alliances for tourism. Int J Hosp Tour Adm 11(3):275–302. https://doi.org/10.1080/15256480.2010.498283 Pantano E, Pietro LD (2013) From e-tourism to f-tourism: emerging issues from negative tourists’ online reviews. J Hosp Tour Technol 4(3):211–227. Scopus. https://doi.org/10.1108/JHTT-022013-0005 Papanis E, Kitrinou E (2011) The role of alternative types of tourism and ICT-strategy for the tourism industry of lesvos. Tourismos 6(2):313–331 Papathanassis A, Knolle F (2011) Exploring the adoption and processing of online holiday reviews: a grounded theory approach. Tour Manag 32(2):215–224. Scopus. https://doi.org/10.1016/j. tourman.2009.12.005 Paskaleva K, Azorín JA (2010) Developing integrated e-tourism services for cultural heritage destinations. Int J Serv Technol Manag 13(3–4):247–262. Scopus Paskaleva K, Cooper I, Azorín JA (2011) Soft factors in integrating innovation in advanced e-services. Int J Serv Technol Manag 15(3–4):161–177. Scopus. https://doi.org/10.1504/IJSTM. 2011.040374 Patiar A, Kensbock S, Ma E, Cox R (2017) Information and communication technology–enabled innovation: application of the virtual field trip in hospitality education. J Hosp Tour Educ 29(3):129–140. https://doi.org/10.1080/10963758.2017.1336096 Pazaitis A, De Filippi P, Kostakis V (2017) Blockchain and value systems in the sharing economy: The illustrative case of Backfeed. Technological Forecasting and Social Change, 125, 105–115. https://doi.org/10.1016/j.techfore.2017.05.025 Perks H, Riihela N (2004) An exploration of inter-functional integration in the new service development process. Serv Ind J 24(6):37–63. https://doi.org/10.1080/0264206042000299176 Pesonen J, Horster E (2012) Near field communication technology in tourism. Tour Manag Perspect 4:11–18. Scopus. https://doi.org/10.1016/j.tmp.2012.04.001 Pettenati MC, Bussotti P, Parlanti D, Giuli D (2008) Trust-enabling decision support system for e-tourism intermediation. Int J Netw Virtual Organ 5(3–4):275–299. Scopus. https://doi.org/10. 1504/IJNVO.2008.018824 Pichler M, Rutzinger M, Neiss H (2014) insightTourism – towards understanding online behavior of tourists. E-Rev Tour Res 11(3/4):50–53 Pühretmair F (2004) It’s time to make e-tourism accessible. Lect Notes Comput Sci (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3118:272–279. Scopus Qi S, Ip C, Leung R, Law R (2020) A new framework on website evaluation. In: 2010 international conference on E-business and E-government, pp 78–81. https://doi.org/10.1109/ICEE.2010.27 Racherla P, Hu C, Hyun MY (2008) Exploring the role of innovative technologies in building a knowledge-based destination. Current Issues Tour 11(5):407–428. https://doi.org/10.1080/ 13683500802316022 Raposo R, Beça P, Figueiredo C, Santos H (2012) Mesh-t: an on-going project on ubiquitous and context-aware technologies in tourism. E-Rev Tour Res 10(2):72–75 Rashideh W (2020) Blockchain technology framework: current and future perspectives for the tourism industry. Tour Manage 80:104125. https://doi.org/10.1016/j.tourman.2020.104125 RateTiger (2020) #1 Hotel Distribution System, Channel Manager & Connectivity Solutions |Hotel Channel Manager |eRevMax. https://www.erevmax.com/travel-solutions/hotel-channelmanager.html Reino S, Frew AJ, Mitsche N (2014) A benchmarking framework for e-tourism capability of destinations’ industries. J Hosp Tour Technol 5(2):126–142. Scopus. https://doi.org/10.1108/ JHTT-05-2013-0015
2 Development of Information and Communication Technology: From e-Tourism . . .
53
Richardson R, Marshall JN (1999) Teleservices, call centres and urban and regional development. Serv Ind J 19(1):96–116. https://doi.org/10.1080/02642069900000006 Roldán-García MDM, García-Nieto J, Aldana-Montes JF (2017) Enhancing semantic consistency in anti-fraud rule-based expert systems. Expert Syst Appl 90:332–343. Scopus. https://doi.org/ 10.1016/j.eswa.2017.08.036 Romero I, Tejada P (2020) Tourism intermediaries and innovation in the hotel industry. Curr Issues Tour 23(5):641–653. https://doi.org/10.1080/13683500.2019.1572717 Rouhani S, Ravasan AZ, Hamidi H, Vosough S (2013) Identification and classification of affecting factors on E-tourism in Iran. Middle East J Sci Res 16(10):1361–1368. Scopus. https://doi.org/ 10.5829/idosi.mejsr.2013.16.10.1330 Ruiz-Molina M-E, Gil-Saura I, Berenguer-Contrí G (2014) Information and communication technology as a differentiation tool in restaurants. J Foodserv Bus Res 17(5):410–428. https:// doi.org/10.1080/15378020.2014.967639 Sahli AB, Legohérel P (2015) The tourism Web acceptance model: a study of intention to book tourism products online. J Vacat Mark 22(2):179–194. Scopus. https://doi.org/10.1177/ 1356766715607589 Sebastia L, Garcia I, Onaindia E, Guzman C (2009) E-Tourism: a tourist recommendation and planning application. Int J Artif Intel Tools 18(5):717–738. https://doi.org/10.1142/ S0218213009000378 Sedmiak G, Planinc T, Kociper T, Planinc S (2016) Managers’ perceptions of the role of ICT in rural tourism firms efficiency: the case of Slovenia. Tourism (13327461) 64(3):339–345 Sheldon PJ (1983) The impact of technology on the hotel industry. Tour Manag 4(4):269–278. https://doi.org/10.1016/0261-5177(83)90005-5 Sheldon PJ (1997) Tourism information technology, 1st edn. CABI, Wallingford Sheraton (2019) About us – our history. Sheraton Hotels & Resorts. https://sheraton.marriott.com/ about-us/ Sigala M (2002) The evolution of Internet pedagogy: benefits for tourism and hospitality education. J Hosp Leis Sport Tour 1(2):29–45. https://doi.org/10.3794/johlste.12.4 Sigala M (2003a) Integrating and exploiting information and communication technologies (ICT) in restaurant operations: implications for restaurant productivity. J Foodserv Bus Res 6(3):55–76. https://doi.org/10.1300/J369v06n03_05 Sigala M (2003b) The information and communication technologies productivity impact on the UK hotel sector. Int J Oper Prod Manag 23:1224–1245. https://doi.org/10.1108/ 01443570310496643 Siricharoen WV (2010) Enhancing semantic web and ontologies for E-tourism. Int J Intel Inf Database Syst 4(4):355–372. Scopus. https://doi.org/10.1504/IJIIDS.2010.035581 Smirnov A, Shilov N, Kashevnik A, Ponomarev A (2017) Cyber-physical infomobility for tourism application. Int J Inf Technol Manag 16(1):31–52. Scopus. https://doi.org/10.1504/IJITM.2017. 080949 Standing C, Vasudavan T (2000) The impact of Internet on travel industry in Australia. Tour Recreat Res 25(3):45–54. https://doi.org/10.1080/02508281.2000.11014924 Stiakakis E, Georgiadis CK (2011) Drivers of a tourism e-business strategy: the impact of information and communication technologies. Oper Res 11(2):149–169. Scopus. https://doi. org/10.1007/s12351-009-0046-6 Stiakakis E, Vlachopoulou M (2017) The impact of social media on travelers 2.0. Tourismos 12(3):48–74. Scopus Stockdale R (2007) Managing customer relationships in the self-service environment of e-tourism. J Vacat Mark 13(3):205–219. Scopus. https://doi.org/10.1177/1356766707077688 Sun Y, Ma H, Chan EHW (2017) A model to measure tourist preference toward scenic spots based on social media data: a case of Dapeng in China. Sustainability (Switzerland) 10(1). Scopus. https://doi.org/10.3390/su10010043 Syam N, Sharma A (2018) Waiting for a sales renaissance in the fourth industrial revolution: machine learning and artificial intelligence in sales research and practice. Ind Mark Manag 69:135–146. https://doi.org/10.1016/j.indmarman.2017.12.019
54
R. Leung
Szopiñski T, Staniewski MW (2016) Socio-economic factors determining the way e-tourism is used in European Union member states. Internet Res 26(1):2–21. Scopus. https://doi.org/10. 1108/IntR-03-2014-0065 Taga H, Gaspari A, Vukaj H (2011) Implementation of customer relationship management in Albania travel industry: its overall impact on performance. J Mark Manag 2(1):51–60 Taleb T, Dutta S, Ksentini A, Iqbal M, Flinck H (2017) Mobile edge computing potential in making cities smarter. IEEE Commun Mag 55(3):38–43. https://doi.org/10.1109/MCOM.2017. 1600249CM Thorn K, Chen H-C (2005) E-business in the New Zealand tourism industry: an examination of implementation and usage. Current Issues Tour 8(1):39–61. https://doi.org/10.1080/ 13683500508668204 Tscheu F, Buhalis D (2016) Augmented reality at cultural heritage sites. In: Inversini A, Schegg R (eds) Information and communication technologies in tourism 2016. Springer International Publishing, pp 607–619. https://doi.org/10.1007/978-3-319-28231-2_44 Tse AC (2003) Disintermediation of travel agents in the hotel industry. Int J Hosp Manag 22(4):453–460. https://doi.org/10.1016/S0278-4319(03)00049-5 Tung VWS, Law R (2017) The potential for tourism and hospitality experience research in humanrobot interactions. Int J Contemp Hosp Manag Bradford 29(10):2498–2513. https://doi.org/10. 1108/IJCHM-09-2016-0520 Umlauft M, Pospischil G, Niklfeld G, Michlmayr E (2002) LoL@, A mobile tourist guide for UMTS. Inf Technol Tour 5(3):151–164 Urquhart L, Schnädelbach H, Jäger N (2019) Adaptive architecture: regulating human building interaction. Int Rev Law Comput Technol 33(1):3–33. https://doi.org/10.1080/13600869.2019. 1562605 Urry J, Larsen J (2011) The Tourist Gaze 3.0, 3rd edn. Sage Publishing. https://uk.sagepub.com/ en-gb/eur/the-tourist-gaze-30/book234297 Valˇci´c M, Domši´c L (2012) Information technology for management and promotion of sustainable cultural tourism. Informatica (Slovenia) 36(2):131–136. Scopus Vargas-Sánchez A (2016) Exploring the concept of smart tourist destination. Enlightening Tour 6(2):178–196 Vlachos IP (2013) Investigating E-business practices in tourism: a comparative analysis of three countries. Tourismos 8(1):179–197 Vladimirov Z (2012) Customer satisfaction with the Bulgarian tour operators and tour agencies’ websites. Tour Manag Perspect 4:176–184. Scopus. https://doi.org/10.1016/j.tmp.2012.07.003 Wall G (2020) From carrying capacity to overtourism: a perspective article. Tour Rev 75(1):212– 215. https://doi.org/10.1108/TR-08-2019-0356 Webb T, Wagstaff CRD, Rayner M, Thelwell R (2016) Leading elite association football referees: challenges in the cross-cultural organization of a geographically dispersed group. Manag Sport Leis 21(3):105–123. https://doi.org/10.1080/23750472.2016.1209978 Xiang Z, Du Q, Ma Y, Fan W (2018) Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews. Inf Technol Tour 18(1):43–59. https://doi.org/10.1007/s40558-0170098-z Xu X, Li Y (2017) Maximising hotel profits with pricing and room allocation strategies. Int J Serv Oper Manag 28(1):46–63. Scopus. https://doi.org/10.1504/IJSOM.2017.085904 Xu F, Buhalis D, Weber J (2017) Serious games and the gamification of tourism. Tour Manag 60:244–256. https://doi.org/10.1016/j.tourman.2016.11.020 Yayli A, Bayram M (2010) Web-based destination marketing: do official city culture and tourism websites’ in Turkey consider international guidelines? Tourism (13327461) 58(1):51–60 Yeh RJ, Leong JK, Blecher L, Lai HHS (2005) Analysis of hoteliers’ e-commerce and information technology applications: business travelers’ perceptions and needs. Int J Hosp Tour Adm 6(2):29–62. https://doi.org/10.1300/J149v06n02_02 Yoo KH, Gretzel U (2008) What motivates consumers to write online travel reviews? Inf Technol Tour 10(4):283–295
2 Development of Information and Communication Technology: From e-Tourism . . .
55
Yovcheva Z, Buhalis D, Gatzidis C (2012) Overview of smartphone augmented reality applications for tourism. E-Rev Tour Res 10(2):63–66 Yu C-C (2016) A value-centric business model framework for managing open data applications. J Organ Comput Electron Commerce 26(1–2):80–115. Scopus. https://doi.org/10.1080/ 10919392.2015.1125175 Yuan Y, Wang FY (2016) Towards blockchain-based intelligent transportation systems. IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), 2663–2668. https://doi. org/10.1109/ITSC.2016.7795984 Zacarias F, Cuapa R, De Ita G, Torres D (2015) Smart tourism in 1-click. Proc Comput Sci 56:447– 452. https://doi.org/10.1016/j.procs.2015.07.234 Zafiropoulos C, Vrana V (2006) A framework for the evaluation of hotel websites: the case of Greece. Inf Technol Tour 8(3–4):239–254 Zare S, Chukwunonso F (2015) How travel agencies can differentiate themselves to compete with online travel agencies in the Malaysian context. E-Rev Tour Res 12(5/6):226–240
3
Drivers of e-Tourism Dimitrios Buhalis
Contents Introduction: The Evolution of e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proprietary Systems and Automation Era (1960 to 1995) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Internet and Web 1.0 Era (1995–2010) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WEB 2.0 and the Social Media (2005–2015) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ambient Intelligence (AmI) Tourism (2020–Future) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions: e-Tourism Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58 59 61 63 66 67 68
Abstract e-Tourism takes advantage of ICT innovations to improve internal efficiency, establish efficient communication and distribution links with various intermediaries, and engage in conversation and service cocreation with customers. Technology-empowered tourism experiences have been supporting travellers to cocreate value throughout all stages of travel, before –during–after travel. The first proprietary information systems supported tourism and hospitality organizations to centralize and manage their inventory as well as manage their internal processes. The rapid development of the Internet since 1995 revolutionized technological solutions and information provision. Organizations developed their Web 1.0 presence as a window to the world and their websites as e-commerce shops. The Web 2.0 and the social media revolutionized interactivity between users and also between users and organizations. Smart tourism, Web 3.0, or the semantic web bring a range of opportunities that optimize the entire
D. Buhalis () Hong Kong Polytechnic University, Hong Kong, China Bournemouth University, Poole, UK e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_6
57
58
D. Buhalis
network and support the tourism ecosystem. The development of smart mobile devices rapidly emerged as a new agile flexible network and challenged desktop computing. Finally, ambient intelligence (AmI) tourism takes advantage of smart systems and brings ambient intelligence across tourism ecosystems. The Internet of Everything supports the development of sensitive, flexible, and adaptive ecosystems. Ambient intelligence connects all stakeholders and supports the constant formation of networks to bring value to all stakeholders.
Keywords Information communication technologies · Technologies · e-Tourism · Smart cities · Smart tourism · Ambient intelligence · Tourism · Marketing
Introduction: The Evolution of e-Tourism ICT innovations have been driving developments and competitiveness in tourism since the 1980s (Sheldon 1997). As early as 1993, Auliana Poon (1993, 16) predicted that “a whole system of ITs is being rapidly diffused throughout the tourism industry and no player will escape its impacts.” This chapter explores the drivers of e-tourism and transformational nature of technology for tourism. Information and communication technologies (ICTs) have been revolutionizing the tourism and hospitality industries, driving a range of innovations (Buhalis 2003). ICTs revolutionize the service development and delivery by empowering management of resources. They also support dynamic communications between all stakeholders, information dissemination to consumers, and coordination of industry at a very low cost (Buhalis 2000). ICTs have introduced radical changes to hospitality and tourism management and marketing. e-Tourism drivers have developed from simple technologies supporting innovation to integrated e-tourism systems, smart ecosystems, and gradually ambient intelligence (Buhalis 2020). Evidently, technology determines the strategy and competitiveness of tourism organizations and destinations (Buhalis 1998). Tourism and hospitality organizations around the globe invest heavily on technology to transform their organizational structures and reengineer their strategic management and marketing. They redesign their operational practices to gain benefits from the paradigm shifts experienced. Developing and maintaining sustainable competitive advantage and cocreating value for all stakeholders has become a function of the ability of organizations to use technology strategically. Buhalis (2003, p.20) defined e-tourism as “the digitisation of all processes and value chains in the tourism, travel, hospitality and catering industries.” At the tactical level, e-tourism includes eCommerce and applies ICTs for maximizing the efficiency and effectiveness of tourism organizations. At the strategic level, etourism revolutionized all business processes, the entire value chain, as well as the strategic relationships of tourism organizations with all their stakeholders. Naturally, e-tourism determines the competitiveness of the organizations by taking advantage
3 Drivers of e-Tourism
59
Table 1 Drivers of e-tourism eras Time 1960–1980
Era Proprietary systems
Technological innovation Mainframe computing/GDS
1980–1995
Proprietary systems Internet
PC/Fax/GDS
1995–2000 2000–2010 2005–2015 2015–2020 2020–
Internet/Web 1.0 Social media/Web 2.0 Networks and smart systems Ambient intelligence
Internet Internet/Web 1.0 Social media/Web 2.0 Smart systems sensors/Beacons Ambient intelligence Autonomous devices Wearables
Driver Automation and inventory management Automation and efficiency Networking/Network efficiency eCommerce Networking/advocacy/ cocreation Central management of resources Agility and ecosystem competitiveness
of intranets for reorganizing internal processes, extranets for developing transactions with trusted partners, and the Internet for interacting with all stakeholders. The e-tourism concept includes all business functions (eCommerce and eMarketing, eFinance and eAccounting, eHRM, eProcurement, eR&D, and eProduction) as well as eStrategy, ePlanning, and eManagement for all sectors of the tourism industry, including tourism, travel, transport, leisure, hospitality, principals, intermediaries, and public sector organizations. Through the different e-tourism eras, as indicated in Table 1, research has identified several drivers that often overlap as they build on each other.
Proprietary Systems and Automation Era (1960 to 1995) The first proprietary information systems (1960 to 1995) supported tourism and hospitality organizations to centralize and manage their inventory as well as manage their internal processes (Buhalis 1996a, 2000, 2003; Werthner and Klein 1999). Airlines’ computer reservation systems (CRS) led to innovation, with American Airlines adopting the first CRS (SABRE) in order to manage their inventory. Efficiency and inventory control were the first drivers of this technology (Buhalis 2004). CRS reduced the time required for manual operations and increased productivity and efficiency dramatically (Sheldon 1997; Buhalis 1993). ICTs transformed best operational practices and provide opportunities for geographical and operational expansion for airlines. Similarly, hotel property management systems improved the operations of hospitality properties and the management of hotel chains (O’Connor 1995, 1999; Peacock 1995; Collins and Cobanoglu 2013). Small and medium tourism and hospitality enterprises took advantage of technology and built their virtual size through engaging themselves digitally (Buhalis 1999; Buhalis and Main
60
D. Buhalis
1998; Peters and Buhalis 2004; Peters et al. 2009, 2018). Travel agency and tour operator systems also emerged to support intermediaries to develop, manage, and distribute tour packages (Inkpen 1998). ICTs empowered a range of applications including marketing research and planning; customer relationship management and personalized service; capacity and inventory management; distribution efficiency and productivity; inventory control and sales; and yield and revenue management; (Buhalis and Crotts 2013; Benckendorff et al. 2019). Many of these systems enabled tourism organizations to automate manual, time-consuming processes with computerized ones. Without technological solutions, collaboration among different departments and across firms could only be conducted manually through telephone and fax, making operations labor intensive, inefficient, and subject to error (Buhalis and O’Connor 2005). Following the success of SABRE, several global distribution systems (GDSs) were launched in the 1980s as travel supermarkets. GDSs connected individual CRS systems of different tourism firms, so that travel agencies can access a centralized system and book different services via a unified platform (Buhalis 2000; Law et al. 2014; Inkpen 1998). Tourism destinations also required systems to coordinate all tourism providers and also systematize information provision to target markets (Buhalis 1993, 1996a,b; Buhalis and Spada 2000). Governments and destinations had to coordinate local tourist enterprises and provide new tools for destination marketing and promotion (O’Connor and Rafferty 1997; Buhalis and Deimezi 2003; Buhalis and Michopoulou 2013). ICTs enabled them to collect, organize, and disseminate information about their tourism products. Destination management organizations (DMOs) developed destination management systems (DMSs) to organize information on attractions and facilities to support the marketing capability of destinations (Sheldon 1993; O’Connor and Rafferty 1997). DMSs emerged as interfaces between destination tourism enterprises (including principals, attractions, transportation, and intermediaries) and consumers and distribution channels (including tour operators, travel agencies, and ultimately consumers) (Buhalis et al. 2011; Mistilis et al. 2014; Brás et al. 2010). The main drivers during that era were economic necessity and a need to maximize efficiency in order to address the price/performance equation (Minghetti and Buhalis 2010). Global competition necessitated improvements in ICT price/performance ratios and required better productivity for capital employed in ICTs (Werthner and Klein 1999). ICT penetration in tourism provides strategic and operational tools for improving inventory efficiency and managerial control. Centralizing decision-making procedures improved control and performance, supporting efficiency and profitability (Buhalis 2003; Werthner and Klein 1999). Meeting customer expectations and managing customer relationships on a global basis was a major challenge for large tourism and hospitality organizations (Buhalis and Leung 2018). ICT supports customer relationship management (CRM) as they enable organizations to interact with customers to improve service and loyalty (Sigala 2011; Neuhofer et al. 2012). The dramatic growth of ICT engagement had profound implications for the whole tourism industry. ICTs incorporate not only software, hardware, and netware
3 Drivers of e-Tourism
61
but also information, management, and telecommunication systems that support big data processing (Stylos et al. 2021) . Designing the tourism information flow within and between organizations is critical for improving efficiency. Coordinating expertise, managing equipment utilized for the production of commodities and the provision of services, and developing sufficient intellectual capacity are critical for technology adoption. Technology and tourism are synergetic entities as technology drives tourism and tourism drives technology! ICTs provide the tools and enable the evolution of tourism demand and supply by facilitating existing needs and business prospects (Buhalis 2020). In turn, developments in ICTs offer further tools and greater potential, which are then matched by the requirements of the industry. This is a step approach where technology propels tourism growth and tourism growth requires more technology, motivating each other to move forward at a fast pace.
Internet and Web 1.0 Era (1995–2010) The rapid development of the Internet since 1995 revolutionized technological solutions and information provision (Egger and Buhalis 2008). Organizations developed their Web 1.0 presence as a window to the world and their websites as eCommerce shops. They aimed to develop their direct distribution channel, to reach their customers with offers and product propositions, and to conduct online marketing and sales (Buhalis 2003; Buhalis and Law 2008; Law et al. 2010, 2014; Qi et al. 2008). The Internet infrastructure also enabled application service providers (ASPs), CLOUD, and EDGE to host big data and perform key functions regardless of location (Paraskevas and Buhalis 2002). This had strategic implications for tourism and hospitality and revolutionized the competitiveness forces in the industry (Buhalis 1998) and affected the productivity of tourism and hospitality (Sigala 2003). eCommerce and online shopping often replaced physical high street outlets and changed dramatically the sources of strategic competitive advantage (Law et al. 2014). Website quality, online support, and customer satisfaction emerged as critical attributes that determined the competitiveness of organizations challenging legacy systems and organizations (Law et al. 2010; Qi et al. 2011a,b, 2014a,b; Buhalis and Licata 2002). Small businesses were also empowered to develop their online presence and compete with larger entities as they were now discoverable through their websites (Buhalis and Main 1998; Buhalis and Molinaroli 2003; Buhalis and Deimezi 2003; Buhalis 1996b; Collins et al. 2003). The Internet also changed how destinations managed and promoted themselves online. They often developed clusters of products, services, and experiences through tourism stakeholder networks (Buhalis 1997; Buhalis and Spada 2000; Mistilis et al. 2014). Destination management and marketing organizations took advantage of the web and developed DMO platforms to support the search and booking processes of potential and active visitors. DMOs developed destination websites and promoted tourism regions and facilities online
62
D. Buhalis
(Buhalis et al. 2011; Hsu et al. 2016). Qi et al. (2010, 2017) compare websites of China-based luxury hotels and international luxury hotels as eCommerce change dramatically the tourism sector developing the competitiveness of DMOs (Li and Buhalis 2006; Ma et al. 2003). The Internet supported travellers to access an incredible range of information about tourism products without any time or geographical constraints (Fan et al. 2019). Searching for information on Google as an engine and Yahoo as a web portal revolutionized online information search (Xiang et al. 2008; Vogt and Fesenmaier 1998; Paraskevas et al. 2011) and provided information about all tourism products in seconds. Pan and Fesenmaier (2006) explored how online information search influenced the vacation planning process. The importance of search engine marketing and search engine optimization for discoverability and distribution channel management also became apparent (Paraskevas et al. 2011). This changed dramatically the purchase behaviors of travellers and their consumer behavior (Fotis et al. 2011; Fan et al. 2019). Travellers could make more informed decisions toward purchase intentions (Bai et al. 2008; Law et al. 2010; Qi et al. 2008). Rapidly, Web 1.0 propelled a new paradigm for tourism and electronic commerce in the marketplace. The Internet also raised consumer expectations, as travellers became used to advanced products and services, compared tourism to other eCommerce companies like Amazon, and anticipated communicating with the organization to be interactive (Werthner and Ricci 2004). This developed requirements for instant gratification and forced organizations to develop their communication and eCommerce operations 24/7. Technology revolutionized and reengineered the entire travel distribution channel, by empowering direct communications and transactions between principals and consumers (disintermediation). The emergence of a plethora of new intermediaries (reintermediation) also changed the distribution channel dramatically (Buhalis and Licata 2002; Egger and Buhalis 2008). e-Tourism drivers during the Internet and Web 1.0 era included a wide range of strategic objectives, namely, the promotion of products and services directly to consumers; the improvement of direct distribution and loyalty; the widening of target market access; the attraction of direct markets toward the disintermediation of distribution channels; the maximization of market reach through the use of a wide range of intermediaries; and managing distressed inventory, through promotions and time-based campaigns. Search engine optimization, for both organic and sponsored campaigns, aimed at gaining significant competitive advantage through preferential placing of products and services on search engines (Paraskevas et al. 2011). This era also included web analytics to better understand market segments based on online behavior and to facilitate effective digital marketing through targeting. Online reputation through review sites such as Tripadvisor, encouraging eWoM, and managing the public image of organizations was also one of the most critical drivers of this era (Au et al. 2014; Williams et al. 2015, 2017). Increasingly online reputation is linked to performance and profitability (Viglia et al. 2016a; Anagnostopoulou et al. 2019).
3 Drivers of e-Tourism
63
WEB 2.0 and the Social Media (2005–2015) The Web 2.0 and the social media era revolutionized interactivity between users and also between users and organizations. Tourism is an “information-intensive industry” as consumers interact dynamically with various stakeholders to plan their trip, look for information, and make informed decisions about destinations, accommodation, restaurants, tours, and attractions (Chung and Buhalis 2008; Xiang and Gretzel 2010). This is particularly the case for people that have special needs and require accessible facilities and interaction (Buhalis and Michopoulou 2011, 2013). Social media revolutionized tourism marketing and became critical in the travel planning process (Xiang and Gretzel 2010; Wang et al. 2012). They encourage interaction, stimulate conversation, and support marketing strategies between consumers and tourism organizations (Leung et al. 2013; Chung and Buhalis 2008; Fotis et al. 2011; Fan et al. 2019; Hays et al. 2012). However, some travellers decide to disconnect during travelling and experience a digital detox (Tanti and Buhalis 2017). Fan et al. (2019) explore the tourist typologies of online and face-to-face social contact toward destination immersion and tourism encapsulation/decapsulation. Social media can support customer engagement through interactive communication and cocreation of experiences (Buhalis and Foerste 2015; Neuhofer et al. 2012). They also facilitate influencers promoting the tourism and hospitality products and services and supporting consumer decisions when planning trips. Consumers consult a wide range of social media websites and information sources when planning holidays (Buhalis and Law 2008; Schegg et al. 2008). Social media can develop the awareness of tourism destinations and services by supporting the planning phase of travel (Molinillo et al. 2018). Tourism experiences are primarily based on intangible services. Therefore, personal recommendations, reputation, and eword-of-mouth are very influential and extremely important for tourism experiences (Buhalis 1998; Inversini and Buhalis 2009). Chung and Buhalis (2008) suggest that online communities such as Tripadvisor and Yelp provide “trustworthy reviews” as they are provided by fellow travellers. Consumer-generated ratings are positively associated with online popularity and have a positive impact of electronic word-of-mouth (Zhang et al. 2010). Initially blogs and other social media introduced the Web 2.0 era around 2005 and facilitate interaction among all users. Social media empowered many-to-many communications (Buhalis and Law 2008; Egger and Buhalis 2008) and changed dramatically the communication strategies for all organizations. A range of dynamic online travel communities and social networks revolutionized communication. For example, the HK Quarantine Support Group on Facebook (https://www.facebook. com/groups/2788738214495345) had 52k members that supported each other with advice on how to travel during COVID and understand the various restrictions. They facilitated a range of different interactions, between consumers, producers, intermediaries, residents, and a range of other stakeholders (Buhalis 2003; Fotis et al. 2011; Hays, Page, Buhalis 2013; Brás et al. 2010). Social media also
64
D. Buhalis
propelled review sites, such as Tripadvisor and Yelp. This supported consumers to express their views online, influencing the eword-of-mouth (eWOM), reputation, branding, and business performance of tourism organizations (Inversini and Buhalis 2009; Ye et al. 2009a,b; Viglia et al. 2016b). In addition, Web 2.0 had major implications for the human resources functions, including recruitment, training, employee management, and satisfaction (Ladkin and Buhalis 2016; Li et al. 2013; Stamolampros et al. 2019a,b). e-Tourism drivers during the social media and Web 2.0 era were primarily focused on establishing direct communication lines with consumers, supporting discoverability, loyalty, and direct sales (Leung et al. 2013). Social media support marketers to engage and involve consumers, enhance their eWoM and reputation, and enhance the image of organizations (Leung et al. 2019). More advanced tourism organizations engage social media in all their marketing functions (Tiago et al. 2018). As a result, consumers are empowered to contribute to product development and service cocreation, personalizing their experiences (Neuhofer et al. 2015; Niininen et al. 2007; Kallmuenzer et al. 2019). Xiang and Gretzel (2010) illustrated the trip planning process and declared that if tourism marketers ignored social media, they would be in jeopardy of becoming irrelevant. Social media also support generating interest, increasing awareness, and ultimately achieving sales and loyalty. Social media engagement also built image, reputation, and wordof-mouth (WOM) marketing (Williams et al. 2015, 2017). Web 2.0 offered the opportunity to destination management and marketing organizations to develop interactive social media platforms and engage with potential and active visitors in real time (Buhalis and Sinarta 2019; Zhang et al. 2018). User-generated content (UGC) gave them the opportunity to attract real content from real travellers and influence many prospective travellers (Thomaz et al. 2017; Ye et al. 2011). Destinations and organizations used social media to increase their chances of capturing the attention of Internet users (Mistilis et al. 2014; Inversini and Buhalis 2009; Inversini et al. 2010). Finally, Web 2.0 supported the development of the sharing economy with Airbnb and Uber leading tourism applications (Yao et al. 2019). Smart tourism, Web 3.0, and the semantic web (2015–) bring a range of opportunities that optimize the entire network and support the tourism ecosystem (Buhalis and Amaranggana 2014, 2015). The development of smart mobile devices rapidly emerged as a new agile flexible network and challenged desktop computing. Mobile devices provide flexibility, portability, and convenience. Powered with 4G and now 5G, they provide users with interconnectivity and interoperability with different resources and stakeholders fuelling smartness. Mobile applications empower travellers to personalize their experiences and engage dynamically with their context. They are interacting dynamically with tourism organizations and are performing various tasks, including search, booking, check-in, as well as empower interactivity with organizations during the trip period. Formatting data in a way that is understood by software agents supports computer-to-computer interoperability and network interconnectivity. Big data, from a range of data sets, collected on the CLOUD or on EDGE facilitate real-time data management, support
3 Drivers of e-Tourism
65
interoperability, stimulate creativity and innovation, and encourage collaboration in the social web (Werthner and Ricci 2004; Buhalis and Sinarta 2019; Buhalis et al. 2019). Smartphones and mobile devices facilitate Web 3.0 and support dynamic interaction (Kim and Law 2015), mediating the touristic experience (Wang et al. 2012). Smartness takes advantage of interconnectivity and interoperability of integrated technologies to reengineer processes and data in order to produce innovative services, products, and procedures toward maximizing value for all stakeholders (Buhalis 2000). Smartness is not about technologies but about using technology to network all stakeholders dynamically and optimize the performance of the entire network. As such it is an optimization methodology rather than an application of technology. Smart tourism provides the infostructure for value cocreation (Buhalis and Amaranggana 2015; Boes et al. 2016; Gretzel et al. 2015; Mehraliyev et al. 2020). Within the smart tourism ecosystem, suppliers, and intermediaries, the public sector and consumers cocreate tourism experience and value for all stakeholders. All destination actors should be networked, dynamically co-producing value for everybody interconnected in the ecosystem. Smart systems employ a comprehensive range of different technologies to cocreate value for all. Smartness can also support organizations to develop their inclusiveness and accessibility for all, by supporting tourists with mobility, visual, auditory, and cognitive impairments to remove physical and service barriers (Buhalis and Michopoulou 2011, 2013; Michopoulou and Buhalis 2008; Michopoulou et al. 2016). Providing accessibility information and networking the accessible ecosystem develop the value cocreation for this special market. Smart tourism also takes advantage of gamification, as it can support rewarding interactions and higher level of satisfaction and engagement (Xu et al. 2016, 2017). Smartness means that interoperability and ubiquitous computing makes everybody interconnected. Integrating processes across different stakeholders brings the entire ecosystem together, toward generating value, through dynamic cocreation, sustainable resources, and dynamic personalization and adaptation to context (Buhalis 2020). Hitherto, there are no tourism destinations that have taken advantage of smartness fully. Examples are more applicable from safety and security domains where particular indicators are monitored and trigger specific actions. This was more evident in the COVID pandemic where smart technologies allowed authorities to locate individuals using applications and mobile devices. Tracking and tracing these individuals and correlating their location and behavior with other individuals enabled them to estimate the epidemiological danger and intervene accordingly. Smart technologies also support forecasting methods for demand using Google Trends data (Volchek et al. 2018). Adoption of smart technologies propels the smart tourism at the destination level (Buhalis and Amaranggana 2015; Boes, Buhalis, Inversini 2015). Destinations such as Palma and Malaga in Spain; Copenhagen, Denmark; Dublin, Ireland; Ljubljana, Slovenia; and Milano and Venice in Italy have regularly been identified as smart tourism destinations (Xiang et al. 2015). Smart tourism drivers aim to optimize the entire tourism ecosystem by ensuring that value is cocreated for all stakeholders while all negative impacts are
66
D. Buhalis
minimized. All stakeholders are interconnected and exchange value constantly. Dynamic discussions, through social media, enable consumers to engage with tourism and hospitality organizations to cocreate their experiences. For example, the Hong Kong Quarantine Support Group on Facebook (https://www.facebook. com/groups/2788738214495345) dynamically discusses the travel situation in Hong Kong during the COVID period, the availability of hotel rooms, the air travel disruptions according to changes in regulations, as well as the service provided in the 36 government-appointed quarantine hotels. Smart technologies support travellers to personalize their experience and adjust facilities and services to their preference and requirements. Location- and contextbased services take advantage of contextual information to maximize value (Buhalis and Foerste 2015). Contextual information enables cocreation of services in real time and propel nowness (Buhalis and Sinarta 2019). Smart technologies enable organizations to develop network benefits, improve collective operational processes, reduce labor cost across the network, and enhance customer interactivity within the ecosystem.
Ambient Intelligence (AmI) Tourism (2020–Future) Ambient intelligence (AmI) tourism takes advantage of smart systems and brings ambient intelligence across tourism ecosystems. The Internet of Everything supports the development of sensitive, flexible, and adaptive ecosystems. Ambient intelligence connects everybody and supports the constant formation of networks to bring value to all stakeholders at the destination. For example, interconnecting meteorological predictions and understand what the weather will be like in the next few days support the dynamic cocreation of suitable products for services. If heavy rain is predicted, then indoor activities such as museum visits or interactions with local cultural experiences such as cooking classes would be more appropriate. When the weather improves, excursions to local beaches or hikes may add more value to travellers. Synthesizing innovations through interoperability of systems and interconnectivity of business functions allow these technologies to support autonomous devices, robots, as well as virtual and augmented reality enhancing the customer experience. Augmented reality supports the projection of information in the field of view enabling a range of services such as interpretation, guiding, gaming, and experience enhancement (Yovcheva et al. 2012, 2013, 2014). The ambient intelligence (AmI) environments will support the emerging service environments of self-driving autonomous vehicles, cars, and drones as well as servicing robots (Tussyadiah et al. 2017; Ivanov and Webster 2019; Ivanov et al. 2019). Increasingly smartness will bring major implications for the tourism ecosystem as AmI supports real-time service, empowering the cocreation of value for all stakeholders across multiple platforms (Assiouras et al. 2022). Ambient intelligence revolutionizes personalization and contextualization of tourism by using social media context-based mobile (SoCoMo) marketing and realtime services (Buhalis 2020). SoCoMo uses both internal and external contextual
3 Drivers of e-Tourism
67
information to optimize customer experience (Buhalis and Foerste 2015). Social media can be used both as a data provider and dissemination media for most appropriate action. For example, hashtags on Twitter can be used to collect information on a festival and a range of social media can be used to disseminate possible actions. When the element of time is added to context, the real-time cocreation can take place. Using big data (Mariani 2020), interactions take place in real time, at the exact moment when consumers are willing to engage with brands using the platforms that they prefer. “Nowness” reflects the agility of brand performance toward real time, cocreation, data-driven, consumer-centric, and experience enhancement (Buhalis and Sinarta 2019). This reengineering enables shaping products, actions, processes, and services in real time, by engaging different stakeholders simultaneously to optimize the collective performance and competitiveness and generate agile solutions and value for all involved in the ecosystem. Ambient intelligence (AmI) tourism (2020–future) is driven by a range of disruptive technologies: the Internet of Things; the Internet of Everything; fifth generation mobile network (5G); radio-frequency identification (RFID); mobile devices, wearable smartphones, and wearables; 3D printing; apps along with APIs; cryptocurrency and blockchain; sensor and beacon networks; pervasive computing; gamification as well as enhanced analytical capabilities supported by artificial intelligence (AI) and machine learning (ML) (Buhalis et al. 2019; Tussyadiah et al. 2018). Ivanov et al. (2019) identified a myriad of application areas for robots across various tourism and hospitality sectors affecting the servicescape. Increasingly robotics and artificial intelligence will reengineer the whole range of tourism functions, from destination selection and reservations before travel to optimal route while travelling and dynamic scheduling of itineraries (Buhalis et al. 2019). Tourism organizations will need to reengineer all processes to ensure that robots and employees can augment the service experiences (Pizam et al. 2022).
Conclusions: e-Tourism Drivers In the last 40 years, e-tourism has progressed through various stages to take advantage of ICT innovations. The key drivers have always been a desire to improve internal efficiency, establish efficient communication and distribution links with various intermediaries, and engage in conversation and service cocreation with customers. Technology-empowered tourism experiences have been supporting travellers to cocreate value throughout all stages of travel, before–during–after travel (Neuhofer et al. 2014; Fotis et al. 2011). As technological tools facilitated this interaction, increasingly tourism organizations and destinations had the opportunity to engage in deep conversation and provide personalizable services. Smart cities and smart tourism have provided unprecedented opportunities for network optimization. Systems use artificial intelligence to collect, analyze, and explore big data. This effectively empowers e-tourism and brings the ability to optimize the value for all stakeholders. e-Tourism primarily focused on individual organizations by reengineering their internal functions and their communications with existing
68
D. Buhalis
and potential consumers. Smart tourism takes this one step forward by optimizing the value cocreation of the entire network and by supporting the entire tourism ecosystem to operate as a network. For tourism organizations and businesses to optimize their performance, they need to bring together and coordinate the entire range of stakeholders, together in tourism service ecosystems. Although it is recognized that individual companies do compete, it is more evident that unless they collaborate, they will not be able to develop comprehensive product propositions and cocreate services with consumers. Smart environments develop digital ecosystems to engage with multiple stakeholders in real time and cocreate value. Inevitably they disrupt and transform industry structures, processes, and practices. Innovations in service, production, management, and marketing strategy affect the competitiveness of individual organization as well as everybody involved in the service cocreation (Viglia et al. 2016a). Spencer et al. (2012) demonstrated that leadership is the most significant driver for technology adoption in tourism. It is agility and leadership that determine the technological competitiveness of tourism organizations and their ability to develop sustainably in the future.
References Anagnostopoulou S, Buhalis D, Kountouri I, Manousakis E, Tsekrekos A (2019) The impact of online reputation on hotel profitability. Int J Contemp Hosp Manag. https://doi.org/10.1108/ IJCHM-03-2019-0247 Assiouras I, Skourtis G, Giannopoulos A, Buhalis D, Karaosmanoglu E (2022) Testing the Relationship between Value Co-creation, Perceived Justice and Guests’ Enjoyment. Current Issues in Tourism. https://doi.org/10.1080/13683500.2022.2030680 Au N, Buhalis D, Law R (2014) Online complaining behavior for mainland China hotels: the perception of Chinese and non-Chinese customers. Int J Hosp Tour Adm 15:248–274 Bai B, Law R, Wen I (2008) The impact of website quality on customer satisfaction and purchase intentions: evidence from Chinese online visitors. Int J Hosp Manag 27(3):391–402 Benckendorff P, Xiang Z, Sheldon PJ (2019) Tourism information technology, 3rd edn. CABI, Boston Boes K, Buhalis D, Inversini A (2016) Smart tourism destinations: ecosystems for tourism destination competitiveness. Int J Tour Cities 2(2):108–124. https://doi.org/10.1108/IJTC-122015-0032 Brás JM, Costa C, Buhalis D (2010) Networks analysis and wine routes: the case of the Bairrada wine route. Serv Ind J 30(10):1–21 Buhalis D (1993) Regional Integrated Computer Information Reservation Management Systems (RICIRMS) as a strategic tool for the small and medium tourism enterprises. Tour Manag 14(5):366–378 Buhalis D (1996a) Information technology as a strategic tool for tourism. Tour Rev 17(2):34–36 Buhalis D (1996b) Enhancing the competitiveness of small and medium sized tourism enterprises at the destination level by using information technology. Electron Mark 6(1):1–6 Buhalis D (1997) Information and telecommunication technology as a strategic tool for economic, social and environmental benefits enhancement of tourism at destination regions. Progress Tour Hosp Res 3(1):71–93 Buhalis D (1998) Strategic use of information technologies in the tourism industry. Tour Manag 19(5):409–421
3 Drivers of e-Tourism
69
Buhalis D (1999) Information technology for small and medium sized tourism enterprises: adaptation and benefits. Inf Technol Tour 2(2):79–95 Buhalis D (2000) Information technology in tourism: past, present and future. Tour Recreat Res 25(1):41–58 Buhalis D (2003) e-Tourism: information technology for strategic tourism management. Pearson (Financial Times/Prentice Hall), London Buhalis D (2004) eAirlines: strategic and tactical use of ICTS in the Airline Industry. Inf Manag 41(7):805–825 Buhalis D (2020) Technology in tourism-from information communication technologies to etourism and smart tourism towards ambient intelligence tourism: a perspective article. Tour Rev 75(1). https://doi.org/10.1108/TR-06-2019-0258 Buhalis D, Amaranggana A (2014) Smart tourism destinations. In: Xiang Z, Tussyadiah I (eds) Information and communication technologies in tourism 2014. Springer, Heidelberg, pp 553– 564 Buhalis D, Amaranggana A (2015) Smart tourism destinations enhancing tourism experience through personalisation of services. In: Information and communication technologies in tourism 2015. Springer, Cham, pp 377–389 Buhalis D, Crotts J (2013) Global alliances in tourism and hospitality management. Routledge, New York Buhalis D, Deimezi R (2003) Information technology penetration and eCommerce developments in Greece. Electron Mark 13(4):309–324 Buhalis D, Foerste M (2015) SoCoMo marketing for travel and tourism: empowering co-creation of value. J Destin Mark Manag 4(3):151–161 Buhalis D, Law R (2008) Progress in tourism management: twenty years on and 10 years after the internet: the state of e-tourism research. Tour Manag 29(4):609–623 Buhalis D, Leung R (2018) Smart hospitality – interconnectivity and interoperability towards an ecosystem. Int J Hosp Manag 71:41–50 Buhalis D, Licata C (2002) The future of e-tourism intermediaries. Tour Manag 23(3):207–220 Buhalis D, Main H (1998) Information technology in small and medium hospitality enterprises: strategic analysis and critical factors. Int J Contemp Hosp Manag 10(5):198–202 Buhalis D, Michopoulou E (2011) Information-enabled tourism destination marketing: addressing the accessibility market. Curr Issues Tour 14(2):145–168 Buhalis D, Michopoulou E (2013) Information provision for challenging markets: the case of the accessibility requiring market in the context of tourism. Inf Manag 50:229–239 Buhalis D, Molinaroli E (2003) Entrepreneurial networks in the Italian e-tourism. Inf Technol Tour 5(3):175–184 Buhalis D, O’Connor P (2005) Information communication technology – revolutionising tourism. Tour Recreat Res 30(3):7–16 Buhalis D, Sinarta Y (2019) Real-time co-creation and nowness service: lessons from tourism and hospitality. J Travel Tour Mark 36(5):563–582 Buhalis D, Spada A (2000) Destination management systems: criteria for success. Inf Technol Tour 3(1):41–58 Buhalis D, Leung D, Law R (2011) e-Tourism: critical information and communication technologies for tourism destinations. Wang Y, Pizam A (eds) Destination marketing and management: theories and applications. CABI, pp 205–224 Buhalis D, Harwood T, Bogicevic V, Viglia G, Beldona S, Hofacker C (2019) Technological disruptions in services: lessons from tourism and hospitality. J Serv Manag 30(4):484–506 Chung J, Buhalis D (2008) Information needs in online social networks. Inf Technol Tour 10(4):267–282 Collins GR, Cobanoglu C (2013) Hospitality information technology: learning how to use it. Kendall/Hunt Publishing Co, Dubuque Collins C, Buhalis D, Peters M (2003) Enhancing SMTEs business performance through e-learning platforms. J Educ Train 45(8/9):483–494
70
D. Buhalis
Egger R, Buhalis D (2008) e-Tourism case studies: management & marketing issues in e-tourism. Butterworth Heinemann, Oxford Fan D, Buhalis D, Lin B (2019) A tourist typology of online and face-to-face social contact: destination immersion and tourism encapsulation/decapsulation. Ann Tour Res 78:102757 Fotis J, Buhalis D, Rossides N (2011) Social media impact on holiday travel planning: the case of the Russian and the FSU markets. Int J Online Mark (IJOM) 1(4):1–19 Gretzel U, Sigala M, Xiang Z, Koo C (2015) Smart tourism: foundations and developments. Electron Mark 25(3):179–188 Hays S, Page S, Buhalis D (2012) Social media as a destination marketing tool: an exploratory study of the use of social media among National Tourism Organisations. Curr Issues 16(3):211– 239. https://doi.org/10.1080/13683500.2012.662215 Hsu A, King B, Wang D, Buhalis D (2016) In-destination tour products and the disrupted tourism industry: progress and prospects. Inf Technol Tour 16(4):413–433 Inkpen G (1998) Information technology for travel and tourism, 2nd edn. Addison Wesley Longman, London Inversini A, Buhalis D (2009) Information convergence in the long tail: the case of tourism destination information. In: Information and communication technologies in tourism 2009, pp 381–392 Inversini A, Cantoni L, Buhalis D (2010) Destinations information competitors and web reputation. Inf Technol Tour 11:221–234 Ivanov S, Webster C (2019) Robots, artificial intelligence and service automation in travel, tourism and hospitality. Emerald, Bingley Ivanov S, Gretzel U, Berezina K, Sigala M, Webster C (2019) Progress on robotics in hospitality and tourism: a review of the literature. J Hosp Tour Technol 10(4):489–521 Kallmuenzer A, Peters M, Buhalis D (2019) Host-guest value co-creation in hospitality family firms. Curr Issues Tour 22(16):2014–2033 Kim HH, Law R (2015) Smartphones in tourism and hospitality marketing: a literature review. J Travel Tour Mark 32(6):692–711 Ladkin A, Buhalis D (2016) Online & social media recruitment: hospitality employer and prospective employee considerations. Int J Contemp Hosp Manag 28(2):327–345 Law R, Qi S, Buhalis D (2010) Progress in tourism management: a review of website evaluation in tourism research. Tour Manag 31(3):297–313 Law R, Buhalis D, Cobanoglu C (2014) Progress on information and communication technologies in hospitality and tourism. Int J Contemp Hosp Manag 26(5):727–750 Leung D, Law R, Van Hoof H, Buhalis D (2013) Social media in tourism and hospitality: a literature review. J Travel Tour Mark 30(1–2):3–22 Leung XY, Sun J, Bai B (2019) Thematic framework of social media research: state of the art. Tour Rev 75, emerald.com Li L, Buhalis D (2006) eCommerce in China: the case of travel. Int J Inf Manag 26(2):153–166. https://doi.org/10.1016/j.ijinfomgt.2005.11.007 Li L, Lockwood A, Gray D, Buhalis D (2013) Learning about managing the business in the hospitality industry. Hum Res Dev Q 24(4):525–559 Ma J, Buhalis D, Song H (2003) The adoption of ICTs & internet in China and impact to tourism industry structure. Int J Inf Manag 23(6):451–467 Mariani M (2020) Big Data and analytics in tourism and hospitality: a perspective article. Tour Rev 75(1). https://doi.org/10.1108/TR-06-2019-0259 Mehraliyev F, Cheng I, Choi Y, Koseoglu M, Law R (2020) A state-of-the-art review of smart tourism research. J Travel Tour Mark 37:1:78–91 Michopoulou E, Buhalis D (2008) Performance measures of net-enabled hypercompetitive industries: the case of tourism. Int J Inf Manag 28(3):168–180 Michopoulou E, Darcy S, Ambrose I, Buhalis D (2016) Accessible tourism futures: the world we dream to live in and the opportunities we hope to have. J Tour Futur 1(3):179–188 Minghetti V, Buhalis D (2010) Digital divide and tourism: bridging the gap between markets and destinations. J Travel Res XX(X):1–15
3 Drivers of e-Tourism
71
Mistilis N, Buhalis D, Gretzel U (2014) Future eDestination marketing: perspective of an Australian tourism stakeholder network. J Travel Res 53(6):778–790 Molinillo F, Liébana-Cabanillas F, Anaya-Sánchez R, Buhalis D (2018) DMO online platforms: image and intention to visit. Tour Manag 65:116–130 Neuhofer B, Buhalis D, Ladkin A (2012) Conceptualising technology enhanced destination experiences. J Destin Mark Manag 1:36–46 Neuhofer B, Buhalis D, Ladkin A (2014) A typology of technology enhanced experiences. Int J Tour Res 16(4):340–350 Neuhofer B, Buhalis D, Ladkin A (2015) Smart technologies for personalised experiences. A case from the Hospitality Industry. Electron Mark 25(3):243–254 Niininen O, Buhalis C, March R (2007) Customer empowerment in tourism through Consumer Centric Marketing (CCM). Qual Mark Res 10(3):265–282 O’Connor P (1995) Using computers in hospitality. Cassell, London O’Connor P (1999) Tourism and hospitality electronic distribution and information technology. CAB International, Oxford O’Connor P, Rafferty J (1997) Gulliver: distributing Irish tourism electronically. Electron Mark 7(2):40–45 Pan B, Fesenmaier DR (2006) Online information search: vacation planning process. Ann Tour Res 33(3):809–832 Paraskevas A, Buhalis D (2002) Information communication technologies decision-making: the ASP outsourcing model from the small hotel owner/manager perspective. Cornell Hotel Restaur Admin Q 43(2):27–39 Paraskevas A, Katsogridakis I, Law R, Buhalis D (2011) Search engine marketing: transforming search engines into hotel distribution channels. Cornell Hosp Q 52(2):200–208 Peacock M (1995) Information technology in hospitality. Cassell, London Peters M, and Buhalis D (2004) Small family hotel businesses: the need for education and training. J Educ Train 46(8/9):406–416. https://doi.org/10.1108/00400910410569524 Peters M, Frehse J, Buhalis D (2009) The importance of lifestyle entrepreneurship: a conceptual study of the tourism industry. PASOS 7(3):393–405. https://tinyurl.com/y8ebw5hs Peters M, Kallmuenzer A, Buhalis D (2018) Hospitality entrepreneurs managing quality of life and business growth. Curr Issues Tour. https://doi.org/10.1080/13683500.2018.1437122 Pizam A, Ozturk AB, Balderas-Cejudo A, Buhalis D, Fuchs G, Hara T, Meira J, Revillae M, Sethi D, Sheng Y, State O, Hacikaraa A, Chaulagain S, (2022) Factors affecting hotel managers’ intentions to adopt robotic technologies: A global study. Int J Hosp Manag 102:103139. https:// doi.org/10.1016/j.ijhm.2022.103139 Poon A (1993) Tourism, technology and competitive strategies. CAB International, Oxford Qi S, Law R, Buhalis D (2008) Usability of Chinese destination management organization websites. J Travel Tour Mark 25(2):182–198 Qi S, Law R, Buhalis D (2010) A comparison of Chinese and international online user perceptions of the usefulness of hotel websites. J Inf Technol Tour 11(4):329–340 Qi S, Law R, Buhalis D (2011a) Motivations for visiting hotel websites: Chinese versus international consumers. J Travel Tour Res 29(1):136–147 Qi S, Leung R, Law R, Buhalis D (2011b) A longitudinal study of consumer perceptions of travel website success factors in Hong Kong. FIU Hosp Rev 29(1):48–63 Qi S, Law, R, Buhalis D (2014a) Who booked five-star hotels in Macau? A study of hotel guests’ online booking attention. J Hosp Tour Manag 20:76–83 Qi S, Law, R, Buhalis D (2014b) A modified fuzzy hierarchical TOPSIS model for hotel website evaluation. Int J Fuzzy Syst Appl 3(3):82–101 Qi S, Law, R, Buhalis D (2017) Comparative evaluation study of websites of China-based luxury hotels and international luxury hotels. J China Tour Res 13(1):1–25 Schegg R, Liebrich A, Scaglione M, Ahmad SFS (2008) An exploratory field study of web 2.0 in tourism. In: O’Connor P, Höpken W, Gretzel U (eds) Information and communication technologies in tourism 2008. Springer, Wien, pp 152–163 Sheldon P (1993) Destination information systems. Ann Tour Res 20(4):633–649
72
D. Buhalis
Sheldon P (1997) Information technologies for tourism. CAB, Oxford Sigala M (2003) The information and communication technologies productivity impact on the UK hotel sector. Int J Oper Prod Manag 23(10):1224–1245 Sigala M (2011) eCRM 2.0 applications and trends: the use and perceptions of Greek tourism firms of social networks. Comput Hum Behav 27:655–661 Spencer AJ, Buhalis D, Moital M (2012) A hierarchical model of technology adoption for small owner-managed travel firms: an organizational decision-making and leadership perspective. Tour Manag 33(5):1195–1208 Stamolampros P, Korfiatis N, Chalvatzis N, Buhalis D (2019a) Job satisfaction and employee turnover determinants in high contact services: insights from employees’ online reviews. Tour Manag 75:130–147 Stamolampros P, Korfiatis N, Chalvatzis K, Buhalis D (2019b) Harnessing the “wisdom of employees” from online reviews. Ann Tour Res. https://doi.org/10.1016/j.annals.2019.02.012 Stylos N, Zwiegelaar J, Buhalis D (2021) Big data empowered agility for dynamic, volatile, and time-sensitive service industries: the case of tourism sector. Int J Contem Hosp Manag 33(3):1015–1036. https://doi.org/10.1108/IJCHM-07-2020-0644 Tanti A, Buhalis D (2017) The influences and consequences of being digitally connected and/or disconnected to travellers. Inf Technol Tour 17(1):121–141 Thomaz G, Biz A, Bettoni E, Mendes-Filho L, Buhalis D (2017) Content mining framework in social media: a FIFA world cup 2014 case analysis. Inf Manag 54:786–801 Tiago F, Couto J, Faria S, Borges-Tiago T (2018) Cruise tourism: social media content and network structures. Tour Rev 75, emerald.com Tussyadiah IP, Zach FJ, Wang J (2017) Attitudes toward autonomous on demand mobility system: the case of self-driving taxi. In: Information and communication technologies in tourism 2017. Springer, Cham, pp 755–766 Tussyadiah IP, Jung TH, tom Dieck MC (2018) Embodiment of wearable augmented reality technology in tourism experiences. J Travel Res 57(5):597–611 Viglia G, Minazzi R, Buhalis D (2016a) The influence of e-word-of-mouth on hotel occupancy rate. Int J Contemp Hosp Manag 28(9):2035–2051 Viglia G, Werthner H, Buhalis D (2016b) Disruptive innovations. Inf Technol Tour 16:327–329 Vogt CA, Fesenmaier DR (1998) Expanding the functional information search model. Ann Tour Res 25(3):551–578 Volchek E, Song H, Liu A, Law R, Buhalis D (2018) Forecasting the demand for London museum visitation using Google trend data. Tour Econ 25(3):425–447 Wang D, Park S, Fesenmaier DR (2012) The role of smartphones in mediating the touristic experience. J Travel Res 51(4):371–387 Werthner H, Klein S (1999) Information technology and tourism: a challenging relationship. Springer, Wien Werthner H, Ricci F (2004) E-commerce and tourism. Commun ACM 47(12):101–105 Williams N, Ferdinand N, Inversini A, Buhalis D (2015) Community Crosstalk: an exploratory analysis of destination and festival eWOM on Twitter. J Mark Manag 31(9–10):1113–1140 Williams N, Inversini A, Buhalis D, Ferdinand N (2017) Destination eWOM A macro and meso network approach? Ann Tour Res 64:87–101 Xiang Z, Gretzel U (2010) Role of social media in online travel information search. Tour Manag 31(2):179–188 Xiang Z, Woeber K, Fesenmaier DR (2008) Representation of the online tourism domain in search engines. J Travel Res 47(2):137–150 Xiang Z, Tussyadiah I, Buhalis D (2015) Smart destinations: foundations, analytics, and applications. J Destin Mark Manag 4(3):143–144 Xu F, Tian F, Buhalis D, Weber J, Zhang H (2016) Tourists as mobile gamers – gamification for tourism marketing. J Travel Tour Mark 33(8):1124–1142 Xu F, Buhalis D, Weber J (2017) Serious games and the gamification of tourism. Tour Manag 60:244–256
3 Drivers of e-Tourism
73
Yao B, Qiu R, Fan D, Liu A, Buhalis D (2019) Standing out from the crowd – an exploration of signal attributes of Airbnb listings. Int J Contemp Hosp Manag 31(12):4520–4542 Ye Q, Law R, Gu B (2009a) The impact of online user reviews on hotel room sales. Int J Hosp Manag 28(1):180–182 Ye Q, Zhang Z, Law R (2009b) Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst Appl 36(3):6527–6535 Ye Q, Law R, Gu B, Chen W (2011) The influence of user-generated content on traveler behavior: an empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput Hum Behav 27(2):634–639 Yovcheva Z, Buhalis D, Gatzidis C (2012) Smartphone augmented reality applications for tourism. e-Rev Tour Res (eRTR) 10(2):63–66 Yovcheva Z, Buhalis D, Gatzidis C (2013) Engineering augmented tourism experiences. In: Information and communication technologies in tourism 2013. Springer, Berlin/Heidelberg, pp 24–35 Yovcheva Z, Buhalis D, Gatzidis C, van Elzakker C (2014) Empirical evaluation of smartphone augmented reality browsers in an urban tourism destination context. Int J Mob Hum Comput Interact (IJMHCI) 6(2):10–31 Zhang Z, Ye Q, Law R, Li Y (2010) The impact of e-word-of-mouth on the online popularity of restaurants: a comparison of consumer reviews and editor reviews. Int J Hosp Manag 29(4):694– 700 Zhang H, Gordon S, Buhalis D, Ding X (2018) Experience value cocreation on destination online platforms. J Travel Res 57(8):1093–1107
4
e-Tourism Research: A Review Yulan Yuan, Yuen-Hsien Tseng, and Ching Li
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of e-Tourism Research Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Main e-Tourism Research Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion and Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76 77 78 81 87 89 91 91
Abstract The growing importance of information and communication technology (ICT) in tourism is widely acknowledged, resulting in a growing body of research on e-Tourism. This chapter presents a comprehensive review of the status of e-Tourism research and its topics based on 752 articles collected from 14 tourism and hospitality journals, which are indexed in the Web of Science (WoS)
Y. Yuan () Department of Landscape Architecture, College of Fine Arts and Creative Design, Tunghai University, Taichung, Republic of China e-mail: [email protected] Y.-H. Tseng Graduate Institute of Library and Information Studies, National Taiwan Normal University, Taipei, Republic of China e-mail: [email protected] C. Li Graduate Institute of Sport, Leisure, and Hospitality Management, National Taiwan Normal University, Taipei, Republic of China e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_12
75
76
Y. Yuan et al.
database. Publication trends regarding e-Tourism, contributing countries/territories, and changes in knowledge basis of scientific disciplines are presented as an overview of e-Tourism research. This is followed by a clustering approach based on bibliographic coupling (BC) similarity between each article to reveal salient topics (and subtopics) as well as their relatedness. Interpretations based on quantitative results and qualitative insights are provided. Complemented with domain knowledge and literature evidence, insights are presented to show possible future research directions, which would benefit both novice and experienced researchers and those who are considering e-Tourism as a research focus.
Keywords e-Tourism research topics · Bibliometric approach · Research trend · Content analysis
Introduction The growing importance of information and communication technology (ICT) in tourism is widely acknowledged, since it affects not only the information searches and travel decisions of individuals but also organizational competitiveness and service provision. Scholars have researched and taken active academic interest in examining the influence of ICT on tourism and hospitality phenomena. Auliana Poon, a pioneer in the study of tourism and technology, has been widely recognized by her work, “Tourism and Information Technologies,” published in the Journal of Annals Tourism in 1988. Subsequently, several leading tourism scholars including Pauline Sheldon (Tourism Information Technology, 1997), Gary Inkpen (Information Technology for Travel and Tourism, 1998), Hannes Werthner, and Stefan Klein (Information Technology and Tourism: A Challenging Relationship, 1999) have authored significant books in the early phases of the development of the field of e-Tourism. The e-Tourism research literature has continued to grow and evolve through the research published in academic journals. Furthermore, as Yuan et al. (2019) noted, e-Tourism has become an increasingly popular research subject in the tourism field with a substantial increase in publications in academic journals in recent years. Scholarly research efforts begin by reviewing literature to understand background knowledge associated with the research area. That is, making sense of prior research is fundamental to academic endeavors. As noted by Shafique (2013), “Studying the kind and content of the knowledge produced by a field can inform about the justification and contribution of the field as well as its evolution and future prospects (p. 34).” Academic journals reflect academic endeavor and provide a medium to disseminate and exchange scholarly knowledge among professionals. They have become an essential platform for scholars and practitioners to learn the known and to identify the unknown. As stated by Van Doren and Heit (1973), “Academic journals mirror the direction of a discipline’s research” (p. 67). Thus,
4 e-Tourism Research: A Review
77
examining articles published in these academic journals can help identify the influential literature and research frontiers. Moreover, such introspection can reveal the evolution of a field and important structural properties (Xiao and Smith 2005). Scholars and practitioners need to learn about the core knowledge of e-Tourism, as well as the research trends, to quickly understand the newly emerging e-Tourism topics. However, the increasing breadth and number of available papers is a significant challenge in reviewing literature when time is limited. Conventional retrospective studies are limited to general literature reviews (Frew 2000). Some more comprehensive reviews, such as those done by Buhalis and Law (2008), Wang et al. (2010), and Buhalis et al. (2011), examine the contents of e-Tourism-related articles and classify them into topic areas mostly relying on expert judgments, which requires a focus on a manageable set of papers. However, reviewing a large corpus of literature and organizing it into meaningful content is a task that can be achieved by new analytical tools now available to e-Tourism scholars. This review builds on and extends the previous work of Yuan et al. (2019) to investigate the current status and trends of e-Tourism research. This work applies automatic content analysis methods by use of a scientometric freeware to investigate the status of e-Tourism research and central topics based on 752 articles collected from 14 tourism and hospitality journals, which are indexed in the Web of Science (WoS) database. It takes on a tourism rather than a computer science perspective and a mainstream perspective rather than a specialist one that would have only considered e-Tourism-focused journals. Further, this review conducts an overview analysis as well as a drill-down analysis in terms of research topics and research trends regarding the subject matter. Publication trends, contributing countries/territories, and insights regarding changes in the knowledge basis of e-Tourism anchored in scientific disciplines are provided. Interpretations based on quantitative results and qualitative insights are presented to show possible future research directions
Data and Methods Information, technology, people, and organization and society are the four fundamental components of studying the use of technology (DeSanctis and Fulk 1999; DeSanctis and Poole 1994; Zhang and Benjamin 2007). Information is recognized as the lifeblood of tourism (Benckendorff et al. 2014, 2019), with the use of information often being the object of interest in a given tourism context, which could be at different travel stages, or by a given organization. From this perspective, ICT refers to the hardware, software, infrastructure, technical tools, applications (Gretzel et al. 2015; Yuan et al. 2006), and services that employ technology for serving, mediating, processing, managing, storing, manipulating, and analyzing information (Wang and Zhang 2012). Additionally, ICT facilitates travel-related content generated by tourists, which is utilized to assist travel-related decision-making of people across different travel stages (Sigala et al. 2012; Xie et al. 2011). The dynamic interactions and integrations of the four fundamental
78
Y. Yuan et al.
components represent the multidisciplinary characteristics of e-Tourism research. Though e-Tourism is defined in various ways, it always includes ICT and the Internet (Buhalis and Law 2008). The term e-Tourism, therefore, refers to electronic tools, software, and machines that are used by tourists or industry practitioners to assist them to resolve problems. e-Tourism also covers an end product (e.g., data, strategy and operational activities) that is generated in the process of using it. Tourism IT includes the Internet and wireless networks that connect user equipment to central service systems. The selection of keywords to search the WoS database for article review was based on these definitions and comprised the following steps. First, previous literature (Buhalis and Law 2008; Leung and Law 2007; Frew 2000; Wang et al. 2010; Yuan et al. 2014) was reviewed to compile a list of e-Tourism keywords, including 26 innovations (tools/machines/software) and 14 industry and business functions. Second, three scholars specialized in e-Tourism research and two scholars with computer science background were invited to review and modify the list of keywords. This process added additional 14 keywords, resulting in 54 keywords to identify records in the WoS database. Table 1 presents the final list of 54 keywords that were used to filter e-Tourismrelated journal articles in WoS. A total of 752 journal articles from 14 tourism and hospitality journals published during the period of 1999–2018 were identified, and their WoS records were downloaded on the 10th of the June 2019. To analyze these data, the Content Analysis Toolkit for Academic Research (CATAR) software package at https://github.com/SamTseng/CATAR/ was used to identify subject areas and citation patterns based on bibliographic coupling (BC) to quantitatively and qualitatively investigate the journal articles. CATAR is a freeware developed by Tseng (2010), which supports multistage clustering (MSC, for hierarchical classification building) to group research issues into concepts and, in turn, concepts into topics, based on the similarity of articles’ references. Three functions of CATAR were performed: (1) overview analysis for revealing overall research status, (2) bibliographic coupling analysis for clustering major research topics, and (3) automatic content analysis to detect keywords and key concepts in the content of articles.
Overview of e-Tourism Research Status A fundamental reason for the success of e-Tourism is the wide adoption of the Internet and Web. Figure 1 illustrates the trend regarding the number of e-Tourism publications per year in the past 29 years, showing that it rose significantly starting from 2007. The influence of e-Tourism is indicated by the times e-Tourism studies are cited, as shown in the black trend line in Fig. 1. The first ENTER conference was held in 1994. The International Federation for IT and Travel & Tourism (IFITT) was then established in 1997 IFITT (2019) to raise the awareness of e-Tourism among scholars and practitioners. The influences of journal publications increased in 2007, after 10 years of efforts to promote the importance of the Internet and Web. Various internet-based information and communication technologies were
4 e-Tourism Research: A Review
79
Table 1 The keywords used in identifying e-Tourism research Categories Technology Applications Infrastructure Software/hardware Technical resources Tools
Key terms used in search APP Ambient intelligence Artificial intelligent Augmented reality Blog Browser Computer Cryptocurrency Email Extranet Facebook Gamification Geographic Information System (GIS) Information and Communication Technology/ICT Organization and Society eBusiness Culture eCommerce Management eComplaint Operations eCRS Process eGovernment/mGovernment Strategy eGuide/mGuide Information Big data Analysis Data mining Classification Open data Representation Text mining
Google Analytics/Google trend Internet Netnography Online Podcast Search engine Self-service technology Smartphone/mobile phone Social media Tweeter Virtual community Virtual reality Web/website YouTube 5G eMarketing eWoM (electronic word-ofmouth) e-Tourism eTrust User-generated content
introduced, for example, mobile phones with touchscreens, which revolutionized information search and decision behaviors of tourists. Subsequently, e-Tourism became one of the most important and quickly growing fields of tourism research (Yuan et al. 2019). Figures 2, 3, and 4 depict the publication characteristics of e-Tourism research. The contribution of different continents and regions was estimated from the location of each author. A total of 1,375 authors from 57 countries have contributed to e-Tourism research. As shown in Fig. 2, the top three contributing continents and regions are Asia (305 articles, 30.1%), North America (289 articles, 28.4%), and Europe (207 articles, 20.3%). Previous study done by Yuan et al. (2019) revealed a similar pattern suggesting a rise in knowledge power and academic institutions in Asia. The country/territory contributing the most articles was the United States, which has a huge quantity of articles, and is also the most influential country (Fig. 3). The United States is followed by Hong Kong, the United Kingdom, and China. The top 15 countries account for over 86.1% of all e-Tourism publications considered in the analysis. The United States remains the most productive and influential country.
80
Y. Yuan et al.
110
3000
100 2500
90 80
2000
70 60
1500
50 40
1000
30 20
500
10 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990
0
0
Fig. 1 The number of cited times of e-Tourism articles published in the period of 1990 to 2018
Afica South America
0.9% 0.5%
Asia
33.3%
Australia/Oceania
7.4%
Europe
29.6%
North America
28.4% 0%
5%
10%
15%
20%
25%
30%
35%
Fig. 2 The contribution of different continents and regions
However, China shows a strong potential to contribute to the future e-Tourism research field There are a total of 263 scientific disciplines in 2018 classified by the Journal Citation Reports (JCR) database. Each article used in this study was assigned to one discipline based on this JCR classification. Figure. 4 shows the accumulated number of disciplines represented by the articles per year. As shown in Fig. 4, e-Tourism knowledge is mainly derived from five disciplines, with Social Science, Business & Economics, and Environmental Science & Ecology as the top 3 popular subject categories. Figure 4 demonstrates that e-Tourism research is a multidisciplinary field in nature and suggests there are many possible approaches to studying e-Tourism
4 e-Tourism Research: A Review Fig. 3 The top 15 contributing countries/territories and their influence. NC number of published articles on the left axis; TC cited times of articles on the right axis
81
280 240
NC
10000
TC
200
8000
160
6000
120
4000
80
2000
40
0
Social Sciences
175
Business & Economics Environmental Sciences & Ecology
150
Number of articles
NZ TR CH AT PT IT CA TW KR AU ES CN UK HK US
0
Sociology 125
Education Science & Technology
100 75 50 25
2018
2017
2016
2015
2013
2014
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
0
Year Fig. 4 The changes in knowledge basis of scientific disciplines
Main e-Tourism Research Topics Multistage clustering (MSC) based on bibliographic coupling (BC) similarity was applied to identify main research topics and identified 10 categories of e-Tourism research topics from the published articles in the 14 tourism and hospitality journals during years from 1999 to 2018. Table 2 presents the ten topic categories and subtopics. The ten research topics are (1) Online Tourist Behavior (OTB), (2) e-Tourism experience (ETE), (3) eCommerce in hospitality (EIH), (4) data analysis and applications (DAA)), (5) ICT and destination competitiveness (IDC), (6) internet research (IR), (7) new service development (NSD), (8) self-service technology
82
Y. Yuan et al.
Fig. 5 e-Tourism research topics revealed via the multistage clustering and multidimensional scaling (graphic is rendered by using VOSviewer)
(SST), (9) tourists’ experiential responses (TER), and (10) place branding and image co-creation (PBIC). Articles that shared no common citations with other articles were treated as outliers and were removed in the multistage clustering (MSC) process during the bibliometric analysis. Thus, the analysis included 609 articles of the original 752. Figure 5 shows the relatedness and popularity of the ten topics. Each circle denotes a research topic, where the size of a circle reflects the number of articles clustered in the corresponding topic and the closeness of circle centers reflects the relatedness of the corresponding topics. As shown in Fig. 5, OTB is the most popular e-Tourism topic with 414 articles; ETE is closely related to OTB and EIH, while IDC and IR are two closely related topics and are also relatively different from other e-Tourism topics. Figure 6 presents the development trends of the ten topics in e-Tourism research from 1999 to 2018. Table 2 lists the scholarly contributions to each topic. OTB, which examines the changes in user behaviors, adoption, and evolution of e-Tourism and the interactions among different user groups, including individuals and private and public organizations, is the topic that attracted the most scholarly contribution, at 68.0% of 609 articles. OTB comprises ten subtopics, namely, (1) online marketing and social media, (2) intermediaries and e-Tourism impacts, (3) eTrust and virtual community (4), online information searching behavior and channel usage, (5) effects and performance of websites, (6) website quality evaluation, (7) online shopping and marketing, (8) the use and adoption of mobile technology, (9) new sources of data and young tourists, and (10) social connections and mobile technology. As indicated in Fig. 5, OTB is the most popular e-Tourism research topic and is likely to remain so in the future. Many studies of OTB in the early 2000s examined e-Tourism from the individual perspective. At the same time, some studies took the organizational perspective to investigate how ICT was used
4 e-Tourism Research: A Review
83
250
30 232
25
200
20
150 128
15
100
10
0
50
45
5 7
2
1999-2002
2003-2006
2007-2010
2011-2014
2015-2018
ETE
EIH
DAA
IDC
TER
IR
SST
PBIC
NSD
OTB
0
Fig. 6 Development trends of the major ten topics in e-Tourism research from 1999 to 2018. OTB use of right side of axis
by various tourism organizations to provide services to tourists and capitalize on online tourist behaviors. This line of research mainly examined various dimensions of customer needs and behaviors to help tourism organizations to identify online behaviors of their customers. The focus has gradually shifted to understanding the psychological and behavioral effects of ICTs. Clearly, scholars have made substantive contributions to online marketing and social media, the subtopic of OTB with the largest number of articles. Studies in this subtopic investigated the profiles and posting motivations regarding eWOM (Bronner and De Hoog 2011), the effects on corporate reputation (Dijkmans et al. 2015), and the counter strategies in responding to negative eWOM (Sparks et al. 2016). The subtopic of user-generated content (UGC) has recently been recognized as a text data source and has been widely studied to exploring the potentials of UGC to enrich the understanding of tourist opinions, needs, and destination images. ETE contained 56 articles and was the second most popular topic. This topic mainly focuses on how ICT mediates the travel experience and its effect on the image formation of a destination, recognizing various cultural phenomena caused by ICT use. For instance, taking travel selfies has become a common behavior when people travel. People are also accustomed to sharing these photographs taken of themselves over various social media platforms. The interactive nature of social media facilitates the attraction of more potential tourists (Lyu 2016; Xiang and Gretzel 2010). This topic has four subtopics, namely, (1) formation and evaluation of virtual destination image, (2) online culture and digital behaviors, (3) the influence
84
Y. Yuan et al.
of ICT on travel experience, and (4) website experiences. Subtopic 3 focuses specifically on the use of social media (Munar and Jacobsen 2014) and how social media mediates experience as well the creation of destination images (Neuhofer et al. 2012). How to optimize website design to create engaging user experiences is the question concerning all the tourism industry players. Studies in subtopic 4 address how to satisfy the information needs of tourists most effectively. EIH gained attention in the period 2007–2010 as shown in Fig. 6. Articles in this topic examined the activities of rating, booking, and pricing of tourist services and products over the Internet, specifically applied in the hotel context. The EIH topic has four subtopics clustered in the EIH topic: (1) online pricing strategy, (2) online booking and pricing, (3) online rating and mining, and (4) guest reactions to online reviews. As revealed in this topic, Internet-based technology significantly affects the decision-making of tourists, who use the Internet not only to search and book hotel rooms but also to find the best deals (Toh et al. 2011). As more people are willing to share their travel experiences and to rate travel products online, those online reviews have great influence on the choice of destinations and travel products. How to package travel products and how to employ effective pricing strategies have thus caught scholars’ attention. DAA studies specifically focus on applying data collected via those new information sources with intention to gain insight into tourist behaviors. This research topic emerged in 2009–2010 and did not gain much attention until 2015–2018. Its trajectory shows growing interest from scholars. Data collected from these new sources, such as social media, global positioning system (GPS), and records of mobile phone activities, provide spatial and temporal information, enabling scholars to track spatial movement patterns of tourists (Chua et al. 2016; Shoval and Isaacson 2007). For example, data obtained from Twitter has been used to map tourist flows (Chua et al. 2016), and data obtained from Flickr can be employed to track digital footprints (Önder et al. 2016). Research related to this topic thus focuses on the precision and validity of different sources of data. A single data source is not sufficient to present the complex tourist spatial-temporal patterns. Therefore, studies comparing data sources evaluated the usefulness of those data (Pan et al. 2012; Salas-Olmedo et al. 2018). A comprehensive review of different types of data in tourism can be found in the work of Li et al. (2018). IDC comprises subtopics related to Internet branding and eDMO capability. The use of ICT enabled destination management organizations (DMOs) to perform old tasks in new ways. However, how to use various types of ICT to manage new tasks effectively and efficiently remains as a challenge. To address this issue, Yuan et al. (2006) evaluated the capability of DMOs to adopt and implement ICT. From the DMO perspective, ICT refers to an integrated information system rather than a single technology. A follow-up study carried out by Wang (2008), which specifically assessed factors for managing and implementing Internet-based information systems, provided implementation guidelines and formulation of effective strategies for the use of Internet-based information systems. A crucial task for DMOs is to utilize tourism-related data to evaluate market performance and make
4 e-Tourism Research: A Review
85
Table 2 Ten research topics in e-Tourism research Topic (No of articles) Topic 1 (414)
Research topics (Abbreviation) subtopics (No of articles) Online Tourist Behavior (OTB) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Topic 2 (56)
Online marketing and social media (102) Intermediaries and e-Tourism impacts (90) eTrust and virtual community (75) Online information searching behavior and channel usage (48) Effects and performance of website (45) Website Quality Evaluation (21) Online shopping and marketing (15) The use and adoption of mobile technology (12) New sources of data and young tourists (8) Social connections and mobile technology (8)
e-Tourism experience (ETE) 1. Formation and evaluation of virtual destination image (24) 2. Online culture and digital behaviors (12) 3. The influence of ICT on travel experience (10) 4. Website experience (10)
Topic 3 (45)
eCommerce in hospitality (EIH) 1. 2. 3. 4.
Topic 4 (36)
Online booking (24) Online pricing strategy (7) Online rating and mining (9) Guest reactions to online review (5)
Data analysis and applications (DAA) 1. Customer profile (7) 2. Spatial-temporal movement analytics (16) 3. Digital analytics (13)
Topic 5 (12)
ICT and destination competitiveness (IDC) 1. Internet branding (6) 2. eDMO capability (6)
Topic 6 (12)
Internet research (IR) 1. Online surveys (8) 2. Alternative methods (4) (continued)
86
Y. Yuan et al.
Table 2 (continued) Topic 7 (11)
New service development (NSD) 1. New service development (6) 2. Social media marketing (5)
Topic 8 (18)
Self-service technology (SST) 1. Self-service technology (9) 2. Performance and competitiveness of hospitality (9)
Topic 9 (11)
Tourists’ experiential responses (TER) 1. Online decision-making (7) 2. Effectiveness of online portraits and representations (4)
Topic 10 (9)
Place branding and image co-creation (PBIC) 1. Place branding and media content (5) 2. Channel comparison and selection (4)
more objective evaluations of the development of tourism demand. To support data exchange and collection, collective data repositories like TourMIS have been established to support tourism research and practice. Data from TourMIS related to urban sustainability was used to demonstrate how valuable information can yield to support destination management (Önder et al. 2017). IR shows that Internet-based data collection methods are attractive to researchers in tourism research, because of low cost, wide coverage, and fast response rates. However, the validity and reliability of survey data collected online were soon questioned. Studies examined the strengths and potential weaknesses of online surveys, compared online surveys with other survey formats (Dolnicar et al. 2009), and tried to understand the factors affecting sample representativeness (Pan et al. 2018). Many intriguing questions related to this topic remain unanswered. A probability sample of respondents from an open population is difficult to obtain. Therefore, researchers need to consider how to generate sampling frames of populations when employing different online social media applications to collect data and how to combine data collected from mobile devices or online with those from conventional forms of survey techniques, such as postal mail surveys. NSD & SST (Topics 7 and 8) are closely related. Topic 7 concerns new service development with specific attentions paid to social media (e.g., Facebook and TripAdvisor) and a focus on the hotel industry. Social media was realized as a channel to learn about customers (Chan and Guillet 2011) and a new channel to reach international travelers (Hsu 2012). Studies in NSD address the advantages and factors that limit social media use. Studies in Topic 8 (SST) take a service delivery perspective to examine technology-facilitated transactions and interactions between
4 e-Tourism Research: A Review
87
service providers and guests, with research conducted mostly in the contexts of hotels and restaurants, such as automated hotel checkout and ticketing services over the Internet. Studies focus on the factors affecting the adoption and implementation of SST, users’ attitudes and perceptions of SST (Kucukusta et al. 2014), and the benefits and competitive advantage resulting from the use of SST (Wei et al. 2016). User interactions with technology-based self-service delivery options in response to different tourist needs form an important research topic. TER & PBIC (Topics 9 and 10) recognize the importance of persuasive service design; thus, studies focus on shared online travel information and marketing contents and how these shape the destination image formation and online purchasing behaviors of tourism-related products and services. Studies in these two topics intended to identify the cognitive effects of personal stories (Tussyadiah and Fesenmaier 2010), of pictures (Boley et al. 2013), and of sounds and other sensory features (Lee and Gretzel 2012). Implications from these studies are focused on how to deliver persuasive communication effectively. It is worth noting that these two topics already attracted substantive attention from scholars in the previous Subtopics 1, 2, 5, and 7. Results of bibliographic coupling analysis suggests that these two independent topics are based on a different set of academic knowledge from that used in the previous subtopics. They represent more communication-/persuasionrelated theories and approaches. Thus, they demonstrate that e-Tourism as a study field attracts scholarly contributions from a wide range of disciplines. Table 3 lists the top three most cited references for each topic to help verify the epistemological characteristics of each topic. Scholars who are interested in understanding more about these topics can use this list as a starting point when familiarizing themselves with a topic and trying to understand its knowledge base.
Discussion and Future Research This review of the e-Tourism literature published in top tourism and hospitality journals indicates that tourism researchers contributed significantly to the expansive body of e-Tourism literature. As can be seen in Fig. 5, OTB, EIH, and ETE are topics receiving considerable research attention over the last 20 years. Studies focus on evaluating the relationships between technology and users and how organizations can apply ICT to provide new services and to effectively market products online. Research efforts tried to understand the connection between online activities and physical experiences in order to bridge the online and offline worlds. Most articles in these topics took individual tourists as research subjects. Online tourist behavior is a global phenomenon, but an international macro level perspective seems to be overlooked in previous studies. And almost no research considered the influence of global IT companies on ICT adoption and management of tourism organizations. The same ICTs, such as Uber ridesharing, are adopted in various countries. Research looking at topics such as social conflicts, impacts on intermediaries, labor policies, and legal compliance caused by new applications and powerful global platforms would be welcome.
88
Y. Yuan et al.
Table 3 The top three most cited references for each topic from 1990 to 2018 Topics Sources of citation Topic 1: Online Tourist Behavior Buhalis (2008) TM, Vol. 29, p. 609 Fornell (1981) JMR, Vol. 18, p. 39 Xiang (2010) TM, Vol. 31, p.179 Topic 2: e-Tourism experience Choi (2007) TM, Vol. 28, p. 118 Gallarza (2002) ATR, Vol. 29, p. 56 Baloglu (1999) ATR, Vol. 26, p. 868 Topic 3: eCommerce in hospitality Tso (2005) IJHM, Vol. Toh (2011) CHQ, V52, p. 181 O’Connor (2003) CHRAQ,
Topics Sources of citation Topic 6: Internet research Dillman (2007) MIS Buhalis (2008) TM, Vol. 29, p.609 Cobanoglu IJMR, Vol. 43, p. 441 (2001) Topic 7: New service development Boyd (2007) JCMC, Vol. 13, Wang (2002) p.210 TM, Vol. 23, p.407 Topic 8: Self-service Technology Law (2005) IJCHM, Vol. 17, Siguaw (2000) p.170 Kokkinou (2013) JTR, Vol. 39, p.192 IJHM, Vol. 33, p.435 Topic 9: Tourist experiential responses Adaval (1998) JCP, Vol. 7, p.207 Sirgy (1982) JCR, Vol. 9, p.278 Padgett (1997) JA, Vol. 26, p.49
Topic 4: Data analysis and applications Shoval (2011) ATR, Vol. 38, p.1594 McKercher TG, Vol. 14, p.147 (2012) EP, Vol. 29, p.184 Thornton (1997) Topic 5: ICT and destination competitiveness Topic 10: Place branding and image co-creation Dwyer (2003) CIT, Vol. 6, p.369 Leung (2013) JTTM, Vol. 30, p.3 Crouch (2011) JTR, Vol. 50, p.27 Connell (2006) TM, Vol. 27, p.1093 Note: ATR annals of tourism research, CIT Current Issues in Tourism, CHQ Cornell Hospitality Quarterly, CHRAQ Cornell Hotel and Restaurant Administration Quarterly, EP Environment and Planning A, IHJM International Journal of Hospitality Management, IJCHM International Journal of Contemporary Hospitality Management, IJMR International Journal of Marketing Research, JA Journal of Advertising, JCMC Journal of Computer-Mediated Communication, JCP Journal of Consumer Psychology, JCR Journal of Consumer Research, JMR Journal of Marketing Research, JTTM Journal of Travel and Tourism Marketing, JTR Journal of Tourism Research, MIS Mail and Internet Surveys, TG Tourism Geography, TM Tourism Management Some topics have only recently emerged and therefore have less than three frequently cited works listed
Previous studies of e-Tourism emphasize the interactions among people, technology, and organizations. Indeed, the extension of Internet connectivity into physical devices and everyday objects has introduced a promising paradigm the Internet of Things (IoT), which has enabled a new wave of data generation and collection. However, the majority of e-Tourism research perceives ICT as tools to provide services, rather than considering ICT as sensors and data sources. Although few studies in e-Tourism focused on this research direction, our analysis and observation of the past publications suggest that IoT potentially enables the collection of a huge quantity and variety of data. Thus, the enabling technology of big data, information capability of organizations, and application of artificial intelligence are expected to draw more attention from researchers. More DAA-related studies can be expected
4 e-Tourism Research: A Review
89
in the future. However, it seems only a very small part of “big” data is currently available to tourism scholars. ICT enables people to join the creation of the products and services of destinations and shape their own travel experience. The co-creation of travel experiences is expected to flourish with new forms of ICT emerging over the coming years. The massive and undeniable benefits of digital access to information and tourism products have been acknowledged. However, the creation of travel experiences is highly mediated by ICT (Tussyadiah and Fesenmaier 2009). Few e-Tourism studies address issues related to the dark side of e-Tourism. When ICT is used to create new experiences and mediate experiences, the question is to what extent tourists are really experiencing the destination. Would the destination rendered through technologies prevent people from discovering the true characteristics of the place? Additionally, new service development is built upon the adoption of ICT, meaning that e-Tourism is extremely technology-dependent. Although ICT is becoming increasingly powerful, it does not mean ICT is equally accessible to all users. When people lag behind or even refuse to adopt new technology, tourism organizations might see no use in incorporating the new technology into their service. The readiness to use ICT and the accessibility of the Internet and related technology varies among countries/territories (Dutta et al. 2015). Moreover, there are worries about artificial intelligence, privacy, and the technological displacement of labor. This could be a reason why the concept of smart tourism has only been rudimentarily developed and deployed (Gretzel et al. 2015). All these are topics that require further exploration. Tourists access online sources of information via the Internet in three stages: before, during, and after the trip (post-trip). Studies in TER and PBICare mainly concerned with the before-trip context. The review indicates that little attention has been given to information needs during the trip and after the trip. While current research examines ICT use in the pre-trip stage, it fails to address the varying information needs in different stages, which need to be considered for information provision. Moreover, travel information search is a recurring and ongoing process and does not always end with any given trip. Thus, e-Tourism scholars have an opportunity to improve the understanding of how to modify the content of communication to reflect the information needs across different stages of the trip.
Conclusion In conclusion, the topics presented here are not meant to be the definitive classification of e-Tourism studies but to serve as input for exploration and reflection. Analyzing past research efforts is essential for informing future research. The use of BC enables this review to group a large number of articles with a similar epistemological basis and thus provides insights into research trends and topics of e-Tourism research. The widespread use of Internet-based technology has inevitably brought rapid changes for the tourism industry and for tourists. It can be expected that the number of e-Tourism publications will grow quickly in the near future.
90
Y. Yuan et al.
Nevertheless, the large majority of e-Tourism articles will still be produced by scholars in the United States; China has the fastest e-Tourism research growth rate over the last 20 years. It was noted that contributions to understanding e-Tourism are not even. Research efforts were mainly contributed from 15 countries/territories, where Asia and North America take the largest portion. Given the trend shown in Fig. 6, the diffusion of e-Tourism knowledge just begins. IFITT as the leading international ICT and tourism association ought to expand its influence and facilitate e-Tourism knowledge dissemination across the continents. Efforts to encourage research in other regions are strongly called for. The limitations of the research must be considered. The subject areas identified by the given data set present the current research efforts of the e-Tourism field from an epistemological perspective. However, such research only reflects the primary interests of current researchers whose work is published in the SSCI-listed tourism journals. The BC analysis makes a comparison between references of the articles to be analyzed. This BC analysis requires the citation sources to have data quality. WoS is the database that currently fits this requirement. However, the emergence of new citation tools will enable scholars to pull data from new databases (Google scholar or Scopus) and to clean the data as well as filter unwanted data points. Incorporating data from those new databases will reveal a more comprehensive picture of e-Tourism research trends. Further, the identified subject areas are based on shared citations, which means that the study excludes works by researchers who did not follow tourism citation conventions or who built their research based on different literature. As a result of this focus on shared citations, similar subtopics might also appear repeatedly. Examples are the subtopics website quality evaluation in Topic 1 and website experience in Topic 2. Works in the former subtopic evaluate how the website might have influence on users’ attitudes and decisions, while the later subtopic evaluates the features that allow website providers to create engaging user experiences. Consequently, the website is the common research object for those subtopics, while approaches and views taken are quite different. Therefore, different sets of citations were used to create new knowledge in these two topic areas. In addition, efforts to further review the corpuses of these subtopics and make comparisons between subtopics is strongly recommended as an avenue for future research. As e-Tourism research has many areas to explore and to cover, more works are continually being published. By the time this work is finished, COVID-19 shook the tourism industry more than ever, people have realized the Internet, and mobile technologies are indispensable in our daily lives and for tourism. The Internet has essentially become a utility. Research responses to the COVID-19 pandemic are appearing quickly, such as “e-Tourism beyond COVID-19: a call for transformative research” (Gretzel et al. 2020). As a result, new e-Tourism-related research topics have already emerged that are not reflected in the current study. e-Tourism as a research field is also constantly under scrutiny and reflections on the state of the field, such as “Knowledge Creation in Information Technology and Tourism: A Critical Reflection and an Outlook for the Future” (Xiang et al. 2020) can provide new directions. In addition, the Information Technology and Tourism
4 e-Tourism Research: A Review
91
journal has finally been included in the Social Sciences Citation Index. The journal covers many important works regarding e-Tourism as a field, for example, “Future research issues in IT and tourism” (Werthner et al. 2015). Research issues and challenges identified in these works are likely to chart new courses for e-Tourism research. It is therefore important to continue reviewing developments in the field.
Cross-References e-Tourism: An Informatics Perspective Robotics in Tourism and Hospitality Acknowledgments The authors would like to thank the Ministry of Science and Technology of the Republic of China, Taiwan, for financially supporting this research under Contract No. MOST 106-2410-H-029-070 -. Special thanks to Chaang-Iuan Ho for her helpful comments and assistance on previous drafts of this manuscript.
References Benckendorff PJ, Sheldon PJ, Fesenmaier DR (2014) Tourism information technology. CABI Benckendorff PJ, Xiang Z, Sheldon PJ (2019) Tourism information technology. CABI Boley BB, Magnini VP, Tuten TL (2013) Social media picture posting and souvenir purchasing behavior: some initial findings. Tour Manag 37:27–30 Bronner F, De Hoog R (2011) Vacationers and eWOM: who posts, and why, where, and what? J Travel Res 50(1):15–26 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet – the state of eTourism research. Tour Manag 29(4):609–623 Buhalis D, Leung D, Law R (2011) eTourism: critical information and communication technologies for tourism destinations. In: Destination marketing and management: theories and applications, pp 205–224 Chan NL, Guillet BD (2011) Investigation of social media marketing: how does the hotel industry in Hong Kong perform in marketing on social media websites? J Travel Tour Mark 28(4):345–368 Chua A, Servillo L, Marcheggiani E, Moere AV (2016) Mapping Cilento: Using geotagged social media data to characterize tourist flows in southern Italy. Tour Manag 57:295–310 DeSanctis G, Fulk J (1999) Shaping organization form: communication, connection, and community. Sage, Thousand Oaks DeSanctis G, Poole MS (1994) Capturing the complexity in advanced technology use: adaptive structuration theory. Organ Sci 5(2):121–147 Dijkmans C, Kerkhof P, Beukeboom CJ (2015) A stage to engage: social media use and corporate reputation. Tour Manag 47:58–67 Dolnicar S, Laesser C, Matus K (2009) Online versus paper: format effects in tourism surveys. J Travel Res 47(3):295–316 Dutta S, Geiger T, Lanvin B (2015) The networked readiness index 2015: taking the pulse of the ICT revolution. The Global Information Technology Report 2015, 3 Frew AJ (2000) Information and communications technology research in the travel and tourism domain: perspective and direction. J Travel Res 39(2):136–145
92
Y. Yuan et al.
Gretzel U, Sigala M, Xiang Z, Koo C (2015) Smart tourism: foundations and developments. Electron Mark 25(3):179–188 Gretzel U, Fuchs M, Baggio R, Hoepken W, Law R, Neidhardt J, Xiang Z (2020) e-Tourism beyond COVID-19: a call for transformative research. Inf Technol Tour 22:187–203 Hsu YL (2012) Facebook as international eMarketing strategy of Taiwan hotels. Int J Hosp Manag 31(3):972–980 IFITT (2019) Outline and Mission, Date of Access. https://ifitt.org/wp-content/uploads/2014/04/ ifitt_articles_of_association_2011-01-26.pdf Kucukusta D, Heung VC, Hui S (2014) Deploying self-service technology in luxury hotel brands: perceptions of business travelers. J Travel Tour Mark 31(1):55–70 Lee W, Gretzel U (2012) Designing persuasive destination websites: a mental imagery processing perspective. Tour Manag 33(5):1270–1280 Li J, Xu L, Tang L, Wang S, Li L (2018) Big data in tourism research: a literature review. Tour Manag 68:301–323 Lyu SO (2016) Travel selfies on social media as objectified self-presentation. Tour Manag 54:185–195 Munar AM, Jacobsen JKS (2014) Motivations for sharing tourism experiences through social media. Tour Manag 43:46–54 Neuhofer B, Buhalis D, Ladkin A (2012) Conceptualising technology enhanced destination experiences. J Destination Mark Manag 1(1–2):36–46. https://doi.org/10.1016/j.jdmm.2012.08. 001 Önder I, Koerbitz W, Hubmann-Haidvogel A (2016) Tracing tourists by their digital footprints: the case of Austria. J Travel Res 55(5):566–573 Önder I, Wöber K, Zekan B (2017) Towards a sustainable urban tourism development in Europe: the role of benchmarking and tourism management information systems–a partial model of destination competitiveness. Tour Econ 23(2):243–259 Pan B, Chenguang Wu D, Song H (2012) Forecasting hotel room demand using search engine data. J Hosp Tour Technol 3(3):196–210 Pan B, Smith WW, Litvin SW (2018) Online travel survey response rates and researcher ethnicity. Int J Tour Res 20(6):779–781 Salas-Olmedo MH, Moya-Gómez B, García-Palomares JC, Gutiérrez J (2018) Tourists’ digital footprint in cities: comparing Big Data sources. Tour Manag 66:13–25 Shafique M (2013) Thinking inside the box? Intellectual structure of the knowledge base of innovation research (1988–2008). Strateg Manag J 34(1):62–93 Shoval N, Isaacson M (2007) Tracking tourists in the digital age. Ann Tour Res 34(1):141–159 Sigala M, Christou E, Gretzel U (2012) Social media in travel, tourism and hospitality: theory, practice and cases. Ashgate Publishing, Ltd Sparks BA, So KKF, Bradley GL (2016) Responding to negative online reviews: the effects of hotel responses on customer inferences of trust and concern. Tour Manag 53:74–85 Toh RS, DeKay CF, Raven P (2011). Travel planning: searching for and booking hotels on the internet. Cornell Hosp Q 52(4):388–398 Tseng Y (2010) Content analysis toolkit for academic research (CATAR). Retrieved April. Retrieved from http://web.ntnu.edu.tw/~samtseng/CATAR/ Tussyadiah IP, Fesenmaier DR (2009) Mediating tourist experiences: access to places via shared videos. Ann Tour Res 36(1):24–40 Tussyadiah IP, Fesenmaier DR (2010) Marketing places through first-person stories – an analysis of Pennsylvania roadtripper blog. J Travel Tour Mark 25(3–4):299–311 Van Doren CS, Heit MJ (1973) Where it’s at: a content analysis and appraisal of the Journal of Leisure Research. J Leis Res 5(1):67–73 Wang C, Zhang P (2012) The evolution of social commerce: the people, management, technology, and information dimensions. Commun Assoc Inf Syst 31(1):5 Wang D, Fesenmaier DR, Werthner H, Wöber K (2010) The journal of information technology & tourism: a content analysis of the past 10 years. Inf Technol Tour 12(1):3–16
4 e-Tourism Research: A Review
93
Wang Y (2008) Web-based destination marketing systems: assessing the critical factors for management and implementation. Int J Tour Res 10(1):55–70 Wei W, Torres E, Hua N (2016) Improving consumer commitment through the integration of self-service technologies: a transcendent consumer experience perspective. Int J Hosp Manag 59:105–115 Werthner H, Alzua-Sorzabal A, Cantoni L, Dickinger A, Gretzel U, Jannach D, . . . & Zanker M (2015) Future research issues in IT and tourism. Inf Technol Tour 15(1):1–15 Xiang Z, Gretzel U (2010) Role of social media in online travel information search. Tour Manag 31(2):179–188 Xiang Z, Fesenmaier DR, Werthner H (2020) Knowledge creation in information technology and tourism: a critical reflection and an outlook for the future. J Travel Res 60(6):1371–1376 Xiao H, Smith SL (2005) Source knowledge for tourism research. Ann Tour Res 32(1):272–275 Xie HJ, Miao L, Kuo P-J, Lee B-Y (2011) Consumers’ responses to ambivalent online hotel reviews: the role of perceived source credibility and pre-decisional disposition. Int J Hosp Manag 30(1):178–183 Yuan Y, Gretzel U, Fesenmaier DR (2006) The role of information technology use in American convention and visitors bureaus. Tour Manag 27(2):326–341 Yuan Y, Tseng Y-H, Ho C-I (2019) Tourism information technology research trends: 1990–2016. Tour Rev 74(1):5–19 Zhang P, Benjamin RI (2007) Understanding information related fields: a conceptual framework. J Am Soc Inf Sci Technol 58(13):1934–1947
5
A Post-disciplinary Perspective on e-Tourism Tim Coles, C. Michael Hall, and David Timothy Duval
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Disciplines, Knowledge Production, and Enquiry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Post-disciplinary Enquiry and Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . e-Tourism as a Post-disciplinary Field of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks: On the Future of Post-disciplinary e-Tourism . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96 98 100 105 111 113 113
Abstract Disciplines have traditionally been the building blocks of knowledge production, especially in higher education. In recent times though, more flexible approaches to production of knowledge beyond disciplines, in the context of application, and with the subject or topic as the starting point have become more popular and no less impactful. Such post-disciplinary approaches to tourism studies have been advocated and in evidence for over a decade. Set against this backdrop, this chapter argues that e-tourism is a field of study that has emerged from, and is best understood in its own right as, post-disciplinary enquiry. The differences between inter-, multi-, and post-disciplinary approaches are explained in the
T. Coles () University of Exeter Business School, Exeter, UK e-mail: [email protected] C. M. Hall Department of Management, Marketing and Entrepreneurship, University of Canterbury, Christchurch, New Zealand e-mail: [email protected] D. T. Duval Faculty of Business and Economics, University of Winnipeg, Winnipeg, MB, Canada © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_10
95
96
T. Coles et al.
chapter which also examines three ways in which post-disciplinary approaches may be recognized in, and contribute to, the e-tourism body of knowledge. Far from just another esoteric concept, viewing e-tourism in this manner suggests that its emergence is a story of synthesis and eschewing disciplines, it cannot and will not advance as far and as quickly if more restrictive approaches are taken, and that e-tourism is one of the few fields in tourism studies to turn towards the physical sciences for new knowledge production.
Keywords e-Tourism · Post-disciplinary · Knowledge production · Knowledge exchange
Introduction For many years, academics have been thinking about the ways in which new innovations and inventions occur, from whom, why, and through which settings and conditions. Recurring themes have been the nature and roles of higher education institutions, including universities; those of non-academic partners, not only in the private sector but also in public and voluntary sector organizations; and their interrelationships (Godin and Gingras 2000). An early subtext was the extent to which universities – as recipients of public money – were able to drive innovation and economic development for a perceived public good, either alone or in partnerships with business and enterprise (Etzkowitz 2008). Over three decades interest in the “entrepreneurial university” and the triple helix of university-industry-government has endured (Etzkowitz 2015), exacerbated by pressure on state budgets in a number of advanced economies and questions of return on investment (Vanino et al. 2019). Of course, some disciplines, such as those related to science, technology, engineering, and mathematics (STEM), have historically attracted higher levels of research funding compared to the arts, humanities, and social sciences, and so too have those institutions specializing more heavily in STEM subjects (Huang and Huang 2018). In a widely cited and relatively early contribution, Gibbons et al. (1994) distinguished between the production of knowledge through the traditional structure of disciplines within universities and higher education (Mode I) with that produced in the context of application (Mode II), that is to say, new knowledge produced within or through thematic foci rather than from disciplinary starting points. They recognized that, in order to address them appropriately and adequately, many subjects were beyond individual disciplines, or they were inherently postdisciplinary in nature. Instead, they required more flexible approaches to knowledge production, exchange, and application (Smith 1998; Hellström et al. 2003; Painter 2003; Goodwin 2004). Examples of the Mode II include advances in high-level computing that have been driven by the gaming industry rather than in academic
5 A Post-disciplinary Perspective on e-Tourism
97
departments dealing with computer science. In a similar vein, some of the major advances in project management (Garel 2013) and supply chain management (Lummus and Vokurka 1999), which are now deeply embedded fields of study in business and management, have been traced to the military and the imperative for expediency and innovation in conflicts. Despite the significance of Mode II-type influences on innovation and paradigm development, disciplines may have been, and in many cases continue to be, considered the traditional building blocks of knowledge (Coles et al. 2006; Munar et al. 2016). Although the veracity of Mode II has been questioned (Hessels and van Lente 2008), from these and other examples it is clear that useful new highlevel knowledge production has frequently taken place outside higher education and beyond the boundaries of traditional disciplines with their particular policing practices, paradigms, and canons of thought (Tribe 2004 Coles et al. 2006). Furthermore, the grander challenges facing humankind across the globe will not be addressed by work from single disciplines alone. In the so-called Fourth Age of Research, international collaborations among (elite) groups of researchers will be required to address the most pressing issues facing society, economy, environment, and culture in the twenty-first century (Adams 2013). In many respects tourism may be considered as a post-disciplinary subject area in its own right (Coles et al. 2005, 2006, 2009, 2016; Munar et al. 2016). As a locus for, or as a distinctive form of, human activity, tourism is highly complex and comprises a wide range of behaviors by both human and nonhuman actors and agents that is only more fully understood when knowledge(s) produced across the social sciences, arts, humanities, and, increasingly, the physical sciences are brought together (Holden 2005; Coles et al. 2006, 2016; Belhassen and Caton 2011; Fullagar and Wilson 2012; Munar et al. 2016). There are limits to tourism enquiry associated with single disciplines which other apparently more flexible (multi- and inter-disciplinary) approaches are also unable to overcome fully, from either a philosophical or practical perspective (Coles et al. 2016; Darbellay and Stock 2012; Darbellay 2016; Munar et al. 2016). This chapter argues that e-tourism is one field of study that has emerged from, and is best understood in its own right, as a form of post-disciplinary enquiry. Within the study of e-tourism, it is the themes, the topics, and the content that are the primary concern. They continue to drive the development of the research agenda rather than disciplinary imperatives. More flexible forms of enquiry in this field of study are better able to deliver greater, more resonant contributions to knowledge that push back the frontiers of understanding and application. Next, the chapter explains how postdisciplinary approaches differ from single, multi-, and inter-disciplinary approaches. This is followed by a brief examination of the multiple facets of e-tourism, the way in which this field of study may be thought of as post-disciplinary, and how particular topics in e-tourism have emerged and may continue to develop in the future.
98
T. Coles et al.
Disciplines, Knowledge Production, and Enquiry The proliferation of tourism research over the past five decades has been accompanied by numerous attempts to make sense of how academic enquiry of the subject has developed. Several scholars have charted the course of tourism research among particular academic disciplines, especially in the social sciences (Holden 2005; Gibson 2008; Hall and Page 2010; Cohen and Cohen 2012; Tang 2014; Müller 2019), but also increasing in the arts and humanities (Belhassen and Caton 2011; Fullagar and Wilson 2012; Munar et al. 2016; Coles et al. 2016). In a period characterized by the increased use of metrics to judge scholarship, there has been a fetish among some to try to identify the most influential scholars as well as the institutions and spaces these “thought leaders” inhabit (Hall 2005a, 2011; Wickham et al. 2012; Benckendorff and Zehrer 2013). For some, such studies represent entirely arid exercises in taxonomy and attribution unconnected to the more urgent issues of tourism epistemology and ontology. However, they continue to remind us that for a great many scholars interested in, and engaged with, tourism and tourism-related research, the way in which they make sense of the academic world and the scholarly environments they inhabit is most commonly through the use of broad disciplinary ascriptions to define their intellectual affiliations and professional homes. Many would describe themselves, for instance, as “tourism geographers” or as interested in the “sociology of tourism,” and there are several disciplinary associations and even peer-reviewed journals that continue to promote, support, and advocate the study of tourism rooted in, or inspired by, distinctive disciplinary positions on scholarship, enquiry, and knowledge production. Orientation, in this instance, matters. First and foremost, scholars tend to identify themselves by “traditional” disciplines (e.g., economics, geography, psychology, sociology) and then by their subject or thematic interest (i.e., tourism). Administrative conventions and reward systems encourage this form of attribution. Promotions within higher education are routinely made on the basis of an academic’s contribution to, and standing in, a discipline (Butowski 2016). The problem is that, as a subject area, tourism presents research problems that defy adequate (i.e., meaningful) responses from scholars exclusively inhabiting single disciplines (Weiler et al. 2012; Wardle and Buckley 2014). For example, the relationship between tourism and climate change is undoubtedly complex (Scott et al. 2012). Without perspectives on personal travel rooted in geography, sociology, and psychology, our understanding of the contribution of personal travel preferences and mobility patterns in achieving emissions reduction targets would be much poorer, arguably even deficient (Barr et al. 2011; Higham et al. 2013). Conversely, without understanding of engineering (energy systems, emissions) and management studies (organizational behavior, entrepreneurship), supply-side responses and mitigation as crucial components would not be properly understood (Gössling 2011, 2013; Coles et al. 2014). But it is not only in the area of tourism and climate change or tourism and the (natural) environment where this is the case. Numerous other examples abound. As scholarship in tourism marketing (Troung and Hall 2013; Dolnicar and
5 A Post-disciplinary Perspective on e-Tourism
99
Ring 2014) and tourism policy (Ambrosie 2010) reveals, tourism produces many complex and “wicked” problems where the solutions are to be found only at the interface of disciplines (Brennan 2004). This is precisely the same for e-tourism. For instance, more agile, evidence-based, smart forms of tourism management are made able by specialists in data science, artificial intelligence, and data analytics and advances in systems engineering (Boes et al. 2016; Buhalis and Leung 2018). Yet the implementation of smart tourism principles, practices, and technologies for destination management require multiple stakeholders and benefit from an understanding of the pragmatics of commerce, policy, and politics, especially at the local level (Ivars-Baidal et al. 2019; Cavalheiro et al. 2020; Graziano and Privitera 2020). Recent reviews of “new realities” and “mixed reality” apps have indicated that tourism research on these forms of technology is still at an early stage (Yung and Khoo-Lattimore 2019; Liang and Eliot 2021). Arguably attention so far has mostly fallen on the intrinsic features, attributes, and characteristics of augmented reality (AR) and virtual reality (VR) apps, including issues such as their use, acceptance, adoption, and, ultimately, visitor satisfaction (Liang and Eliot 2021). Even the most apparent future challenges have the same focus in terms of awareness of the technologies; the willingness to substitute virtual for corporeal experiences; usability; and the demands to produce such alternatives (Yung and Khoo-Lattimore 2019). Yet some of the critical issues raised by AR and VR apps and their utilization relate to issues such as authenticity (Dueholm and Smed 2014), representation (Bec et al. 2021), and inclusion which are nontechnical, highly nuanced issues requiring entirely different academic gazes (Nevola et al. 2021), including perspectives from sociology, cultural studies, or even public history. Moreover, without a consideration of law and the politics of privacy and pragmatics of data protection (Coles 2022), it is also remarkable that in such data-rich times so few tourism studies make use of the plethora of the analytics and empirical data generated by the use of such apps by visitors (Liang and Eliot 2021). While others may have recognized the complexity and wickedness of the subject area and its research problems some time ago (Buhalis and Law 2008), this point is arguably more pressing even now in light of the pace and nature of technological change. Without wishing to reopen the Tribe-Leiper debate (Tribe 1997, 2000; Leiper 2000), it is also further evidence for those who still doubt it that there is not a distinctive, unified discipline focused on tourism (Munar et al. 2016, 343–344). As several contributions have argued (cf. Coles et al. 2016; Darbellay 2016), there is a need for precision in the choice of nomenclature and vocabulary regarding disciplines and their arrangements in the production of knowledge (about tourism). Routinely, the terms “inter-disciplinary” and “multi-disciplinary” are used in discussions of research design, knowledge production, and team composition. Quite commonly though, they continue to be (incorrectly) conflated and confused in academic discourse, although there are important essential differences between the two. This is reminiscent of a similar problem regarding multiple methods and mixed methods research (Johnson et al. 2007). In simple and abbreviated terms, the former uses particular methods in a siloed manner and relies on the analyst(s) to blend
100
T. Coles et al.
findings from ring-fenced exercises to deliver sensible, complimentary conclusions; the latter seeks to use methods, the data and findings they generate in a more mutually reinforcing manner, such that the application of, and analysis from, one method may influence the design and execution of subsequent method(s) and rounds of data collection and analysis. “Multi-disciplinary projects” or “multi-disciplinary teams” are precisely that; they are endeavors comprising scholars from multiple disciplines employing their knowledge, expertise, and skills to generate diverse perspectives that add to the body of knowledge dealing with a particular research problem. In a wide-ranging examination of what he calls “shifting concepts,” Darbellay (2016, 365) unpacks the nature of multi-disciplinarity further, arguing that it is “a sequential process, in which researchers from different disciplines work from their perspective[s] on a more or less shared research topic, and in a linear and independent way that does not involve any real interaction between them.” In other words, “a multi-disciplinary approach recognizes and incorporates information derived in other disciplinary areas without scholars stepping beyond their own boundaries” (Coles et al. 2009, 83). In contrast, in “inter-disciplinary” approaches, the sum of collaboration is more than the individual parts, and knowledge production benefits both from greater flexibility and the blending of insights and perspectives. Drawing on Sayer’s (1999) definitive contribution, Coles et al. (2016, 376) note that “inter-disciplinary enquiry is not about permanently abandoning one’s disciplinary home so much as temporary or tactical transgression into a different terrain for the purpose of discovery and insight.” Darbellay (2016, 365) articulates a similar sentiment such that interdisciplinarity requires researchers to “work together based on – and between – their disciplinary perspectives on a shared research topic and in a co-ordinated and interactive fashion.” For example, the topic of travel behaviors and their environmental impacts has continued to generate attention in studies of tourism for over a decade now (Barr et al. 2011; Higham et al. 2013). A closer inspection reveals that this burgeoning body of knowledge is informed by, benefits greatly from, and has contributed to debate about environmental behaviors among citizen consumers and sites of practice more generally (Barr et al. 2011). This particular intellectual terrain is also a largely inter-disciplinary space that has profited from distinctive, highly positioned and contested contributions from psychology (Whitmarsh et al. 2011) and sociology (Shove 2011). Quite different perspectives on the same issue have helped to define the composite body of knowledge and drive trajectories towards further collective understanding.
Post-disciplinary Enquiry and Tourism While inter-disciplinary work may be more pragmatic and flexible, transgressions should be temporary. Further reflection on the nature of inter-disciplinary enquiry raises the existential question of whether it is possible for a researcher or a team of researchers to be in a permanent inter-disciplinary state or even a long-term, semipermanent situation. Put another way, if the purpose of creating the inter-
5 A Post-disciplinary Perspective on e-Tourism
101
disciplinary coalition and its perpetuation are because the subject or focus continues to resonate, then is it not actually the topic or theme that is the more meaningful focus for organizing academic endeavor, rather than the disciplines from which the team originated? Are the members of the team not, in fact, already working beyond disciplines? Perhaps, in a post-disciplinary state or times, after disciplines cease to be relevant for them? As Munar et al. (2016, 344) put it, “post-disciplinarity brings into question both the belief that all scientific knowledge creation originates in disciplinary compartments and the belief that tourism epistemology has to progress only as an inter- or multidisciplinary endeavour.” In early discussion of post-disciplinarity in studies of tourism, it was argued that ideas, not (disciplinary) conventions and institutions, should drive future enquiry about tourism (Coles et al. 2006, 2009). An approach of this nature would enable greater creativity, agility, and responsiveness to contemporary challenges that are frequently messy, complex, and wicked (Brennan 2004; Law 2004). Without more responsiveness, there was a greater probability that attempts to address contemporary subjects in tourism may be blighted by outdated institutional arrangements and restrictions within higher education that were no longer fit for purpose. In a vicious spiral, forward progress was further jeopardized by the slow pace of change in higher education (Coles et al. 2016). Post-disciplinary approaches would enable more edgy, responsive, and transformative approaches to be taken and extend new knowledge production to endpoints that could not have been previously anticipated by more conservative approaches. Greater ambition, scope, and imagination would be necessary to solve contemporary problems and future challenges. Research problems should be selected for their relevance not by their conformity to disciplinary or paradigmatic dogma (Coles et al. 2016, 378). In overcoming the unreasonableness associated with disciplines (Toulmin 2001), hybrid forms of knowledge would be produced (Hellström et al. 2001). As studies of innovation demonstrate, there is great value in disruption, and the time was right to develop “alternative circuits of knowledge that are free, or at least relatively free, from rationalising assumptions of dominant methods and paradigms” but that “may usefully augment the rich heritage of knowledge derived from single, multi- or interdisciplinary sources” (Coles et al. 2006, 295). The intention then of post-disciplinary approaches was not (and is not) to replace other modes of enquiry but to augment the knowledge (legitimately) produced by them. Indeed, Jessop and Sum (2001) long ago argued that the need for, and relevance of, post-disciplinary approaches is only revealed through knowledge of, and critical reflection on, the adequacies of pre-disciplinary and disciplinary-based approaches. Putting it more explicitly because of the unfortunate connotations of the prefix “post-” (Darbellay 2016), the goal of post-disciplinary thinking is not the (catastrophic) destruction of disciplines resulting in a form of scholastic anarchy (Coles et al. 2006, 2016; Darbellay 2016). This is an unwise, unrealistic, and – of itself – anti-intellectual goal. Rather post-disciplinary approaches do set out to challenge the established power relations and politics of knowledge production. In a more recent contribution, Hollinshead (2016, 350) points out that post-disciplinary enquiry is not trivial, it should not be trivialized (for instance, as acting “fast
102
T. Coles et al.
and loose” with academic conventions), and, while it places certain demands on them, it empowers researchers wishing to adopt this approach. As he puts it, post-disciplinarity “requires thinkers/researchers/activities to identify – and work conceptually and operationally within – the extensive range of ways of knowing that hold sway with and across the settings they investigate where these settings are known to be, or suspected of being, pluri-dimensional.” In demarcating the nature of post-disciplinary enquiry more generally, four components were originally identified as desiderata: “[shared] interests; competencies, worldviews; and outlook, or the assumptions of what should be involved in the field, not least conceptually and methodologically” (Coles et al. 2006: 305, based on Hellström et al. 2003; Törnebohm 1983, 1985). Calls for greater post-disciplinary enquiry existed alongside, and were informed by, discussion of other broad trends associated with the nature of knowledge production in, through, and with higher education. In response to the proposition of a Fourth Age of Research (Adams 2013), these included a shift from the traditional practices of “research, publish, read, and use” associated with Mode I knowledge production (Gibbons et al. 2004) towards a greater prevalence of “engage, develop, and share” associated with Mode II (Smith and Adams 2014, 10). In a digital age, “Science 2.0” was characterized by placing ideas on the web, co-production and co-development, open source, user participation, pooling resources, modification, and eventual publication in a formal sense. This was a strong departure from its predecessor, “Science 1.0,” which was characterized by a more conventional approach of research leaders “seeking grants, running teams, publishing and disseminating their outputs” (Smith and Adams 2014, 10). Early advocates in international political economy (IPE) argued that there is not a single form of post-disciplinary enquiry but rather three especially promising orientations were identified (Hay and Marsh 1999): it could breathe life into old (research) problems requiring new approaches; new problems requiring old approaches; and new times requiring new approaches. In the case of the third trajectory, contemporary challenges required fresh, innovative approaches to solving (research) problems that could not have been imagined previously under usual disciplinary arrangements and conditions (Coles et al. 2006, 2016). The other two trajectories pointed to the possibilities for revisiting knowledge produced under previous conditions and reinspecting these through new lenses of flexibility. In other words, one contribution of post-disciplinarity can be to drive forward knowledge by also taking a peek in the rearview mirror, as it were. The first (old problems, new approaches) suggests that former or enduring problems, which may have been abandoned as disciplines or inter-disciplinary enquiry progressed, may be resumed or reconsidered. The prospects of fruitful new knowledge production are enhanced because of the possibilities of revisiting subjects with the latest data, techniques, concepts, theorizations, and so on. In tourism studies, the quintessential example of this may be reopening in the digital age of time-space geographies from the 1970s to the 1980s. These were largely abandoned in the analogue era because of its considerable demands on data collection, processing, and analysis that old tech could not satisfactorily overcome (Hall 2005b). The second (new problems,
5 A Post-disciplinary Perspective on e-Tourism
103
old approaches) suggests that older modes of enquiry, ways of thinking, and/or established data sets remain analytically and methodologically valid and valuable, and they may be applied to contemporary topics or subjects that in some cases substantially post-date them. Solutions to new problems may be in theory, concepts, methods, or techniques that have been long used but which perhaps were abandoned as disciplines, scholarship, and imperatives shifted (Coles et al. 2016). To this point, the central threads of logic have been greater flexibility, openness, agility, and plurality in knowledge production culminating in post-disciplinary enquiry. Inter-disciplinary approaches are more flexible than multi-disciplinary approaches which, in turn, overcome the rigidities, exclusivities, and exclusions associated with mono-disciplinary approaches. Darbellay (2016) has sought to add further clarification to what he terms the “crisis” facing disciplines (see also Darbellay and Stock 2012). Drawing on Schlanger (1992, 292), who argued that for every discipline the limits were known and accepted, he observes a progressive dedisciplining of tourism studies where there is a gradual “decompartmentalization” based on a continuum of openness, interaction, and integration from disciplinarity at one end of the spectrum to “transdisciplinarity” at the other (Darbellay 2016, 366). Of course, a device of this nature is conceptually useful if analytically impossible to apply in any precise or meaningful sense. Transdisciplinarity is, for him, the condition in which “researchers work to develop a conceptual and methodological framework that transcends disciplinary boundaries with the aim of resolving a concrete problem between science and society” (Darbellay 2016, 365). The essence of transdisciplinarity is furthermore characterized by key words of “problem-solving, implementation, relationship between science and society” which distinguishes it from inter-disciplinary which is concerned with “interaction, interface, exchange, shared research topic[s], interdependence” (Darbellay 2016, 365). Importantly, this definition of transdisciplinary does not advocate “a return to some kind of unit of knowledge. . . ” rather it “. . . refers to the ‘highest level of integrated study, that which proposes the unity of intellectual frameworks beyond the disciplinary perspectives and points toward our potential to think in terms of frameworks, concepts, techniques and vocabulary that we have not yet imagined’ (Buckler 2004, 2)” (Darbellay 2016, 365). Post-disciplinarity may be considered as an epistemological and ontological position – a statement on how and why knowledge is produced – as well as a relativity, as a sort of temporality, a moment in time or period after the hegemony of disciplines started to dissipate. In the case of the latter, taking a transdisciplinary approach is also to be acting in a post-disciplinary way. Arguably in a strict sense, so too is engaging in multi- or inter-disciplinary research. Be this as it may, there are obvious similarities with Darbellay’s characterization of transdisciplinarity with representations of “post-disciplinarity” in other contributions on tourism and as articulated above. While (Darbellay 2016) correctly cautions against artificial or contrived definitions of “post-disciplinarity” to distinguish the term artificially from “transdisciplinarity” (and other labels) either ontologically or epistemologically, for some “post-disciplinarity” may refer to an even greater sense of flexibility, permissiveness, and creativity than his somewhat instrumental, goal-oriented depiction of
104
T. Coles et al.
transdisciplinarity may suggest (cf. Pernicky et al. 2016; Barry 2016; Bødker 2016). Articulated in this particular way, notions of problem-solving, implementation, and useful knowledge all inherently infer a sense of expectation, conformity, and even quasi-disciplining as do aspirations to unity and the highest level of integration (i.e., who or what defines these?). This discussion demonstrates that considerable time can be taken discussing the relative merits and subtitles of a number of other connected terms. Concepts like “pluri-disciplinarity,” “anti-disciplinarity” (Ito 2017), and so on all require careful unpacking and close comparison. Nevertheless, they add to the sense that the role and status of disciplines is under ever more scrutiny (Darbellay 2016). The obvious question therefore arises of to what extent does research characterized by the hallmarks of post-disciplinarity drive the production of knowledge in studies of tourism? This is not an especially easy question to answer precisely or definitively, largely for two reasons. First, many scholars working in tourism research do not reflect on their philosophical positions in their publications nor do they discuss the extent to which their decisions (for instance, about ontology, epistemology, and methodology) impact on the nature of the research they conduct and the outputs they produce. They very rarely consider how their disciplinary origins, or their professional homings with the attendant baggage they bring, impact on the knowledge they produce about their subjects. These may seem somewhat nebulous, almost irrelevant points for many scholars. However, if tourism as a form of human behavior is an interest to scholars in both geography and sociology (as it is also for psychologists and economists), how many scholars are able to articulate clearly the differences in approach that a “geographer” or a “sociologist” might take to the study of tourism (Gibson 2008; Hall and Page 2010; Cohen and Cohen 2012)? In other words, in what ways might geographers or sociologists or social scientists from any other disciplinary home contribute distinctively to interdisciplinary studies of, or projects on, tourism? Second and connected, approaches to knowledge production and research philosophy very rarely feature in the standard indexing material for most publications: that is, the title, keywords, and abstracts. Thematic and subject-specific words more often than not describe publications as scholars’ attempt to attract others to their work and establish de facto communities of common interest and practice. Using key words is a crude device that likely under-measures the extent to which post-disciplinary approaches are being or have to be taken in tourism studies (Coles et al. 2016). Instead, to establish any degree of precision or accuracy, a more labor-intensive, qualitative inspection would be required on a project-by-project or publication-by-publication basis. In fact, alternative evidence points to post-disciplinary ways of thinking as gathering some traction. Three international conferences arranged in Switzerland (2013), Denmark (2015), and New Zealand (2018) have explored the potentials and practices of post-disciplinarity in tourism studies (Munar et al. 2016; http://www. postdisciplinary.net/). Prima facie, the programs for these meetings demonstrate the considerable opportunities of post-disciplinary enquiry, in particular in the spaces occupied by scholars identifying with the arts and humanities and the social sciences (Pernecky et al. 2016). Using the threefold categorization proposed for IPE,
5 A Post-disciplinary Perspective on e-Tourism
105
Coles et al. (2006, 2016) roughly mapped out the terrains within tourism studies which would benefit (and to some extent have benefitted) from post-disciplinary enquiry in the years ahead. Of necessity, these are broad-ranging and among the “new times, new approaches” they identified the increasing generation and use of “big data” (Coles et al. 2016). A term like this deserves unpacking further, not only because it is multifaceted in nature (Miah et al. 2017; Li et al. 2018a; Mariani et al. 2018) but also because it implies greater connection with mathematics, computer science, data science (i.e., analytics), and engineering. This is significant: physical sciences such as these have not previously been naturally or immediately associated with the study of tourism. Notwithstanding, the digital revolution has generated all sorts of new data, both dealing with the demand and supply side of tourism, that could not be imagined when studies of tourism first started. Large data sets allow greater generalization with higher levels of certainty than in the analogue era, while the emergence of new data sets and types of data sources demand the development of new skills and approaches for data management, processing, and analysis not previously widely prevalent in the tourism academy (Coles et al. 2016, 383). Clearly, there is far more to the study of e-tourism than employing big data; the potential and practice of post-disciplinary approaches in e-tourism are greater than might at first seem to be the case, and it is to this we turn in the next section.
e-Tourism as a Post-disciplinary Field of Study As the preceding discussion makes clear, the precise and clear application of language and terminology matters. The same is also true for e-tourism. There are probably as many separate definitions of the term as there are researchers working in the field. Within a chapter of this nature, there is neither the space nor the scope to enter into a much fuller discussion of the definitions and scope of e-tourism. For the sake of simplicity – and albeit arbitrarily – in this chapter Buhalis’ (2003: xxiv) early view will coarsely delimit the boundaries of the term and the field. For him, e-tourism is concerned with the “digitization of all the processes and value chains in the tourism, travel, hospitality and catering industries that enable organizations to maximize their efficiency and effectiveness.” A definition of this nature covers a full spectrum of interests from the now arguably established, everyday and banal – such as the study of electronic point of sales (EPOS) data or online booking systems – to some of the more exciting, most current advances in personal wearable technologies, psycho-physiological measures of visitors, artificial intelligence, and automation. Annual calls for papers from the International Federation for Information Technology and Tourism (IFITT) add to the sense that e-tourism covers a wide array of interests (Neihardt 2019), and the current research agenda is not set by particular conceptual, theoretical, or methodological canons. Consulting conference programs, tables of contents of dedicated peer-reviewed journals, and even simple key word searches of standard bibliographical databases (i.e., Scopus and Web of Science) reveal the wide range of authors engaged with e-tourism. These are too numerous
106
T. Coles et al.
to cite here, but the contributors’ professional homes include inter alia, computer science, analytics, (applied) linguistics, management studies, and geography, as well as scholars in units, departments, or institutes dedicated to the study of tourism, either in a broad or more specific guise. Publication teams regularly comprise scholars spanning institutional and organizational divides, and encouragingly in the Fourth Age of Research (Adams 2013), reach over international boundaries beyond national education systems. The essential aspect of their research, and what unites them in a common endeavor, appears to be the subject – for instance, the solution, the method, the application, the invention, the innovation, and the incremental improvement – and the more flexible approaches that are taken to knowledge production in the context of particular (research) problems, not particular disciplinary-based origin(s) or approach(es) to producing new knowledge. Of course, not all knowledge production in e-tourism has to be – nor automatically should be – considered as post-disciplinary in nature, and some may be produced in other modes. Conversely, just one contribution in e-tourism overtly identifies itself as post-disciplinary in nature. Bødker (2016) attempts to provoke further discussion of the types of fieldwork and representations that would emphasize embodiment in the design of tourism technologies. His work stresses the importance of technology as a lived experience and its affective nature and the importance of the full range of senses in future consideration of technology in digital tourism. Be this as it may, Table 1 indicates that many of the topics engaging the international e-tourism research community currently are inherently post-disciplinary in the sense that their emergence post-dates the start of the progressive erosion of disciplines (Darbellay 2016). We would contend that this is also largely the case ontologically and epistemologically. Although by no means intended to be exhaustive, Table 1 maps many of the most urgent topics identified by IFITT recently against the three orientations for post-disciplinary enquiry exposed in IPE (n.b. a similar but more extensive exercise might have been conducted from chapter titles in this handbook). Very many of these have emerged from the context of application, from “doing,” “managing,” and “practicing.” Typological exercises like this are typically criticized because items may not neatly sit in just one category, classifications are subjective, and allocations are sometimes the result of fine, even debatable judgment calls. They also depend on the precision of language (i.e., in how the labels or items are defined and/or interpreted). This is also the case here. For some, the topic of psycho-physiological measures of visitors might just as easily have been placed in the category “old problems, new approaches” as in that of “new times, new approaches.” In short, this type of research attempts to use the ever widening array of different physiological measures associated with psychological processes (e.g., eye-tracking, electro-dermal activity) in an attempt to produce enhanced understandings of contemporary visitors and their experiences. In other words, it may be considered as one of the next stages in the longstanding fascination tourism scholars have had with marketing, extending into the digital age (Dolnicar and Ring 2014). Furthermore, at its most basic, psychophysiological research concerned with the application of method(s) to particular current research questions and much of the current research on experience and
5 A Post-disciplinary Perspective on e-Tourism
107
Table 1 Selected indicative topics for post-disciplinary enquiry in e-tourism. (Sources: authors, adapted from Coles et al. 2006, 2016 and Neihardt 2019) Orientation Old problems, new approaches
New problems, old approaches
New times, new approaches
Potential topic Augmented experiences, AR, VR. Data mining, analytics, and measurement Data standards and data integration Digital marketing and social media strategies Digital distribution and social selling Gaming and gamification GPS and geospatial tracking Human computer interactions Recommender systems and personalization Text and concept mining, sentiment analysis Travel information, search, and retrieval User modeling and decision-making Website design and evaluation Advanced distribution systems, strategies, and dynamic packaging Data protection, privacy, security, ethics, and legality Digital divide and socioeconomic development Digital nomads e-Government and public policy in tourism Emotions and personality-based systems ICT and the tourism experience ICT adoption and value creation ICT for innovation and service design ICT for regional development and sustainability ICT enabled partnerships and collaborations Platform economy Responsible ICT in Tourism Smart destinations/visitor management Social networking, social media, and social inspiration Social network analysis Artificial intelligence, machine learning, deep learning Big data and large-scale systems Blockchains e-Learning and MOOCs Fairness, transparency, and responsibility in algorithms Internet of Things Location-based services and context-aware systems Mobile services and wearables Neuro-tourism Psycho-physiological measures Robotics and automation Semantic web, tourism ontologies, and linked open data Travel chatbots
emotion is framed by established concept and theory (Kim and Fesenmaier 2015) which, in some cases, substantially predates the emergence of these forms of technology in the social sciences (from the analogue and Web 1.0 eras). On this basis, psycho-physiological research may be providing a new approach to an
108
T. Coles et al.
old, enduring challenge in tourism studies, namely, understanding visitors, their experiences, and especially their emotions (Hosany et al. 2015; Li et al. 2015). Nevertheless, as Table 1 intimates, in our view this form of enquiry is actually far better regarded as a new approach for new times. It post-dates the halcyon days of disciplines; it also requires considerable background knowledge of ideas from, inter alia, psychology, physiology, physics, and sometimes medicine, a combination that is quite new and unfamiliar to tourism studies and many tourism scholars. A greater degree of complexity is involved, therefore demanding greater methodological dexterity to combine techniques that have rarely been used together before. Several authors have called for even these newer, more advanced methods to be used in multiple methods research designs – combining both old and new, analogue and digital, psycho-physiological, and self-report – as a more reliable and comprehensive means of understanding visitors (Li et al. 2018b; Stadler et al. 2018). For example, Marchiori et al. (2018) employ analysis of heart rate data and selfreported perceptions to understand the effectiveness of virtual reality experiences for visitors, while Babakhani et al. (2017) connect eye-tracking and electro-dermal data to measure the appeal of carbon-offsetting in online purchasing. Others have pointed to the continued prevalence of such studies within highly controlled laboratory environments rather than the “natural settings” where routine activities, including those of visitors, take place (Kiefer et al. 2016; Baldwin et al. 2020). Within eye-tracking, for instance, there are subtle but important differences in response to stimuli – text, images, and iconography – when viewed in natural settings. It may be relatively straightforward to take the technology into homes, offices, and workspaces or even to simulate them (Baldwin et al. 2020); taking it outside into the natural environment such as the countryside or coast, where light varies and/or infrared light levels may be high, can create significant challenges (Kiefer et al. 2016; Scott et al. 2017). Added to the complexity and radically different nature of such enquiry, such methods have the potential to shift thinking. Such work is very labor- and resource-intensive with the consequence that sample sizes have been limited to date (Scott et al. 2017); however, few if any authors have yet to pose questions of to what extent is variance in visitor experience revealed by or accounted for among psycho-physiological measures, how this relates to traditional self-reported psychographic measures, how much they account for together, and which accounts for more? Furthermore, do traditional segmentations and groups of visitors based on psychographic variables continue to be valid, and do psychophysiological variables alone or in combination of psychographic measures form a stronger basis for future analysis and managerial interventions? Many of the topics in the “new times, new approaches” categories require knowledge, skills, and expertise to pursue them to their logical termini that have not previously been in the scope of tourism studies and may be pushing back the frontiers of (social sciences) disciplines that contribute to the body of knowledge. For instance, Moyle et al. (2019) suggest that studying the brain and its responses represent the next frontier in tourism emotion research, just as it may also push back the frontiers of destination marketing Bastiaansen (2020). Building on neuroeconomics, neuro-politics, and neuro-marketing, the potential exists to measure
5 A Post-disciplinary Perspective on e-Tourism
109
emotions as “the result of appraisals of perceptions. . . in the cortex of the brain” through “use of EEG [electroencephalography] which records the electrical activity of the brain” (Moyle et al. 2019, 1394). With more complexity, the costs of conducting cutting-edge research increase, and sample sizes remain low currently. While we would agree with Moyle et al. (2019, 1393) that further research of this nature “should focus on the efficacy of utilising self-report measures with cutting-edge psychophysiological techniques,” approaches like EEG that intimately measure the human-being raise all sorts of relatively new questions for research ethics, privacy, and data protection which tend to get lost in the excitement about new analytical and methodological possibilities. Beyond the more usual social science moorings, other topics in the “new times, new approaches” category suggest studies of e-tourism have to, and will, increasingly take a turn towards the physical sciences. While it is possible for those with a training in the social sciences or arts and humanities – as traditional foundation stones of tourism studies – to understand the principles of, and ideas behind, many of the topics like blockchains, wearable tech, or the operation of travel chatbots require the detailed skills and knowledge of those trained in programming, analytics, engineering, and so on to advance in a practical or analytical sense. And the reverse is also going to be the case in so far as new intellectual symbioses are going to be necessary. While physical science may drive invention and innovation in this space, the implementation and appraisal of such technologies take place in particular contexts that demand other specialist knowledge and insight from those better versed in the humanities and social sciences. Audiences and visitors of the future are going to expect increasing levels of technological enhancement and augmentation in the delivery of their experiences; witness, for instance, the rush for airlines and airports to produce mobile apps designed to augment and enhance the customer service experience, not to mention to close the distribution gap when providing such services. Far from science fiction, the design of anthropomorphic automation is already part of the discourse over robotic service design (Murphy et al. 2019) but cannot and should not be disconnected from discussion about the future of the labor force, especially where tourism comprises a significant proportion of employment and/or contributes significantly to citizens’ livelihoods (Bowen and Morosan 2018). Work on robotics, automation, and artificial intelligence casts e-tourism as a largely “path free” form of intellectual endeavor; that is, the progress and development of topics in this category are largely independent of precedent because there was little, if anything, that went before it. It may be attractive to consider e-tourism as being at the vanguard of tourism studies because it is dealing with the most current technologies, innovations, ideas, and thinking. Items in the “new problems, old approaches” category temper this view somewhat. Advances in digitization and digital engagement have produced several research problems and challenges distinctive to contemporary times, like the growth of platform economies and their alternative business models, the emergence of digital nomads and their increasingly peripatetic lifestyles, and the proliferation of social media and social networks. Thematically though, these topics represent extensions of, and they are
110
T. Coles et al.
usefully informed by, earlier contributions. Items in this category benefit by drawing on the “institutional memory” of tourism studies or by referring to antecedents or analogues in other disciplines and their attendant subject areas. In the case of the former, current tourism scholarship may benefit from former advances that have otherwise been forgotten. For instance, the emergence of digital nomads effectively represents the next stage in the gradual blurring of home and away, the enmeshing of work and leisure, observations that in part drove the “mobilities turn” in the social sciences, and tourism studies (Hall 2005a; Coles et al. 2006; Cook 2020). The platform economy is perhaps the topic de nos jours in so far as some argue that webbased transactions through online booking sites and agencies comprise new business models in the tourism sector. As McKee (2017) notes, operators like Airbnb and Uber exploit the ambiguities between, on the one hand, acting as a private economic actor and, on the other, as provider of technological infrastructure for markets. For some critics, this has resulted in unsustainable outcomes for local destinations and businesses (Gössling and Lane 2015), not least by considerable offshoring of proceeds and profits into locations that some consider tax-efficient, others unfair, immoral, and exploitative. A more powerful critique is that, in retrospect, the body of knowledge on business models in management studies which is increasingly being deployed in tourism studies (Reinhold et al. 2017) may cast doubt over whether the platform economy is quite as new or radical an idea as its advocates suggest. Central to the operation of the platform economy is the connection of consumers and opportunities by online technologies; for platforms, we may read “market places” and the operators of platforms as “agents” or “agencies.” Language of this nature recalls an altogether different era of travel agencies and holiday (apartment/homes/cottage/second homes) letting agencies which is the basic tenet of Airbnb. Calls for policy-makers and politicians to regulate the effects of offshoring are nothing new: as far back as the 1980s, Stephen Britton observed this was one of the unfortunate consequences of the globalization of tourism, the choices that consumers make, and the effects these have on local communities (Britton 1991). As Mosedale’s (2006) work from over a decade ago demonstrates, the challenge is to be able to map financial flows and value (or commodity) chains precisely. Conversely, most of the items in the “old problems, new approaches” category are related to methods and techniques, and the nature of this form of post-disciplinary e-tourism is actually to progress some of the more enduring issues in tourism studies by applying the latest advances. Augmented reality (AR) and virtual reality (VR) may increasingly be the domain of many, especially younger audiences (Han et al. 2019; Yung and Khoo-Lattimore 2019; Liang and Eliot 2021); however, augmentation per se is nothing new. There is a long history of enhancing (or at least attempting) visitor experiences, not least through tour guides’ guidebooks, in situ interpretation, or even more recently the audio-guide, all of which endure today (Hanna et al. 2019). AR and VR simply represent the next level of technological sophistication (Han et al. 2019), as do other technologies such as QR codes to drive interpretation (Solima and Izzo 2018). Moreover, as contemporary experiences demonstrate, there are still challenges of curation, authenticity, data ownership, presentation, and performance that require scholars and practitioners
5 A Post-disciplinary Perspective on e-Tourism
111
with backgrounds in the arts and humanities to work in combination with developers and marketers to deliver content to “customers” (Dueholm and Smed 2014; Bec et al. 2021). Digital platforms do however offer greater opportunities to deliver multiple views of history and reinforce recent social trends towards the erosion of grand narratives (Bohlin and Brandt 2014). GPS and new forms of smart phoneenabled tracking are producing new insights into visitor behaviors (Hallo et al. 2012; Raun et al. 2016; Hardy et al. 2017), especially when combined with other forms of psycho-physiological measures (Shoval et al. 2018a,b) or even analogue data and approaches (East et al. 2017). However, the notion of tracking visitor behaviors through time and space has been around since the 1970s, it has been attempted by diaries and self-report (Hall 2005b), and the principal development is the technological advances that have enabled this to become more efficient and effective. Similar comments may be made about text analytics (i.e., text mining) or sentiment analysis (Ma et al. 2018). In case of the former, the search for highquality data about tourists (and tourism) is well-established, especially in tourism studies of marketing; the principal difference appears to be the ability of highpowered computing and the wider availability of accessible software to increase the scale, scope, and speed of the research. Processing is also a hallmark of the latter: sentiment analysis uses computation to find and categorize the users’ opinions through their texts to determine the authors’ views on a particular subject. Reduced to its most basic elements, it is arguably little more than (textual) content analysis or discourse analysis (Hannam and Knox 2005). Arguably discourse analysis and semiotic analysis as anthropocentric methods may reveal some of the finer nuances that sentiment analysis may not (Qian et al. 2018). Perhaps the contemporary contradistinction is that the direction of recent innovations and initiatives also sets the stage for potentially complex legal issues with respect to data acquisition, data protection, tracking, mining, privacy, and ownership. The online activities of previsitors generate corporate interest in terms of shaping the experience. The data produced by visitors in situ on-site equally present significant opportunities for everyone throughout the supply chain in terms of enhancing understandings of movements, purchase habits, dwell times, and so on. With its timeline tracking feature, which is already old technology by today’s standards, Google can arguably put together meaningful pictures of individual and collective mobilities. This raises issues of what rights might tourists have to travel anonymously in the future? Is a digital footprint of one’s mobility purely their own?
Concluding Remarks: On the Future of Post-disciplinary e-Tourism What then of the prospects for future post-disciplinary enquiry in e-tourism? Part of the purpose of producing a table of this nature is to challenge current assumptions and to provoke a wider discussion about the nature of tourism enquiry. As we noted above, there may be disagreements as to which category a particular theme may be
112
T. Coles et al.
allocated, but it is clear that subjects and topics are driving the e-tourism research agenda rather than established theoretical or conceptual traditions and orthodoxies more reminiscent of disciplines. In fact, whatever one’s critical reflections on the composition of Table 1, it demonstrates the wide range of topics in e-tourism that have emerged, that they have done so as the “crisis of disciplines” has deepened (Darbellay 2016), and they all benefit from more flexible, responsive approaches where the subject is the starting point for enquiry, not the discipline or, in the case of tourism, the field of studies (Tribe 1997, 2000). A closer inspection of some of the earlier texts on the topic suggests that the emergence of the body of knowledge on e-tourism is itself a story of synthesis and eschewing disciplines in favor of pragmatism and progress (Buhalis 2003). Put another way, no discipline could lay claim to the topics that interest e-tourism scholars. Post-disciplinary studies of e-tourism are here to say. It is hard to imagine knowledge advancing in the areas set out in Table 1 if more restrictive and restricting modes of knowledge creation are employed. Indeed, it is difficult to conceive of the sort of transformative pathways for future e-tourism research that Gretzel et al. (2020) identify as a necessary response to the COVID-19 pandemic. In many respects, of and by themselves, the topics contained in Table 1 already represent an exciting basis for an intellectually valuable, future-facing research agenda. Yet, the space and opportunity afforded by the pandemic for scholars to reappraise their roles and the purpose of their work point to the further significance that these topics (and many others) can have as a force for change. Science has long faced crisis, not least because, as Saltelli and Funtowicz (2017, 5) observe, its role has shifted “from emancipation and betterment of mankind [sic] to instrument of profit and growth.” Rather than revert to modes of e-tourism enquiry with these hallmarks, a new space has opened for, as Gretzel et al. (2020, 198) put it, “ transformative e-tourism research as being critical of these assumptions but also constructive by building the necessary foundations for change.” Specifically, COVID-19 has forced “the need to better understand but also challenge, responsibly manage, and proactively create IT as both a short-term response and a long-term means for the renewal of e-tourism” (Gretzel et al. 2020, 198). More flexible forms of enquiry in this field of study are better able to deliver greater, more resonant knowledge of this nature to push back the frontiers of understanding and application, both during the pandemic and afterwards. Moreover, the evidence assembled here points to a different orientation to post-disciplinary studies of tourism where e-tourism is concerned. While most previous discussion has focused on knowledge and insight on tourism developing among communities, groups, and teams spanning the arts, humanities, and social sciences (Coles et al. 2006, 2016), e-tourism by necessity requires greater engagement with physical science. Among the new problems requiring new approaches, the unit of analysis may drive the nature and type of involvements in integrated knowledge production, with scholars from the physical sciences focused more on issues of technology, hardware, and software and those from other disciplines in the humanities and social sciences providing subjectspecific insight into the tourists and tourism, experiences, and meanings involved in the act of consuming what they afford.
5 A Post-disciplinary Perspective on e-Tourism
113
Some may push back on the view that e-tourism enquiry is, or will continue to be, post-disciplinary in nature, but there are more robust arguments in the opposite direction. e-tourism is certainly not a discipline in its own right, rather a field of studies with self-supporting institutions. Potential critics may argue that, instead, e-tourism is actually better described as inter-disciplinary in nature. Some may even argue that discussions of this nature are incidental and lack relevance. There are three reasons why such views are short-sighted and misdirected. First, within institutional settings, national higher education systems, and among international bodies, the need for, and value of, more flexible, plural, integrative forms of knowledge production is more widely recognized and rewarded these days (Adams 2013; Smith and Adams 2014). Second, consideration of many of the topics in e-tourism demands a much deeper immersion and sustained engagement by those involved, such that tactical, periodical retreating back into disciplinary homes is unrealistic or counterproductive if it happens. Finally, as Munar et al. (2016, 344) note, the nature and composition of the tourism academy, widely written, is constantly evolving. Moving towards post-disciplinarity would appear to be the next logical step in a process that started with establishing the legitimacy of tourism as a field of study. For Filep et al. (2013, 7), this is manifest in the emergence of “Generation Tourism,” a new cohort of tourism scholars who are “equipped to deal with the complex issue of developing tourism knowledge across a diverse field of study” (Filep et al. 2013, 10). While Generation Tourism may be perceived in some quarters as lacking “the advantages of a discipline-focused education with its strong theoretical and methodological foundations” (Filep et al. 2013, 1), it comprises precisely the type of scholars who may deploy advanced methodological tools, adopt disaggregated research approaches, employ high-resolution analytics, and avoid stereotypical depictions of destinations and tourists – in other words, the types of scholars who are necessary to delivering renewed intellectual impetus to the study of e-tourism (Shoval and Birenboim 2019).
Cross-References A Futuristic Look at Tourism in the Era of the Internet Ecosystem Developments in German e-Tourism: An Industry Perspective Drivers of e-Tourism e-Tourism: An Informatics Perspective
References Adams J (2013) Collaborations: the fourth age of research. Nature 497:557–560 Ambrosie L (2010) Tourism policy research: avenues for the future. Int J Tour Policy 3(1):33–50 Babakhani N, Ritchie BW, Dolnicar S (2017) Improving carbon offsetting appeals in online airplane ticket purchasing: testing new messages, and using new test methods. J Sustain Tour 25(7):955–969
114
T. Coles et al.
Baldwin J, Haven-Tang C, Gill S, Morgan N, Pritchard A (2020) Using the Perceptual Experience Laboratory (PEL) to simulate tourism environments for hedonic wellbeing. Inf Technol Tour. https://doi.org/10.1007/s40558-020-00179-x Barr S, Shaw G, Coles T (2011) Times for (un)sustainability? Challenges and opportunities for developing behaviour change policy. A case-study of consumers at home and away. Glob Environ Chang 21:1234–1244 Barry K (2016) Packing as practice: creative knowledges through material interactions. Tour Anal 21(4):403–416 Bastiaansen M, Straatman S, Mitas O, Stekelnberg J, Jansen S (2020) Emotion measurement in tourism destination marketing: a comparative electroencephalographic and behavioral study. J Trav Res. https://doi.org/10.1177/0047287520981149 Bec A, Moyle B, Schaffer V, Timms K (2021) Virtual reality and mixed reality for second chance tourism. Tour Manag 83 Belhassen Y, Caton K (2011) On the need for critical pedagogy in tourism education. Tour Manag 32:1389–1396 Benckendorff P, Zehrer A (2013) A network analysis of tourism research. Ann Tour Res 43: 121–149 Bødker M (2016) Getting lost in the field. Tour Anal 21(4):417–430 Boes K, Buhalis D, Inversini A (2016) Smart tourism destinations: ecosystems for tourism destination competitiveness. Int J Tour Cities 2(2):108–124 Bohlin M, Brandt D (2014) Creating tourist experiences by interpreting places using digital guides. J Herit Tour 9(1):1–17 Bowen J, Morosan C (2018) Beware hospitality industry: the robots are coming. Worldwide Hosp Tour Themes 10(6):726–733 Brennan A (2004) Biodiversity and agricultural landscapes: can the wicked policy problems be solved? Pac Conserv Biol 10(2):124–142 Britton S (1991) Tourism, capital and place: towards a critical geography of tourism. Environ Plan D 9(4):451–478 Buckler JA (2004) Towards a new model of general education at Harvard College. In Essays on general education in Harvard College. Retrieved from. http://isites.harvard.edu/fs/docs/icb. topic733185.files/Buckler.pdf Buhalis D (2003) e-Tourism. Information technology for strategic tourism management. FT Prentice Hall, Harlow Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet – the state of e-tourism research. Tour Manag 29(4): 609–623 Buhalis D, Leung R (2018) Smart hospitality – interconnectivity and interoperability towards an ecosystem. Int J Hosp Manag 71:41–50 Butowski L (2016) The issue of disciplinarity and non-disciplinarity of tourism studies. Téoros [Online] 35(1). http://journals.openedition.org/teoros/2899 Cavalheiro M, Joia L, doCanto Cavalheiro G (2020) Towards a smart tourism destination development model: promoting environmental, economic, socio-cultural and political values. Tour Plan Dev 17(3):237–259 Cohen EH, Cohen SA (2012) Current sociological theories and issues in tourism. Ann Tour Res 39(4):2177–2202 Coles TE (2022) Hidden in plain sight? AR apps and the sustainable management of urban heritage tourism. In: Nevola F, Rosenthal D (eds) Hidden cities: urban space, geolocated apps and public history in early modern Europe. Routledge, Abingdon. Forthcoming Coles TE, Hall CM, Duval DT (2005) Mobilising tourism: a post-disciplinary critique. Tour Recreat Res 30(2):53–63 Coles TE, Hall CM, Duval DT (2006) Tourism and post-disciplinary enquiry. Current Issues Tour 9:293–319 Coles TE, Hall CM, Duval DT (2009) Post-disciplinary tourism. In: Tribe J (ed) Philosophical issues in tourism. Channel View, Clevedon, pp 80–100
5 A Post-disciplinary Perspective on e-Tourism
115
Coles TE, Dinan C, Warren N (2014) Energy practices among small- and medium-sized tourism enterprises: a case of misdirected effort? J Clean Prod 111(B):399–408 Coles TE, Hall CM, Duval DT (2016) Tourism and post-disciplinarity: back to the future? Tour Anal 21(4):373–388 Cook D (2020) The freedom trap: digital nomads and the use of disciplining practices to manage work/leisure boundaries. Inf Technol Tour 22:355–390 Darbellay F (2016) From disciplinarity to post-disciplinarity: tourism studies dedisciplined. Tour Anal 21(4):363–372 Darbellay F, Stock M (2012) Tourism as a complex interdisciplinary research object. Ann Tour Res 39(1):441–458 Dolnicar S, Ring A (2014) Tourism marketing research: past, present and future. Ann Tour Res 47:31–47 Dueholm J, Smed K (2014) Heritage authenticities – a case study of authenticity perceptions at a Danish heritage site. J Herit Tour 9(4):285–298 East D, Osborne P, Kemp S, Woodfine T (2017) Combining GPS & survey data improves understanding of visitor behaviour. Tour Manag 61:307–320 Etzkowitz H (2008) The triple helix: university-industry-government innovation in action. Routledge, London Etzkowitz H (2015) Special introduction: the entrepreneurial university wave. In: Technology financing and commercialization. Palgrave Macmillan, London Filep S, Hughes M, Mostafanezhad M, Wheeler F (2013) Generation tourism: towards a common identity. Current Issues Tour 18(6):511–523 Fullagar S, Wilson E (2012) Critical pedagogies: a reflexive approach to knowledge creation in tourism and hospitality studies. J Hosp Tour Manag 19(1):1–6 Garel G (2013) A history of project management models: from pre-models to the standard models. Int J Proj Manag 31(5):663–669 Gibbons M, Limoges C, Nowotny H, Schwartzmann C, Scott P, Trow M (1994) The new production of knowledge: the dynamics of science and research in contemporary society. Sage, London Gibson C (2008) Locating geographies of tourism. Prog Hum Geogr 32:407–422 Godin B, Gingras Y (2000) The place of universities in the system of knowledge production. Res Policy 29:273–278 Goodwin M (2004) Recovering the future: a post-disciplinary perspective on geography and political economy. In: Cloke P, Goodwin M, Crang P (eds) Envisioning human geography. Arnold, London, pp 65–80 Gössling S (2011) Carbon management in tourism. Mitigating the impacts on climate change. Routledge, London Gössling S (2013) National emissions from tourism: an overlooked policy challenge? Energy Policy 59:433–442 Gössling S, Lane B (2015) Rural tourism and the development of Internet-based accommodation booking platforms: a study in the advantages, dangers and implications of innovation. J Sustain Tour 23(8–9):1386–1403 Graziano T, Privitera D (2020) Cultural heritage, tourist attractiveness and augmented reality: insights from Italy. J Herit Tour 15(6):666–679 Gretzel U, Fuchs M, Baggio R, Hoepken W, Law R, Neidhardt J, Pesonen J, Zanker M, Xiang Z (2020) e-Tourism beyond COVID-19: a call for transformative research. Inf Technol Tour 22:187–203 Hall CM (2005a) Systems of surveillance and control: commentary on ‘An analysis of institutional contributors to three major academic tourism journals: 1992–2001’. Tour Manag 26(5):653–656 Hall CM (2005b) Tourism: rethinking the social science of mobility. Pearson Education, Harlow Hall CM (2011) Publish and perish? Bibliometric analysis, journal ranking and the assessment of research quality in tourism. Tour Manag 32(1):16–27 Hall CM, Page SJ (2010) Progress in tourism management: from the geography of tourism to geographies of tourism – a review. Tour Manag 30:3–16
116
T. Coles et al.
Hallo J, Beeco J, Goetcheus C, McGee J, McGehee N, Norman W (2012) GPS as a method for assessing spatial and temporal use distributions of nature-based tourists. J Travel Res 51(5): 591–606 Han D-I, tom Dieck M, Jung T (2019) User experience model for augmented reality applications in urban heritage tourism. J Herit Tour 13(1):46–61 Hanna S, Carter P, Potter A, Forbes Bright C, Alderman D, Modlin A, Butler D (2019) Following the story: narrative mapping as a mobile method for tracking and interrogating spatial narratives. J Herit Tour 14(1):49–67 Hannam K, Knox D (2005) Discourse analysis in tourism research a critical perspective. Tour Recreat Res 30(2):23–30 Hardy A, Hyslop S, Booth K, Robards B, Aryal J, Gretzel U, Eccleston R (2017) Tracking tourists’ travel with smartphone-based GPS technology: a methodological discussion. Inf Technol Tour 17(3):255–74 Hay C, Marsh D (1999) Introduction: towards a new (international) political economy. New Polit Econ 4(1):5–22 Hellström T, Jacob M, Wenneberg S (2003) The ‘discipline’ of post-academic science: reconstructing paradigmatic foundations of a virtual research institute. Sci Public Policy 30(4):251–260 Hessels L, van Lente H (2008) Rethinking new knowledge production: a literature review and a research agenda. Res Policy 37:740–760 Higham J, Cohen S, Peeters P, Gossling S (2013) Psychological and behavioural approaches to understanding and governing sustainable mobility. J Sustain Tour 21(7):949–967 Holden A (2005) Tourism studies and the social sciences. Routledge, London Hollinshead K (2016) Postdisciplinarity and the rise of intellectual openness: the necessity of ‘plural knowability’ in tourism studies. Tour Anal 21(4):349–362 Hosany S, Prayag G, Deesilatham S, Causevic S, Odeh K (2015) Meauring tourists’ emotional experiences. J Travel Res 54(4):482–495 Huang MH, Huang MJ (2018) An analysis of global research funding from subject field and funding agencies perspectives in the G9 countries. Scientometrics 115:833–847 Ito J (2017) The antidisciplinary approach. Res-Technol Manag 60(6):22–28 Ivars-Baidal J, Celdrán-Bernabeu M, Mazón JN, Perles-Ivars A (2019) Smart destinations and the evolution of ICTs: a new scenario for destination management? Current Issues Tour 22(13):1581–1600 Jessop B, Sum N-L (2001) Pre-disciplinary and post-disciplinary perspectives. New Polit Econ 6(1):89–101 Johnson R, Onwuegbuzie A, Turner L (2007) Toward a definition of mixed methods research. J Mixed Methods Res 1(2):112–133 Kiefer P, Giannopoulos I, Kremer D, Schlieder C, Raubal M (2016) Starting to get border: an outdoor eye tracking study of tourists exploring a city panorama. In: Proceedings of the symposium on eye tracking research and applications (ETRA 2014). Safety Harbor (Fl), pp 315–318 Kim J, Fesenmaier D (2015) Measuring emotions in real time: implications for tourism experience design. J Travel Res 54(4):419–429 Law J (2004) After method. Mess in social science research. Routledge, London Leiper N (2000) An emerging discipline. Ann Tour Res 27(3):805–809 Li S, Scott N, Walters G (2015) Current and potential methods for measuring emotion in tourism experiences: a review. Current Issues Tour 18(9):805–827 Li J, Xu L, Tang L, Wang S, Li W (2018a) Big data in tourism research. A literature review. Tour Manag 68:301–323 Li S, Walters G, Packer J, Scott N (2018b) Using skin conductance and facial electromyography to measure emotional repsonse to tourism advertising. Current Issues Tour 21(15):1761–1783 Liang L, Eliot S (2021) A systematic review of augmented reality tourism research: what is now and what is next? Tour Hosp Res 21(1):15–30 Lummus R, Vokurka R (1999) Defining supply chain management: a historical perspective and practical guidelines. Ind Manag Data Syst 99(1):11–17
5 A Post-disciplinary Perspective on e-Tourism
117
Ma E, Cheng M, Hsiao A (2018) Sentiment analysis: a review and agenda for future research in hospitality contexts. Int J Contemp Hosp Manag 30(11):3287–3308 Marchiori E, Niforatos E, Preto L (2018) Analysis of users’ heart rate data and self-reported perceptions to understand effective virtual reality characteristics. Inf Technol Tour 18:133–155 Mariani M, Baggio R, Fuchs M, Höpken W (2018) Business intelligence and big data in hospitality and tourism: a systematic literature review. Int J Contemp Hosp Manag 30(12):3514–3554 McKee D (2017) The platform economy: natural, neutral, consensual and efficient? Transnational Legal Theory 8(4):455–495 Miah S, Vu H, Gammack J, McGrath M (2017) A big data analytics method for tourist behaviour analysis. Inf Manag 54:771–785 Mosedale J (2006) Tourism commodity chains: market entry and its effects on St Lucia. Current Issues Tour 9(4&5):436–458 Moyle B, Moyle C-L, Bec A, Scott N (2019) The next frontier in tourism emotion research. Current Issues Tour. 22(12):1393–1399. https://doi.org/10.1080/13683500.2017.1388770 Müller D (2019) A research agenda for tourism geographies. Cheltenham, Elgar Munar A, Pernecky T, Feighery W (2016) An introduction to tourism postdisicplinarity. Tour Anal 21(4):343–347. https://doi.org/10.3727/108354216X14600320851578 Murphy J, Gretzel U, Pesonen J (2019) Marketing robot services in hospitality and tourism: the role of anthropomorphism. J Travel Tour Mark. https://doi.org/10.1080/10548408.2019.1571983 Neihardt J (2019) ENTER 2020. Third Call for Papers. Online communication, sent to members of [email protected]. 11 June 2019 Nevola F, Coles TE, Mosconi C (2021) A city revealed? Critical insights from the implementation of a heritage tourism app for the World Heritage City of Florence. Exeter: University of Exeter Business School. Unpublished manuscript, available from the authors Painter J (2003) Towards a post-disciplinary political geography. Polit Geogr 22:637–639 Pernecky T, Munar A, Wheeller B (2016) Existential postdisciplinarity: personal journeys into tourism, art and freedom. Cognizant Communication Corporation. Tour Anal 21(4):389–401. https://doi.org/10.3727/108354216X14600320851730 Qian J, Wei J, Law R (2018) Review of critical discourse analysis in tourism studies. Int J Tour Res 20:526–537 Raun J, Ahas R, Tiru M (2016) Measuring tourism destinations using mobile tracking data. Tour Manag 57:202–212 Reinhold S, Zach F, Krizaj D (2017) Business models in tourism: a review and research agenda. Tour Rev 74(4):462–482 Saltelli A, Funtowicz S (2017) What is science’s crisis really about? Futures 91:5–11 Sayer A (1999) Long live postdisciplinary studies! Sociology and the curse of disciplinary parochialism/imperialism. Department of Sociology, Lancaster University. Retrieved from: http://www.comp.lancs.ac.uk/sociology/papers/Sayer-Long-Live-Post-disciplinary-Studies.pdf Schlanger J (1992) Fondation, nouveaute, limites, memoire. Communiations 54:289–298 Scott D, Hall CM, Gössling S (2012) Tourism and climate change: impacts, adaptation and mitigation. Routledge, Abingdon Scott N, Zhang R, Le D, Moyle B (2017) A review of eye-tracking research in tourism. Current Issues Tour 22(10):1244–1261 Shoval N, Birenboim A (2019) Customization and augmentation of experience through mobile technologies: a paradigm shift in the analysis of destination competitiveness. Tour Econ 25(5):1–9. https://doi.org/10.1177%2F1354816618806428 Shoval N, Schvimer Y, Tamir M (2018a) Tracking technologies and urban analysis: adding the emotional dimension. Cities 72:34–42 Shoval N, Schvimer Y, Tamir M (2018b) Real-time measurement of tourists’ objective and subjective emotions in time and space. J Travel Res 57(1):3–16 Shove E (2011) On the difference between chalk and cheese – a response to Whitmarsh et al’s Comments on ‘beyond the ABC: climate change policy and theories of social change’. Environ Plan A 43:262–264 Smith MJ (1998) Social science in question. Towards a post-disciplinary framework. Sage, London
118
T. Coles et al.
Smith S, Adams J (2014) The fourth age of research’: implications and actions for global universities. Online document. Available from: https://www.britishcouncil.jp/sites/default/files/ pro-he-international_ collaboration_ and_ research_ strength-presentation_ mr_jonathan_adamsfeb17.pdf. Last accessed: 21 June 2019 Solima L, Izzo F (2018) QR codes in cultural heritage tourism: new communications technologies and future prospects in Naples and Warsaw. J Herit Tour 13(2):115–127 Stadler R, Jepson A, Wood E (2018) Electrodermal activity measurement within a qualitative methodology: exploring emtion in leisure experiences. Int J Contemp Hosp Manag 30(11):3363–3385 Tang L (2014) The application of social psychology theories and concepts in hospitality and tourism studies: a review and research agenda. Int J Hosp Manag 36:188–196 Törnebohm H (1983) Studies in knowledge development. Doxa, Uppsala Törnebohm H (1985) What is theory of science? Report 145, Department of Theory of Science and Research, Gothenburg University, Sweden Toulmin S (2001) Return to reason. Harvard University Press, Cambridge Tribe J (1997) The indiscipline of tourism. Ann Tour Res 24(3):638–657 Tribe J (2000) Indisciplined and unsubstantiated. Ann Tour Res 27(3):809–813 Tribe J (2004) Knowing about tourism: Epistemological issues. In J Phillimore, L Goodson (eds) Qualitative research in tourism: Ontologies, epistemologies, and methodologies. Routledge, London, pp 46–62 Troung VD, Hall CM (2013) Social marketing and tourism: what is the evidence? Soc Mark Q 19(2):110–135 Vanino E, Roper S, Becker B (2019) Knowledge to money: assessing the business performance effects of publically-funded R&D grants. Res Policy 48(7):1714–1737 Wardle C, Buckley R (2014) Tourism citations in other disciplines. Ann Tour Res 46:163–184 Weiler B, Moyle B, McLennan C (2012) Disciplines that influence tourism doctoral research. The United States, Canada, Australia and New Zealand. Ann Tour Res 39(3):1425–1445 Whitmarsh L, O’Neill S, Lorenzoni I (2011) Climate change or social change? Debate within, amongst, and beyond disciplines. Environ Plan A 43:258–261 Wickham M, Dunn A, Sweeney S (2012) Analysis of the leading tourism journals, 1999–2008. Ann Tour Res 39:1683–1724 Yung R, Khoo-Lattimore C (2019) New realities: a systematic literature review on virtual reality and augmented reality in tourism research. Current Issues Tour 22(17):2056–2081
6
Consumer Behavior in e-Tourism S. Volo and A. Irimiás
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . State-of-the-Art Research in Consumer Behavior in the E-Tourism Era . . . . . . . . . . . . . . . . . Pre-trip Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On-site Tourism Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Post-trip Evaluation Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
120 121 122 124 130 134 135 135
Abstract Tourism scholars have extensively investigated tourists’ behavior; from motivations to actual choices and consumption patterns, the way tourists behave has relevant implications for theory and practice. In e-Tourism, consumer behavior encompasses the wide range of tourists’ behaviors supported by technologies and happens at different stages: prior undertaking a vacation, during the experience itself, and after it, when tourists are engaged in post-vacation assessments. Research on these aspects is vast, encompassing both the supply and demand side, but it remains scattered. This chapter provides an informed overview of consumer behavior in the e-Tourism era. The core of the chapter focuses on
S. Volo () Faculty of Economics and Management, TOMTE, Free University of Bozen-Bolzano, Bruneck-Bolzano, Italy e-mail: [email protected] A. Irimiás Faculty of Economics and Management, Free University of Bozen-Bolzano, Bruneck-Brunico, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_8
119
120
S. Volo and A. Irimiás
three phases of consumer behavior that have significantly been reshaped by e-Tourism: pre-trip stage, on-site experience, and post-trip evaluation. These three relevant areas are herein analyzed, and considering the tourists and providers’ perspective, the most relevant changes enabled by the e-Tourism era are presented. The conclusion section discusses the relevance of behavioral changes induced by digitally mediated experiences, outlines advances, and presents future perspectives for tourism and hospitality.
Keywords Consumer behavior · Decision-making · Social media · Tourism experiences · Online travel reviews
Introduction Scholars define e-Tourism as a wide range of travel- and tourism-related behavior enabled, facilitated, shaped, and enhanced by technologies, digital services, and platforms and to the consumers’ embracement of and reaction to such technologies and digital services (Xiang and Gretzel 2010; Xiang et al. 2015). The emergence of e-Tourism, from the digitalization of processes in the value chain (Buhalis and Deimezi 2004) has affected both consumers and producers throughout all stages of production and consumption, with evident changes on both the supply side (e.g., Buhalis and Law 2008; Zhang et al. 2010) and the demand side (e.g., Gretzel et al. 2006; Tussyadiah and Fesenmaier 2009). Tourism consumers have benefited from new information and communication technologies (ICTs) and the use of Internet for tourism purposes (Garín-Muñoz et al. 2020). Indeed, the last three decades have seen a growth in tourists’ interest, usage, and engagement with technologies and with all offerings of the e-Tourism realm leading to a range of novel behavioral practices. The ease of access and use of new information and communication technologies has contributed to their pervasiveness and success in tourism, travel, and hospitality (Sigala et al. 2012). Undeniably, information and communication technologies (ICTs) are present at every stage of the tourism experience: from expectation formation to post-vacation evaluation. More recently, web-based technologies referred to as “tools of mass collaboration” (Sigala 2012:7) have empowered consumers, “armed” with their smartphones, to create and share information with a multitude of people and businesses, anytime, anywhere. Technological advances and tools are influencing, and somewhat driving, consumer behavior in e-Tourism. They are affecting tourists’ decision-making process and their experiences and engagement with providers and with other actual or potential tourists. E-Tourism has also enabled tourists to be more knowledgeable about the offerings in the marketplace, more independent in the choice and the timing of their decision-making, and even more involved with the possibility to engage and co-create experiences. In recent decades, web-based services, social media, and mobile information systems in tourism have become ubiquitous and have
6 Consumer Behavior in e-Tourism
121
radically affected customer-provider contacts offering opportunities for interaction and creating novel communication channels (Volo and D’Acunto 2020). Indeed, e-Tourism allows tourists to connect with providers and with fellow tourists, in social media environments, before, during, and after the vacation, thus creating faster and more dynamic interactions (provider-tourists and tourist-to-tourist) directly happening via social media channels (Fotis et al. 2011; Sigala et al. 2012; Fesenmaier and Xiang 2017) specifically in online travel communities (Buhalis and Law 2008; Irimiás and Volo 2018; Wang et al. 2002) and on review platforms (Inversini et al. 2009; Filieri and McLeay 2014; Viglia et al. 2016). Social media and digital technologies have facilitated the generation and dissemination of usergenerated content and have empowered tourists creating a more personal interaction with the destination and its tourism providers. In turn, this has affected the different stages of the decision-making process, the experience itself, and the post-vacation evaluation with effects on all related consumer behavior aspects. Since the advent of e-Tourism, several scholars have investigated the challenges and opportunities it brings to both providers and users and the shift in the consumer behavior of tourists at various stages of their experience (Gretzel et al. 2006; Xiang et al. 2015; Sigala 2018). This chapter provides an informed review of consumer behavior in the e-Tourism era. In the next sections, the growth and maturity of the topic is outlined, and cutting-edge ideas and recent developments within this field of research are presented. The core of the chapter focuses on three dimensions typically investigated in our field that have significantly been reshaped by e-Tourism, namely, pre-trip stage, on-site tourism experience, and post-vacation evaluation. Finally, the chapter identifies paths for future research and provides general guidance and directions for scholars and graduate students wanting to engage in future research in this area.
State-of-the-Art Research in Consumer Behavior in the E-Tourism Era Tourism researchers have investigated consumer behavior from different viewpoints trying to answer fundamental questions related to the motivations to travel, the expectations of tourists, the factors that influence tourists’ various decisions, the mechanisms of tourism decision-making, and the outcomes of tourism experiences (e.g., Correia et al. 2013, 2015; Dimanche and Havitz 1995; Mansfeld 2012; Pizam et al. 1999; Volo 2021). In particular, in the tourism behavior literature, most authors agree on identifying three to five stages of the tourist’s decision-making process (e.g., Correia and Crouch 2004; Demir et al. 2014; Gretzel et al. 2006). Indeed, the most commonly used approach to investigate the behavior of tourists is one based on a temporal/spatial dimension that sees tourists as consumers going through the following temporal stages: pre-consumption, consumption, and post-consumption. Other authors refer to these stages as information search or pre-decision, decision stage, on-site experience (which emphasizes the spatial dimension), and postvacation evaluation stage. These relevant areas around which to frame consumer
122
S. Volo and A. Irimiás
behavior are herein used to discuss recent developments in tourists’ behavior in e-Tourism. In an early conceptualization, Gretzel et al. (2000) forecasted that the development of information technology would have generated an enormous impact on tourism and on consumer behavior; thus there is a need to trace the evolution of the last two decades of research in this field. This chapter is based on an overview of selected studies with the aim to illustrate the diversity of technological impacts on tourists’ behavior.
Pre-trip Phase Consumer behavior at the pre-trip phase has received wide scholarly attention, and, in this section, some key dimensions and practices of tourists’ information search and travel planning are discussed. Tourists take decisions based on the information collected and integrated before the act of traveling to a destination (Vogt and Fesenmaier 1998). Early research on ICT highlighted how computerassisted travel counseling helps to better respond to consumers’ needs and how technologies can assist to provide tourists with alternatives, customized services and fast problem-solving (Hruschka and Mazanec 1990; Loban 1997). People seek travel information for leisure, recreation-based purposes or for social, aesthetic, and creative reasons based on their needs. However, research points out that not everyone who searches travel information will actually become a tourist (Vogt and Fesenmaier 1998). Recognition of the relevance of high-quality digital services, to meet consumers’ needs, was already given in earlier works (e.g., Connell and Reynolds 1999). Gretzel et al. (2004) explored tourists’ preferences in destination recommendation systems and built on travel personality categories to provide improved e-Tourism services. Jun et al. (2007) investigated tourists’ online and offline behavior founding that tourists’ prior travel experiences and knowledge are rather influential in the decision-making process. The authors argued that vacation planning theory, compared to decision-making theory, better describes the multiple and interrelated reasons behind the act of booking a holiday. This is because decision-making theory is goal-centered, while vacation planning theory is processcentered. In the pre-trip phase, decision-making is a complex and multifaceted process, and this peculiarity needs to be considered by tourism marketers and service providers to understand consumer behavior in e-Tourism. Over the last decades, e-Tourism confirmed its role, on the supply side, in fostering the development of a culture of constant innovation and, on the demand side, in offering tourists with the possibility of personalized services and major control over the process. Communication and interaction between tourists and destinations occurring on different online and social media platforms are considered relevant to motivate tourists to travel (Buhalis and Law 2008; Gretzel and Yoo 2008; Kim and Fesenmaier 2017) and affect the vacation planning process (Gretzel et al. 2006; Sigala et al. 2012). Gursoy and McCleary (2004) elaborated a framework to integrate the psychological/motivational, economic, and processing approaches to explore consumers’ information search behavior. Familiarity with tourism products
6 Consumer Behavior in e-Tourism
123
and expertise in tourism influence tourists external and internal sources. The stages of the online vacation planning process were investigated by Pan and Fesenmaier (2006); the authors developed a conceptual model of online search highlighting that tourists’ cognitive states are constantly changing during the planning process. This change is led by consumers’ knowledge and perceptions about a destination but also by visuals and language used by tourism marketers. The authors conclude by stating the relevance of design, languages, and subjective and experiential online narratives. Nature, characteristics, and models of online travel information search are also provided by Xiang and Fesenmaier (2020). The process of tourists’ online information search has been significantly transformed by social media and the ubiquitous use of smartphones. The authors argue that “travel now takes place in technological bubbles” (Xiang and Fesenmaier 2020:10) in which the constant interaction between tourists, the tourism domain, and the home environment allows tourists to embark on unplanned travels. Unplanned behavior is facilitated by the search engines embedded in social media networks that provide access to an immense amount of data about destinations, tourism services, and the tourists themselves. Indeed, the advent of social media platforms has enabled wide-scale participatory content generation and sharing. Social media platforms have played an increasing role in consumers’ decision-making processes and in travel planning (Gretzel and Yoo 2008; Sigala et al. 2012). Travel-related social media content mediates and shapes tourists’ perceptions and influences tourists’ interactions (Gretzel 2010). Tussyadiah and Fesenmaier (2009) applied netnography as a method to approach YouTube videos featuring touristic aspects of New York City and how videos influenced consumer behavior. By evaluating the role of online communication media, the authors showed that short videos stimulated fantasies in the pre-trip stage. The authors argue that watching short tourism-related videos benefits tourists in the planning process since videos serve as means of transportation to the holiday realm. Emotionally engaging stories consumed in the pre-trip stage encourage individuals to travel and to shape their experiences (Moscardo 2010; Moin et al. 2020). With respect to the actual planning stage, Xiang et al. (2015) found that online travel agencies (OTAs) still dominate the travel planning landscape, but there are significant generational differences between older and younger tourists. Younger segments of tourists prefer using social media, and the use of smartphones, social media, and location-based apps, such as Google Maps, allows tourists to postpone decisions that used to be taken in the pre-trip stage. In conclusion, the factors influencing travel information search, planning, and decision-making behavior are various: tourists’ personalities, age, and prior knowledge about the destination; the platforms and search engines used for information gathering; and personally meaningful narratives about the destinations and tourism offerings (Xiang and Fesenmaier 2020). Höpken et al. (2021) argue that tourists’ online search behavior is strongly connected to subsequent holiday booking. Online search queries (e.g., web search traffic, e-reviews, interaction on social media) reflect tourists’ preferences and needs. More recently the literature has focused on how tourism expectations are shaped and influenced by different sources: DMOs’ communication (Moin et al. 2020),
124
S. Volo and A. Irimiás
electronic word of mouth (Pourfakhimi et al. 2020), and social media content (Gretzel 2021), among others. Tourists find inspiration in opinion leaders who have the power to affect the decisions of others (Xiang and Fesenmaier 2020). Opinion leaders in social media are generally referred to as “influencers” (Gretzel 2018). Travel influencers’ judgments, tastes, and preferences are considered reliable and valuable by their audiences. Influencers’ travel content has also been leveraged by tourism marketers and planners since consumers consider influencers’ recommendations trendy and trustworthy (Gretzel 2018). Photos shared on Instagram were found to create multidimensional experience values for content creators and audiences as well (Conti and Lexhagen 2020; Volo and Irimiás 2020). Sharing impressive content on social media can turn out to be the sole travel motivation as highlighted by Woods and Shee (2021). The authors argued that the wish to gain visibility on Instagram was one of the reasons Singaporean volunteer tourists engaged in humanitarian practices. Travel visuals on Pinterest were explored by Gretzel (2021) through a qualitative analysis of Pinterest travel boards to see how users, dominantly females, capture and craft their travel dreams. Findings show that Pinterest visuals are not only inspirational but provide material for future planning. Early conceptualization on online booking focused on how the Internet gave the possibility of purchasing travel products online instead of relying on travel agencies, facilitated the process, and saved time for tourists (Morrison et al. 2001). In parallel, several constraints which have limited consumers’ online travel purchasing in the past have been overcome (Buhalis and Law 2008). The lack of experience and the lack of trust or security issues were resolved. In fact, digital technologies have also changed tourists’ booking behavior by introducing mobile travel booking services (Gretzel et al. 2006). In addition, the ubiquity of computing, the role of smartphones, and mobile applications have accelerated innovations in hotels’ digital services (Buhalis and Sinarta 2019). Immediacy, convenience, and money savings are among the reasons behind tourists’ transformed online browsing and booking behavior and changed value perceptions (Lee 2020). Tourists have gained so much confidence in online purchasing of travel products that they are not only passive consumers but are able to shape hotels’ pricing strategies as well (Masiero et al. 2020). These authors, employing a discrete choice experiment, showed that tourists rate the utility of free cancellation in hotels and accommodation services higher because they not only expect a price drop but also because they trust the availability of automatic rebooking services. A synopsis of the studies herein discussed is presented in Table 1.
On-site Tourism Experience The changing nature of on-site tourism experiences has also received wide academic interest (Lamsfus et al. 2015; Volo 2021; Wei et al. 2019). Here the focus is on the understanding of tourist behavior patterns shaped by technologies and digital services. Tourism behavior at destinations is now increasingly influenced, shaped,
6 Consumer Behavior in e-Tourism
125
Table 1 Overview of selected studies focusing on the pre-trip stage dimensions Dimensions Expectation formation
Focus Travel videos
Images
Influencers
Motivation
Visibility on social media
Information search
Tourists’ needs
Online behavior
Insights Role of YouTube travel videos in stimulating expectations about tourism experiences Pinterest travel visuals shared and consumed enhance tourists’ wish to travel. Instagram visuals offer multidimensional values of tourism experiences and influence decision-making. Instagram is a visual media platform of tourism experiences where tourists window-shop for destinations Tourists consider opinion leaders on social medias as trusted, authentic, and fun individuals making influencer marketing key for destinations Representation of the self on Instagram is the primary motivation to participate as a volunteer tourist in humanitarian projects Functional, hedonic, innovation, aesthetic, and sign are the information needs influencing tourists’ decision-making processes online. Changing information needs of American travelers. Tourists’ pre-purchase information-seeking behavior is influenced by internal and external costs and by tourists’ familiarity and expertise with a destination.
Author(s)/year Tussyadiah and Fesenmaier (2009)
Gretzel (2021) Conti and Lexhagen (2020) Volo and Irimiás (2020)
Gretzel (2018)
Woods and Shee (2021)
Vogt and Fesenmaier (1998)
Choe et al. (2017)
Gursoy and McCleary (2004)
(continued)
126
S. Volo and A. Irimiás
Table 1 (continued) Dimensions Planning
Focus Browsing behavior
Channels used
Purchasing
Online/offline behavior
Predictive demand
Strategic consumer behavior
Online search, booking conversion
Insights Online vacation planning is an information-intensive task, and tourists use different sematic mental models to navigate. Online tourist behavior can be deconstructed into episodes and chapters related to specific issues. Different channels are used for different purposes by different age cohorts. Gen Y use social media and smartphones, while older generations use traditional online channels Relationship between travel information search and purchase behavior is varied: information search occurs online, while travel product purchase occurs offline Tourists using Google for planning seriously intend to visit the destination as shown by tourists’ web search traffic and actual visits Free cancellation rate, possibility to rebook an accommodation influence e-tourism consumer behavior Online browsing behavior (goal-oriented/experienceoriented consumers) is predictive for airline booking
Author(s)/year Pan and Fesenmaier (2006)
Xiang et al. (2015)
Jun et al. (2007)
Höpken et al. (2021)
Masiero et al. (2020)
Lee (2020)
and supported by numerous ICT devices, tools, and apps across different platforms, times, and spaces. Indeed, on-site travel experiences are more and more digitally mediated by smartphones and other wearable devices. Tourists, according to their attitudes and personality, use on-site technologies for different reasons: effective problem-solving, enhancing interpretation of attractions, to feel safe and in control, not to be bored while waiting, and keeping social contacts. In addition, tourists use online services for price convenience, immediacy of feedback, and reducing
6 Consumer Behavior in e-Tourism
127
waiting time, among others. The excessive use of technology, however, can also be disruptive for the tourism experience; thus a balance in its use is recommended by literature (Fan et al. 2019). Mobile technologies greatly influence travelers’ behavioral pattern and provide on-the-go tourists with new possibilities to support their needs (Lamsfus et al. 2015). Consumer preferences and needs are various and go from avoiding getting lost in a city to live once-in-a-lifetime experiences. Cruise passengers’ behavior seems to be highly technology-driven (Paananen and Minoia 2019). For example, tourismrelated apps, location-based social networks, and Google maps prevent tech-savvy cruise passengers, once onshore, from losing time or getting lost in the city. Evolving digital technologies support tailor-made tourism services, and tourists using technology-enabled services often recall enhanced and meaningful experiences. Consumers visiting national parks and gardens reported deeper experiences and detailed knowledge about the attractions when using digital services such as podcast tours (Kang and Gretzel 2012) or iPhone apps (Mann 2012). In art galleries, the use of wearable augmented reality devices such as Google Glass was found to facilitate, to some extent, the learning experience, although it must be noted that users of innovative technologies often pay attention to the device itself while they might ignore the surrounding environment (tom Dieck et al. 2018). Researchers also found that virtual reality technology and interactive engagement can strengthen positive consumer experiences in theme parks where consumer entertainment is the core business (Wei et al. 2019). Technology can also drive tourists to otherwise less popular destinations, can enhance the experience at mass tourism attractions and could be used to reduce overtourism (Shoval 2018). Gamification such as geocaching, for example, can function as a visitor attraction in rural destinations (Skinner et al. 2020). Furthermore, tourists’ digital footprint provides behavioral data allowing technology-enabled services to improve their accuracy and to provide customized experiences (Sigala 2018; Volo 2018). Consumer behavior in urban tourism has also been influenced by e-Tourism services. In most urban destinations, the tourist activity centers are modified by the use of location-based social networks. Tourists use different social network platforms on-the-go especially when sightseeing, searching for a place to eat, to do shopping or to be entertained (Martí et al. 2021). It is important to note that building on such data, human-centered, rather than technology-centered, tourism services could create a competitive advantage (Stankov and Gretzel 2020). Digitalization and ICT technologies allow tourists to adopt a more flexible decision-making process which does not necessarily follow a traditional hierarchical structure (pre-trip planning, on-site consuming, post-trip recommendation) but is rather dynamic and fluid. In fact, tourists relying on ICT, network technologies, and smartphones are able to construct, shape, and enrich their travel experiences onsite. Mobile travel guides and smartphone applications offer tourists the possibility to gather information while en route. Zach and Gretzel (2012) studied touristactivated networks of trips through Northern Indiana, USA. Findings showed that short- and long-trip tourists alike were flexible and open to suggestions to visit
128
S. Volo and A. Irimiás
spots that were not previously planned. Mobile technology had a decisive role in on-site decision-making, and the network structure derived from such data provides a practical basis for dynamically bundling products and to craft the tourism experience on-site. Mobile technology and free wireless Internet enable tourists to access and share content while on the move and influence actual behavior (Magasic and Gretzel 2020). Location-based social network applications (Foursquare, Gowalla, etc.) on mobile phones influence tourist behavior in several ways. The use of these applications influences tourists’ spatiotemporal mobility and shape their overall patterns of consumption experiences (Tussyadiah 2012). Further, the role of on-site push recommendations generated by smartphones along with the agency of mobile technology in influencing tourists’ tendency to follow such recommendations is relevant to understanding consumer behavior. Tussyadiah and Wang (2016) employed projective techniques to evidence that tourists consider smartphones as travel companions, guides that assist them in making their experiences enjoyable and safe. In some cases, rejection of use was due to participants’ fear that being too reliant on smartphones would deter them from having a meaningful tourism experience. Extant literature also predicts that, in the future, the development of mobile technology to support tourists on the move will be further enhanced (Buhalis and Sinarta 2019). Mobile technology has significantly shaped consumer behavior in accommodation services as well. Issues such as tourists’ intentions to adapt self-service technology in hotels, the use of hotel-branded applications, and artificial intelligence technology services in hospitality have been investigated (Law et al. 2020). For most tourists, it is common practice to use mobile hospitality services for hotel reservations, check in/out, and for online communication (Reinhold and Dolnicar 2017; Rita et al. 2018). Research on smartness in hospitality and the diffusion of smart applications in hotels – smart TVs, remote control of lighting, speech recognition, and digital signage – explored the ways hotels can provide their guests with technology-enhanced accommodation services (Stylos et al. 2021). The positive role of ICT in enriching tourists’ experiences has already been recognized in the hospitality literature (Sigala 2005; Stylos et al. 2021); however, some negative aspects should also be mentioned. Smartphones can be disruptive in the tourism experience since apps and social media strongly hold tourists’ attention, potentially depriving them from the possibility to immerse in the experience. Tourists’ technology overload, techno-stress, or unfamiliarity with technological devices/apps offered by hospitality services can co-destruct tourists’ experiences. To overcome tourists’ attention constraints, Stankov et al. (2019) introduced the concept of calm ICT design based on ambience awareness and explored its applicability in hotels. Tourists can be guided toward customized services without forcing them to pay attention to apps. With respect to the on-site experience, researchers have engaged in exploring the opportunities of technologies to track tourists and understand their itineraries (e.g., Shoval and Isaacson 2007; Shoval 2018). Shoval et al. (2018) used, with the consent of participants, tourists’ smartphones and tracked their movements by GPS
6 Consumer Behavior in e-Tourism
129
to have real-time information. Kim and Fesenmaier (2015) collected measurements of emotions with tracking devices. Indeed, novel technologies and smart devices that measure consumers’ physical activity, heart rate, and skin temperature can provide direct observational data on tourists’ actual behavior and physical state (e.g., Scott et al. 2019; Volo 2020). Social media data such as geo-located Instagram visuals have also been used in academic studies to investigate patterns in tourists’ movements and behavior (e.g., Ma et al. 2020). Big data generated by tourists who share their tourism experiences on social media can help measure tourism flows in large-scale tourism regions (e.g., the Danube Region in Europe in Kádár and Gede 2021). Tracing tourists behavior at the destination can also provide relevant information that can be used by recommender systems (e.g., Ricci 2020) to improve the on-site experience. Finally, e-Tourism allows for being constantly and synchronously connected with social contacts (Magasic and Gretzel 2020) which is an extremely relevant characteristic during the on-site tourism experience, allowing tourists to somehow extend the boundaries of their destination to include friends and relatives and even to connect with strangers. Modern sociality is mobile and digitally connected as the patterns of online connection and disconnection between tourists and homestayers evolve. Such modern sociality includes virtual mooring, following someone in the online realm, or to stay (dis)connected (Germann Molz and Paris 2015). Synchronous connectivity has shaped the ways tourists represent destinations, attractions, hospitality services, themselves, and other tourists while on the move. The constant sharing on-the-go that tourists engage with is traceable throughout social media and has been widely explored with text analytics (Liang et al. 2019; Ye et al. 2009). The synchronous experience sharing has evolved also on a more visual level (Conti and Lexhagen 2020), and Instagram provides particularly rich visual content to explore with innovative and creative methods (Volo and Irimiás 2020). The tourist gaze is more and more self-directed, and the practice of selfietaking has influenced consumer behavior (Dinhopl and Gretzel 2016). Traditional tourism landscapes and attractions seem to be less important than the tourist-self. The immediacy of sharing also allows an inclusion, in the tourism experience, of those who stayed home (Woods and Shee 2021). Certainly, new communication technologies revolutionized the intersubjective construction of space and location, the meanings of “home” and being “away” in relation to the concepts of “copresence” and “virtual presence” (White and White 2007). (Pearce and Gretzel 2012) coined the term “digital elasticity” which refers to tourists’ behavior to stay digitally connected to their home while traveling. Digital elasticity has some negative issues as well. Tourists are called to handle emotionally disruptive interactions and unpleasant reminders of work duties. The idea of escape as a motivation for travel is challenged by the fact that tourists maintain symbolic proximity with home by digital technology. Backpackers, for example, who stay in continuous touch with friends, family, and other travelers while on the move because of their extensive access to technology were described as “flashpacking” tourists (Germann Molz and Paris 2015) (Table 2).
130
S. Volo and A. Irimiás
Table 2 Overview of selected studies focusing on the on-site stage dimensions Dimensions Mobile technology use
Focus On-the-go decision-making
Urban tourist activities and location-based social networks
Rural tourism and gamification
Cruise tourism and apps
National parks, botanic gardens
Themed parks and virtual reality
Insights Tourists use digital technologies for navigation, short-term decision-making, and on-site purchases, implications to develop new offerings to support on-the-go tourist needs Tourists using apps such as Instagram, Foursquare, Google Places, Twitter, and Airbnb redesign urban tourist activity centers related to sightseeing, shopping, eating, and nightlife Gamification and geocaching can enhance tourism experiences in rural settings Tourists rely on map-based applications, wayfinding tools in time constraints while on shore excursion Tourists using podcasts had more meaningful experiences Visitors using iPhone app with a customizable map in Kew Garden reported deepened and expanded knowledge, thus an enhanced consumer experience Tourists using virtual reality applications felt in control, be present, and immersed in a virtual environment which enhanced their experience
Author(s)/year Lamsfus et al. (2015)
Martí et al. (2021)
Skinner et al. (2020)
Paananen and Minoia (2019)
Kang and Gretzel (2012)
Mann (2012)
Wei et al. (2019)
(continued)
Post-trip Evaluation Stage E-Tourism has also reshaped the way tourists manage their post-vacation experience; in this section, the most relevant findings on post-trip behavior are discussed. Most of the literature focuses on the role of post-evaluation assessment at two
6 Consumer Behavior in e-Tourism
131
Table 2 Continued Dimensions Behavioral triggers
Focus Seeking recommendations
Accepting push recommendations
Tracking and itineraries
Synchronous connectivity
Tracking tourist mobility
Fluidity of home/holiday concepts Selfie-taking
Backpackers
Insights Tourists relying on online travel recommendations on-the-move are prompt to visit lesser known places in a destination Tourists eager to be rewarded in competition-based and connection-based on-site activities modify their itineraries and visit new venues, attractions, and businesses Smartphones are perceived as trusted travel companions/guides, tourist well accept push recommendations while on-the-go Objective socio-temporal data on tourist spatial movements and emotional reactions in a destination (with tourists’ consent) Experiments on tourist mobility using land-based tracking, GPS, and Hyprid systems
Author(s)/year Zach and Gretzel (2012)
Review on eye-tracking research Geotagging and other metadata shared on social media describe tourists on-site movements and behavior patterns (with data scraping) Smartphones and new communication services blurred the division between home and being away Tourists’ relation to smartphones, social media, and virtual audiences influences tourist experience Backpacker tourists often use their smartphones to maintain social contacts. Dichotomy between connection/disconnection is explored
Scott et al. (2019)
Tussyadiah (2012)
Tussyadiah and Wang (2016)
Shoval et al. (2018)
Shoval and Isaacson (2007)
Ma et al. (2020)
White and White (2007)
Dinhopl and Gretzel (2016)
Germann Molz and Paris (2015)
132
S. Volo and A. Irimiás
different levels of sharing the experience, namely, sharing the assessment of the vacation with others in a way that can be useful from a practical or business perspective and sharing as a way to enhance the affective and personal vacation sphere or preserving the emotional experience. Sharing consumer experiences – with tourists, friends and relatives and tourism providers – on different social media platforms has become an integral part of tourism. Tourism experiences are supposed to be extraordinary to be shared through personal and public networks in order to self-construct tourists’ social image and identity (Sigala 2018). Tourists’ electronic word-of-mouth has had a tremendous impact on the behavior of those who read reviews to inform their travel planning (Filieri and McLeay 2014). As a consequence, trust and trustworthiness of online travel reviews are issues strongly influencing consumer behavior (Gretzel et al. 2007; Gregori et al. 2014; Filieri 2016). Filieri (2016) evidenced that tourists assess reviews’ trust and untrustworthiness based on the review content, length, style, and integrity. The recent advances in textual data analytics have supported researchers in exploring several aspects of the online narratives such as the differences in institutional and user-generated communication of tourism offerings (Irimiás and Volo 2018). From the e-evaluation perspective, tourism online review platforms provide consumers and businesses a dynamic way to engage, interact, and contribute to value creation (Sigala 2018). Yoo and Gretzel (2008) shed light on the reasons which drive consumers to write online travel reviews to be shared with a wide audience. Reasons for engaging in traditional word-of-mouth (WOM) related to tourism experiences have been widely studied (Dichter 1966; Murphy et al. 2007), and in part such motivations are overlapping with those of online travel review writing. Positive reviews are motivated by enjoyment in expressing positive feelings, altruism, helping service providers, and concerns for other consumers. Negative reviews are motivated by the intent to warn other consumers, venting negative feelings, vengeance, and exertion of collective power over companies (Gonçalves et al. 2018; Yoo and Gretzel 2008). Filieri’s (2016) study has also revealed that the type of digital platform – commercial or independent – has also influenced whether consumers trusted fellow reviewers’ comments or cross-checked the reviews and scores from different websites. Managerial implications of the study suggest tourism businesses to develop and invest in reputation defense mechanisms to identify untrustworthy reviews and to provide an accurate response to each review published on digital platforms. Looking into the post-trip evaluating stage, the affective and personal aspects of sharing are also significantly affected be e-Tourism. Social media platforms provide tourists with the tools and space to narrate and co-create post-trip experiences. The act of sharing tourism experiences on online travel platforms does influence the evaluation of the holiday as Kim and Fesenmaier (2017) evidenced. The authors found that even in cases when tourists had bad travel experiences sharing their experience on social media improved post-trip evaluation. Wu and Pearce (2016) explored why Chinese tourists who travel independently craft informed and aesthetically designed travel blogs to share their personal travel experiences. Conti and Lexhagen (2020) investigated the experiential values linked to sharing travel photos within tourists’
6 Consumer Behavior in e-Tourism
133
social networks and how meaning of the tourism experience is shaped through the visuals. Yu et al. (2020) assessed the moderation effect of travel experience sharing on holiday motivation fulfillment and tourists’ life satisfaction. It was shown that consumer behavior such as uploading tourism visuals while still traveling and post-trip experience sharing magnified the effects of motivation fulfillment and influenced not only tourists’ momentaneous positive mood but also their subjective well-being. Tourism marketers are advised to stimulate tourists to capture and share tourism experiences on-site and to “develop special programs and events that can facilitate experience sharing after the trip” (p.11). With reference to these aspects, Li et al. (2021) assessed the impact of the act of sharing online consumers’ experiences on how tourists evaluated their tourism experiences. It is not surprising that writing about positive tourism experiences increases tourists’ positive mood. The authors argue that not only positive experiences but the acts of sharing a story about that experience and interacting with audience on social media do enhance positive posttrip evaluation (Table 3).
Table 3 Overview of selected studies focusing on the post-trip evaluation dimensions Dimensions Social media and post-trip evaluations (e-WOM, OTRs, SM)
Focus Factors of influence
Insights Product ranking, information trustworthiness, relevance, and timeliness influence tourists’ information adoption
Author(s)/year Filieri and McLeay (2014)
Motivations to write online reviews
Tripadvisor users write reviews to help travel businesses, fellow travelers, for self-enhancement or venting negative feelings Social media audience feedback as an important factor in co-creating tourism experiences Sharing either positive or negative tourism experiences enhances positive evaluations of the same
Yoo and Gretzel (2008)
Tourists using Instagram to share nature-based experiences report enhanced affective and experience values Self-enhancement through sharing personally significant travel stories online motivate Chinese independent travelers to write “little Lonely Planets” Amplified effects of online sharing of travel experiences on subjective well-being
Conti and Lexhagen (2020)
Co-creation
Affective and personal aspects of sharing
Social media and post-trip evaluation Social values
Enjoyment of blogging
Sharing and well-being
Li et al. (2021)
Kim and Fesenmaier (2017)
Wu and Pearce (2016)
Yu et al. (2020)
134
S. Volo and A. Irimiás
Conclusion Consumer behavior has radically changed in the last two decades. Indeed, with e-Tourism opportunities, tourists’ behavior has been evolving and changing across all trip stages. Content shared by peers or influencers, such as trip evaluations on Tripadvisor or photos on Instagram, is able to drive tourist behavior at pre-trip stage and during the tourism experience. This aspect is foreseen to have a major influence on tourist behavior as generational differences become less sharp. This chapter illustrated how advancements in e-Tourism can have an impact on a range of consumer behaviors. Location-based technological advancements provide tourists with the possibility to decide what to do, where to do it, and how to do it while on-the-go. In tourism context, algorithms can drive tourists’ mobility to lesserknown places. The “unknown” is not anymore tamed by in-person tour guides, or travel brochures, smartphones, and their apps are recognized as trusted travel companions (Tussyadiah and Wang 2016). Tailor-made push recommendations guiding tourists on-site are based on tracking systems. The more tracking systems are used, the more sophisticated those become, leading to both tourists’ vacation enhancement and providers/destinations better offers. Thus, all service providers need to consider that tracking systems do influence tourists’ behavior. Tourists, in an era obsessed with “nowness” (Buhalis and Sinarta 2019), have been empowered like never before and act instantly to search information, plan on-the-go, co-create experiences, and evaluate tourism providers. Consumers in the e-Tourism era are getting used to immediate, precise, and often free online services. Such habits have consequences on the offline behavior as well, for example, tourists are less tolerant when immediacy cannot be guaranteed by service providers for any reason. The ubiquity of digital technologies and mobile devices influences, shapes, and drives consumer behavior. Smart tourism products and services have changed the ways tourists perceive and engage in identifying, choosing, and experiencing tourism (Sigala 2018). Youngsters’ travel and purchase behavior is remarkably influenced by messaging, consuming, and sharing content on social media platforms, and overall, technologically mediated experiences have a powerful impact on human behavior (Sigala 2018). While tourists develop new identities in the online realm, authors also urge for a more human-centered perspective with the emerging Tourism 4.0 technologies mediated tourism experiences (Stankov and Gretzel 2020) and argue that smart tourists and intelligent tourist behavior should be shaped in line with sustainability and local needs, that is, an intelligent use of ICTs (Pearce 2021). Several social media platforms offer space for e-evaluation of tourism services and destinations. The reasons behind writing a review on Tripadvisor or Booking.com can be diverse, and a review reflects both the quality of services and experiences and the reviewer’s state of mind. In the post-trip stage, affective and personal aspects affect consumer behavior in the use of social media. The act of sharing tourism experiences with different audiences (family, friends, unknown followers) on social media platforms shapes emotions, experience evaluation, and memories (Volo 2020). As this overview shows, consumer behavior in e-Tourism
6 Consumer Behavior in e-Tourism
135
has gained broad academic attention. Research in this field aimed at understanding and contextualizing the rapid changes affecting tourist behavior and at building academic knowledge on the behavioral aspects of e-Tourism. The fast pace at which consumer behavior changes in relation to e-Tourism poses both challenges and opportunities to tourism providers and destinations. Future research should explore digital and e-Tourism consumer behavior acknowledging the specific social, economic, political, and institutional context in which consumers act and interact. Additionally, the technological advancements and modalities enhancing online consumer experiences should be considered in a historical perspective, in order to offer an enriched understanding of current and future behavioral trends. More effort is required in understanding consumers privacy concerns and ethical issues surrounding e-Tourism (D’Acunto et al. 2021; Tussyadiah et al. 2019), and more research is needed to evaluate the effect of service providers data protection rules in managing tourists’ traces (Volo, 2018). Last but not least, there is a need to consider the long-lasting effects of the pandemic on tourist behavior including technology enhancement (e.g., Assaf et al. 2021). Future studies should investigate the actual effects of Covid-19 on tourists’ behavior and on the role that e-Tourism will play in the post-pandemic traveling era.
Cross-References Big Data Technologies Business Intelligence in Tourism Content Analysis of Online Travel Reviews E-Business Models in Tourism e-Supply Chain Management in Tourism Destinations E-Tourism Curriculum Impact of Artificial Intelligence in Travel, Tourism, and Hospitality Recommender Systems in Tourism Service Management in the E-Tourism Era Strategic Use of Information Technologies in Tourism: A Review and Critique Travel Information Search
References Assaf AG, Kock F, Tsionas M (2021) Tourism during and after COVID-19: an expert-informed agenda for future research. J Travel Res. https://doi.org/10.1177/00472875211017237 Buhalis D, Deimezi O (2004) E-tourism developments in Greece: information communication technologies adoption for the strategic management of the Greek tourism industry. Tour Hosp Res 5(2):103–130 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the internet—the state of etourism research. Tour Manag 29:609–623 Buhalis D, Sinarta Y (2019) Real-time co-creation and nowness service: lessons from tourism and hospitality. J Travel Tour Mark 36:563–582
136
S. Volo and A. Irimiás
Choe Y, Fesenmaier DR, Vogt C (2017) Twenty-five years past Vogt: assessing the changing information needs of American travellers. In: Information and communication technologies in tourism 2017. Springer, Cham, pp 489–502 Connell J, Reynolds P (1999) The implications of technological developments on Tourist Information Centres. Tour Manag 20(4):501–509 Conti E, Lexhagen M (2020) Instagramming nature-based tourism experiences: a netnographic study of online photography and value creation. Tour Manag Pers 34:100650 Correia A, Crouch GI (2004) A study of tourist decision processes: Algarve, Portugal. Cons Psycho Tour Hosp Leis 3:121–134 Correia A, Kozak M, Ferradeira J (2013) From tourist motivations to tourist satisfaction. I J Cult Tour Hosp Res 7(4):411–424 Correia A, Zins AH, Silva F (2015) Why do tourists persist in visiting the same destination? Tour Econ 21(1):205–221 D’Acunto D, Volo S, Filieri R (2021) “Most Americans like their privacy.” Exploring privacy concerns through US guests’ reviews. Int J Cont Hosp Manag. https://doi.org/10.1108/IJCHM11-2020-1329 Demir SS, Kozak M, Correia A (2014) Modelling consumer behavior: an essay with domestic tourists in Turkey. J Travel Tour Mark 31(3):303–312 Dichter E (1966) How word-of-mouth advertising works. Harv Bus Rev 44:147–166 Dimanche F, Havitz ME (1995) Consumer behavior and tourism: review and extension of four study areas. J Travel Tour Mark 3(3):37–57 Dinhopl A, Gretzel U (2016) Selfie-taking as touristic looking. Ann Tour Res 57:126–139 Fan DXF, Buhalis D, Lin B (2019) A tourist typology of online and face-to-face social contact: destination immersion and tourism encapsulation/decapsulation. Ann Tour Res 78:102757 Fesenmaier DR, Xiang Z (2017) Introduction to tourism design and design science in tourism. In: Fesenmaier D, Xiang Z (eds) Design science in tourism. Tourism on the Verge. Springer, Cham, pp 3–16 Filieri R (2016) What makes an online consumer review trustworthy? Ann Tour Res 58:46–64 Filieri R, McLeay F (2014) E-WOM and accommodation: an analysis of the factors that influence travelers’ adoption of information from online reviews. J Travel Res 53(1):44–57 Fotis J, Buhalis D, Rossides N (2011) Social media impact on holiday travel planning: the case of the Russian and the FSU markets. Int J Online Mark 1(4):1–19 Garín-Muñoz T, Pérez-Amaral T, López R (2020) Consumer engagement in e-Tourism: micropanel data models for the case of Spain. Tour Econ 26(6):853–872 Germann Molz J, Paris CP (2015) The social affordances of flashpacking: exploring the mobility nexus of travel and communication. Mobilities 10(2):173–192 Gonçalves HM, Silva GM, Martins TG (2018) Motivations for posting online reviews in the hotel industry. Psychol Mark 35(11):807–817 Gregori N, Daniele R, Altinay L (2014) Affiliate marketing in tourism: determinants of consumer trust. J Travel Res 53(2):196–210 Gretzel U (2010) Travel in the network: redirected gazes, ubiquitous connections and new frontiers. In: Levina M, Kien G (eds) Post-global network and everyday life. Peter Lang, New York, pp 41–58 Gretzel U (2018) Influencer marketing in travel and tourism. In: Sigala M, Gretzel U (eds) Advances in social media for travel, tourism and hospitality: new perspectives, practice and cases. Routledge, New York, pp 147–156 Gretzel U (2021) Dreaming about travel: a Pinterest netnography. In: Wörndl W, Koo C, Stienmetz JL (eds) Information and communication technologies in tourism 2021. Springer, Cham. 10. 1007/978-3-030-65785-7_23 Gretzel U, Yoo KH (2008) Use and impact of online travel reviews. In: Information and communication technologies in tourism 2008. Springer, Vienna, pp 35–46 Gretzel U, Yuan YL, Fesenmaier DR (2000) Preparing for the new economy: advertising strategies and changes in destination marketing organizations. J Travel Res 39(2):146–156
6 Consumer Behavior in e-Tourism
137
Gretzel U, Mitsche N, Hwang YH, Fesenmaier D (2004) Tell me who you are and I will tell you where to go – use of travel personalities in destination recommendation systems. Inf Technol Tour 7(1):3–12 Gretzel U, Fesenmaier DR, O’Leary JT (2006) The transformation of consumer behaviour. In: Buhalis D, Costa C (eds) Tourism business frontiers: consumers, products and industry. Elsevier, Oxford, pp 9–18 Gretzel U, Yoo KH, Purifoy M (2007) Trip Advisor online travel review study: the role and impacts of online travel review for trip planning. Laboratory for Intelligent Systems in Tourism, College Station Gursoy D, McCleary KW (2004) An integrative model of tourists’ information search behavior. Ann Tour Res 31(2):353–373 Höpken W, Eberle T, Fuchs M, Lexhagen M (2021) Improving tourist arrival prediction: a big data and artificial neural network approach. J Travel Res 60(5):998–1017 Hruschka H, Mazanec J (1990) Computer-assisted travel counseling. Ann Tour Res 17(2):208–227 Inversini A, Cantoni L, Buhalis D (2009) Destinations’ information competition and web reputation. Inf Technol Tour 11(3):221–234 Irimiás A, Volo S (2018) A netnography of war heritage sites’ online narratives: user-generated content and destination marketing organizations communication at comparison. Int J Cult Tour Hosp Res 12(1):159–172 Jun SH, Vogt CA, MacKay KJ (2007) Relationships between travel information search and travel product purchase in pretrip contexts. J Travel Res 45(3):266–274 Kádár B, Gede M (2021) Tourism flows in large-scale destination systems. Ann Tour Res 87:103113 Kang M, Gretzel U (2012) Effects of podcast tours on tourist experiences in a national park. Tour Manag 33(2):440–55 Kim J, Fesenmaier DR (2015) Measuring emotions in real time: implications for tourism experience design. J Travel Res 54(4):419–429 Kim J, Fesenmaier DR (2017) Sharing tourism experiences. The posttrip experience. J Travel Res 56(1):28–40 Lamsfus C, Wand D, Alzua-Sorzabal A, Xiang Z (2015) Going mobile: defining context for onthe-go travellers. J Travel Res 54(6):691–701 Law R, Leung D, Chan ICC (2020) Progression and development of information and communication research in hospitality and tourism: a state-of-the-art review. Int J Contemp Hosp Manag 32(2):511–534 Lee M (2020) Will this search end up with booking? Modeling airline booking conversion of anonymous visitors. J Tour Anal 27(2):237–250 Liang S, Schuckert M, Law R (2019) How to improve the stated helpfulness of hotel reviews? A multilevel approach. Int J Contemp Hosp Manag 31(2):953–977 Li H, Meng F, Zhang X (2021) Are you happy for me? How sharing positive tourism experiences through social media affects posttrip evaluations. J Travel Res. https://doi.org/10.1177/ 0047287521995253 Loban SR (1997) A framework for computer-assisted travel counseling. Ann Tour Res 24(4):813– 834 Ma S, Kirilenko AP, Stepchenkova S (2020) Special interest tourism is not so special after all: big data evidence from the 2017 Great American Solar Eclipse. Tour Manag 77:104021 Magasic M, Gretzel U (2020) Travel connectivity. Tour Stud 20(1):3–26 Mann C (2012) A study of the iPhone app at Kew Gardens: improving the visitor experience. In: Proceedings of the electronic visualisation and the arts conference, London, 10–12 July 2012 Mansfeld Y (2012) Consumer behavior in travel and tourism. Routledge, London Martí P, García-Mayor C, Serrano-Estrada L (2021) Taking the urban tourist activity pulse through digital footprints. Curr Issues Tour 24(2):157–176 Masiero L, Viglia G, Nieto-Garcia M (2020) Strategic consumer behavior in online hotel booking. Ann Tour Res 83:102947
138
S. Volo and A. Irimiás
Moin SMA, Hosany S, O’Brien J (2020) Storytelling in destination brands’ promotional videos. Tour Manag Persp 34:100639 Morrison AM, Jing S, O’Leary JT, Lipping AC (2001) Predicting usage of the Internet for travel bookings: an exploratory study. Inf Tech Tour 4(1):15–30 Moscardo G (2010) The shaping of tourist experience. In: Morgan M, Lugosi P, Ritchie J (eds) The tourism and leisure experience. Channel View, Bristol, pp 43–58 Murphy L, Moscardo G, Benckendorff P (2007) Exploring word-of-mouth influences on travel decisions: Friends and relatives vs. other travellers. Int J Cons Stud 31(5):517–527 Paananen K, Minoia P (2019) Cruisers in the City of Helsinki: staging the mobility of cruise passengers. Tour Geogr 21(5):801–821 Pan B, Fesenmaier DR (2006) Online information search: vacation planning process. Ann Tour Res 33(3):809–832 Pearce PL (2021) Smart tourists and intelligent behavior. In: Xiang Z et al (eds) Handbook of e-tourism. https://doi.org/10.1007/978-3-030-05324-6_66-2 Pearce PL, Gretzel U (2012) Tourism in technology dead zones: documenting experiential dimensions. Int J Tour Scien 12(2):1–20 Pizam A, Mansfeld Y, Chon KS (1999) Consumer behavior in travel and tourism. The Haworth Hospitality Press, New York Pourfakhimi S, Duncan T, Coetzee WJ (2020) Electronic word of mouth in tourism and hospitality consumer behaviour: state of the art. Tour Rev 75(4):637–661 Reinhold S, Dolnicar S (2017) How Airbnb creates value. In: Dolnicar S (ed) Peer-to-peer accommodation networks: pushing the boundaries. Goodfellow Publishers, Oxford, pp 39–53 Ricci F (2020) Recommender systems in tourism. In: Xiang Z et al (eds) Handbook of e-tourism, https://doi.org/10.1007/978-3-030-05324-6_26-1 Rita P, Oliveira T, Estorninho A, Moro S (2018) Mobile services adoption in a hospitality consumer context. Int J Cult Tour Hosp Res 12(1):143–158 Scott N, Zhang R, Le D, Moyle B (2019) A review of eye-tracking research in tourism. Curr Issues Tour 22(10):1244–1261 Shoval N (2018) Sensing tourists: geoinformatics and the future of tourism geography research. Tour Geog 20(5):910–912 Shoval N, Isaacson M (2007) Tracking tourists in the digital age. Ann Tour Res 34(1):141–159 Shoval N, Schvimer Y, Tamir M (2018) Real-time measurement of tourists’ objective and subjective emotions in time and space. J Travel Res 57(1):3–16 Sigala M (2005) Integrating customer relationship management in hotel operations: managerial and operational implications. Int J Hosp Manag 24(3):391–413 Sigala M (2012) Introduction to Chapter 1. In: Sigala M, Christou E, Gretzel U (eds) Social media in travel, tourism and hospitality. Ashgate, Farnham, pp 7–10 Sigala M (2018) New technologies in tourism: from multidisciplinary to anti-disciplinary advances and trajectories. Tour Manag Persp 25:151–155 Sigala M, Christou E, Gretzel U (2012) (eds) Social media in travel, tourism and hospitality. Ashgate, Farnham Skinner H, Sarpong D, White GRT (2020) Meeting the needs of the Millennials and Generation Z: gamification in tourism through geocaching. J Tour Futur 4(1):93–104 Stankov U, Gretzel U (2020) Tourism 4.0 technologies and tourist experiences: a human-centered design perspective. Inf Technol Tour 22:477–488 Stankov U, Filimonau V, Slivar I (2019) Calm ICT design in hotels: a critical review of applications and implications. Int J Hosp Manag 82:298–307 Stylos N, Fotiadis AK, Shin D, Huan TC (2021) Beyond smart systems adoption: enabling diffusion and assimilation of smartness in hospitality. Int J Hosp Manag 98:103042 tom Dieck C, Jung TH, tom Dieck D (2018) Enhancing art gallery visitors’ learning experience using wearable augmented reality: generic learning outcomes perspective. Curr Issues Tour 21(17):2014–2034 Tussyadiah IP (2012) A concept of location-based social network marketing. J Trav Tour Mark 29:205–220
6 Consumer Behavior in e-Tourism
139
Tussyadiah IP, Fesenmaier DR (2009) Mediating tourist experiences: access to places via shared videos. Ann Tour Res 36(1):24–40 Tussyadiah IP, Wang D (2016) Tourists’ attitudes towards proactive smartphone systems. J Travel Res 55(4):493–508 Tussyadiah IP, Li S, Miller G (2019) Privacy protection in tourism: where we are and where we should be heading for. In: Information and communication technologies in tourism 2019. Springer, Cham, pp 278–290 Viglia G, Minazzi R, Buhalis D (2016) The influence of e-word-of-mouth on hotel occupancy rate. Int J Contemp Hosp Manag 28(9):2035–2051 Vogt C, Fesenmaier DR (1998) Expanding the functional information search model. Ann Tour Res 25(3):551–578 Volo S (2018) Tourism data sources: from official statistics to big data. In: Cooper C, Volo S, Gartner WC (eds) The Sage handbook of tourism management. SAGE, London, pp 193–201 Volo S (2020) The experience of emotion: directions for tourism design. Ann Tour Res 86. https:// doi.org/10.1016/j.annals.2020.103097 Volo S (2021) Tourist experience: a marketing perspective. In: Sharpley R (ed) Routledge handbook of the tourist experience. Routledge, London, pp 549–563 Volo S, D’Acunto D (2020) Service management in the E-tourism era. In: Xiang et al (eds) Handbook of E-tourism. Springer https://doi.org/10.1007/978-3-030-05324-6_73-1 Volo S, Irimiás A (2021) Instagram: visual methods in tourism research. Ann Tour Res 91. https:// doi.org/10.1016/j.annals.2020.103098 Wang Y, Yu Q, Fesenmaier DR (2002) Defining the virtual tourist community: implications for tourism marketing. Tour Manag 23:407–417 Wei W, Qi R, Zhang L (2019) Effects of virtual reality on theme park visitors’ experience and behaviors: a presence perspective. Tour Manag 71:282–293 White NR, White PB (2007) Home and away: tourists in a connected world. Ann Tour Res 34(1):88–104 Woods O, Shee SY (2021) “Doing it for the ‘gram?” The representational politics of popular humanitarianism. Ann Tour Res 87:103107 Wu MY, Pearce PL (2016) Tourism blogging motivations: why do Chinese tourists create little “Lonely Planets”? J Travel Res 55(4):537–549 Xiang Z, Fesenmaier DR (2020) Travel information search. In: Handbook of e-Tourism. Springer, pp 1–20. https://doi.org/10.1007/978-3-030-05324-6_55-1 Xiang Z, Gretzel U (2010) Role of social media in online travel information search. Tour Manag 31(2):179–188 Xiang Z, Magnini VP, Fesenmaier DR (2015) Information technology and consumer behavior in travel and tourism: insights from travel planning using the Internet. J Ret Consum Stud 22:244– 249 Ye Q, Law R, Gu B (2009) The impact of online user reviews on hotel room sales. Int J Hosp Manag 28(1):180–182 Yoo KH, Gretzel U (2008) What motivates consumers to write online travel reviews? Inf Technol Tour 10(4):283–295 Yu GB, Sirgy MJ, Bosnjak M (2020) The effects of holiday leisure travel on subjective well-being: the moderating role of experience sharing. J Travel Res. https://doi.org/10.1177/ 0047287520966381 Zach F, Gretzel U (2012) Tourist-activated networks: implications for dynamic bundling and EN route recommendations. Inf Technol Tour 13(3):239–257 Zhang Z, Ye Q, Law R, Li Y (2010) The impact of e-word-of-mouth on the online popularity of restaurants: a comparison of consumer reviews and editor reviews. Int J Hosp Manag 29: 694–700
7
Developments in German e-Tourism: An Industry Perspective Claudia Brözel
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . German Innovators Seek Holidays for All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Early Development Shaped by Cultural Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . American Airlines Begins Electronic Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Idea to System Provider in Germany . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . START Becomes Amadeus, a European GDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Situation Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Global Distribution Systems: A Blessing and a Curse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform Data Formats Are Missing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . New Players on the Market: Competition Becomes Global . . . . . . . . . . . . . . . . . . . . . . . . . Packaged Tours Become Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NDC: A New Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Google Travel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linked Data: Knowledge Graphs and Open Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourism 4.0: Robotics, Smart Destinations, Education, and Sustainable Effects . . . . . . . . Between Digital Detox, Slow Travel, Resonance Tourism, and VR . . . . . . . . . . . . . . . . . . Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interview Partners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
142 144 144 146 147 148 150 151 151 154 157 158 160 160 161 162 163 165 166 167 169 169
C. Brözel () Eberswalde University for Sustainable Development, University of Applied Sciences, Berlin, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_11
141
142
C. Brözel
Abstract The tourism industry is driven by data. Digitization means efficiency and thus cost savings. The tourism industry, as a highly fragmented and complex worldwide industry, has a strong need for data coordination in order to produce and distribute bundles of travel services to customers. Due to their cost structures and the complex process of organizing worldwide transportation in real time, airlines were the first main drivers of digitalization in the travel industry. In Germany, the leisure market was initiated and first driven by state-dominated companies, but subsequently, a significant change in actors and global dependencies can be seen over the course of time. Until today, the recurring themes in the different stages of digitalization are communication formats and the power of various market actors. This has led to a standardization of data exchange formats between various data systems that allows data to be distributed efficiently worldwide and business processes to be completed in real time. This article includes the perspectives of three interview partners who helped shape the development of the German leisure market. It will be shown how the players and drivers of digitization have changed over time but that the key drivers have stayed the same: cost efficiency, data quality, and technical development.
Keywords German historical development · Transaction costs · GDS · Data standards · NDC · Open data
Introduction From a business perspective, digitization leads to efficiency, cost reduction, information quality, and thus competitiveness. From the customer’s perspective, digitization contributes to greater flexibility, improved depth and breadth of information, and better usability. This levels the playing field between customers and the travel industry in terms of access to relevant information. This chapter outlines the development of digitization in the leisure tourism market – and therefore in the package tour sector – using the examples of the German and Central European markets. Germany is still one of the most important outbound travel markets in Europe and even worldwide. In fact, prior to the liberalization of the Chinese travel market in 2003, German citizens had the highest number of travel days of any country. According to the 50th “Reiseanalyse” (Travel Analysis) conducted by the Foschungsgemeinschaft Urlaub und Reisen e.V. (FUR; Vacation and Travel Research Association) in 2020, around 74% of all German vacation trips in 2019 were abroad (Forschungsgemeinschaft Urlaub und Reisen (FUR) 2021).1 At the heart of the analysis in this chapter lies the leisure market, with a particular focus on package tours. It is in this sector in particular that the strong influence of digitization interacts with the characteristics of tourism products and services:
7 Developments in German e-Tourism: An Industry Perspective
143
immateriality, abstractness, perishability, the uno actu principle, consumption at the place of service production, and service bundles of at least three partial services that can be combined in different ways (Freyer 2008; Chehimi 2014). Before the customer’s arrival in the service context, the service is merely prepared in the form of various data that is communicated for distribution. This means that even when a service is booked, the only thing that is purchased is a right to disposal. Therefore, the creation of a service bundle in the travel industry is strongly influenced by transaction costs. Coase (1937), the founder of transaction cost theory, speaks of costs that develop in relation to the “use of the market.” The use of the market is defined by the exchange of “rights of disposal” (purchase, sales, rent), also known as property rights. “Transactions are thus in principle the explicit and implicit (contractual) negotiations about goods and (services) between at least two actors” (Picot and Dietl 1990). Building on the list of characteristics of tourism products, we can add that every tourism service is a complex network product that involves multifaceted interactions between the service providers involved. The number of possible combinations of service components in the form of data units – as well as the speed, availability, and adaptation to different languages and end devices – highlights how essential digital systems are for the creation of travel products. Tourism is an information-based business, the product is a “confidence good,” and an a priori comprehensive assessment of its qualities is impossible. Tourists must leave their daily environment to consume the product. At the moment of decision making, only an abstract model of the product is available, based on information acquired through multiple channels [. . . ]. Tourism products require information gathering on both the consumer and supply sides – and thus entail high information search costs. Such informational market imperfections lead to the establishment of comparably long information and value chains (Werthner and Ricci 2004, p. 102).
These complex value chains start with various service providers on one end and the customer on the other (Yılmaz and Bititci 2006, pp. 341–349). Therefore, not only the service provided at the vacation destination but also transport to the destination and the bundling of all services into a total product is important. In fact, transport to the destination is the key to opening up a destination for tourists. For this reason, airlines played a leading role in the development of tourism products and bundles, especially in the boom phase of tourism from the mid-1960s onwards (Krippendorf et al. 1989). As IATA explains in its Economics Briefing No. 10, the roughly 30 years between the 1980s and early 2010s saw an enormous increase in flights, especially direct city connections. From 1980 to 2012, IATA recorded a 150% increase in unique city pair services worldwide. However, IATA states that the general price inflation for air travel more than halved the prices during this time. This led to enormous cost and efficiency pressures on airlines (IATA 2013). As Brutzel (2017) points out, while airlines are an essential part of the tourism value chain, their revenues do not cover the resources used. Therefore, the pressure on airlines to innovate is high, and digitization offers many opportunities, whether in communication, product creation,
144
C. Brözel
or processing. Nevertheless, the divergence of airlines’ revenues and cost structures has continued. A collaboration of IATA with Harvard professor Michael Porter on the issue of the continued poor profitability of airlines found that while one cause was in the value chain, other factors included the inefficient design of government regulations, the complex structure of the tourism industry, and the marketability and dissemination of airline products (IATA 2013). For these reasons, airlines have always tried to reduce costs, especially transaction and distribution costs, as much as possible. Therefore, the need to design more efficient business processes and take steps towards digitalization has always lain with the airlines. The question of data standards – a standardized language between systems and interfaces within the tourism industry – has been the biggest issue since digitalization began. While end customers value being able to compare different tourist options, many providers see their greatest market opportunities in the diversity and variation of services and products. However, comparing offers is only possible with a uniform data standard and an agreement on common attributes as well as successful communication between different systems across all interfaces. Methodologically this chapter is built around qualitative interviews with three leading figures in the German tourism industry; these interlocutors are briefly presented at the end of the chapter. All three have accompanied and shaped the developments (especially digitalization) within the German tourism industry through their work in various positions. The information gleaned through the interviews has been supplemented with further information taken from secondary literature and other sources.
Historical Background German Innovators Seek Holidays for All In Germany, the concept of the packaged tour began with Dr. Carl Degener, who in 1939 – before the Second World War – had already brought 13,000 guests to Ruhpolding (Schneider and Sülberg 2013, pp. 77–82). From early on, Degener had been interested in the idea of making holiday travel accessible to people from lowerincome backgrounds. In 1930, the Deutsche Reisespar GmbH (German Travel Save Co.) was founded in Berlin, and Degener assumed the role of director. However, the “travel save” idea couldn’t be implemented for long, as interest was low due to the political situation of the time. Degener therefore founded the “Reisebüro Dr. Carl Degener” travel agency and organized holidays to the Alps; his chartered trains traveled mainly to the area around Salzburg. In the 1930s, Degener’s business was the second largest travel company in Germany, second only to the politically infused and state-operated “Strength Through Joy” (Kraft durch Freude; KdF) organization. Later, Degener founded Touropa and became the first German travel organizer to promote inbound travel to Germany from other countries after the war. His motivation was to avoid leaving the market only to foreign tour operators like American Express and Thomas
7 Developments in German e-Tourism: An Industry Perspective
145
Cook, who had as early as 1949 advertised their services with such messages as “Travel with us to Germany, it’s your last chance to visit the ruins [. . . ].” At the same time, the Official Bavarian Travel Agency (amtliche bayerische Reisebüro, abr) and the German Travel Agency (Das Deutsche Reisebüro, DER) came into being (Schneider and Sülberg 2013, pp. 84–94). As the following section shows, the abr, as well as DER and Touropa, were all involved in the early digitalization of the travel industry in Germany. Werner Sülberg points out that, differently from most other countries, the structures of the commercial travel industry in Germany developed under the significant influence of the travel agent industry. Furthermore, the emergence of the large tour operator companies was shaped by the interests and sales strategies of German travel agents (Sülberg 2008). Up until the year 2000, the unique position of travel agents was largely due to developments after World War II. In the midtwentieth century, several large travel centers and chains cooperated with each other in order to minimize capacity utilization risks and organized tours themselves. Then, in the midst of the mass tourism boom of the 1960s and 1970s, tour operator companies were founded by many German travel agents. The power structures only changed in the 1990s, when the founding generation fused their travel agency chains and afterwards sold them to the tour operator conglomerates that they had partially founded. The tour operator companies then integrated the regular travel agencies as controllable sales units (Sülberg 2008). According to Sülberg (2008), the development of the travel industry can be divided into four phases, which are listed below with additional information as to the changes that precipitated each phase (Table 1). Table 1 Phases of development of the travel sector (Sülberg 2008, p. 36; additions by the author) Time period Ancient World/Middle Ages to mid-nineteenth century Before World War II
Characteristics Travels of individuals mainly motivated by political, commercial, or ethnic factors
Causal factors Wars, economic necessity, religious beliefs
Commercial tourism industry structures
Political and social motivation to offer travel services
3
Second half of the twentieth century
Development of mass tourism and with this also the commercial and organizational development (IT, logistics) of the tourism industry
4
Since the beginning of the twenty-first century
Internationalization of travel market structures and the concurrent demerging of business models and sales channels due to digitalization and the availability of internet technologies
State funding, monopolization, barriers to market entry; in the 1990s also German reunification, the opening of the EU market, market liberalization, and legislation concerning package tours Globalization, internet technology, new market players, global structures, and changes to business models as a result of Information and Communication Technologies (ICT)
1
2
146
C. Brözel
Early Development Shaped by Cultural Differences Developments in Germany were always heavily dominated by the tourism companies themselves, who focused strongly on their own expertise. Conversely, the tourism sector in the USA was influenced by technology companies or media corporations, which had a more problem-oriented approach. While in Germany development was dominated by approaches and understandings specific to the tourism sector, in the USA problems were recognized and attended to by technology companies that then presented solutions to the industry. (Interview with Alexander von Koslowski 2019)
In the early days of digitalization, a cultural difference that still shapes capacity for innovation today began to manifest itself: the digital development of one of the important European outbound tourism markets was determined by companies that wanted to assert a German position in the fragmented and competitive tourism market. One of the reasons for the tourism-focused direction was the strong presence of state-run carriers on the supply side of the early tourism market in Germany. Prior to their restructuring in the 1990s, the two most important German passenger carriers, Deutsche Bahn and Lufthansa, were state-owned companies and therefore not embedded in competitive structures in the same way as other market operators throughout the world. Their monopolistic structure in the hand of the state also shaped the way they participated in the market; this can be clearly seen in the protection offered by barriers to market entry. Together with the finance sector (Deutsche Bank, Commerzbank, Dresdner Bank, and Westdeutsche Landesbank), which largely co-determined the direction of developments through the shares they held in various tour operators and travel agencies, Deutsche Bahn and Lufthansa held respective monopolies on the supply side of the German tourism market. Vertical and horizontal consolidations between service providers, tour operators, and travel agencies took place until the mid-1990s, and there existed a strong interdependence between banks, carriers, travel agencies, and tour operators. The 1990s saw key changes and a restructuring of the market conditions. The supply side of the market was changed by the reunification of Germany in 1989 and the liberalization of the EU market in 1993. Vertical and horizontal expansion processes of the largest operators followed, and new constellations of interests emerged as a result of changes in ownership structures (Sülberg 2008). Around the same time, two legislative changes also influenced the development of travel agencies in Germany. The first was that the limitation of rights of distribution (German: Vertriebsbindung) ended in summer 1995. Before the abrogation of Vertriebsbindung, the tour operator, as the merchant, had usually forbidden the sales representative – the travel agency – to sell products other than those of the merchant. Once this changed, however, a travel agency belonging to TUI, for example, could also offer tours by Neckermann and so on. This legislative change led to a major liberalization of sales. This phenomenon has been much discussed in terms of various permutations of principal-agent theory (on this theory, see Schreyögg 2003). The second major change occurred in the realm of the strong German tourism law,2 which had already been anchored in paragraph 651 (and subsequent paragraphs)
7 Developments in German e-Tourism: An Industry Perspective
147
of the Bürgerliches Gesetzbuch (BGB) in 1976. After 1990, the EU directive on package travel and consumer rights (Directive EU 2015/2302), which is strongly based on German travel law, was amended and became more adapted to online sales. In summary, significant changes took place in the 1990s, not only because of sociopolitical changes in Germany itself and the reunification of Germany but also due to economic changes, including the opening of the European market. These changes were supported by the abrogation of Vertriebsbindung and the coming into effect of the new European directive on package travel and consumer rights. These geopolitical and legal changes were also accompanied by the development of the Internet in the late 1990s.
American Airlines Begins Electronic Data Processing In the USA, and later also in Europe, digital development was driven largely by the airlines. The complex execution of the sale of available seats on particular flights requires all information to be available in real time. American Airlines (AA) was already using a half-automated system for the reservation and booking of flights in 1960. With this system, however, many steps in the process still had to be done manually: telephone calls, teleprinter messages, and lots of paperwork. The error rate was 8%, and while this may be considered to be quite high, it was the best error rate that any airline could achieve at the time. On a transatlantic flight, the then CEO of American Airlines, Cyrus Rowlett Smith, was seated next to an IBM employee, and following a conversation with him, he initiated SABRE (SemiAutomatic Business Research Environment), the first computer reservation system for the electronic processing of flight bookings. SABRE was put into operation in March of 1964. As discussed in a CNN article in 1999, the investment of $150 million into the new system by the CEO of American Airlines was ridiculed at first: “Instead of investing the money in its core business, it was putting it into a lot of mysterious boxes that would sit in a room somewhere” (Goff 1999). The initial SABRE system had two IBM 7090 mainframe computers that were connected to 1,500 terminals across the USA and Canada. By 1964, American Airlines had connected over a thousand participants in 60 cities across the USA to SABRE via the telephone network. With that, they had become the largest private data processor in the world. American Airlines became a kind of prototype to follow, and other airlines also upgraded their reservation systems (SABRE 2017). The business model of the new reservation systems stipulates a connection fee as well as licensing fees for particular branches of the industry. To tie sales outlets (predominantly travel agencies) closely to the Global Reservation System (later Global Distribution Systems, GDSs), so-called kickbacks are paid as part of a system of bonuses for bookings made via the system. The example of SABRE shows the profits that could be achieved through this pricing strategy in the early days of the reservation system:
148
C. Brözel
The flight schedules of all major airlines worldwide are stored in the SABRE system. The system was sold to about 12,000 US-American travel agencies, which covers about half of the relevant market. The travel agencies pay $1.75 for each flight booking that is completed via SABRE, insofar as this is not a direct booking via American Airlines. The revenue is estimated at $338 million, the profit at $170 million. SABRE therefore operates with a return on sales of about 50%. For the airline, this return is considerably higher than that gained through aviation. This relatively high significance is also expressed in a statement by the former marketing director of AA: “We are now in the data processing as well as in the airline business.” (Mertens 1985, p. 10, in Berger et al. 1990, p. 99)
From Idea to System Provider in Germany During the growth phase of mass tourism in the 1970s, it became clear in Germany that the information burden associated with organized tours could no longer be handled by telephones, paper, and pens, especially in light of the competition from abroad, which was steadily increasing. Developments in other countries, especially in the USA, worried the European airlines and German tourism providers greatly. For this reason, actors from the transportation and tourism industries cooperated to form the START Gesellschaft, initially with research support given by the Federal Ministry of Research and Technology (Bundesministerium für Forschung und Technologie, BMFT). All the way to the 1990s, the three largest representatives of the German tourism industry (Lufthansa, Deutsche Bahn, and TUI) were the shareholders and drivers of the development of START and later Amadeus. (Interview with Jürgen Büchy 2019)
START, the “Studiengesellschaft zur Automatisierung in Reise und Touristik” (Research Association for Automation in Travel and Tourism), was founded in 1971. Later, its name was changed to “START Datentechnik für Reise und Touristik GmbH” (START Data Systems Technology for Travel and Tourism Ltd) (Fig. 1). According to Berger et al., the goal of the START Gesellschaft was to establish the START system and to “counteract the establishment of company-specific reservation systems and avoid the danger of incompatible hardware and software systems. A standardized information system for all travel agencies that didn’t only complete all necessary functions associated with the sale of travel services but could also control market entry was to be established” (Berger et al. 1990, p. 92). The START Gesellschaft delivered not only the necessary software but also the hardware to travel agencies. Likewise, the new connection allowed for mid and back office functionalities as well as evaluations that enabled a new dimension of strategic planning and organization for travel agencies and organizers connected to the system. In contrast to the UK, in Germany START meant that an electronic distribution system was placed on the market that showed all offers irrespective of the organizer. From the outset, the system offered access to all participants, showed the offers of everyone participating in the system, and was accessible at all sales offices (travel agencies) – a system, therefore, in which all fellow competitors could be found at the same time. In the beginning, the information about the operators was static, meaning that the participants in the system (the
7 Developments in German e-Tourism: An Industry Perspective
Amtliches Bayrisches Reisebüro (abr) 8,5 %
149
Hapag Lloyd Reisebüro 8.5 %
Deutsches Reisebüro (DER) 8.5 %
Touristik Union International (TUI) 25.0 %
Deutsche Bahn (German Railway) 25.0 %
Deutsche Lufthansa (German Airline) 25.0 %
START ownership shares in the founding period
Fig. 1 START ownership shares in the founding period (based on Berger et al. 1990)
sales offices) received an update once a week that refreshed their own database. Today that sounds somewhat incredible, but at a time when people were still working with paper, telephone calls, faxes, and telexes, and always in different systems, that in itself was already something of a revolution. (Interview with Alexander von Koslowski 2019)
The founding companies soon recognized the market power made possible for them by this new system. Nevertheless, there was also quite a lot of criticism. In 1979, the Frankfurter General Newspaper (FAZ; Frankfurt General Newspaper) ran an article with the headline “Foreign airlines feel disadvantaged by this booking system.” The concentration of information, as well as the potential disadvantages and barriers to market entry that START meant for many market operators, was widely discussed. One of the key criticisms was that a customer receiving an offer in a travel agency equipped with the START system would inevitably only receive the offers of companies connected to the system (Berger et al. 1990). Hermann Weihe (1979) strongly criticized the process of concentration and the lack of a wide range of offers provided by the START system. He argued that the comfort and advantages of the system interfered with the diversity of offers and that travel agencies and customers tended to only make use of the offers within the START system; tour operators who did not work with START were excluded (Weihe 1979, in Berger et al. 1990). Even in its early days, the START system was both a service provider and a system provider. The START travel agency mode included special services designed by START for travel agencies. These were functions that provided useful support for the sales process. Following the completion of a booking, for example, the system created not only the tickets but also a security certificate as well as receipts,
150
C. Brözel
insurance documents, and invoices. Customer data was stored at START and (if required by the travel agency) stored in the service data center. START therefore had all customer and booking data at its disposal. From this, extended data analyses were then created over time, which in turn supported travel agencies in their marketing activities (Brehm 1986, in Berger et al. 1990). Service data centers took on a special role in the START system. Since 1985, DERDATA-RZ has made online bookkeeping possible via START terminals. This has meant that START is not only a data collection point and intermediary but also a data analyst (Brehm 1986, in Berger et al. 1990).
START Becomes Amadeus, a European GDS The founding phase of START (1970s–1987) was oriented strongly towards the German market, but the European Single Market and strong competition from the US-American market made an enhancement necessary. In 1987 Amadeus was created with the purpose of capturing the market for passenger transport in Europe. The founding companies of Amadeus were Lufthansa, Iberia, Air France, and SAS, and the responsibilities were divided between the four respective countries. Because of this, the headquarters of Amadeus is in Madrid, the development center in France, and the largest private data processing center in Erding, Germany. In the year that Amadeus was founded, the position of CEO went to the Scandinavians. (Interview with Uta Martens 2019)
The direct competitors of Amadeus in Europe are the GALILEO system (British Airways, Swiss Air, and KLM) and, in the USA, the APOLLO (United Airlines) and SABRE (American Airlines) systems. ABACUS, which was initially only used in the Asian market (Cathay Pacific, Singapore Airlines, China Airlines, and others), began cooperating with SABRE in 1997. These information systems highlight the goal of the airlines to optimize passenger transport and structure markets geographically. However, as early as the 1990s, hotels, rental car agencies, and a large number of airlines (roughly 2000) had already been connected via the US-American systems. Global Distribution Systems (GDSs) can be seen as the origin of digital marketplaces or platforms, platforms being “interaction managers” between suppliers and demanders (Schmidt n.d.). It can be stated that the platform is the central business model of the digital economy (examples are Amazon and Airbnb, but also Apple, Microsoft, and Facebook). The formerly national systems began showing a strong tendency towards global development, which had immense consequences for the tourism market (Berger et al. 1990, pp. 97–99; Freyer 2015, pp. 354–360). Critical discussions of the information and reservation systems have raised largely the same points today as in the past: • Costs: Information and reservation systems are expensive because they charge a range of usage fees. • Market power: The fact that the systems favor providers within the systems creates a change in the market. Those providers who are not hooked up to
7 Developments in German e-Tourism: An Industry Perspective
151
the system will not receive bookings simply because using the system is easier (Berger et al. 1990). • Mass distribution: The use of the system affects the work of travel agency employees in that only those service providers who are present in the system appear as offers. This then also increases the sliding scale commission from the tour operators to the travel agencies. Because only the products of providers within the system are sold, the customer has only limited choices and no real overview of the market as a whole. The number of trips sold by the suppliers listed in the system increases because the transaction costs (see above) are particularly low for the travel agencies and because commissions increase as more packages are sold. • Data access: Access to customer and competitor profiles stays with the system operator. As system operators, SABRE, Amadeus, and APOLLO have access to PNR (passenger name record) databases, from which strategically important customer profiles or competitor analyses can be created (Berger et al. 1990, pp. 97–99; Freyer 2015, pp. 354–360).
The Situation Today Global Distribution Systems: A Blessing and a Curse Today, everyone is talking about platforms – from my perspective, Amadeus is the mother of all platforms. Already at the beginning of automation, it offered all market operators a technology that allowed them an efficient way of being present in the market. (Interview with Uta Martens 2019)
System providers make accessing information, communication, and settlement systems possible for all branches of the industry. The most important tourist products offered via the systems are the rights of disposal (or rather the availability) of seats on airplanes, trains, and buses, as well as spots in package tours or at events. It is therefore crucial that the systems are dynamic, meaning that an automatic update occurs in the systems of the connected tour operator or carriers whenever changes occur (Freyer 2015). The core area of passenger transport is therefore a constitutive starting point for the technological development of the travel industry. The sales and marketing via GDSs have led to codeshare agreements and airline alliances. Of course, the founding members of the GDSs were always interested in seeing their own offer listed first, because a route or offer that is listed at the top is chosen most often. Even today, in the era of Google, the order of search results can heavily influence the selection of offers and therefore also the booking. Due to the strong market power of the GDSs for sales and marketing by airlines, regulatory bodies quickly introduced a code of conduct to reduce the distortive effects of benefits and price reductions on the competition. The code stipulated that the presentation of offers and services had to be done neutrally. Today, this style of listing (which displays direct flights first and then those with a stopover)
152
C. Brözel
still determines the order in which connections are presented in online portals. But in the GDSs, the only connections that are allowed to be listed are those that have a binding commitment to the connecting flight, including the transporting of baggage. The code of conduct is still valid outside the USA and has significantly contributed to the formation of airline alliances like STAR Alliance, OpenSky, and others. Alliances, but also Code-Sharing (in which two or more airlines share a scheduled flight and each airline shows it under their own code), have contributed to the successful handling of air traffic worldwide. GDSs not only give travel agencies a market overview in relation to the availability of flight seats and for other service providers but also – as described above – provide a system for managing customer master data and processing (travel) data from reservation to settlement (payment) and accounting. This represents an extreme increase in efficiency compared to processes lacking a GDS and offers a lot of potential for improving offers and customer service. GDSs therefore quickly became the most important information and operating systems for core processes. GDSs have therefore very quickly become the most important part of the value chain – what I refer to as “the main artery of tourism data.” Nevertheless, the costs of such systems are consistently criticized. The business model of GDSs is profitable, as they charge airlines a booking fee for each transaction per segment. But this booking fee is quickly becoming a major cost for the airlines in their distribution channels. Travel agencies connect the GDSs to their systems via a user subscription that includes both hardware and software. A lock-in effect3 arises because travel agencies do not want to use alternative platforms. In addition, GDSs incentivize travel agents with a “kickback” when they make a booking. This turns the airlines’ transaction costs into a win not only for the GDSs but also for the travel agents. While on the one hand the GDSs have contributed to an extreme increase in the efficiency of airline distribution networks, on the other hand they have also developed a monopoly and have thus become a powerful part of the value chain, “locking” all market participants into their systems. Usability, flexibility, and high-quality access to data are the functionalities that generate high distribution costs for service providers and additional revenue for travel agencies through the incentives from the bookings. Since their inception, the GDSs have been interested in maintaining this position as intermediaries and data providers, while the airlines and tour operators, from a transaction cost perspective, have tried to develop direct forms of distribution (without using the GDSs) with every new technical development (Brutzel 2016). Buhalis (1998) studied the success of the GDSs and describes the great advantage of GDSs and computer reservation systems (CSRs) for customers: providing comprehensive access to offers and thus a certain degree of transparency that allows comparisons to be made. A large range of package tours, but also individual services such as flights and hotels as well as their prices and availability, can be displayed on online travel agencies’ websites via a GDS. On the supplier side, Buhalis (1998) describes how GDSs have become the “backbone” of the industry through the
7 Developments in German e-Tourism: An Industry Perspective
153
adoption of communication standards and the “new tourism electronic distribution channel” that all market participants want to use because of its high efficiency (Buhalis 1998, p. 412). Suppliers try to find a way of reaching customers, and customers try to find a suitable offer. It’s actually quite simple: we provide a marketspace that brings supply and demand together. (Interview with Uta Martens 2019)
When one considers the different business models of the suppliers, the goal of marketing the diverse range of products efficiently is actually not easy to achieve. In the early days of the GDSs as system providers, GDSs provided the software and hardware and were therefore able to dictate a single “language and logic” regarding data exchange. In the last 30 years, however, the question of the language of data exchange and data processing has repeatedly become an important question for developments in the context of technological advancement. Therefore, tour operators and airlines view a lot of the data strategically and enter it into their own proprietary systems, which are the cornerstone of their business models, as their entire bargaining power lies in their respective products and pricing models. For this reason, these data sets cannot be completely shared, and they require a “translator” (either an intermediary or a system provider) that further processes the data as well as the proprietary rules that have been set. In his overview (Fig. 2), Buhalis sketches the strategic importance of the information and communications technologies according to three aspects: inner organization (intra), a focus on the end customer (consumer), and B2B organization (inter). These aspects must then only be supplemented by the topic of data formats (language). Auliana Poon (2003) notes that the international tourism industry has undergone a radical and extremely rapid evolution. While the 1950s, 1960s, and 1970s were characterized by standardization, mass production, and rigidly prefabricated packetized trips, the “new” tourism industry is characterized by flexibility, segmentation, and a strong customer orientation. Poon defines four core elements of this new tourism industry, all of which build on the capabilities of data exchange and system providers. At the heart of the considerations are flexibility and the combinability of data into individualized offerings. Similar developments are also taking place with each further push towards innovation in e-Tourism, which has led not only to more flexible organization and thus also flexible offer production and distribution but also a high degree of choice and thus flexibility in payment and booking options (e.g., via various Internet providers). This has increased the possibilities for customers to find suitable individual offers (Poon 2003). The higher degree of flexibility means high transaction costs for the users of the systems, especially the service providers. However, the market performance for all market participants (both providers and customers) is clearly better than before the automatization and development of the system providers. The system providers – especially the GDSs – are developing rapidly and offer their communication services and management systems not only to airlines but also to all service providers and tour operators. Due to the rising costs, however, the efforts of market
154
C. Brözel
Fig. 2 Tourism and information technologies: strategic und systemic framework (based on Buhalis 1998, p. 417)
participants will in the future go in the direction of being able to bypass the system providers and build up direct possibilities of access from supply to demand (see also section “NDC: A New Standard” on the NDC standard).
Uniform Data Formats Are Missing The high flexibility desired by all market participants for the production and distribution of travel products is severely limited by inconsistent data. This means that the data formats with which the highly fragmented tourism industry communicates can and do limit the flow of information. The idea of disintermediation (cutting out the middleman) has therefore been discussed repeatedly. However, data formats have grown structurally in the industry, and their use is complex and varied. In the 1990s, the Internet and its widespread use developed rapidly, and the debate about digital distribution channels also continued into the early 2000s. Early on, travel agencies could not even imagine a system of digital distribution, because this would remove their core service – consultation. But new players entered the market in the form of online travel agencies that rely on search algorithms, appealing websites, and especially online marketing. In this regard, Alexis (2017)
7 Developments in German e-Tourism: An Industry Perspective
155
notes the “disintermediation effect,” which has been the subject of much debate. The disintermediation effect is the basic possibility of lowering transaction costs through the use of Information and Communication Technologies (ICT) and a direct connection between supply and demand. The emergence of online marketplaces has brought new market participants into the market and risks displacing stationary sales by travel agencies. Clearly, without uniform data standards, no comparability is possible. This applies especially to online systems in B2C contexts. While many service providers are concerned that a uniform data standard would reduce the variety of the product range, proponents argue that a uniform data standard could make the system providers or the intermediaries superfluous. Several organizations are working on the unification of data standards in the tourism industry on both a national and an international level. On an international level, the OpenTravel Alliance describes the following goal for itself: “The OpenTravel Alliance provides a community where companies in the electronic distribution supply chain work together to create an accepted structure for electronic messages, enabling suppliers and distributors to speak the same interoperability language, trading partner to trading partner” (OpenTravel Alliance n.d.). In Germany, the OpenTravel Data Standard (OTDS) was founded as an association in 2012. From the beginning, its members were various internet booking engine operators as well as tour operators and travel agency networks – and thus many parts of the tourism value chain. The goal of the OTDS is to establish an open data standard that makes “free, compact, precise, comprehensive and compatible” data usable for all participants (OTDS n.d.). The standard itself, according to the website, can deliver information to all central cache systems and supports different attribution systems for product description (e.g., Giata Facts, DRV Global Types, etc.) (OTDS n.d.). These examples show how the travel industry was forced to develop new solutions due to the complex fragmentation and high transaction costs of the system providers. This is how centralized “cache systems” have evolved. A cache is a buffer memory that makes it possible for data queries not to be run on the live system in order to save transaction costs. Using the example of the GDS Amadeus with various other system providers such as GIATA (images and descriptions) or TravelTainment (packaging of content), Fig. 3 shows how complex searching, booking, and responding can be. The cache forms an important and much-discussed element in the leisure sector that cushions the transaction load. Here offers are saved and searched in order to not burden the real systems downstream, to allow fast responses, and to avoid unnecessary transaction costs. (Interview with Uta Martens 2019)
Figure 3 shows the process of a booking request and response at Amadeus Leisure IT. The process involves an interconnected exchange between the different systems of airlines, tour operator systems, content providers, and others. Data is delivered to the IBE (Internet Booking Engine) at the front end for online and offline users by using different data formats (language and logic). Packages are bundled in the IBE according to the rules that are sent by the travel organizers and airlines.
156
C. Brözel
Pictures producthotel-description videos (e.g. GIATA)
Data routes from request to booking (2011)- example IBE
The distribution system generate more than 50 million requests per day.
The cache reduces transaction load costs Package-tour Hotel only Car Flight
Reviews
Online
Vertrieb
Cache
Flight only
Offline
Mobile
Flight Touroperator TUI Systems TC FTI ITS etc.
GDS Cache
Connections Pricing/Rules
GDS
Availabilities
Airline Hosting System
request booking
Etc..
Fig. 3 Data route of a request and booking response from the perspective of a system provider example (Brözel 2012)
A package tour or a flight-only product that is bookable (available) is then sent back to the distribution front end. As Fig. 3 shows, very different systems with different data formats are involved. This causes high transaction costs for the service providers but delivers excellent results in terms of product presentation and sales on the customer end. In contrast, Fig. 4 shows the system infrastructure as provided by a service provider offering a unified solution with differently connected external systems, including different delivery options (online travel agency, white label content, travel agency). Different containers with a direct connection to external flight systems, rental cars, holiday houses, insurance packages, and rail tickets can be seen. Next to this is the tour operator’s own (in-house) tour operator system, as well as the mid office and back office areas that handle the payment and the complete tracking and analysis of the websites. The diagram shows that content from the different data sources can be relayed and packaged dynamically according to the logic in the center of the diagram: the “DynaPack” bridge. In this system, content is relayed not only as white label but is also presented on the operator’s own website. The different containers show from which systems the respective products are relayed. The example shows a completely independent solution of an online travel agency and tour operator. This solution allows a partial elimination of the transaction costs of the GDSs and the development of an own itinerary and thus an individual offer.
7 Developments in German e-Tourism: An Industry Perspective
157
Fig. 4 Representation of the system infrastructure of an online travel agency: the example of ZNT Richter (with kind permission of ZNT Richter n.d.)
New Players on the Market: Competition Becomes Global According to the German Travel Association (Deutscher Reiseband, DRV) and FVW Medien GmbH, there were 11,116 travel agencies and other sales offices in Germany in 2017. The majority of the German travel agencies are so-called Touristikbüros that possess at least two tour operator licenses (FVW Medien 2018). Germany is still a market that is strongly geared towards distribution offices and focused on outbound tourism. However, for several years now, German citizens no longer occupy the top spot on the list of “world travel champions.” The World Tourism Organization (UNWTO) has said that “the large majority of international travel takes place within travelers’ own regions (intraregional tourism). Four of five tourists travel within their own region” (2018). In this regard, it is interesting to note that in Germany a large part of the organized tourism market is still shaped by outbound travel.4 Among other factors, this might be the result of inconvenient bookability and the structures of travel sales that influence the domestic market. Furthermore, the predominantly middle-sized travel industry is also fragmented as a result of German political structures (see principle of subsidiarity), so that each commune, region, and state tries to build up its own branding, marketing, and distribution on the basis of completely different ICT systems. Domestic
158
C. Brözel
holiday travel still takes place in a largely self-organized manner and is therefore generally only interesting for accommodation providers. Domestic packaged tours are increasing but still not offered to the same degree as travels abroad. In her overview of the tourism distribution market, Moller (2008) describes competition from other industries with traditionally stationary tourism sales outlets as nontraditional outlets (NTOs). She defines these NTOs as service providers who “use their knowledge and their experience as well as their reputation in online sales in order to also offer travel products” (Moller 2008, p. 33). It is often noted that the history of the Internet as a sales channel in Germany began in the year 2000. Expedia, which was founded as a Microsoft spin-off in 1995, is connected with this date. Expedia launched its first German website in 1999, and this is often considered the beginning of internet sales in the German tourism industry. At the same time, in the USA, Priceline (founded in 1997) was launched as a pure product portal (hotel portal) and bought the Dutch booking.com in 2005, which until then had only functioned as a meta-search engine and is now the leading website in the B2C market worldwide. Somewhat later, but still with American roots, Airbnb – which introduced the sharing economy to the tourism industry with home-sharing, a new concept, and a sophisticated system platform – was founded in 2008. Until 2016, the largest German online intermediary had been the Unister group in Leipzig, which owned portals like ab-in-den-urlaub.de, fluege.de, urlaubstours.de, reisen.de, and many other generic URLs. Originally developed by students as an online swap meet in 2002, it has been very successful on the market since 2005. The second largest and very successful German portal is holidaycheck.de, which was started as a review portal in 2003; in 2006 the portal was purchased by Burda Media Group and converted into a very successful Online Travel Agency. A similar model was followed in 2013 by the Munich media company ProSiebenSat1, which through its subsidiary 7travel has several touristic portals like weg.de (founded in 2005 by Ströer Medien; portals like ferien.de, weg.at and payback-reisen.de also belong to the portfolio of Ströer Medien). In addition to weg.de, the 7travel media company also represents wetter.de, mydays, and tropo, a dynamic packaging tour operator (Dörnberg et al. 2017). In the fast-paced online market, it should be noted that the dynamic packaging tour operator tropo was purchased by dnata in 2019 and that weg.de now belongs to lastminute.com.
Packaged Tours Become Dynamic The packaging of tourism services based on customer preferences in a dynamic way has been made possible by digitalization. Figure 4 shows an example of the infrastructure of a service provider that enables dynamic packetization in real time. Dynamic packaging tour operators (like tropo or LMX) have changed the model of package tours dramatically through their reliance on real-time information. Dynamic packaging tour operators (German: X-Veranstalter) have no fixed (i.e., prepurchased) contingents and therefore no own catalog as such but have the ability to access the databases of supply components to search for customer requests and
7 Developments in German e-Tourism: An Industry Perspective
159
packages in real time. Only when a customer makes a request does the electronic booking system begin the process of searching for services and products. Of course, there is a pre-searching of all sources to list all possibilities. But when the customer enters the booking path on the provider’s website, the various systems are queried in the background (hence familiar messages like “We are currently querying 500 airlines for current fares and the availability of your request”), and the availability of the different services (flight, hotel, etc.) is checked. In this case, a multitude of booking possibilities is queried and compared. For example, flights might be queried directly at the airlines, but the offers of rival tour operators might also be checked. The hotel offers with the best value for money are also researched in the same way. Once the electronic offers for all the desired services are available, the system bundles them together into a package tour. Services and prices are therefore not negotiated in advance but rather updated daily via the “electronic marketplace.” It is precisely these dynamics in the discovery and pricing of offers that the term “dynamic packaging” is intended to underscore. The advantage of dynamic packaging lies in the up-to-date prices and the usually more diverse range of offers. However, strong price fluctuations can also arise within minutes, and an offer may also be booked out, sometimes even in between the time it takes to search for and finalize the booking. Changes and reservations are usually not possible, and the cancellation fees are generally higher than for package tours that have been put together in advance. Nowadays, almost all traditional tour operators also have an “X” subsidiary – illustrative examples are X-TUI, X-Neckermann, X1-2-FLY, X-FTI, X-DERTOUR, as well as ITS Indi and Tjaereborg Indi. Purely dynamic packaging tour operators like Glauch, V-Tours, and LMX also exist. As we have seen in the example in Fig. 4, the core of the infrastructure is the “DynaPack bridge” in the middle of the system. This shows that dynamic packaging has now become a central element in the business model of online travel agencies. This allows two key market advantages to be realized: the reduction of transaction costs and an infinite number of possible combinations of product components (and therefore greater individualization). Today, dynamic tour operators are no longer an innovation but rather an everyday phenomenon. This also includes the highly varied models and possibilities of packaging transportation options with accommodation. Apart from the traditional airline companies, low-cost carriers and no-frills airlines have entered the market with highly competitive prices and have been able to capture large sections of the market through price-driven business models. Aircraft leasing (ACMI: Aircraft, Crew, Maintainance, and Insurance) is used not only to overcome aircraft shortages but can also lead to “virtual airlines” that have outsourced the complete risk of value creation to an operating company and themselves only deal with sales and marketing (for a definition, see Luftfahrt Bundesamt 2016). Conducting sales with low transaction costs makes it necessary to sell directly to the end customer via one’s own website. This model has been mainly used by the Irish low-cost carrier Ryanair, which has grown to become the second largest airline in Europe. Airlines worldwide are heavily dependent on the GDSs, as they have built their complete operation and sales processes using these systems. A repeated goal of the airlines
160
C. Brözel
has been to develop sales via their own distribution channels more cheaply and independently and without intermediaries.
The Future We live in an ever more connected world. A search initiated on a smartphone is easily continued on a desktop. Devices communicate with each other wirelessly when just recently cables were still necessary. On the basis of new data standards, the data-driven travel industry is concerned with the always recurring issues of disintermediation and direct sales. However, recent developments (robotics, AI, and big data) go far beyond the present issues of the industry and will play a decisive role in the future.
NDC: A New Standard NDC (New Distribution Capability) is a project that was started by IATA (International Air Transport Association) in 2012 with the hope of completely redesigning airline retailing. “NDC will enable the travel industry to transform the way air products are retailed to corporations, leisure and business travelers, by addressing the industry’s current distribution limitations: product differentiation and timeto-market, access to full and rich air content and finally, transparent shopping experience” (IATA n.d.). The project is being widely discussed in the industry, as it seeks to achieve “the development and market adoption of a new, XML-based data transmission standard (NDC Standard)” (IATA n.d.). This involves developing a completely new standard based on XML (Extensible Markup Language) that will offer new integration possibilities for all actors and will expand the way that data and information can be displayed. Historically, the direct connectivity of all actors involved increased with every phase of digitalization (see Fig. 5). For the travel industry, this has meant that each phase of digitalization has represented new forms of process design and distribution. In IATA’s view, NDC is a way of meeting the current challenges of complex connectivity and data distribution. NDC is a data standard that includes all actors (from content suppliers to hotels, flights, car rental, Google Maps, reviews, tickets) and is able to distribute data to B2C devices via a mobile solution. “NDC offers an XML-compatible schema (XSD) for searching and booking travel services” (Bingemer 2019). NDC is built up over different levels, and currently 66 airlines are already certified. However, the full capability of the interface can only be developed using Web 4.0, which only a small number of airlines have implemented. NDC makes language recognition, neural interfaces, a completely connected travel chain across different providers, a “make me a better offer” button, and different pricing options and bundles possible and much more. Currently, various providers have begun developing processes based on this standard by IATA. Whereas the workflow in travel marketing was until now mainly shaped by GDS functionality, NDC uses a
7 Developments in German e-Tourism: An Industry Perspective
161
Fig. 5 Development of communication relationships in development phases of automation and digitalization (based on Bingemer 2019)
Direct Connect that allows actors to communicate directly with one another to shape the workflow. Figure 5 shows the different levels of connectivity; currently we are in Web 3.0, on the brink of Web 4.0.
Google Travel Google is considered the starting point for most online travel bookings. The platform has brought together all services in the area of travel and, after many initiatives over the last few years, even went one step further towards an interconnected booking experience for the travel industry in May 2019 with hotel and flight searches. Google sees itself as a platform and has launched a new meta-project for its entry to travel searches and bookings with Google Travel. “Google has taken a huge next step, putting all the pieces together, by including flights, hotels, packages, and tripplanning tools on a dedicated website and in Google Search and Google Maps. Google’s foothold in travel just got even larger” (Schaal 2019). Richard Holden, Vice President of Product Management at Google Travel, has said that “Travel planning is complicated. The number of tools and amount of information you need to sift through when deciding where to go, where to stay, and what flight to take can be time-consuming and overwhelming. That’s why today, we’re simplifying
162
C. Brözel
the way we help travelers plan trips with Google across devices” (Holden 2019). On this platform, Google is bringing the different parts of the value chain together (hotels, flights, trips), which keeps the search results accessible on all devices. On top of this, the information is accessible in Google Search and Google Maps: “one place for all of your trip details” (Holden 2019). With this, Google allows users to begin planning their trip in Google Maps or Google Search and “to pick it up where they left off when planning. . . ” (Schaal 2019). In this way, Google continues to connect travel products in its function as a metasearcher and increases usability for customers further. Google itself will not become a travel company. “For those waiting for Google to start doing transactions, beyond the limited number it does for hotel advertisers today, Google did not take that step. Instead, Google Travel continues with its metasearch model, referring users to advertisers for bookings, but now it has tied it all together in one place” (Schaal 2019).
Linked Data: Knowledge Graphs and Open Data In 2006, Tim Berners-Lee, Director of the World Wide Web Consortium (W3C), presented a new project that he called Semantic Web. In a TED Talk in 2009 (Berners-Lee 2009), he explained his concept of linked datasets and raised the way that data is presented online up to a whole new level. He explained that the Semantic Web isn’t just about putting data on the web but is about making links so that a person or machine can explore the web of data. The so-called linked data concept creates relations between data that allow greater flexibility in dealing with data for more complex answers. An example is a georeference (e.g., Berlin) that can be flexibly linked to different data: historical (Berlin was founded in the thirteenth century), political (Berlin is the capital of Germany), touristic (visit the Brandenburg Gate), and personal (I live in Berlin). Like the web of hypertext, the web of data is constructed with documents on the web. However, unlike the web of hypertext, where links are relationships anchors in hypertext documents written in HTML, for data they link [sic] between arbitrary things described by RDF (Resource Description Framework). The URIs identify any kind of object or concept. But for HTML or RDF, the same expectations apply to make the web grow. Linked data is essential to actually connect the semantic web. It is quite easy to do with a little thought, and becomes second nature. (Berners-Lee 2006)
Google developed its Knowledge Graph based on this new data structure that Berners-Lee presented in 2006. Searchmetrics describes Google’s Knowledge Graph as part of a Universal and Extended Search and therefore as an expansion of organic Google search results. These are displayed in a box on the SERPs (search engine results pages), in which results from different sources are combined graphically to new entities and thereby expand the range of search results (Searchmetrics n.d.). The Knowledge Graph is therefore a semantic database. Here the “entities” (objects or units of information) are placed in relation to each other, and the goal of the connection of the individual units is to deliver the perfect range of search
7 Developments in German e-Tourism: An Industry Perspective
163
results. Google’s Knowledge Graph plays a big role in the context of search engine optimization (SEO) for providers. It is therefore interesting that the German National Tourist Board (Deutsche Zentrale für Tourismus, DZT) is pushing forwards the development of a tourism Knowledge Graph and the rapid implementation of ideally comprehensive content from different states and regions (destinet 2019). The goal of the DZT’s Open Data Strategy is to define an approach that would allow the free exchange of data. The current phase is still strongly shaped by “owned content” that can only be used for each company’s or destination’s own purposes, but in the future, the linked data, knowledge graph development, and Open Data projects would change exactly this: because more recent developments in the field of digital transformation, like augmented reality, live content, the Internet of Things, language assistance systems, and artificial intelligence require an unfettered data flow. Different digital assistants must be able to find, automatically read out, interpret and use sovereign content and third-party content. If that is not the case, if the systems ignore the content, then destinations and tourism providers lose competitivity in the areas of visibility and connectivity. (DestiNet 2019)
Here, the outdoor app Outdooractive serves as an example of how to present the ideas, thoughts, and goals of the current developments. Hartmut Wimmer (founder of Outdooractive) says that the purpose of Linked (and Open, i.e., available for everyone) Data is to create a data structure that links all objects within tourism. These objects could come from the most dissimilar of sources. The system puts all of the content from the different sources together and relays the best possible information back to the user. In Fig. 6, Wimmer (2018) gives an example of how this data could be combined on the basis of a Knowledge Graph structure in one destination. While proprietary data formed part of the business models in the “old” tourism industry until the end of the 2010s, in recent years more and more suppliers have realized that only data that is cooperatively used and combined can ensure the survival of all market participants in the market. As a result, ICT is changing not only the more interesting availability of data for the user but also the business models of the providers. A lock-in effect is achieved only at the system level (e.g., using a particular system or app) but no longer through the data formats.
Tourism 4.0: Robotics, Smart Destinations, Education, and Sustainable Effects As in the first phases of automation and digitization, the debate about “disintermediation” and “re-intermediation” has been rekindled in the 2020s due to the use of artificial intelligence and robotics (Alexis 2017). Especially in the tourism industry, a service “by people for people” is often enthusiastically praised. The efficiency potential of digital technologies and cost reduction is undisputed, yet there are many critics on both the provider and demand side about the dehumanization of the industry. It is interesting to note, however, that many users usually do not recognize AI as such (e.g., an “assistant” in a chat on a website) and may be very satisfied with
164
C. Brözel
Fig. 6 Linked Open Data (with kind permission of Outdooractive, Wimmer 2018)
the support option. Many services are therefore already digitally optimized without users perceiving the kind of dehumanization that is postulated by critics. In addition to the interesting cost and efficiency effects, the World Tourism Organization (2018) advocates that the digital transformation of the travel industry – through the use of technologies such as the “Internet of Things,” location-based services, artificial intelligence, augmented and virtual reality, and blockchain technology – will lead to a travel supply that is more attractive, efficient, inclusive, and economically, socially, and environmentally sustainable. At the same time, challenges such as seasonality or overcrowding can be overcome in a technically feasible way. From the UNWTO’s perspective, the digitization of the industry contributes to a positive environmental impact that can be further enhanced through innovative technologies, efficient resource use and management, and smart destination applications. In the future, a major impact on the tourism sector might be seen in smart technology development throughout the whole value chain. As the UNWTO, World Economic Forum, and OECD have pointed out, this transformation is leading to an extreme change in the form of organization of companies and skills required to work within the tourism industry, a change that has not yet been adequately implemented in education and training. Marketrelevant companies of the twenty-first century are developing a “digital culture” that, when implemented purposefully, drives sustainable action and high value for all stakeholders (World Tourism Organization (UNWTO) n.d.; World Economic Forum (WEF) 2021; Schleicher n.d.). At the same time, there are significant implications
7 Developments in German e-Tourism: An Industry Perspective
165
for the skill requirements of jobs in tourism as well as cultural implications in terms of global cooperation. According to Andreas Schleicher (n.d.) of the OECD Education Directorate, the education policy of the twenty-first century should focus on critical thinking, creativity, collaboration, and communication, all of which need to be integrated with systems-related skills in tourism education and training. One of the most widely discussed futuristic projects is the hotel “FlyZoo Hotel” of Alibaba Group Holding Limited (BABA-N). The Alibaba Group opened the 290room hotel in 2019 as a pilot project and as a platform from which to offer new technologies to the hotel industry. Beyond this, customer acceptance is also to be tested here. Cadell (2019) reports on the project in a Reuters article, quoting Andy Wang (CEO, Alibaba Future Hotel Management), who explains in detail that the robots are designed to reduce the enormous labor costs in the hotel industry and that this “eliminates the need for guests to interact with other people.” Alibaba sees the advantage of the robots in their efficiency and consistency in always providing the same service without being subject to mood swings. On the one hand, the FlyZoo Hotel serves as a platform for developing AI and other high-tech knowledge, with which Alibaba wants to open further business areas. On the other hand, the hotel is also a research project that seeks to understand how guests and robots interact. Alibaba states that it will build further hotels but that these will primarily serve group employees on their travels and will therefore be built in Beijing and Shanghai (Cadell 2019). This seems to confirm the statement by Papathanassis Alexis that the future is already here today and that extreme cost savings are already being implemented through the use of Service Robots. A change in the type of market players in recent years is also becoming evident. Large companies – especially in China, like Trip.com and Alibaba – are influencing the direction of developments in the coming years with their futuristic projects.
Between Digital Detox, Slow Travel, Resonance Tourism, and VR Mass tourism has become an ecological and societal problem. While the tourism industry is continually able to declare astounding growth rates worldwide, problems like climate catastrophes, trash, child prostitution, overtourism, exploitation, and the dangers posed by terrorism are ever increasing. Many people are looking for new forms of traveling and are deciding not to travel by airplane, to go offline, to travel slowly and long term, and to seek real experiences. Especially in 2019, the discussion surrounding these phenomena was brought center stage worldwide by Greta Thunberg. In addition to avoiding flight shame, digital detox and enjoying one’s holidays without the Internet (a large campaign was started in 2015 by Switzerland) are becoming more and more relevant. Movements like “Slow Travel,” “de-touristification,” and “hyperlocal” are being presented as tourism trends by the German Zukunftsinstitut (Future Institute) (ZukunftsInstitut n.d.). “Resonance Tourism” is designed to appeal to traditional values like hospitality and quality of life, and habitat management has gone so far
166
C. Brözel
as to stop speaking of tourists but rather of people who use a space in a city or other area together at a particular time. On top of this, Interactive Traveling can be seen either as travel preparation or as an experience in itself that can be had without leaving one’s own couch. In our ever more connected and mobile world, in which people are also tourists in their day-to-day lives, the focus will increasingly be on spaces of arriving rather than fleeting experiences.
Conclusion and Outlook The success story of the package tour in Germany began even before World War II, with the idea of making vacation travel possible for low-income groups. Immediately after the war, the industrialization of travel began in the buildup phase with developments that can be seen in all industries at the time: standardization, assembly, and series production (Enzensberger 1964). State transportation monopolies played a key role in the industrialization of travel. Additionally, the German travel market has been an outgoing market from the beginning and was thus very much driven by the airline industry. Airlines, however, have always had strong cost and efficiency pressures due to their high production costs, leading to a high propensity to innovate. As Weiber (2008) points out, the fundamental changes in the economy brought about by ICT were initially made possible by the basic innovations that Soviet economist Nikolai Kondratiev describes in his theory of long waves of economic development (see also Nefiodow and Nefiodow 2014).5 The second, third, and fourth “Kondratiev cycles” show the beginnings of mass tourism, including railroad developments, electric power, cars, and air travel. The fifth Kondratiev cycle gets its impetus from computer-based information technology and is based on information and communication technology (Brözel 2012). These technical developments were taken up by the airlines at the beginning of automation, as they allowed them to work efficiently and with the highest possible cost-saving potential. Under the leadership of Lufthansa, Deutsche Bahn, and several travel providers, the first digital reservation systems in Germany, were created. Driven by the potential reduction of transaction costs (which are particularly high in the complex travel industry), many kinds of system providers developed along the tourism value chain. The possibilities of using ICT to offer products and services efficiently and flexibly across all parts of the value chain and to distribute and process them directly to customers were further developed. For a very long time, cost pressures and the possibility of reducing transaction costs through ICT have been driving the digitization of the travel industry. However, with the development of the market from a seller’s to a buyer’s market and an increased focus on sustainability and thus climate protection, factors like social compatibility, justice, individualization, and flexibility will play an increasingly important role in the twenty-first century. Blockchain technology
7 Developments in German e-Tourism: An Industry Perspective
167
promises new institutional solutions for the problems of market coordination, for which transaction costs are usually incurred (Mukkamal et al. 2018). The issue of coordination – especially in the globally active tourism industry, which has a value chain that involves many service providers – repeatedly raises the question of how to increase efficiency and lower transaction costs and the question of whether this is best and most securely handled directly or via an intermediary. The question of the advantages and disadvantages of intermediation has never lost its relevance despite the digitalization of many processes in the travel industry.
Interview Partners
Alexander von Koslowski Alexander von Koslowski was one of the founders of the first specialized travel agents for globetrotters, Travel Overland. He is a specialist for individual travelers and especially for air travel. Travel Overland has been online with its own booking platform since 1996 and is therefore one of the pioneers in Germany. Koslowski has assumed diverse management positions within the European travel industry, with a specialization in sales and marketing. From 2001 to 2005, he was Senior Vice President Supplier Relations at Travelocity Europe, where he was responsible for Travelocity Europe, Travel Overland, Travelchannel.de, Flug.de, and Travelocity.co.uk. He was also managing director of Travelchannel.de, Flug.de, and Traveloverland.de until 2008. In 2009 he transferred to the tourism industry and became Vice-President of DER Touristik Germany (dertour.de), Meiers Weltreisen (meiers-weltreisen.de), ADAC-Reisen.de, ITS.de, and Jahn Reisen (jahnreisen.de). As the Director of Distribution Tour Operator, he has been responsible for several digitalization projects for what would later become the REWE Group. Koslowski has been a member of the Technical Working Group ECTAA since 2001 and its chairman since 2018. Interview: 28 June 2019
168
C. Brözel
Jürgen Büchy After finishing high school, Büchy completed vocational training to become an aviation salesman with Lufthansa and worked for the company for 25 years. His last position at Lufthansa was Senior Vice-President Germany for Domestic Sales and Marketing of the airline. This was followed by five years as CEO of START Amadeus, the forerunner of Amadeus Germany. After his switch to Deutsche Bahn at the end of 2000, Büchy was CEO of DB Vertrieb GmbH, where he was responsible for sales and revenue accounting of DB passenger services. After being elected President of the German Travel Association (Deutscher Reiseverband, DRV) in 2010, he left Deutsche Bahn and founded the Travel Consulting Group together with several colleagues in the industry. Interview: 19 August 2019
Uta Martens Since May 2018, Uta Martens has been Sales Director Retail Global Accounts at Amadeus Germany, where she has also been Managing Director since 1 January 2017. Previously to this, she headed the Amadeus Sales and Marketing Team in Germany. After training as a travel agent, Uta Martens began her career at Amadeus in 1995 in the sales team for Hamburg and Schleswig-Holstein. In January 2005 she became regional manager for the northeast, with two regional offices in Hamburg and Berlin. Four years later she also became the German Sales Manager for Travel Agency Sales. She completed her vocational training as a travel agent between 1989 and 1992 at EuroLloyd Reisebüro GmbH in Cologne, where
7 Developments in German e-Tourism: An Industry Perspective
169
she subsequently worked in the corporate services division as well as at Cologne Airport. Interview: 20 August 2019
Notes 1
The year 2019 saw the highest percentage of foreign travel by Germans. In March 2020, German travel behavior changed fundamentally due to the global pandemic. Unsurprisingly, the 51st Travel Analysis in 2021 showed a foreign travel share of only 55% and thus a sharp drop in relation to previous years (Forschungsgemeinschaft Urlaub und Reisen (FUR) 2021). At the time of writing, it is not yet possible to make long-term predictions about the consequences of the pandemic. 2 The German travel law, which was incorporated into the Civil Code in 1979, contains extensive obligations on tour operators in relation to package tour agreements with customers. Not only are promised product attributes enforceable, but a right to return from a given destination under certain circumstances as well as the on-site liability of service providers give package tour travelers comprehensive protections. 3 One characteristic of digital networks is the so-called lock-in effect. This effect occurs when users are prepared to remain in their current system (perhaps even at a higher cost) because the benefits of switching are considered to be lower, for example, because they may have to learn a new user interface (Hinterholzer and Jooss 2013). 4 This chapter was written in 2019. Until 2019, the statement that the German tourism market was strongly characterized by outgoing travel was certainly correct. However, at the beginning of 2020, the world experienced the coronavirus pandemic, which has led to a strong change in travel behavior. Complete lockdowns (i.e., no travel movements) and a strong increase in domestic travel due to the uncertainties of the pandemic situation can only be briefly mentioned here as a factor that has significantly impacted the tourism industry. 5 The Soviet economist Nikolai Kondratiev developed a theory of cyclical economic development – the theory of long waves. The starting point for these long waves is paradigm shifts and the associated investments in innovations.
Cross-References Acceptance and Adoption of eTourism Technologies E-Business Models in Tourism e-Tourism: An Informatics Perspective Strategic Use of Information Technologies in Tourism: A Review and Critique
References Alexis P (2017) R-Tourism: introducing the potential impact of robotics and service automation in tourism. Ovidius Univ Ann Ser Econ Sci 17(1):211–216 Berger P et al (1990) Technikleitbilder und Büroarbeit. Zwischen Werkzeugsperspektive und globalen Vernetzungen (Technique models and office work: Between toolkit perspectives and global connections). Westdeutscher Verlag, Wiesbaden Berners-Lee T (2006) Linked data. https://www.w3.org/DesignIssues/LinkedData.html. Accessed 8 Sept 2019
170
C. Brözel
Berners-Lee T (2009) The next web. Presentation at ted.com, Feb 2019. https://www.ted.com/talks/ tim_berners_lee_on_the_next_web?language=en. Accessed 7 Sept 2019 Bingemer S (2019) Wie NDC den Verkauf von Flugtickets verändert (How NDC is changing flight ticket sales). Presentation at DRV Distribution Day, St. Martin Tower, Frankfurt am Main, 18 June 2019 Brehm G (1986) Verwaltungsvereinfachung mit START und DERDATA (Simplifying administration with START and DERDATA). Presentation at START Conference, 1986 Brözel C (2012) Perspektiven von Transaktionen in der Internet-Ökonomie am Beispiel der Reisebranche (Perspectives on transactions in the internet economy: the example of the travel industry). Dissertation, TU Dresden. Available via https://nbn-resolving.org/urn:nbn:de:bsz:14qucosa-86522. Accessed 03 Aug 2021 Brützel C (2016) GDS-Systeme sind Fluch und Segen zugleich (GDS systems are a blessing and a curse). Airliners.de, 3 Feb 2016. http://www.airliners.de/gds-systeme-fluch-segen-aviationmanagement/37753. Accessed 29 Aug 2019 Brützel C (2017) Profitabilität in der Wertschöpfungskette Luftverkehr (Profitability in the aviation value chain). Airliners.de, 29 Mar 2017. https://www.airliners.de/profitabilitaetwertschoepfungskette-luftverkehr-aviation-management/41052. Accessed 4 Sept 2019 Buhalis D (1998) Strategic use of information technologies in the tourism industry. Tour Manag 19(5):409–421 Cadell C (2019) At Alibaba’s futuristic hotel, robots deliver towels and mix cocktails. Reuters.com, 22 Jan 2019. https://www.reuters.com/article/us-alibaba-hotels-robots/at-alibabas-futuristichotel-robots-deliver-towels-and-mix-cocktails-idUSKCN1PG21W. Accessed 8 Sept 2019 Coase R (1937) The nature of the firm. Economica 4(16):386–405 Chehimi N (2014) The social web in the hotel industry: the impact of the social web on the information process of German hotel guests. Springer Gabler, Wiesbaden destinet (2019) Was die DZT mit dem Open Data Projekt erreichen will (What DZT is seeking to achieve with the Open Data Project). destinet.de, 24 July 2019. https://www.destinet.de/meldungen/menschenmanagement/etourismus-online-marketing/7324-was-die-dzt-mit-dem-open-data-projekt-erreic hen-will. Accessed 8 Sept 2019 Directive (EU) 2015/2302 – package travel and linked travel arrangements. Available via EUR-Lex. https://eur-lex.europa.eu/legal-content/EN/LSU/?uri=uriserv:OJ.L_.2015.326. 01.0001.01.DEU. Accessed 20 Aug 2019 Dörnberg A et al (2017) Reiseveranstalter- und Reisevertriebs-Management: Funktionen – Strukturen – Prozesse (Tour operator and travel sales management: functions, structures, processes), 2nd end. De Gruyter Oldenbourg, Berlin Enzensberger HM (1964) Einzelheiten I: Bewußtseins-Industrie. Suhrkamp, Frankfurt Goff L (1999) SABRE takes off. CNN.com, 29 June 1999. http://edition.cnn.com/TECH/ computing/9906/29/1960.idg/. Accessed 20 Aug 2019 Freyer W (2008) Tourismus. Einführung in die Fremdenverkehrsökonomie (Tourism. An introduction to the tourism economy), 8th edn. Oldenbourg Wissenschaftsverlag, München Freyer W (2015) Tourismus. Einführung in die Fremdenverkehrsökonomie (Tourism. An introduction to the tourism economy), 11th edn. Walter de Gruyter, Berlin Forschungsgemeinschaft Urlaub und Reisen (FUR) (2021) Erste ausgewählte Ergebnisse der 51. Reiseanalyse 2021 (First selected results of the 51th travel analysis). FUR, https://reiseanalyse. de/erste-ergebnisse/. Accessed 6 Aug 2021 FVW Medien GmbH (2018) Anzahl der Reisebüros in Deutschland von 2002 bis 2017 (Number of travel agencies in Germany between 2002 and 2017). Statista, https://de.statista.com/statistik/ daten/studie/252715/umfrage/anzahl-der-deutschen-reisebueros/. Accessed 20 Aug 2019 Hinterholzer T, Jooss M (2013) Social Media Marketing und -Management im Tourismus. Springer, Heidelberg Holden R (2019) There’s an easier way to plan and organize your trips – here’s how. Google Keyword, 14 May 2019. https://www.blog.google/products/flights-hotels/planning-trip-googlecan-help/. Accessed 7 Sept 2019
7 Developments in German e-Tourism: An Industry Perspective
171
IATA (n.d.) New Distribution Capability. IATA.org, https://www.iata.org/whatwedo/airlinedistribution/ndc/Pages/default.aspx. Accessed 29 Aug 2019 IATA (2013) Profitability and the air transport value chain. IATA economics briefing no. 10. https://www.iata.org/en/iata-repository/publications/economic-reports/profitabilityand-the-air-transport-value-chain/. Accessed 6 Aug 2021 Krippendorf J, Kramer B, Müller H (1989) Freizeit und Tourismus – Eine Einführung in Theorie und Politik. Berner Studien zum Fremdenverkehr, Nr. 22, Wittwer-Verlag, Bern Luftfahrt Bundesamt (2016) Merkblatt des Luftfahrt-Bundesamtes zu Leasing und Code-Share (Bulletin of the Federal Aviation Office on leasing and code sharing), v. 1.5, 30 Aug 2016. http:// www.lba.de/SharedDocs/Downloads/DE/Formulare/B1/B11_Genehmigungen/Merkblaetter_ Info/Merkblatt_Leasing.pdf.pdf?__blob=publicationFile&v=1. Accessed 7 Sept 2019 Mertens P (1985) Zwischenbetriebliche Integration der EDV (Inter-company integration of IT). Information Spektrum 8(2):81–90 Möller C (2008) Reisevertriebsmarkt: Begriffe und Strukturen (The tourism sales market: concepts and structures). In: Freyer W, Pompl W (eds) Reisebüro-Management: Gestaltung der Vertriebsstrukturen (Travel agency management: sales structure design). Oldenbourg, Munich, pp 3–34 Mukkamal RR, Pradeep R, Vatrapu R (2018) Blockchain for social business: principles and applications. IEEE Eng Manag Rev https://doi.org/10.1109/EMR.2018.2881149 Nefiodow L, Nefiodow S (2014) Kondratieff cycles (Trans: O’Meara E) https://www.kondratieff. net/kondratieffcycles. Accessed 10 Aug 2021 OpenTravel Alliance (n.d.) About OpenTravel. https://opentravel.org. Accessed 29 Aug 2019 OTDS (n.d.) Open travel data standard: https://www.otds.de/en. Accessed 29 Aug 2019 Picot A, Dietl H (1990) Transaktionskostentheorie. Wirtschaftswissenschaftliches Studium 19(4):178–184 Poon A (2003) Competitive strategies for a ‘new tourism.’ In: Cooper C (ed) Classic reviews in tourism. Channel View Publications, Clevedon, pp 130–142 SABRE (2017) The Sabre story. A chance meeting on an airline flight that turned into the technology leader for the travel industry. https://www.sabre.com/files/Sabre-History-rev2017. pdf. Accessed 26 Aug 2019 Schaal D (2019) Google Travel is now one step closer to one-stop-shopping. Skift, 14 May 2019. https://skift.com/2019/05/14/google-travel-looks-more-like-an-online-travelagency-by-putting-all-the-pieces-together/. Accessed 7 Sept 2019 Schmidt H (n.d.) Plattform-Ökonomie. https://www.netzoekonom.de/plattform-oekonomie/. Accessed 17 Aug 2021 Schleicher A (n.d.) The case for 21st-century learning. https://www.oecd.org/general/ thecasefor21st-centurylearning.htm. Accessed 17 Aug 2021 Schneider O, Sülberg W (2013) Die Ferien-Macher: Eine Branche macht Urlaub (The holidaymakers: an industry goes on holiday). Frankfurter Allgemeine Buch, Frankfurt am Main Schreyögg G (2003) Organisation: Grundlagen moderner Organisationsgestaltung (Organization: fundamentals of modern organizational design), 4th edn. Gabler, Wiesbaden Searchmetrics (n.d.) Google knowledge graph. https://www.searchmetrics.com/de/glossar/ knowledge-graph/. Accessed 7 Sept 2019 Sülberg W (2008) Entwicklungsgeschichte und Marktstrukturen des Reisebürovertriebs in Deutschland (Historical development and market structures of travel agency sales in Germany). In: Freyer W, Pompl W (eds) Reisebüro-Management: Gestaltung der Vertriebsstrukturen (Travel agency management: sales structure design). Oldenbourg, Munich, pp 35–80 Weihe H (1979) Wettbewerbspolitische Konsequenzen der Massierung von Datenspeicher- und Datenverarbeitungskapazitäten – Dargestellt am Beispiel des Reservierungs- und Informationssystems für Reisebüros START (Consequences of the concentration of data storage and processing capacities on competition policies: the example of the reservation and information system for travel agencies START). In: Hansen H et al (eds) Mensch und Computer (Humans and computers). Oldenbourg, Munich, pp 207–223
172
C. Brözel
Weiber R (2008) Empirische Gesetze der Netzwerkökonomie: Auswirkungen von IT-Innovationen auf den ökonomischen Handlungsrahmen. Nachlese für Teilnehmer am Hochschulforum PHW 2.10.2008. https://www.uni-trier.de/fileadmin/forschung/CEB/ceb_Start/aktuelles/ Nachrichtenarchiv/RingvorlesungWEIBER.pdf. Accessed 10 Aug 2021 Werthner H, Ricci F (2004) E-commerce and tourism. Commun ACM Blogosphere 47(12):101– 105. https://doi.org/10.1145/1035134.1035141. Accessed 27 Aug 2019 Wimmer H (2018) Auf dem Weg zum Graphen (On the way to the graph). Blog entry, 16 Aug 2018. https://corporate.outdooractive.com/oa-blog/auf-dem-weg-zum-graphen/. Accessed 7 Sept 2019 World Tourism Organization (UNWTO) (2018) Tourism highlights, 2018 edition. UNTWO, Madrid. https://doi.org/10.18111/9789284419876. Accessed 27 Aug 2019 World Tourism Organization (UNWTO) (n.d.) Digital transformation. https://www.unwto.org/ digital-transformation. Accessed 17 Aug 2021 World Economic Forum (WEF) (2021) Digital culture: the driving force of digital transformation. http://www3.weforum.org/docs/WEF_Digital_Culture_Guidebook_2021.pdf. Accessed 17 Aug 2021 Yılmaz Y, Bititci U (2006) Performance measurement in tourism: a value chain model. Int J Contemp Hosp Manag 18(4):341–349 ZNT Richter (n.d.) Dynapack. https://www.znt-richter.com/en/products/Booking-systemsinterfaces-for-the-tourist-industry/dynapack. Accessed 28 Aug 2019 ZukunftsInstitut (n.d.) Dossier Tourismus. https://www.zukunftsinstitut.de/dossier/dossiertourismus/. Accessed 29 Aug 2019
8
Digitalization and the Transformation of Tourism Economics Luis Moreno-Izquierdo, Ana B. Ramón-Rodríguez, and Adrián Más-Ferrando
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research on the Tourism Economy and Its Evolution in Face of E-Tourism . . . . . . . . . . . . . . A New Tourist Demand Based on Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digitalization for Competitive Improvement of the Tourism Offer . . . . . . . . . . . . . . . . . . . . . Implications of the Digital Economy in the Relationship Between Tourism Supply and Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Tourism Labor Market in the Face of the Challenge of Automation . . . . . . . . . . . . . . . . Destinations and the Need for Open Data: Intelligence, Accessibility, and Sustainability . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
174 175 177 179 180 182 185 186 187 187
Abstract In the current technological paradigm, the functioning of the tourism sector cannot be explained without the process of digitalization. This process takes on special relevance in tourism economics due to two issues: on the one hand, tourism is a highly dependent and sensitive sector to innovation, especially on the demand side; and on the other, it is a sector with little innovative capacity due to its business structure. This changing reality requires the reconsideration and even the reinterpretation of some principles of tourism economics. In this chapter, starting from a comprehensive literature review, five of these principles
L. Moreno-Izquierdo () · A. Más-Ferrando Economics of Innovation and AI Research Group, University of Alicante, Alicante, Spain e-mail: [email protected]; [email protected] A. B. Ramón-Rodríguez Department of Applied Economic Analysis, University of Alicante, Alicante, Spain e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_139
173
174
L. Moreno-Izquierdo et al.
are analyzed: the development of a new demand, changes in the value chain of tourist supply, pricing with almost perfect information, the labor market in the tourism sector and the challenge of automation, and finally the evolution of destinations in terms of intelligence, sustainability, and accessibility. This is intended to shed light on the transformation that the tourism sector is currently experiencing in digital terms and to anticipate the change that new disruptive technologies, especially artificial intelligence, will continue to create.
Keywords Tourism economics · Digital economy · Digitalization · Supply · Demand
Introduction The global economic relevance of the tourism sector is an unquestionable fact. With more than 1.3 billion international displacements per year and a unique potential to promote international direct investments, tourism has become the most value-generating service industry, and one of the five most important commercial activities - including goods and services. In recent decades, the growth of the tourism sector has been supported by and benefited from digital development (Hojeghan and Esfangareh 2011). Its rapid adoption by companies, users and destinations has led to a change in the structure and configuration of the tourism market, but also in the form of analysis. The digital economy, as in other industries, has modified marked patterns of the tourism economy in terms of supply, demand, the role of governments, employment, sustainability, or accessibility, among many other areas of study. In the digital age, any sector is condemned to renewal and innovation in order to promote knowledge as the main factor of competitiveness and productivity. In the case of tourism, this can be seen in e-commerce, changes in demand preferences since the 1990s, or the use of artificial intelligence (AI) algorithms for user segmentation or pricing, among many other issues. According to Baggio and Del Chiappa (2013), the incorporation of the digital economy in destinations has led to a differentiation between physical or real components (companies and tourist assets of destinations) and virtual components (a digital representation or extension of physical ones). However, both must work in an integrated way to give coherence to the tourist offer and achieve higher levels of competitiveness. Generally, this integration is based on the processing of new sources of information and data that did not exist until now. These changes in the composition of the sector also give rise to new lines of research of the tourism economy related to both the importance of data and the rapid advance of technologies. This phenomenon, which also occurs in many other industries (Agrawal et al. 2019a), may even generate a renewal in foundations of the tourism economy, based on issues such as access to almost perfect information on supply and demand, or a new law of nondecreasing returns based on knowledge and innovation.
8 Digitalization and the Transformation of Tourism Economics
175
At a more applied level, the adoption of the digital economy is a unique challenge in the tourism sector because of two issues: firstly, it has less innovative capacity than other sectors, as Cooper (2006) points out, due to the business structure and the preparation of human resources; and secondly, the tourism industry has shown extreme sensitivity to the incorporation of new technologies, quickly adapting the industry’s value chain to new trends (Sigala 2018). For Gretzel et al. (2015), all these changes are giving rise to intelligent tourism and a new stage of e-tourism that is supported entirely by data and their processing (collection, exchange, process, and analysis). Therefore, in this new information age, algorithm management (user segmentation, demand capture, price management, etc.) is more important for achieving competitive advantages than decades of experience in the sector. The emergence of the platform economy is a clear example of how technology-based companies have the greatest possibility of adaptation and reinvention of the tourism sector itself. In short, economic sectors, but also principles of economic theory and its traditional methods of analysis, are being altered by the impact of the digital economy. This occurs fundamentally from the development of AI and the introduction of data as one more input in the production function (Varian 2019), in economic growth, employment and inequality, economic regulation, and its microand macroeconomic consequences. The aim of this chapter is to analyze this effect not only on the tourism sector but also on the evolution of research into the economics of tourism, including issues dealt with, models applied, and even the obtaining of information. To this end, changes caused by the so-called disruptive digital technologies (Moreno-Izquierdo and Pedreño-Muñoz 2020) will be taken into consideration, which include advances as relevant as AI, the Internet of things, blockchain, big data, quantum computing, robotics, virtual reality, and 3D printing. This chapter begins with an analysis of the evolution of the tourism economy from the point of view of applied economic analysis, indicating its tendency and need to deepen the analysis of the sector in light of this technological disruption faced by tourism. Next, there will be a reflection and evaluation based on the development of the digital economy of the tourist demand, the changes in the value chain, the new generation of jobs in the sector, and, finally, the emergence of new tourist destinations.
Research on the Tourism Economy and Its Evolution in Face of E-Tourism Tourism is a multidisciplinary field of study (Leiper 1981) with the economics of tourism as a specialization that emerges as a specific focus of study with Gray (1966) or Gerakis (1965) as pioneers. There has always been a debate about the nature of their object, the role that the tourism economy plays in the study of tourism, and the impact of research into tourism economics on the development of the economic discipline itself (Tisdell 2000).
176
L. Moreno-Izquierdo et al.
In short, we can say that the tourism economy is responsible for the analysis of the economic side of the tourism system, the functioning of the tourism market, and in general the economic problems associated with those activities that supply tourism demand. Thus, Capó et al. (2006) point out that the tourism economy deals with the application of economic principles and techniques of economic analysis to the tourism industry, and this is considered as a set of activities whose main objective is the satisfaction of tourism demand. As pointed out in Song et al. (2012), the tourism economy has made substantial developments and has contributed significantly to the creation of knowledge in the broad field of tourism. We can distinguish in this half-century of research a certain evolution in the economic analysis of tourism: A first stage in which economic analyses applied to tourism were considered to suffer from a weak conceptual framework through a lack of appropriate research (Sessa 1984) was methodologically unsophisticated (Pearce and Butler 1993), which had negative consequences for management and policy development. Several authors (Aislabie et al. 1988; Stabler and Sinclair 1997; Tisdell 2000; Witt and Moutinho 1994, among others) agreed on highlighting the lack of definition and validity of tourism statistics as the most notable reason that prevented a more rigorous and in-depth economic contribution to the study of the tourism sector. However, Stabler and Sinclair (1997) go a step further and demonstrate the ability of tourism to reinforce theoretical foundations and economic principles to explain and predict tourism phenomena. A second stage includes more recent studies, for which Dwyer et al. (2010) or Tribe and Xiao (2011) highlight the maturity of the tourism economy as a field of study comprising large bodies of knowledge and theoretical foundations. The tourism economy is developing rapidly, and its relationship with other disciplines is becoming more relevant (Wanhill 2011). As tourism activity becomes global and more countries and destinations become part of the tourism market, supply and demand, research in tourism, and more specifically in tourism economics, is also experiencing exponential growth. In this second phase, tourism economics as a discipline has advanced, particularly in Asia, and all economic concepts and economic analysis have been applied (Stabler et al. 2010). However, despite the fact that the application of IT to tourism has accelerated incredibly since the 1990s, in 2009, authors themselves consider the economic study of e-tourism, or the process of digitalization of the tourism system from an economics point of view, to be still incipient. There is currently a thirdphase in the evolution of economic analysis applied to tourism in which a large number of studies analyze the digitalization of the tourism sector from an economics point of view. This is because aspects such as tourism demand and supply, industrial organization of the sector, role of governments, evaluation of the economic impact of tourism, or issues related to environmental economics (Stabler and Sinclair 1997) were disruptively transformed first with the emergence of ICT and the Internet and then with their evolution to AI and the big data economy. Digitalization processes have also changed the value chain of
8 Digitalization and the Transformation of Tourism Economics
177
both the tourism sector and the public sector itself, and technology has affected the entire commercial evolution of tourism, from CRSs to yield management and now machine learning, giving way to new pricing strategies. Even concepts as well established as sustainability principles of our destinations are today affected by the digital economy and its impact on the sector. For instance, new data sources and the sensorization of cities allow us to observe the problem of massification of some tourist destinations or to generate accurate predictions of desertification, causing many authors to rethink how to model and define “tourism competitiveness” (Moreno-Izquierdo et al. 2018). In new definitions, the well-being of residents takes considerable relevance, but so does the digital commitment of destinations. This is due to the great dependence of tourism on information, being data, especially the real-time one, crucial for decisions regarding tourism supply and demand (Sigala 2018), to ensure the reconversion of the sector or to make knowledge-based decisions. In sum, digitalization and its possibilities in terms of data generation and processing are causing an evolution in the tourism sector, and this translates into a renewal of the theory of tourism economics and the emergence of new lines of research.
A New Tourist Demand Based on Information Within tourism economics, demand research has the longest tradition. During the last decades, there have been notable developments in terms of diversity of tourist interests, depth of theoretical foundations, and advances in research methods (Li et al. 2005). The digital economy has greatly influenced this research development, as technology has led to major changes in both preferences of tourism demand and its volume, in practice experiencing a redistribution of tourism flows on a global scale (WTO 2018). The impact of the digital economy on tourism demand became apparent in the 1990s and the early 2000s. The “new” tourist had preferences far removed from those of traditional tourism and determined mostly by the explosive development of ICT, according to definitions given by Poon (1993) and Mills and Law (2004). The greater access to information thanks to the Internet modifies behavior patterns in the choice of destination, the seeking of experiences, and the integration of the tourist in the community. But the greatest changes occur before the trip, with new ways of choosing a destination, accessing information, and proceeding to purchase. According to Shanker (2008), the emergence of the Internet has completely transformed the structure of the tourism industry, due to direct user access to information and open tools for contracting services that for decades were only available to professionals. Access to these new sources of information differentiates the current stage from previous ones at a theoretical level, since, in the new demand models, we can assume almost perfect access to information from buyers. Moreover, the digital gap in Internet access, according to Minghetti and Buhalis (2010), explains relevant
178
L. Moreno-Izquierdo et al.
Access to information
Unidirectional information (web - tourist)
Two-way information (social networks)
Tourist performance
Patient / reader
Voluntary interaction after service
Without verification
Personal verification (user / pass)
Tourist verification
Bidirectional information (online reputation)
Automated information processes Automatic and real-time interaction
Blockchain and artificial intelligence
Fig. 1 Evolution in access to and creation of information
differences in the choice and decision-making capacity of tourists. Tourists are no longer passive subjects but are active participants in decisions of other users (see Fig. 1), generating contents reflecting previous experiences. In this sense, different authors have pointed to changes in tourism decisionmaking based on social networks and access to images of destinations (Llodra-Riera et al. 2015), online reputation (Perles-Ribes et al. 2019), or comments and votes between users on online platforms (Teubner et al. 2017). On an applied basis, Schuckert et al. (2015) calculated an increase of up to 10% in users’ willingness to pay for hotels and restaurants with a good review from other users. In the case of Airbnb apartments, reviews of previous tenants could make a difference of up to 30% according to the study by Moreno-Izquierdo et al. (2019) on different destinations on the Mediterranean coast. In addition to information and comments from other users, cloud-based technologies, big data, and the Internet of things will increase the reliability and authenticity of information available to demand (Calvaresi et al. 2019), leading tourism to a new degree of disintermediation between buyers and sellers (Sun et al. 2016). All this would advance us to a future scenario where optimal decision-making could even be automated based on machine learning algorithms, with an almost perfect prediction of individual demand preferences based on previous decisions. At the academic level, all these millions of user-generated data can make a difference in future studies on demand in the tourism economy. For the first time, there is a massive amount of information that can either solve problems or support assumptions about tourists’ preferences and decision-making in an applied way. In addition, the implementation of new analysis techniques based on AI methods for better forecasting and prediction of demand, such as artificial neural networks (ANNs) (Dewangan and Chatterjee 2018), could indicate a trend toward the creation of new and relevant studies in tourism economics. This is why, from the point of view of academic research, the ownership of these data and their open access are issues that urgently need to be resolved in order to guarantee the advancement of science and studies on tourism demand. This access to information could lead to a greater knowledge of the tourist, especially in relation to the experience economy (Pine and Gilmore 1999), with real implications on
8 Digitalization and the Transformation of Tourism Economics
179
issues of sustainability, welfare, and competitiveness from which among demand, supply, and governments would benefit.
Digitalization for Competitive Improvement of the Tourism Offer The digital transformation of the tourism offer has had an impact at three different levels: at the business level, at the industry level, and at the destination level. Access to more information and better analysis tools is enabling the optimization of value chain processes in almost any traditional sector. This translates into new communication strategies between different agents (demand, company, and policymakers), investment, costs, and distribution or pricing among many others. According to Zsarnoczky (2017), these benefits of the digital revolution are even more evident in the tourism sector, as it is based on cooperation between a wide range of services and products. But the digital economy has not only led to improvements in processes or in the value chain of business and industry. It has also stimulated a new offer that has reinvented the sector, especially through the emergence of new technological start-ups. This competitive difference based on technology has major impacts on the tourism economy. As with demand, there is also a technological gap in supply, and adaptation to the digital economy is the factor that explains why some companies can stand out in such competitive environments. Technology firms and large platforms are increasingly distancing themselves competitively from traditional tourism firms and SMEs (Cheng and Foley 2018). This can be seen above all in outstanding cases such as Booking, Airbnb, or Uber that dominate their markets more by the management of information than by their experience in the sector. This adaptation of the tourist offer to the digital economy has obvious implications. For example, the emergence of e-commerce has removed barriers in the interaction between tourists and companies and destinations, and currently, more than 70% of direct travel sales are made online (Schuckert et al. 2015). The way of communicating has also changed, with new marketing strategies adapted to the information obtained by companies and destinations (Gursoy 2018). But if there is one element in which this evolution has been particularly noteworthy, it has been in pricing strategies, which have been the subject of studies for decades. The digitalization of the tourism sector has allowed great advances in strategies of tourism companies and destinations, with information such as the seasonality of demand to carry out yield management. Works such as of Smith et al. (1992) highlight benefits obtained by airlines by the rapid adoption of ICTs, with improvements in the management of reservations and the allocation of discounted fares. Over time and thanks to more information, these strategies improved toward revenue management. These strategies, based on the segmentation of the market and the maximization of consumers’ willingness to pay, represent a substantial improvement in profits of tourism businesses (Vinod 2004). But we are currently undergoing technological changes that point to new pricing strategies. Automatic learning algorithms are already enabling companies to study behavior patterns of their users, thereby optimizing their pricing decisions (Webb et al. 2001).
180
L. Moreno-Izquierdo et al.
The abundance of data on the behavior of tourists decreases the cost of prediction in the value chain of businesses and the tourism industry, with the application of AI already observed in tourism businesses to reduce costs, reduce risks, and improve decision-making (Agrawal et al. 2019b). Therefore, the use of these data, which would include both those collected by companies and those extracted from social networks, will be of paramount importance in the profitability and competitiveness of the tourism offer of the twenty-first century. Starting from this logic, legal disputes regarding data ownership and limits in the exploitation of the data provided by users are understandable, especially in the European Union whose regulatory framework is stricter.
Implications of the Digital Economy in the Relationship Between Tourism Supply and Demand Greater access to information on the part of the demand, more efficient data management thanks to AI, or technologies mentioned above such as blockchain, robotics, or quantum computing will undoubtedly have an impact equal to or greater than that of the tourism sector’s own adaptation to the Internet in the 1990s. It should come as no surprise if, as recent studies such as Makridakis (2017) or Agrawal et al. (2017) indicate, we are at gates of a fourth industrial revolution driven by disruptive technologies of our time. Undoubtedly, the adaptation of tourism companies, tourists, and destinations to the new digital wave will mean a new field of study for academics in tourism economics. On a theoretical level, digitalization has very relevant implications. Perhaps the most important one is the change in consumer preferences toward nonstandardized products and the micro-segmentation of companies’ strategies, which can be seen in personalized marketing solutions or dynamic pricing, leading to market equilibria in the short term. In other words, the traditional idea of a supply-demand equilibrium point in the long term is abandoned, to generate constant changes in their relationship, with prices and products that are adjusted several times a day depending on existing information and enhanced by AI algorithms. This phenomenon, which started with low-cost airlines (Moreno-Izquierdo et al. 2016), is now also evident in the accommodation and inter-city transport sectors (Gibbs et al. 2018; Wu and Yang 2021). Another theoretical effect of digitalization is the aforementioned access to more information on the demand and supply side. It could be said that in the digital era, both consumers and sellers/producers make their decisions based on more knowledge than ever before, and even the “perfect information” used in some economic models could be assumed. However, there is a gap in access to information that is needed to be considered, also in the tourism industry. Not all suppliers have access to the same amount of information, and at the same time, it is impossible for consumers to have real-time data (tourists must wait for experiences that conform the online reputation). The risk of technological monopolies is increasingly evident in the tourism sector (Coveri et al. 2021), so studies will be needed on effects and the need to enhance open data and technologies that contribute to the decentralization of information, such as blockchain.
8 Digitalization and the Transformation of Tourism Economics
181
Related to the previous point, it is also very relevant to understand how digitalization is generating competitive polarizations. From a macroeconomic perspective, the digitalization of companies can make marginal costs of international expansion close to zero, if we take into account their scalability. Traditional theories of internationalization and global expansion were already revised years ago, with the emergence of the so-called born global (Rennie 1993), companies that, from their origins and even at a small dimension, are able to compete globally. Today, these types of rapidly expanding companies are a reality in the tourism sector, with digital companies fully established in destinations, offering decentralized services for activities that were previously only provided by local agents. From a microeconomic perspective, the digitalization of markets implies that the least competitive supplier may not be taken into account in users’ decisions. This situation is discussed in Moreno-Izquierdo et al. (2019) in a study in nine Spanish coastal destinations, observing that an excess of supply does not have an impact on prices paid by tourists on the Airbnb platform. Specifically, the study found destinations where more than 50% of the accommodation had never been rented, without this excessive supply affecting dynamic pricing strategies, with increases during weekends and holiday periods (Fig. 2). This phenomenon occurs because the free entry of goods into the market, in addition to the almost perfect information on demand, leads to two types of supply: gross supply (all goods available on the market) and net supply (only those that are desirable because of demand). Buyers’ knowledge of the market (both from product descriptions and from experiences of other tourists) causes prices to vary only as a function of changes in net supply. Apartments that are not rented do not affect the equilibrium, as tourists will prefer to visit another city or choose another type of accommodation rather than stay in them. This situation can be seen graphically in Fig. 3, where the equilibrium point at each moment t depends on the relationship between net supply (S’) and demand (D). The gross supply (S) does not intervene in the configuration of the price, since the almost perfect information possessed by the demand automatically discards it. This effect can also be seen in the second part of the graph inversely, using apartment occupancy rather than total supply. Again, the equilibrium at each point in time t is formed from the relationship between the temporary demand (Dt) and the occupancy rate of net supply (S’). The occupancy rate of the gross market supply (S) has no impact on the final price. This phenomenon of discarding also occurs on the supply side. The dynamic pricing strategy discussed above, which aims to optimize firms’ performance, is a good illustration of this. Supply information may result in prices being increased as buyer movements are detected (e.g., during weekends or special events), leaving those tourists with less purchasing power out of access to certain products. In short, the digital economy is generating very relevant changes in the competitiveness of tourism companies and their capacity for global expansion, in access to information, in decision-making, in consumer preferences, and in business strategies, among many others. It is very revealing to observe how the most disruptive companies in the tourism sector in recent decades, such as Booking, Airbnb, or even Ryanair, base their success more on data management and prediction than on experience in the sector in which they operate. As we have seen, this phenomenon
182
L. Moreno-Izquierdo et al.
Fig. 2 Behavior of the rental price in relation to the supply of apartments in Airbnb and total bookings. An example for the city of Valencia (June 2017). (Source: Own definition based on data from Moreno-Izquierdo and others 2019)
has led to new theoretical and applied research and will continue to be the case for decades to come. One of the most obvious examples is found in the recent studies on how AI-based solutions are going to change the tourism industry (BulchandGidumal 2020) or how it could help to predict or mitigate effects of COVID-19 on the sector (Peri´c and Vitezi´c 2021; Kontogianni et al. 2022).
The Tourism Labor Market in the Face of the Challenge of Automation The emergence of new technologies in the tourism sector has also affected the productive factor that undoubtedly supports tourism activities and provides added value, human capital (Sigala 2018). In fact, we could identify a transformation in
8 Digitalization and the Transformation of Tourism Economics
183
Fig. 3 Differences between gross and net supply and their implications for equilibrium in markets with near-perfect competition (platform economy). (Source: Own definition)
two stages: the era that encompasses the entire digitalization process until 2006 and the era of acceleration promoted by AI (Xiang 2018), which is the last drive toward the automation process that the global economy is undergoing. Although in previous times situations of transformation in the labor market have been successfully experienced (Stevenson 2018), in this latest technological wave, the speed with which changes are implemented will be the key to everything.
184
L. Moreno-Izquierdo et al.
Research such as that of Manyika et al. (2017) points to the destruction of 73 million jobs by 2030. This possible reduction would especially affect those jobs that require a lower level of education or a specialization far from the technological evolution itself, as is the case in a large number of jobs in the tourism sector. Automation is related to an increase in average productivity, improved product quality, business profitability, standard of living, and consumer satisfaction in any sector (Malihah and Setiyorini 2019). But it is also a great generator of employment, especially in job profiles related to software engineering (Melián and Bulchand 2015). The digital economy has created millions of new jobs in a multitude of tourist activities such as adapting companies to the Internet, analyzing tourist data, creating mobile applications, developing algorithms for booking hotel rooms, cloud computing, and even research in areas such as robotics or autonomous vehicles (Alexis 2017), among many others. This fact points to a very important need for structural change in the labor market of the entire economy, not just the tourism sector (Korinek and Stiglitz 2017). The workforce requires a new set of digital (data analysis, programming, user experience, etc.) and non-digital (entrepreneurship, adaptability to continuous change, working with technical equipment, etc.) knowledge and training (SpitzOener 2006). This change must be promoted in the training phase in order to reduce the gap between labor supply and demand (Zehrer and Mössenlechner 2008). Although works such as of Ndou et al. (2018) positively value initiatives developed in universities to promote new skills in tourism students, other authors such as Hsu (2018) or Kalbaska and Cantoni (2021) also bet on the personalization of learning, based on on-demand Massive Open Online Course courses that can adapt quickly to changing needs of the market. In short, all these facts should promote the study of the adaptation of the tourist labor market to the new digital scenario over the next few years. The impact of the digital economy on the tourism sector, including robots that can provide services, will generate the automation of intermediation tasks, telematic assistance, registration of activities, or even cleaning and personal service in restaurants and hotels (Ivanov et al. 2017, 2020). Although today we are at an incipient stage in the research and application of automation and robotics to the tourism industry (Borràs et al. 2014; Murphy et al. 2017), their inevitable development will lead to a review and renewal of policies relating to the labor market sector. This fact should encourage research on what effects automation will have on classic axioms of employment in tourism: the quality and structure of jobs, salaries, income inequalities, or education. It is true that in previous technological revolutions and after a short period of time, the labor market has tended to compensate, generating more jobs than it has destroyed. But the magnitude and speed of changes that are happening to us must make us think about whether or not there will be a net loss of employment, as well as possible solutions to be undertaken to guarantee the well-being of the population.
8 Digitalization and the Transformation of Tourism Economics
185
Destinations and the Need for Open Data: Intelligence, Accessibility, and Sustainability Changes brought about by the digitalization of supply, demand, and employment will bring about a new way of understanding tourist destinations, moving toward structures that allow for the management and adoption of innovation. This fact is extremely relevant, as it places innovation at the center of competitiveness factors of the tourism industry, whereas in previous models such as Porter’s increased rivalry (2008), it appeared as an exogenous factor. According to Buhalis and Amaranggana (2013), the era of ICTs opened a multitude of opportunities in tourist cities, having consolidated the term “smart destination” as the great reference point for tourism innovation and a symbol of good government management. Nowadays, AI, cloud computing, or the Internet of things (see Aivalis 2021) have replaced ICTs as disruptive technologies, but the idea of providing destinations with innovation as a competitive engine remains. In relation to the tourism economy and technological innovation in the sector, it is also important to consider the role of the public sector, which as managers of the destination will have to encourage and enhance the development of the digital economy at all levels (residents, tourists, and businesses). According to Goldfarb et al. (2015), this situation leads us to rethink the role of the public sector in the digital age, while opening lines of research and analysis on incentives for the development of public goods and their dissemination in society. Indeed, policymakers must understand digital impacts on destinations and take advantage of the current trend to provide destinations with digital tourist services and information, whose marginal cost of distribution and communication tends to be zero, being able to make far-reaching policies cheaply and supported by the large volume of information collected. For its part, the definition of smart cities/destinations arises from the idea of interconnectivity between stakeholders (people, companies, and even machines and things) to improve the experience of inhabitants and visitors and facilitating access to information by both companies and users. In fact, there is growing awareness of the close relationship between new technologies and tourism competitiveness at all levels, with contributions such as of Hjalager (2002), Buhalis and Law (2008), Hojeghan and Esfangareh (2011), or Moreno-Izquierdo et al. (2018) stressing the importance of information in tourism supply and demand decision-making. But beyond the theoretical question of competitiveness and the role of governments, the current technological wave brings with it new questions of special relevance to academia. The boom in tourism in recent decades has generated common problems at a global level, such as the saturation of some cities or neighborhoods, abusive prices of some attractions, pollution, or lack of accessibility. Preferences of users for irreplaceable characteristics of some destinations (from the climate and beaches to world heritage sites) mean that although the use of technologies is now an essential part of destination management, their exploitation is insufficient to solve
186
L. Moreno-Izquierdo et al.
these problems, as it is shown in Perles-Ribes and Ivars-Baidal (2018), Bouchon and Rauscher (2019), Bratec et al. (2021), or Rucci et al. (2021) among many others. Academic studies could help improve tourism destinations through the use of technological solutions. For example, Skeli and Schmid (2019) propose solutions that range from managing saturated zones with mobile applications to establishing a destination’s load capacity limit based on tourist access control. Another option could be automated demand management based on blockchain and AI (see Treiblmaier 2020), allowing real-time interaction, that is geolocated and verified between users and the destination itself. Also, new technologies will be key to mitigating the impact of pollution or giving attention (Ali 2021) to tourists with special needs. In this sense, governments should encourage the development of open data systems at destinations to evaluate their decisions and technology investments. However, it is to be expected that we pay attention to conflicts over data collection and exploitation, which can considerably limit the development of the digital economy of destinations. While the public sector and academia work with their eyes closed, companies such as Airbnb, Ryanair, Booking, or TripAdvisor have enough information to predict demand, prices, or the saturation of tourist spaces. The public sector, in collaboration with academia, must take advantage of digitalization, using existing information or generating their own open data lakes to be able to solve problems affecting tourist destinations. It is the most effective way to move toward smart development.
Conclusions Economic sectors, but also principles of economic theory and its traditional methods of analysis, are being altered by the impact of the digital economy and the relevance of data. The current technological paradigm unevenly affects different productive sectors of the economy, with tourism being one of the most affected. Tourism is an information-intensive sector, because of the high number of agents involved, its wide geographical dispersion, and its volume of online transactions or e-commerce. This occurs fundamentally from the development of AI and the introduction of data as another input in the production function. But the relevance of data in tourism is also affected by the emergence of new sources of information that more accurately and in real time are able to approximate the behavior of tourism supply and demand. This article synthesizes some of the most relevant impacts of digitalization on the tourism sector and on the functioning of the market, as well as implications it has in terms of research related to tourism economics. How economic principles of the tourism market are affected by issues related to bidirectionality in information, the reduction in transaction costs, the disappearance of perfect information clauses, the unstable equilibrium in the market, the reduction in prediction costs and their impact on market knowledge, or the role of the public sector in the face of many
8 Digitalization and the Transformation of Tourism Economics
187
cases of the free offer of tourism goods and services by online platforms appears as relevant research paths in this new configuration of the tourism sector. However, many challenges and obstacles remain to this day, both for industry and practitioners and for researchers. In today’s disruptive and competitive environment, both risks and opportunities present themselves, and the future of tourist destinations and businesses will depend on their adaptation to change. In this scenario, the sector workforce or small businesses will have to face much more competitive environments, facing the cost of adapting to new market rules in training and data collection. All this is because, as what always happens with the digital economy, its entry into traditional industries means both an improvement in existing companies and the creation of new groups of companies that tend to dominate markets, such as Booking, Airbnb, or Uber among many others in the tourism sector. Increasing digitalization is bringing with it new disruptive technologies that will completely shake value chains of businesses, industry, and tourist destinations. But studying the effect of the digital economy on the tourism economy is not straightforward. There is currently a barrier to data access, which must be overcome for research to advance at the same speed as the market does. This is no small issue; research into tourism economics can be key for governments, businesses, and society in their adaptation to the digital economy. Problems cited in this text, such as overtourism, need for sustainable development, or lack of accessibility could be tackled with innovative research techniques if access to data were made easier for researchers. It is, therefore, necessary to continue to urge governments that there is no better policy or better investment than open data and the support of technological development that can foster research.
Cross-References A Futuristic Look at Tourism in the Era of the Internet Ecosystem Advanced Web Technologies and e-Tourism Web Applications Development of Information and Communication Technology: From e-Tourism
to Smart Tourism Drivers of e-Tourism E-Business Models in Tourism e-Tourism: An Informatics Perspective Internet of Things and Ubiquitous Computing in the Tourism Domain Strategic Use of Information Technologies in Tourism: A Review and Critique
References Agrawal A, Gans J, Goldfarb A (2017). What to expect from artificial intelligence. MIT Sloan Management Review, 58311 Agrawal A, Gans J, Goldfarb A (eds) (2019a) The economics of artificial intelligence: an agenda. University of Chicago Press, Chicago
188
L. Moreno-Izquierdo et al.
Agrawal A, Gans J, Goldfarb A (2019b) Economic policy for artificial intelligence. Innov Policy Econ 19(1):139–159. https://doi.org/10.1086/699935 Aislabie C, Tisdell C, Staton PJ (eds) (1988) Tourism economics. Institute of Industrial Economics, University of Newcastle, Newcastle Aivalis CJ (2021) Big data technologies. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https://doi.org/10.1007/978-3-030-05324-6_23-1 Alexis P (2017) R-tourism: introducing the potential impact of robotics and service automation in tourism. Ovidius Univ Ann Ser Econ Sci 17(1):211–216 Ali A (2021) Information and communication technology for sustainable tourism development. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https:// doi.org/10.1007/978-3-030-05324-6_103-1 Baggio R, Del Chiappa G (2013) Tourism destinations as digital business ecosystems. In: Cantoni L, Xiang Z (eds) Information and communication technologies in tourism 2013. Springer, Heidelberg/Berlin, pp 331–342. https://doi.org/10.1007/978-3-642-36309-2_16 Borràs J, Moreno A, Valls A (2014) Intelligent tourism recommender systems: a survey. Expert Syst Appl 41(16):7370–7389. https://doi.org/10.1016/j.eswa.2014.06.007 Bouchon F, Rauscher M (2019) Cities and tourism, a love and hate story; towards a conceptual framework for urban overtourism management. Int J Tour Cities 5(4):598–619. https://doi.org/ 10.1108/IJTC-06-2019-0080 Bratec M, Bernabeu MA, Krizaj D, Ivars-Baidal J, Diaz AB, Kopic P, Rogelja T (2021) The dual aspect of technology in tourism: social contradictions surrounding the sharing economy and smart destination development. University of South Florida M3 Center Publishing, 17(9781732127593), 8 Buhalis D, Amaranggana A (2013) Smart tourism destinations. In: Xiang Z, Tussyadiah I (eds) Information and communication technologies in tourism 2014. Springer, Cham, pp 553–564. https://doi.org/10.1007/978-3-319-03973-2_40 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet – the state of eTourism research. Tour Manag 29(4):609–623. https://doi.org/10.1016/j.tourman.2008.01.005 Bulchand-Gidumal J (2020) Impact of artificial intelligence in travel, tourism, and hospitality. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https:// doi.org/10.1007/978-3-030-05324-6_110-1 Calvaresi D, Leis M, Dubovitskaya A, Schegg R, Schumacher M (2019) Trust in tourism via blockchain technology: results from a systematic review. In: Pesonen J, Neidhardt J (eds) Information and communication technologies in tourism 2019. Springer, Cham, pp 304–317. https://doi.org/10.1007/978-3-030-05940-8_24 Capó J, Riera A, Roselló J (2006) Una revisión del análisis económico del turismo Principios: estudios de economía política 5:5–23 Cheng M, Foley C (2018) The sharing economy and digital discrimination: the case of Airbnb. Int J Hosp Manag 70:95–98. https://doi.org/10.1016/j.ijhm.2017.11.002 Cooper C (2006) Knowledge management and tourism. Ann Tour Res 33(1):47–64. https://doi. org/10.1108/16605371111175320 Coveri A, Cozza C, Guarascio D (2021) Monopoly capitalism in the digital era (No. 2021/33). LEM Working Paper Series Dewangan A, Chatterjee R (2018) Tourism recommendation using machine learning approach. In: Saeed K, Chaki N, Pati B, Bakshi S, Mohapatra D (eds) Progress in advanced computing and intelligent engineering. Advances in intelligent systems and computing, vol 564. Springer, Singapore, pp 447–458 Dwyer L, Forsyth P, Dwyer W (2010) Tourism economics and policy, vol 3. Channel View Publications, Bristol. https://doi.org/10.21832/9781845411534 Gerakis AS (1965) Effects of exchange-rate devaluations and revaluations on receipts from tourism. Int Monetary Fund Staff Pap 12:365–384. https://doi.org/10.2307/3866335 Gibbs C, Guttentag D, Gretzel U, Yao L, Morton J (2018) Use of dynamic pricing strategies by Airbnb hosts. Int J Contemp Hosp Manag 30(1):2–20. https://doi.org/10.1108/IJCHM-092016-0540
8 Digitalization and the Transformation of Tourism Economics
189
Goldfarb A, Greenstein SM, Tucker CE (2015) Economic analysis of the digital economy. University of Chicago Press Gray HP (1966) The demand for international travel by the United States and Canada. Int Econ Rev 7(1):83–92. https://doi.org/10.2307/2525372 Gretzel U, Sigala M, Xiang X, Koom C (2015) Smart tourism: foundations and developments. Electron Mark 25(3):179–188. https://doi.org/10.1007/s12525-015-0196-8 Gursoy D (2018) Future of hospitality marketing and management research. Tour Manag Perspect 25:185–188. https://doi.org/10.1016/j.tmp.2017.11.008 Hjalager A (2002) Repairing innovation defectiveness in tourism. Tour Manag 23:465–474. https:// doi.org/10.1016/S0261-5177(02)00013-4 Hojeghan SB, Esfangareh AN (2011) Digital economy and tourism impacts, influences and challenges. Proc-Soc Behav Sci 19:308–316. https://doi.org/10.1016/j.sbspro.2011.05.136 Hsu CH (2018) Tourism education on and beyond the horizon. Tour Manag Perspect 25:181–183. https://doi.org/10.1016/j.tmp.2017.11.022 Ivanov SH, Webster C, Berezina K (2017) Adoption of robots and service automation by tourism and hospitality companies. Revista Turismo & Desenvolvimento 27/28(1):1501–1517 Ivanov S, Webster C, Berezina K (2020) Robotics in tourism and Hospitality. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https://doi.org/10.1007/ 978-3-030-05324-6_112-1 Kalbaska N, Cantoni L (2021) e-Learning in tourism education. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https://doi.org/10.1007/978-3-03005324-6_104-1 Kontogianni A, Alepis E, Patsakis C (2022) Smart tourism and artificial intelligence: paving the way to the post-COVID-19 era. In: Advances in artificial intelligence-based technologies. Springer, Cham, pp 93–109. https://doi.org/10.1007/978-3-030-80571-5_7 Korinek A, Stiglitz JE (2017) Artificial intelligence and its implications for income distribution and unemployment (No. w24174). National Bureau of Economic Research, Cambridge Leiper N (1981) Towards a cohesive curriculum in tourism. The case for a distinct discipline. Ann Tour Res 8(1):69–84. https://doi.org/10.1016/0160-7383(81)90068-2 Li G, Song H, Witt SF (2005) Recent developments in econometric modeling and forecasting. J Travel Res 44(1):82–99. https://doi.org/10.1177/0047287505276594 Llodra-Riera I, Martínez-Ruiz MP, Jiménez-Zarco AI, Izquierdo-Yusta A (2015) Assessing the influence of social media on tourists’ motivations and image formation of a destination. Int J Qual Serv Sci 7(4):458–482. https://doi.org/10.1108/IJQSS-03-2014-0022 Makridakis S (2017) The forthcoming Artificial Intelligence (AI) revolution: its impact on society and firms. Futures 90:46–60. https://doi.org/10.1016/j.futures.2017.03.006 Malihah E, Setiyorini HPD (2019) Industry revolution 4.0: the challenge for secondary education on tourism and hospitality in Indonesia. In: 5th UPI international conference on technical and vocational education and training (ICTVET 2018). Atlantis Press Manyika J, Lund S, Chui M, Bughin J, Woetzel J, Batra P, Ko R, Sanghvi S (2017) Jobs lost, jobs gained: workforce transitions in a time of automation. McKinsey Global Institute, San Francisco Melián G, Bulchand G (2015) Competences that the new work in tourism requires. Investigaciones Turísticas 10:76–89 Mills J, Law R (2004) Handbook of consumer behaviour, tourism and the internet. Haworth Hospitality Press, New York Minghetti V, Buhalis D (2010) Digital divide in tourism. J Travel Res 49(3):267–281. https://doi. org/10.1177/0047287509346843 Moreno-Izquierdo L, Pedreño-Muñoz A (2020) Europa frente a EEUU y China: Prevenir el declive en la era de la inteligencia artificial. KDP Publishing Moreno-Izquierdo L, Ramón-Rodríguez AB, Perles-Ribes JF (2016) Pricing strategies of the European low-cost carriers explained using Porter’s Five Forces Model. Tourism Economics, 22(2):293–310. https://doi.org/10.5367/te.2016.0551 Moreno-Izquierdo L, Ramón-Rodríguez AB, Such-Devesa MJ (2018) The challenge of longterm tourism competitiveness in the age of innovation: Spain as a case study. Investigaciones Regionales 42:13–34
190
L. Moreno-Izquierdo et al.
Moreno-Izquierdo L, Ramón-Rodríguez AB, Such-Devesa MJ, Perles-Ribes JF (2019) Tourist environment and online reputation as a generator of added value in the sharing economy: the case of Airbnb in urban and sun-and-beach holiday destinations. J Destin Mark Manag 11:53– 66. https://doi.org/10.1016/j.jdmm.2018.11.004 Murphy J, Hofacker C, Gretzel U (2017) Dawning of the age of robots in hospitality and tourism: challenges for teaching and research. Eur J Tour Res 15:104–111. https://doi.org/10.1016/j. jhlste.2018.10.003 Ndou V, Mele G, Del Vecchio P (2018) Entrepreneurship education in tourism: an investigation among European Universities. J Hosp Leis Sport Tour Educ. In press. https://doi.org/10.1016/j. jhlste.2018.10.003 Pearce DG, Butler RW (eds) (1993) Tourism research: critiques and challenges. Routledge, London Peri´c M, Vitezi´c V (2021) Tourism getting back to life after COVID-19: can artificial intelligence help? Societies 11(4):115. https://doi.org/10.3390/soc11040115 Perles-Ribes JF, Ivars-Baidal J (2018) Smart sustainability: a new perspective in the sustainable tourism debate. Investigaciones Regionales 42:151–170 Perles-Ribes JF, Ramón-Rodríguez AB, Moreno-Izquierdo L, Such-Devesa MJ (2019) Online reputation and destination competitiveness: the case of Spain. Tour Anal 24(2):161–176. https:// doi.org/10.3727/108354219X15525055915518 Pine J, Gilmore JH (1999) The experience economy: work is theater and every business a stage. Harvard Business School Press, Boston Poon A (1993) Tourism, technology and competitive strategies. CAB International, Wallingford Porter ME (2008) On competition – updated and expanded edition. Harvard Business Review, Boston Rennie MW (1993) Born global. McKinsey Q (4):45–53 Rucci AC, Moreno-Izquierdo L, Perles-Ribes JF, Porto N (2021) Smart or partly smart? Accessibility and innovation policies to assess smartness and competitiveness of destinations. Current Issues Tour 1–19. https://doi.org/10.1080/13683500.2021.1914005 Schuckert M, Liu X, Law R (2015) Hospitality and tourism online reviews: recent trends and future directions. J Travel Tour Mark 32(5):608–621. https://doi.org/10.1080/10548408.2014.933154 Sessa A (1984) Comments on Peter Gray’s contribution of economics to tourism. Ann Tour Res 11(2):283–286 Shanker D (2008) ICT and tourism: challenges and opportunities, conference on tourism in India – challenges ahead. Indian Institute of Management Kozhikode, Kerala, pp 50–58 Sigala M (2018) New technologies in tourism: from multi-disciplinary to anti-disciplinary advances and trajectories. Tour Manag Perspect 2018(25):151–155. https://doi.org/10.1016/j. tmp.2017.12.003 Skeli S, Schmid M (2019) Mitigating overtourism with the help of smart technology solutions– a situation analysis of European city destinations. In: ISCONTOUR 2019 tourism research perspectives: proceedings of the international student conference in tourism research Smith BC, Leimkuhler JF, Darrow RM (1992) Yield management at American airlines. Interfaces 22(1):8–31. https://doi.org/10.1287/inte.22.1.8 Song H, Dwyer L, Li G, Cao Z (2012) Tourism economics research: a review and assessment. Ann Tour Res 39(3):1653-1682. https://doi.org/10.1016/j.annals.2012.05.023 Spitz-Oener A (2006) Technical change, job tasks, rising educational demands: looking outside the wage structure. J Labor Econ 24(2):235–270. https://doi.org/10.1086/499972 Stabler M, Sinclair MT (eds) (1997) The economics of tourism. Routledge, Londres Stabler MJ, Papatheodorou A, Sinclair MT (eds) (2010) The economics of tourism, 2nd edn. Routledge, London Stevenson B (2018) AI, income, employment, meaning. In: The economics of artificial intelligence: an Agenda. National Bureau of Economic Research, Inc. University of Chicago Press, Chicago Sun J, Yan J, Zhang KZ (2016) Blockchain-based sharing services: what blockchain technology can contribute to smart cities. Financ Innov 2(1):26. https://doi.org/10.1186/s40854016-0040-y
8 Digitalization and the Transformation of Tourism Economics
191
Teubner T, Hawlitschek F, Dann D (2017) Price determinants on Airbnb: how reputation pays off in the sharing economy. J Self-Gov Manag Econ 5(4):53–80 Tisdell C (ed) (2000) The economy of tourism (Volume I and II) The international Library of critical writings in economics. Edward Elgar, Cheltenham Treiblmaier H (2020) Blockchain and tourism. In: Xiang Z, Fuchs M, Gretzel U, Höpken W (eds) Handbook of e-tourism. Springer, Cham. https://doi.org/10.1007/978-3-030-05324-6_28-2 Tribe J, Xiao H (2011) Developments in tourism social science. Ann Tour Res 38(1):7–26. https:// doi.org/10.1016/j.annals.2010.11.012 Varian H (2019) Artificial intelligence, economics, and industrial organization. In: Agrawal A, Gans J, Goldfarb A (eds) The economics of artificial intelligence: an Agenda. University of Chicago Press, Chicago, pp 399–419. https://doi.org/10.7208/9780226613475-018 Vinod B (2004) Unlocking the value of revenue management in the hotel industry. J Revenue Pric Manag 3(2):178–190. https://doi.org/10.1057/palgrave.rpm.5170105 Wanhill S (2011) What tourism economists do. Their contribution to understanding tourism. Estudios de Economía Aplicada 29(3):679–692 Webb GI, Pazzani MJ, Billsus D (2001) Machine learning for user modeling. User modeling and user-adapted interaction 11(1–2):19–29. https://doi.org/10.1023/A:1011117102175 Witt SF, Moutinho L (eds) (1994) Tourism management and marketing handbook, 2nd edn. Prentice Hall, Hemel Hempstead World Tourism Organization (2018) Panorama OMT del turismo internacional, Edición 2018, UNWTO, Madrid Wu C, Yang X (2021) Factors impact on dynamic pricing in sharing economy: a study on uber & lyft. Front Econ Manag 2(1):164–177. https://doi.org/10.6981/FEM.202101_2(1).0023 Xiang Z (2018) From digitization to the age of acceleration: on information technology and tourism. Tour Manag Perspect 25:147–150. https://doi.org/10.1016/j.tmp.2017.11.023 Zehrer A, Mössenlechner C (2008) Industry relations and curricula design in Austrian tourism master programs: a comparative analysis. J Teach Travel Tour 8(1):73–95. https://doi.org/10. 1080/15313220802441992 Zsarnoczky M (2017) How does artificial intelligence affect the tourism industry? VADYBA 31(2):85–90
Part II Technologies in e-Tourism
9
The Evolution of Online Booking Systems Robert Goecke
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Computer Reservation Systems to Global Distribution Systems . . . . . . . . . . . . . . . . . . The Internet as a Global Virtual Data Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The World Wide Web as a Distributed Internet Multimedia Information System . . . . . . . . . . HTML as Description Language for the Multimedia Hypertext Documents of the WWW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JavaScript as Browser Programming Language and Browser Plug-Ins . . . . . . . . . . . . . . . . HTML Forms, Server-Side Scripts, and Web-Enabled Applications . . . . . . . . . . . . . . . . . . Internet Booking Engines as Self-Service Enablers in Tourism . . . . . . . . . . . . . . . . . . . . . . . . Integration of Legacy Booking Systems with Web Front Ends . . . . . . . . . . . . . . . . . . . . . . . . Touristic IBEs, Package Tour Comparison Systems, and Specialized Online Travel Agent WBEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolution from Mainframe via Distributed Web Services to Cloud Computing . . . . . . . . . . . Synopsis: Automation, Customer-Oriented Self-service, and the Two Faces of Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
196 197 202 204 206 207 208 210 212 213 214 216 217 217
Abstract E-Tourism started with the first automated airline computer reservation systems (CRS) implemented on mainframes. With the global spread of data transmission networks, they evolved to global distribution systems (GDS), which serve as B2B touristic distribution backbones until today. The seamless integration of heterogeneous data networks into the Internet and the invention of the World Wide Web as a distributed multimedia application platform led to the development of
R. Goecke () Munich University of Applied Sciences, Munich, Germany e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_27
195
196
R. Goecke
comfortable Internet/web booking engines. They enabled easy-to-use browserbased online booking as a self-service for customers and new forms of B2C travel distribution. Because the Internet and the WWW became the base technologies of all E-Tourism applications, their architecture and components are explained in detail as key enablers for online booking and CRS-Web-Front ends.
Keywords Computer reservation system · Global distribution system · Internet · World Wide Web · HTML/CSS · IBE/WBE
Introduction Before the introduction of automated reservation systems in tourism, the exclusive allocation of rooms or seats with limited availability to guests or passengers was done manually with pencil and eraser in booking calendars, index cards, or rolodex systems. Call center operators of airlines or hotel chains administrated the available inventories centrally, and travel agents booked via phone or telex on behalf of their customers. Flight schedules and tariffs were published regularly in printed manuals like railway timetables with price lists, while room rates and flight tariffs could also be enquired via phone. To understand the digital transformation process of the tourism industry, we start with an architecture analysis of the first computer reservation systems (CRS). They were implemented on centralized mainframe computers, where many principles of virtualization were invented to enable multiuser and multi-application hosting on a single computer (Deitel 1990; Schulz et al. 1996). Both analogue telecommunication networks and digital computer networks existed long before Vinton G. Cerf und Robert E. Kahn introduced the Internet Protocols in the early 1970s and were used to provide travel agents with direct “online” CRS-access via terminals (Tanenbaum 1988; Crol 1992; Comer 2018). The stepwise interconnection of those first-generation online booking systems was a main driver to establish global distribution systems (GDS). After Tim Berners-Lee had invented the World Wide Web (WWW) in 1989 at Europe’s CERN research labs, web-based Internet Booking Engines (IBEs) enabled browser-based multimedia online booking services for end customers and led to the modernization of most existing CRS and GDS with web front ends (Schulz et al. 2010, 2014; Schulz 2014; Goecke and Landvogt 2017; Benckendorff et al. 2019). Now, web-based architectures are state of the art for all modern online booking systems and fundamental for advanced web technologies and E-Tourism apps.
9 The Evolution of Online Booking Systems
197
From Computer Reservation Systems to Global Distribution Systems Originally, American Airlines had developed the first airline computer reservation system (CRS) together with IBM only to automate call center booking processes (Hopper 1990; Schulz et al. 1996; Buhalis 2003; Schulz 2014): An old rolodex reservation system based on paper index cards was substituted by a mainframe reservation database to support internal phone-call-center booking agents. Its duty was to collect, store, and process all the incoming booking requests from travel agencies. All flight schedules, inventory data, and bookings were kept in a central host/mainframe computer, whose storage and processors filled iron wardrobes (Hopper 1990). Its database system is a program specialized on storing all kinds of structured information in electronic files as indexed tables (flight offers, reservations, passengers, etc.) containing attributed records like structured cards in card boxes. The records’ attribute structure, their data types, and meaning in the tables are called database schema and specify the information model of the database as meta-data (data about data). Airline CRS are one of the first industry specific database applications for professional users, i.e., flight inventory managers and call center agents. CRS offer functions to insert and select flight offerings and seat availabilities as well as to reserve or release seats for individual passengers. Passenger booking information is stored as PNR (passenger name record) in flightspecific passenger lists. The CRS programs control the reservation process by managing all the dialogues between its users and the database. While internationally standardized and hierarchically organized analogue telephone networks enabled automatically connected voice calls between humans all over the world, the first digital data networks were introduced by competing computer companies with proprietary incompatible data communication standards. However, it was a clever combination of those data and phone-networks, which allowed American Airlines to transform its first internal CRS into a networked airline distribution system by providing remote regional travel agencies with “thumb” alphanumeric computer terminals connected directly to its CRS (see Fig. 1). The first digital Wide Area Networks (WANs) were organized either by the multinational computer companies or by national public telecommunication service providers. They enabled the transmission of alphanumeric data of booking masks, user input characters, and the central distribution system’s output between the remote travel agent’s terminal and the CRS host computer. If no access to one of those proprietary data networks existed, so-called acoustic couplers or modems were used to encode (modulate) binary signals into noisy sound (Comer 2018; Amadeus 1990; Wikipedia 2019). The airline distribution system operated “modem banks” to handle all parallel dialogues with different travel agents’ terminals independently. Dedicated communication controllers and front-end processors managed the individual dialogue sessions between each agent terminal and the central
198
R. Goecke
Airline CRS Host Supplier Hosts Airline CRS Host
Data Network 1
digital data link
Modem Network Adapter
flights, reservations Network(Session) Controller
Database
Modem Bank
…
Data Network 2 analogue telephone link
CRS Host as Airline Distribution System
Telephone Network
Terminal Modem
Travel Agent
Modem
Travel Agent
Terminal Terminal Travel Agent
Terminal Travel Agent
Travel Agencies Fig. 1 Mainframe-based airline distribution system connected via data and phone-networks with travel agencies and internal CRS of cooperating supplier airlines
distribution system’s main processor. The sessions had to be properly separated from one another in a way that every agent got the impression to use the reservation system exclusively instead of sharing it with hundreds of other agents at the same time (Tanenbaum 1988). Since then, expedients in travel agencies could search for flights directly within the CRS and book them “online.” When other Airlines also started to introduce their own internal CRS for faster booking processing, American Airlines shared its travel agent terminal network with other CRS-equipped airlines (Fig. 1). In this way, even internal CRS of other suppliers could connect to the American Airline CRS and distribution system to enlist their flights and make them bookable by travel agents for usage fees smaller than the investments necessary to build up own terminal networks. After the success of its airline distribution system, American Airlines decided to host even computer reservation applications for globally operating hotel chains and rental car companies on its host and made its airline distribution system the “mother” of all global distribution systems (GDS). A GDS (see Fig. 2) enables travel agents to search for flights, connecting flights, or complementary travel segments of a journey worldwide, to compare competing offers and to book them online for their customers using one terminal of one integrated CRS system (Buhalis 2003; Benckendorff et al. 2014, 2019; Nyheim 2019). GDS pioneered what we call today an electronic market long before B2B electronic stock exchange trading systems or B2C Internet markets, like Amazon or Alibaba, were invented. Over the years, the American Airline GDS got independent from American Airlines, and we know it today as Sabre
9 The Evolution of Online Booking Systems
Airline CRS Host Supplier Hosts Airline CRS Host Hotel CRS Host Rental Car CRS Host
Data Network 1
199 Modem
Global Distribution System Host flights
cars
hotels
Network(Session) Controller
Network Adapter Database
Modem Bank
…
Data Network 2
Telephone Network
Terminal/PC Modem
Travel Agent
Terminal/PC Modem
Travel Agent
Terminal/PC analogue telephone link digital data link
Travel Agent
Terminal/PC Travel Agent
Travel Agencies
Fig. 2 Global distribution system (GDS) database architecture
Travel Network. European and Asian airline consortia started their own competing GDS projects in the 1980s, which lead to more than 10 GDS systems in the early 1990s and a somewhat fragmented GDS landscape: Amadeus, Galileo, System One, Apollo, Worldspan, Abacus, and TravelSky were all founded by Airlines in cooperation with leading computer companies, like IBM, Unisys, Fujitsu Siemens, etc. (Schulz 2014; O’Connor 1999; Schulz et al. 1996; Amadeus 1990; Hopper 1990). Over the years, the agents’ thumb terminals were completely replaced by personal computers (PCs) supporting additional mid- and back-office processes of travel agencies after a booking was made. All GDS mainframes/hosts used a complex time-sharing multiuser/multitasking operating system software to protect the shared storage and processor resources of different users and applications effectively from one another. To prevent booking conflicts between different GDS system users, the underlying inventory database used special transaction processing monitors like IBM’s famous TPF to guarantee exclusive user access to a data block of, e.g., a seat vacancy, whenever a reservation is made (Deitel 1990). All those “mainframe” technologies were designed to share superfast processor and storage resources of one extremely expensive central computer between as many users and applications as possible in a way that every travel agent has the virtual impression to use the whole application exclusively. Using idle processing cycles between the interactions of a single user to serve tasks of other users’ sessions and sharing expensive main storage in a virtual way unnoticed by users were big inventions. It made mainframe-based CRS and distribution systems profitable, which even were able to simulate different computers as virtual machines for different users at the same time, when low-
200
R. Goecke
cost PCs or mobile computers were not available on every desktop or not even invented yet. It was the beginning of virtualization as a leading design principle to share scarce resources and to reduce complexity either by hiding information about complex IT realities or by creating simplified user interfaces, which simulate real-world objects like “files,” “folders,” “recycle bins,” etc. In countries with big railway or ferry networks and strong markets for railway trips, ferry trips, and prepackaged tours, special Touristic Distribution Networks (TDN, or Tour DNs) were developed to connect even internal rail CRS, ferry CRS, and tour operator CRS for prepackaged trips with travel agencies. Sabre Merlin, Amadeus TOMA/Leisure, and Galileo CETS are examples for the integration of such national Touristic Distribution Networks into their global distribution system suites (Hopper 1990; Amadeus 1990; Schulz et al. 1996). National Touristic Distribution Networks enable the travel agent to book railway tickets, ferries, and complex packaged tours directly online in the CRS of the railway, ferry, or tour operator (see Fig. 3). While a GDS acts as a single shared CRS with flight, hotel, and rental car databases synchronized with the internal inventory systems of the participating suppliers to enable a direct search and comparison of matching offerings from competing suppliers to find and book the best fitting travel segments for a travel itinerary, a national Tour DN network serves not as one integrating database CRS platform. A Touristic Distribution Network is only a connecting switch providing the travel agent with a common touristic mask to start direct dialogue sessions with any single internal CRS of the connected suppliers to look and book at this specific supplier but without any function to compare competing
Airline CRS Host Supplier Hosts Airline CRS Host
Data Network 1
Net.Ctr
flights
Modem Bank
…
Data Network 2
Telephone Network
Modem Network Adapter
cars
hotels
Network(Session) Controller
Hotel CRS Host Rental Car CRS Host
Tour Global Distribution DN System Host
Database analogue link digital link digital LAN PC Client
Modem
Travel Agent
Modem
Travel Agent
PC Client
Tour Op CRS PC Client
Tour Op Server TO PC Client
…
TO PC Client
Travel Agent
Growth-competition of seperated digital & analgue networks, PC Client partially inter-connected via proprietary network Travel Agent adapters with limited end-to-end connectivity. Travel Agencies
Fig. 3 CRS, GDS, and national Touristic Distribution Networks before the advent of the Internet
9 The Evolution of Online Booking Systems
201
offerings of different suppliers from their connected internal CRS within a single search display (Schulz 2014). However, a special benefit of a direct connection to a tour operator CRS (Tour Op CRS in Fig. 3) is that travel agents can assemble complex individualized flexible package tours according to the rules described in the tour operator’s menu-like flexible packaging catalog (Goecke and Weithöner 2014). The complex business logic of proprietary flexible packaging products is configured directly within the tour operator CRS. GDS and Touristic Distribution Networks contributed a lot to the globalization of both business and leisure travel, as well as the worldwide travel industry boom in the 1970s, 1980s, and 1990s, which supported global commerce and integration. During the 1980s and 1990s, most “thumb” terminals were replaced by personal computers (PC) with local processing, data storage, and graphic display capabilities. While PCs still served as terminals for central mainframes of host-based CRS and GDS, they quickly became a low-cost platform for desktop reservation systems and administrative mid- and back-office systems in small businesses like hotels, small tour operators, tourist offices, or travel agencies. To enable data exchange between desktop PCs, they were connected via innovative corporate digital local area networks (LAN), and the client-server architecture was invented for distributed PC applications: Multiuser applications like reservation systems were split into a central server component running on a single centralized PC server to serve the requests of many client software packages running on local desktop user PCs. The central component contains the reservation system’s database and transaction processing logic. It acts as a server to process the service requests of many client components, which introduced graphical user interfaces and printer controllers on local desktop PCs. PC-/LAN-based client-server reservation systems spread very fast as property management systems (PMS) in hotels (Goecke 2014b) and became key enablers for tour operator CRS implementing interactive menu-driven flexible packaging reservations. Like airports, commercial banks, or the US military, also GDS and Tour DNs suffered from the many different data link standards or missing digital communication networks in different nations and continents. They had to connect their CRS not only with thousands of small travel agencies across the globe but also with hundreds of internal airline inventories as well as with central reservation systems of hotel chains, hotel consortia, and rental car providers to synchronize inventory and bookings. For this purpose, special switch companies and mediators programmed and operated dedicated gateway computers (switches) to transform and exchange application data like vacancies and bookings. Some like Official Airline Guide (OAG for global flight schedule distribution), Airline Tariff Publishing Company (ATPCO for global fare distribution), or Pegasus (Hotel Switch) still exist today (O’Connor 1999; Werthner and Klein 1999; Buhalis and Laws 2001). The national Touristic DNs served as dialogue interfaces between thousands of travel agent terminals and hundreds of proprietary CRS tour operator systems programmed as client-server solutions by different IT vendors. A simple and robust standardized way to exchange binary data over different types of digital and analogue networks between the computers of all those actors was needed “end to end,” i.e., without
202
R. Goecke
the necessity for applications and their programmers to know any details of the underlying networks.
The Internet as a Global Virtual Data Network Before the development of TCP/IP (Transmission Control Protocol/Internet Protocol) and UDP (User Datagram Protocol), sponsored by the US Department of Defense, it was only possible to interconnect networks directly via modems or network adapters (Tanenbaum 1988). The problem with those direct network adapters was that in order to connect one network with n other networks, it was necessary to develop n different network adapters. The full interconnection of n networks thus would have required n*(n-1)/2 adapter developments, and every programmer of a distributed application needed to know all the details of all adapters and networks along a routing path between two computers in advance (Comer 2018). The innovative idea of the Internet by Cerf and Kahn was to introduce TCP/IP and later UDP as a new end-to-end protocol for the exchange of digital data packages via existing analogue and digital telecommunication networks for all computer applications in a layered architecture (Crol 1992). The new Internet Protocols served both as a new lingua franca for inter-application data exchange and as intermediary language for package exchange between subnetworks via routers. In this way, global inter-networking could be achieved by developing only one UDP-TCP/IP adapter-type for every proprietary or innovative subnetwork (Figs. 4 and 5). Routers containing several adapters can connect many networks via routes chosen in a fully automated way unnoticed by applications, application users, and programmers, which simplified both application usage and development. The layered Internet architecture (see middle layers in Fig. 6) delivers a homogeneous virtual Inter-Network hiding the complex details of inter-networking between heterogeneous real subnetworks. Over the years, most innovative distributed applications used them for inter-application communication. The Internet technology was also useful to restructure existing data networks of legacy applications, like the established computer reservation systems and global distribution systems, for more simplicity, connectivity, flexibility, and performance (Fig. 5). Layered virtual Internet Protocols are enablers for the easy integration not only of legacy networks but also of innovative future networks, because each requires only the development of one new Internet adapter (Fig. 6). The flexibility and migration friendliness of the Internet architecture supported the growth of virtual global airport and airline data network providers like SITA (Société Internationale de Télécommunication Aéronautique) and ARINC (Aeronautical Radio Incorporated – today Rockwell). The Internet also supported consolidation and internal innovation processes in the GDS industry. Sabre/Abacus, Amadeus, Travelport (Galileo and Worldspan), and China’s TravelSky survived as major global distribution systems with interconnected national CRS networks for rail, ferry, tour, and cruise operators and form the backbone for many booking channels between suppliers and travel
9 The Evolution of Online Booking Systems
Data Network 1
Airline IPr Supplier CRS Host Hosts Airline CRS Host IPr
203
Tour Global Distribution System Host DN flights IPr IPr
Internet Router
Database
Hotel IPr CRS Host Data Network 2
IPr
Tour Op CRS
IPr
Tour Op CRS
IPr
Telephone Network IPr
IPr
IPr
Modem
Internet Protocols as standardized inter-networking technology (IPr i.e. UDP TCP/IP)
PC
Travel Agent
Modem
IPr
PC
Travel Agent IPr
analogue telephone link digital data link
IPr IPr
IPr IPr IPr
Rental Car CRS Host
IPr
cars
hotels
IPr/Session Controller
IPr
Modem Internet Adapter
PC
Travel Agent IPr
PC
Travel Agent
Travel Agencies
Fig. 4 The Internet as inter-networking technology via standardized Internet Protocols (IPr)
Airline IPr Supplier CRS Host Hosts Airline CRS Host IPr Hotel IPr CRS Host Rental Car CRS Host
IPr
Tour Op CRS
IPr
Tour Op CRS
IPr
analogue telephone link digital data link
Tour Global Distribution System Host DN flights IPr IPr
hotels
cars
IPr/Session Controller
Database Internet Adapter
IPr
IPr
The INTERNET seen from an applicaon developer‘s perspecve: A global virtually homogeneous network, which transfers packages of binary data end-to-end between compung devices idenfied by IP-adresses according to standardized Internet Protocols (IPr i.e. UDP TCP/IP). Neither routes, speed, reliable delivery, nor data security are ensured!
IPr
PC
Travel Agent IPr
PC
Travel Agent IPr
PC
Travel Agent IPr
PC
Travel Agent
Travel Agencies
Fig. 5 The INTERNET as virtual homogeneous end-to-end data package network
agencies until today (Schulz 2014; Benckendorff et al. 2019). Beyond this, the Internet architecture became even a platform for Internet standard applications, like e-mail, file transfer, and the World Wide Web (WWW-Fig. 6), which transformed E-Tourism also from an end user and customer perspective.
204 Internet Applicaon Protocols: (Virtual) INTERNET:
R. Goecke
DNS Domain Name Service
WWW HTML HTTP(S)
eMail MIME, IMAP POP, SMTP
File Transfer FTP
…
Voice over IP
Video over IP
End-to-end Data Transport Protocols
UDP (User Datagram Protocol): Connection-less i.e. message-like package-transfer. TCP (Transmission Control Protocol): Structure of data packages and rules for their end-to-end connection-oriented transport & sequencing between a sending and a receiving application via different routes accross different Internet-sub-networks.
InterNetworking Protocols
IP (Internet Protocol): Structure and allocation of IP-addresses to computers as well as routing-rules for the data-package-exchange between computers within and across different sub-networks.
Internet Adapters
IP / Modem
(Real) Physical Networks: (Sub-Network Protocols)
Analogue Digital Local Area Telephone Subscriber Networks Networks Lines IEEE 802.3
IP / xDSL
IP / LAN
…
IP / WLAN Wireless LANs IEEE 802.11
IP / 3G
IP / 4G
IP / 5G
Mobile Networks: Enhanced LTE/4G
5G
Fig. 6 The layered Internet architecture as basis for the World Wide Web
The World Wide Web as a Distributed Internet Multimedia Information System While Internet Protocols and adapters offer benefits for programmers and network managers, the highest layer of the Internet architecture is designed to offer distributed applications for end users via the virtual global Internet. Internally the Internet Protocol identifies sending and receiving computers by numerical Internet Protocol addresses (IP addresses). IP addresses are assigned to the Internet’s subnets and their computers by the Internet’s participating organizations in a way that different computers never share the same IP address (Tanenbaum 1988; Crol 1992; Deitel et al. 2016; Comer 2018). The automatic mapping of virtually assigned IP addresses and real subnet addresses, e.g., LAN MAC addresses, phone numbers, etc., is the task of the network adapters and routers. Because IP addresses got more and more lengthy (IP v4 to IP v6), no human being can remember them easily. The Domain Name System (DNS) is a basic Internet service for all other useroriented Internet applications: It maps memorable strings like names of countries, organizations, companies, etc. as domain names to IP addresses (Fig. 7). The World Wide Web (WWW) was specified as a distributed multimedia information service for CERN’s particle collider researchers, who collaborate in numerous universities. It provides remote access to multimedia research documents stored on web servers with a web client software called web browser (BernersLee 1989). Every page of a multimedia web document is referenced by its unique
9 The Evolution of Online Booking Systems
205
IP: 160.250.4.30
IP: 271.560.3.19 5. Web server looks for requested HTML-page in its file system
mtourismus.de Web server with HTML-files
Standard-IP-adress
Domain Name Server .de mtourism: 160.250.4.30 mtourismus: 271.560.3.19
4. HTTP-request (HyperText Transfer Protocol) with URL for index.htm on Web server 160.250.300
mtourism.de Web server with HTML-files 6. HTTP-response index.htm file is returned to IP address of browser computer
INTERNET
3. IP-address of Domain Name 2. Look up Domain Name
IPr: UDP - TCP/IP index.htm page with HTML-code
7. Web-/HTML-page is saved temporarily in browser cache, its HyperText Markup Language tags (instructions) are interpreted & rendered by the browser to be presented as interactive multimedia-document in the browser-window. 1. Type URL: http://www.mtourism.de/index.htm into browser or click web link with URL (Uniform Resource Locator)
Browser = Web client with HTML-interpreter/renderer
Homepage
Fig. 7 World Wide Web with seven steps of HTTP communication between web browser and web server
Uniform Resource Locator (URL). It is built from the web server’s domain name together with the directory path and name of the specific file describing the page in Hypertext Markup Language (HTML). Hypertext was invented earlier and means that a text may be augmented by the author with reference links, enabling every recipient to navigate with a simple mouse click to every other page of the same document or even to other documents with a hyperlink (Weithöner 2014a). The linear and one-dimensional sequence of pages of classic texts and books is overcome by Web authors, who enable their readers to browse from hyperlink to hyperlink across a web of interlinked documents forming a multidimensional document hyperspace. The WWW’s innovation was to use the hypertext concept for a web of multimedia documents distributed across different web servers connected to the Internet, which can be accessed by any user with a web client on an Internet-connected PC. The Hypertext Transfer Protocol (HTTP) defines how web browser and web server communicate via the Internet whenever a user types in the URL of a web page into the browser’s address field to get the corresponding HTML file delivered from the web server. The DNS (Domain Name System) helps the browser to find out the correct IP address of the web server specified by the URL’s domain name (see Fig. 7). Because in the Internet architecture, all data transfer is visible in every network on the path, HTTPSecure was introduced, where all HTTP data transfer between browser and web server is encrypted end to end. It is the duty of the web browser to interpret the HTML file, to fetch all multimedia content, and to render text and multimedia data for presentation on
206
R. Goecke
a graphics display with sound output devices of the desktop multimedia personal computers (PC) developed in the 1980s (Crol 1992; Goecke 2014a). The WWW thus introduced multimedia user interfaces for distributed network applications, which too long have been the domain of alphanumeric terminal interfaces. It had a major impact for E-Tourism applications, where the established CRS and GDS had not been able to present any hotel pictures or other graphics to travel agents and their customers.
HTML as Description Language for the Multimedia Hypertext Documents of the WWW Hypertext Markup Language (HTML) consists of tags (instructions in brackets like ), which describe the structure and formatting of a multimedia document (web page) assembled from text, image, sound, or video content (Fig. 8). HTML pages include text tagged with file descriptors or URLs referring to the website‘s non-text content resources for immediate download and rendering by the HTML-interpreting browser. Tagged hypertext reference links (as URLs) enable
Base-structure of HTML-page with CSS and embedded Java Script code for a DHTML web page:
Web page of Robert‘s Internet Cafe
Welcome to Robert‘s Internet Cafe!
Eat best home made cookies while surfing!
Open every day from 9.00 to 23.00 for HM.edu students: Link
Our logo:
Move mouse over photo to see more:
…
Tags are HTMLelements surrounding text text as mark-up
The „head“ contains meta data about the web page: title, author, keywords, … They are inspected by search engines. Inline CSS formating rule Link to a cascaded style sheet with further format rules JavaScript programs may be included from JavaScript-files or (like here) are embedded directly into HTML code. JavaScript-function searches right image and changes photo1 roll_over_image as dynamic HTML effect. The „body“ contains the main text-/ image-/multimedia-content of the web page to be seen by users in the browser window. Text marks level1 header,
Text
a paragraph,
a break. Clickable hypertext reference link to web page of same or other web site! File descriptor of GIF-graphics file, which is automatically downloaded by the browser and presented on the web page. JavaScript functions are executed whenever specified mouse-event occurs File descriptor of JPG-photo file
Fig. 8 HTML file with CSS and JavaScript code describing a web page with mouse-over-photo
9 The Evolution of Online Booking Systems
207
the user to navigate to related HTML documents or Internet resources with a mouse click. Every HTML page contains internationally coded Unicode text and tags only. It must be rendered by a web browser to present the web page with all its multimedia content loaded from all resource files mentioned in its HTML code to the human user (Robbins 2019; w3schools.com/html 2019). A website (or web site) consists of all web pages interlinked within the same domain name under the responsibility of the same person or organization and normally having the same or a very similar layout (Goecke 2014a). The first (index) page of a website with the pure domain name as URL is also called the home page of a website, person, or organization. The word home page may also be used for a whole website as well as for the start page users can configure in their browser. HTML became a worldwide standard and has evolved over many years from HTML 1.0 now to HTML 5.2 under recommendation by the World Wide Web Consortium (W3C 2019).
Cascaded Style Sheets as Additional Rule-Based Formatting Language for HTML The formatting and presentation of the elements of a web page are defined by rules in cascaded style sheets (CSS) embedded or referenced in its HTML description (http://www.w3schools.com/css 2019; Robbins 2019). In a cascaded style sheet , a style rule with format instructions enclosed in curved brackets { } can be defined for every specific HTML tag (Fig. 8). The format instructions tell the browser how content within the specific tag has to be presented with respect to color, text font, size, weight, underlining, word spacing, frame color and size, position, etc. In Fig. 8 the background color for the body, i.e., the visible part of the web page is set to red. With cascaded style sheets, it is even possible to specify how links are colored when the user hovers the mouse over them or has already visited them. Another great benefit of cascaded style sheets is that all format instructions for many web pages can be held within the same CSS file. All web pages having a similar “corporate” or “branded” design may reference the same CSS file describing the corporate design rules. Every change of just one tag-formatting rule in the CSS file changes immediately the style attributes of all corresponding tags in all web pages referencing that particular CSS file without having to alter local style code in all the change-affected web pages (http://www.w3schools.com/css 2019). It is even possible to refer to different CSS files within the same web page depending on the type of web browser or screen size to adapt the style of the same HTML page to different user devices.
JavaScript as Browser Programming Language and Browser Plug-Ins HTML and CSS are no algorithmic programming languages because they have no vocabulary to specify calculations, mathematic functions or any command iterations. Algorithmic programming is necessary for animations or dynamic user interactions like, e.g., browser-side price calculations for selected shopping basket items or dynamic effects like mouse-over-buttons, drop-down menus, etc.
208
R. Goecke
To give authors of web documents the possibility to write their own programs which monitor all user interaction events in the browser and manipulate the content of the browser window dynamically, JavaScript was added to the HTML/CSS technology. JavaScript is a derivative of the popular object-oriented and machineindependent JAVA programming language, which may be included or embedded into HTML code (Fig. 8). JavaScript code is ignored by the browser‘s HTML interpreter and passed for immediate processing to the browser‘s JavaScript interpreter running in a “sandbox” environment to prevent insecure harmful calls to a computer‘s operating system routines. HTML pages with interactive JavaScript features are also called Dynamic HTML (DHTML) pages (http://www.w3schools. com/js 2019; Robbins 2019; Goecke 2014a). JavaScript code’s real-time changes of HTML elements in the web browser’s cache have no effect on the original HTML pages stored on the web server. Because JavaScript is a full-blown algorithmic programming language, it has the capability to send information, e.g., about certain user actions on the web page back to the web server via HTTP. Therefore, users may disable the JavaScript interpreter in their browser settings with the effect that now they cannot see or use fancy DHTML user interfaces. Website providers need to keep in mind that the most important user interactions of every web page must be accessible even without JavaScript support. Another way to enable web browsers to render special dynamic, interactive, or non-HTML/CSS content is the browser plugin interface. With the user’s active permission, every browser may load and install executable binary code modules called browser plug-in into the sandbox (Deitel et al. 2016). A loaded and installed plug-in may be started with special HTML or JavaScript commands from a web page, or it starts whenever a corresponding nonHTML-document is loaded into the browser. Famous examples for browser plug-ins are Adobe’s PDF Reader, Microsoft’s Office Reader, RealPlayer, Apple’s QuickTime Player, or Macromedia/Adobe’s Flash Player. They enable web browsers to view even documents in a fixed printer layout, Word/Excel/Powerpoint Office files, radio/TV media streams, or interactive multimedia animations including sound and videos, as well as machine-independent JAVA applications (applets). Special plugins for interactive geographic maps or 3D sceneries are also available and support virtual-reality applications with avatars like the famous Second Life virtual world from 2003.
HTML Forms, Server-Side Scripts, and Web-Enabled Applications Many E-Tourism websites of the first years were more or less dynamic virtual brochures showing text with photos or videos of destinations, attractions, hotels, and restaurants. Users could communicate with the website owners via active e-mail address links on the web page which started the local PCs e-mail client on a click with the e-mail address prefilled in the e-mail-receiver header. For more structured and even semiautomated dialogues between web page users and website providers, HTML offers special HTML form tags.
9 The Evolution of Online Booking Systems
Web server with session management and server-side script interpreters with access to databases & applications, and web site files: HTML, CSS, JavaScript, media, server-side scripts, log files
209
Application
SW/HW & storage
2. Database
Web browser with browser window, HTML-interpreter & renderer, Java Script-interpreter, and
1.
record/file storage Web SW/HW
Thank you Robert Goecke
3. HTTP(S) via INTERNET
Legend:
for your request on Mon 22nd Feb 2020. We will answer it via eMail asap ! See our last minute weekend sales: Weekend suite 90€ incl. VAT Single bed room 60€ incl. VAT
https://www.mtourism.de/email-request-form.html
x
HTML-form
Cache to store temporary files: HTML, CSS, JavaScript, media, cookies Fig. 9 Web application with browser and server-side scripts for form-based dialogue sessions
HTML form tags (w3schools.com/html 2019) define the structure of a form and its input fields, which may contain text, pre-defined items selectable in user menus select buttons and a send form button. When the user clicks the send button, the browser calls the URL of a so-called server-side script on the web server and sends all user entries of the form’s input fields to the server-side script. The server-side script processes the form’s input data and generates an HTML response, which is sent back to the user‘s browser to display a feedback message. Web servers have a session management (Fig. 9) to control multi-step user dialogues involving sequences of forms and response pages (http://www. w3schools.com/php 2019): They assign unique session ids and session variables to every ongoing user dialogue until a session timeout happens. Every file access is logged in the web server’s log file The preprocessing of embedded server-side script languages like PHP Hypertext Preprocessor, JSP (Java Server Pages from Sun/Oracle), or ASP (Active Server Pages from Microsoft) creates dynamically generated web pages with timely and personalized output from the web server to the end user in response to the user‘s data and requests in HTML web forms (Deitel et al. 2016). In our example (Fig. 9), an automated call to a database or other applications from a server-side script could retrieve recent last-minute offers and insert them as sales promotion hints into the response web page sent back to the user.
210
R. Goecke
Form-based dialogue sessions between web users and servers via active serverside script programming turned static brochure-like websites into web-enabled applications. They may include further data communication with databases or other applications, especially so-called legacy applications, which like CRS or GDS existed long before the invention of the WWW (Fig. 9). With HTML forms and server-side scripting, the WWW mutated from a distributed multimedia document network to a global platform for distributed web applications implementing Internet multimedia information systems. A special benefit of a web application in comparison to a classic PC application is that every user with an Internet connection and a web browser can use it immediately without the installation of any applicationspecific client software on the user’s device. Web applications removed the main barriers of safe global application access for consumers.
Internet Booking Engines as Self-Service Enablers in Tourism The most important web application for E-Tourism is the Internet Booking Engine (IBE) which from the technology standpoint of this article is more correctly a web booking engine (WBE) (Werthner and Klein 1999; Buhalis 2003; Egger 2005; Weithöner 2007, 2014a,b; Benckendorff et al. 2014, 2019). An old-fashioned computer reservation system (CRS) with an alphanumeric terminal user interface communicating via Internet with the CRS host may be called an online booking system, but today nobody will name it Internet Booking Engine, although it is certainly using the Internet. Before the invention of the WWW, other proprietary online booking systems even for consumers existed, for example, as part of German Bildschirmtext (BTX) services, France’s Minitel, or even the CompuServe and America Online Internet services. What most vendors and users mean today, when they speak of an Internet Booking Engine (IBE), is a special web application, where users can search for tourism offerings and even book them in a web browser. Therefore in this chapter, the term Web Booking Engine (WBE) is much more appropriate and will be used as a substitute for all web-based Internet Booking Engines (IBEs) in the market. Famous web booking engines were introduced by Expedia, a website developed by Microsoft as a showcase for the benefits of the Internet and the World Wide Web for end users, by the Sabre GDS with its Travelocity website, by HRS – a hotel reservation system in Germany, and by Tiscover and Gulliver, two destination marketing websites of Tyrol and Ireland (Werthner and Klein 1999; Buhalis 2003; Egger and Buhalis 2007). A simplified architecture of a web booking engine for flights is shown in Fig. 10. The inventory of all bookable flights is maintained in the flights’ table of a relational database. Every record in the flights’ table describes the number of vacant seats in a specific booking class of a flight. Whenever a flight is booked, the number of vacant seats is reduced by one in the corresponding flight record. For every booking, a record with the name and address of the passenger as well as the flight and class/seat is created in the table of passenger name records (PNR). An airline as supplier of flights has a password-protected HTTPS browser login to the web booking engine via server-
9 The Evolution of Online Booking Systems
211
Web Booking Engine Web Server Web Booking Engine (WBE) files: flight search form, booking form, pictures of airline logos, etc. WBE server-side scripts: 2. flight selection, 5. seat & PNR, 6. payment, 9. booking & confirm.
HTTP(S) via INTERNET
Database Management System with filter/insert/update/delete record transaction commands
Database (DB) table with available flight records: SQL
flightID, Src, Dest, Dep-Arrival, Class, Avail, Fare
table of passenger name records (PNR): PassengerID, Name, eMail, flightID, Class/Seat, Date List of records SQL: Structured Query Language commands to access & change database records
Customer‘s browser: 1. flight search mask, 3. flight list, 4. reservation, 7. pay, 10. conf.
8. Payment web-application of an Internet Payment Provider like e.g. PayPal Supplier‘s browser: A. HTML-form to create and edit flight records, B. PNR-list edit form
Fig. 10 Architecture of Internet/web booking engine for airline flights with ten booking steps
side scripts. The web booking engine’s server-side scripts control the form-based dialogues used by the supplier to maintain the flight records’ table and to filter a passenger name record list for every flight from the PNR table. For access to the database, the server-side scripts send standardized Structured Query Language (SQL) queries via its SQL interface to the database, which returns text lists with the query results (Deitel et al. 2016; Elmasri and Shamkant 2017; http://www.w3schools.com/SQL 2019). The query results are augmented by the server-side scripts with HTML/CSS tags and sent back via http(s) to the supplier’s browser at the end of each dialogue step. The website of the airline flight booking engine may be used by consumers or by travel agencies. Typical dialogue steps between a user’s browser and the web booking engine are shown in Fig. 10. They are numbered sequentially from one to ten. Three examples of simplified SQL-like queries sent by a web booking engine’s server-side scripts to the database can be seen in Fig. 11. The database executes the SQL queries, manipulates the data tables, and returns result lists as text files back to the calling server-side script, which inserts HTML/CSS tags, etc. and returns the formatted HTML result list to the calling user’s web browser. An important step handled outside a web booking engine is the payment sub-process. Payment service providers (Goecke 2014c) offer specialized web applications, where customers can authorize payments like credit card payments as part of ecommerce transactions invoked by web booking engines or other web shopping sites. The web booking engine redirects the user to the payment service website before the booking is confirmed. A secured HTTPS payment authorization dialogue is made between the user and the payment service provider’s website. After
212
R. Goecke
SQL: Structured Query Language commands to access & change database records:
Web Server
SELECT * FROM flight_table WHERE Src=MUC AND Dest=FRA AND avail>0 AND Dep=12.12.2019
Select & Book - Flight search - Book&Pay
Delivers list with all flight-records from flight table with available seats from MUC to FRA departing 12.12.2019. BEGIN TRANSACTION: INSERT INTO passenger_table (ID, Robert Goecke, eMail&Address, 1430, eco, …); UPDATE flight_table SET avail=avail-1 WHERE flightID=1430 AND class=eco AND avail>0); END
SQL
Airline-Inventory Supplier Admin: - Login - Flight records - Passenger lists
record list flight _table
DATABASE passenger _table
flightID
Inserts passenger record into passenger_table of flightID 1430 and reserves one economy seat (avail reduced by 1 seat) as atomic all-or-nothing transaction, i.e. if update command fails, (e.g. avail = 0) all commands are rolled back (e.g. previous insert command is undone). SELECT PAX.Name, PAX.eMail&Adress, PAX.Flights.flightID, Flights.Dest, Flights.Dep FROM flight_table AS Flights, passenger_table AS PAX WHERE PAX.Name=„Robert Goecke“ AND PAX.PID=ID AND Flights.flightID=PAX.flightID SQL join query delivers list with Name, eMail&Adress, flightID, destination and departure of all flight-table passenger booking records by „Robert Goecke“. Mapping of flight to passenger is possible by joining all flight_table records and passenger_table records with the same key attribute flightID.
Fig. 11 Examples of SQL-like queries sent from server-side scripts of a WBE to a database
a successful authorization the user is redirected back to the web booking engine in a secured way, and the booking is confirmed. Otherwise the booking transaction is rolled back, and the end user gets no booking confirmation because of payment failure. A pioneer for such a secure decoupling of a booking or shopping process from the payment process is PayPal. Without this decoupling of payment services from the shopping or booking process, a website would have to handle the payment process by itself with all security risks for both the website owner and the user. Large and well-known ecommerce or travel websites implemented own payment services, which in this case may establish a closer customer relationship, a retainer and an enabler for convenient one-click shopping. Some e-commerce platforms, e.g., Alibaba in China, opened their well-established payment service Alipay also for third-party payments of their customers, which is easier than traditional banking transactions and offers deeper insights into consumer spending habits and demands.
Integration of Legacy Booking Systems with Web Front Ends A big benefit of web application technology is the simple integration of existing legacy applications like mainframe-based or proprietary CRS and GDS with easyto-use web front ends (WFE). Figure 12 shows how server-side scripts may call existing computer reservation systems via remote procedure calls to a Web-CRS interface. Instead of reprogramming the complex business logic upon a new database, established legacy systems with their proven business logic and efficient data management can be reused by just implementing a new Web-CRS interface. GDS providers, many airlines,
9 The Evolution of Online Booking Systems
213
Legacy CRS & GDS with Web Front End (WFE)
HTTP(S) via INTERNET Web browser for B2B & B2C commerce: Multimedia presentation & reservation of travel segments/package tours/tickets
CRS or GDS-HOST Computer Reservation System CRS programms with CRS logic for dialogue & booking control & database access
Web-CRS procedure calls
Interface
Web Server Web Front End files: dialogue forms, booking forms, hotel-, room-, destination-media WFE server-side scripts: segment selection, reservation, booking in database, & payment
internal CRS-Database
Payment web application of an Internet Payment Provider like e.g. PayPal
GDS Web Front Ends for Travel Agents CRS Web Front Ends for websites of airlines, hotels chains, tour operators, rail/cruise/car rental operators, etc.
Fig. 12 Web booking engine as web front end (WFE) of legacy CRS and GDS
rail operators, hotel chains, leading rental car companies, and tour operators have managed to develop innovative multimedia web front ends for their legacy systems, which enabled consumers to book standard tourism products with simplified selfservice processes (Werthner and Klein 1999; Buhalis 2003; Zhou 2004; Egger and Buhalis 2007). Even for stationary travel agents, the GDS redesigned their terminal-oriented user interfaces and migrated them to convenient browser-based sales suites as hybrids of locally installed business applications and multimedia web clients to access photos, videos, and maps describing destinations, hotels, cars, etc. Like the Internet reused existing proprietary networks with adapters for a new virtual network, web front ends together with Web-CRS interfaces allowed a very fast innovation of existing legacy applications with convenient multimedia facades giving users the virtual impression of a completely new application. The WWW actually prolonged the life cycle of many well-established tourism applications and protected the system owner’s investments.
Touristic IBEs, Package Tour Comparison Systems, and Specialized Online Travel Agent WBEs While many web booking engines started as web front ends of legacy systems, web booking engines for prepackaged tours described in print catalogues, which are popular in the UK and Germany, had to be invented completely from the scratch (Buhalis 2003). National CRS networks (Tour DN in Fig. 3) as affiliates of the GDS had already connected travel agents directly with the tour operator
214
R. Goecke
systems to book those catalogue specified packages. New ways to collect all vacant tour package offerings from the tour operator systems in a way to enable the electronic comparison of competing offers from different tour operators had to be combined with innovative searchable eCatalogues. Important pioneers were GIATA for eCatalogues, TravelTainment and IFF as developers of a package tour web booking engine for Expedia Germany based on national CRS networks, as well as Traffics, Travel IT, and Bewotec, who integrated innovative package tour web booking engines with own tour operator CRS connection platforms (Goecke and Weithöner 2014; Goecke et al. 2010). All those package tour WBEs, which are also called touristic IBEs, enabled both end consumers at home and expedients in travel agencies to compare competing tour operators’ package tour offerings in one comparison list before booking them either via WBE or via legacy Tour DN and CRS. Completely new web booking engines and web-based CRS were also developed by leading online hotel booking platforms, e.g., HRS and Booking.com exclusively for their portals, or by HitchHiker, Ypsilon.Net, and other travel technology providers for consolidators. Further innovations have been IBEs/WBEs for apartments, cruises, events, and restaurant tables the IBE/WBE developments of leading travel portals in China and India as well as the famous sharing service web booking engines like CouchSurfing or Airbnb. All those self-service web booking engines and web shops for consumers started a new dimension of virtualization: New intermediaries like automated or semiautomated virtual shops and virtual travel agencies (called online travel agencies – OTAs) with their 24 h 7 days a week availability started to substitute (disintermediate) real bricks-and-mortar businesses. Many shop employees, travel expedients, and even call center agents lost their jobs (Statista 2019). On the other hand, the self-service paradigm of web booking engines created jobs in the Internet economy and increased the workload for smaller hotel and accommodation suppliers to maintain their vacancies manually via browser in a growing number of electronic distribution channels (Buhalis and Laws 2001). Subsequent developments of more advanced web technologies and E-Tourism web applications (see Goecke 2020) led to further transformations of business models and production processes like dynamic packaging and to innovative cloudbased CRS and GDS architectures and mobile reservation apps.
Evolution from Mainframe via Distributed Web Services to Cloud Computing The WWW and its networking technology moved many corporate applications from centralized company-owned data centers to automated large-scale internet service providers (ISP), providing e-mail and web hosting, and to application service providers (ASPs), hosting special web applications and web services. With special multi-tenant virtual web server software, they can organize the efficient sharing of one single web server hardware by dozens or hundreds of less frequented websites
9 The Evolution of Online Booking Systems
215
as a cheap alternative to the classic hosting of web servers and applications on dedicated hardware servers. When users and applications of different organizations cooperate via the Internet, it is necessary to segment the Internet technically according to three organizational security levels (Comer 2018): Open to the public Internet are all functions of a web application, which are accessible for every user without any registration or after a simple e-mail-verified self-registration. The extranet of an organization grants limited application access to trusted external users or applications from identified business partners of an organization via authenticated Internet login or encrypted virtual private network (VPN) links. Internal users or applications of an organization belong to the organization’s intranet, which is either a physical private network using Internet technology without any direct physical connection to the public Internet or which is partly connected to the public Internet by virtual private network links and via firewall routers, which block all data packages from unauthorized senders. Over the years, both proprietary local area network protocols and proprietary client-server applications of the PC era were substituted by web-based intranet applications. On the application level of the WWW, all communication between users and applications as well as between applications should use secure HTTPS protocols to ensure today’s privacy and data protection laws. Because highly frequented web applications like web booking engines or search engines get into performance problems, when too many users search in parallel Google, IBM, Amazon, Microsoft, Oracle, and the global content delivery network Akamai developed special virtualization methods for global content management systems. They replicate the function of a web server, spread content copies globally or distribute a search index and databases to thousands of physical multiprocessor servers located in one or more server farms (Marinescu 2018; Wiktorski 2019). Some providers of touristic IBEs used the low prices of memory chips to coinvent in-memory databases granting instant access to all vacancy data without the need of time-consuming file loads from hard disks. NoSQL or NOSQL databases (meaning “No” at first and “Not Only” later) made massively distributed parallel queries more efficient than SQL databases. They support stream queries for media, click, and sensor data streams as well as knowledge graph queries of advanced semantic web applications (Marinescu 2018; Goecke 2020). While centralized systems are vulnerable to system failures, the CAP theorem proofed a general tradeoff between data consistency, instant service availability and tolerance to partial system failures for distributed systems (Kemper and Eickler 2015). Besides such new economies of scale and scope, the shortage of qualified IT experts was a main driver, why many tourism suppliers, tour operators, and travel agents outsourced their IT systems and IT departments. Because of antitrust rulings in the USA and the spread of the WWW, most founding airlines of the GDS sold their capital shares, which made the GDS airline-independent and eligible to provide IT outsourcing for all kinds of tourism suppliers (Egger and Buhalis 2007): Many chose GDS as hosters for their internal reservation systems and as providers for E-Tourism applications. The leading GDS acquired E-Tourism technology start-
216
R. Goecke
ups to integrate newest web applications into their travel solution portfolios. They even founded own OTAs like Sabre with Travelocity, while Galileo owned Orbitz and Amadeus held a major stake of Opodo for long. However, around 2010 all GDS faced competition in their core business by global new entrants (GNEs), who tried to substitute mainframe-based GDS with innovative distributed GDS software running on hundreds of PC servers in server farms. Google bought one of those GNE start-ups as a foundation for its flight search engine. The GDS learned and Amadeus migrated its mainframe technology to a distributed server cloud (Campbell 2017), while Sabre announced a cloud migration, Sabre and Travelport acquired the remaining GNEs. Chinas TravelSky GDS also grew dynamically from a state company to a Hong Kong Stock Exchange-traded company, which acquired several travel technology WBE platform pioneers like OpenJaw and started a cooperation with SITA. Today’s data center virtualization technologies are able to create, allocate, reallocate and delete more than thousand task-specific virtual web or application services to real servers within seconds to offer a highly performing cloud service dispersed anywhere in the Internet cloud. Google, Amazon, Microsoft, Facebook and IBM offer web services and applications distributed dynamically according to required processing loads, network bandwidth or distance requirements within their global clouds of interconnected data centers (Baun et al. 2009; Marinescu 2018). At the same time, IBM even introduced a new version of their highly performant mainframe systems in 2018, which is not only still in use by some GDS but is still irreplaceable in industries like banking and insurance. The strength of the WWW and its subsequent innovations is its capability to integrate centralized and decentralized technologies in a seamless way unnoticed by the users. The underlying information hiding principle of virtualization reduced complexity and increased flexibility, reuse, sharing, and user interface convenience for tourists and suppliers in an interconnected world.
Synopsis: Automation, Customer-Oriented Self-service, and the Two Faces of Virtualization As we have seen, online booking systems automated the internal reservation and booking processes of touristic suppliers and enabled self-service booking for expedients and end customers on a global scale. Global distribution systems, tour operator CRS, Internet Booking Engines, and online travel agents transformed travel distribution chains and introduced more customer-oriented flexible packaging as a precursor for dynamic packaging, which relies on a combination of many distributed CRS of different suppliers by advanced web technologies (Goecke 2020 in this Handbook of E-Tourism). From a technology perspective, online booking systems are a good example of how virtual facades supported an evolutionary innovation process based on re-usage and coexistence of both established and disruptive new IT technologies: Instead of subsequent revolutionary and expensive total replacements of legacy systems and networks by next-generation technologies,
9 The Evolution of Online Booking Systems
217
today’s Internet-based online booking systems may include mainframe- or cloudbased GDS, PC-/Web server-based PMS, and tour operator CRS as well as webor cloud-based IBEs usable via browsers and mobile apps across all kinds of fixed or mobile networks. From a social interaction perspective, online booking systems introduced screen-based booking experiences as a virtual substitute for personal agent-assisted booking. This virtualization led to a substitution of human expedients and bricks-and-mortar travel agencies: Some jobs have been reallocated to remote customer care centers, and new job profiles have been created at online travel agents and travel technology suppliers. While virtualization supported a smooth technology evolution, its effects on the customer journey, touristic business models and processes, and job profiles in touristic sales and marketing have been revolutionary.
Cross-References Advanced Web Technologies and E-Tourism Web Applications Web Information Retrieval and Search
References Amadeus (1990). Das Rechenzentrum Amadeus – eine Dokumentation. München: Data Processing & Co GmbH. Baun, Ch., Kunze, M., Nimis, J., Tai, St. (2009). Cloud Computing – Web-Based Dynamic IT Services. Berlin: Springer. Benckendorff, P.J., Sheldon, P.J., Fesenmaier, D.R. (2014). Tourism Information Technology. 2nd Edition. Wallingford and Boston: Cabi. Benckendorff, P.J., Xiang, Z., Sheldon, P. (2019). Tourism Information Technology. 3rd Edition. Wallingford and Boston: Cabi. Berners-Lee, T. (1989 and 2019). https://www.w3.org/blog/2019/03/30-years-ago-the-worldchanged-forever/ (Accessed 15. December 2019). Buhalis, D., Laws, E. (2001). Tourism Distribution Channels: Patterns, Practices and Challenges. London: Thomson. Buhalis, D. (2003). eTourism – Information technology for strategic tourism management. Harlow: Pearson Education. Campbell, J. (2017). Amadeus To Retire ‘Workhorse’ Mainframes. In: The Company Dime. 22. March 2017. https://www.thecompanydime.com/mainframe/ (Accessed 12. March 2019). Comer, D.E. (2018). The Internet Book: Everything You Need to Know about Computer Networking and How the Internet Works 5th Edition. Boca Raton FL: CRC Press. Crol, E. (1992). The Whole INTERNET – User’s Guide & Catalogue. Sebastopol CA: O’Reilly Deitel, H.M. (1990). Operating Systems 2nd Edition. Reading (Mass): Addison Wesley Deitel, P.J., Deitel, H.M., Deitel, A. (2016). Internet and World Wide Web How to Program. 5th International Edition. Boston et al.: Pearson Education. Egger, R. (2005). Grundlagen des eTourism, Aachen: Shaker. Egger, R., Buhalis, D. (Eds.) (2007). eTourism Case Studies. Amsterdam: Butterworth Heinemann. Elmasri, R., Shamkant N. (2017). Fundamentals of Database Systems 7th Edition. Harlow: Pearson. Goecke, R., Eberhard, T., Roth, J. (2010), Neue Wege zur Navigation durch die Datenflut der Reiseangebote – auf der Suche nach neuer Beratungsqualität im digitalen Zeitalter,
218
R. Goecke
Arbeitsbericht der Fakultät für Tourismus, Hochschule München https://w3-mediapool.hm. edu/mediapool/media/fk14/fk14_lokal/diefakultt_1/forschungundprojekte/it/Arbeitsberichtl-Go ecke_Eberhard_Roth_Internet.pdf. (Accessed 20. September 2019). Goecke, R. (2014a). Systemarchitekturen touristischer IT-Applikationen. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 13–24. Goecke, R. (2014b). Informationsmanagement in Hotel- und Gastronomiebetrieben. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 371–405. Goecke, R. (2014c). Elektronische Zahlungs- und Kartensysteme. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 516–535. Goecke, R. (2020). Advanced Web Technologies & E-Tourism Web Applications. In: Xiang, Zh., Fuchs, M., Gretzel, U., Höpken, W. (2020): Hanbook of E-Tourism. Springer Goecke, R., Weithöner, U. (2014). IT-Systeme und Prozesse bei Reiseveranstaltern. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 442–472. Goecke, R., Landvogt, M. (2017–2019). Digitaler Tourismus – Technologien, Systeme, Geschäftsmodelle. vhb Virtuelle Hochschule Bayern. https://kurse.vhb.org/VHBPORTAL/ kursprogramm/kursprogramm.jsp?kDetail=true&COURSEID=10729,68,1145,2 (Accessed 9. December 2019). Hopper, M.D. (1990). Rattling SABRE—New Ways to Compete on Information.” Harvard Business Review, May-June 1990, 118–125. Kemper, A., Eickler, A. (2015). Datenbanksysteme – Eine Einführung 10th Edition. Berlin: DeGruyter. Marinescu, D.C. (2018): Cloud Computing – Theory and Practice 2nd Edition. Cambridge MA: Elsevier Nyheim, P. D. (2019). Technology Strategies for the Hospitality Industry. 3rd Edition Upper Saddle River, NJ: Prentice Hall. O’Connor, P. (1999). Electronic Information Distribution in Tourism and Hospitality. Wallingford: CAB International. Robbins, J.N. (2019). Learning Web Design: A Beginner’s Guide to HTML, CSS, JavaScript, and Web Graphics 5th Edition. Beijing, Boston: O’Reilly. Schulz, A., Frank, K., Seitz, E. (1996). Tourismus und EDV, München: Vahlen. Schulz, A., Weithöner, U., Goecke, R. (Eds.) (2010). Informationsmanagement im Tourismus – Prozesse und Systeme. München: Oldenbourg Verlag. Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.) (2014). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter. Schulz, A. (2014). Globale Distributionssysteme. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 213–239. Statista (2019). Anzahl der Reisebüros in Deutschland von 2002 bis 2017. https://de.statista. com/statistik/daten/studie/252715/umfrage/anzahl-der-deutschen-reisebueros/ (Accessed 12. September 2019). Tanenbaum, A.S. (1988). Computer Networks 2nd Edition. Englewood Cliffs, NJ: Prentice Hall. W3C (2019). World Wide Web Consortium. https://www.w3.org/ (Accessed 22. December 2019) w3schools.com (2019). w3schools.com - THE WORLD’S LARGEST WEB DEVELOPER SITE. https://www.w3schools.com (Accessed 30. December 2019). Weithöner, U. (2007). Electronic Tourism – kleines Lexikon zu informationstechnologischen Systemen in der Tourismuswirtschaft. Hamburg (Deutschland): WiWi-Online.de. http://www. odww.net/artikel.php?id=359
9 The Evolution of Online Booking Systems
219
Weithöner, U. (2014a). eMarketing und eCommerce – Internet-Basis, Voraussetzungen und Potentiale. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 65–93. Weithöner, U. (2014b). Web-Portale und Internet Booking Engines. In: Schulz, A., Weithöner, U., Egger, R., Goecke, R. (Eds.). eTourismus – Prozesse und Systeme 2nd Edition. München and Berlin: DeGruyter, 314–324. Werthner, H., Klein, S. (1999). Information Technology and Tourism: A Challenging Relation. Vienna: Springer. Wikipedia (2019) The Free Encyclopedia from Wikimedia Foundation Inc. Sabre (computer system). https://en.wikipedia.org/wiki/Sabre_(computer_system) (Accessed 28. December 2019). Wiktorski, T. (2019). Data-intensive Systems – Principles and Fundamentals using Hadoop and Spark. Cham (Switzerland): Springer Nature. Zhou, Z.Q. (2004). E-Commerce and Information Technology in Hospitality and Tourism. Clifton Park NY: Delmar Learning 2004.
Advanced Web Technologies and E-Tourism Web Applications
10
Robert Goecke
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic Packaging Engines as Enablers of Mass Customization in Tourism . . . . . . . . . . . . Web Content Management Systems, Content Syndication, and Web 2.0 . . . . . . . . . . . . . . . . XML as a Base Technology for Standardized Data Exchange Interfaces . . . . . . . . . . . . . . . . XML as Inter-Application Interface Technology: Web Services and Mash-Ups . . . . . . . . . AJAX as Enabler for Asynchronous Data Interchange Between Browser and Web Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Semantic Web for Distributed Knowledge Representation . . . . . . . . . . . . . . . . . . . . . . . . Web Portals as User-Centered Integration of Web Applications for Hybrid Business Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Online Booking Engines and Corporate Travel Management Systems . . . . . . . . . . . . . . . . . . Web Search Engines and Travel Meta-Search Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mobile Web, Apps, and Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Destination Management Systems and Portals to Smart Destination Service Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Impacts and the Future of Advanced E-Tourism Web Applications . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
222 222 224 227 227 229 230 233 236 238 242 244 247 249
Abstract Since its invention in 1989, the World Wide Web was extended with many advanced web technologies like XML, Web services, AJAX, JSON, HTML5, etc. They are enablers for innovations like mash-ups, responsive web design, web-enabled mobile apps, or augmented reality apps, which are building blocks for complex E-Tourism applications with many special use cases: Dynamic packaging engines are platforms for virtual/online tour operators. Content
R. Goecke () Munich University of Applied Sciences, Munich, Germany e-mail:
[email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_15
221
222
R. Goecke
management systems evolved as core technology for travel communities, blogs, online advertising, and social media. Travel portals with Internet booking engines serve direct sales of travel suppliers and online travel agents. Online booking engines introduced self-service even for corporate/business travel management. Travel search engines offer direct price and product comparisons and search engine marketing. Destination management systems support service bundling, while mobile web apps innovated tourist guidance.
Keywords Dynamic packaging engine · Content management system · Travel search engine · HTML5 · XML · Semantic web · Destination management systems
Introduction Online booking systems (Goecke 2020 in this Handbook of E-Tourism) evolved from mainframe-based computer reservation systems (CRS) to web-based Internet booking engines (IBEs/WBEs), whose architecture is based on SQL databases, the Internet, and basic Web-technologies like URLs, HTTP(S), HTML/CSS, JavaScript, and web-server-side scripts (Berners-Lee 1989/2019). Many E-Tourism websites started as a brochure-like collection of HTML pages globally accessible via an HTML browser under a single domain name, which sometimes offered simple online booking functions via one or more IBEs. In this chapter the extension of such classic IBEs and their combination with complementary CRS to dynamic packaging engines is analyzed as a driver for mass customization in tourism. Then we introduce advanced web technologies for websites like web content management systems, XML interface technologies, asynchronous web interaction (AJAX), and the semantic web. They enabled advanced web applications like social media communities, web portals of touristic suppliers and online travel agents, as well as corporate travel management systems, travel search engines, and mobile travel apps. Finally, we investigate the evolution of web-based destination portals to smart destination ecosystems empowered by the web of things and discuss the roles of cloud services and block chains for future web applications.
Dynamic Packaging Engines as Enablers of Mass Customization in Tourism The production of package tours was reinvented completely with the introduction of dynamic packaging (Weithöner 2007; Schulz et al. 2010, 2014; Goecke and Weithöner 2014; Benckendorff et al. 2019): Classic prepackaged tours are known since Thomas Cook as bundled accommodation and transport services based on fixed allotment contracts between a tour operator and its suppliers. They are offered for fixed inclusive prices, are described in catalogues, and can be booked
10 Advanced Web Technologies and E-Tourism Web Applications
223
in travel agencies. The introduction of computer reservation systems and national CRS networks in the 1970s and 1980s gave travel agents direct access to the tour operator’s production systems and led to the invention of flexible packaging: Tour operators define menus of combinable tour segments in special “flexible packaging” catalogues. Expedients use them to assemble individualized package tour variants for their customers in menu dialogues with the tour operator’s CRS. The tour operator system manages the inventory of all available fixed and optional allotments bought from suppliers at the beginning of the season. Flexible packaging is controlled by a rule-based business logic describing the choice menus, combination restrictions, and calculation rules, which are defined by product managers. With the development of the first web booking engines and travel websites, dynamic packaging was invented as a new production process for package tours in the USA, UK, and Germany (Goecke and Weithöner 2014). Figure 1 shows a web server-based dynamic packaging engine realized with HTML forms, media files, and server-side scripts. Although classic and flexible tour packages have never been very popular in the USA, web booking engines made it necessary that end users bundle flight, hotel, and rental car offerings for their journeys as a self-service. The combination of tour segments booked separately in IBEs for flights, hotels, and rental cars might become a nightmare for end customers: If only one journey segment with low availability is detected to be sold out in the midst of a journey booking sequence, previous bookings have to be cancelled! The invention of web shopping baskets for material products or services led to dynamic bundling: All journey segments are collected in a web shopping basket and booked together not as a single-priced package tour of a tour operator, but as a
Dynamic Packaging Engine Web Server Dynamic Packaging Engine files: packaging dialogue forms, booking forms, logos, hotel-, room-, destination-images,… Dynamic Packaging server-side scripts: 2. flight/hotel/car segment collection, 4. rule-based packaging & calculation, 8. distributed booking transaction
HTTP(S) & distributed service/ database calls via INTERNET Web browser: 1. Customer preference mask, 5. list of customized package tours, 6. booking of selected package tour
Internet/HTTP-connection distributed query/transaction 7. payment via web-application of Internet Payment Provider like e.g. PayPal
GDS flights, hotels, cars
Bed Bank hotels
3. select all matching flight-, room-, car-offers
Rental Car CRS
9. confirmed booking of car-, flight-, & roomsegments for customer-selected package tour
Airline Consolidator CRS Hotel Chain CRS Airline CRS flights
Fig. 1 Simplified architecture and nine business process steps of a dynamic packaging engine
224
R. Goecke
bundle of individually priced segments from different suppliers. Dynamic bundling simplified booking transaction control and is still very popular for expedients, because it is very similar to the classic GDS booking process, where all journey segments are collected in a travel itinerary and segment reservations can be made before the final booking. For inexperienced consumers dynamic bundling implies a risk to buy incompatible or inconvenient itineraries because sound package combination rules of a responsible tour operator are missing. Dynamic packaging automates the packaging process further: Whenever a customer requests a tour in a web-based dynamic packaging engine, it collects preference matching and vacant segments for the specified journey directly from connected supplier databases and CRS. The returned segment offers are packaged “just in time” according to packaging rules specifying compatible combinations and price calculation schemes of the responsible tour operator in a fully automatized way. A result list of packaged tours with fixed price tags is presented to the user, who selects the preferred journey variant in the browser for booking. After successful payment, the dynamic packaging engine books all journey segments from the involved suppliers in a distributed booking transaction that may roll back all segment bookings automatically, if only one segment booking fails. All nine steps of the distributed dynamic packaging process are shown in Fig. 1. Dynamic packaging introduced “just in time” travel productions by virtual online tour operators (VTOs) which offer highly customized fixed price package tours even without the need to make risky allotment contracts with suppliers in advance.
Web Content Management Systems, Content Syndication, and Web 2.0 Within the mid-1990s, the number of websites and their web pages grew exponentially. More and more website providers got into trouble to keep the content of hundreds or thousands of web pages up to date and consistent, especially when many authors with often insufficient knowledge of HTML and CSS tried to alter web pages simultaneously in an uncoordinated way. Because web browsers were invented only for the viewing and browsing of web pages or their source code, no browser provided any web editing functions. Soon, commercial web editor PC software was developed to design web pages with a Word-like user interface. Only little knowledge of HTML/CSS was necessary because the page design was machine translated into HTML/CSS as well as JavaScript code and uploaded automatically to the web server. The problem with this approach was to keep pages and files from different authors of the same website separated from one another and to distinguish simple content changes from more sophisticated layout changes. Web Content Management Systems (CMS) solved that problem with the proven technology of database-driven newspaper content management systems (Goecke 2014a; Weithöner 2014a; Barker 2016). Access to the HTML/CSS layout and JavaScript programming of a website is strictly restricted to a small number of
10 Advanced Web Technologies and E-Tourism Web Applications
225
web or screen designers with professional HTML/CSS and JavaScript programming skills. They are responsible to program DHTML/CSS layout templates with named or numbered variables for text, image, photo, or video content. Web content authors may create or edit web pages in their web browser for the website without any HTML/CSS knowledge by simply filling out the input fields of the predefined DHTML/CSS template forms with text, URLs, and file descriptors of images, audio files, and videos to be uploaded as web page content to the web server (Fig. 2). For every website user, the Web CMS checks the user’s access rights for every page call, and only pages with fitting read or write access rights are shown to the user. By website registration procedures, a CMS can distinguish anonymous website visitors from differentiated groups of registered users. With a fine-grained user rights management, a Web CMS can even be used to show personalized and individually editable content for every single user, which enabled personalized marketing and customer relationship management. For users, who log in as web authors, web pages are presented embedded in templates to change their content. Users with programming rights may even create and change HTML/CSS and JavaScript files directly. High-end content management systems may also have a workflow system, where content or template changes are sent to a reviewer who is authorized to publish the changes. With the separation of content from layout by CMS, it became possible to use templates tailored for the browser type of a user encoded in the browser’s http-request.
Web Content Management System (CMS) Logo
Web Server
Database Management System with select/insert/update/delete record SQL commands
Banner
Navigation
TEMPLATE Web CMS files: HTML/CSS/JavaScript-Templates, images Img n , videos Video k , etc. etc. CMS server-side scripts:
2. CMS checks user‘s access rights and fills template with matching content. 3. CMS sends assembled HTML page for presentation to user-browser. 5. CMS loads media files and updates content from authorized users in DB.
CMS Database (DB) table with user access rights elementId, userID, read/write_access_rights
SQL
table with page content elements pageId, elementId, templateURL, textContent, mediaLink
Records with (user-specific) web page content: Text i
Text l
Video-link Image link
Link1 Link2
Logo link
Banner link
Link3
HTTP(S) via INTERNET Web Browser: 1. User types/clicks URL of web page in CMS.
Logo 3 Banner m 4a. Users with read permission see dynamically assembled web page. 4b. Authors/Users with write permission see web page with HTML-form Link1 Text i Text l Link2 elements to insert, upload or change content without HTML-skills. Link3 Img n Video 6. Authors see new/updated web page to confirm content changes.
Fig. 2 Architecture of web content management system and six steps of web page assembly process
226
R. Goecke
New business opportunities arose for websites with valuable content like travel news, travel guides, restaurant tips, etc. The same text, image, or photo content can be customized or rebranded with different templates to deliver it in different layouts to many websites. The commercial selling of the same valuable content to many websites is a business model called content syndication. It opened new revenue streams and e-commerce opportunities for established travel guide book publishers or innovative e-catalogue content aggregators, e.g., MairDumont or Giata, etc., and lowered content acquisition costs for websites without own research staff. With content filtering and layout customization, foreign content may be presented on a website in a seamless way unnoticed by the website’s users, which lead to new forms of content virtualization. Another special form of content to be managed by Web CMS is advertising banners. Websites with large audiences may program their web templates to include changing advertising banners for own offerings or even to promote products of paying third-party advertisers named affiliate partners. As more and more revenues on a website are generated with its content or advertising, the website business model becomes a so-called new media business model. Web content management systems became the base technology for Web 2.0 and social media: Because every user without HTML/CSS knowledge can easily insert texts, images, photos, or videos into a web content management system, not only professional web journalists but everybody is enabled to publish own content in the WWW. This movement from one to many (1:n) broadcasting websites of the Web 1.0 phase to more participatory many to many (m:n) conversational websites was named Web 2.0 (O’Reilly 2005). Travel community websites, e.g., Lonely Planet and recommender sites like TripAdvisor or HolidayCheck use specialized content management systems to enable all tourists to share their travel experiences. Content creation by masses of volunteers was later named crowdsourcing. Blog systems (from web-log) including Word Press or Twitter are also nothing else than multiuser content management systems, where bloggers can choose a template and go life with their personal diary, which was the beginning of influencer marketing. From a technology standpoint, even Facebook is a highly sophisticated self-service multimedia content management system with a special user rights management based on “friendships” forming the famous social graph as enabler for friends-oriented message propagation. Because Facebook collects a lot of sociodemographic, geographic, and interest data from its users, it is able to show closely targeted banners or to influence its users with the mass propagation of “likes” and “comments.” Facebook banners may be paid per view (page impression), per click, or per induced business transaction, for example, a booking. Facebook’s data collection and especially the evaluation of social graphs enables all kinds of innovative social, political, and marketing research. However, it bears also new risks for privacy, data protection, and political misuse. CMS transformed the WWW into the basic platform for social media in tourism (Amersdorffer et al. 2010; Hinterholzer and Jooss 2013).
10 Advanced Web Technologies and E-Tourism Web Applications
227
XML as a Base Technology for Standardized Data Exchange Interfaces As we have seen, many web applications have the necessity to exchange data with other applications across organizational boundaries: Web booking engines and dynamic packaging engines need supplier data about their offerings and return PNR booking data with guest names, addresses, etc. Content management systems offer pure (unformatted) content data including both text and multimedia files to websites, where the HTML-free content is reformatted into HTML code using website-specific HTML templates. With the tremendous success of HTML as a universal standardized markup language to describe multimedia documents for a global exchange of information between humans, the idea arose to use standardized markup languages even for the interorganizational exchange of machineinterpretable data between applications. In 1998 the EXtensible Markup Language (XML) was defined by the W3C as a meta-language to specify the document structure and semantics of markup tags to enable industry-specific automated exchange of machine-interpretable content data between applications (Werthner and Klein 1999; Goecke 2014a; Deitel et al. 2016; Comer 2018). A big advantage of XML documents as data interchange format is their structural similarity to HTML documents. All tools and programming paradigms developed for HTML can easily be modified to process XML documents and vice versa. Additionally the transformation of XML content into HTML is easy, and XML tags may be formatted with CSS for a browser-individual presentation. As a technology, XML is accompanied with machine-readable specification languages like XML Document Type Definitions (DTD) or XML Schema. They specify the syntax and structure of XML markup together with the data types allowed for an industry-specific document, for example, an airline flight information response (Fig. 3). Every program receiving a standardized XML document can retrieve the published schemas via links and check whether the document complies with the referenced schemas (i.e., it is “well formed”). Human programmers can read the referenced documentation to understand the exact meaning of an XML document according to the surrounding XML markup (http://www.w3schools.com/xml 2019).
XML as Inter-Application Interface Technology: Web Services and Mash-Ups While all services from http-servers may be called web services, a Web service is a special application offering self-describing XML content via http(s) to other applications as an automated service (Fensel et al. 2011), while web services are any services via HTTP. In addition to SQL database interfaces for highly structured record data complying with proprietary database schemes, Web services give web programmers the opportunity to exchange complex nested self-describing
228
R. Goecke
…
UA References to XML schema definitions 116 describing structure, datatypes, etc.,
of flight information XML-document in a SFO San Francisco machine interpretable way according to 2019-04-15 the structure/semantics agreed by the 16:45-0800 Open Travel Alliance.
AER Сочи 2019-04-17 23:45+0300
…
Fig. 3 Open Travel XML flight description (extract like Open Travel 2003)
XML documents via Web service interfaces described by machine-readable XML DTDs and schemas. XML documents may therefore include both highly structured tagged record data and semi-structured tagged text. With standardized Web Service interfaces, it is much easier for web programmers to directly combine services and content from different applications. Web applications using Web services of other web applications like websites integrating map services from Google or Bing are called mash-ups. Although Web services are easy to combine, complex copyright law compliance checks are required in advance. Web services make interface development cheaper and in theory may even reduce the demand for centralized data exchange hubs like GDS in favor of direct data exchange between the applications of, e.g., suppliers, tour operators, and travel agencies. The Open Travel Alliance (Open Travel) specified Open Travel XML in 2004 as a global standard for the automated exchange of travel-related content and business documents between all kinds of travel applications within the travel industry (OTA 2003, see Fig. 3). Standardized protocols for the exchange of Open Travel XMLmessages for e-business processes and e-commerce transactions like bookings or payments exist. Additional sector-specific XML-based data exchange standards are IATA’s New Distribution Capability (NDC) standard for data exchange between airlines, travel agencies, and third parties, as well as the Hospitality Technology Next Generation (HTNG) XML standards for data exchange between hotel property management systems, distribution systems, and other systems (Nyheim 2019). The DRV Data Standard of the German Travel Association (DRV) specifies more than 300,000 global types defining the semantics of travel offer attributes accessible as XML Schema definitions with a Web service XML interface. Precise offer attributes like room size, beach distance, etc. are most important to describe and compare the
10 Advanced Web Technologies and E-Tourism Web Applications
229
quality of competing package tours in travel search engines and Internet booking engines. Almost all industries have developed their own XML data interchange languages, and even MS Office documents now exist in Microsoft docx, pptx, or xlsx XML formats. Today every database and CMS is able to deliver all of their content in different XML formats. Special XML databases optimized for the retrieval and manipulation of thousands of XML documents are available, and every element of an XML document can be queried with standardized XPath, XQuery, and XUpdate expressions. To handle XML documents in JavaScript or with serverside script languages, XML DOM was developed as an object-oriented Document Object Model for XML. The automated exchange of business documents in industry-standardized XML formats between applications triggered the automation of many intra- and interorganizational business processes not only in E-Tourism. With e-business XML (ebXML), even a language to describe such e-business processes is available. UDDI offers a Universal Description, Discovery, and Integration service for the automated just-in-time discovery and integration of Web services: A dynamic packaging process, for example, may use UDDI to search for alternative last minute rental car booking Web services for a destination not covered by the pre-connected rental car IBE as a last step of a dynamic packaging process.
AJAX as Enabler for Asynchronous Data Interchange Between Browser and Web Server Another benefit of XML is the realization of asynchronous communication between browser and web server via Asynchronous JavaScript and XML (AJAX). Classic browser/web server dialogues exchange no information between a web browser and a web server unless the user clicks on a page link or a form’s send button, which is called synchronous user dialogues over http (Comer 2018). The key idea of AJAX is that JavaScript code is sent embedded within an HTML page to a browser and then may report information about certain user events in an application-specific XML format via http(s)-calls to the web server asynchronously without any explicit command or notice of the browser’s user (Fig. 4). Further speed improvements of this asynchronous communication were gained by the exchange of web page content between web server and browser directly in HTML/CSS, which had been redefined to XHTML to make HTML fully XML compliant. Latest implementations of asynchronous browser-server communication exchange objects in JavaScript-interpretable JSON (JavaScript Object Notation, https://www. w3schools.com/js/js_ajax_intro.asp 2019) format instead of XML to save parsing time. Web programming frameworks (i.e., ready-to-use software libraries) like JQuery (https://www.w3schools.com/jquery 2019) helped web designers to use AJAX without too much extra coding efforts, but are replaced more and more by new JavaScript 5.0 expressions. For the end user, AJAX brought convenient interface interaction services like auto-fills (Fig. 4) of best fitting words or phrases whenever a form is filled out, dynamically changing web page sections or the emulation of complex graphical user
230
R. Goecke
synchronous communication
Web server with temporary session management, server-side script interpreters with access to databases & applications, and website files: HTML, CSS, JavaScript, media, Web service server-side scripts, logs
synchronous http-request (URL or user clicks on link/button)
Application e.g. CRM or Website Analytics
asynchronous communication
Database
http-response: http-response asynchronous XML-httpHTTP(S) via XML, media files request HTML/CSS, INTERNET JavaScript JSON HTML/CSS page
Web browser user interface (UI), AJAX engine HTML interpreter & renderer, (JavaScript) JavaScript interpreter UI-calls XML, Cache to store temporary files: HTML, CSS, JavaScript, media, cookies
HTML/CSS, JSON
https://www.mtourism.de/mask/. ..
x
Destination : Mal_ Malaga Maledives Mallorca
search
Fig. 4 Asynchronous JavaScript and XML (AJAX) communication between browser and web server
interfaces known from locally installed multimedia PC clients only. The benefits of AJAX for the website provider may also cause problems for the users: AJAX can monitor all user interactions on the website. Together with cookies (i.e., small files with a unique user code left in the browser cache), a website owner gets lots of information on a user’s behavior without the user’s notice. It can be used to improve the user interface and the customer journey on a website as well as to detect hidden interests which may be used to redesign product offerings in a customer-oriented way. But the systematic evaluation of user behaviors (Schneider et al. 2014) may also intimidate the privacy especially of frequent website users who may be profiled, scored, and even discriminated according to their behavior or because their behavior is similar to other users whose profile is used for recommendations, purchasing power predictions, or customer-centered pricing.
The Semantic Web for Distributed Knowledge Representation With the availability of industry-specific XML tags, which describe the structure and meaning of all elements within electronic documents in a machine-interpretable way, the idea was born to use web technologies like XML even for the distributed representation of more complex knowledge. HTML websites all over the world and especially web encyclopedias like Wikipedia contain a lot of valuable knowledge, which is interpretable only for humans. A special problem for the machine
10 Advanced Web Technologies and E-Tourism Web Applications
231
interpretation of text even in XML documents is that many words are ambiguous like “Paris” meaning a specific city either in Europe or in Texas or the name of a female person. Similar to airport codes identifying airports or URLs identifying the location of a web resource, it is necessary to identify every physical or abstract thing or entity in the world of interest with its unique Uniform Resource Name (URN) (Fensel et al. 2011). URNs have the same path structure like URLs beginning with a domain name referring to the organization responsible for a unique name id and its definition, which may be accessed via http at a website with the same domain name. It serves as knowledge repository for one or more names of that name space. Because URLs and URNs share the same concepts to uniquely identify web resources or knowledge entities, W3C subsumes them as Uniform Resource Identifiers (URIs). The semantic web initiative of the W3C invented the Resource Description Framework (RDF) to express knowledge with URI triples arranged as (subject . predicate . object). An example is (http://cities.org/Paris01. http://geo.org/spatial_ reference/in. http://eu-rep.org/members/France) It indicates what a human means with the sentence “Paris is in France”. The knowledge is distributed across three repositories keeping the definitions that “Paris 01” means “the French capital,” “in” means “located geographically within a region,” and “France” means “the country France as EU member.” Every repository may also deliver further attributes of the specific referred entity whose structure and data types again may be self-described by machine-readable XML Schema definitions. RDF triples may be either used as part of repositories or embedded into the text of HTML web pages to make their content machine readable (Fensel et al. 2011; Sikos 2015), which is one of the key ideas of Web 3.0. Contrary to a specialized XML business document compliant with an XML schema like an invoice, HTML web pages mix a lot of information about many interrelated topics from different knowledge domains, e.g., about persons, places, hotels, restaurants, seasons, etc. Introducing standardized XML meta-tags and RDF triples into HTML body tags like or
has no effect on the rendering of the web page in the browser, but provides machine-readable meta-data (HTML Microdata) about the meaning of parts of a web page in a much more detailed way than general meta-tags in the HTML header of a web page. A web application called semantic web agent could read the web page and interpret the meaning of the meta-tags by following the XML Schema links and the RDF links. A hotel address on a web page could be annotated by meta-data of a published industry XML schema describing the elements of a postal address (Hotel Name, Street, House#, City Code, City, Country). The semantic web agent of a geo map service could read the hotel website via http and introduce the hotel as point of interest (POI) into its map without the help of a human interpreter. If the hotelier also augments the website text about past visits from celebrities with RDF triples of the form (celebrity_URI . visitor_URI . Hotel_URI), a local recommender website’s bot could inspect those URIs to learn valuable associations between those celebrities and the hotel. More details about the celebrities might be found following the celebrity URI to the celebrity repository. Every RDF triple can be visualized by the two nodes for subject and object connected by an edge for the predicate.
232
R. Goecke
Entity triples together form a graph of distributed knowledge named “linked open data,” “web of data,” or “knowledge graph.” RDF was enhanced with the SPARQL Protocol and RDF Query Language to offer graph-based investigation of the web of data and with the Web Ontology Language (OWL) to support logical inference machines of artificial intelligence Web 3.0 applications (Szeredi et al. 2014). Figure 5 gives an example of a web page describing the famous Hotel Sacher in Vienna with a short text, its geo-position, and its star rating. The web page contains semantic web tags with RDF references to existing schema and knowledge repositories of schema.org (maintained by Google, Microsoft, and others) and wikidata.org (55 million entries describing useful pieces of Wikipedia.org knowledge) which both support SPARQL queries (Wikidata.org 2019). Figure 6 shows the resulting piece of the corresponding “web of data” graph with Hotel Sacher as central item, entity, or thing (Wikidata.org 2019; Schema.org 2019, and Hepp 2019, Fensel et al. 2011). The knowledge graph shows all machinereadable attributes of the Hotel Sacher item as well as its URI references including the unique IDs representing the items (things) Hotel Sacher (Q279260) and Vienna (Q1741) in the open Wikidata.org knowledge repository. Via RDF-URI references, it is possible for a semantic web agent program to collect further information with SPARQL queries from all referenced items and even their references, etc. Instead of RDF/XML representations, simpler RDF/JSON-LD (JavaScript Object Notation for Linked Data) have just been introduced (W3C 2019). The semantic web is very complex, and many visions especially about its usefulness for complex logical resolution by artificial intelligence applications have only been realized partially by academic research projects. The annotation of
…
Hotel Sacher Famous hotel in Vienna!
Coordinates:
Latitude: 48 deg 12 min 14 sec N Longitude: 16 deg 22 min 10 dec E
Star rating: https://www.
*****
…
Fig. 5 HTML extract with embedded semantic web meta-data about Hotel Sacher in Vienna
10 Advanced Web Technologies and E-Tourism Web Applications
Hotel schema.org
Access to definition of a hotel incl. structure & meaning of typical hotel properties
233
Hotel Sacher Wikidata
//…/*.jpg 5
Access to all Wikipedia properties about Hotel Sacher! Famous Hotel in
Hotel Sacher
Item Access to all maps & GIS infos GeoCoordinates Latitude: 48.203889° N Longitude: 16.369444° E
description
Vienna
“Hotel Sacher“
subject item/entity/thing predicate item/entity/property
Access to all Wikipedia properties about Vienna!
Vienna Wikidata
object item/entity/thing value
Fig. 6 Semantic web graph of the machine-interpretable content of the Hotel Sacher HTML extract
websites with semantic Microdata spread faster and can be supported by content management systems which insert many semantic meta-tags automatically with information derived from their templates or database schemes or requested by its users through mandatory fields of their content entry forms. Semantic web information is also generated by open data projects like Wikipedia, by industry consortia, or by commercial content aggregators. Its main users are search engines, web shops, and IBEs enriching their results lists with links to query-related information or with recommendations for similar products or cross-selling proposals.
Web Portals as User-Centered Integration of Web Applications for Hybrid Business Models Many early websites especially from travel agencies were mere collections of information about the travel agency, special travel tips, and links to web booking engines of suppliers. The referred suppliers granted “non traditional outlet” commissions for every booking to the referring travel agency whose code was embedded in the referring link’s URL. Content management systems and XML technologies led to the development of white label web applications like web booking engines (WBE/IBE) or dynamic packaging engines. White label web engines can be branded individually with templates or by using their XML Web service interfaces for selfdesigned web dialogues (Goecke and Landvogt 2017–2019). A web portal is a website integrating different (white label) applications with a content management system in a way that end users get all applications and
234
R. Goecke
dialogues presented in the same (branded) layout with a seamless look and feel and with only one user log-in (i.e., SSO for single sign-on) required for e-commerce transactions (Goecke 2014a; Weithöner 2014b). Web portals (see Fig. 7) offer users a single entry point and easy usage of interwoven heterogeneous web applications homogeneously integrated under the control, domain, and trusted brand of one website provider. Web portals enable website providers to add value for their customers by combining complementary information and e-commerce services without the need to develop all the web applications by themselves. Web portals therefore are the technology of choice for Online Travel Agents (OTA) to seamlessly combine own IBEs/WBEs with those of third-party providers of flight, hotel, rental car, and touristic booking engines or dynamic packaging engines to aggregate bookable offers of many market suppliers (Weithöner 2007). In our example (Fig. 7), a typical OTA focused on hotel sales operates a self-developed hotel WBE, where contracted hotel partners maintain their vacancies directly via browser. The OTA’s web portal integrates its own hotel WBE with an inspiring hotel and travel guide; three WBE/IBEs for flights, rental cars, and prepackaged tours; and a dynamic packaging engine from third-party providers. Portal users may either book individual hotels, flights, and cars even in combination with one another, or they may use the dynamic packaging engine to get the combination of choice as a packaged tour for a fixed price under a tour operator’s responsibility. Moreover the touristic IBE/WBE gives users the possibility to book classic prepackaged holiday tours. Some touristic IBE/WBEs even offer integrated white label dynamic packaging services. Portal providers and virtual/online tour operators may deliver hotel
Web Portal (e.g. Online Travel Agency) Flight IBE
Car IBE
Flight+Hotel+Car
GDS flights, cars, for DP
Travel Guide
Dyn.-Pack. Engine Touristic IBE
Fotos, Videos, Recommends.
Tour Operators
GIS Payment Provider
HTTP(S) via EXTRANET Call Center & eFulfillment Guest reviews & eCRM eMail campaigns
HTTP(S) via INTRANET
Web CMS, Login Labeled templates +Banner
Hotel & accomm. suppliers
Hotel IBE Own Content
Browser
HTTP(S) via INTERNET Web Browser: Dest: Begin: ...
From: End: SEARCH
Flights Hotels Cars
Flight+Hotel+Car
Package Tours …
Login
FOR ASSISTANCE CALL 1-800-TRAVEL !
Travel Guide: Get inspired by best reviewed destinations & hotels of the World!
Teaser Ad / Banner:
Save 20% for car!
Fig. 7 Structure and components of a web portal of an online travel agent (OTA)
10 Advanced Web Technologies and E-Tourism Web Applications
235
vacancies which are dynamically packaged with flight and car vacancies available in connected GDS and consolidator databases, whenever a user searches for tour package offerings in the touristic IBE. The dynamic packaging tours are enlisted for the user together with the matching classic tour package offerings for direct product or price comparison. Suppliers of a web portal’s booking engines may define complex pricing schemes with seasonal prices, discount rates for early or last minute bookings, and special offerings (e.g., “buy now and get an extra night for free”) in their browser interfaces. With special Web services, it might even be possible to enable the supplier’s revenue management systems to change their prices in a WBE dynamically (Goecke et al. 2008). The US company priceline.com invented a special booking service called “name your own price” on their travel portals. Since then reverse pricing services enable customers to specify a travel trip by source, destination, time window, and quality level in a specified flight, hotel, or package tour web form together with a binding personal price bid. Those customer price bids are compared with matching opaque offerings, which have been posted by suppliers into the portal IBE’s databases with nondisclosed minimum prices. Depending on the reverse pricing model, the customer must buy one of the fitting offers for the named price bid or a price between the bid and a lower minimum price for the offer, which is neither refundable nor for resale. If no matching offer is available, the customer’s bid may be forwarded to suppliers with matching offers but higher prices who may decide if they are willing to give a rebate within the period specified by the customer. This type of opaque pricing is attractive for travel suppliers who do not want their high rebates for distressed inventory to be published. Other reverse pricing methods supported in the WWW by specialized web portals are eBay’s voucher auctions or Groupon’s platform for group coupon-rebate offerings. For user assistance, travel portals offer telephone call center support. To help effectively call center agents need access to many of the web portal’s functionality and a view of the actual dialogue status of a calling portal user via call center web user interfaces. At least, when a user wants to book travel components, a log-in or registration is necessary. With every booking an after-sales (e)fulfillment process starts where the progress of the booking, its ticket processing, supplier notifications, payment and commission flows, or user questions and complaints have to be monitored and supported with web interfaces to specialized mid- and back-office systems. All data about a user, her transactions, and her subsequent interactions with the portal or other channels are stored in a special application called electronic Customer Relationship Management (eCRM) system (Berchtenbreiter 2014). Modern eCRM systems provide also functions to collect customer feedback and valuable guest reviews to be displayed anonymously on the web portal. Reputation management systems support service help desks by a systematic analysis of written user feedback messages on own and 3rd party websites and help to measure and improve a brand’s customer appreciation. eCRM systems are also core systems to manage personalized e-mail marketing campaigns as well as to control the customization of user dialogues in the web portal by the portal’s CMS. The CMS
236
R. Goecke
may also be used to display relevant banner ads on all web pages of the web portal. Many eCRM systems added partner relationship management functionality to support also the cooperation with suppliers and affiliates. This is especially important for the two-/multi-sided market business logic of online travel agents, tour operators, and other intermediate platform businesses. Today, most websites of airlines, hotel chains and consortia tour operators, rental car suppliers, rail operators, cruise operators, and tourism destinations use web portal technologies not only to integrate many applications under a single log-in but also for the recommendation and cross-selling of complementary touristic services. As intermediaries OTAs had to integrate internal and external web applications across organizational boundaries of suppliers and customers. Therefore the technological separation of the public Internet from specially protected Intranet and Extranet segments (Benckendorff et al. 2014) became necessary: Access to the Intranet is granted only for members of the own corporation by corporate Internet routers, which block all Internet packages from unauthorized IP addresses or with wrong encryption (Fig. 7). Extranet access is granted to authorized business partners and their applications usually via encrypted Internet package exchange in so-called virtual private networks or via https-secured links to authenticated CMSregistered users only. Services for self-registered customers or unregistered users are provided under control of the CMS rights management via https-secured or unsecured http links over the public Internet.
Online Booking Engines and Corporate Travel Management Systems For business travelers the business processes are different from those for leisure travel. Employers pay for the business trips of their employees who have to apply for a trip approval by their supervisors and get their travel expenses refunded according to organization-specific travel regulations. All processes have to be auditable and compliant with taxation laws, because travel expenses reduce a company’s profits. Large companies or public employers with high purchasing volumes request confidential corporate discount rates from their travel suppliers. They even prefer to pay negotiated service fees with incentives to reduce travel costs to their travel agents to prevent them from taking supplier provisions or commissions behind their backs. Because of these specific requirements, specialized business travel agency chains or travel management companies (TMCs) evolved, which offer companies the handling of business trips in company collocated travel offices (firm implants) and/or in business travel portals, i.e., web portals similar to OTA web portals but focused entirely on corporate travel management, booking, and fulfillment (Mahnicke 2013; Fischer 2014; Unger 2016). Business travel portals or corporate travel portals (see Fig. 8) whose development started at the turn of this century may only be used by employees of registered corporate customers to book their business trips according to the corporate rules and rates defined by the companies’ corporate travel, mobility, purchasing, or controlling
10 Advanced Web Technologies and E-Tourism Web Applications
237
Business Travel Management System Flight OBE
RAIL OBE
Hotel OBE/OTA Corp. rates
Nego fares
Car OBE Corp. rates
GDS flights, hotels rail,cars
Event management platform bids
HTTP(S) via EXTRANET BTM Portal & CMS
BTM/controlling & invoicing module of Enterprise Resource Planning System
Travel policy & workflow control
Portal templates
Browser eFulfillment, assistance of business travel agency
Event suppliers Credit Card Payment
HTTP(S) via INTRANET Corporate Travel Portal: Dest: Begin: ...
From: End: SEARCH
Flights Rail Hotels Cars
MICE
Policies
Login
1. Travel application & approval 2. Booking of travel segments, 3. Assistance & Credit Card Payment 4. Travel expense reimbursement
Fig. 8 Corporate Travel Management Portal with portal, travel agency support, and MICE platform
departments. Corporate travel WBEs/IBEs have to enforce complex rule-based corporate travel policies including the application of negotiated corporate rates with preferred suppliers. For this they need interfaces to workflow management systems and to corporate credit card processing. They even exchange data with the purchasing and controlling modules of enterprise resource planning (ERP) systems. To distinguish corporate booking engines from leisure IBEs/WBEs, they are called Online Booking Engines (OBEs) in corporate travel management (Kwoka 2010). OBEs are sold as part of integrated software suites called business travel management (BTM) or corporate travel management (CTM) systems (Weithöner 2007; Fischer 2014; Benckendorff et al. 2014). Their development started earlier in the 1980s as ERP software modules supporting business hotel and GDS online booking. Today’s web-based BTMs and their OBEs are integrated into the Intranets and Extranets of client corporations to restrict the access to their employees only. BTM systems and portals are offered by ERP vendors, GDS providers, specialized BTM software vendors, leading business travel agency chains, or OTAs with a corporate travel focus. Organizing trade fairs, corporate meetings, incentive tours, and events (MICE) poses extra requirements on corporate business travel management. Web-based event planning and eProcurement platform connect corporate event management departments with travel suppliers via the platforms web user interfaces or Web services (Nyheim 2019). Competing travel suppliers are invited for bids, and those with the best offerings are selected in an interactive process which is supported
238
R. Goecke
by the event platform. It may even include project management and workflow functions for fulfillment processes.
Web Search Engines and Travel Meta-Search Engines The exponential growth of WWW’s websites and web pages required web-based services allowing end users to find websites and web pages which contain certain content the user is looking for. Web search engines are the most important platforms for end users to search for websites and web pages in the public Internet. To understand their huge impact on website promotion especially in E-Tourism, an overview of the main functional components of general search engines with enhanced semantic search capabilities is given in Fig. 9 (Lewandowski 2008, 2011, 2013; Goecke 2014a). URLs of all kinds of websites can be registered at a general search engine. Its server-side search engine component called web crawler, spider or bot systematically visits the registered web pages, enlists them, and then follows all links to find new web pages, which are not registered or not enlisted yet, and stores their newly discovered URLs for further visits. With this endless iterative process, the web of all web pages interlinked with the registered websites can be enlisted systematically. The navigation links of a website provide all web pages belonging to the website, while external links lead to the discovery and enlisting of previously unknown websites. With the visit of a web page, its HTML meta-tags with keywords, author,
Web Search Engine with semantic search Semantic Knowledge Graph
Indexer & Semantic Interpreter Linguistic & semantic analysis ot meta-tags, text, RDF/Microdata
Web crawler, spider, bot Fetch pages of known URLs: - page-text extraction, - image/video archiving, - add new links to URL-List, - delete URLs “not found“
List of known & registered URLs
Search engine archive tagged text-snippets, images, videos
Search engine index (inverted index)
Keyword/ phrase: … Holiday
Ranking engine - Count external referrals & clicks, - calculate page ranks
… Holidays
Advertising & web analytics server - define ads & links, - keywords, bids & campaign budget, - web analytics
- ad positioning, - click counting & billing, - analytic services
URL (Uniform Resource Locator):
Rank:
de.wikipedia.org/wiki/Holiday, Holiday_on_Ice/tickets, muenchen.de/holiday/oktoberfest,
1,20 1,22 1,44
https://dict.leo.org/.../holidays
1,31
Searcher web server
Semantic Search
- insert - Query search engine index & archive for keyword(s)/phrase, semantic search - insert fitting ads, results - present results & ads
Query semantic graph, news-bases, map geoservices, shops, IBEs, TSEs, recommendations
HTTP(S) via INTERNET Promoted web site Data flow Link
Web browser of advertiser Web site registry URL
Campaign manager Ad & link, kewords, bid budget, … Reports, statistics
Web browser of search engine user Search form Keyword(s), phrase Search
Search result Ads spons.links Results, links text extracts
Wiki, map, shop
Fig. 9 Architecture and components of a web search engine with semantic search (simplified)
10 Advanced Web Technologies and E-Tourism Web Applications
239
description, etc. are extracted together with its text and multimedia content, which might be stored in the search engine archive. An indexer and semantic interpreter module tries to find out the relevant keywords which describe the content of a web page best. Because the keywords in the meta-tags (Goecke 2020 Handbook of E-Tourism) often do not describe the real content of a website correctly, statistical and linguistic text analysis as well as semantic HTML Microdata interpretation is done to find out the real topics of the web page. This analysis is necessary to build a semantic knowledge graph and the search index, which is an inverted index: For every keyword/topic entry, those web pages with the most relevant content are enlisted in descending order of their relevance, and links to characteristic web page text snippets and multimedia content from the archive may be added. The searcher of a search engine offers search engine users a web search form to type in the keywords, topics, or even phrases that characterize web pages they are looking for. A query of the search engine index by the searcher returns a list of best matching entries together with the URLs of the most relevant web pages in order of descending relevance. This result list may be augmented with text snippets or image thumbnails from the search archive, from georeferenced map services, or from semantic queries of the knowledge graph or, e.g., Wikidata. Even specialized product or service comparison search engines (see TSEs in the next section) may be queried and deliver so-called “vertical” search results. Reformatted with HTML tags in the layout of the search engine, the final search result list is presented in the user browser. In 2000 Google became famous with its page rank algorithm (Brin and Page 1998). For every enlisted web page, a ranking engine (Fig. 9) counts the number of topic-related referral links from other pages enlisted. Those web pages with most referrals for a topic are candidates for the best rank. To prevent great numbers of irrelevant websites from promoting one another via link partnerships, page referrals to a web page are additionally weighted with the rank of the referring page. This delivers a weightadjusted ranking. Other factors affecting the final results’ ranking are the frequency a web page is actually visited by users whenever it occurs in a result list as well as further information from linguistic and semantic analysis of meta-tags and text content of the web page. With search engine optimization (SEO), website owners like tourism portals try to design the structure and content of their web pages in a way to achieve highest ranks for the topics they want to address. Google also introduced a highly targeted advertising model which is very important for the promotion of all kinds of websites including tourism websites via search engine marketing (SEM) (Lewandowski 2018): Instead of pay-per-view banners, which advertisers bought on popular websites and search engines for a specified number of banner views, Google invented small pay-per-click text ads called adWords. Having enlisted a particular website at the advertising and web analytics services (Fig. 9), it is possible to define adWord advertising campaigns for that website. An adWord text is defined with a (deep) link to one of the website’s web pages (i.e., the landing page) together with a keyword bid specifying the maximum amount of money the advertiser wants to spend for every user visit and an advertising budget limit. An advertising server collects for every keyword all bids with
240
R. Goecke
remaining click budget and uses them to calculate the position where the adWords are displayed as sponsored links above or aside the research result list, whenever a user searches for the specified keyword. Only if a user clicks on an adWord and is directed to the promoted web page (e.g., with an adWord-fitting tourism offer), the sponsor/advertiser has to pay the adWord and the budget is reduced by an amount typically less than the maximum bid because Google follows a Vickrey auction scheme to stay attractive for advertisers (Varian 2007). To ensure both end user relevance of advertising and advertising revenues, those adWords which are clicked more frequently by the end users may earn a better position in the sponsored link list than those adWords with less clicks but higher bids per click. With extra web analytics services, website owners may embed Google’s server- and browser-side scripting code snippets into the HTML code of their web pages to monitor the visitor activities on their website. Those scripts call the web analytics server whenever specified server-side or browser-side JavaScript events occur. The web analytics services collect all monitoring events, evaluate them statistically, and provide many scores and statistic diagrams for the website owners. With those search engine and website statistics, leading search engines are able to collect sensible information about both the website owner’s business and its users which may be a risk for business data protection and personal privacy. Besides Google, also Microsoft’s Bing, Yahoo/Oath, Baidu in China, and Yandex in Russia are well-known general web search engines. In E-Tourism, web search engines found even a more specialized usage as travel search engines (TSE) (Goecke 2014a; Benckendorff et al. 2019). While GDS had aggregated most globally distributed flight, hotel, and rental car offerings, many flight offers especially from low-cost carriers, from smaller hotels or rental car providers, and discount offers from suppliers, tour operators, and consolidators are not bookable via GDS. Instead, those non-global offers are presented and can be booked only on the suppliers’ websites, on web portals from online travel agents, or on websites and portals from tourism offices of destinations. Travel search pioneers used travel search technology to collect those non-GDS offers from supplier’s and online travel agent’s websites to aggregate them in a database. A search engine database with its structured tables makes all those offers from many IBEs/WBEs searchable and comparable with one another like a meta-search result. The only difference between an IBE/WBE and a travel meta-search engine is that for reservation and booking, the user is offered a link to the source website, where the specific offer was found and where it might be bookable. Similar to IBEs/WBEs, travel search engines are technically specialized for flights, hotels, rental cars, etc. They often are jointly integrated into travel search portals and even combined with reviews and travel tips from a travel community, as it is the case with, e.g., TripAdvisor. Figure 10 shows the simplified architecture and components of a flight search engine, which acquires flight offers either by scanning airline websites and online travel agents’ portals for sales offers or by querying GDS or IBE databases of affiliated advertising partners directly via Web service APIs. Its crawler regularly visits the web pages of unaffiliated suppliers and online travel agents, requests offerings like an unregistered normal user, and extracts the details of every enlisted offer from the result list’s HTML code in a process called web
10 Advanced Web Technologies and E-Tourism Web Applications
Web Flight Search Engine
Data exchange
Airline website 1 Flight search index collected flight infos & deep links sorted by destination, date and price
…
Search engine media archive
Web service
Scrape/query unsponsored Flight Searcher flight offers Search mask, Scrape web sites & query Web services ,
OTA portal n
Logos, photos, banners, videos
… Airline website k
241
query index DB, show result
Advertising & sponsored link server Collect offers & deep links from sponsors & bill them Web service
Web service
… … OTA portal l … Deep link
HTML & Web services over HTTP(S) 1. Web browser flight search engine: Dest: AGP From: MUC Date: 10.3. One Way ... SEARCH
Search Result Sponsored Links Flight 1 price deep link flight 2 price deep link ... flight l price deep link
Ads
via Internet
2. Deep link to „flight l“ into OTA portal‘s IBE: AIRLINE / OTA IBE BOOKING FORM FOR FLIGHT l Flight l, price, From: MUC To: AGP Dep: 10.3.2019 9,00 Arrival: 10.3.2019 12.00 Passenger Name: Address: Credit Card:
BOOK
Fig. 10 Architecture of a flight search engine (simplified)
scraping. All scraped offers are stored in the search engines index database as searchable records together with a deep link URL of the web page, where the offer was found. The end user now may query the flight searcher with a web search mask similar to a WBE/IBE for all matching flight offers which are enlisted in the same way as a WBE/IBE output. Instead of booking buttons, the deep links into those websites are presented where the offer may be bookable. The pioneering travel search engines got revenues only from banner advertising and sponsored links. Before a website’s offers can be scraped, a detailed analysis of the website’s dialogue steps and page layouts by the crawler’s programmers is necessary. Another problem of scraping results is that they quickly become outdated and may harm the legal copyrights and business rules of the scraped websites, especially when images and other valuable content are archived. While many website owners are happy when their offerings are visible to many visitors of today’s established travel search engines like Kayak, Skyscanner, Trivago, etc., other suppliers shy the direct comparison of their offerings or fear customer complaints if scraped prices or offer descriptions on search engines are misleading. Therefore travel search engines developed a new business model where affiliated travel suppliers and OTAs can directly either deliver their correct offerings via XML Web service interfaces to the search engines database or respond just in time to a user’s search engine query. Further offers are also deliverable by Web services of cooperating WBE/IBEs or even GDS who offer their supplier and tour operator clients an extra promotion in leading travel search engines. While the pure link-less listing of an offer as a search result normally is free of charge, leading search engines
242
R. Goecke
introduced pay-per-click pricing for offers listed with a deep link for booking. An even higher pay-per-booking fee is chargeable, if a customer really books an offer, which is detectable by a TSE’s monitoring script on the referenced booking engines. The advertising and sponsored link server (Fig. 10) of a TSE controls the business logic of those models. This TSE business model is so attractive that leading OTAs like Expedia, Priceline.com, or Ctrip acquired TSEs and even the general search engine giants started their own travel search services, e.g., Google Flights. They embed matching travel offers within a “vertical search results” box into their generic results lists together with deep links to their own white label booking engine services. Those offer the customer a booking page branded by the supplier with all content and process steps under the supplier’s responsibility and liability. This is an opportunity for suppliers without own booking infrastructures, but they risk a growing dependency and more advertising costs from TSE listings. Even GDS and destination portals benefit from travel search engines because of their meta-search capabilities. Travel expedients in travel agencies’ offices use GDS search engines which aggregate the results of a search within the GDS with non-GDS content from numerous TSE searches of connected web booking engines and web portals. As a result the travel expedient gets a better view of all offers in the market and either may book the best offer directly in the GDS or follow a deep link to the best supplier’s web booking engine, where the expedient books for the customer in agency mode. Even big tourism destinations with many tourism regions, cities, villages, and attractions use meta-search engines to integrate the diverse web booking facilities of their members: Customers who search for offerings across a specific destination use the search mask of the destination portal’s travel search engine. It starts a meta-search across all websites and WBEs of a destination’s member organizations and deliver deep links to fitting offers.
Mobile Web, Apps, and Augmented Reality When mobile smartphones with touch screens spread in 2007, they could use existing mobile Internet connections via WLAN or 3G cellular mobile networks that had already been used by mobile PCs, web pads, or outdoor and in-car navigation systems. Smartphone web browsers implemented new device-specific gesture recognition interfaces and additional JavaScript functions to read the actual device position and orientation from embedded GPS (Global Positioning System), gyro, and compass modules (Egger and Jooss 2010; Lester 2011). Operators of cellular mobile networks and WLAN base stations may provide cell position information to allocate web users for location-based services. A mobility barrier of classic HTML 4.0 browsers was that their temporary browser caches offered no persistent storage for application-specific content. This is a problem for applications like mobile travel guides, if a user has no permanent access to the mobile Internet and needs to find a route in a map that was not preinstalled or downloaded completely. Another issue was that some browser plug-ins for highly comfortable animated screen interfaces were not optimized for low-energy and low-processor/storage environments of mobile devices.
10 Advanced Web Technologies and E-Tourism Web Applications
243
Therefore Apple used mobile apps as mobility-optimized applications which can be installed with their runtime binaries and app-specific content on its mobile device operating systems (Fig. 11). To prevent the problems of the PC era with installation conflicts between PC applications from different vendors and malware, mobile apps can be loaded only via Apple’s app store download service. All vendors may distribute their apps via Apple’s app store only after an app inspection by Apple. For apps and content that are not free of charge for the end users, Apple charges commissions from commercial app and content providers, which led to new sustainable revenue streams for Apple and a more secure but completely closed mobile app world for Apple users and vendors. Microsoft, Google, etc. offer competing mobile operating systems for mobile devices from other smartphone manufacturers which led to a segmented and incompatible landscape of mobile app systems. Therefore, mobile apps need to be programmed in different programming languages with device-specific optimizations for Apple’s, Google’s, or Microsoft’s operating system ecosystems (Van Drongelen et al. 2017). Mobile app technology enabled comfortable apps, e.g., for mobile payments or location-based apps for car sharing, public transport ticketing, and tourist guidance on personal GPS smartphones with mobile Internet access (Goecke 2014c,d; Egger and Jooss 2010). Like classic web applications, also mobile apps may call app-specific Web services (App 1 . . . n; Fig. 11). They respond with XML content to be rendered and presented in device-optimized highly interactive designs, which create new user experiences especially for gaming and augmented reality (Nyheim 2019; Egger and Jooss 2010; Pease et al. 2007). A clever way to present existing web content in new ways to tourists are mobile apps called augmented reality browsers (Lester 2011). They retrieve the position and orientation of their mobile device and call a
AppStore App 1 server Web Server 1 …
Application Store
Mobile Travel Guide
Web service
Web service We
…
HTTP(S)) via mobile INTERNET
Attraction Web Cam
Browser
Web service
App1
App2
UI/Logik
UI/Logik
Local Data
Local Data
App n UI/Logic
… Local Data
City Mobility App
Web service
Web service
Web service
AppStore
Map & navigation service
App n server
Web Server k
Web service
eTicketing service
Operating system GSM/WLAN cell localization, Hardware GPS-Chip, Gyro-Chip, etc. Smartphone/Web Pad
Fig. 11 Architecture of mobile web applications and mobile apps (simplified)
Mobile Payment Provider
244
R. Goecke
mobile mash-up Web service. They collect location-related information from web servers or Web services offering geospatial queries to retrieve georeferenced data (e.g., Wikipedia, Google/Bing web maps, or travel and hotel guides) and mix them for a uniform app presentation. Augmented reality apps use the embedded video camera of a mobile device to superpose data about nearby points of interest (POIs) with the life view of the camera display. In real time they blend only POI data of positions matching the actual camera orientation (calculated with GPS, gyro, and compass data) as a simulated 3D presentation into the camera view. Users get a mixed reality experience on their mobile device: They see the real-life video scenery augmented with POI symbols, text, and either links to browse to more detailed POI information or deep links to those source websites where the POI data originated. It is even possible to include life video streams from web cams to view a ski resort’s weather conditions or visitor queue lengths, etc. Mixed reality gaming apps like Ingress pioneered tourism gamification (Horster and Kreilkamp 2017). While standard web applications developed with HTML 4, device-specific CSS, and HTML templates had been universally accessible for free on every web browser and every device, they suffered from a mediocre user experience in comparison with device-optimized “native” mobile apps. To regain the advantages of universally accessible web applications without their problems for mobile devices, a completely redesigned HTML 5.0 standard came up in 2014. It enables all web browsers to download persistent content and offers new ways to integrate animated graphics and multimedia streams. A new CSS version adapts web content more easily to different device displays. The new art of programming HTML 5 web pages and templates which adapt themselves in an optimal way to any display type or output medium (incl. print) is called responsive web design. Although the expensive programming of operating system- and device-specific native mobile apps still offers some advantages in user interface design, HTML 5 offers many features to mark the content of HTML 5 web pages with semantic tags. Those semantic annotations like HTML-Microdata support the indexing of a website in a more detailed way and promote both an open, barrier-free access and a better ranking by web search engines. Another new feature is the JavaScript Web Speech API as enabler for natural speech recognition and processing services used by interactive voice browsers. Smart watches are the smallest wearable devices used as platforms for mobile apps, e.g., for touristic outdoor activities, while virtual reality glasses are still proprietary niche products to realize innovative indoor attractions.
From Destination Management Systems and Portals to Smart Destination Service Platforms Destination Management Systems (DMS) are IT systems supporting both sales and administration processes of tourism destinations and their tourist info offices (Werthner and Klein 1999; Buhalis 2003; Egger and Buhalis 2007; Weithöner and Raab 2014; Landvogt et al. 2017; Benckendorff et al. 2019).
10 Advanced Web Technologies and E-Tourism Web Applications
245
First DMS came up shortly after the success of CRS/GDS as part of municipal host-based CRS, which later were migrated to client/server PC systems. As mentioned before the first destination websites spread early in the 1990s. Tiscover, Gulliver, and others pioneered web portals for shared browser-based self-service usage by suppliers and guests as most important stakeholder groups of their destinations (i.e., Tyrol and Ireland). Those destination portals were powered by the first touristic web content management systems and integrated WBE/IBEs to sell regional hotels, accommodation tickets, and package tours (Fig. 12). A destination portal focuses content aggregation and syndication, SEO (search engine optimization) and SEM (search engine marketing) activities, and banner campaigns with affiliates for a tourism region. SEO means optimizing the structure and content of web pages for a better listing in search engines, while SEM means organizing search engine advertising campaigns. A centralized coordination of these activities by a destination management organization (DMO) often has more impact than uncoordinated individual web marketing efforts of single destination member organizations. Even the individual websites of regional tourism suppliers are supported if they reuse the destination portal’s CMS and its white label web booking engine on their own websites and get them enlisted for referral by the higher ranked destination portal. Some destination web booking engines provide even interfaces to the booking engines of affiliated OTAs which are connected via channel management systems. Hotel channel managers are browser-based web applications, where a hotel supplier can configure which and how many of its vacancies may be distributed under which conditions via which connected OTA, tour operator, or GDS channel. Channel management services are offered not only by destination management
Aracons: Browser
AI & Data Aracons, tours, maps Analycs Interacve maps, media, Σ
Hotel/B&B Suppliers: Browser
GIS-CMS
Hotels, B&B IBE
Channel Manager
Data Warehouse
Archive
Municipality:
Web Portal CMS, SEO Templates
Guest card Resellers: Browser
+Banner
GuestCard Writer
HTTP(S) INTRANET Regional Site Local Websites Aracon Sites Info Kiosks
Desnaon App Mobile Guide
Supplier site
Affiliates
HTTP(S) via EXTRANET
Browser
Banner Payment Provider
Call Center
eCRM & eMarkeng SEM/eMail Banner
GuestCard & visitor‘s tax
Campaigns
Web service
IoT
GuestCard Reader
IoT sensors
HTTP(S) via INTERNET
Visit Us Portal: Dest: Begin: ...
Event & Meta Search eTicket IBE
recommendaons
TouristInfos:
Browser
OTA portal
Smart Desnaon Management System (DMS)
From: End: SEARCH
Aracons Events Hotels/B&B Tours
GuestCard
FOR ASSISTANCE CALL 1-800-TRAVEL !
Folk Music Festival 9.6.-15.6.
Weather 10% discounts forecast! with GuestCard
Desnaon Info Kiosk Festival Vacancies
Smart App n Phone vCard / Pad eTicket
Fig. 12 Architecture and components of a smart destination management system (simplified)
246
R. Goecke
systems but also by specialized application service providers. They even provide seamless interfaces to hotel property management systems (PMS) for automated inventory and booking synchronization (Goecke 2014b). Other destination portals use a meta-search engine to collect offerings of different local booking engines or cooperating OTAs on request by portal visitors and direct the portal visitors via deep link to the source booking engine (see Fig. 12). Some destination management organizations chose to outsource or sell their web booking engines or web sales to leading OTAs, who often may have a more global reach especially to customers preferring different destinations for each holiday. Sometimes a whole destination portal is licensed to third-party providers under the specific usage conditions of a public-private partnership for some years before new providers may apply or an insourcing is reconsidered. Providers of legacy destination management systems had the advantage to integrate even fulfillment processes like brochure management or municipal tourism registration forms as well as tourism tax collection. They developed web front ends to integrate destination management mid- and back-office systems with destination portals. Other software providers introduced web-based clients for outdoor tourist info screens and kiosk systems as well as ticket vending machines. A very important business process involving cooperative marketing, sales, billing, transaction clearing, and settlement is the management of a destination’s guest card system (Pechlaner and Zehrer 2005). Guest, tourist, or destination cards provide a platform for product bundling, mobile payment, tax collection, rebating, and revenue sharing of all businesses, tourism attractions, and public transport providers in a destination. A web-based guest card server is connected to all card readers and writers of both authorized guest card issuers and card acceptance points including many automated door or barrier openers, etc. (Goecke 2014d). Touchless communication between card readers/writers and tourist cards is possible by radio frequency identification (RFID)). Even biometric data may be exchanged between smart passport cards and cameras or scanners. or by the exchange of biometric information between smart passport cards and cameras or scanners. Those applications may use classic wired or wireless Internet connections as well as new low-energy-consuming wireless Internet of Things (IoT) protocols for sensors embedded into all kinds of things used or only passed by the guest. The IoT enables sensor and actor applications in touristic facilities (smart hotel rooms) and outdoor environments (visitor monitoring and guidance in parks, caves, trade fairs, etc.), while the web of things (Guinard and Trifa 2016; W3C 2017) allows users to control smart things via browsers or apps. The collection of all tourist card transactions is necessary for transparent revenue sharing, tax collection, and tourist statistics. Anonymized booking and card transaction data accumulated in a data warehouse is the basis for future destination-oriented data mining (Höpken et al. 2015). Big data analytics requires masses of data for the analysis of visitor streams, preferences, and behavior with advanced statistical methods and innovative artificial intelligence (AI) tools like artificial neural networks (Werthner and Klein 1999; Fuchs and Höpken 2014). DMOs as well as GDS, WBEs, TSEs, etc. may be able to accumulate the
10 Advanced Web Technologies and E-Tourism Web Applications
247
critical mass and expertise for such projects as a service for their members and partners. Because many destinations and hotels cannot afford the development and maintenance of native tourist apps for different mobile operating systems, providers of destination management portals developed standardized white label tourism apps and hotel apps. A white label destination app is programmed and maintained like a native app only once (Van Drongelen et al. 2017; Goecke 2014a). Then, different destinations or hotels can reuse the same skeleton app with different logos, layouts, menus, and functions customized individually for each destination or hotel. The content for every destination’s individualized app is delivered by a Web service of the destination’s app server, which is directly connected to the destination portal’s single content management system and GIS database with maps, routes, attractions, regional POIs, or even hotel-specific content. More and more those GIS-enabled CMS evolve to single data sources about georeferenced destinationoriented public attractions, route and public transport guides for both tourists and local residents. Another innovative use of mobile apps is the simulation of tourist, guest, or destination cards with virtual smart cards. Smartphones with trusted embedded security modules and RFID or Bluetooth antennas may emulate the signals of a tourist RFID card or an eTicket whenever they are held against a wireless card reader (Egger and Jooss 2010; Goecke 2014d). For human or optic readers, the smartphone display presents a tourist card picture with a 2D bar code. Because smartphone apps may track users also between their tourist card transactions, they are useful for real-time monitoring and for the smart control of visitor flows by sending flow-dependent guidance tips. Radio beacons attached at known points, e.g., in trade fairs, museums, airports, railway stations, or hotels, are a new way to locate the position of visitor smartphones very precisely in buildings and rooms. They may be used for both in-house visitor-guiding apps and the monitoring and control of in-house visitor streams. Innovative voice processing services will be helpful extensions for destination portals to support voice interaction and even language translation with mobile tourism apps and web browsers. Destination management systems and destination portals will be important components of future smart destinations for smart cities and smart regions to serve guests, citizens, and innovative e-Government processes (Buhalis and Amaranggana 2013; Höpken et al. 2015).
Further Impacts and the Future of Advanced E-Tourism Web Applications Web applications enabled new business models like pay-per-use, on-demand services, sharing, and crowdfunding with substantial impacts for open tourism innovation processes (Egger et al. 2016). Open source projects started to collaborate via WWW and to create content and software for free. This eroded established
248
R. Goecke
business models of many copyright-driven media (e.g., travel guide books) but introduced new opportunities for CMS-driven content aggregation and syndication, which led to discussions on the special role of tourism organizations as open data and mobile travel guide providers (Sommer 2018). Because highly frequented web applications like web booking engines or search engines get into performance problems, when too many users search in parallel, Google, IBM, Amazon, Microsoft, Oracle, and the global content delivery network Akamai developed special distribution methods for global content management. They replicate the function of a web server, spread content copies globally, or distribute a search index and databases to thousands of physical multiprocessor servers located in one or more server farms of the Internet cloud (Marinescu 2018). NoSQL or NOSQL databases (meaning “No” at first and “Not Only” later) support semantic knowledge graph queries and execute massively distributed parallel queries more efficiently than SQL databases (Sikos 2015). Distributed web application clouds Baun et al. (2009) removed entry barriers for innovators to implement innovative E-Tourism services by clever mash-ups of existing Web services using highly scalable network and global server infrastructures “on demand” without prior investments. Extreme economies of scale and scope caused by e-commerce network effects led to oligopolistic dominance of GAFA (Google, Apple, Facebook, and Amazon) over essential parts of the WWW and its Web services, content, knowledge, and online marketing (The Economist 2018). Even some GDS, OTAs, and TSEs achieved leadership in selected travel distribution and E-Tourism marketing sectors. The new capabilities of mass personalization, mass customization, and mass data collection by search engines, social media, shopping sites, or e-Government services have critical implications: Like most web users, also tourists receive content and offers filtered by their own preferences, which creates filter bubbles and may amplify misinformation. Artificial intelligence (AI) algorithms deciding about the eligibility of citizens and guests for services and prices based on statistic predictions about their creditworthiness, customer lifetime value, or social score bear serious discrimination risks, which should be avoided by proper data protection and privacy standards. The possibility to encrypt Internet and WWW communication to dynamically change routes of data packages and the idea to share Web services as intermediaries for the repackaging and forwarding of encrypted messages (onion routing) is an enabler for anonymous distributed peer-to-peer networks. While they promise their users anonymous web use and confidential private information exchange, they are also misused as global platforms for illegal and criminal activities which are subsumed as “the dark web net” (Snow 2017). Decentralized and unregulated anonymous blockchain currencies like bitcoin or ether may support legal and illegal payments. At the same time, blockchain platforms like Ethereum deliver distributed ledgers as trustworthy frameworks for secured distributed business transactions (Werbach 2018). Blockchain-enabled bookings, payments, insurance contracts, tax declarations, and financial accounting may be fulfilled by embedded smart contract programs, which might be a next disruptive innovation for E-Tourism and for web of things transactions. Some authors name them Web 3.0
10 Advanced Web Technologies and E-Tourism Web Applications
249
or 4.0 depending on their classification of the semantic web in their web technology road maps (Ragnedda and Destefanis 2020; Kollmann 2018). Another open question is how sustainable Tourism can be achieved with energy-sensitive tourist guidance and supply chain management apps (Ali and Frew 2013) as well as with more energy-efficient web devices and cloud services of the future.
Cross-References Augmented, Virtual, and Mixed Reality in Tourism Electronic Data Interchange and Standardization Semantic Web Empowered E-Tourism The Evolution of Online Booking Systems Web Information Retrieval and Search
References Ali A, Frew A (2013) Information and communication technologies for sustainable tourism. Routledge, London/New York Amersdorffer DA, Bauhuber F, Egger R, Oellrich J (eds) (2010) Social web im tourismus – Strategien – Konzepte – Einsatzfelder. Springer, Berlin Barker D (2016) Web content management: systems, features, and best practices. O’Reilly, Sebastopol Baun Ch, Kunze M, Nimis J, Tai St (2009) Cloud computing – web-based dynamic IT services. Springer, Berlin Benckendorff PJ, Sheldon PJ, Fesenmaier DR (2014) Tourism information technology, 2nd edn. Cabi, Wallingford/Boston Benckendorff PJ, Xiang Z, Sheldon P (2019) Tourism information technology, 3rd edn. Cabi, Wallingford/Boston Berchtenbreiter R (2014) IT-gestütztes Kundenbeziehungsmanagement. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 536–562 Berners-Lee T (1989/2019) https://www.w3.org/blog/2019/03/30-years-ago-the-world-changedforever/. Accessed 15 Dec 2019 Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. http://infolab. stanford.edu/~backrub/google.html. Accessed 20 Dec 2019 Buhalis D (2003) eTourism – information technology for strategic tourism management. Pearson Education, Harlow Buhalis D, Amaranggana A (2013) Smart tourism destinations. In: Xiang Z, Tussyadiah I (eds) Information and communication technologies in tourism 2014. Springer, Cham Comer DE (2018) The internet book: everything you need to know about computer networking and how the internet works, 5th edn. CRC Press, Boca Raton Deitel PJ, Deitel HM, Deitel A (2016) Internet and world wide web how to program. 5th international edition. Pearson Education, Boston et al Egger R, Buhalis D (eds) (2007) eTourism case studies. Butterworth Heinemann, Amsterdam, pp 310–324 Egger R, Jooss M (eds) (2010) mTourism – Mobile Dienste im Tourismus. Gabler Springer Wissenschaft, Wiesbaden Egger R, Gula I, Walcher D (eds) (2016) Open tourism – Open innovation, crowdsourcing and co-creation challenging the tourism industry. Springer, Heidelberg
250
R. Goecke
Fensel D, Facca FM, Simperl E, Toma J (2011) Semantic web services. Springer, Berlin Fischer K (2014) Geschäftsreisemanagement und IT-Systeme. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 278–300 Fuchs M, Höpken W (2014) Data mining im tourismus. HMD Praxis der Wirtschaftsinformatik. 46:73-80 Goecke R (2014a) Systemarchitekturen touristischer IT-Applikationen. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 13–24 Goecke R (2014b) Informationsmanagement in Hotel- und Gastronomiebetrieben. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 371–405 Goecke R (2014c) Informationsmanagement bei Autovermietern. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 427–472 Goecke R (2014d) Elektronische Zahlungs- und Kartensysteme. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 516–535 Goecke R (2020) The evolution of online booking systems. In: Zh Xiang, M Fuchs, U Gretzel, W Höpken (eds) Hanbook of e-tourism. Springer Goecke R, Weithöner U (2014) IT-Systeme und Prozesse bei Reiseveranstaltern. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 442–472 Goecke R, Heichele H, Westermann D (2008) Lufthansa Systems: dynamic pricing. In: Egger R, Buhalis D (eds.). eTourism Case Studies. Amsterdam: Butterworth Heinemann. 310–324 Goecke R, Landvogt M (2017–2019) Digitaler Tourismus – Technologien, Systeme, Geschäftsmodelle. vhb Virtuelle Hochschule Bayern https://kurse.vhb.org/VHBPORTAL/ kursprogramm/kursprogramm.jsp?Period=71&School=3&Section=228. (Accessed 9 Dec 2019) Guinard D, Trifa V (2016) Building the web of things. Manning Publishing, Shelter Island Höpken W, Fuchs M, Keil D, Lexhagen M (2015) Business intelligence for cross-process knowledge extraction at tourism destinations. Inf Technol Tour 15:101–130 Hepp M (2019) Markup for hotels. https://schema.org/docs/hotels.html. (Accessed 20 Dec 2019) Hinterholzer St, Jooss M (2013) Social Media Markeing und –Management im Tourismus. Springer-Gabler, Berlin/Heidelberg Horster E, Kreilkamp E(2017) Gamification im Tourismus. In: Landvogt M, Brysch AA, Gardini MA (eds) Tourismus, E-Tourismus – M-Tourismus – Herausforderungen und Trends der Digitalisierung im Tourismus. Erich Schmidt Verlag, Berlin Kollmann T (2018) Grundlagen des Web 1.0, Web 2.0, Web 3.0 und Web 4.0. In: Kollmann T (eds) Handbuch Digitale Wirtschaft. Springer Gabler, Wiesbaden Kwoka S (2010) Geschäftsreise-Management mit IT Systemen. Schulz A, Weithöner U, Goecke R (eds) Informationsmanagement im Tourismus – Prozesse und Systeme. Oldenbourg Verlag, München, pp 310–331 Landvogt M, Brysch AA, Gardini MA (eds) (2017) Tourismus, E-Tourismus – M-Tourismus – Herausforderungen und Trends der Digitalisierung im Tourismus. Erich Schmidt Verlag, Berlin Lester M (2011) Professional augmented reality browsers for smartphones: programming for Junaio, Layar and Wikitude (Kindle Edition) Lewandowski D (2018) Suchmaschinen verstehen, 2nd edn. Springer Vieweg, Berlin/Heidelberg Lewandowski D (2008, 2011, 2013) Handbuch Internet Suchmaschinen 1, 2 und 3. AKA Akademische Verlagsgesellschaft, Heidelberg Mahnicke R (2013) Business travel management. Springer Gabler, Berlin Marinescu DC (2018) Cloud computing – theory and practice, 2nd edn. Elsevier, Cambridge, MA Nyheim PD (2019) Technology strategies for the hospitality industry, 3rd edn. Prentice Hall, Upper Saddle River
10 Advanced Web Technologies and E-Tourism Web Applications
251
O’Reilly T (2005) What is web 2.0 – design patterns and business models for the Next Generation of Software http://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html (Accessed 27 Dec 2019) OTA – Open Travel Alliance (2003) Open travel alliance schema descriptions and examples. Version 2.0 30 May 2003. http://xml.coverpages.org/OTA-2003ADescriptionsAndExamples. pdf. (Accessed 22 Dec 2019) Pease W, Rowe M, Copper M (eds) (2007) Information and communication technologies in support of the tourism industry. IDEA Group Publishing, Herschey Pechlaner H, Zehrer A (2005) Destination-Card-Systeme: Entwicklung – Management – Kundenbindung. Linde Verlag, Wien Ragnedda M, Destefanis J (2020) Blockchain and web 3.0: social, economic, and technological challenges. Routledge, London, New York Schema.org (2019) Schema.org. http://www.schema.org (Accessed 27 Dec 2019) Schneider M, Enzmann M, Stopczinski M (2014) Fraunhofer SIT Web Tracking Report 2014. Fraunhofer Institut für Sichere Informationstechnologie SIT, Darmstadt Schulz A, Weithöner U, Goecke R (eds) (2010) Informationsmanagement im Tourismus – Prozesse und Systeme. Oldenbourg Verlag, München Schulz A, Weithöner U, Egger R, Goecke R (eds) (2014) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin Sikos LF (2015) Mastering structured data on the semantic web – From HTML5 microdata to linked open data. Apress, New York Snow AW (2017) The deep web. Amazon Kindle Edition Sommer G (2018) Herausforderungen und Chancen einer offenen, digitalen Dateninfrastruktur im Tourismus. Ergebnisse des ersten Think Tanks zum Thema „Open Data im Tourismus“ sowie aktuelle Entwicklungen. https://okfn.de/files/blog/2018/08/ThinkTank2017_ Whitepaper_formatiert_Final.pdf. (Accessed 29 Dec 2019) Szeredi P, Lukaszy G, Benkö T (eds) (2014) The semantic web explained – the technology and mathematics behind web 3.0. Cambridge University Press, Cambridge, UK The Economist (2018) Special report fixing the internet. https://www.economist.com/specialreport/2018/06/28/how-to-fix-what-has-gone-wrong-with-the-internet. (Accessed Dec 2018) Unger Cl (2016) Corporate travel: hiding in plain sight. CreateSpace Independent Publishing Platform Van Drongelen M, Dennis A, Garabedian R, Gonzalez A, Krishnaswamy A (2017) Learn mobile App development – develop lean iOS and android apps using industry standard techniques and lean development practices. Packt Publishing, Birmingham/Mumbai Varian HR (2007) Position auctions. Int J Ind Organ 25(6):1163–1178 W3C (2017) Web of things (WoT) architecture – W3C first public working draft 14 Sept 2017 http://www.w3.org/TR/2017/WD-wot-architecture-20170914/. (Accessed 28 Dec 2019) W3C (2019) World wide web consortium. https://www.w3.org/. (Accessed 22 Dec 2019) w3schools.com (2019) w3schools.com – the world’s largest web developer site. https://www. w3schools.com. (Accessed 30 Dec 2019) Weithöner U (2007) Electronic Tourism – kleines Lexikon zu informationstechnologischen Systemen in der Tourismuswirtschaft. Hamburg (Deutschland): WiWi-Online.de. http://www. odww.net/artikel.php?id=359 Weithöner U (2014a) eMarketing und eCommerce – Internet-Basis, Voraussetzungen und Potentiale. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 65–93 Weithöner U (2014b) Web-Portale und Internet Booking Engines. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, pp 314–324 Weithöner U, Raab U (2014) Destinationsmanagement-Systeme und Portale. In: Schulz A, Weithöner U, Egger R, Goecke R (eds) eTourismus – Prozesse und Systeme, 2nd edn. DeGruyter, München/Berlin, 301–313 Werbach K (2018) The blockchain and the new architecture of trust. MIT Press, Cambridge, MA
252
R. Goecke
Werthner H, Klein S (1999) Information technology and tourism: a challenging relation. Springer, Vienna Wikidata.org (2019) https://www.wikidata.org/wiki/Wikidata:Main_Page. (Accessed 18 March 2019) Wikimedia Foundation Inc (2019) Wikipedia The free encyclopedia. https://en.wikipedia.org/wiki/ Main_Page. (Accessed 28 Dec 2019)
Web Information Retrieval and Search
11
Jürgen Dorn
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrieval Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrieval Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weighting of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation of Information Retrieval Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information Retrieval in the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automated Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Relevance of Search Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . User Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metasearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operation of a Metasearch Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Search Engine Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Steps in Metasearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Site Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Search Engine Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Search Engine Marketing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bias in Web Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . History of Information Retrieval in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
254 255 256 256 257 258 259 260 261 262 262 263 264 265 266 266 266 267 268 268 269 269 270 271 271
J. Dorn () Institute for Information Systems Engineering, Technische Universität Wien, Wien, Austria e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_16
253
254
J. Dorn
Abstract Web information retrieval and search deals with the information provisioning and information search on the World Wide Web. It is used by tourists to find information about, e.g., worldwide touristic services. The chapter starts with an introduction into information retrieval (IR) and the specifics of IR in the tourism domain where it is argued that not only information but real touristic services are searched for which requires also higher trust into found services and information. Further, the basics of IR in local information systems such as a generic retrieval model, indexing, retrieval functions, the relevance of found objects, and the evaluation of IR systems are described. For a large distributed information system as the Web with mostly unstructured or semi-structured content, specialized techniques such as crawling, automated indexing, text mining, computation of the relevance of objects, the ranking of results, and an extended architecture are described. Additionally, user profiling to achieve better matches for seekers is addressed. Web user behavior, Web site optimization, and keyword advertisement as technologies for marketing are included, as well as location-based services. Further, approaches to extract and transform semi-structured data into structured data, deep Web, as well as metasearch, i.e., searching over distributed structured information resources and aggregating search results, are described. Finally, crowd-based approaches such as social tagging to evaluate the quality and relevance of Web resources by users are presented and an outlook into trends is given.
Keywords Indexing · Information retrieval models · Metasearch · User profiling · Web crawling · Web site optimization
Introduction Today many tourists use the Web to search for information about destinations and available resources such as transport and hospitality vacancies. Moreover, the Web is used to compare service quality and price of touristic offers. In general, a tourist may search for different media types and services (text, audio, video, or even touristic objects such as flights, hotels, events, or whole travel packages). Since touristic services cannot be checked before consumption and have to be paid in advance, trust into the found information and the expected service is especially important. Consequently, destination marketers design their Web sites increasingly persuasive by showing pictures and videos about their offers as a mean of influencing travelers’ decision-making process (Werthner and Klein 1999) and trust in such services.
11 Web Information Retrieval and Search
255
Two types of search engine interfaces can be distinguished: (1) in case of a focused search, a user knows a certain site (e.g., an airline portal) where she has to enter certain arguments (e.g., preferred date and destination) into a form to find a certain service and (2) in case of a general search, a tourist is looking for services and information by entering certain search terms into a general search engine (e.g., flight Vienna–New York). In the first case, one or more domain experts have designed the user interface according to typical interests. In the second case, the matching between search terms and different media objects is based on a textual comparison where the different media objects or services are indexed with keywords describing the objects appropriately. Often, also a two-phase approach is taken: first we search sites where flights are offered, and then we can enter certain attributes of our desired service. The general search is also called horizontal search in contrast to a vertical search where the search is focused on a certain segment of content. Vertical search (e.g., search for hotels) often provides better results because certain concepts are defined. A special case of such a focused search is a Web site managing content related to this domain locally. A provider of touristic services has to consider carefully how its services are offered on the Internet. A service description can be submitted to a specialized Web site for managing these services or by characterizing the service by appropriate terms so that prospective customers will find the offers also with a general search engine. Special mechanisms can be applied to foster such offerings. In the following, we first consider the case that touristic information is searched for at one specific site (e.g., visiteurope.com). Later we extend to a search in the Web where distributed information has to be considered and typically the amount of information and the number of sites is practically unknown. Here, the concepts for modeling users’ explicit and implicit preferences become more important. Then we look briefly at the service provider side to show how he can achieve that offers can be found by potential customers.
Information Retrieval Information retrieval is the scientific discipline that deals with the representation, organization, storage, and maintenance of information objects and in particular textual objects (Jensen and Rieh 2010). Often also the process of retrieving or searching information is called information retrieval. The representation and organization of the information items should provide the user with easy access to the relevant information and satisfy the user’s various information needs. Today, information retrieval focuses on electronically stored objects and usually on Web-accessible objects. However, in principle, information retrieval is a much older discipline applied also in traditional library systems since ancient times (Encyclopædia Britannica Online 2019). First proposals to organize information by computers and to support the search for information in computer memory were published by Maron and Kuhns (1960).
256
J. Dorn
An information retrieval system (IRS) is a software system that implements one or more retrieval functions. A search engine is a specific type of information retrieval system. Classified directories are another type in which a user can navigate and browse through to find relevant information. Furthermore, a social bookmarking system can be seen as an information retrieval system as well, where a tourist stores interesting bookmarks which she can revisit later or share this information with like-minded people. In a simple information retrieval system, all information might be stored locally in a database system, and the different items are typically records in such a database. Information could also be stored in files, but for structured information, database management systems provide faster access. The collection of all available information items is usually called corpus. If the system is locally managed, e.g., by a tourism service provider, the responsibility for the correctness of the information is clearly defined. For example, a hotel reservation system may contain millions of hotels with their services, addresses, and other attributes, but the owner of this database will ensure data quality to avoid incorrect search results. In this setting, the number of stored objects, as well as their attributes, are known, and a function can be defined to evaluate how good a result matches a customer’s expectation. Thus, the best solution can be found in a reasonable time. Furthermore, only one responsible body decides on the structure of available information and the functionality of the system.
Retrieval Models The main retrieval models are distinguished based on the retrieval function into a Boolean, a vector space, a probabilistic, and a network model. Existing retrieval systems often offer a mixture of these models to provide full expressiveness and efficiency. Concerning the relevance of items, we may distinguish binary decisions (whether an object belongs to the solution or not) or gradual decisions where we evaluate how good an item belongs to a solution. Thus, the quality of retrieval results is an estimation of how good the proposed object matches the customer’s expectations. Information retrieval systems are also distinguished concerning how the information items are stored and how queries are posed to the system. The items may be represented as raw numbers (number of rooms in the hotel), text (a description of the environment of the hotel), binary stored objects such as pictures and videos, or classified terms. For example, hotels are typically classified related to their services often expressed with stars or into certain hotel types like “wellness hotel.” Such classifications are called controlled vocabulary.
Queries Most user interfaces of general search engines provide a simple text search box in which a tourist enters terms that somehow specify what are the characteristics
11 Web Information Retrieval and Search
257
of the searched object. Typically, an object (e.g., hotel), a location (e.g., Vienna), and some other restrictions/characterizations are given in the search query. Search engines partially enable complex logical queries similar to database management systems. Moreover, additional operators based on typical Web resources are often provided. Hardwick (2018) explains 42 advanced operators that could be applied in Google search (e.g., a $ letter to find prices, “” for exact matches, map:Vienna to find a map for a region). However, users usually apply only simple queries. Search queries contain on average three words, and less than 5% of queries contain logical operators (Jansen et al. 2005). Often search engine user interfaces support an auto-complete (also called autosuggest) function, providing users with suggested search terms as they type in their query in the search box. This feature relies on matching algorithms with previous search terms of other users, e.g., to handle misspellings. Autocomplete can have an adverse effect on result quality when negative search terms are suggested when a search takes place. Autocomplete has now become a part of reputation management as companies linked to negative search terms try to alter the results. Google, in particular, offers to eliminate certain negative associations with a person’s or company’s name. Some search engines support also natural language input. Natural language search is carried out in everyday language, phrasing questions as used in a human conversation. These queries can be typed into a search engine, spoken aloud with voice search, or posed as a question to a digital assistant and then transformed into logical expressions. To enable such a search, the semantics of certain concepts must be modeled in information retrieval systems.
Indexing If stored objects are large texts or even pictures or videos, textual descriptors are required to enable a search for these objects. In case of texts, these index terms speed up the search by avoiding to search through the whole text. For other objects such as pictures or touristic services, the index terms are necessary because the user queries with textual expressions. In a simple local retrieval system, we expect that typically a domain expert uses a controlled vocabulary to describe stored information objects. Simple indexing approaches store for each information object all index terms in a tabular form. If a new object is stored, the insertion is trivial and fast. However, the search for objects indexed with a certain term would require a search through the whole index. Thus, in larger retrieval systems, an inverted index is used, where for each used index term a link to the information object is stored (Zobel and Moffat 2006). For example, if we would index a hotel as “wellness hotel”, the term “wellness hotel” would be stored in a row with entries for all hotels indexed with this term. Indexing can also be performed automatically. Then an index is created for terms that are characteristically for a text. Today, most operating systems analyze all
258
J. Dorn
documents (also emails) of a desktop computer to enable fast access to documents containing certain terms. In general, statistical analysis of the text objects is used to find out which terms are relevant.
Retrieval Functions The simplest retrieval function is a Boolean retrieval, where we simply ask whether a certain object is available in the corpus, i.e., an exact match between query and object exists. Typically, the user specifies certain characteristics of a searched object, and the system returns all objects matching these characteristics. We could, for example, ask for a certain hotel in a hotel database, and the retrieval system would return either true (together with all matching hotels) or false if no matching hotels are found. Usually, we would ask for hotels having certain attributes, for example, hotels in Vienna (where the attribute location is Vienna). In this case, we expect from the retrieval system that all matching Viennese hotels will be returned. With the Boolean model, we could use logic operators such as AND, OR, NOT and NOR to specify what kind of object we are interested in. For example, “hotels with the location in Vienna that are NOT in the first district OR having a cheap price.” If all logical operators are supported, very complex queries are possible, and most users would be unable to cope with this user interface (Greene et al. 1990). Such complex logical queries are standard for relational databases, but for a typical Web user, simple forms with interactive filters and sorting capabilities are typically provided instead. Moreover, in the Boolean model, all retrieved objects have the same relevance, and a user may not be interested in all results matching his logical query expression. The theoretical complexity is of NP-completeness already for simple queries, where only conjunctive expressions over existential clauses are given (Chandra and Merlin 1977). Although some experts prefer Boolean retrieval, ordinary users of the Web prefer other models where results are ranked by relevance, described next. In the vector space model (Salton et al. 1975), the query and every text object of the corpus is a vector in an n-dimensional space with the terms as independent dimensions. Query and text objects are now compared, e.g., by the cosine similarity. As a result of the vector space method, the n vectors most similar to the query vector and, thus, the n best matching objects are returned. In the probability model (Robertson 1977), the probability of the relevance of a document regarding a query is calculated. Basically, for each term of the query, the probability that the object satisfies the query term is calculated, and all probabilities are aggregated. The computed probability value can then be used to rank the results. Okapi weighting of terms, as described in the next section, is the basis of one possible probability model. In the inference network model (Turtle and Croft 1990), the relevance values of objects are computed in a network of probabilistic relations between objects and queries.
11 Web Information Retrieval and Search
259
Weighting of Terms We assume that every object in the corpus is represented by a set of characteristic terms. If we represent the user query also by a set of terms, we can compare the set of query terms with each object set in the corpus. These sets can be modeled by binary vectors, where each dimension of the vector is a certain term and its value is 1 if the term occurs in the object. If we search for a hotel, the query vector has a 1 in the dimension “hotel,” and we would find all objects that also have in their set a 1 in the dimension “hotel,” If we search “hotel in Vienna,” this query may be represented by a binary vector with a “1” for the dimensions “hotel” and “Vienna.” We have to weight how important the two characteristic terms “hotel” and “Vienna” are for documents in the corpus as well as for the query. We analyze terms for every document in the corpus without knowing which terms will be queried later. Term frequency (tf) is a measure counting how many times a term occurs in a document. If one of the terms occurs often (e.g., hotel), the document seems to be more relevant for the search than a document where this term occurs only once. However, terms that occur in many documents of the corpus have only few information content to decide which document is relevant. For example, if we have a corpus where many hotels are described, the term “hotel” may occur in every document, and, consequently, this term is irrelevant. Therefore, sometimes weighting based on the inverse document frequency (idf) is proposed: idf (t) = log
N
D:t∈D
1
where t is the term, N is the number of documents in the corpus, and the denominator is the number of documents where the term t appears. The weighting measure term frequency–inverse document frequency (tf–idf) reflects how important a term in a document is in relation to the whole corpus. The value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general. tf idf (t, d, D) = tf (t, d) · idf (t, D) Furthermore, we may have queries with different terms that have different relevance for the query. In our example, the word “in” has a lower relevance than the word “Vienna.” Thus, in general, there are three components of a weighting approach for which researchers have developed different functions. In the following example, Okapi weighting (Robertson et al. 1999) is given as an exemplary formula, where N is the number of documents, qtf is a term’s frequency in the query, df is the number of documents that contain the term, dl is the document length in bytes, and avdl is the average length of a document in the corpus. k1 (between 1.0 and 2.0,
260
J. Dorn
k3 (between 0 and 1000) and b (usually .75) are free parameters used to fine-tune results. tQ,D
ln(
k3 + 1 qtf N − df + 0.5 (k1 + 1) tf ∗ ∗ dl df + 0.5 k3 + qtf k1 (1 − b) + b avdl + tf
There exist many different functions that somehow try to find relevant documents for searchers considering the different aspects mentioned before. To identify which are the best weighting functions, we have to evaluate and compare different information retrieval systems.
Evaluation of Information Retrieval Systems If retrieval systems are compared, typically two measures are taken into account: recall and precision. For tourists looking for information, of course also the response time and the user-friendliness of the retrieval system are important criteria. If a tourist searches objects by specifying certain terms, all objects that satisfy his/her need are called relevant. In a binary retrieval function, relevant means that the matching between a search term and the found object is evaluated as true. For other retrieval functions, the relevance is mapped to a real value, and relevance means that the value is above a certain threshold. With recall, the relationship between the number of retrieved relevant objects and the total number of relevant objects is calculated. In Boolean retrieval on a local database, the recall should be 1.0. This is of course only possible if the corpus of all objects is finite and known. Recall quantifies which fraction of all relevant objects is found. However, the retrieval system may also return objects that are not relevant. Precision is then the percentage of returned objects which are relevant. Again, for a Boolean retrieval function with a restricted number of objects, the precision should be 1.0. Those objects retrieved that are not relevant are called false positive, and those that are relevant but not retrieved are called false negative. In information retrieval, a perfect precision score of 1.0 means that every result retrieved is relevant (but says nothing about whether all relevant objects were retrieved), whereas a perfect recall score of 1.0 means that all relevant objects were retrieved by the search (but says nothing about how many irrelevant objects were also retrieved). In statistical analysis when we classify into relevant and not relevant objects, the F-measure tests the accuracy of the classification. It considers both the precision p and the recall r of the corpus. A simple measure is F1 which is calculated as follows: F1 = 2 ·
p ·r p+r
11 Web Information Retrieval and Search
261
For large corpora, this simple measure is often declined, and measures such as F2 or F0.5 or in general Fβ are proposed, where either the recall (if β is greater than 1) or precision (if β is between 0 and 1) is of higher importance. The general formula is as follows: p·r Fβ = 1 + β 2 · 2 β ·p +r We could now use these measures to compare different available Web search engines; however, the problem is that we do not know the number of available Web pages, and with that, we also do not know the number of relevant Web pages. Consequently, for the World Wide Web, only rough estimates of the recall are possible.
Information Retrieval in the Web Information retrieval in the World Wide Web is different from simple retrieval in local databases because the sources for the retrieval are distributed and not controlled by a single authority. A Web Information Retrieval Architecture consists of a Web crawler, a document repository, an indexer, a document index, and a ranking algorithm sorting the results. The ranking of results typically is applied on a subset of the retrieved results, based on heuristics that could not be applied to the whole result set, due to performance issues. Additionally, a user’s query is optimized to make the search for results in the document index more efficient. The following figure shows the typical architecture of a Web information retrieval system (Fig. 1).
Ranked List of Documents
User
Web
query documents Query Optimization
Web Crawler
optimized query Result Ranker
Retrieval System
Document
Fig. 1 Architecture of a Web information retrieval system
Document Repository
Indexer
262
J. Dorn
Crawling One of the characteristics of the Web is that sources (Web sites) change continuously and the number of pages is usually growing. New Web sites are created, and existing Web sites offer additional pages, content, and services. Web sites may also disappear. Today it is estimated that Google, for example, has indexed over 5 billion Web pages. Since there is no central authority, where a search service could ask for new Web sites and pages, one required function for Web search is a crawler (explained below). Web content that is not found by any Web search engine is called the deep Web in contrast to the surface Web. Reasons, why certain content is not found, are password-protected sites as well as sites with dynamic links generated from database content. A Web crawler or spider is a program that tries to find as many relevant Web sites and pages as possible and puts the links as well as the pages into a document repository (the corpus of subsequent search tasks). For general Web search, almost all Web sites must be classified as relevant since we do not know which search queries will be generated eventually. A crawler starts with a seed, which is a collection of links to somehow strategically well-selected Web sites. When the crawler analyzes the main pages of these sites, it identifies all the hyperlinks in the pages and adds them to a list called crawl frontier. Depending on the applied search strategy, links from the frontier are recursively analyzed. A simple strategy is a breadth-first search: links entered first will be processed first. A knowledge-based strategy is to evaluate each link/page regarding certain criteria to then rank the links by these criteria. The most recent version of each page is then stored as a file in the document repository. The large amount of Web pages implies that only a limited number of the Web pages can be downloaded within a given time. Consequently, many different policies were developed to rank the items in the frontier list. Aspects considered here are the freshness of a Web page, the frequency a page is updated and/or searched for, the content, the accessibility (by mobile and handicapped users), the structure of a page, and many more. A crawler will also try to avoid duplicated content. In general, well-structured HTML pages can be processed by a crawler quite simply, but if the content is dynamically generated, depending on certain arguments of a URL, or if other languages or media are referenced, crawlers are limited in their ability to recognize duplicated content. Regular crawling by different crawlers can also raise considerable traffic on a Web site. If an owner of a Web site does not wish the site to be analyzed by a crawler, a robots.txt file can be stored on the site where rules can be defined how crawlers should behave on the site.
Automated Indexing When a Web information retrieval system stores billions of Web pages in a repository, the search for pages that are relevant for a given query must be very
11 Web Information Retrieval and Search
263
efficient. The indexer is responsible to enable fast access to documents in the repository. Thus, the indexer analyzes each stored document for characteristic terms, which are stored in the document index with a link to the Web page. A Web site manager can specify certain keywords in an HTML document. These terms could be used as indexes. However, some Web page administrators tried to make their documents more relevant by inserting thousands of such keywords. Therefore, Google for example, does not use these keywords anymore. Special HTML elements such as title, description, or headings may also be used by an indexer because these should typically describe the content of the document, but again these become irrelevant. For finding relevant terms in text documents, text mining (or knowledge discovery in texts) is used (Feldman and Dagan 1995). Different techniques and procedures were developed in the area of text mining. Text mining analyzes a text document and assigns terms to the document that shall characterize the document best. Typical first steps are to eliminate HTML code (in case of Web pages) and stop words (terms having no specific meaning such as articles, etc.) and to reduce words to their word stems. Often also phrases representing certain concepts are tried to be identified. For example, we may index a text with the content “This was a very luxury hotel” simply with terms “luxury” and “hotel” and/or with the concept “luxury hotel.” Perhaps also the concept of “very luxurious hotel” would make sense. The co-occurrence of such words is usually identified by statistical means. The weighting of terms as described for non-Boolean retrieval functions can finally be used to identify characteristic terms and their relevance for a document.
The Relevance of Search Results If we use a search engine for touristic information, we usually obtain a long list of links to pages that should satisfy the query. Most search engines deliver a link to the found resource, a short abstract, publication date, and an author or owner of the resource as far as available for every result. This list is called search engine results page (SERP) and may contain millions of results. For example, even if we search for hotels in a smaller city, the list often contains millions of results. These are typically Web pages that contain the search terms or perhaps contain similar terms as given in the search query. Searchers typically prefer the first entries of the SERP. Experiments have shown that the top 5 links on Google’s SERP account for 76% of clicks. Consequently, the ranking of results has a great impact on search quality. Therefore, search engines rank the results on base of a calculated relevance of the Web pages. Typically, the relevance is computed by different criteria: 1. The grade of the matching of the search terms and the indexes of the Web pages. If a Web page is short and has few indexes, the exact match is more relevant than for a Web page with many indexes.
264
J. Dorn
2. The search engine may have additional information about the user, for example, from the search history, the used language, and/or the region from where the search query was issued (based on the IP address of the user) 3. The importance of Web pages is a criterion independent of the individual user and certain query terms. This value is calculated in relation to other Web pages. 4. The structure of a Web page, the correct usage of HTML, and the support for accessibility by handicapped persons are further soft criteria for improving the ranking. 5. The authority of the owner of a Web site is also used. Consequently, a Web site of a well-known organization will be ranked higher than a private Web site. If many Web pages link to a certain page, this page is more important than other pages. If few Web pages link to a certain page, the page may be irrelevant. Moreover, the links to the Web page may be weighted by the importance of the originating Web page. Since the incoming weights may be changed because the importance of the originating pages has changed, several iterations are required in the algorithm to obtain a value that is close to the theoretical value. Many algorithms exist to compute the importance of a Web page based on these ideas. The best-known algorithm is the PageRank algorithm (Page et al. 1998) developed by the founders of Google. Today, Google uses improved versions of this algorithm. A whole theory called social network analysis researches linked structures and computes characteristic values for nodes in such a network. We may analyze the flow of information through a network as well as the impact of market participants on other participants by similar concepts (Easley and Kleinberg 2010). In the past, the Google search engine has shown the page rank of found Web pages in its search results. This has led to investigations on how to optimize the page rank of a page. Today, the page rank is not shown any more to reduce the attempts to artificially increase the page rank. Such score-based approaches to rank search results by different criteria are typically secrets of the search engine developers and are often based on extended experiments with human experts. Sophisticated approaches use techniques from machine learning to re-rank the upper part of the search results. For example, Burges et al. (2007) proposed LambdaRank, which uses a supervised neural network to learn a re-ranking of search results by looking at results pairwise. Wang et al. (2011) apply a similar approach to recognize semantic similarity between query and result. These approaches are computationally expensive and can therefore only be applied on a subset of the found results.
User Profiling If two persons search with a general search engine, they often get different results, because Web search is becoming more and more individualized. A user may enter explicit preferences in the search engine, or related programs may exchange data
11 Web Information Retrieval and Search
265
with the search engine (e.g. Google has several apps and sites such as Maps, Gmail, and YouTube that exchange data with the search engine if a Web user registers for these applications). However, implicit preferences become even more important. Commercial search engines do not publish their strategies in individualization, but different parts of their strategies can be assumed (Slawski 2006, 2019). For example, if proximity plays a role in the search phrase, Google ranks Web pages that offer services near to the searcher’s locality higher. Thus, the locality of the searcher is used as an implicit preference. Searching for a hotel will provide us with links to local hotels independent of whether we use a mobile device or a desktop computer. The locality of the device can be determined by using GPS coordinates, the IP address, and/or the accessed Wi-Fi network. Another implicit preference is the user’s language. Based on preferences set in the Web browser and partially the region, a certain default language can be chosen. The Web-based search engine can also use information about the Internet browser and the operating system the searcher is using. Other preferences are derived from the search history and the Web sites visited by the user. Similar to e-commerce where similar products are recommended, a search engine can rank Web pages higher that seem to be similar to those pages the user has liked before.
Metasearch Metasearch is an approach to combine results from different search engines. A user specifies in a query what she is searching for, and this query is then transformed into search queries for several ordinary search engines. The metasearch engine does not need an index and does not have to store the documents. This concept can be used for general as well as focused search. Dogpile is a general metasearch engine that fetches results from Google, Bing, Yandex, Yahoo!, and other popular search engines. One advantage of metasearch is the possibility to search anonymously because the metasearch engine does not use the identity of the searcher. However, often general metasearch engines will not find so many more results in contrast to large search engines. However, a focused metasearch for tourism seems to be more successful than general metasearch. Since there exist many online travel agencies and search engines for tourism that only have partially overlapping information, metasearch engines in the tourism domain are very successful in searching for accommodation, flights, rental cars, and more. Depending on the query, the search engine may also restrict the number of queried search engines (e.g., if a certain search engine provides only information about services in a certain region). Metasearch can also be used to dynamically construct holiday packages from different sources. A specialized application of metasearch is a usage of metasearch in destination management, where a destination provides search capabilities for regional resources without developing an own database or search engine.
266
J. Dorn
Operation of a Metasearch Engine When a metasearch engine uses different underlying search engines, different kinds of cooperations are possible. The search engine provider may give full access to its database and the index file to the metasearch engine provider. For the search engine provider, this may result in adaptations of its interfaces and additional maintenance work, and the search engine provider’s business model may be impacted by the metasearch. Metasearch results in higher traffic on other search engines without higher income. Thus, certain benefits must be offered to the search engine provider. Search engines may also be noncooperative without denying the access, but with no offered interface. Then the metasearch provider has to extract data by simulating ordinary Web searchers. This results in higher maintenance effort on the side of the metasearch engine provider, because the search engines may change their Web interface without notification. More critical processes such as reservation or booking of a tourism service could be realized only in a cooperative environment. A search engine provider may also try to refuse access by legal means. Existing commercial metasearch engines certainly build on strong cooperations with search engine providers to avoid legal and technological problems. A detailed technical description of how the integration can be realized can be found in Dorn et al. (2009).
Search Engine Interfaces Search engines have different interfaces. They can be distinguished by the communication protocol, the required process to obtain data, the used logic for expressions, and the semantics. A simple protocol is HTTP which is used typically by Web browsers. Another protocol is a REST Web service interface based on XML, RDF, or JSON file format. If the results of a search engine are not presented in a structured way, also methods for extracting semi-structured data can be applied (Carme et al. 2006). Search engines may have different processes of how to obtain data, distinguished by such steps as authorization, specification of interests, or search and booking of services.
Steps in Metasearch In a focused metasearch, where the engine tries to extract information from structured sources, the sources (search engines) may use different names for similar concepts (e.g., accommodation or hotel), concepts may have varying attributes with different value ranges (an outdoor pool may be an attribute or a value of a general attribute), or the results may be presented with different concepts (e.g., prices may be presented in different currencies and/or for different periods). Thus, the user
11 Web Information Retrieval and Search
Web Interface
267
Mobile Client Interface
Metasearch Engine
Programming Interface
User Profiles
Query Engine Query Translation Filter Result Ranker Logger query
query
results
results Search Engine 1
query
Search Engine 2
results Search Engine n
World Wide Web
Fig. 2 Metasearch architecture
query, as well as the results, have to be translated individually for every search engine. Semantic Web approaches, based on a central ontology, may support the translation process (Missikoff et al. 2003). Finally, methods of data cleansing have to be applied to the search results. Double entries have to be detected and inconsistencies between doublets eliminated, and all results have to be ranked dependent on user preferences. If the metasearch is applied to touristic services, the prices of offers from different search engines have to be considered, too. The following figure shows a simplified architecture of a metasearch architecture (Fig. 2).
Web Site Optimization From the perspective of a touristic service provider, methods can be applied to make it easier for potential customers to find resources such as information and newly available services. Web site optimization is an approach used to rank certain offers higher than computed by standard ranking. There are two possibilities: by search engine optimization, an information or a service provider tries to improve the Web site to achieve better ranks in a SERP. This ranked list is also often called organic search results. With keyword advertisement (i.e., search engine
268
J. Dorn
marketing), a provider pays for getting better ranks. These are typically shown in a second list.
Search Engine Optimization Search engine optimization (SEO) is the attempt to increase the quality and quantity of Web site traffic by increasing the visibility of a page or a whole site to users of a search engine. To optimize a site, it is necessary to understand how search engines crawl, how they index resources, and how they rank results. The strategies change over time, and the exact details of these steps are usually trade secrets of search engines, but the big search engines publish recommendations for Web site providers. In principle, the recommendations support the process of search engines to evaluate whether a certain site matches a query and a user’s interests. Google, for example, works with external search quality evaluators worldwide. These evaluators are evaluating Web sites for the so-called EAT (Expertise, Authoritativeness, Trustworthiness) factors using quality guidelines described in Quality Rater Guidelines Google (2019). Google does not use these ratings for ranking directly, but for testing their algorithms. Often, Web site providers have identified certain strategies of search engines and have somehow misused these insights to optimize their ranking in SERPs. For example, companies made extensive use of popular keywords to improve their ranking. Google bombing refers to the practice of causing a Web site to rank highly in search engine results for irrelevant, unrelated, or off-topic search terms. A general recommendation for Web site providers is to use only few keywords, to have mainly own content (no copied content), using a correct HTML syntax and only seldomly using data formats that cannot be indexed (e.g., pictures should not be used too extensively). A further tactic to promote a Web site is to increase the number of backlinks or inbound links, especially in order to increase Google’s page rank, or to offer a responsive design, enabling a usage with mobile devices or by impaired users.
Search Engine Marketing In search engine marketing or keyword advertising, an information or service provider pays to have an advertisement appear in the results listing similar to other search results when a searcher uses a certain search phrase. The particular phrase is composed of one or more key terms that are linked to an advertisement. Usually, keyword advertisement is distinguished by the payment method in pay-per-click (PPC), cost per action (CPA), or cost per mille (CPM). AdWords is the well-known keyword advertising platform of Google. Service offerings or products that match search terms are displayed on the results page above the organic results. The ranking in this list is based on a biding strategy. An
11 Web Information Retrieval and Search
269
advertiser offers a certain amount for ranking its advertisements as high as possible. Payment is based on pay-per-click.
Bias in Web Search Several studies indicate various political, economic, and social biases in results that search engines provide. By increasing personalization of search results, two users will typically obtain different results for the same query. Based on different used hard- and software, the user’s search history, and other attributes, these differences can be explained. Today we speak of a filter bubble, meaning that Web users are led to Web sites that somehow reflect their own believing and interests and thus users get less exposure to conflicting viewpoints and are isolated intellectually in their own informational bubble. Tavani (2016) state three reasons for search engine bias: • search-engine technology is not neutral, but instead has embedded features in its design that favour some values over others (e.g. pages in certain languages may be preferred); • major search engines systematically favor some sites (and some kinds of sites) over others in the lists of results they return in response to user search queries (e.g., the authority of an American official administration may be higher than those of another smaller country); and • search algorithms do not use objective criteria in generating their lists of results for search queries
History of Information Retrieval in Tourism Local computer-based information retrieval systems were first developed in the 1960s. By the 1970s different retrieval techniques for small text document collections were introduced, and in the late 1970s, large-scale retrieval systems came into use. Since 1992, the Text Retrieval Conference (TREC) is organized to support the information retrieval community by supplying the infrastructure that was needed for the evaluation of text retrieval methodologies on very large text collections. This catalyzed research on methods that scale to huge corpora. The introduction of Web search engines has boosted the need for very large-scale retrieval systems even further. The Web was until 1992 a small number of servers where the resources were handled manually by Tim Berners-Lee at CERN. In the next years, several search engines with very restricted capabilities were developed. In 1995 Yahoo! and AltaVista launched directories where users could interactively browse through to
270
J. Dorn
find resources on the Web. One of the first widely used Web browsers was Mosaic similar to today’s browsers with the capability to show different types of media. In 1996 Netscape started a browser with a search engine. In 1995 Larry Page and Sergey Brin started their work on what was later called Google. The company Google was founded in 1998 and offers today the most successful general search engine. The market share of Google’s search engine in 2018 was about 86%. The second biggest player with about 4.6% is Bing from Microsoft Corporation (Statista 2019). However, in China Baidu is the most used search engine, and in Russia it is Yandex. GoTo.com launched one of the first keyword advertising models that was commercially successful, with a patent on the concept issued in 1998. HRS (Hotel Reservation Service) Group was one of the first companies offering a hotel search in Europe on the Web. In 1991 Tiscover was founded as a tourism search portal in Austria, today part of HRS. In 1994 Venere.com was founded in Italy as a hotel search platform and later became part of Expedia. 1996 Expedia was founded by Microsoft Corporation. In 1996 Booking.com was founded as one of the first hotel metasearch engines on the market. Today, Booking.com has the greatest share in hotel bookings. In 2000 TripAdvisor was founded. In 2004 Kayak was founded as a metasearch platform with today seven brands such as checkfelix.com, swoodoo.com, and others and was acquired in 2013 by Booking.com. Trivago was founded in 2005 as a metasearch platform for tourism and was acquired later partially by Expedia. In 2016 Google launched Google Trips as a mobile app for travel information search. This app was shut down in 2019, but most of its features are now integrated into Google Maps. Google offers hotel owners with Google Hotel Ads a similar program as Google AdWords, specialized for hotel search, where a commission is paid to Google if a tourist is guided by a Google link to the hotel and makes a booking. A similar program is offered for airlines where they can offer dynamic prices for searched flights.
Expected Future Developments Although we see already many applications of semantic reasoning in Web information retrieval, this will become more important in the future. The Semantic Web as formulated by Berners et al. (2001) demands content descriptions that enable the integration of content and systems. Today general search engines take over some of the tasks of the vertical search engines because they “understand” concepts such as hotel, flight, and costs. However, these concepts are somehow hardcoded, because the developers of such search engines know what kind of queries are often posed by users. Search engines such as WolframAlpha and Ask.com try to apply semantic reasoning on a more general level, but still, the full potential is not available because they reason only on selected sources. What we would expect is a touristic search engine that can reason about preferences of even a group of tourists and then combine different offers in the Web to one new dynamically composed offer (i.e., dynamic packaging), where reasoning about prices, locations, time, and more is applied.
11 Web Information Retrieval and Search
271
Cross-References Advanced Web Technologies and E-Tourism Web Applications Artificial Intelligence and Machine Learning Big Data Technologies Business Intelligence in Tourism Eye-Tracking Technology for Measuring Banner Advertising Efficacy on
E-Tourism Websites: A Methodological Proposal Recommender Systems in Tourism Semantic Web Empowered E-Tourism The Evolution of Online Booking Systems Travel Information Search Website Evaluation Frameworks: A Review of the Hospitality and Tourism Field
from 1996 to 2019
References Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Scientific American. https://www. scientificamerican.com/article/the-semantic-web/ Burges CJC, Ragno R, Le Chris QV, Burges JC (2007) Learning to rank with non-smooth cost functions. Advances in neural information processing systems, vol 19. MIT Press, Cambridge, MA Carme J, Ceresna M, Frölich O, Gottlob G, Hassan T, Herzog M., Holzinger W, Krüpl B (2006) The Lixto project: exploring new frontiers of web data extraction, British national conference on databases, pp 1–15 Chandra AK, Merlin PM (1977) Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of STOCS. 18. pp 77–90 Dorn J, Hrastnik P, Rainer A, Starzacher P (2009) Web service based meta-search for accommodations. Inf Technol Tour 10(2):147–159 Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, New York Encyclopædia Britannica Online (2019) The history of Libraries https://www.britannica.com/topic/ library/The-history-of-libraries. Accessed 19 July 2019 Feldman R, Dagan I (1995) KDT – knowledge discovery in texts. In: Proceedings of the first international conference on knowledge discovery (KDD), pp 112–117 Google (2019) Search quality tests. https://www.google.com/search/howsearchworks/mission/ users/. Accessed 29 Jan 2020 Greene SL, Devlin SJ, Cannata PE, Gomez LM (1990) No IFs, ANDs, or ORs: a study of database querying. Int J Man Mach Stud. 32(3):303–326 Hardwick J (2018) Google search operators: the complete list (42 advanced operators). https:// ahrefs.com/blog/google-advanced-search-operators/. Accessed 19 July 2019 Jansen BJ, Spink A, Koshman S (2005) Web searcher interaction with the Dogpile.com metasearch engine. J Am Soc Inf Sci Technol 58(5):744–755 Maron ME, Kuhns JL (1960) On relevance, probabilistic indexing and information retrieval. J ACM 7:216–244 Missikoff M, Werthner H, Höpken W, Dell’Erbab M, Fodor O, Formica A, Taglino F (2003) Harmonise–towards interoperability in the tourism domain, ENTER conference Helsinki
272
J. Dorn
Page L, Brin S, Motwani R, Winograd T (1998) The PageRank citation ranking: bringing order to the web. In: Proceedings of the 7th international world wide web conference, Brisbane, pp 161–172. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf Robertson SE (1977) The probabilistic ranking principle in IR. J Doc 33:294–304 Robertson SE, Walker S, Beaulieu M (1999) Okapi at TREC-7: automatic ad hoc, filtering, VLC and filtering tracks. In: Proceedings of the 7th text retrieval conference, pp 253–264 Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620. https://dl.acm.org/citation.cfm?id=361220 Slawski B (2006) 20 ways search engines may rerank search results. http://www.seobythesea.com/ 2006/10/20-ways-search-engines-may-rerank-search-results/ Slawski B (2019) When Google SERPs may undergo a sea change. Search Engine J. https://www. searchenginejournal.com/when-google-serps-undergo-sea-change/ Statista (2019) Worldwide desktop market share of leading search engines from January 2010 to April 2019. https://www.statista.com/statistics/216573/worldwide-market-share-of-searchengines/ Tavani H (2016) Search engines and ethics. In: Zalta EN (ed) The stanford encyclopedia of philosophy (Fall 2016 edn). https://plato.stanford.edu/archives/fall2016/entries/ethics-search Turtle H, Croft WB (1990) Inference networks for document retrieval. In: SIGIR’90 proceedings of the 13th annual international ACM SIGIR conference on research and development in information retrieval, pp 1–24 Wang R, Jiang S, Zhang Y, Wang M (2011) Re-ranking search results using semantic similarity. In: Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Shanghai, pp. 1047–1051. Werthner H, Klein S (1999) Information technology and tourism – a challenging relationship. Springer, Wien Zobel J, Moffat A (2006) Inverted files for text search engines. ACM Comput Surv 38(2):43–56
Mobile Applications for e-Tourism
12
Wolfgang Wörndl and Daniel Herzog
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Foundations of Mobile e-Tourism Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Development of Mobile Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Context, Localization, and Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . User Modeling, Personalization, and Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mobile User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Taxonomy of Mobile e-Tourism Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information and Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Booking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On-Trip Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sharing and Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
274 275 275 276 278 280 281 283 284 286 287 288 289 290 290
Abstract More and more people are using smartphones and other mobile devices as their main means for information access. This is especially true for travelers, and mobile applications supporting them have become very popular in the last years. This chapter first introduces basic concepts and technologies that are important for mobile applications for e-Tourism. After a brief historical overview, we then discuss issues regarding the development of mobile applications, such as determining the context of users with sensors. We also explain basic principles
W. Wörndl () · D. Herzog Department of Informatics, Technical University of Munich, Garching bei München, Germany e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_17
273
274
W. Wörndl and D. Herzog
of user modeling and personalization and mobile user interfaces. The second main part of the chapter classifies and outlines existing mobile applications for travel and tourism. We introduce a taxonomy that reflects a traveler’s journey from vacation planning until concluding the trip. This includes applications that assist users in finding destinations; searching for and booking hotels, events, and activities; and identifying other travel-related items. A more specific scenario is the search for points of interest (POIs) that a user can visit during a trip. Modern e-Tourism applications allow browsing and filtering relevant POIs and can combine them to interesting and practical itineraries. In addition, we present applications for social networking, entertainment in e-Tourism, and others that are frequently used by tourists.
Keywords Mobile application · Traveler · Context · User modeling · Taxonomy · User interface
Introduction More and more people are using smartphones and other mobile devices as their main means for information access. Recent advances in computing power on mobile devices, networking bandwidth, and interaction capabilities have made it possible to use a mobile device not only as a supplementary device but as the preferred option to manage people’s needs. The trend for mobile information access is especially striking for traveling users, and mobile applications supporting tourists have become very popular. Today, people use mobile technologies in all stages of travel. This includes planning and booking but most importantly supporting the travel experience while on the move. Furthermore, people can review their trips and share their experiences after traveling. Mobile applications can provide a pleasant user experience to travelers and assist them with personalized services that are tailored toward their needs. In addition, they help service providers in tourism to understand their customers and adapt their offers accordingly. Mobile applications pose some challenges for tourists. The screen sizes of mobile devices are usually significantly smaller than the ones of desktop computers (Ricci 2010). The interaction is hence limited and becomes more difficult when a physical keyboard is missing. Furthermore, mobile applications heavily depend on the availability of mobile Internet and a suitable data plan which may be an issue in rural areas or when traveling in another country (Borràs et al. 2014). Nevertheless, the biggest advantage of mobile applications compared to traditional desktop- and web-based systems is that users can solve their travel tasks even when already on the move and thereby update their plans spontaneously. This chapter extends and updates existing overviews on mobile applications for tourism, e.g., Kennedy-Eden and Gretzel (2012), Wang and Xiang (2012), or Dickinson et al. (2014). Benckendorff et al. (2019) gave a broader overview on
12 Mobile Applications for e-Tourism
275
the mobility paradigm and mobile technologies for tourism and also discussed signaling technologies and devices, such as digital cameras. Dorcic et al. (2018) surveyed academic research related to mobile technologies in literature, such as the ENTER Conference series organized by the International Federation for Information Technology and Travel & Tourism (IFITT). The chapter focuses on foundations and state of the art of mobile applications for e-Tourism in a stricter sense. A mobile application (or “app”) is a computer program running on a mobile device, such as a smartphone, tablet, watch, or any other device designed to be used in a mobile context. Thus, apps are intended to be used on the move, and the user is not tied to a particular location, such as a desktop computer in an office. The first part of this chapter discusses foundations and building blocks that are important for mobile applications for e-Tourism. The second part then classifies and covers existing applications along with different functions for mobile e-Tourism. The chapter concludes with a brief summary and short discussion of future developments.
Foundations of Mobile e-Tourism Applications This section first provides an overview of the historical development of mobile apps for e-Tourism and then discusses basic concepts and technologies. This includes the development of web-based, native, and cross-platform applications for mobile devices. An essential part of mobile applications is the use of sensors to determine the context of users, such as the traveler’s current location. This information can then be used together with a more static user model for personalization and recommendation which is particularly valuable in mobile scenarios. Last but not least, mobile user interfaces play an important role in tourism applications, which is why we explain basic principles in this domain as well.
Historical Overview While most of today’s mobile applications for e-Tourism run on smartphones and related devices, such as tablets and smartwatches, there have been approaches to supporting travelers already before the invention of smartphones. Most of these systems have been deployed in more specialized domains and controlled environments, such as mobile guides for museums. Other domains for early adaptive mobile guides related to tourism have been navigation systems or shopping assistants. Krüger et al. (2007) provided an overview on pre-smartphone solutions. These systems typically had some kind of representation of users and situations, especially the location within an exhibition, and tried to adapt the multimedia content that they show to visitors (Krüger et al. 2007). An example for an early museum guide was PEACH (Personal Experience with Active Cultural Heritage) (Stock et al. 2007). PEACH was a framework to support museum visits on personal digital assistants (PDAs), the predecessors of today’s smartphones. In addition to the
276
W. Wörndl and D. Herzog
mobile device, the approach also integrated larger stationary screens. PEACH used virtual characters (avatars) for guidance through the museum with different focal points and was also able to generate a post-visit summary for the visitor to take home. Early mobile applications were often proprietary systems for the intended target device. Noteworthy milestones to standardize mobile information access were the Wireless Application Protocol (WAP) and the Java 2 Micro Edition (J2ME) (Ahson and Ilyas 2010). WAP was a collection of techniques and protocols to bring simple web pages to cell phones. WAP was optimized for slower transfer rates and small displays with limited resolution and only simple input methods based on 3x4 numeric keypads. Intrigue (Ardissono et al. 2003) was an example for a tourist information system about a city (Torino) and its surroundings that used WAP in addition to a webbased interface for desktop browsers. Users were able to browse sights, specify geographic queries, and receive recommendations for destinations and itineraries for both individual users but also tourist groups with heterogeneous preferences. The J2ME platform allowed the development of more powerful interactive application on mobile devices. For example, Kenteris et al. (2009) presented a research prototype that facilitates the implementation of content-rich and personalized mobile tourist applications using J2ME. Users could download the dynamically created applications and execute them on their device without the need for network connection while on the move. Later systems made use of the improvement in the available mobile technology including continuous network coverage with increased bandwidth.
Development of Mobile Applications With the continuing development of mobile devices, such as smartphones, not only mobile web pages were more accessible to end users, but also the concept of mobile apps has received widespread attention. Users can expect not only uniform interaction patterns, but it also became easier for developers to implement and deploy their apps. This includes the concept of app stores and also recommendation of apps to customers. In general, we can distinguish between (El-Kassas et al. 2017): • Mobile web pages – web servers delivering web pages customized to the end user device using technologies such as HTML5, JavaScript, and CSS • Native mobile apps – developed for one particular platform, such as Android or iOS • Cross-platform apps – developed for different platforms, using tools and frameworks such as Ionic1
1 https://ionicframework.com
12 Mobile Applications for e-Tourism
277
One advantage of mobile web pages compared to native apps is that existing content on the World Wide Web can be presented to end users without any additional development effort. Furthermore, the end users do not need to install an app on their device but can access all mobile web pages using a web browser. However, mobile web pages can lead to a worse user experience if the web content cannot be adapted to the end user’s device by providing a responsive design, for example. Native mobile apps promise to overcome this limitation, but they can lead to a higher development and maintenance effort due to the heterogeneity of mobile operating systems and devices. Cross-platform apps try to close the gap between mobile web pages and native apps. Mobile app frameworks, such as Ionic, allow using web technologies to develop apps that can be transformed to native applications for different operating systems and distributed via the native app stores. El-Kassas et al. (2017) and Biørn-Hansen et al. (2018) discussed the state of the art of (crossplatform) mobile app development in more detail. A typical architecture of mobile applications is composed of the following components (Fig. 1): • Frontend (e.g., native smartphone app), with or without local storage (e.g., for user preferences) and functionality to collect and analyze sensor output • Backend implementing business logic, often with database system (e.g., to store points of interests) • Connection to external data sources and application programming interfaces (APIs), e.g., booking systems or social media TourRec is an example of a mobile app for Android that implements these components (Herzog et al. 2018). TourRec is a context-aware tourist trip recommender system that can recommend routes between any two locations in a city and is also able to make recommendations for groups of users. The app manages
Fig. 1 Basic architecture for mobile applications
278
W. Wörndl and D. Herzog
Fig. 2 User interaction in the TourRec Android app. (a) Users can enter preferences about POI categories and (b) provide start and destination points, as well as starting time and duration for their desired route
user preferences for POI categories (Figs. 2 and 3) on the device, while the POIs are retrieved in real time from external services, such as Foursquare. The route recommendation algorithm runs on the backend, which sends back the generated trip to the frontend. The backend is also used to store and analyze the users’ feedback on the recommendations.
Context, Localization, and Sensors Mobile applications typically collect contextual information about space, time, and other objects in the user’s environment (Benckendorff et al. 2019). Context can be defined as “any information that can be used to characterize the situation of an entity. An entity is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves” (Dey 2001). Learning about the environment allows understanding the situation travelers are in and support their decision-making. Hence, context modeling must reflect the whole travel experience and capture the dynamic process of travel decision-making (Lamsfus et al. 2013).
12 Mobile Applications for e-Tourism
279
Fig. 3 Recommendation of a sequence of POIs along a route in the TourRec app. The app presents the recommendation (a) as a list with additional information, such as arrival times, and (b) on a map.
The most important context feature for mobile applications is arguably the location of the user. Outdoor localization mostly relies on Global Positioning System (GPS). GPS is a satellite-based global navigation system that provides location and time information to a GPS receiver. Most mobile devices have a built-in GPS receiver and can thus rely on the exact positioning for location-based services. Features of smartphones and sensors that are often present in a mobile device and can be utilized to support travelers are the following (Benckendorff et al. 2019): • Gyroscopes, magnetometers, and accelerometers: these sensors allow the detection of device movements which can be, for example, utilized in navigational systems to determine when users will arrive at their intended destinations. • Networking and signaling technologies for access to networks: mobile devices usually support a range of networking standards, such WLAN/WiFi, cellular networks, Bluetooth/BLE (Bluetooth Low Energy), or NFC (Near-Field Communication). These technologies connect devices not only to the Internet and local networks but also to nearby objects and other devices and users. This allows a deeper understanding of the user context if the user is, for instance, interacting and exchanging data with another person or smart device.
280
W. Wörndl and D. Herzog
• Microphones and cameras allow for voice input and recording images and video. The latter can be used to detect user faces or gestures, but cameras can also scan features in the environment, such as nearby quick response (QR) codes (see below). • Temperature, humidity, and pressure sensors can be used to learn more about the users’ surroundings. • Others, such as light sensors, are used to adjust display brightness which increases the accessibility of mobile applications while walking outdoors, for example. Since satellite reception is usually poor inside buildings, GPS cannot be used indoors, and different technologies are employed. Various methods utilizing the mentioned sensors in mobile devices have been proposed for indoor localization. These include positioning based on infrared beacons, radio, or ultrasonic signals, cameras and microphones, and infrared or radio-frequency identification (RFID) technology (Krüger et al. 2007). A QR code is an example of utilizing the camera on smartphones to allow mobile applications to better understand the users’ location surroundings. The QR code is a two-dimensional machine-readable bar code that contains a locator, for instance, the URL of a web page, an identifier, or other information about the item to which it is attached to. Users can scan these QR codes with their mobile devices and receive information associated with the item or place. Mobile applications can be classified according to the applied context technologies. For example, Emmanouilidis et al. (2013) reviewed mobile guides according to a taxonomy based on client-side components but also system characteristics, such as localization methods and context awareness support level. Höpken et al. (2010) listed the following context dimensions which are relevant for the adaptation of mobile tourism applications: • Time context, e.g., date, time, and season • Device context, e.g., hardware, interface, and network • User context, e.g., demographic data and preferences (see also section “User Modeling, Personalization, and Recommendation”) • Spatial context, e.g., location, weather, and environment • Travel context, e.g., purpose, itinerary, and logistics Höpken et al. (2010) also presented a framework to utilize the context and dynamically adapt tourism applications to the current situation of users. Jannach and Zanker (2020) summarize interactive and context-aware systems in tourism.
User Modeling, Personalization, and Recommendation In addition to context, mobile applications utilize information about the user in a user profile or user model. The general user modeling process can be summarized
12 Mobile Applications for e-Tourism
281
in three steps as follows (Brusilovsky and Maybury 2002): (1) collecting data about the user, (2) analyzing the data to build a user model, and (3) applying the user model to adapt services. Both user profile and context information are critical for mobile e-Tourism systems. The difference is that the user profile is rather static and longer lasting, for example, preferences for restaurants or event types the user is interested in. This information can be implicitly derived from user behavior but is often explicitly entered by the user. On the other hand, context in a stricter sense is highly dynamic and transient and usually observed by sensors. For example, in the TourRec app, the system uses user preferences about different POI categories that the user can provide on a scale ranging from 0 to 5 (Fig. 2a). For context, the app uses the current location as default starting point and the current time as default starting time (Fig. 2b). In addition, weather and other context factors are used to select the recommended items in TourRec (Herzog et al. 2018). Today, mobile applications use both contextualization (e.g., show places in the tourists’ surroundings on a map) and personalization (e.g., filter and rank restaurants according to the tourists’ preferences for cuisine type or price level). A recommender system is a subclass of information filtering system that aims at predicting how much a user would like an item based on the user profile and context and suggest items to buy or consume to the user. Chaudhari and Thakkar (2019) and Ricci (2020) review recommender systems for travel and tourism in general, while Deli´c et al. (2020) focus on group recommender systems and decision-making. The increasing information overload may lead to the need of even more filtering, recommendation, and personalization for future mobile e-Tourism services.
Mobile User Interfaces Design requirements for mobile user interfaces differ significantly from those for desktop computers. Mobile devices vary a lot in output (e.g., display sizes) and input capabilities (e.g., soft or hard keyboard). The user interface needs to be tailored to the device characteristics and also the context of use and should not only be a smaller version of a user interface intended for larger displays. The focus in mobile interface design needs to be on usability and consistency. User interface design patterns are common solutions to typical interface design challenges, for example, utilizing screen space, navigating through content, or providing input options for users (Nilsson 2009). To deal with different screen sizes, mobile web pages are often responsive, and the size and arrangement of screen elements are dynamically adapted to the current interaction environment. Groth and Haslwanter (2016) studied the efficiency, effectiveness, and satisfaction when searching for tourism information on a smartphone. They ran an experiment with participants interacting with two mobile web sites for information retrieval tasks and found that the responsive website performed better in terms of satisfaction and perceived ease of use.
282
W. Wörndl and D. Herzog
With regard to user input, mobile devices often miss physical keyboards but feature touchscreens which are intuitive to use for most users. Touchscreens facilitate gesture-based interaction, for example, a “two-finger pinch” gesture is commonly used to change the size of objects or content such as zooming in and out of map views. An increasing trend is the exploitation of multi-touch interfaces, enabling more complex user interaction with the mobile device (Emmanouilidis et al. 2013). In addition, audio-based interfaces make spoken human interaction with computing systems possible, utilizing speech recognition to control, for example, the entertainment system in a car. For output, mobile tourism applications often deal with data that can be associated with a location, such as POIs. Therefore, mapbased views are often used in addition or as substitute for list-based visualization of data. For example, the TourRec app provides a list view with details about the POIs and walking times between places (Fig. 3a) but also shows the recommended sequence of POIs on a map (Fig. 3b) (Herzog et al. 2018). Another issue of mobile applications is notifications. Apps can send push messages to users of mobile devices using various visual, acoustic, or haptic means. This can be used to notify mobile tourists about interesting POIs or relevant events in the vicinity, for example. However, mobile applications also need to take into account cognitive load and limited attention spans of users while moving. A variety of mobile and interactive devices are common today and have to be considered for and integrated into current and future mobile tourism applications. Smartwatches are an alternative to larger smartphones for frequent but short interactions, for example, notifications about incoming messages or quick navigational instructions. Public spaces are often equipped with large displays showing information such as timetables or maps. More and more of these public displays are becoming interactive and can provide personalized information. In addition, virtual, mixed, and augmented reality (AR) technologies promise to expand the perception of reality. AR is an interactive experience of a real environment, where objects in the real world are enhanced by computer-generated information. For instance, museums can use AR technology to show the context of an artifact or the development of a place in history. Chung et al. (2015) presented a study on the role of AR for tourists’ intentions to visit heritage destinations. We present more examples later in this chapter. Finally, it is important to evaluate the usability and user acceptance of mobile applications and their interfaces. The goal is to understand how users interact with an app and make appropriate changes to the design. Because of the mentioned requirements of mobile user interfaces, good usability is key for the success of applications. The work of Rasinger et al. (2009) is an example for a user-centered approach to study the early phases of mobile application design in the tourism domain. Fuchs et al. (2011) investigated the tourists’ behavioral intentions when using an existing mobile application. The article proposes a technology acceptance model which is especially designed for mobile information access in the tourism domain. The use case is a mobile tourist guide for a large ski area. Their results show
12 Mobile Applications for e-Tourism
283
Mobile e-Tourism
Information & Search
Booking
On-Trip Services
Sharing & Interaction
Destinations
Hotels
Help & Support
Social
Points of Interest
Transportation
Navigation & Transport
Virtual Tourism
Safety
Others
Activities
Fig. 4 Taxonomy of mobile apps for e-Tourism
that hedonic quality and social influence are main factors for tourists’ intentions to use these technologies.
A Taxonomy of Mobile e-Tourism Applications Tourists can cover all aspects of travel planning and realization using mobile applications: (1) choosing a destination; (2) finding interesting attractions and activities; (3) booking transportation, accommodations, and activities; (4) receiving on-site support during the trip; and (5) sharing experiences. In this section, we present a taxonomy of mobile applications in tourism (Fig. 4). Our findings are based on published literature in the field of e-Tourism, previous surveys of mobile applications in tourism (Borràs et al. 2014; Gavalas et al. 2014; Wang and Xiang 2012), and our own survey of published mobile applications in the Android Play Store2 and Apple’s App Store.3 Compared to the previously published taxonomy of Kennedy-Eden and Gretzel (2012), we classify mobile e-Tourism applications in a way that they reflect a traveler’s journey from planning until concluding the trip. We introduce all of the five aforementioned travel aspects and present examples of mobile applications for each of these aspects. In addition, we present a few other mobile applications that do not fit into one of these categories but support users at different stages of travel planning and execution. 2 https://play.google.com/ 3 https://www.apple.com/ios/app-store/
284
W. Wörndl and D. Herzog
Fig. 5 Different types of travel information in mobile applications: (a) countries in the Skyscanner Android app and (b) restaurants in city of Munich in the Yelp Android app
Information and Search The first task when planning a trip is to find a suitable destination. Destinations can be defined on different levels: the Explore feature of the Skyscanner4 Android app suggests countries and cities (Fig. 5a), while apps of local tourism organizations provide an overview of their specific areas. The decision where to travel can be based on many factors. LaMondia et al. (2010) identified the most important criteria for choosing a destination: scenery and nature, climate, history and culture, visiting friends and relatives, and entertainment. The choice of a destination can give a rough estimation on the planned activities and POIs to be visited. For instance, history enthusiasts who choose a destination may already have some important museums or historical buildings that they want to visit in mind. The survey of LaMondia et al. (2010) revealed that the most preferred activities of tourists are cultural activities, such as examining architecture and visiting museums or exhibitions. The respondents stated that they spend the most money on food, local craft products, and clothing. However, travelers decide on at 4 https://www.skyscanner.com/
12 Mobile Applications for e-Tourism
285
least some of their activities after choosing a destination or even not before starting the trip. For example, a tourist who visited a lot of museums may spontaneously decide to have dinner at a nearby restaurant. For this scenario, a wide range of mobile applications exist that contain information about POIs and activities and support users in choosing the right ones, even when already on the move. Popular touristic websites, such as TripAdvisor,5 Foursquare,6 and Yelp7 (Fig. 5b), offer mobile applications that allow users to search for POIs that satisfy their needs. Users can specify queries and filter results with regard to category, location, and price range, for example. Furthermore, they use the mobile device’s GPS sensor to highlight POIs in the user’s vicinity, as explained in section “Context, Localization, and Sensors.” Applications for finding POIs often come with many social functions. People can rate POIs, write reviews, and upload photos. They also provide recommendations, for example, highly rated restaurants in the vicinity or recommendations from local experts. Applications for identifying POIs have been a popular research topics for years, especially in the field of recommender systems. Existing applications consider the user’s preferences and are often context-aware. Consequently, the recommendations can consider several aspects that have an impact on the quality of a recommendation, such as the user’s location, mood, and the current weather. A user who likes going to parks will not expect a recommendation for an outdoor activity on a rainy day, for example. Gavalas et al. (2014) presented a survey on mobile recommender systems. Examples of mobile and context-aware POIs recommender systems for travel and tourism have been introduced by Cheverst et al. (2000), Ricci and Nguyen (2006), Baltrunas et al. (2012), and Braunhofer et al. (2013). As tourists often travel in groups, recent works focus on recommending POIs to groups of users (Guzzi et al. 2011; Nguyen and Ricci 2017). When users explore a city, they often want to visit a set of POIs along a route. In most of the cases, it is impossible to visit all POIs of a city during a singleor multi-day trip because of several constraints, such as time and money. The problem of finding a route containing the most attractive POIs along an enjoyable and feasible route without violating the given constraints is called the Tourist Trip Design Problem (TTDP). A few mobile applications have been developed in the last few years to solve the TTDP. The mobile application eCOMPASS, introduced by Gavalas et al. (2015), can integrate public transport into the recommended trip. Another solution of Gavalas et al. (2016) extended the TTDP incorporating scenic, walking routes between POIs into the recommendations. The previously presented TourRec application (Herzog et al. 2018) is also an example for a mobile app that recommends POIs, focusing on generating a sequence of interesting places for a walking tour in a city (Figs. 2 and 3).
5 http://tripadvisor.com/ 6 http://foursquare.com/ 7 https://www.yelp.com/
286
W. Wörndl and D. Herzog
Existing mobile applications in tourism, such as TripAdvisor, often recommend activities besides POIs. One example of an activity that is popular among locals and tourists is cultural events, such as theater plays and concerts. Event applications, such as Bandsintown,8 allow searching for events. Similar to POI applications, event applications often come with personalized recommendations. Event recommendations are also an ongoing research topic. Solutions for context-aware event recommendations have been presented by De Pessemier et al. (2013) and Herzog and Wörndl (2017).
Booking After choosing a destination and having a rough idea of the planned activities, users can decide on how to travel to the location. Airlines and railway operators allow booking flights and train connections via mobile applications. Meta-search engines, such as Skyscanner, allow to compare different operators before buying a ticket. Another important decision when planning a trip is choosing the right type of accommodation. Tourists can stay at hotels, hostels, vacation homes, and private apartments, for example. Many meta-search engines, such as the mobile applications of Booking.com9 (Fig. 6) and trivago,10 allow finding available rooms and cheapest prices. Even though most POIs recommender systems also consider hotels as one type of POI, a few applications have been developed that focus on suggesting best hotels, e.g., by Raubal and Rinner (2004). Other applications, however, focus more on the social aspects of accommodation. Couchsurfing11 is one example of an application that goes beyond traditional booking applications. It allows travelers to get in touch with residents who offer a couch or room for free. Lin et al. (2015) described a research example for a mobile app for personalized hotel recommendation. The approach first builds a user interest profile using text mining techniques on aspects of hotel review content the user might prefer. The interesting review parts were identified by observing the users’ behavior while reading the text recording zooming in and out and other gestures. Then, the profiles were used to recommend hotels, reaching beneficial results in an experimental study. Booking is not limited to transport and accommodation. At any stage of the trip planning and execution, travelers can buy tickets for all kinds of activities. For instance, many of the aforementioned event recommender systems allow purchasing tickets within the mobile application.
8 https://www.bandsintown.com/ 9 https://www.booking.com/ 10 https://www.trivago.com/ 11 https://www.couchsurfing.com/
12 Mobile Applications for e-Tourism
287
Fig. 6 Hotels in Los Angeles suggested by the Booking.com Android app and ordered by popularity (a). Users can filter the search results with regard to several criteria (b)
On-Trip Services We summarize all kinds of support and help that allows travelers to move around at their destinations under the umbrella term On-Trip Services. For instance, many of the previously presented applications for attractions and activities also provide some form of support, e.g., by showing maps with all POIs and a navigation to each location. Local tourist information is increasingly available within mobile applications. These applications are often provided by the corresponding travel destination, for instance, travel regions or national parks. These applications contain general information relevant for visitors, such as how to travel to the destination, detailed maps of the area, and emergency numbers. Some of them even provide attraction and activity recommendations tailored to the respective area (see section “Information and Search”) and an option to book some of these recommendations directly in the app (see section “Booking”). For example, the US National Park Service offers mobile apps for some of the parks and other places they manage. The apps provide information about hiking trails, eating and sleeping options, but also local events and road closures in the area.
288
W. Wörndl and D. Herzog
Another type of tourist service is information about local transport. Existing mobile applications help tourists to find fastest or shortest connections between locations. Today, many cities or public transport companies offer such applications to support their customers. For instance, the Munich transport company MVG12 offers a mobile application that integrates several mobility offers, such as carsharing, bikesharing, and charging stations for electric vehicles. In the last few years, research has been done to recommend so-called multimodal routes that combine different transportation modes in one itinerary, such as private transport, public transport, carsharing, and walking. Tumas and Ricci (2009) presented PECITAS, a recommender system for multi-modal routes in the city of Bolzano, which is based on a knowledge-based recommender system. Codina et al. (2015) developed a context-aware route planner that based the recommendations on different personal and environmental factors, such as companionship and weather. Herzog et al. (2017) presented RouteMe, a collaborative recommender system for multi-modal routes that recommends routes based on the user’s personal preferences and the wisdom of the crowd but can also highlight popular routes. A few mobile applications focus on the safety of locals and tourists on the move. Life36013 is a mobile application that allows family members to track the other members, share their own locations, and notify emergency contacts whenever necessary. Wang and Xiang (2012) introduced the WalkSafe app that aids pedestrians that use their mobile phones while walking. WalkSafe uses the mobile phone’s camera and accelerometer sensors to alert the user when a vehicle is approaching the user.
Sharing and Interaction There are a lot of other mobile apps that were not specifically designed for tourism but are often used by tourists. Examples include mobile translator apps, different types of applications for entertainment and education, and applications for creating and sharing content, such as Instagram14 for (travel) pictures. The social aspect of traveling is an important factor for many tourists today. People want to share their experiences on social networks and get inspired by the shared content of others. Many of the aforementioned POI applications allow users to rate locations and write reviews containing personal experiences. Some mobile applications focus more on these social aspects. Foursquare originally allowed users to check in to locations, showing other people where the user currently is. Users were able to collect awards when actively using the check-in functionality. The check-in feature was later migrated to the new mobile application Swarm.15
12 https://www.mvg.de/en.html 13 https://www.life360.com/ 14 https://www.instagram.com/ 15 https://www.swarmapp.com/
12 Mobile Applications for e-Tourism
289
Various other applications cannot be assigned to one of the previously presented categories but are nevertheless frequently used by travelers. One popular example that is used by many locals and travelers is the mobile game Pokemon GO,16 which combines location-based content with AR (see section “Mobile User Interfaces”). BBC’s Civilizations AR17 is another example of applying augmented reality related to tourism. The app allows exploring art from across the world from different cultural periods. There are further applications allowing users to virtually explore places. For example, Google Arts & Culture18 enables virtual tours of a variety of cultural venues, such as museums and exhibitions, and also other places, such as national parks. For museum exhibits, high-resolution photographs and detailed information can be called up. The application is web-based with responsive design suitable for mobile devices, but not necessarily intended to be used on-site. In addition, a lot of museums offer individual mobile applications to not only support tourists planning their visit but also exploring their collections.
Conclusion In this chapter, we have provided an overview of mobile applications for e-Tourism. In the first part, we have discussed important foundations of mobile technologies. The ongoing transition from desktop computers to mobile devices and ubiquitous environments is very relevant for e-Tourism. On the one hand, the exploitation of context data is important for mobile e-Tourism applications. There is a need for advanced user modeling and personalization to better adapt applications to changing contexts. On the other hand, mobile technologies are already omnipresent and embedded into our surroundings. These not only include personal devices such as smartphones, smartwatches, and other wearable devices but also applications integrated into the environment of users, including devices in the Internet of things (IoT). Not et al. (2020) discuss IoT in the tourism domain in more detail. Besides wearable devices, kiosk systems are often available in public spaces, such as touristic areas and airports, to support individuals and groups of travelers in finding recommendations, for example. Herzog and Wörndl (2019) deployed their TourRec app on such an interactive screen and compared the user experience of this solution to a smartphone-only and a hybrid variant in a user study. In the future, distributed and migratory applications and user interfaces could be more prevalent. Scenarios include tourists downloading local information or interacting with multiple screens. In the second part of this chapter, we have reviewed existing mobile applications from the perspective of different travel planning and execution phases. We classified
16 https://pokemongolive.com/ 17 https://www.bbc.co.uk/taster/pilots/civilisations-ar 18 https://artsandculture.google.com
290
W. Wörndl and D. Herzog
apps in a way that they reflect a traveler’s journey from planning until concluding a trip. Consequently, the presented taxonomy distinguishes between apps for information and search, booking, on-trip services, and solutions to entertain users or allow them to share their experiences and feedback. Benckendorff et al. (2019) envisions a world where many tourism-related tasks are conveniently processed and presented on mobile devices in a user-friendly way. However, challenges in terms of connectivity, interoperability, and cross-platform compatibility of data sources and systems need to be overcome. Additional issues include necessary advances in analytic systems to utilize the potential of big data and the adoption of artificial intelligence (AI) and machine learning technologies. Speech recognition and chatbots are already used to support the interaction with mobile devices. Furthermore, travel is an inherently complex domain for intelligent applications, such as recommender systems. Finally, mobile applications pose problems and questions with regard to privacy and user control. e-Tourism apps collect data about the users, such as their current location, and can infer information about them that may be unwanted. This leads to an increased demand for a more rigorous treatment of privacy preservation and security (Emmanouilidis et al. 2013) and appropriate solutions to mitigate privacy concerns for future mobile e-Tourism applications.
Cross-References Group Decision-Making and Designing Group Recommender Systems Interactive and Context-Aware Systems in Tourism Internet of Things and Ubiquitous Computing in the Tourism Domain Recommender Systems in Tourism
References Ahson SA, Ilyas M (2010) Mobile Web 2.0: developing and delivering services to mobile devices. CRC Press Ardissono L, Goy A, Petrone G, Segnan M, Torasso P (2003) Intrigue: personalized recommendation of tourist attractions for desktop and hand held devices. Appl Artif Intell 17(8–9):687–714 Baltrunas L, Ludwig B, Peer S, Ricci F (2012) Context relevance assessment and exploitation in mobile recommender systems. Pers Ubiquit Comput 16(5):507–526. https://doi.org/10.1007/ s00779-011-0417-x Benckendorff P, Xiang Z, Sheldon P (2019) Tourism information technology, 3rd edn. CABI, London Biørn-Hansen A, Grønli TM, Ghinea G (2018) A survey and taxonomy of core concepts and research challenges in cross-platform mobile development. ACM Comput Surv (CSUR) 51(5):108 Borràs J, Moreno A, Valls A (2014) Intelligent tourism recommender systems: a survey. Expert Syst Appl 41(16):7370–7389. https://doi.org/10.1016/j.eswa.2014.06.007
12 Mobile Applications for e-Tourism
291
Braunhofer M, Elahi M, Ge M, Ricci F (2013) STS: design of weather-aware mobile recommender systems in tourism. In: In Proceedings of the 1st workshop on AI*HCI: intelligent user interfaces (AI*HCI 2013) Brusilovsky P, Maybury MT (2002) From adaptive hypermedia to the adaptive web. Commun ACM 45(5):30–33. https://doi.org/10.1145/506218.506239 Chaudhari K, Thakkar A (2019) A comprehensive survey on travel recommender systems. Arch Comput Methods Eng 1–27 Cheverst K, Davies N, Mitchell K, Friday A, Efstratiou C (2000) Developing a context-aware electronic tourist guide: some issues and experiences. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’00. ACM, New York, pp 17–24. https://doi.org/ 10.1145/332040.332047 Chung N, Han H, Joun Y (2015) Tourists intention to visit a destination: the role of augmented reality (ar) application for a heritage site. Comput Hum Behav 50:588–599 Codina V, Mena J, Oliva L (2015) Context-aware user modeling strategies for journey plan recommendation. In: Ricci F, Bontcheva K, Conlan O, Lawless S (eds) User modeling, adaptation and personalization: 23rd international conference, UMAP 2015, Dublin, June 29 – July 3, 2015. Proceedings, Springer International Publishing, Cham, pp 68–79. https://doi.org/ 10.1007/978-3-319-20267-9_6 De Pessemier T, Minnaert J, Vanhecke K, Dooms S, Martens L (2013) Social recommendations for events. In: CEUR workshop proceedings, vol 1066, p 4 Deli´c A, Nguyen TN, Tkalˇciˇc M (2020) Group decision-making and designing group recommender systems. Springer International Publishing, Cham, pp 1–23. https://doi.org/10.1007/978-3-03005324-6_57-1 Dey AK (2001) Understanding and using context. Pers Ubiquit Comput 5(1):4–7 Dickinson JE, Ghali K, Cherrett T, Speed C, Davies N, Norgate S (2014) Tourism and the smartphone app: capabilities, emerging practice and scope in the travel domain. Curr Issues Tour 17(1):84–101. https://doi.org/10.1080/13683500.2012.718323 Dorcic J, Musanovic J, Suzana M (2018) Mobile technologies and applications towards smart tourism – state of the art. Tour Rev https://doi.org/10.1108/TR-07-2017-0121 El-Kassas WS, Abdullah BA, Yousef AH, Wahba AM (2017) Taxonomy of cross-platform mobile applications development approaches. Ain Shams Eng J 8(2):163–190 Emmanouilidis C, Koutsiamanis RA, Tasidou A (2013) Review: mobile guides: taxonomy of architectures, context awareness, technologies and applications. J Netw Comput Appl 36(1):103–125. https://doi.org/10.1016/j.jnca.2012.04.007 Fuchs M, Höpken W, Rasinger J (2011) Behavioral intention to use mobile information services in tourism: the case of the tourist guide dolomitisuperski.mobi. Inf Technol Tour 13. https://doi. org/10.3727/109830512X13364362859858 Gavalas D, Konstantopoulos C, Mastakas K, Pantziou G (2014) Mobile recommender systems in tourism. J Netw Comput Appl 39:319–333 Gavalas D, Kasapakis V, Konstantopoulos C, Pantziou G, Vathis N, Zaroliagis C (2015) The ecompass multimodal tourist tour planner. Expert Syst Appl 42(21):7303–7316. https://doi.org/ 10.1016/j.eswa.2015.05.046 Gavalas D, Kasapakis V, Konstantopoulos C, Pantziou G, Vathis N (2016) Scenic route planning for tourists. Pers Ubiquit Comput 1–19. https://doi.org/10.1007/s00779-016-0971-3 Groth A, Haslwanter D (2016) Efficiency, effectiveness, and satisfaction of responsive mobile tourism websites: a mobile usability study. Inf Technol Tour 16(2):201–228 Guzzi F, Ricci F, Burke R (2011) Interactive multi-party critiquing for group recommendation. In: Proceedings of the fifth ACM conference on recommender systems, RecSys ’11. ACM, New York, pp 265–268. https://doi.org/10.1145/2043932.2043980 Herzog D, Wörndl W (2017) Mobile and context-aware event recommender systems. In: Monfort V, Krempels KH, Majchrzak TA, Traverso P (eds) Web information systems and technologies. Springer International Publishing, Cham, pp 142–163 Herzog D, Wörndl W (2019) A user study on groups interacting with tourist trip recommender systems in public spaces. In: Proceedings of the 27th ACM conference on user modeling,
292
W. Wörndl and D. Herzog
adaptation and personalization, UMAP ’19. ACM, New York, pp 130–138. https://doi.org/10. 1145/3320435.3320449 Herzog D, Massoud H, Wörndl W (2017) Routeme: a mobile recommender system for personalized, multi-modal route planning. In: Proceedings of the 25th conference on user modeling, adaptation and personalization, UMAP ’17. ACM, New York, pp 67–75. https://doi.org/10. 1145/3079628.3079680 Herzog D, Lass C, Wörndl W (2018) TourRec: a tourist trip recommender system for individuals and groups. In: Proceedings of the 12th ACM conference on recommender systems, RecSys ’18. ACM, New York, pp 496–497. https://doi.org/10.1145/3240323.3241612 Höpken W, Fuchs M, Zanker M, Beer T (2010) Context-based adaptation of mobile applications in tourism. Inf Technol Tour 12(2):175–195 Jannach D, Zanker M (2020) Interactive and context-aware systems in tourism. Springer International Publishing, Cham, pp 1–22. https://doi.org/10.1007/978-3-030-05324-6_125-1 Kennedy-Eden H, Gretzel U (2012) A taxonomy of mobile applications in tourism. e-Rev Tour Res (eRTR) 10:47–50 Kenteris M, Gavalas D, Economou D (2009) An innovative mobile electronic tourist guide application. Pers Ubiquit Comput 13(2):103–118. https://doi.org/10.1007/s00779-007-0191-y Krüger A, Baus J, Heckmann D, Kruppa M, Wasinger R (2007) Adaptive mobile guides. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web. Springer, Berlin/Heidelberg, pp 521–549. http://dl.acm.org/citation.cfm?id=1768197.1768217 LaMondia J, Snell T, Bhat CR (2010) Traveler behavior and values analysis in the context of vacation destination and travel mode choices: European union case study. Transp Res Rec 2156(1):140–149. https://doi.org/10.3141/2156-16 Lamsfus C, Xiang Z, Alzua-Sorzabal A, Martín D (2013) Conceptualizing context in an intelligent mobile environment in travel and tourism. In: Cantoni L, Xiang ZP (eds) Information and communication technologies in tourism 2013. Springer, Berlin, Heidelberg, pp 1–11 Lin KP, Lai CY, Chen PC, Hwang SY (2015) Personalized hotel recommendation using text mining and mobile browsing tracking. In: 2015 IEEE international conference on systems, man, and cybernetics. IEEE, pp 191–196 Nguyen TN, Ricci F (2017) A chat-based group recommender system for tourism. In: Schegg R, Stangl B (eds) Information and communication technologies in tourism 2017. Springer International Publishing, Cham, pp 17–30 Nilsson EG (2009) Design patterns for user interface for mobile applications. Adv Eng Softw 40(12):1318–1328 Not E, Cavada D, Venturini A (2020) Internet of things and ubiquitous computing in the tourism domain. Springer International Publishing, Cham, pp 1–22. https://doi.org/10.1007/978-3-03005324-6_18-1 Rasinger J, Fuchs M, Beer T, Höpken W (2009) Building a mobile tourist guide based on tourists’ on-site information needs. Tour Anal 14:483–502. https://doi.org/10.3727/ 108354209X12596287114255 Raubal M, Rinner C (2004) Multi-criteria decision analysis for location based services. In: Proceedings of the 12th international conference on geoinformatics, pp 47–53 Ricci F (2010) Mobile recommender systems. Inf Technol Tour (3):205–231. https://doi.org/10. 3727/109830511X12978702284390 Ricci F (2020) Recommender systems in tourism. Springer International Publishing, Cham, pp 1–18. https://doi.org/10.1007/978-3-030-05324-6_26-1 Ricci F, Nguyen QN (2006) Mobyrek: A conversational recommender system for on-the-move ˜ travelers. In: Fesenmaier DR, WAber KW, Werthner H (eds) Destination recommendation systems: behavioural foundations and applications, CABI, pp 281–294. https://doi.org/10.1079/ 9780851990231.0281 Stock O, Zancanaro M, Busetta P, Callaway C, Krüger A, Kruppa M, Kuflik T, Not E, Rocchi C (2007) Adaptive, intelligent presentation of information for the museum visitor in peach. User Model User-Adap Inter 17(3):257–304
12 Mobile Applications for e-Tourism
293
Tumas G, Ricci F (2009) Personalized mobile city transport advisory system. In: Höpken W, Gretzel U, Law R (eds) Information and communication technologies in tourism 2009: proceedings of the international conference in Amsterdam, 2009. Springer Vienna, Vienna, pp 173–183. https://doi.org/10.1007/978-3-211-93971-0_15 Wang D, Xiang Z (2012) The new landscape of travel: a comprehensive analysis of smartphone apps, pp 308–319. https://doi.org/10.1007/978-3-7091-1142-0_27
Internet of Things and Ubiquitous Computing in the Tourism Domain
13
Elena Not, Dario Cavada, and Adriano Venturini
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From Mobile Services to IoT-Enabled Ubiquitous Computing . . . . . . . . . . . . . . . . . . . . . . . . The Enabling Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Location Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Action and Environment Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Design Choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Novel Scenarios for the Tourism Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information and Services Requested via Physical Actions . . . . . . . . . . . . . . . . . . . . . . . . . . Beyond the Personal Screen: Adaptive Information Kiosks, Intelligent Shop Windows, and Soundscapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interaction with Smart Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tangible Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploiting IoT for Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploiting IoT for Extended Data Analytics Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
296 297 298 300 300 301 302 304 304 305 306 307 308 310 310 312 312
E. Not () Intelligent Interfaces and Interaction Research Unit, Fondazione Bruno Kessler, Trento, Italy e-mail: [email protected] D. Cavada · A. Venturini Suggesto S.r.l., Trento, Italy e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_18
295
296
E. Not et al.
Abstract The introduction of mobile information services to the tourism domain has represented a radical change in the way tourists plan, enjoy, and reflect on their travel experience. The Internet of Things promises to represent the next big advancement. This corresponds to the possibility of distributing small pieces of interconnected technology in the environment and within objects for a more pervasive monitoring and personalization of how the tourism experience is consumed and for the creation of an extended interaction interface that supports a more direct engagement with the destination and its facilities, products, and services. For the tourism sector, these advancements open up novel scenarios of ubiquitous computing, for example, smart shop windows, digitally augmented showcases for handicrafts display, augmented itineraries that engage visitors with other means than smartphones and tablets for a more immersive experience, personalized souvenirs and smart gadgets, stationary information kiosks that automatically identify their users, mobile applications that are aware of which products, and places the tourist has already been in contact with. This chapter provides an overview of the enlarged ubiquitous computing capabilities enabled by the IoT technologies and illustrates possible applications in the tourism domain.
Keywords Ubiquitous computing · Internet of Things · Tourism services · Distributed intelligence · Tangible interaction · Embodied interaction
Introduction The introduction of mobile information services to the tourism domain has represented a radical change in the way tourists plan, enjoy, and reflect on their travel experience (Grün et al. 2008). In the early stages of this revolution, four main factors were recognized as determinants for the success of mobile services (Clarke 2001): the possibility of accessing services anywhere, regardless of location (ubiquity); the availability of services at all times, with the convenience for the user to access them at the point of need; the tailoring of service contents to the user location (localization); and the additional customization according to other contextual or personal variables (personalization). More recently, a fifth ingredient has revealed its importance: the timely sharing of tourist experiences and opinions with social networks (Leung et al. 2013). One major limitation of traditional mobile services is the reliance on personal devices to convey information to users and to collect their interaction input. This aspect is limiting in several ways ranging from technical aspects (e.g., explicit authorization is required for location tracking), to data quality aspects (e.g., mobile devices log just a few aspects of the onsite visit), to user experience aspects (e.g., the user-system interaction is focused on the device screen). The Internet of Things (IoT) promises to represent the next big advancement, with the possibility
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
297
of distributing small pieces of interconnected technology in the environment and within objects for a more pervasive monitoring and personalization of how the tourism experience is consumed (distributed intelligence) and for the creation of an extended interaction interface that supports a more direct engagement with the destination and its facilities, products, and services. This can be achieved, for example, via the use of sensors and smart objects to sense physical actions and activate contextual digital information delivery or personalize the consumption of services (tangible and embodied interaction), with the presentation of information in the environment (context-aware public displays and soundscapes) or the crafting of personalized physical objects (tangible output). For the tourism sector, these advancements open up novel scenarios of ubiquitous computing, for example, smart shop windows, digitally augmented showcases for handicrafts display, augmented itineraries that engage visitors with other means than smartphones and tablets for a more immersive experience, personalized souvenirs and smart gadgets for securing customer loyalty, information kiosks that automatically identify their users, and mobile applications that are aware of which products and places the tourist has already been in contact with. The ultimate ambitious goal is that of implementing a technological ecosystem that seamlessly integrates multiple points of encounter between the tourist and the touristic context and provides more effective information services.
From Mobile Services to IoT-Enabled Ubiquitous Computing By exploiting the powerful factors of ubiquity, convenience, localization, and personalization, mobile services for the tourism domain may have varied goals: support specific market transaction phases (pre-sales agreement, booking/ticketing and payment, and after-sales assistance), improve the quality of the relationship between final customers and providers of tourist products (e.g., by offering information, recommendation, and other functionalities that increase the enjoyment and the perceived quality of products and places), or represent final services with clear economic value and price (e.g., electronic travel guides, maps). Mobile services may be used during different stages of the travel lifecycle (pre-trip, onsite, aftertrip), each stage having its own constraints and opportunities. One of the most significant benefits is that mobile services can be personalized by taking into account contextual factors, such as the user profile and interaction behavior, location, time, social context, and environmental conditions. The personalization may impact on different aspects of the service, for example, on the content (e.g., which specific products are suggested to users), the information presentation (e.g., the graphical rendering, the language, and phrasing used), or the interaction mechanisms (e.g., which browsing options or activities are proposed to users at a certain point in time). The service delivery may be triggered by specific users’ requests (pull mode) or may be fired automatically on system initiative according to contextual factors and appropriateness strategies (push mode). To improve the convenience for users to access the services, web sites and applications are implemented to guarantee
298
E. Not et al.
a responsive visualization on different types of devices (laptop screen, tablet, smartphone), and multichannel communication strategies are used to improve the bidirectionality of the dialogue between the user and the service providers (through web sites, mobile applications, email, social media, web chat, and voice). In the traditional scenario of mobile services, the computational nodes involved in the communication are the personal devices of the user and one or more remote servers collaborating to deliver the required information. With ubiquitous computing technology, other computational nodes are introduced in the network which are locally distributed in the environment or are carried by the user and that share part of the computation load (Weiser 1993). This enriches the possibilities to acquire information about the context of the user and the status of the environment, so as to improve the visitors’ digital and physical experience. This vision has been generalized and extended by the new computational paradigm of the Internet of Things (IoT) that envisages networks of interconnected devices and objects embedded with electronics that exchange data and cooperate toward a common goal, taking advantage of the progress made in the Internet communication and cloud computing (Atzori et al. 2010; Gubbi et al. 2013). With the new technological infrastructure, sensors distributed indoors and outdoors can send to collecting gateways data about environmental parameters like air quality, temperature, and light; they can sense the presence and movement of individuals, crowds, vehicles, and goods; they can control the state of machinery. Sensors attached to objects are able to detect when the latter are moved, touched, stretched, or pressed. Wearable sensors can additionally monitor life signals of people and the execution of actions. Although the multisensor monitoring of the environment with signal fusion, user multimodal interaction, tangible interaction, and energy efficient and robust data transmission are not new areas of investigation, the IoT paradigm has the merit to pool all the technological advances made in the different fields by exploiting the concept of advanced networked architectures.
The Enabling Technologies The logical architecture of a system based on the Internet of Things is divided into four main layers (Guo et al. 2014) (Fig. 1). The lower layer is populated with hardware devices, sensors, and actuators. These can be directly accessible to the user (e.g., personal devices, wearable smart devices, smart cards, and other physical objects augmented with sensing capabilities) or can be distributed in the environment, where local microcontrollers manage their functioning and the collection and transmission of the signals. For the tourism domain, the sensor technologies that have found widespread application so far mainly relate to issues of identification, location detection, action sensing, and environment sensing. A second layer, regulated by standard message exchange protocols, provides a wired or wireless data exchange infrastructure (e.g., via Bluetooth, Wi-Fi, ZigBee, low-power wide-area networks (LPWAN)), with local gateways and networked computing
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
299
Fig. 1 The logical layers of an IoT architecture
nodes that provide data preprocessing, filtering, and abstraction (e.g., by recognizing relevant events from low-level image capture by cameras, like crowd detection or traffic congestion). A third layer guarantees that the filtered data and service requests are dispatched to remote servers for further computation and service provision: this is where technologies for internet-based communication, mobile communication, and cloud computing need to seamlessly guarantee the timely collaboration among heterogeneous computing nodes. At the top of the logical architecture stays the application layer, where the services based on the collected data are administered and delivered to the involved users. For the tourism domain, these may consist of information and transaction services offered online (e.g., recommendations for the next point of interest to visit based on the past user journey traced via Bluetooth beacons) or of augmentations of the physical experience onsite (e.g., the purchase of spa services simply by accessing the hotel wellness area with a personal smart card). The application layer also includes software for the storage of the interaction logs generated by tourists interacting with the services and the provision of analytics to tourism service providers. A fifth fundamental ingredient of the logical IoT architecture permeates all the four layers and is represented by mechanisms and protocols aimed at solving issues of trust and integrity of the collected and exchanged data, security of transmission, and compliance with ethical and privacy regulations (Atzori et al. 2010). In the next subsections, we describe in more detail the technologies at the sensor layer that have been experimented so far in tourism for uniquely identifying visitors, tracking their location, recognizing physical actions, and monitoring the environment.
300
E. Not et al.
Identification In certain application scenarios of the tourism domain, it is important to uniquely identify users, objects, or places. For example, the identification of the user is important during market transaction phases (pre-sales agreement, booking/ticketing and payment, and after-sales assistance) or to build a user model that collects preferences, needs, and products chosen at different instants of the users’ journey to provide personalized services and recommendations (Ricci 2002). Web cookies and unique identifiers associated with downloaded mobile applications are typically used to automatically trace which users are accessing the services with their personal devices at different points in time, without the need of explicit account login. However, this approach is limited to tracing interactions that are made via web services and applications. New opportunities are offered by technologies like RFID (radio-frequency identification) and NFC (near-field communication) (Egger 2013). The RFID system is composed by a reader of radio waves connected to an antenna that sends interrogating signals to tags. When in reach, tags respond with their unique identifier. Passive tags are small and cheap and are activated by the electromagnetic energy transmitted by the RFID reader; they have a range of activation from near contact to 25 m. Instead, active tags contain a power source that allows them to broadcast their signal up to a range of 100 m. A badge or a bracelet containing an RFID tag can be given to individuals to identify them. NFC is a specific evolution of RFID technology that is more convenient for secure communication, as it requires close proximity for activation and is capable of a two-way communication: an NFC device can function as a reader as well as a tag, and two NFC devices can exchange data through a peer-to-peer interaction. These distinguishing features have made NFC technology particularly appealing to the mobile telephony industry as an enabler for a new set of services (like mobile payments and ticketing, e.g., Google Pay1 ; Apple Pay2 ), and today, most smartphones incorporate NFC devices, thus providing an alternative way to identify users. In addition to identifying users, the same RFID and NFC technology can be used to uniquely identify objects and points of interest by attaching tags to them: when a tagged object is placed near a tag reader or an NFC tag is scanned with a smartphone, the signal of the object/place recognition can be sent to a server for further processing (Riekki et al. 2006). This method for object identification offers an alternative with respect to QR (quick response code) and bar code recognition or augmented reality (AR).
Location Detection In existing mobile applications, the user location is usually gathered via the internal sensing capabilities of the personal devices carried by users. GPS is frequently used, but geographic coordinates may also be estimated using information about in-range
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
301
cellular networks and wireless networks (Roxin et al. 2007). As mentioned above, the user position can also be indirectly determined when users scan NFC tags or QR codes associated with specific objects or places with their mobile devices, thus allowing the implementation of location-aware mobile indoor guides and city tours that do not require GPS (Canadi et al. 2010). Techniques of augmented reality also allow to recognize visitors’ location according to what is in their line of sight (Han et al. 2013). IoT facilitates the distribution of additional types of location sensors in the environment. For example, infrared technology (IR) has been used in various applications in museums to determine more precisely the proximity of visitors to exhibit objects and to activate presentations, e.g., Stock et al. (2007). Its popularity is now being surpassed by the new generation of low-energy Bluetooth emitters (commonly known as beacons) thanks to their very compact dimensions, parsimonious battery consumption, and low cost. Beacons work with a short range between 0 and 100 m and transmit a unique identifier which can be associated with a physical location or a meaningful object. The transmitted data can be sensed by the Bluetooth receiver embedded in most tablets and smartphones and, with the aid of background processes running on the devices, can fire the generation of locationaware notifications and information (Ng et al. 2017). Beacons are particularly convenient for marking hotspots for indoor and outdoor itineraries, for example, to generate instructions on personal devices on how to reach the gate at the airport or to activate stories about historical buildings in a city (Gast 2014). Beacons can also be used the other way round: visitors can bring a beacon along, and computing nodes distributed at hotspots can sense when visitors approach to activate relevant multimedia presentations on public displays.
Action and Environment Sensing The types of sensors, actuators, and microcontrollers that can be integrated in an IoT setting are varied and may enable networks of collaborating nodes at different levels of complexity and scale (Kubitza et al. 2013). In addition to the sensors for location detection discussed above, there are sensors for sensing activities occurring on augmented artifacts (i.e., when an object is picked up, moved, opened, shaken, spoken to): accelerometer, gyroscope, compass, touch sensor, microphone, conductive fabric, and strain gauge. The state of the environment can additionally be monitored with sensors that measure light, temperature, air quality, moisture, and noise level. Actuators can be of different type, like visual (screens, projectors), acoustic (beeper, speaker), haptic (vibration, linear motor, mini-fan), related to heat (Peltier elements) and smell (odor releasers, material heaters), or for creating new artifacts (printers and 3D printers). Different types of microcontrollers can be used to connect the sensors and actuators in a customized way, for example, mBed, ESP, nRF5x, Arduino, Intel Edison, and Raspberry Pi (Kubitza et al. 2013). But preconfigured commercial sensor kits with associated managing platforms also start to be popular, especially to augment private spaces, e.g., Philips HUE for smart
302
E. Not et al.
home lighting,3 Sonoff smart switches,4 Ikea wifi smarthome,5 Apple HomeKit,6 Google Home.7 Even a smartphone or a tablet can be used in an innovative way as an off-the-shelf cluster of sensors and actuators and be programmed to exchange data with a wider IoT network. Deployments may range from very small solutions aimed at capturing simple actions (like the detection of the lifting of an exhibit object in a museum to activate a projection) to sophisticated sensor networks in urban areas to determine complex events like crowding in public places (Ganti et al. 2011) or to monitor traffic (Gubbi et al. 2013). Urban IoT has raised interest particularly in the smart cities field, where the most advanced communication technologies are used to support addedvalue services for the administration of the city and for the citizens (Zanella et al. 2014). Innovative applications seamlessly extend to the travel and tourism domain: for example, the same data used by a municipality to monitor crowding and traffic for quality of life and safety reasons can be exploited by recommender systems to adjust the suggestion of the next points of interest to visit or the parking areas to use, guiding the tourists’ flow as to improve the city experience and promote less popular areas (Massimo and Ricci 2018).
The Design Choices The opportunities opened to the travel industry by the ubiquitous computing and the IoT have already started to emerge and to revolutionize the way tourists experience the tourist context. However, when departing from the well-known development of web services and applications and turning to build and orchestrate more pervasive technological components, the complexity of the task increases significantly. The design phase that precedes the actual system development, and in particular the practice of co-design in strict collaboration with the stakeholders of the sector, becomes crucial. Not only is it necessary to understand end users’ and stakeholders’ requirements for new types of services and the feasibility and the cost of the technological deployment, but also important decisions must be taken on how the human-system interaction unfolds. Interaction design for ubiquitous computing and the IoT is indeed a multifaceted process aimed at delivering a coherent user experience that seamlessly spans across different moments of encounter between the user and the technological elements, which may occur at different times and places, either by overtly using a range of devices or via embodied interaction where the technology remains in the background and hidden to the user. To narrow down the amplitude of the design choices and make focused decisions, designers and developers need to address several questions. 1. Why to augment. First of all, we need to identify what objective the technological augmentation is addressing. For the tourism domain, the general aim to support the visitor experience can be inflected into more specific goals, each implying different types of design decisions regarding the interaction and the user engagement. For example, when developing services to facilitate practical tasks (e.g., restricted area access, ticketing, payments), the main focus is on
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
303
supporting new procedures, gestures, and objects that speed up the process, by providing easy identification of the user and the transaction while preserving the mandatory privacy and security constraints (improved convenience of action). When, instead, the objective is to increase the emotional engagement of the tourist with a destination or visited site, the design focus is on how memorable information can be presented in new, surprising, or more convenient ways, where the user has an active role in customizing and sharing the experience with significant others (increased engagement). As a third example, if the objective is to provide effective guidance (e.g., directions and recommendations on what to do next), the focus is on developing the infrastructure capturing a more complete picture of the user context and her interactions in order to implement more effective algorithms (better system performance). The technological augmentation based on IoT may also serve other goals than those of the end user (i.e., the tourist), like in the case of destination management organizations (DMOs), event organizers, or museum curators wishing to trace visitors’ flow to understand the audience/customers. Valuable information relates to how visitors prepare their travel, where they come from, when they start looking into practical details, what they do and purchase onsite, and how crowd gathers. By elaborating data provided by a multitude of complementary sensors, useful statistics and reports can be distilled. 2. Augmentation for whom. Mobile services and applications are typically designed to be used by single individuals, notable exceptions being mobile guides explicitly fostering group interaction (Callaway et al. 2014) and group recommenders (Garcia et al. 2011). However, the travel experience is generally done in self-organized groups (family, group of friends, couple, school group) or as a casual group (guided visits or in-place group workshops), but even when visiting alone, individuals move in a shared space and compete for the same tourist resources. Previous work has demonstrated that social interaction and active participation around interactive installations make the experience more meaningful and memorable (Heath et al. 2002): experiencing together physical objects, like an exhibit object or a handicraft product, offers a new opportunity to do things together; choosing together what information and emotional content to receive favors conversation and understanding. Ubiquitous computing and IoT allow to depict scenarios where the interaction is purposefully conceived for groups and can be personalized to accommodate the different motivations, mood, and expectations of visitors, e.g., to support lively treasure hunts through the city for families with young children vs. to guide groups of friends through a wine-tasting tour. 3. What and how to augment. Only after having replied to the previous questions, can we turn to consider more specific decisions on how the human-computer interaction will look like. At this point, decisions are taken about (i) when the tourist and the technological elements of the ubiquitous computing ecosystem encounter (e.g., when entering a specific door, making a payment, choosing a product, reaching a hotspot, or using a mobile application) and (ii) the actions and the events that need to be modeled (e.g., the unlocking of the door, the identification of the user and her location, the movement of objects, service requests from the mobile
304
E. Not et al.
app). Technology elements are then selected and arranged into an infrastructure supporting those actions and events: some technology pieces may be carried/worn by the user, others may be embedded into artifacts, and additional components may be distributed in the environment. Some decisions are driven by technical constraints. For example, objects or buildings may be fragile, valuable, or with historical value, and no invasive technology can be employed – e.g., no cabling, no power supply, and no direct touch by users. Or the area to cover may be wide and not suitable for expensive high-precision localization technology, or the Internet connection may not be available. Other decisions are instead driven by the desired effect on the user/group engagement, for example, when replacing a mobile tourist app with a more engaging soundscape.
Novel Scenarios for the Tourism Domain The specific tourism subdomain of city tours and visits to cultural heritage sites (e.g., museums, archaeological sites, gardens, historical buildings, itineraries) has been one of the earliest and most active fields of research and experimentation on the introduction of ubiquitous computing and IoT, with installations that range from distributed technology to detect users’ presence and location, local microcontrollers and servers to collect and preprocess the data generated by the sensors, and output devices distributed in the environment (projectors, public displays, ambient sound) to reach fully fledged IoT settings that support tangible, embedded, and embodied interaction (Marshall 2018). However, IoT-based solutions have now started to permeate the whole tourism sector by supporting new interaction modalities, modernizing the traditional means of information delivery, and enhancing systems intelligence.
Information and Services Requested via Physical Actions The use of sensors for action and environment monitoring allows to recognize users’ physical actions as forms of input to the information system. The scan of a badge or of a bracelet containing an RFID tag can grant the access to specific areas, for example, a parking lot, a hotel room, or attractions at an amusement park.8 Instead, the use of the personal mobile device to read a NFC tag or to frame a visual code placed in the environment can be interpreted as a request to receive information about the point of interest the marker is associated with (O’Neill et al. 2007), like the timetable related to a bus stop, today’s menu at a restaurant, or the program at the local cinema. A generalization of this approach is represented by NFC-based smart posters (Boes et al. 2015), where multiple magnetic tags are used to augment traditional, paper-based information and advertising posters for users to transfer digital information onto their smartphone with very simple interaction gestures. The advantage
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
305
of introducing this intuitive form of interaction between public, static sources of information and personal devices is the possibility to personalize the transmission according to the user profile or her current position, for example, by including directions to reach the point of interest advertised in the poster (Borrego-Jaraba et al. 2011). More in general, the convenience of physical actions coupled with security protocols that guarantee the protection of the data exchange contributes to the spread of transaction services supporting users on the move, for example, mobile ticketing and mobile payments based on the pairing of a smartphone with local terminals (Atzori et al. 2010). Several factors have been demonstrated to impact on the acceptance of mobile payment services, like compatibility with lifestyle and habits, individual mobility, perceived security, perceived ease of use, and usefulness (Schierz et al. 2010). IoT solutions, such as the pairing of a phone with a terminal based on NFC protocols, certainly increase the ease of use therefore contributing to the overall user acceptance and popularity of the service.
Beyond the Personal Screen: Adaptive Information Kiosks, Intelligent Shop Windows, and Soundscapes Travel information kiosks started to appear more than 20 years ago, before the mobile revolution (Fesenmaier and Kingsley 1995). They are still used at destination welcome centers, stations and airports, hotel lobbies, and tourist attractions to provide updated information, promote services and activities, and support specific transactions (e.g., make reservations, dispense travel coupons or maps with directions, buy entrance tickets). However, they have gradually lost their initial appeal, due to the fact that the displayed information is usually generic, i.e., not automatically personalized to the specific user needs. Indeed, several factors may impact on the successful adoption of an information kiosk (Slack and Rowley 2002): its design and location; how much its functions address the needs of its prospective audience; its internal information categorization and structure; the interface design; whether the system adapts to users’ interaction; and whether it includes e-commerce functionalities. New opportunities for innovating information kiosks are offered by IoT technologies that help networking kiosks within a larger information ecosystem and putting them in a dialogue with personal devices. Ojala et al. (2012), for example, describe how an interactive public display can be paired with the personal devices of users to create a distributed/hybrid user interface: the mobile phone’s user interface can be coupled with the kiosk’s public interface to allow content download or the phone can be used to remotely control what is shown on the public display. In their experiment, the authors tested NFC/RFID tags, QR codes, Bluetooth, and SMS to enable mobile phone and portal service pairing. A similar case study is presented by Hardy and Rukzio (2008), where multiple NFC/RFID tags are used to partition a public display in multiple areas that can be selected and controlled separately via a mobile phone for the implementation of an interactive tourist information application.
306
E. Not et al.
The possibility of pairing public displays with personal mobile devices or other identifiers carried by the users extends to scenarios of intelligent shop windows, where the information that is displayed on public screens may be tailored to users’ personal/group features (e.g., the suggestion of a menu at special rates for families) or to previously experienced attractions or activities (e.g., targeted advertising of gastronomic products to tourists who have enrolled in a guided tour to wineries). Intelligent shop windows introduce a new type of communication channel between a pervasive information system and the tourist: information and recommendations can be delivered to users without requiring them to explicitly search and interact with their mobile device, thus extending the opportunities of personalization technology. In addition, indirect forms of communication can be adopted as well, like the intelligent control of the shop lighting system (van Doorn et al. 2008). Another scenario that is enabled by IoT and that provides an alternative to traditional mobile guides and mobile tourist information systems (i.e., an alternative to the personal screen) is represented by soundscapes, i.e., technological deployments where a network of loudspeakers is distributed along an itinerary to engage visitors with suggestive audio-based narrations and sounds. Marshall et al. (2016b), for example, describe a system where several Bluetooth loudspeakers are distributed in an archaeological site and automatically play ambient sounds and stories depending on a combination of features such as location of the point of interest, visitors’ proximity (measured via Bluetooth signal strength), and thematic preferences expressed by manipulating an augmented book or an augmented belt. An evaluation study carried out with small groups of visitors revealed that this type of immersive experience takes visitors beyond the traditional view of heritage as a source of information toward a sensorial experience of feeling the past, with high levels of visitors’ appreciation and sharing between group members. This is an example of how IoT (when properly coupled with a design phase aimed at selecting the values, the content, and the type of interaction to support) offers new opportunities to the tourist industry for creating novel products and packages.
Interaction with Smart Objects Tangible interaction has been demonstrated to stimulate engagement and sharing (Shaer and Hornecker 2010). The possibility to augment material objects with sensors detecting when they are moved and manipulated enables scenarios where descriptive information about objects and places is presented to tourists at the very exact time they are experiencing them. We can have, for example, projections showing multimedia content when an exhibit object in a museum is lifted – with object movement detected via a RFID sticker attached to the artifact (Not et al. 2019) – or an interactive plinth that projects information around objects that are put on top of it, with user presence and proximity detected through infrared sensors (Wolf et al. 2015). This type of interactive experience that does not involve direct use
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
307
of digital devices (smartphones, tablets, touch screens) has already been evaluated successfully (Petrelli and O’Brien 2018). Although most of the case studies developed so far have concentrated on applications to cultural heritage explorations, other aspects of the tourism experience can be augmented, like the discovery of local handcrafted products or food and wine products. For example, Cavada et al. (2018) experimented with IoT-based information kiosks in a commercial setting where different challenges emerge: the need to homogenize the installations to the company brand strategy, the requirement to attract and engage visitors to foster subsequent interactions with the salespersons, and the utility of collecting product popularity statistics to initiate further marketing campaigns. In their case study, in the context of a wine fair, bottles of wine were available to visitors for a closer inspection of the packaging and for activating multimedia descriptions when bottles were placed onto wooden boxes of the selling winery, which were augmented with an RFID reader and smart buttons. Touch areas on the box surface were available to select the preferred output language and the type of information to display (information about the winery, the land of grapes growth, and the properties of the wine). Although very specific, this scenario can be generalized to many similar situations in which there is a collection (catalogue) of objects with related information; stakeholders (e.g., retailers, exhibitors in fairs or markets, museum curators) are interested in conveying detailed information about the objects (e.g., technical features, organoleptic properties, manufacture techniques, or their history); end users are interested to learn about the objects; the physical engagement with the objects might improve the user experience; it would be difficult for the organization to provide personally all the details to individual users.
Tangible Output Despite the convenience of digital information services, in the tourism domain, the digital content does not replace completely paper-based material distributed onsite: information brochures, booklets, city maps, guides for thematic itineraries, and calendars with lists of events are an important communication channel maintained by DMOs and touristic product providers, as they significantly contribute to satisfy tourists’ needs and to generate a positive product image. IoT and personalization technologies open new opportunities for innovating how paper-based information material is created to have a deeper impact on tourists’ decision making and emotional experience. An information kiosk can, for example, be developed to print on the fly automatically generated maps that reflect the visualization and content preferences expressed by tourists (Grabler et al. 2008) or to generate brochures advertising what to see next by taking into account what the visitor has already seen (e.g., by importing the visit history after the visitor has paired her phone with the kiosk via NFC) and containing route information (Birsak et al. 2014). The end of the visit is another excellent opportunity to build upon the visit experience and create a bond – that takes a material form – between tourists and
308
E. Not et al.
the touristic context, as demonstrated by the success of souvenir shops offering objects that capture the highlights of the visit. A souvenir holds an emotional value for visitors; it offers the background for remembering and retelling the visit experience to others and can be a stimulus to get in further touch with the destination/touristic product. The technological infrastructure of IoT enables the logging of visitors’ interaction at multiple points of their journey with the possibility of automatically composing travel summaries and of embedding them into souvenir objects. Previous work has investigated the feasibility of this scenario. Callaway and colleagues (2007) exploited the logs of the users’ interaction with an audio guide and the semantic representation of the described contents (frescos from the fourteenth century) to generate textual summaries of what captured visitors’ attention, enriched with pictures of the frescos details and suggestions for related heritage sites to visit. The illustrated summaries were printed out for the visitors to take away. More recent work has investigated the generation of personalized souvenir postcards that help visitors remember what they saw and experienced during an itinerary augmented with smart objects and digital presentations. For the crafting of the personalized message, both techniques of natural language generation (Not et al. 2017) and the composition of graphical elements over a preprinted postcard template (Petrelli et al. 2017) have been experimented. The postcard is also an ideal method to share a unique key identifier that grants each visitor access to a personal online space related to their visit (Petrelli et al. 2017). These opportunities can be further exploited with the usage of 3D printers for engaging tourists in the creation of their own handcrafted souvenir, for example, as investigated by Anastasiadou and Vettese (2019).
Exploiting IoT for Personalization When IoT installations supporting extended interaction with objects/handicrafts and places, as those illustrated in the previous sections, are deployed in a tourism environment, they become an additional source of information about users’ choices, preferences, and performed activities. This allows to improve the personalization of the delivered information and services and allows to recommend what to do next. “Personalization” is a broad term that encompasses three types of system behavior: adaptability (also called customization) offers users a number of options to set up the system the way they like it; context-awareness is the ability of the system to sense the current state of the environment and to respond accordingly; and adaptivity implies the system maintains a dynamic model of the ongoing interaction and dynamically changes its own behavior to adapt to the changing situation. Tangible and embodied interaction enabled by the IoT extends the way users enter in a dialogue with the system, impacting on all three forms of personalization (Not and Petrelli 2018). For example, by choosing one among different types of gadgets distributed at a destination welcome office (incorporating an RFID tag or a beacon), tourists might
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
309
be able to explicitly choose a certain type of thematic itinerary. By monitoring what object was chosen, the system can be adapted accordingly throughout the visit. As a result, a very intuitive action that does not require that visitors do any interaction with digital devices becomes a new way to deliberately control the system (system customization) (Marshall et al. 2016a). In traditional mobile services for the tourism scenario, the context considered for personalization typically includes features like the user’s location, time, day of week, social context, and weather forecast (Baltrunas et al. 2012). With IoT, context-awareness expands to take into account the state of smart objects (e.g., a switch is off), the performance of physical actions (e.g., the scan of an RFID tag), and the more sophisticated environment sensing (e.g., noise level), and these factors affect how the system behaves, as exemplified by many of the scenarios mentioned in the previous sections. Adaptivity occurs when the system dynamically improves the relevance of the provided information and adjusts the appropriate level of system proactiveness (i.e., the system ability to initiate the interaction, e.g., notifications) based on a model of the users’ profile, the flow of their interaction, and the multifaceted context in which the interaction occurs. With IoT elements integrated in the system infrastructure, the modeling algorithms need to reason over an extended set of logs that includes the events generated by the sensors. Massimo et al. (2017), for example, have explored the challenging problem of learning the preferences of users (what item is chosen next and why) from low-level behavior data by focusing on modeling and learning these preferences in a sequential decision-making problem and using a machine learning technique called inverse reinforcement learning (Ng and Russel 2000). They have considered the scenario of a network of interactive stations for artifact description based on tangible interaction and have studied the issue of learning the user model for a recommender to suggest which media items to consume next by reasoning on the logs of IoT interactions. The same authors are exploring the generalization of this approach to predict visitors’ next actions and movements to recommend optimal sequences of points of interest to visit in a city (Massimo and Ricci 2018). The major challenge is to develop recommendation and personalization algorithms that are able to reason over the extended set of logs generated by both interacting with traditional web and mobile services and with an IoT infrastructure. A typical scenario is when the tourist, during the same travel, uses a mobile guide, receives notifications generated by beacons, uses smart cards, interacts with information kiosks, experiences augmented itineraries, and discovers the features of handicrafts via interactive stations (Cavada et al. 2018). The scale-up to a comprehensive scenario of interconnected physical and digital information services certainly poses technical and research issues related to (i) the heterogeneity of the data collected at the different touch points of the visitor with the system that are representative of different information goals and contexts of use (ii) and the greater sparsity of the data (both with respect to time and space) that increases the complexity of learning models of users’ preferences and behavior.
310
E. Not et al.
Exploiting IoT for Extended Data Analytics Capabilities The logs collected by the IoT infrastructure can additionally be exploited to compute statistics on system usage and visitors’ preferences that may be of interest to tourism organizations for market segmentation (Dolnicar 2008), for tuning marketing strategies or to understand the impact on the visitors’ experience of technological augmentations. Similar to web analytics of online information services and data mining of mobile app usage (Not 2019), IoT analytics reveal, for example, which beacons distributed around a city or other public spaces have been detected most frequently (e.g., useful for crowd monitoring and for developing new guided itineraries), which handicrafts have been explored most and are therefore more attractive to tourists (e.g., to determine product popularity and update display strategies), and the days and time slots during which physical premises with regulated access have been mostly used (e.g., to monitor access to wellness areas and plan new service packages). By cross-checking these data with other logs collected by mobile applications integrated in the technological ecosystem, the IoT analytics can be further segmented by tourists’ provenance, their information-seeking style, and the advance with which they start planning their travel. However, caution must be used when automatically analyzing the logs of IoT installations, as not all the recorded interactions correspond to a meaningful user behavior and the extraction of user interests and preferences from raw interaction logs via unsupervised data mining might not be reliable. Several issues need to be carefully taken into account: irrelevant data may simply correspond to user attempts to understand how interactive installations work or what type of information is available; usability problems might cause mistakes or repeated actions; and the context where the IoT installation is placed might influence how it is used, e.g., crowding might urge users to free an interactive installation earlier than actually desired (Cavada et al. 2018). In the case of IoT installations, the “end of theory” postulate assuming that big data analysis is sufficient to discover significant phenomena without the need of theoretical models or qualitative research (Anderson 2008) is misplaced. Interviews, questionnaires, and observations may be crucial to overcome bias in the raw data and to fully understand issues of usability, motivation of use, appropriation, and long-term adoption of the technology.
Expected Future Developments Going back to consider the determinant factors for the success of mobile services in the tourism domain, we can see that ubiquitous computing and IoT significantly expand the meaning and amplitude of those factors: • With proper sensor and actuator selection and varied data transmission methods, ubiquity can be guaranteed also in places where phone connection or power supply is not available or where the usage of personal devices is not suitable.
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
311
• Convenience is improved through new, easy ways of activating services and performing practical tasks, with actions and smart objects that speed up processes, support identification, and communicate choices. Convenience is complemented by higher levels of engagement, as the user-system interaction is not focused solely on the personal devices but extends to include the experience of the senses and the body. • Localization, e.g., the awareness of the user location to tailor the services, extends to include the awareness of objects and the state of the environment to build a richer model of the context. • Personalization can reach better results thanks to the possibility to log and reason on varied aspects of the onsite visit, i.e., different types of interaction between the user and the tourist context. • The material aspects of the interaction introduce new forms of experience sharing among members of a travel group: being immersed in a common augmented space and the sharing of physical objects (observing what others are doing, taking turns, handing over, conversating about the received information) offer a new opportunity to do things together and create shared memories. The field of research for developing large-scale information systems that integrate in a unique ecosystem online services, mobile applications, and physical-digital services powered by ubiquitous computing and IoT is still in its infancy. The main challenges are represented by (i) the need for an overarching strategy to design a coherent user experience that seamlessly spans across different services and interaction modalities; (ii) the availability of an efficient content management system that optimizes the distribution of content to the various communication channels; and (iii) the building of models of the tourist experience that reason on all the (heterogeneous) points of encounter between the user and the digitally augmented environment for effective personalization. Another emerging area of investigation relates to the design and implementation of development platforms that facilitate the assembling of the IoT-based installations by easing the interconnection of heterogeneous smart devices and offering physical debugging of sensor events during testing (Kubitza and Schmidt 2017). Graphical approaches are also being investigated to develop friendly user interfaces to support the end user programming of smart environments in general (Desolda et al. 2017), and for specific application domains, like the deployment of interactive installations in museums (Ardito et al. 2018). For an application domain like tourism, where the timely publication of inspirational content is essential to communicate correct information and an engaging product image, the simple and rapid update of contents by stakeholders is an essential additional requisite. A user-centered approach to the design of facilities for an easy upload of contents in an IoT setup is, for example, described by Not and Petrelli (2019) for the scenario of cultural heritage visits. More in general, there is an urgent need for authoring platforms centered on the requirements of tourism stakeholders, which include functionalities for selfprovisioning, i.e., for choosing from a cloud repository the services that fit their tourism activities (e.g., templates for web pages, mobile applications, information
312
E. Not et al.
kiosks, intelligent shop window setups, interactive installations with smart objects, beacon-based itineraries), for configuring them and filling them with proper content, for publishing them online or deploying onsite, for monitoring their usage, and for periodically updating their contents. A last reflection concerns how technological infrastructures based on ubiquitous computing and the IoT may produce a shared benefit for different stakeholders operating in the same area, like in the case of destination smart cards, smart gadgets, beacons pervasively distributed in town, and information kiosks. These elements of tangible tourism, coupled with a proper interaction design, may strengthen in tourists a holistic perception of the tourism experience, even when products and services are offered by different providers. This is an additional opportunity for DMOs for stimulating collaboration among local stakeholders and for the development of joint initiatives with a destination-wide scope. To embrace this destination management innovation, the self-provisioning platforms envisioned above should include functionalities to support collaborative decision-making processes and the sharing of resources among a community of stakeholders.
Cross-References Augmented, Virtual, and Mixed Reality in Tourism Interactive and Context-Aware Systems in Tourism Log File Analysis Market Segmentation for e-Tourism Mobile Applications for e-Tourism Recommender Systems in Tourism
Notes 1
Google Pay. https://pay.google.com/about/. Accessed 5 June 2019. Apple Pay. https://www.apple.com/apple-pay/. Accessed 5 June 2019. 3 https://www2.meethue.com/. Accessed 5 June 2019. 4 http://sonoff.itead.cc/en/. Accessed 5 June 2019. 5 https://www.ikea.com/gb/en/cat/smart-lighting-kits-36815/. Accessed 5 June 2019. 6 https://www.apple.com/shop/accessories/all-accessories/homekit?. Accessed 5 June 2019. 7 https://store.google.com/gb/product/google_home. Accessed 5 June 2019. 8 e.g., the Magic Band used at Walt Disney World (https://disneyworld.disney.go.com/en-eu/ plan/my-disney-experience/bands-cards/. Accessed 5 June 2019). 2
References Anastasiadou C, Vettese S (2019) “From souvenirs to 3D printed souvenirs”. Exploring the capabilities of additive manufacturing technologies in (re)-framing tourist souvenirs. Tour Manag 71:428–442 Anderson C (2008) The end of theory: the data deluge makes the scientific method obsolete. Wired Mag 16(7). https://www.wired.com/2008/06/pb-theory/
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
313
Ardito C, Buono P, Desolda G, Matera M (2018) From smart objects to smart experiences: an end-user development approach. Int J Hum-Comput Stud 114:51–68 Atzori L, Iera A, Morabito G (2010) The internet of things: a survey. Comput Netw 54(15):2787– 2805 Baltrunas L, Ludwig B, Peer S, Ricci F (2012) Context relevance assessment and exploitation in mobile recommender systems. Pers Ubiquit Comput 16(5):507–526 Birsak M, Musialski P, Wonka P, Wimmer M (2014) Automatic generation of tourist brochures. Comput Graphics Forum 33:449–458 Boes K, Borde L, Egger R (2015) The acceptance of NFC smart posters in tourism. In: Tussyadiah I, Inversini A (eds) Information and communication technologies in tourism 2015. Springer, Cham, pp 435–447 Borrego-Jaraba F, Luque Ruiz I, Gómez-Nieto MA (2011) A NFC-based pervasive solution for city touristic surfing. Pers Ubiquit Comput 15(7):73–742 Callaway C, Not E, Stock O (2007) Report generation for post-visit summaries in museum environments. In: PEACH – intelligent interfaces for museum visits. Springer, Berlin, pp 71–92 Callaway C, Stock O, Dekoven E (2014) Experiments with mobile drama in an instrumented museum for inducing conversation in small groups. ACM Trans Interactive Intell Syst 4(1): 1–39 Canadi M, Höpken W, Fuchs M (2010) Application of QR codes in online travel distribution. In: Gretzel U, Law R, Fuchs M (eds) Information and communication technologies in tourism 2010. Springer, Vienna, pp 137–148 Cavada D, Elahi M, Massimo D, Maule S, Not E, Ricci F, Venturini A (2018) Tangible tourism with the internet of things. In: Stangl B, Pesonen J (eds) Information and communication technologies in tourism 2018. Springer, Cham, pp 349–361 Clarke I III (2001) Emerging value propositions for M-commerce. J Bus Strateg 18(2):133–148 Desolda G, Ardito C, Matera M (2017) Empowering end users to customize their smart environments: model, composition paradigms, and domain-specific tools. ACM Trans Comput-Hum Interact 24(2):Article 12 Dolnicar S (2008) Market segmentation in tourism. In: Woodside AG, Martin D (eds) Tourism management: analysis, behaviour and strategy. CAB International, Cambridge, pp 129–150 Egger R (2013) The impact of near field communication on tourism. J Hosp Tour Technol 4(2):119–133 Fesenmaier DR, Kingsley I (1995) Travel information kiosks: an emerging communications channel for the tourism industry. J Travel Tour Mark 4(1):57–70 Ganti RK, Ye F, Lei H (2011) Mobile crowdsensing: current state and future challenges. IEEE Communications Magazine 49(11):32–39 Garcia I, Sebastia L, Onaindia E (2011) On the design of individual and group recommender systems for tourism. Expert Syst Appl 38(6):7683–7692 Gast MS (2014) Building applications with iBeacon: proximity and location services with bluetooth low energy. O’Really Media, Sebastopol, CA Grün C, Werthner H, Pröll B, Retschitzegger W,Schwinger W (2008) Assisting tourists on the move – an evaluation of mobile tourist guides. In: 2008 7th international conference on mobile business, pp 171–180 Grabler F, Agrawala M, Sumner RW, Pauly M (2008) Automatic generation of tourist maps. ACM Trans Graph 27(3):Article 100 Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of things (IoT): a vision, architectural elements, and future directions. Futur Gener Comput Syst 29(7):164–1660 Guo Y, Liu H, Chai Y (2014) The embedding convergence of smart cities and tourism internet of things in China: an advance perspective. Adv Hosp Tour Res (AHTR) 2(1):54–69 Han DI, Jung T, Gibson A (2013) Dublin AR: implementing augmented reality in tourism. In: Xiang Z, Tussyadiah I (eds) Information and communication technologies in tourism 2014. Springer, Cham, pp 511–523
314
E. Not et al.
Hardy R, Rukzio E (2008) Touch & interact: touch-based interaction of mobile phones with displays. In: Proceedings of the 10th international conference on Human computer interaction with mobile devices and services (MobileHCI ’08). ACM, pp 245–254 Heath C, Luff P, Vom Lehn D, Hindmarsh J, Cleverly J (2002) Crafting participation: designing ecologies, configuring experiences. Vis Commun 1(1):9–33 Kubitza T, Schmidt A (2017) meSchup: a platform for programming interconnected smart things. IEEE Comput 55(11):38–49 Kubitza T, Pohl N, Dingler T, Schneegaß S, Weichel C, Schmidt A (2013) Ingredients for a new wave of Ubicomp products. IEEE Pervasive Comput Mag 12(3):5–8 Leung D, Law R, van Hoof H, Buhalis D (2013) Social media in tourism and hospitality: a literature review. J Travel Tour Mark 30(1–2):3–22 Marshall MT (2018) Interacting with heritage: on the use and potential of IoT within the cultural heritage sector. In: Proceedings of 2018 Fifth international conference on internet of things: systems, management and security, pp 15–22 Marshall MT, Dulake N, Ciolfi L, Duranti D, Kockelkorn H, Petrelli D (2016a) Using tangible smart replicas as controls for an interactive museum exhibition. In: Proceedings of the TEI ’16: tenth international conference on tangible, embedded, and embodied interaction (TEI ’16). ACM, pp 159–167 Marshall MT, Petrelli D, Dulake N, Not E, Marchesoni M, Trenti E, Pisetti A (2016b) Audiobased narratives for the trenches of World War I: Intertwining stories, places and interaction for an evocative experience. Int J Hum-Comput Stud 85:27–39 Massimo D, Ricci F (2018) Harnessing a generalised user behaviour model for next-POI recommendation. In: Proceedings of the 12th ACM conference on recommender systems (RecSys ’18). ACM, pp 402–406 Massimo D, Elahi M, Ricci F (2017) Learning user preferences by observing user-items interactions in an IoT augmented space. In: Tkalcic M, Thakker D, Germanakos P, Yacef K, Paris C, Santos O (eds) Adjunct publication of the 25th conference on user modeling, adaptation and personalization (UMAP ’17), pp 35–40. ACM Ng A, Russel S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning (ICML ’00), Morgan Kaufmann Publishers Inc. San Francisco, CA, 663–670 Ng PC, She J, Park S (2017) Notify-and-interact: a beacon-smartphone interaction for user engagement in galleries. In: Proceedings of 2017 IEEE international conference on multimedia and expo (ICME), pp 1069–1074 Not E (2019) Studying the information seeking preferences of participants to a large event. In: Proceedings of the 13th biannual conference of the Italian SIGCHI chapter: Designing the next interaction (CHItaly ’19). ACM, New York Not E, Petrelli D (2018) Blending customisation, context-awareness and adaptivity for personalised tangible interaction in cultural heritage. Int J Hum Comput Stud 114:3–19 Not E, Petrelli D (2019) Empowering cultural heritage professionals with tools for authoring and deploying personalised visitor experiences. User Model User-Adap Inter 29(1):67–120 Not E, Cavada D, Maule S, Pisetti A, Venturini A (2019) Digital augmentation of historical objects through tangible interaction. ACM J Comput Cult Herit 12(3):Article 18 Not E, Zancanaro M, Marshall MT, Petrelli D, and Pisetti A (2017) Writing postcards from the museum: composing personalised tangible souvenirs. In: Proceedings of the 12th biannual conference on Italian SIGCHI Chapter (CHItaly ’17), Article 5. ACM, New York, pp 1–9 O’Neill E, Thompson P, Garzonis S, Warr A (2007) Reach out and touch: using NFC and 2D barcodes for service discovery and interaction with mobile devices. In: LaMarca A, Langheinrich M, Truong KN (eds) Pervasive computing. Pervasive 2007. Lecture notes in computer science, vol 4480. Springer, Berlin/Heidelberg Ojala T, Kostakos V, Kukka H, Heikkinen T, Linden T, Jurmu M, Hosio S, Kruger F, Zanni D (2012) Multipurpose interactive public displays in the wild: three years later. Computer 45(5):42–49
13 Internet of Things and Ubiquitous Computing in the Tourism Domain
315
Petrelli D, O’Brien S (2018) Phone vs. tangible in museums: a comparative study. In: Proceedings of the 2018 CHI conference on human factors in computing systems (CHI ’18). ACM Petrelli D, Marshall MT, O’Brien S, McEntaggart P, Gwilt I (2017) Tangible data souvenirs as a bridge between a physical museum visit and online digital experience. Pers Ubiquit Comput 21(2):281–295 Ricci F (2002) Travel recommender systems. IEEE Intelligent Systems 17(6):55–57 Riekki J, Salminen T, Alakarppa I (2006) Requesting pervasive services by touching RFID tags. IEEE Pervasive Comput 5(1):40–46 Roxin A-M, Gaber J, Wack M, Nait-Sidi-Moh A (2007) Survey of wireless geolocation techniques. In: 2007 IEEE Globecom Workshops, Washington, DC Schierz PG, Schilke O, Wirtz BW (2010) Understanding consumer acceptance of mobile payment services: an empirical analysis. Electron Commer Res Appl 9(3):209-216 Shaer O, Hornecker E (2010) Tangible user interfaces: past, present, and future directions. Found Trends Hum–Comput Interact 3(1–2):1–137 (2010) Slack F, Rowley J (2002) Kiosks 21: a new role for information kiosks? Int J Inf Manag 22(1): 67–83 Stock O, Zancanaro M, Busetta P, Callaway C, Krueger A, Kruppa M, Kuflik T, Not E, Rocchi C (2007) Adaptive, intelligent presentation of information for the museum visitor in PEACH. User Model User-Adap Inter 17(3):257–304 van Doorn M, van Loenen E, de Vries AP (2008) Deconstructing ambient intelligence into ambient narratives: the intelligent shop window. In: Proceedings of the 1st international conference on Ambient media and systems (Ambi-Sys ’08), Article 8 Weiser M (1993) Ubiquitous computing. Computer 26(10):71–72 Wolf K, Abdelhady E, Abdelrahman Y, Kubitza T, Schmidt A (2015) MeSch: Tools for interactive exhibitions. In: Proceedings of the conference on electronic visualisation and the Arts (EVA ’15). BCS Learning & Development Ltd., Swindon, 261–269 Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M (2014) Internet of things for smart cities. IEEE Internet Things J 1(1):22–32
Augmented, Virtual, and Mixed Reality in Tourism
14
Roman Egger and Larissa Neuburger
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application of AR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pre-travel Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On Trip Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Post-travel Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Virtual Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Immersion and Presence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application of VR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pre-travel Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On-Trip Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Post-travel Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mixed Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
318 321 323 324 326 326 327 328 329 330 332 332 333 334 334 335 336 336
R. Egger () Innovation and Management in Tourism, Salzburg University of Applied Sciences, Salzburg, Austria e-mail: [email protected] L. Neuburger Department of Tourism, Hospitality and Event Management, University of Florida, Gainesville, FL, USA e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_19
317
318
R. Egger and L. Neuburger
Abstract The perception of touristic space is constantly changing, and today physical and virtual spaces are interwoven due to new technological developments. Virtual Reality, Augmented Reality, and Mixed Reality are opening a new paradigm, receiving increasing interest from marketers and consumers alike. These technologies provide huge opportunities by enhancing customer engagement in a highly interactive way, which leads to an enhanced customer experience and new marketing approaches for innovative players in the tourism industry. The terms, definitions, and boundaries between these technologies are however blurred and not well defined. This chapter therefore tries to clarify key concepts by unraveling the subtleties behind the different terms which are often used interchangeably. After reading this chapter, the reader will arrive at a clearer understanding of the different concepts, their technological background and characteristics as well as possible fields of application within the tourism industry. Several best practice scenarios illustrate the potential, opportunities, chances, and risks of these immersive technologies. The chapter ends with discussing further prospects when it comes to technological developments and alternative application fields.
Keywords Augmented reality · Virtual reality · Mixed reality · Immersive technologies · Immersion · Presence
Introduction Advancements of Information and Communication Technologies (ICTs) such as Augmented Reality (AR) and Virtual Reality (VR) have the power to transform the dynamics of the tourism industry as well as the way tourism is experienced (Buhalis and Law 2008; Guttentag 2010; Wei et al. 2019). The perception of touristic space is constantly changing, and today physical and virtual spaces are interwoven due to new technological developments. With the technological advancements of AR and VR, immersive technologies have the ability to change the perception of touristic space by manipulating different layers of reality. While VR can immerse the tourist into a different dimension of reality (Beck et al. 2019), AR augments the real environment with virtual elements (Azuma 1997). Technologies such as AR and VR have the potential to change the way of travelling and to alter the perception of touristic spaces by blurring the boundaries between the physical and the digital environment and combining them in a “phygital” realm (Neuburger et al. 2018). AR and VR can be seen as part of the overall concept of Mixed Reality (MR) classified in the Milgram continuum shown in Fig. 1 (Milgram and Kishino 1994). This framework depicts the coexistence of the real and Virtual Environment (VE) as polarities on both sides and AR and Augmented Virtuality (AV) in the middle of the MR continuum. Thereby, MR encompasses the concepts of physical reality, AR, AV, and VR. “Mixed reality refers to space which consists of real and virtual
14 Augmented, Virtual, and Mixed Reality in Tourism
319
Fig. 1 Milgram’s reality-virtuality continuum. (Adapted from Milgram and Kishino 1994)
elements that interact with each other” (Noh et al. 2009, p. 51). While AR adds a layer of virtual elements to the real space, AV integrates elements of reality into a VE such as a virtual museum. In the VE the user interacts only with virtual elements in a computer-simulated environment, and the real environment is replaced (Billinghurst et al. 2014; Milgram and Kishino 1994; Noh et al. 2009). MR in this continuum is defined by its level of virtuality (Tönnis 2010). Furthermore, when distinguishing between AR and VR, it is also important to consider the terms augmentation and simulation. While the term augmentation describes technologies that add features to already existing systems (AR), simulations defines technologies that have the ability to model realities (VR) (Billinghurst et al. 2014). However, VR, AR, and MR are opening a new paradigm, receiving increasing interest from marketers and consumers alike. These technologies provide opportunities to enhance customer engagement and tourism experience in a highly interactive way, integrating immersive technologies in the tourism industry while opening new marketing approaches for innovative destinations and businesses. The nextgeneration technologies are projected to grow their market value from $6.1 billion in 2016 up to $160 billion in 2023 (IDC 2019) In terms of mainstream adoption the Gartner Hype Cycle for emerging technologies 2018 predicts AR and MR to reach the “plateau of productivity” in 5 to 10 years. While VR has already reached the stage of mass adoption, AR has only reached the level of mass adoption in 2019 after remaining in the stage of “disillusionment” for several years (Gartner 2019). To identify research topics that relate to AR and VR, a network analysis was conducted. Therefore, keywords such as “VR + Tourism,” “Virtual Reality + Tourism,” “Augmented Reality + Tourism,” etc. were used to identify relevant articles in Web of Science. VOSviewer was then used to build a network based on the title and abstract of these articles, highlighting that AR, VR, and MR are intertwined with one another when it comes to research as visible in Figs. 2 and 3. The bibliographic topic analysis about AR shows that research topics focus on technology adoption and acceptance of AR especially in terms of usability, mobile AR, and location-based services (LBS). Furthermore, there are different fields within tourism research where AR is applied. AR is used in the area of smart tourism
320
Fig. 2 Topic analysis: AR research
Fig. 3 Topic analysis: VR research
R. Egger and L. Neuburger
14 Augmented, Virtual, and Mixed Reality in Tourism
321
and smart cities, archaeology (e.g., 3D reconstruction, photogrammetry), cultural heritage (e.g., museums), as well as services within the tourism and hospitality industry. Hereby, tourism researchers use AR as a solution for issues in the area of sustainability (e.g., environment), tourism experience (e.g., personalization, authenticity), co-creation, education, marketing and promotion (e.g., social media), and management (e.g., satisfaction, loyalty, memory). Similar to AR research, research on VR focuses on technology acceptance and usability and is closely connected to AR and MR. Furthermore, VR research is mainly used for use cases in destination marketing especially in relation to destination image. Similar as in AR research topics regarding VR in tourism are visitor experience, innovation, co-creation, and education (e.g., knowledge). Hereby, VEs are used as tools to create immersion, the feeling of presence (both explained later in this chapter), and to enhance interactivity through simulation of senses. Moreover, VR is used in the areas of hospitality, cultural heritage tourism (e.g., museums, visualization, and photogrammetry), social media, and sustainable tourism. In addition, VR research examines motivations, behaviors (e.g., word of mouth), and satisfaction. The terms, definitions, and boundaries between these technologies are however blurred and not well defined. This chapter therefore tries to clarify key concepts by unpacking the different terms and providing an overview about AR, VR, and MR. It is the hope of the authors that the reader will receive a clear understanding of the different concepts, the technological background and characteristics as well as fields of application within the tourism industry. Several best practice scenarios illustrate the potentials, opportunities, chances, and risks of these technologies. Hereby, the customer journey (CJ) will be used to better demonstrate the different types of immersive technologies. The CJ distinguishes three main phases when travelling: (1) the pre-travel phase, where the tourists get inspired to travel, plan their trip, and go through the decision-making process; (2) the on-trip phase, where the tourists travel to the destination and accumulate travel experiences; and (3) the post-travel phase that takes place when the tourists travel back home, indulging in reminiscences about the trip. In this phase they can already get inspired and dream about their next trip. The chapter ends with prospects, discussing further technological developments and expanded fields of application.
Augmented Reality The idea of creating virtual representations of real environments dates back a few centuries, when theaters and museums already made their first efforts in the seventeenth century to trick visitors into seeing virtual ghosts created with reflections of objects (Billinghurst et al. 2014). The first attempts of visualizing 3D graphics were made in the 1960s when the researcher Ivan Sutherland developed the first interactive graphic application “Sketchpad” and his first AR prototype. Although the research field was already established way back in the 1990s, the area of AR is still fairly new in the field of tourism (Azuma et al. 2001; Billinghurst et al. 2014). Azuma (1997) described AR as a variation of virtual environments (VEs)
322
R. Egger and L. Neuburger
and sets AR in contrast to VR and its full immersive character (Azuma 1997). He explains that “[. . . ] AR allows the user to see the real world, with virtual objects superimposed upon or composited with the real world” (Azuma 1997, p. 356). Rather than replacing or blocking the real environment, AR adds virtual computergenerated 2D images or 3D objects to the real world in order to provide additional information to the user (Azuma 1997; Azuma et al. 2001; Carmigniani and Furht 2011; Danado et al. 2005). Hereby Azuma et al. (2001) defined three characteristics of an AR system: (1) the combination of reality and virtuality, (2) interactivity in real time, and (3) the alignment of both real and virtual objects. In this way “AR enhances the user’s perception of and interaction with the real world” (Carmigniani and Furht 2011, p. 3). However, AR is more than a tool for visual augmentation. AR can enhance the perception of all senses and even help in substituting a missing sense (Carmigniani and Furht 2011). The hearing sense can be stimulated by different sounds, music or narratives. Also the sense of touch can be activated so that the user can directly interact with the virtual objects on the screen as the example of an AR prototype for the Dommuseum in Salzburg, Austria (Neuburger and Egger 2017) in Fig. 4 shows that visitors can interact with the digital objects on the screen which are used to augment and provide more information about a particular object or painting (Azuma 1997; Neuburger and Egger 2017). The haptic sense of the user can be additionally stimulated by wearing devices like special gloves, haptic skin, or exoskeletons. These systems can be used to stimulate the neural cells of the user in order to convey heat or the feeling of structure. The olfactory as well as the gustatory sense can also be stimulated by wearable devices or devices in the room and can be used to enhance the experience the user already has with the AR application (Tönnis 2010; Yu et al. 2019). Therefore, AR offers a combination of haptic and digital experience (hap.dig) where the haptic element offers an additional added value (Mehler-Bicher et al. 2011) Furthermore, the mass adoption of smartphones and other mobile devices has led to changes in communication behavior, with the discovery of AR for the end-consumer market (Mehler-Bicher et al. 2011). Recent technical developments
Fig. 4 AR user interaction
14 Augmented, Virtual, and Mixed Reality in Tourism
323
and advancements of mobile processing, storage capacity, Wi-Fi ubiquity, and the integration of embedded features, like cameras, microphones, and sensors, paved the way for progress in AR technology and the growth of AR adoption (Kipper and Rampolla 2012; Li et al. 2018). Rapid developments like Google’s relaunch of their smart glasses “Google Glass 2” (for a price below $1,000) (Robertson 2019), integrated AR browsers in mobile devices (Statt 2018) or the success of AR games like Pokémon GO (Wingfield and Isaac 2016) can help AR follow the same trajectory as VR, namely, to be adopted by the mass consumer market. “The potential of AR has just begun to be tapped and there is more opportunity than ever before to create compelling AR experiences” (Billinghurst et al. 2014, p. 3).
Technological Background AR functions by seeing augmented objects through a visor. The visor consists of a display screen and a small camera, which captures the real world around the user and sends those pictures to the computer, which tracks the position, the elements, and the rotation of the camera. Then the virtual components of the AR application are sent back to the display screen. In this way the user has the illusion of looking through the augmented content into the real world. Thereby, tracking defines the recognition and tracing of the real environment and objects (the trigger) in order to augment it with virtual content. The perfect illusion is achieved when the virtual objects are integrated as precisely as possible in the real world (Mehler-Bicher et al. 2011). Furthermore tracking can be differentiated into marker-less and marker-based tracking. Marker-less tracking can work among other things using image recognition, GPS, ultrasound, optoelectronic sensors, or inertial sensors. However, tracking in AR is highly dependent on light conditions and clear, constant features of the tracking objects. Marker-based tracking means on the contrary to install patterns (e.g., tags, QR codes) on the object, which are then recognized by the image processing system. The performance of the whole AR system is therefore dependent on the recognition speed of the tracker (Madden 2011; Mehler-Bicher et al. 2011). Despite rapid technological advancements, outdoor tracking represents one of the biggest challenges of AR and shows that immersive technologies are still in an early stage of development. While visual tracking in the beginning was limited to high-contrast 2D pictures, the advancements made in the field of image recognition now enables to even track moving 3D objects. However, tracking objects with lowcontrast (e.g., cars), changing bodies (e.g., trees changing through seasons), or irregular shapes (e.g., food, flowers) is still a challenge when using AR without any supportive tracking systems such as GPS or accelerometers. The projected virtual objects (2D images, 3D animations, videos or animated 2D images, or sound in form of background music or explanation/narration) can be accessed either through AR browsers (e.g., Layar, Junaio, Wikitude) or customized AR applications (Mehler-Bicher et al. 2011; Scholz and Smith 2016; Woods et al. 2004). However, the big advantage of virtual objects not being limited to costs or
324
R. Egger and L. Neuburger
physical limitations makes AR a practical and powerful tool. They can be animated and transformed and are able to interact with the users in response to their actions and are not constrained to time, space, or costs (Woods et al. 2004). Essentially, there are three different types of AR displays: 1. Head-mounted displays (HMDs) are mostly see-through AR glasses like Google Glass – where the goggles serve as conventional (optical) glasses and at the same time as the projection area for the display of any added virtual information (Rauschnabel et al. 2015). With AR glasses, the users can capture the real world around them by naturally moving their head in any direction. The virtual objects are projected on a display in front of the eyes where they can see the real world enhanced with virtual objects. With HMDs the users can move freely inside a room. Furthermore for this interface, markers are not necessary, and the users can capture any random number of objects to receive additional information. However, the user has to wear a certain device on his body. Another AR interface is the heads-up display, which is not attached to the body and was originally developed for military aviation where additional information is projected on the windscreen of the airplane. With heads-up displays the users do not have to activate the augmented information with any kind of activity and can look around freely without losing the displayed information. A relatively new development in the sector of interfaces are contact lenses with a wireless data connection. Especially the integration of all required technical components on a thin, flexible polymer without affecting the eye is, however, still a challenge (Mehler-Bicher et al. 2011). 2. Handheld devices or any kind of wearables use the camera of the device to record an image of the real environment around the user to supplement it with virtual objects shown on the screen of the device. Most mobile devices are already equipped with built-in cameras, accelerometers, microphones, or GPS and combine all components that are necessary for AR in one small device. Therefore, AR is mostly used on mobile devices (Kipper and Rampolla 2012; Yovcheva et al. 2012). 3. Other forms of displays that are capable to showcase AR emerge through new developments. Particularly, AR mirrors are a new and popular form of display as the users do not depend on any kind of device or application to experience AR. While recording the environment in front of the mirror in “selfie mode,” the users can see themselves in interaction with the virtual objects on the screen. The AR mirrors are often placed in open spaces (e.g., bus stops), museums, or shopping centers (Azuma et al. 2001; Scholz and Smith 2016).
Application of AR AR is already widely used across different fields: In education (see overview in Bacca et al. 2014) AR books can enhance learning experiences (Billinghurst et al. 2001). Other examples show how AR can be used in the classroom as a learning
14 Augmented, Virtual, and Mixed Reality in Tourism
325
activity to enhance a student’s learning experience, motivation, and effectiveness as well as their creative skills (Bower et al. 2014; Huang et al. 2016a; Tzima et al. 2019). Entertainment and gaming is one of the most popular fields of AR (Bielli and Harris 2015), evident from the success of the AR smartphone game Pokémon GO, as it made AR more accessible to consumers (Wingfield and Isaac 2016). AR in retail has been successfully implemented in various applications for customers, e.g., envisioning the living room with new furniture or using “Magic Mirrors” to show clothes, shoes, or makeup on the actual body/face of the customer (Scholz and Smith 2016). In the realm of advertising, AR can also raise awareness for products or services. There are numerous examples of interactive advertising (e.g., IKEA app, Volkswagen billboard), AR packaging, or marketing campaigns with so-called Bogus Windows, where the AR content is projected on a glass screen that simulates a window (e.g., Pepsi Max bus shelter, Walking Dead Campaign in Vienna) (Scholz and Smith 2016). In training or manufacturing AR can help workers to accomplish their tasks by viewing detailed digital information through smart glasses or HMD while working on a specific task without distraction (Carmigniani and Furht 2011). Furthermore, medicine is an important field of application for AR. Doctors and surgeons can see useful information projected on smart glasses or directly on the body of the patient (Billinghurst et al. 2014; Carmigniani and Furht 2011). In addition, AR applications like “see-through skin” exist, where the user can see muscles or bone structures projected on a screen while the user is holding the device in front of a person’s body. Thus, this can be used to study the anatomy of people and combine medicine with education (Bichlmeier et al. 2007) In the tourism industry AR is mostly used on mobile devices – smartphones in particular. As over 70% tourists (based on a study focused on US travelers) use their smartphones while travelling to search for POIs, activities, or directions (Think with Google 2018), mobile AR is the most popular technology to invest in at the current state of development. Mobile AR is also a good fit for tourists due to the intuitive usability and the natural gesture of holding up the camera for tracking AR – similar to taking pictures (Sherman 2011). In research, previous studies of AR in tourism mostly focused on the adoption behavior and acceptance of AR (Jung et al. 2018; Kounavis et al. 2012; Kourouthanassis et al. 2015; tom Dieck et al. 2016). Furthermore, the satisfaction of tourists with AR applications, its content and functionality (Han et al. 2018), and its impact on the intention to visit the destination due to AR are among research studies in this field (Chung et al. 2015; Jung et al. 2015, 2018). In addition, several tourism studies have focused on tourism experience with AR (He et al. 2018; Neuburger and Egger 2017; Tussyadiah et al. 2018). A further comprehensive review of studies about AR (and VR) in tourism can be found in Wei (2019) and Yung and KhooLattimore (2017). However, the tourism industry is still in an early stage of adopting AR, due to technological barriers such as the lack of integrated AR browser in mobile devices and the challenge of outdoor tracking. When designing and developing AR particularly for tourism, it is important to have the tourists and their tourism
326
R. Egger and L. Neuburger
experience in the center of the development process. Therefore the main focus of AR experiences should be about adding value to an already existing touristic experience (Hawkinson 2018). The following paragraphs should demonstrate how AR can add value to different touchpoints along the CJ.
Pre-travel Phase As the pre-travel phase is primarily associated with inspiration, information search, planning, and booking, AR can serve as a marketing tool to enhance any kind of 2D promotional material, such as flyers, brochures, menus, booklets, business cards, or websites by adding virtual interactive content in 3D (Buhalis and Yovcheva 2013; Shang et al. 2016). Adding interactive elements of gamification to traditional promotional materials can evoke curiosity and joyful anticipation of travelling to a destination (Xu et al. 2016). At the same time, these AR marketing and advertising possibilities can strengthen the effect of narratives and storytelling for a brand or destination by facilitating consumers to experience a product or service beforehand (Scholz and Smith 2016). When planning and preparing for a trip, AR can be used to add virtual content to flight tickets or booking confirmations by, e.g., showing real-time weather forecasts of the destination (Augment 2016). Furthermore, the tourists can choose their seats on the airplane in AR, compare their luggage with the required size stipulated by the airline (Garcia 2017), or track flights by pointing a mobile device at the sky to gather information about planes flying by in real time (Lee 2010).
On Trip Phase At the destination AR can be used to enhance the touristic experience, provide more information about places in an interactive way, or support the visitor by facilitating customized real-time navigation. One of the first successful AR applications was “Google World Lens AR,” which can translate written texts and signs in a foreign language for the user by only looking through the camera of the application (Zibreg 2017). Azuma (1997) already proposed an AR navigation system with POI suggestions or information overlay on mountains as future application in tourism destinations. As of today Apple Maps has integrated an AR flyover mode (Stein 2017), and Google just proposed its first prototype of an AR navigation “Live View” integrated in Google Maps (Ellis 2019). Yelp’s AR application augments the real world around the users by suggesting POIs around them and at the same time showing the location of near friends (Buhalis and Yovcheva 2013; Scholz and Smith 2016). “Visit Orlando” was one of the first DMOs that integrated AR functionality in their own destination app to explore POIs in the area (May 2016). The example of an AR guide in an aquarium, where virtual penguins lead the visitor to the next POI, shows the combination of interactive elements with navigation (Scholz and Smith 2016).
14 Augmented, Virtual, and Mixed Reality in Tourism
327
Many apps exist to gather information about destinations (Hamburger 2011), mountains, or stars around the user (Bogomolov 2019; Carmigniani and Furht 2011). Furthermore, hotels have also started to implement AR such as the “Hub Hotel by Premier Inn,” which integrated a map of the city on the wall of the hotel rooms that can be activated with an AR application to provide guests with additional information of POIs (Bogomolov 2019). Tourism experience can also be connected with AR through the concept of gamification. Studies show that location-based AR games, similar to geocaching (Hawkinson 2018) can increase place attachment. Especially through satisfaction and positive emotions provoked by the game (Oleksy and Wnuk 2017) AR represents the potential for destinations or theme parks to provide relevant content to tourists while providing the entertainment aspect of play (Jung et al. 2015). Moreover, travelers often want to leave their footprints at the destination by, e.g., writing their name or a message on a wall. In order to solve these graffiti-related problems, the traveler can create a virtual message or artwork with AR and share the location in the destination with friends and family. When scanning the same place with a mobile device, they are able to see the virtual message (Deans 2017). In the context of cultural heritage sites or museums, AR can be used as an interactive mobile guide enhancing the information provided by interactive virtual contextualization (Choudary et al. 2009; Chung et al. 2018; Neuburger and Egger 2017). The mobile AR application can serve as a tour guide through a place or a museum supporting people with navigation while providing useful content (Kourouthanassis et al. 2015). Furthermore AR can reanimate historical heritage sites and make them tangible for visitors by reconstructing historical building structures or simulating scenes to show how life was back in time (tom Dieck and Jung 2018; Vlahakis et al. 2001). Mobile AR tour guides cannot only be used in the specific case of heritage sites but can also be implemented in cities or national parks to provide tourists with additional interactive information on specific trails (Han et al. 2013). The augmentation of the surroundings of visitors can also be a useful tool for impaired tourists By tracking the environment and delivering necessary information to impaired tourists, it not only ensures the safety of visitors in a destination but also enhances their experience by delivering customized information in real time (Carmigniani and Furht 2011). Moreover, AR is a good tool to enhance the interactivity in museums. The example in Fig. 5 shows how AR can be used to enhance the experience in museums, to facilitate the visitor’s interpretation process of exhibitions, and to make artifacts more accessible (Neuburger and Egger 2017). In addition, AR can increase educational benefits in museums, science centers, or libraries (Woods et al. 2004) and connect the visitor to the history of heritage (Azuma 1997).
Post-travel Phase After returning home from travelling, visitors primarily want to share more of their experiences with friends and family through social media (ubiquitous Wi-Fi enables
328
R. Egger and L. Neuburger
Fig. 5 AR in the museum
sharing experiences during the trip itself), so as to memorialize past experiences. As a consequence, AR is becoming more popular, adopted by social media channels such as Snapchat, Facebook, or Instagram which offer AR filters combined with face recognition to add or improve effects of travel pictures or selfies (Bullock 2018). With wearables and mobile devices becoming ubiquitous in everyday life, AR can help travelers to preserve and revive travel memories. Therefore going through travel pictures or videos has become more interactive. AR can bring travel photos or photo books alive by using printed photos as a trigger to enable the projection of videos or photo storylines (Stam 2016; Wong 2017). Postcards or souvenirs can also be augmented with AR content that was recorded at the destination or informational videos that were created by the destination or attraction itself (Henze and Boll 2011; Shang et al. 2016). In that way, AR can help tourists to stay connected with the destination and already get inspired about their next trip. A project in Namibia where tourists can buy magnets augmented with videos that were produced together with indigenous people from the Donkerbos San community shows the potential of AR to be used for the social good as part of inclusive, collaborative projects (ICTech Hub 2019).
Virtual Reality Academics often cite VR as a technology that can profoundly alter the tourism industry (Beck et al. 2019; Guttentag 2010; Tussyadiah et al. 2017). Over the last 10 years, this technology has undergone a dramatic technical evolution, and its application in the realm of tourism is finally gaining momentum. In contrast to AR, which puts a virtual layer on top of the real-world view, VR is often described as a virtual computer-simulated world (Desai et al. 2014) A multitude of attempts try to
14 Augmented, Virtual, and Mixed Reality in Tourism
329
define the term virtual reality, showing that a notable discrepancy exists regarding its definition (Najafipour et al. 2014). Beck et al. (2019) argue that most tourism studies dealing with VR provide a general definition and therefore neglect the tourism context. Their comprehensive definition seems to be helpful for a subject-specific discussion and will therefore be adopted for this paper: Virtual Reality (VR), in a tourism context, creates a virtual environment (VE) by the provision of synthetic or 360-degree real life captured content with a capable non-, semi-, or fully-immersive VR system, enabling virtual touristic experiences that stimulate the visual sense and potentially additional other senses of the user for the purpose of planning, management, marketing, information exchange, entertainment, education, accessibility or heritage preservation, either prior to, during or after travel (Beck et al. 2019).
The development of VR solutions is preceded by a long history, based on the desire of people to be able to leave their real space at will (Riva et al. 2003). This aspiration is in part comparable with travel motives, the desire to be able to leave the familiar environment temporarily. As early as the mid-1990s, the potential of VR to revolutionize the tourism industry was already pointed out (Hobson and Williams 1995; Williams and Hobson 1995). Guttentag’s contribution (2010) can be considered as a pivotal study from a decade-long academic discussion on the application, use, and implications of VR in tourism. Since then, numerous contributions have attempted to shed light on individual facets of the topic both conceptually and empirically. Beck et al. (2019) identified 60 papers in their stateof-the-art review on VR in tourism, distinguishing between non-, semi-, and fully immersive systems while analyzing them. In those papers, it becomes clear what current VR articles in the tourism sector are about. For example, Marasco et al. (2018) did research on “behavioral intentions to visit/revisit cultural heritage sites” as well as “perceived visual appeal”; Beck and Egger (2018) on the “intensity of triggered emotions”; Hopf et al. (2020) the impact of multisensory VR; Tussyadiah et al. (2018) the “sense of presence, post-VR attitudes toward the destination, and enjoyment of VR experiences”; Disztinger et al. (2017) “behavioral intention to use VR for travel planning”; Marchiori et al. (2017) “different sense of the VR experience”; and Jung et al. (2016) the “four realms of experience economy,” just to single out a few research findings. There is also an literature review by Wei (2019) on the “Research Progress on Virtual Reality and Augmented Reality in Tourism and Hospitality.” This meta-analysis shows the main research developments of VR and AR and provides a good overview of the current state of research. In addition, they pose several worthwhile questions for the future.
Immersion and Presence Two concepts that are directly associated with VR technology – unfortunately often not considered separately – are immersion and presence. Both constructs are VRspecific, thus distinguishing it from other media (Slater and Sanchez-Vives 2016). Immersion can be considered as a measurable variable consisting of four components that characterize aspects of display technology. (1) Inclusiveness describes the
330
R. Egger and L. Neuburger
degree to which the physical reality is omitted, (2) surrounding describes the degree to which extent the display allows a panoramic view, (3) extensiveness describes the variety of sensory modalities accommodated, and (4) lastly vividness refers to aspects such as fidelity and resolution (Slater and Wilbur 1997). Immersion can thus be seen as an objective construct, explaining the physical configuration. The use of an HMD therefore always results in a fully immersive experience, as it completely secludes the user from his environment. In the case of semi-non-immersive systems, as with AR, the user is at least partially in contact with his environment (Azuma et al. 2001). Unlike immersion, presence is a subjectively perceived construct (Gutiérrez et al. 2008; Slater and Sanchez-Vives 2016), which expresses the degree of “actually being there” (Ijsselsteijn and Riva 2003). Dörner et al. (2013) show that the degree of immersion, as well as the intensity of presence, increases with more sophisticated VR technologies. The goal is that the user dives fully into the VE to experience a maximum of VR. Scholl et al. (2019) explain that additional sensory measures can further optimize the VR experience. They examined the effects of odor and haptic influences on presence. Scents, wind, heat, humidity, and hot air were used as a fourth dimension to immerse the user in a virtual world that feels and looks as realistic as possible. The possibility of adding extensive sensory information to the VR experience is of particular interest to the tourism sector (Guttentag 2010) as it offers the user the opportunity to make a positive impact on the travel decisionmaking process (Scholl et al. 2019). Both the level of involvement and interactivity as well as the degree of immersion significantly influence the construct of presence (Diemer et al. 2015; Kim 2005).
Technological Background Based on the abovementioned definition, some technical aspects can be considered systematically. For example, in their definition, Beck et al. (2019) explicitly point out the difference between virtual, computer-generated content and 360◦ videos. Slater and Sanchez-Vives (2016) also emphasize that both – synthetically generated and real-captured images – can create VEs whereas the reproduction of reality is not necessarily the goal of VR. The history of VR technology dates back to the 1960s, and for a long time, it was exclusively computer-generated virtual worlds that allowed the user to freely navigate within them. However, the practice has shown that the production of virtual worlds is complex and expensive, and so far, only a limited degree of verisimilitude has been achieved. For some years, VR experiences have been primarily produced with 360◦ images and videos (La Valle 2016). Until recently, the production process was labor-intensive due to the simultaneous use of multiple cameras and the subsequent spherical stitching of the individual image sections (the directions of view recorded by different cameras had to be combined into one single 360◦ image). Now the first 360◦ cameras with resolutions greater than 5K are affordable in the consumer segment. This allows anyone to create and share VR experiences without the need for technical knowledge.
14 Augmented, Virtual, and Mixed Reality in Tourism
331
It must be assumed, however, that synthetic virtual worlds will become more important again in the future because new methods, such as photogrammetry and 3D scans, allow a photorealistic reproduction of real spaces. While it is not possible for the user to leave the path of the cameraman when viewing 360◦ videos, synthetically produced spaces enable the user to navigate autonomously in a virtual copy of the real space, which will be helpful for the feeling of “presence,” a term which will be explained subsequently. Besides the differentiation between synthetic and 360◦ real-life captured content, Beck et al. (2019) also distinguish between non-, semi-, and fully immersive VR systems in their definition. Non-immersive systems are desktop-based solutions. This allows one to view 360◦ videos on a screen. The interaction by the user takes place with a mouse or keyboard. Non-immersive systems are thus the simplest and most widely used VR variant hitherto. Platforms like Facebook, YouTube etc. support the playback of 360◦ videos and make a non-immersive experience accessible to the masses without the need for additional hardware or installations. In contrast, semi-immersive systems project one or more screens onto the walls and floor of a room, allowing for a multi-user experience. In most cases, these projections are supplemented by 3D sound (Dörner et al. 2013; Gutiérrez et al. 2008). Ultimately, fully immersive systems are systems in which the user wears a HMD. These solutions emerged parallel with the evolution of smartphones, as many HMDs use the mobile device as a playback device. These include “Google Cardboards” – the simplest cardboard folded HMD – where the smartphone can be inserted at the front end. “Samsung Gear” and similar devices are compact headsets with remote controls, where the screen of the smartphone is also used to display the VR content. Currently, more and more complex high-resolution HMDs are available on the market, which are primarily developed for gaming and often provide controllers for user-object interaction. HMD solutions such as the HTC Vive, Oculus Rift, Oculus Go, or Oculus Quest support both synthetic and real-captured 360◦ worlds, enabling head tracking and other interaction capabilities, ultimately leading to an enhanced user experience (Disztinger et al. 2017; Marchiori et al. 2017; Munster et al. 2015). A consequence among all fully immersive systems is the complete isolation of the user from the real environment. Such systems provide a sense of embodiment, since the users experience the feeling that the VR device is part of their body (Flavián et al. 2019). Therefore, the level of embodiment can be linked to user experience (Tussyadiah et al. 2017) and plays a critical role in the creation of immersive experiences (Dawley and Dede 2014). In order to create the most realistic VR experience possible there are some technical requirements for HMDs that must be met. Each VR headset is equipped with one or two screens, one for each eye. To coordinate the eye movements and its position, two autofocus lenses are placed between the screens and the eyes. The images shown are rendered either by a computer or a mobile phone, depending on the HMD. For a fully immersive VR experience at least 60 frames per second (fps), a corresponding refresh rate of less than 20 milliseconds and a field of view (FOV) of
332
R. Egger and L. Neuburger
at least 100 degrees (ideally 180 degrees) are required. The frame rate indicates the rate at which the images can be rendered by the GPU per second. This is between 90 and 120 Hz for the most common models. The screen refresh rate, in turn, indicates the speed at which images can be rendered from the screen. The FOV describes the extent of the observable environments. The larger the FOV, the more present the VR experience can be perceived. HTC Vive and Oculus Rift offer a 110-degree field of view, Google Cardboard offers 90 degrees, and the Pimax 8k HMD offers a 200degree FOV (Mirabite 2019). These technical specifications are the prerequisites for users to have an optimal VR experience. For this to happen, all of the factors mentioned must interact in the right ratio. If, for example, the frame rate does not match the screen refresh rate, the risk is higher that the user will experience motion sickness or cybersickness, which can manifest as headaches, dizziness, and nausea. In order to provide the user with the desired FOV, eye and head tracking is necessary. This is usually done with LED lights, laser pointers, or sensors. Accelerometers can detect movement due to the increase or decrease of speed, gyroscopes are used for position control, and magnetometers measure the direction, strength, or change of a magnetic field and are therefore able to identify the relative position to the earth. Using a 6DoF (six degrees of freedom) system, head movements are decomposed into X, Y, and Z axes. Thus, all directions of view as well as forward and backward movements can be detected, and the corresponding correct image can be displayed.
Application of VR The uses of VR in tourism are diverse, ranging from providing information, entertainment, education, planning, and management (Wiltshier and Clarke 2017) to first attempts to map transactions via VR. However, VR is not just another marketing tool (Huang et al. 2016b) as recent studies highlight, as VR can enhance the processes along the CJ (Disztinger et al. 2017; Neuhofer et al. 2014; Tussyadiah et al. 2016). From a customer-centric view, it therefore makes sense to discuss the touchpoints along the CJ, namely, the pre-trip, on-trip, and post-trip phase, to highlight the potential of VR, to enrich the customer experience, and to create additional value.
Pre-travel Phase VR has a high potential to influence travelers – especially in the inspirational pre-travel phase. Tourist products are usually expensive and difficult to explain and cannot be tested in advance (Cho et al. 2002; Egger 2015). VR gives the opportunity to preview essential parts of the journey beforehand. Users can engage with an enriched kind of information (Marchiori et al. 2017), which results in an enhanced decision-making process, as uncertainties regarding the product in the
14 Augmented, Virtual, and Mixed Reality in Tourism
333
sense of a “try before you buy experience” can be massively decreased (Huang et al. 2016b; Tussyadiah et al. 2017). Guttentag (2010) points to an increased destination awareness through the use of VR, combined with the desire to visit the place in reality. Studies show that prospective travelers readily accept VR as an information medium and recognize its added value over traditional media. VR can thus be seen as a new marketing tool (Marchiori et al. 2018), which provides customers with realistic impressions of destinations, attractions, and other tourist services in advance. Meanwhile, all types of service providers are represented with VR experiences. Hotels present their rooms, restaurants, and wellness areas, and airlines like Etihad advertise their luxury departments with 360◦ films. Thomas Cook integrates VR into travel agencies to support sales pitches, and destinations worldwide give a pre-trip experience of their landscapes, cultural monuments, and sightseeing spots as part of their marketing campaigns (Beck and Egger 2018). With the advent of 360◦ cameras with correspondingly high resolutions, tourists will be able to share their impressions through a virtual experience with other users on YouTube, Facebook, or other social media platforms. These VR word-of-mouth advertisements in turn influence the peer group positively to produce content. To date, examples of AR and VR applications during the booking phase, in particular for fully immersive VR, are rare. So, most 360-degree videos are showcases of hotel facilities but are limited in their possible interactions.
On-Trip Phase The use of HMDs allows virtual walkthrough of real locations, and it begs the question of what form of contribution VR can offer when tourists are already at the destination. However, many examples show that in certain situations or for certain groups of people, VR can be a decisive support in the experience in the destination. For heritage or tourism sites that are not accessible due to protection and danger or do not exist anymore in reality, VR constitutes a possible supplement to make inaccessible sites accessible (Beck and Egger 2018; Jung et al. 2016). The “Grotte de Lascaux,” which houses well-known cave paintings near the French town of Montignac, remains closed to visitors in order to prevent further damage. However, a VR experience of the cave is available to visitors at the stunning Center International de l’Art Pariétal in Montignac. Another example is the Project Mosul. In 2015 the Islamic State deconstructed antiques in and around the city of Mosul in northern Iraq. Crowd-sourced images of artifacts were used to reconstruct antiques via photogrammetry to make the museum accessible via VR (Fig. 6). Inaccessible parts of Chernobyl as well as volcanic craters are also accessible via VR projects. For elderly tourists or disabled people, VR allows them to visit places they otherwise would not be able to. But destinations also apply VR in order to enhance existing experiences on site. Theme parks create new attractions or rejuvenate existing roller coasters with VR applications like “The Great LEGO Race” in Legoland, Florida (Neuburger et al. 2018), or the Dare Devil Dive VR Roller Coaster. People wear HMDs while being on the roller coaster ride, simultaneously transported into a
334
R. Egger and L. Neuburger
Fig. 6 Project Mosul
virtual word. In Germany, the thermal spring “Erding” offers the world’s first VR snorkeling experience. Bathers can borrow a VR HMD and then explore the water by swimming through coral reefs, shipwrecks, and underwater caves. In the middle of Bavaria, guests swim with turtles, colorful fish, hammerhead sharks, and whales (Therme Erding 2019).
Post-travel Phase To repeat the travel experience at a later time, the same applications as in the pre-trip phase can be used again at this stage, tourists will mentally review their journey and share their travel experiences on social media channels. Numerous social media platforms like Facebook and YouTube make it nowadays possible to publish consumer-generated 360◦ videos. For example, the platform Facebook 360 is a hub that is explicitly intended for VR content. Action cams such as the “Insta360 ONE” enable the production of VR content for everyone. With the now available hardware, software, and the social media platforms optimized on VR content, all prerequisites are given, for producing and sharing travel-related VR content. People want to share their positive experiences with their friends and families back home. At the same time, this shared content can be an inspiration for other people during the pre-trip phase when looking for information and inspiration online (Beck et al. 2019).
Mixed Reality In Milgram’s reality-virtuality continuum (Fig. 1), the left side defines the environment that only consists of real objects and contains all aspects, which appear when watching a real scene in person or through a display, whereas the right side describes surroundings only consisting of virtual objects. Hence, MR in this framework is defined as an environment where real and virtual objects are combined (Milgram and Kishino 1994).
14 Augmented, Virtual, and Mixed Reality in Tourism
335
AR and VR are thus subsumed under the term MR. However, while AR limits the user’s interaction possibilities to the screen area of the device, VR blocks out the user’s perception of the real world. Therefore a combination of both systems can overcome these limitations – calling for MR (Danado et al. 2005). MR allows users to share their virtual experience with each other and interact in both virtual and real world and enables shared “phygital” experiences. Noh et al. (2009, p. 51) note that “mixed reality refers to space which consists of real and virtual elements that interact with each other.” The clear conceptual distinction between AR, VR, and MR is often not made, and so the terminology blurs which leads to confusion (Jeon and Choi 2009). Flavián et al. (2019) indicate that recent launches in the industry are labeled as MR and should no longer be considered as the broad section along the continuum, but rather as an independent domain. They suggest an independent dimension called “pure mixed reality” (PMR) where virtual content “is not superimposed on the physical environment (as in AR)” or “virtual objects are rendered so that they are indistinguishable from the physical world” (Flavián et al. 2019, p. 549). In a PMR environment, users can simultaneously interact with the real and virtual worlds. Currently the Microsoft HoloLens and the Magic Leap solutions are technologies that support this understanding of MR. The user therefore sees virtual objects that fit in real space. For example, virtual objects can be seen on a table, but not when you look under the table. Physical reality and VR merge in real time, with physical space dictating how virtual objects should behave. Primarily due to the lack of technical solutions, PMR is limited to prototypical projects, such as “Holo Tour,” an app that allows users to walk around and explore panoramic settings of Rome or Machu Picchu in a PMR mode.
Conclusion With the advent of AR, VR, and MR, the concept of tourist space has been redefined. Physical and virtual spaces coexist and intermingle, resulting in new forms of action and interaction along the entire CJ. The examples given in the chapter demonstrate the paradigm shift that is currently taking place. New hardand software available on the market is outdated quickly, and while improvements in quality and performance are happening rapidly, the price for new devices is steadily declining in the customer segment. However, the widespread use of these technologies is still mainly in the gaming segment, and one hopes that it will be more widely used in other areas such as the tourism industry in the future. Customers need to further embrace technology, as any lack of acceptance will probably inhibit investment projects on the part of the tourism industry. Applications of virtual technologies are available at almost all customer touchpoints along the CJ and show their potential to enrich the customer experience on the information level. At the transactional level, there are still barely presentable examples, which are mainly due to the lack of interaction between users and the systems. However, the numerous innovative best practice examples available demonstrate how the consumer experience can be enhanced by virtual technologies.
336
R. Egger and L. Neuburger
In the future, additional senses will enrich the virtual experience, leading to an enhanced experience that is increasingly similar to the real one. At the same time, it will lead to an expansion and optimization of reality that is unparalleled. Furthermore, there will be shared collaborative experiences of AR, VR, and MR. In the future, the physical reality will be virtualized with the help of 3D scans in order to transfer the recorded point clouds to the end user in real time. There the point clouds will be rendered and transformed into a photorealistic VR experience. In this sense, point clouds or AR/VR clouds will provide digital versions of spatial units where agents can be located and appropriate mapping and positioning will enable a realistic interaction with digital objects in the “merged space” (Bucko 2019). AR, VR, and MR have the potential to alter the entire tourism industry, compelling everyone involved to rethink their traditional ways of thinking and acting. The future will show whether the visitation of real places will be substituted by the use of new technologies, or whether the reverse will triumph, where it might become increasingly important for tourists to experience real places to undergo a detoxification of their digital or even virtual everyday lives.
Cross-References Acceptance and Adoption of eTourism Technologies Consumer Behavior in e-Tourism Tourists and Augmented and Virtual Reality Experiences Virtual Reality and the End of Tourism? A Substitution Acceptance Model
References Augment (2016) Augmented reality applications in the tourism industry. Retrieved from https:// www.augment.com/blog/augmented-reality-in-tourism/ Azuma R (1997) A survey of augmented reality. Presence Teleop Virt Environ 6(4):355–385. https://doi.org/10.1.1.30.4999 Azuma R, Baillot Y, Behringer R, Feiner S, Julier S, MacIntyre B (2001) Recent advances in Augmented Reality. IEEE Comput Graph Appl 21(6):34–47. https://doi.org/10.3949/ccjm. 47.2.73 Bacca J, Baldiris S, Fabregat R, Graf S, Kinshuk (2014) A systematic review of research and applications. Educ Technol 17(4):133–149. ISSN: 1436–4522 Beck J, Egger R (2018) Emotionalise me: self-reporting and arousal measurements in virtual tourism environments. In: Stangl B, Pesonen J (eds) Information and Communication Technologies in Tourism 2018. Springer, Cham, pp 3–15 Beck J, Rainoldi M, Egger R (2019) Virtual reality in tourism: a state-of-the-art review. Tour Rev 74:586–612. https://doi.org/10.1108/TR-03-2017-0049 Bichlmeier C, Wimmer F, Heining SM, Navab N (2007) Contextual anatomic mimesis: hybrid insitu visualization method for improving multi-sensory depth perception in medical augmented reality. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR, pp 129–138. https://doi.org/10.1109/ISMAR.2007.4538837 Bielli S, Harris CG (2015) A mobile augmented reality system to enhance live sporting events. In: Proceedings of the 6th Augmented Human International Conference. ACM, pp 141–144. https://doi.org/10.1145/2735711.2735836
14 Augmented, Virtual, and Mixed Reality in Tourism
337
Billinghurst M, Clark A, Lee G (2014) A survey of augmented reality. Found Trends HumanComput Interact 8(2–3):73–272. Retrieved from http://www.nowpublishers.com/article/Details/ HCI-049 Billinghurst M, Kato H, Poupyrev I (2001) The MagicBook: a transitional AR interface. Comput Graph (Pergamon) 25(5):745–753. https://doi.org/10.1016/S0097-8493(01)00117-0 Bogomolov V (2019) Top 5 ideas how to use ar in tourism. Retrieved from https://www. hospitalitynet.org/opinion/4092421.html Bower M, Howe C, McCredie N, Robinson A, Grover D (2014) Augmented Reality in education – cases, places and potentials. Educ Media Int 51(1):1–15. https://doi.org/10.1080/09523987. 2014.889400 Bucko M (2019) Augmented reality is the operating system of the future. AR cloud is how we get there. Retrieved from https://www.forbes.com/sites/johnkoetsier/2019/02/21/augmentedreality-is-the-operating-system-of-the-future-ar-cloud-is-how-we-get-there/#b487eee25fb3 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet-The state of eTourism research. Tour Manag 29(4):609–623. https://doi.org/10.1016/j.tourman.2008.01.005 Buhalis D, Yovcheva Z (2013) Augmented reality in tourism: 10 unique applications explained. Retrieved from http://thinkdigital.travel/wp-content/uploads/2013/04/10-AR-Best-Practicesin-Tourism.pdf Bullock L (2018) AR and social media: is augmented reality the future of social media? Retrieved from https://www.forbes.com/sites/lilachbullock/2018/11/16/ar-and-social-mediais-augmented-reality-the-future-of-social-media/#5a0fc2ede141 Carmigniani J, Furht B (2011) Augmented reality: an overview. In Handbook of augmented reality. Springer, New York, NY, pp 3–46 Cho Y, Wang Y, Fesenmaier DR (2002) Searching for experiences: the web-based virtual tour in tourism marketing, J Travel Tour Mark 12(4):1–17. https://doi.org/10.1300/J073v12n04_01 Choudary O, Charvillat V, Grigoras R, Gurdjos P (2009) MARCH: Mobile Augmented Reality for Cultural Heritage. In: Proceedings of the 17th ACM International Conference on Multimedia. ACM, pp 1023–1024. Retrieved from https://dl.acm.org/citation.cfm?id=1631500 Chung N, Han H, Joun Y (2015) Tourists’ intention to visit a destination: the role of augmented reality (AR) application for a heritage site. Comput Hum Behav 50:588–599. https://doi.org/10. 1016/j.chb.2015.02.068 Chung N, Lee H, Kim JY, Koo C (2018) The role of augmented reality for experience-influenced environments: the case of cultural heritage tourism in Korea. J Travel Res 57(5):627–643. https://doi.org/10.1177/0047287517708255 Danado J, Dias E, Romão T, Correia N, Trabuco A, Santos C, . . . Câmara A (2005) Mobile Environmental Visualization. Cartogr J 42(1):61–68. https://doi.org/10.1179/000870405x57293 Dawley L, Dede C (2014) Situated learning in virtual worlds and immersive simulations. In: Handbook of research on educational communications and technology. Springer, New York, pp 723–734 Deans R (2017) Graffiti in augmented reality could be the next avenue for street art. Retrieved from https://qz.com/1072528/the-next-trend-in-street-art-will-be-graffiti-in-augmented-reality/ Desai PR, Desai PN, Ajmera KD, Mehta K (2014) A review paper on oculus rift. a virtual reality headset. Int J Eng Trends Technol 13(4):175–179 Disztinger P, Schlögl S, Groth A (2017) Technology acceptance of virtual reality for travel planning. In: Schegg R, Stangl B (eds) Information and Communication Technologies in Tourism 2017. Springer, Cham, pp 255–268 Diemer J, Alpers GW, Peperkorn HM, Shiban Y, Mühlberger A (2015) The impact of perception and presence on emotional reactions: a review of research in virtual reality. Front Psychol 6:26. Dörner R, Jung B, Grimm, P, Broll W, Göbel M (2013) Einleitung. In: Dörner R, Broll W, Grimm P, Jung B (eds) Virtual und Augmented Reality (VR/AR). Springer, Berlin, pp 1–32 Egger R (2015) Die Welt wird phygital. Metamorphosen touristischer Räume. In: Egger R, Luger K (eds) Tourismus und mobile Freizeit: Lebensformen, Trends, Herausforderungen Books on Demand
338
R. Egger and L. Neuburger
Ellis C (2019) Google Maps adds augmented reality navigation, beating Apple to the punch. Retrieved from https://www.techradar.com/news/google-maps-augmented-reality-apple-maps Flavián C, Ibáñez-Sánchez S, Orús C (2019) The impact of virtual, augmented and mixed reality technologies on the customer experience. J Bus Res 100:547–560. https://doi.org/10.1016/j. jbusres.2018.10.050 Garcia M (2017) This App uses augmented reality to show you your seat before you even step on the plane. Retrieved from https://www.travelandleisure.com/travel-tips/mobile-apps/app-inthe-air-vr-flight-booking Gartner (2018) 5 Trends emerge in the gartner hype cycle for emerging technologies, 2018. Retrieved from https://www.gartner.com/smarterwithgartner/5-trends-emerge-in-gartner-hypecycle-for-emerging-technologies-2018/ Gartner (2019) 5 Trends Appear on the Gartner Hype Cycle for Emerging Technologies, 2019. Retrieved from https://www.gartner.com/smarterwithgartner/5-trends-appear-on-the-gartnerhype-cycle-for-emerging-technologies-2019/ Gutiérrez MAA, Vexo F, Thalmann D (2008) Stepping into virtual reality. Springer, London Guttentag DA (2010) Virtual reality: applications and implications for tourism. Tour Manag 31(5):637–651. https://doi.org/10.1016/j.tourman.2009.07.003 Hamburger E (2011) Best Augmented Reality Apps For iPhone and iOS. Retrieved from https:// www.businessinsider.com/best-augmented-reality-apps-for-iphone-and-ios-2011-3 Han DI, Jung T, Gibson A (2013) Dublin AR: implementing augmented reality in tourism. In: Xiang Z, Tussyadiah I (eds) Information and Communication Technologies in Tourism 2014. Springer, Cham, pp 511–523 https://doi.org/10.1007/978-3-319-03973-2_37 Han DI, tom Dieck MC, Jung T (2018) User experience model for augmented reality applications in urban heritage tourism. J Herit Tour 13(1), 46–61. https://doi.org/10.1080/1743873X.2016. 1251931 Hawkinson E (2018) Augmented tourism: definitions and design principles. Invent J Res Technol Eng Manag 2(9):33–39 He Z, Wu L, Li X (2018) When art meets tech: the role of augmented reality in enhancing museum experiences and purchase intentions. Tour Manag 68:127–139. https://doi.org/10.1016/ j.tourman.2018.03.003 Henze N, Boll S (2011) Who’s that girl? Handheld augmented reality for printed photo books. In: Proceedings from IFIP Conference on Human-Computer Interaction. Springer, Berlin, pp 134– 151. https://doi.org/10.1007/978-3-642-23765-2_10 Hobson JSP, Williams AP (1995) Virtual reality: a new horizon for the tourism industry. J Vacat Mark 1(2):125–135. https://doi.org/10.1177/135676679500100202 Hopf J, Scholl M, Neuhofer B, Egger R (2020) Exploring the impact of multisensory VR on travel recommendation: a presence perspective. Inf Commun Technol Tourism 2020. Springer, pp 169–180 Huang TC, Chen CC, Chou YW (2016a) Animating eco-education: To see, feel, and discover in an augmented reality-based experiential learning environment. Comput Educ 96:72–82. https:// doi.org/10.1016/j.compedu.2016.02.008 Huang YC, Backman KF, Backman SJ, Chang LL (2016b) Exploring the implications of virtual reality technology in tourism marketing: an integrated research framework. Int J Tour Res 18:116–128. https://doi.org/10.1002/jtr.2038 ICTech Hub (2019) Augmented reality (AR) souvenirs. Retrieved from https://ictechhub.com/sanaugmented-reality-bracelets/ IDC (2019) Global augmented/virtual reality market size 2016–2023. Retrieved from https://www. statista.com/statistics/591181/global-augmented-virtual-reality-market-size/ Ijsselsteijn WA, Riva G (2003) Being there: the experience of presence in mediated environments In: Riva G, Davide F, Ijsselsteijn WA (eds) Being there: concepts, effects and measurement of user presence in synthetic environments. Ios Press, Amsterdam, pp 4–16 Jeon S, Choi S (2009) Haptic augmented reality: taxonomy and ana example of stiffness modulation. Presence Teleop Virt Environ 18(5):387–408. https://doi.org/10.1162/pres.18.5.387
14 Augmented, Virtual, and Mixed Reality in Tourism
339
Jung T, Chung N, Leue MC (2015) The determinants of recommendations to use augmented reality technologies: The case of a Korean theme park. Tour Manag 49:75–86. https://doi.org/10.1016/ j.tourman.2015.02.013 Jung TH, Lee H, Chung N, tom Dieck MC (2018) Cross-cultural differences in adopting mobile augmented reality at cultural heritage tourism sites. Int J Contemp Hosp Manag 30(3):1621– 1645. https://doi.org/10.1108/IJCHM-02-2017-0084 Jung T, tom Dieck MC, Lee H, Chung N (2016) Effects of virtual reality and augmented reality on visitor experiences in museum. In: Inversini A, Schegg R (eds) Information and Communication Technologies in Tourism 2016. Springer, Cham, pp 621–635 Kipper G, Rampolla J (2012) Augmented reality: an emerging technologies guide to AR. Elsevier, Waltham Kim GJ (2005) Designing virtual reality systems: the structured approach. Springer, London Kounavis CD, Kasimati AE, Zamani ED (2012) Enhancing the tourism experience through mobile augmented reality: Challenges and prospects. Int J Eng Business Manag 4(1):1–6. https://doi. org/10.5772/51644 Kourouthanassis P, Boletsis C, Bardaki C, Chasanidou D (2015) Tourists responses to mobile augmented reality travel guides: the role of emotions on adoption behavior. Pervasive Mob Comput 18:71–87. https://doi.org/10.1016/j.pmcj.2014.08.009 La Valle SM (2016) Virtual reality. Cambridge University Press, Cambridge Lee K (2010) Plane finder AR tracks planes in the sky. Retrieved from https://www.laptopmag. com/articles/plane-finder-ar-tracks-planes-in-the-sky Li H, Gupta A, Zhang J, Flor N (2018) Who will use augmented reality? An integrated approach based on text analytics and field survey. Eur J Oper Res. https://doi.org/10.1016/j. ejor.2018.10.019 Madden L (2011) Professional augmented reality browsers for smartphones: programming for Junaio, Layar and Wikitude. John Wiley & Sons, Chichester Marasco A, Buonincontri P, van Niekerk M, Orlowski M, Okumus F (2018) Exploring the role of next-generation virtual technologies in destination marketing. J Destin Mark Manag 9:138–148. https://doi.org/10.1016/j.jdmm.2017.12.002 Marchiori E, Niforatos E, Preto L (2018). Analysis of users’ heart rate data and self-reported perceptions to understand effective virtual reality characteristics. Inf Technol Tourism 18(1– 4):133–155 Marchiori E, Niforatos E, Preto L (2017) Measuring the media effects of a tourism-related virtual reality experience using biophysical data. In: Schegg R, Stangl B (eds) Information and Communication Technologies in Tourism 2017. Springer, Cham, pp 203–215 May K (2016) Visit Orlando claims app first – artificial intelligence and augmented reality in one Retrieved from https://www.phocuswire.com/Visit-Orlando-claims-app-first-artificialintelligence-and-augmented-reality-in-one Mehler-Bicher A, Reiss M, Steiger L (2011) Augmented reality: theorie und praxis. Oldenbourg, Munich Milgram P, Kishino F (1994) Taxonomy of mixed reality visual displays. IEICE Trans Inf Syst 77(12):1321–1329. Retrieved from https://search.ieice.org/bin/summary.php?id=e77-d_ 12_1321 Mirabite M Jr (2019) New, Emergent, and Interactive Media. Manny Mirabite, Jr. Munster G, Jakel T, Clinton D, Murphy E (2015) Next mega tech theme is virtual reality. Retrieved from https://piper2.bluematrix.com/sellside/EmailDocViewer?encrypt=052665f6-3484-40b7b972-bf9f38a57149&mime=pdf&co=Piper&id=reseqonly@pjc.com&source=mail Najafipour AA, Heidari M, Foroozanfar MH (2014) Describing the virtual reality and virtual tourist community. applications and implications for tourism industry. Arab J Bus Manag Rev 3(12a):12–23. https://doi.org/10.12816/0018842 Neuburger L, Beck J, Egger R (2018) The ‘Phygital’ tourist experience: the use of augmented and virtual reality in destination marketing. In: Camilleri MA (ed) Tourism planning and destination marketing. Emerald Publishing, Bingley, pp 183–202
340
R. Egger and L. Neuburger
Neuburger L, Egger R (2017) An afternoon at the museum: through the lens of augmented reality. In: Schegg R, Stangl B (eds) Information and Communication Technologies in Tourism 2017, pp 241–254. Springer, Cham Neuhofer B, Buhalis D, Ladkin A (2014) A typology of technology-enhanced tourism experiences. Int J Tour Res 16(4):340–350. https://doi.org/10.1002/jtr.1958 Noh Z, Sunar M, Pan Z (2009) A review on augmented reality for virtual heritage system. In: Proceedings from 4th International Conference on E-Learning and Games: Learning by Playing. Gamebased Education System Design and Development, pp 50–61. https://doi.org/10.1007/ 978-3-642-03364-3 Oleksy T, Wnuk A (2017) Catch them all and increase your place attachment! The role of locationbased augmented reality games in changing people – place relations. Comput Hum Behav 76:3–8. https://doi.org/10.1016/j.chb.2017.06.008 Rauschnabel PA, Brem A, Ivens BS (2015) Who will buy smart glasses? Empirical results of two pre-market-entry studies on the role of personality in individual awareness and intended adoption of Google Glass wearables. Comput Hum Behav 49:635–647. https://doi.org/10.1016/ J.CHB.2015.03.003 Robertson A (2019) Google announces a new $999 Glass augmented reality headset. Retrieved from https://www.theverge.com/2019/5/20/18632689/google-glass-enterpriseedition-2-augmented-reality-headset-pricing Riva G, Davide F, IJsselsteijn WA (2003) Being there: the experience of presence in mediated environments. In: Being there: concepts, effects and measurement of user presence in synthetic environments. IOS Press, Amsterdam Scholl M, Hopf J, Lulay S, Gautam M (2019) Investigating the effect of presence in multisensory VR on travel recommendation. In: ISCONTOUR 2019 Tourism Research Perspectives: Proceedings of the International Student Conference in Tourism Research. BoD–Books on Demand,p 73 Scholz J, Smith AN (2016) Augmented reality: designing immersive experiences that maximize consumer engagement. Bus Horiz 59(2):149–161. https://doi.org/10.1016/j.bushor.2015. 10.003 Shang LW, Zakaria MH, Ahmad I (2016) Mobile phone augmented reality postcard. J Telecommun Electron Comput Eng 8(2):135–139. Retrieved from https://www.semanticscholar.org/paper/ Mobile-Phone-Augmented-Reality-Postcard-Shang-Zakaria/41b0020d0d91daccda69e60be98c5 328ec4c6769 Sherman A (2011) How tech is changing the museum experience. Retrieved from https://mashable. com/2011/09/14/high-tech-museums/ Slater M, Sanchez-Vives MV (2016) Enhancing our lives with immersive virtual reality. Front Robot AI 3:74. https://doi.org/10.3389/frobt.2016.00074 Slater M, Wilbur S (1997) A framework for immersive virtual environments (FIVE): Speculations on the role of presence in virtual environments. Presence Teleop Virt Environ 6(6):603–616. https://doi.org/10.1162/pres.1997.6.6.603 Stam A (2016) The world’s first augmented reality photobook. Retrieved from https://www. huckmag.com/art-and-culture/photography-2/worlds-first-augmented-reality-photobook/ Statt N (2018) Google is working on bringing AR to Chrome with downloadable 3D objects. Retrieved from https://www.theverge.com/2018/1/23/16925574/google-ar-webchrome-augmented-reality-downloadable-objects Stein S (2017) Apple Maps has a hidden VR/AR trick in iOS 11. Retrieved from https://www.cnet. com/news/apple-maps-has-a-hidden-ar-trick-in-ios-11/ Therme Erding (2019) Virtual reality Schnorcheln. Retrieved from https://www.therme-erding.de/ therme-erlebnisbad/wellenbad/virtual-reality-schnorcheln/ Think with Google (2018) How smartphone usage is shaping travel decisions. Retrieved from https://www.thinkwithgoogle.com/consumer-insights/consumer-travel-smartphone-usage/ Tönnis M (2010) Augmented reality. Einblicke in die erweiterte Realität. Springer, Berlin tom Dieck MC, Jung T (2018) A theoretical model of mobile augmented reality acceptance in urban heritage tourism. Curr Issues Tour 21(2):154–174. https://doi.org/10.1080/13683500. 2015.1070801
14 Augmented, Virtual, and Mixed Reality in Tourism
341
tom Dieck MC, Jung T, Han DI (2016) Mapping requirements for the wearable smart glasses augmented reality museum application. J Hosp Tour Technol 7(3):230–253. https://doi.org/10. 1108/JHTT-09-2015-0036 Tönnis M (2010) Augmented reality. Einblicke in die erweiterte Realität. Springer, Heidelberg Tussyadiah I, Wang D, Jia CH (2016) Exploring the persuasive power of virtual reality imagery for destination marketing. “Leading Tourism Research Innovation for Today and Tomorrow” Tussyadiah IP, Jung TH, tom Dieck MC (2018) Embodiment of Wearable augmented reality technology in tourism experiences. J Travel Res 57(5):597–611. https://doi.org/10.1177/ 0047287517709090 Tussyadiah IP, Wang D, Jia CH (2017) Virtual reality and attitudes toward tourism destinations In: Schegg R, Stangl B (eds) Information and Communication Technologies in Tourism 2017. Springer, Cham, pp 229–239 Tussyadiah IP, Wang D, Jung TH, tom Dieck MC (2018) Virtual reality, presence, and attitude change: Empirical evidence from tourism. Tour Manag 66:140–154. https://doi.org/10.1016/j. tourman.2017.12.003 Tzima S, Styliaras G, Bassounas A (2019) Augmented reality applications in education: teachers point of view. Educ Sci 9(2):99. https://doi.org/10.3390/educsci9020099 Vlahakis V, Karigiannis J, Tsotros M, Gounaris M, Almeida L, Stricker D, . . . Ioannidis N (2001) Archeoguide: first results of an augmented reality, mobile computing system in cultural heritage sites. In: Proceedings of the 2001 Conference on Virtual Reality, Archeology, and Cultural Heritage, pp 131–140. https://doi.org/10.1145/584993.585015 Wei W (2019) Research progress on virtual reality (VR) and augmented reality (AR) in tourism and hospitality: a critical review of publications from 2000 to 2018. J Hosp Tour Technol https:// doi.org/10.1108/JHTT-04-2018-0030 Wei W, Qi R, Zhang L (2019) Effects of virtual reality on theme park visitors’ experience and behaviors: a presence perspective. Tour Manag 71:282–293. https://doi.org/10.1016/j.tourman. 2018.10.024 Williams P, Hobson JP (1995) Virtual reality and tourism: fact or fantasy? Tour Manag 16(6):423– 427. https://doi.org/10.1016/0261-5177(95)00050-X Wiltshier P, Clarke A (2017) Virtual cultural tourism: six pillars of VCT using co-creation, value exchange and exchange value. Tour Hosp Res 17(4):372–383. Wingfield N, Isaac M (2016) Pokémon go brings augmented reality to a mass audience. Retrieved from https://www.nytimes.com/2016/07/12/technology/pokemon-go-bringsaugmented-reality-to-a-mass-audience.html Wong R (2017) Lifeprint’s ‘Harry Potter’-like moving photos are now big enough to hang up. Retrieved from https://mashable.com/2017/12/05/lifeprint-3x4-augmented-reality-instantphoto-printer/ Woods E, Billinghurst M, Looser J, Aldridge G, Brown D, Garrie B, Nelles C (2004) Augmenting the science centre and museum experience. In: Proceedings from 2nd International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia, pp 230– 236. ACM. https://doi.org/10.1145/988834.988873 Xu F, Tian F, Buhalis D, Weber J, Zhang H (2016) Tourists as mobile gamers: gamification for tourism marketing. J Travel Tour Mark 33(8):1124–1142. https://doi.org/10.1080/10548408. 2015.1093999 Yovcheva Z, Buhalis D, Gatzidis C (2012) Overview of smartphone augmented reality applications for tourism. e-Rev Tour Res 10(2):63–66 Yu X, Xie Z, Yu Y, Lee J, Vazquez-Guardado A, Luan H, . . . Ji B (2019) Skin-integrated wireless haptic interfaces for virtual and augmented reality. Nature 575(7783):473–479. https://doi.org/ 10.1038/s41586-019-1687-0 Yung R, Khoo-Lattimore C (2017) New realities: a systematic literature review on virtual reality and augmented reality in tourism research. Curr Issues Tour 22(17):2056–2081. https://doi.org/ 10.1080/13683500.2017.1417359 Zibreg C (2017) Google Translate’s augmented reality feature, Word Lens, now works with Japanese. Retrieved from https://www.idownloadblog.com/2017/01/26/google-translatesaugmented-reality-feature-word-lens-now-works-with-japanese/
Electronic Data Interchange and Standardization
15
Christian Huemer, Philipp Liegl, and Marco Zapletal
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UN/EDIFACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Travel Alliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Message Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EDIFACT Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OTA Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parties Involved in the Data Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Role of the Service Provider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Classic EDI and WebEDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Classic EDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WebEDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EDI Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Core Issues and Future Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
344 346 353 361 361 362 362 363 365 365 366 367 367 369 370 370
Abstract Electronic data interchange (EDI) is the application-to-application exchange of business-related data between different organizations based on a structured machine-readable format. In this chapter, we will outline the concepts of EDI and its relevance for tourism. Thereby, we first introduce the UN/EDIFACT standards
C. Huemer () Institute of Information Systems Engineering, TU Vienna, Vienna, Austria e-mail: [email protected] P. Liegl · M. Zapletal ecosio GmbH, Vienna, Austria e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_21
343
344
C. Huemer et al.
which were the first ones to be used for EDI. Since this book targets IT in tourism, we next describe the standards of the Open Travel Alliance (OTA) which are most commonly used to realize EDI exchanges in the tourism domain today. In order to realize an EDI partnership between organizations, it is also critical to define how the messages are delivered from the message sender to the message receiver. Since service providers are commonly involved in EDI message exchanges, we will have a closer look on their services and explain how these services facilitate EDI partnerships. In particular, we discuss WebEDI which specifically targets small and medium enterprises (SMEs). Finally, we provide a list of critical issues that have to be met in any EDI project.
Keywords Electronic data interchange (EDI) · UN/EDIFACT · Open travel alliance (OTA) · Business-to-business communications · Inter-organizational systems · Application integration
Introduction Whenever business partners engage in a business transaction, they exchange at least an invoice. However, in most cases, they exchange many more business documents in the course of the business transaction, such as availability requests, quotes, reservations, etc. Traditionally, these document types have been exchanged in paper form. In this case, documents need to be printed, sent by snail mail, and most importantly manually re-entered in the receiver’s system. Since the information systems of the business partners are not integrated in this process in case of paperbased document exchanges, the business partners are not able to realize the full economic potential of inter-organizational business processes. Instead of paper-based and error-prone processes, direct data exchange must take place between the partners’ systems. It therefore makes sense to exchange the data of a business document electronically and in a structured form. This is referred to as electronic data interchange (EDI) (Hill and Ferguson 1989). EDI is not a specific technology, but an interaction of electronic processes, exchange protocols, and business document standards with one central goal: “Automated communication of business information directly between two IT systems, without any human intervention.” Such a seamless communication between the information systems enables a high degree of process automation. The data is transferred directly from one system to another without manual intervention – in other words, without “media discontinuity.” The automation of business processes through EDI results in a number of advantages (Colberg et al. 1995), for example: • Shorter transaction times • Lower transaction costs
15 Electronic Data Interchange and Standardization
• • • • • • • •
345
Reduction of repeated input Reduction of data entry errors Increasing the accuracy of information Reduction of paper-bound document flow Cost savings in data handling activities Avoiding duplication of work Reduction of throughput times Time reduction of processes
However, the use of EDI also results in a broad spectrum of advantages for company management, as business-relevant information is available faster and, above all, in a structured manner. The evaluation of this data using appropriate business intelligence methods enables improved process control through constantly available and up-to-date data. In the areas of planning, decision-making and control, for example: • • • • • • • • •
simpler target/actual comparisons deviation analyses new forecasting techniques newly acquired statistical knowledge more accurate information simulations productivity analyses improved cash management better stock overview
In order to realize all these benefits mentioned above, it is necessary to exchange business data between information systems without human intervention. This idea of electronically exchanging business documents exists for some decades. Accordingly, there have been quite a number of initiatives to standardize the underlying business document types, as described in Liegl et al. (2010). In this chapter we present two business document standards. First, we introduce UN/EDIFACT which has been the first global and industry-independent standardization approach. Second, we illustrate the standardization approach from the Open Travel Alliance (OTA) which is the predominant EDI standard in tourism. It should be noted that there have been several other attempts towards structured information exchange in the area of tourism, such as ANSI ASC X12 TG08, CEN/TC 329, HITIS (Hospitality Industry Technology Integration Standards), IATA AIDX (International Air Transport Association: Aviation Information Data Exchange), OTDS (Offener Touristischer Datenstandard), and TTI (Travel Technology Initiative) – A2BTRANSFERS.COM XML. After a brief outlook on future trends in standardizing business documents, we concentrate on EDI messaging, which concerns the transport of messages between the business partners. Following, we discuss how service providers may help SMEs
346
C. Huemer et al.
getting involved in EDI and also Web-EDI which is an alternative approach for SMEs who cannot afford a full-fledged EDI solution.
UN/EDIFACT The work on UN/EDIFACT started in 1985, when the United Nations decided to develop a global and industry domain independent EDI standard. The EDIFACT Syntax has been accepted as international standard ISO 9735 in 1987 and a last revision in 2014 (ISO 2014). Since 1990 the United Nations Centre for Trade Facilitation and e-Business (UN/CEFACT) releases UN/EDIFACT message directories, which include all standardized message types, twice a year. Note, in the beginning EDIFACT (without preceding UN) was used to refer to the syntax, whereas UN/EDIFACT was used to refer to the directories published by the UN. However, many people are not aware of this distinction, and so these abbreviations are often used synonymously in the literature. UN/ EDIFACT is based on the following key concepts: messages, segments, composite, and simple data elements as well as codes. Standardized codes are used for the representation of business terms. Data elements are the smallest indivisible pieces of data. Furthermore, UN/EDIFACT uses the concept of composite data elements, which are sequences of simple data elements that all together describe one logical unit. Segments are groups of related data elements. A message is a sequence of segments and segment groups representing a specific business transaction. Since UN/EDIFACT is designed as an industry domain independent standard, a message represents a certain business function that may be used in many industry domains, and all these industry domains share the same message type design. However, not all business functions and, thus, messages apply to every domain. The reuse across different industry domains becomes even more evident on the level of segments and data elements. Even if a business function may be so specific that it is required by only one domain, the corresponding message is built by segments and data elements that appear as well in other messages for different business functions. This reuse of segments and data elements has also a cascading effect on message type versions. Even if the message design between two directory versions does not change the segment structure, the message may implicitly change because another business function requires a change in an underlying segment or data element. Before we have a detailed look on the UN/EDIFACT message design, we must understand that there is a difference between batch and interactive EDI. In the beginning, the design of UN/EDIFACT messages was heavily influenced by the exchange of classical business documents in long running business transactions as, e.g., common in the consumer goods industry. This became later known as Batch EDI (Berge 1994). In the mid-1990s, it became evident that there is also a need for EDI that better suits the needs of more interactive conversations within sessions as, e.g., common in tourism transactions. This type of EDI is now known as Interactive EDI (Barrett 1995). On first sight, the hierarchical structure of batch and interactive EDI is more or less the same. Only the identifiers of the service segments – as
15 Electronic Data Interchange and Standardization
347
Interchange
UNA
UNG
UNB UIB
Either: Only Groups
Message
Message
UNH UIH
Message Body
Segment
Segment Tag
+
UNT UIT
Segment
Segment Group
Segment
Simple Data Element
+
UNE
Message
Segment Group
Trigger Segment
UNZ UIZ
Or: Only Message(s)
Composite Data Element
+
Simple Data Element
Simple Data Element
:
Simple Data Element
+
Composite Data Element
‚
Fig. 1 UN/EDIFACT Interchange Structure
we outline below – is different. The major difference is that batch and interactive EDI messages are built by different segments and data elements. This means, for example, that all batch EDI messages share the same segment to specify time-related information, but in interactive EDI, there exists a different segment for this purpose. Thus, each directory release covers a directory for batch EDI and another one for interactive EDI. Today, by far the most exchanges are based on batch EDI. However, all the messages that are relevant for the tourism domain are part of the interactive EDI directories. This hierarchical structure of a UN/EDIFACT interchange is depicted in Fig. 1 (ISO 2014). It should be noted that the UN/EDIFACT service segments (depicted in black boxes) differ for batch and interactive EDI. In this case we first list the batch EDI name and below the interactive EDI name. An EDI interchange starts off with an optional Service String Advice (UNA). The UNA is rarely used and specifies all special characters to be used in the interchange. UNA is used only if these characters differ from the default characters – which is not recommended anyway. Between an Interchange Header (UNB/UIB) and an Interchange Trailer (UNZ/UIZ), all the
348
C. Huemer et al.
messages are given. Only batch EDI supports the concept of message groups with similar functionality, which is started by a Group Header (UNG) and ended by a Group Trailer (UNE). Each message no matter if in a group or not is started with a Message Header (UNH/UIH) and ended with a Message Trailer (UNT/UIT). The message body is built by segments that may be recursively aggregated into segment groups (that are started by a trigger segment). A segment is built by a sequence of data elements, be it simple data elements or composite data elements. The latter are themselves built by simple data elements. Note, the latest syntax version also supports the concept of repeating data elements, where the same data element is repeated multiple times. However, this concept is hardly used in any EDIFACT messages, and, thus, we do not consider it in the following explanations. Taking a closer look at Fig. 1, it is easy to detect that UN/EDIFACT takes advantage of a delimiter-based syntax: The data values are separated from each other by special characters (“:”, “+”, etc.), which allows for flexible lengths of data values. Simple data elements within a composite element are separated by a colon (:). Data elements that are directly included in a segment (no matter whether simple or composite) are separated by a plus (+). Each segment starts off with a three character tag identifying the segment followed by all included data elements separated by a plus. Each segment is concluded by a single quote (‘) before the next one starts with the three character tag. Segments may be aggregated into segment groups which may be instantiated multiple times in a message. However, there are no explicit identifiers for segment groups within an EDIFACT message. Here the concept of so-called trigger segments plays a crucial role. This means that each segment group starts with a mandatory segment, which is a regular segment, but serves as trigger segment. Each message design guarantees that the trigger segments enable the identification of the logical position of each segment in a message. This means that a segment may appear multiple times in a message, but not on the same nesting level within a segment group. Accordingly, one always knows which one was the last instantiated trigger segment and, thereby, can identify the position of a segment within the message structure. By using a delimiter-based syntax, UN/EDIFACT uses implicit data identification. In contrary to XML, the meta information (the meaning of a data value) is not explicitly stated in a message except for segments. The semantics of a data value are given by its position in the message. Consequently, simple data elements within a composite data element, data elements within a segment, as well as segments in segment groups and messages must follow a predefined order. This order is defined in the message type definitions. To ensure a common understanding of transmitted data values, both business partners must be aware of the corresponding message type definition. The message type definitions are maintained by UN/CEFACT, and new versions are published twice a year in UN/EDIFACT directories. The following list shows all interactive UN/EDIFACT messages that are of particular interest to the tourism domain: • AVLREQ Availability request • AVLRSP Availability response
15 Electronic Data Interchange and Standardization
• • • • • • • • • • •
349
PASREQ Travel, tourism and leisure product application status request PASRSP Travel, tourism and leisure product application status response RESREQ Reservation request RESRSP Reservation response SKDREQ Schedule request SKDUPD Schedule update TIQREQ Travel, tourism and leisure information inquiry request TIQRSP Travel, tourism and leisure information inquiry response TSDUPD Timetable static data update TUPREQ Travel, tourism and leisure data update request TUPRSP Travel, tourism and leisure data update response
In order to illustrate the design of UN/EDIFACT messages, we take the Reservation Request (RESREQ) message as an example. The segment structure of this message is depicted in the two top columns of Fig. 2. RESREQ is built by 108 segments, some of which are aggregated into 20 (nested) segment groups. The maximum nesting level is three (e.g., segment group 13 is included in segment group 12, which in turn is included in segment group 7). The identification of a segment group in an interchange is realized by the non-repeatable and mandatory trigger segment that starts each instance of a group. For example, an Address (ADS) segment appears on position 180 and 210. The parser is able to detect the correct position of an instance whether or not an instance of the mandatory trigger segment Name (NME) was detected before. The structure of each of the 208 segments is defined in the segment directory. As an example, we depict the structure of the segment Originator of Request Details (ORG) in the lower left of Fig. 2. This segment is built by seven directly included data elements, two of which are simple data elements – 3036 Party Name (position 040) and 3457 Originator Type Code (position 050) – and the five others are composite data elements. The latter are themselves built by simple data elements, e.g., the composite data element E973 Delivering System Details (position 010) includes the simple data elements 3036 Party Name, 3225 Location Identifier, and 3223 Location Name. The simple data elements are either alphabetic, alphanumeric, or numeric (a/an/n) and have a maximum length, e.g., 3036 Party Name is alphanumeric up to 70 characters (an..70). All data elements appear in the data element directory. In case of a coded simple data element, all the valid codes are part of the simple data element definition. For example, the coded simple data element 3457 Originator Type Code, which is depicted on the lower right of Fig. 2, comes with three code values for travel agent (1), reservation agent (2), and seller (3). To further illustrate how UN/EDIFACT works, we exemplify an instance of the Reservation Request (RESREQ) message in Fig. 3. Providing a real-world message that uses many segments of the message structure would go beyond the scope of this chapter. Accordingly, we decided to go for a cancellation of a reservation made in a previous dialog. It should be noted that no matter if it is a new reservation, an update, or a cancellation of a reservation, the message type definition is basically
350
C. Huemer et al.
Pos
Tag Name
S
R
00010 00020 00030 00040 00050 00060 00070 00080 00090 00100 00110 00120 00130 00140 00150 00160 00170 00180 00190 00200 00210 00220 00230 00240 00250 00260 00270 00280 00290 00300 00310 00320 00330 00340 00350 00360 00370 00380 00390 00400 00410 00420 00430 00440 00450 00460 00470 00480 00490 00500 00510 00520 00530 00540 00550 00560 00570 00580 00590 00600 00610 00620 00630 00640 00650 00660
UIH MSD ORG RCI NUN FTI CRI CLT ATI CON ITM HDI TFF POS CMN IFT TDI ADS
M C C C C C C C C C C C C C C C C C C M C C M C C M C C C C M C C C C C C C C M C C M C C C C M C C C C C C C C C C C C C C C C M C
1 1 1 1 1 1 1 1 1 1 1 1 9 1 1 9 9 9 9------+ 1 | 1------+ 99-----+ 1 | 1------+ 9------+ 1 | 1 | 1 | 1 ------+ 999----+ 1 | 1 | 1 | 1 | 1 | 1 | 1 | 9 | 99----+| 1 || 1-----+| 9-----+| 1 || 1 || 1 || 1 -----++ 99-----+ 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 9 | 9 | 99 | 9-----+| 1 || 1-----+|
ORG
010
020
030
040 050 060
070
NME ADS SSR IFT PMT CON ADS NME TIF FTI CRI CON IFT TDI DIS ADS SSR IFT PMT CON ADS NME TVL MSD RCI RPI MOV CNX DIS ASD CON ADS NME PLI AAI IFT TDI TFF VEH PDT
Interactive message header Message action details Originator of request details Reservation control information Number of units Frequent traveller information Consumer reference information Clear terminate information Tour information Contact information Item number Hardware device information Tariff information Point of sale information Commission information Interactive free text Traveller document information Address ---- Segment group 1 -----------------Name Address ---- Segment group 2 -----------------Special requirement details Interactive free text ---- Segment group 3 -----------------Payment information Contact information Address Name ---- Segment group 4 -----------------Traveller information Frequent traveller information Consumer reference information Contact information Interactive free text Traveller document information Discount information Address ---- Segment group 5 -----------------Special requirement details Interactive free text ---- Segment group 6 -----------------Payment information Contact information Address Name ---- Segment group 7 -----------------Travel product information Message action details Reservation control information Quantity and action details Car delivery instruction Connection details Discount information Service details Contact information Address Name Product location information Accommodation allocation information Interactive free text Traveller document information Tariff information ---- Segment group 8 -----------------Vehicle Product information
ORIGINATOR OF REQUEST DETAILS Function: To provide the originator of request details. E973 3036 3225 3224 E974 3197 3465 3197 3036 E975 3225 3224 3207 3227 3036 3457 E976 3207 6345 3453 3503
DELIVERING SYSTEM DETAILS Party name Location identifier Location name ORIGINATOR IDENTIFICATION DETAILS Agent identifier In-house identifier Agent identifier Party name LOCATION Location identifier Location name Country identifier Location function code qualifier PARTY NAME ORIGINATOR TYPE CODE ORIGINATOR DETAILS Country identifier Currency identification code Language name code ACCESS AUTHORISATION IDENTIFIER
C C C C C C C C C C C C C C C C C C C C C
00670 00680 00690 00700 00710 00720 00730 00740 00750 00760 00770 00780 00790 00800 00810 00820 00830 00840 00850 00860 00870 00880 00890 00900 00910 00920 00930 00940 00950 00960 00970 00980 00990 01000 01010 01020 01030 01040 01050 01060 01070 01080 01090 01100 01110 01120 01130 01140 01150 01160 01170 01180 01190 01200 01210 01220 01230 01240 01250 01260 01270 01280
PMT CON ADS NME NUN CRI SSR PMT CON ADS NME IFT TIF FTI CRI CON IFT ASD ADS TFF SSR IFT PMT CON ADS NME ODI MSD TVL APD RPI NUN FTI TFF RCI DIS ASD IFT SSR IFT VEH PDT PMT CON ADS NME TIF FTI SSR UIT
---- Segment group 9 -----------------Payment information Contact information Address Name ---- Segment group 10 -----------------Number of units Consumer reference information ---- Segment group 11 -----------------Special requirement details Payment information Contact information Address Name Interactive free text ---- Segment group 12 -----------------Traveller information Frequent traveller information Consumer reference information Contact information Interactive free text Service details Address Tariff information ---- Segment group 13 -----------------Special requirement details Interactive free text ---- Segment group 14 -----------------Payment information Contact information Address Name ---- Segment group 15 -----------------Origin and destination details Message action details ---- Segment group 16 -----------------Travel product information Additional transport details Quantity and action details Number of units Frequent traveller information Tariff information Reservation control information Discount information Service details Interactive free text ---- Segment group 17 -----------------Special requirement details Interactive free text ---- Segment group 18 -----------------Vehicle Product information ---- Segment group 19 -----------------Payment information Contact information Address Name ---- Segment group 20 -----------------Traveller information Frequent traveller information Special requirement details Interactive message trailer
C M C C C C M C C M C C C C C C M C C C C C C C C M C C M C C C C M C M M C C C C C C C C C C M C C M C C M C C C C M C C M
9-----+| 1 || 1 || 1 || 1 -----+| 1-----+| 1 || 1-----+| 99----+| 1 || 1 || 1 || 1 || 1 || 1-----+| 999---+| 1 || 1 || 1 || 1 || 1 || 1 || 9 || 99 || 99---+|| 1 ||| 1----+|| 9----+|| 1 ||| 1 ||| 1 ||| 1 ----+++ 99-----+ 1 | 1 | 99----+| 1 || 1 || 1 || 1 || 1 || 1 || 1 || 1 || 1 || 9 || 99---+|| 1 ||| 1----+|| 9----+|| 1 ||| 1----+|| 9----+|| 1 ||| 1 ||| 1 ||| 1 ----+|| 999--+|| 1 ||| 1 ||| 99---+++ 1
3457 Originator type code Desc: Code specifying the type of originator.
1 an..70 an..35 an.. 256 1 an..9 an..9 an..9 an.. 70
Repr: an..3 Code Values: 1
Travel agent The originator is a travel agent.
2
Reservation agent The originator is a reservation agent.
3
Seller The originator is the seller.
1 an..35 an..256 an..3 an.. 3 1 an.. 70 1 an.. 3 1 an..3 an..3 an.. 3 1 an..9
Fig. 2 UN/EDIFACT Reservation Request Message Structure + sample segment and data element structure
15 Electronic Data Interchange and Standardization
Message Release Number Message Version Number
351
Message type sub-function identification: cancel previous dialogue reservation
Message Reference Number
Controlling Agency
Message Type Message Header
Initiator Control Reference
UIH+ RESREQ:D:18A:AM:UN+mr98765+id123456’ Message Action Details
Business Function Code: Air
Message Function Code: Cancellation
Response type code: Acknowledgment required Originator details / Country Identifier + Currency Identifier + Language Code Originator type code: Reservation Agent
MSD+AAP:1+AB’ Originator Identification Details / In-house Identifier
Delivering System Details / Party Name Originator of request details
Originator Identification Details / Agent Identifier
Party Name
ORG+the_System+TUW:HUH_TUW++TU Wien+1+AT:EUR:EN’ Reservation Identifier Reservation Control Information
Party Name
Reservation Identifier Code Qualifier: Booking
Date Millisecond Time
RCI+TU Wien:B_2018019123456:4:20191001:1105000’ Message Trailer
Message Reference Number
Number of segments in a message
UIT+mr98765+5’ Fig. 3 UN/EDIFACT RESREQ Example Message
always the same. In our example, we reduce the message body to the information about the originator of this message and the reservation that is going to be canceled. The resulting message in Fig. 3 covers five segments. Each segment is depicted in a new line for better readability, but in reality, there is no line feed after a segment. The next segment starts immediately after the ending quote (‘) of the previous one. Since UN/EDIFACT does not use any mark-ups, we have added the meta information as comments to Fig. 3. The RESREQ message is started by the Message Header segment which is UIH for any interactive EDIFACT message. The first data element of this service segment is a composite data element identifying the type of message. This composite data element states the message type (RESREQ), the message version number (D) and message release number (18A) referencing the corresponding directory version, the message type sub-function with a coded value (AM – cancel a reservation made in a previous dialog), and the controlling agency of the message (UN – United
352
C. Huemer et al.
Nations). The values are separated by colons (:) within the composite data element. Afterwards, the message reference number (mr98765) is stated in a simple data element, and finally a control reference (id123456) as issued by the initiator is specified. The data elements within the segment are separated by a plus (+). Since none of the other data elements of this segment are used, the UHI segment is immediately terminated after the control reference by a quote (‘). The second segment, which is the first segment of the message body, specifies the message action details (MSD). The first composite data element lists the message processing details. It includes a business function code (AAP) that indicates the industry segment, which is Air in our example. The second data element is another coded element that codifies the message function (1). The value 1 refers to a cancellation. Afterwards the segment includes a simple data element for the response type code (AB) which indicates that an acknowledgment of this message is required. Again, the quote (‘) terminates the segment since no other data elements are used. The third segment is used to provide details about the originator of the request (ORG). The first data element on the details of the delivering system is a composite data element, even if it does not look like one due to the non-existence of colons (:). Since only the first data element for the party name (the_System) is used and all other simple data elements within the composite are omitted, the composite data element ends with the plus sign directly after the first simple data element. The second composite data element specifies the details of the originator and uses the agent identifier (TUW) and the in-house identifier (HUH_TUW) for this purpose. Since the third data element for specifying locations is not used at all, two plus (+) as a delimiter are stated in a row. The fourth data element is a simple one and specifies the party name (TU Wien). Another simple data element follows to indicate the type of originator (1). The code 1 refers to a reservation agent in the corresponding code list. Afterwards, some more details of the originator are listed in a composite data element: the country identifier (AT), the currency identifier (EUR), and the language (EN). The data element is ended hereafter by a quote (‘), since the last element for the access authorization identifier is not used. The forth segment is used for the reservation control information (RCI) to identify the existing reservation that is going to be cancelled. This segment consists of only one composite data element with exactly the same name. The first simple data element states the name of the party (TU Wien) that made the reservation followed by the reservation identifier (B2018019123456). The next value qualifies the type of reservation (4). The value 4 indicates a booking in the corresponding code list. The last two elements are used for the date (20191901) and time in milliseconds (1105000) of the booking. Eventually, the message trailer (UIT) is the last segment as in any other interactive EDIFACT message. The message trailer repeats the message reference number (mr98765) that is also stated in the message header. The second data element states the number of segments in the message including the header and the trailer, which is 5 in our example.
15 Electronic Data Interchange and Standardization
353
Open Travel Alliance In the late 1990s, the Extensible Markup Language (XML) and in particular XML Schema had a big impact on the standardization of EDI business document types. Given the success of XML and its tool support, almost all new initiatives on standardizing business document types have been based on XML, which uses markups to identify the data in the business documents. In particular, the specifications of the Open Travel Alliance (OTA) became the first choice for implementing data exchanges between information systems in the tourism domain. The Open Travel Alliance (OTA) was founded in 1999 as a not-for-profit association by major airlines, hoteliers, car rental companies, travel agencies, global distribution systems, etc. The aim of the OTA is to facilitate the inter-organizational collaboration of parties in the tourism sector. According to their own words on opentravel.org, OTA provides a community where companies in the electronic distribution supply chain work together to create an accepted structure for electronic messages, enabling suppliers and distributors to speak the same interoperability language, trading partner to trading partner. For OTA, the goal of creating a structure for electronic messages has been synonymous for many years with the attempt of developing and maintaining a library of XML schemas for the travel industry. Accordingly, the main output of OTA has been standardized XML schemas that cover the most important industry segments of the tourism sector. To be precise, these XML schemas define message structures for air, car rental, hotel, rail, ground transportation, cruise, golf, package tours, but also destination activity, loyalty, profile, and travel insurance. It is essential that all the different message schemas are not developed in isolation from each other. For the sake of increased interoperability, as much reuse of XML content among the message schemas is strongly desired. This means reusing complex types, simple types, attribute groups, and code lists across different message schemas. It follows that an XML message sent between two business partners is not based on a self-contained XML schema; rather it is based on an XML schema that includes several other schemas. For example, an availability request for a hotel room is based on the XML schema “OTA_HotelAvailRQ,” and this XML schema includes several other schemas. These other schemas provide many definitions of elements, attributes, and simple/complex types that build the availability request for a hotel booking, but do also appear in some other OTA messages. It follows that the overall OTA specification is built-up by a large collection of XML schemas. This requires an efficient method for maintaining, understanding, and implementing the XML schemas within this large collection. In other words, an overall architecture and XML design technique guarantees a consistent specification. In line with best practices from other industries, OTA opted for a hierarchical organization of the XML schema files (see OTA 2017). The resulting hierarchy of message level schemas, function-specific schemas, industry common types, OTA common types, and OTA simple types is depicted in Fig. 4.
354
C. Huemer et al.
Fig. 4 OTA XML Schema Architecture
Message Level Schemas
includes
Function Specific Schemas
includes
includes
Industry Common Types
includes
includes
includes
OTA Common Types
includes
includes
OTA Simple Types
XML message level schemas are at the top of this architecture. A message level schema corresponds to a certain business message that is exchanged between two business partners in a business transaction. As mentioned above, OTA provides message structures for all the different segments related to the travel and tourism industry. Thus, it should be noted that similar business messages exist for each segment, and each XML message level schema represents a business message in the specific context of the respective industry segment. As outlined in Fig. 4, each message level XML schema includes other XML schemas that, in turn, include others. The message level schema contains for sure the root element that indicates the business message within a certain industry segment and that appears as the root node of a corresponding XML instance that is exchanged between business partners (e.g. Cancel RQ, which is used in the example of Code Listing 1). Whatever is contained below the root element may be defined either in the message level schema itself or in any of the directly or indirectly included schemas of the levels further down the hierarchy. The second level of the hierarchy comprises function-specific schemas. An XML schema on this level represents a container for reusable data structures that are shared across messages with a similar business function. As an example, consider the reservation of a hotel room. The OTA specification includes four different request/response pairs, i.e., eight XML message level schemas for the reservation of a hotel room and its modification. Evidently, these eight message level schemas have
15 Electronic Data Interchange and Standardization
355
a lot of structures in common. Instead of duplicating (or better multiplicating) and scattering these structures in each message level schema, it is preferable to aggregate these structures in a single function-specific schema. In our example the functionspecific schema OTA_HotelReservation increases interoperability between the eight message level schemas that all provide related functionality and, thus, import this function-specific schema. XML schemas for industry common types represent the third layer. OTA standardizes numerous XML schemas for each industry segment. It is not surprising that all schemas of a certain industry segment share a lot of common data types independent of their underlying business function. For this purpose, complex types that are relevant to the whole industry segment are aggregated in an XML schema for this segment (e.g. guest room type within the hotel common types). The resulting schema is included in both message level schemas and functionspecific schemas. The concept of industry common types boosts consistent reuse of structural elements throughout the OTA specification. In the context of our example this means that it is not only the eight hotel reservation messages that share common structures, but all of the various hotel messages share the industry common types for hotels. Again, instead of defining these types separately for each message, the OTA_HotelCommonTypes delivers a single definition for all types that are used in the context of hotels. The two bottom layers of the hierarchy define types that are shared by messages of all industry segments. The fourth layer covers the OTA common types (e.g., address type). The types contained herein may be referenced from any XML schemas of the levels above. Any schema of higher levels may define elements that reference complex types, simple types, and attribute groups of the OTA common types. The fifth layer provides the OTA simple types which contain global simple types that are available for all other schemas. Typically, the OTA simple types provide the basic data structures that are used to build larger structures, such as String Length 1 to 32, Numeric 1 to 3, ISO3166 (i.e., two character ISO country code), DatOrDateTimeType, and YesNotType (Boolean). In addition to this 5-level architectural design of XML schemas, the design of each of the schemas must be somewhat aligned. In general, XML Schema comes with high degrees of freedom for the schema designers. However, without restricting all the options that XML schemas allows, one will end up in a nightmare that hinders interoperability in any way. Thus, OTA has released the schema design best (OTA 2010) which provide guidance on a number of XML schema design topics: Tag Naming Conventions, Root Element, Message and File Naming Conventions, Elements and Attributes, Use of XML Schema, Global vs. Local Element Types, Namespaces, Versioning XML Schemas, Schema Markup and Annotations, Enumerations vs. Code Lists, and Code Lists. A key characteristic of the OTA XML schemas is the fact that all messages support either one of two so-called message exchange patterns. Message exchange patterns refer to the types of trading partner interactions whether information is requested or delivered without request. The first pattern is the request/response pattern. In this case, one business partner requests information from the other
356
C. Huemer et al.
business partner. Accordingly, the request/response pattern is realized by a pair of messages, the requestor sends a request message (RQ) indicating what data are requested, and the responder processes the request and returns the relevant data in a response message (RS). It follows that the interaction follows a “pull” of data. Most of the OTA XML schemas follow this pattern, and, thus the OTA library is full of RQ/RS pairs of messages, such as HotelAvailRQ and HotelAvailRS which are used to request the availability of a hotel room. The second pattern is the notification or in short “notif” pattern. In this case, a business partner sends data to the other business partner without any request to do so. This conforms to a “push” of data to another business partner. On first sight, one may assume that the notify pattern requires only one message, but it is again two messages (RQ/RS), since the receiver after processing the data replies with a business signal to confirm that the payload was received. XML schemas that follow the notify pattern include the substring “notif,” such as HotelAvailNotifRQ and HotelAvailNotifRS which are used to push information about available hotel rooms. The publication process of OTA is reminiscent of UN/EDIFACT. This means that a new version of the OTA is released twice a year. The name of the library is also built by the concatenation of the year and A or B. In Fig. 5 we give an overview of all the OTA 2017 B messages (OTA 2017).
Generic Messages - OTA_AuthorizationRQ/RS - OTA_CancelRQ/RS - OTA_DeleteRQ/RS - OTA_ErrorRQ/RS - OTA_FileAttachmentNotifRQ/RS - OTA_NotifReportRQ/RS - OTA_PingRQ/RS - OTA_PurchaseItemRQ/RS - OTA_ReadRQ/ OTA_ResRetrieveRS - OTA_ScreenTextRQ/RS Air Messages - OTA_AirAvailRQ/RS - OTA_AirBaggageRQ/RS - OTA_AirBookRQ/RS - OTA_AirCheckInRQ/RS - OTA_AirDemandTicketRQ/RS - OTA_AirDisplayQueueRQ/RS - OTA_AirFareDisplayRQ/RS - OTA_AirFlifoRQ/RS - OTA_AirDetailsRQ/RS - OTA_AirLowFareSearchRQ/RS - OTA_AirPriceRQ/RS - OTA_AirRulesRQ/RS - OTA_AirSchedulesRQ/RS - OTA_AirSeatMapRQ/RS - OTA_AirBookModifyRQ/ OTA_AirBookRS - OTA_AirGetOfferRQ/RS Car Messages - OTA_VehicleCommonTypes - OTA_VehAvailRateRQ/RS - OTA_VehCancelRQ/RS - OTA_VehCheckOutRQ/RS - OTA_VehCheckInRQ/RS - OTA_VehLocDetailsNotifRQ/RS - OTA_VehExchangeRQ/RS - OTA_VehLocDetailRQ/RS - OTA_VehLocSearchRQ/RS - OTA_VehModifyRQ/RS - OTA_VehRateNotifRQ/RS - OTA_VehRateRuleNotifRQ/RS - OTA_VehRateRuleRQ/RS - OTA_VehResRQ/RS - OTA_VehResStatusNotifRQ/RS - OTA_VehRetResRQ/RS - OTA_VehResNotifRQ/RS
Cruise Messages - OTA_CruiseCommonTypes - OTA_CruiseBookingDocumentRQ/RS - OTA_ReadRQ/ OTA_CruiseBookingHistoryRS - OTA_CruiseCabinAvailRQ/RS - OTA_CruiseCabinHoldRQ/RS - OTA_CruiseCabinUnholdRQ/RS - OTA_CruiseCancellationPricingRQ/RS - OTA_CruiseCategoryAvailRQ/RS - OTA_CruiseBookRQ/RS - OTA_CruiseDiningAvailRQ/RS - OTA_CruiseFareAvailRQ/RS - OTA_CruiseFastSellRQ - OTA_CruiseInfoRQ/RS - OTA_CruiseItineraryDescRQ/RS - OTA_CruisePaymentRQ/RS - OTA_CruisePkgAvailRQ/RS - OTA_CruisePriceBookingRQ/RS - OTA_CruisePNR_UpdateNotifRQ/RS - OTA_CruiseSailAvailRQ/RS - OTA_CruiseShorexAvailRQ/RS - OTA_CruiseSpecialServicesRQ/RS
Ground Transportation Messages - OTA_GroundCommonTypes.xsd - OTA_GroundAvailRQ/RS - OTA_GroundBookRQ/RS - OTA_GroundResRetrieveRQ/RS - OTA_GroundCancelRQ/RS - OTA_GroundModifyRQ - OTA_GroundResRetrieveRQ/RS - OTA_GroundResNotifRQ/RS
Loyalty Messages - OTA_LoyaltyCommonTypes.xsd - OTA_LoyaltyAccountCreateRQ/ OTA_LoyaltyAccountRS - OTA_LoyaltyCertificateCreateRQ/RS - OTA_LoyaltyCertificateCreateNotifRQ/RS - OTA_LoyaltyCertificateRedemtpionRQ/RS - OTA_LoyaltyAccountRS
Package Tour Messages Hotel Messages - OTA_PkgCommonTypes.xsd - OTA_HotelCommonTypes.xsd - OTA_PkgAvailRQ/RS - OTA_HotelAvailRQ/RS - OTA_PkgBookRQ/RS - OTA_HotelAvailGetRQ/RS - OTA_PkgExtrasInfoRQ/RS - OTA_HotelAvailNotifRQ/RS - OTA_PkgCostRQ/RS - OTA_HotelBookingRuleRQ/RS - OTA_PkgReservation - OTA_HotelBookingRuleNotifRQ/RS - OTA_CacheChangeRQ/RS - OTA_CacheChangeNotifRQ/RS - OTA_HotelCommNotifRQ/RS - OTA_HotelDescriptiveContentNotifyRQ/RS - OTA_HotelDescriptiveInfoRQ/RS - OTA_HotelEventRQ/RS Profile Messages - OTA_HotelGetMsgRQ/RS - OTA_Profile.xsd OTA_HotelInvAdjustRQ/RS - OTA_ProfileCreateRQ/RS Destination Activity Messages - OTA_HotelInvBlockRQ/RS - OTA_ProfileMergeRQ/RS - OTA_DestinationActivity.xsd - OTA_HotelInvBlockNotifRQ/RS - OTA_ProfileModifyRQ/RS - OTA_DestActivityCapabilitiesRQ/RS - OTA_HotelInvCountRQ/RS - OTA_ReadRQ/ - OTA_DestActivityResRQ/RS - OTA_HotelInvCountNotifRQ/RS OTA_ProfileReadRS - OTA_HotelInvNotifRQ/RS Dynamic Package Messages - OTA_DynamicPkgCommonTypes.xsd - OTA_HotelInvSyncRQ/RS Rail Messages - OTA_HotelPostEventReportRQ/RS - OTA_DynamicPkgAvailRQ/RS - OTA_RailCommonTypes.xsd - OTA_HotelPostEventNotifRQ/RS - OTA_DynamicPkgBookRQ/RS - OTA_RailAvailRQ/RS - OTA_ReadRQ/OTA_DynamicPkgBookRS - OTA_HotelRateAmountNotifRQ/RS - OTA_RailBookRQ/RS - OTA_HotelRatePlanRQ/RS - OTA_DynamicPkgModifyRQ/ - OTA_RailConfirmBookingRQ/RS - OTA_HotelRatePlanNotifRQ/RS OTA_DynamicPkgBookRS - OTA_RailFareQuoteRQ/RS OTA_HotelResModifyRQ/RS - OTA_CancelRQ/ - OTA_RailIgnoreBookingRQ/RS - OTA_HotelResModifyNotifRQ/RS OTA_DynamicPkgBookRS - OTA_RailPaymentRQ/RS - OTA_HotelResNotifRQ/RS - OTA_DynamicPkgModifyNotifRQ/RS - OTA_RailPriceRQ/RS - OTA_HotelResRQ/RS - OTA_ReadRQ/ - OTA_HotelRFP_MeetingRQ/RS OTA_RailResRetrieveSummaryRS Golf Messages - OTA_HotelRFP_MeetingNotifRQ/RS - OTA_ReadRQ/ - OTA_GolfCommonTypes.xsd - OTA_HotelRFP_TransientRQ/RS OTA_RailResRetrieveDetailRS - OTA_GolfCourseAvailRQ/RS - OTA_HotelRFP_TransientNotifRQ/RS OTA_RailScheduleRQ/RS - OTA_GolfFacilityInfoRQRS - OTA_HotelRoomingListRQ/RS - OTA_RailShopRQ/RS - OTA_GolfRateRQRS - OTA_HotelSearchRQ/RS - OTA_GolfCourseResRQ/RS - OTA_HotelStatsRQ/RS - OTA_GolfCourseResModifyRQRS - OTA_HotelStatsNotifRQ/RS - OTA_GolfCourseSearchRQ/RS - OTA_HotelStayInfoNotifRQ/RS - OTA_HotelSummaryNotifRQ/RS
Fig. 5 OTA 2017B Messages
Tour / Activity Messages - OTA_TourActivityAvailRQ/RS - OTA_TourActivityBookRQ/RS - OTA_TourActivityCancelRQ/RS - OTA_TourActivityCommonTypes - OTA_TourActivityModifyRQ - OTA_TourActivityResRetrieveRQ - OTA_TourActivitySearchRQ/RS Travel Insurance Messages - OTA_InsuranceCommonTypes.xsd - OTA_InsurancePlanSearchRQ/RS - OTA_InsuranceQuoteRQ/RS - OTA_InsuranceBookRQ/RS Travel Itinerary Messages - OTA_TravelItineraryReadRQ\ OTA_TravelItineraryRS
15 Electronic Data Interchange and Standardization
357
Obviously, it is impossible to discuss all these messages in detail. Even a comprehensive discussion of a single message would go beyond the boundaries of a book chapter, since most messages comprise several hundreds of elements and attributes. So, depicting the schema of the message and providing an illustrative example would each require quite a number of pages without even trying to explain them. Nevertheless, we want to provide a bit of a look and feel of an OTA message. Thus, we have opted for a rather small, but still very common OTA message: the CancelRQ message. The XML schema of this message is depicted in Figs. 6 and 7. The CancelRQ message is part of the generic messages that are not customized to a specific industry segment and, thus, are used by all industry segments. Accordingly, the CancelRQ message is used to request the cancellation of all or portions of a previous reservation. The root element is Cancel_RQ. Its attributes are a timestamp for when the message was created (timeStamp) and the OpenTravel message version indicated by a decimal value (version). Due to space restrictions, we do not further detail the element extensions used for an OTA payload regarding data policy, encryption, target system, and transaction. The element called PayloadStdAtrributes is used for a standard set of attributes that appear on all OpenTravel messages: a reference for additional message identification, assigned by the requesting host system that must be referenced in a response message (echoToken); a timestamp for when the payload was created (timestamp); a test or production target system indicator (target); the name of the target system (targetName); the OpenTravel message version indicated by a decimal value (version); a unique identifier to relate all messages within a transaction (transactionIdentifier); a sequence number indicating the retry to send a message (sequenceNmbr); an indicator where this message falls within a sequence of messages (transactionStatusCode); a mechanism to allow end-toend correlation of log messages (correlationID); the ISO code of the primary language of the request (primaryLangID); the ISO code of the alternative language of the request (altLangID); and a flag indicating the retransmission of a message (retransmissionIndicatorInd). The element Point of Sales (POS), which is depicted in Fig. 7, is used to provide information about the party that sends the request. The nested subelement source carries details about the requestor and may be repeated to also reflect the delivery systems. The source comes with the following attributes: identification of the party within the requesting entity (agentSine); an identification code assigned to an office or agency by a reservation system; the ISO country code (iSOCountry) and the ISO currency code (iSOCurrency); an authority code assigned to the requestor (agentDutyCode); the IATA assigned airline code (airlineVendorID) and the IATA assigned airport code (airportCode); the point of the first departure in a trip (firstDepartPoint); the identifier of the individual using the electronic reservation service provider (eRSP_UserID); and the electronic address of the device from which information is entered (terminalID).
358
C. Huemer et al.
Fig. 6 OTA Cancel RQ XML Schema (Part 1)
The first two elements of source are the RequestorIDSubGrp that may be used to identify the requesting entity and RequestorId that may also be used to identify the requestor. Since these two options are not used in our XML instance example, we skip the detailed explanation of these two elements.
15 Electronic Data Interchange and Standardization
Fig. 7 OTA Cancel RQ XML Schema (Part 2)
359
360
C. Huemer et al.
CancelRQ timeStamp="2019-10-31T08:20:00+01:00" version="2" xmlns="http://www.opentravel.org/OTM/reservation/v2" xmlns:ota2-0400="http://www.opentravel.org/OTM/Common/v4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opentravel.org/OTM/reservation/v2 ../Reservation_2_0_0_Trim.xsd">
RefOrg12345
TU Wien
Internet
RefOrg98765 B_2018019123456
Mr. Hugo Heuschreck
[email protected] 4315880818808 9876
Code Listing 1 Example of a Cancel RQ Message
In our XML instance example of Code Listing 1, the identification of the requestor is realized by the third element RequestorIDDetail. The attributes of this element are a password that the receiver may use to validate the authority of the sender (messagePassword) and the name of the person making the request
15 Electronic Data Interchange and Standardization
361
(personName). The subelement Identifier is used for the identification of the requestor. In addition to the identifier, this element comes with the attributes for the uri of the creating system (url), the name of the creating system (system), and the name of the company that manages the reference system (organization). We omit to describe the next element Position that provides attributes for latitude, longitude, and altitude. Position is also not used in our example. However, we do use the next element BookingChannel which specifies the booking channel type. Its attribute primaryInd indicates whether it is the primary means of connectivity of the source or not. BookingChannel includes as a first element CompanyName, which identifies the company that is associated with the booking channel. This is done via its attributes: The division name or ID with which the contact is associated (division); the department name or ID with which the contact is associated (department); a company common name (shortName); identification of a company by the company code (code); and the source authority responsible for the code (codeContext) and a description of the code (description). The second element of BookingChannel specifies the type of booking channel, which refers to an OTA maintained code list. The next top-level child of CancelRQ is ReservationIdentifier which represents the core of the cancelation request. Its attribute objID defines a unique identifier within this message for this object. The first element Identifier provides the identification of the reservation that is being canceled by this message. Like any other identifier, it comes with the attributes uri, system, and organization (c.f. the identifier of ReuestorIDDetail described before). In addition, the second element ReservationRef of ReservationIdentifier may be used to refer to an objID with this document – this is not used in the example. Finally, CancelRQ includes the element Verification providing information to check the correctness of the cancelled reservation. It covers the element PersonNameSummary with the subelements Prefix, Given, Middle, and Surname. The elements Email_Simple and Telephone_Simple provide contact details of this person, and PaymentCard_Masked is used to control access to the reservation.
Message Transport Another crucial point in the exchange of business documents is the transport of the messages from the sender to the receiver. Whereas SMEs may think that it is appropriate to exchange the documents as e-mail attachments, e-mail is a nogo for larger companies, since the exchange and processing of messages must be automated and failure-proof as far as possible. The latter means, for example, that the transport protocol must guarantee the non-repudiation of receipt of a message.
EDIFACT Messaging UN/EDIFACT messages are pure text-based messages. Thus, a lot of transport protocols may be used for the message transfer. In practice, transmission via X.400
362
C. Huemer et al.
or AS2 are most common for EDIFACT messaging. The former is a dedicated mailbox network that has existed since the 1980s and is operated in parallel to the Internet. In the X.400 network, message traffic is still billed per kilobyte today and is therefore considered to be correspondingly expensive. AS2 is an HTTPbased transport protocol for EDI messages between two trading partners. Due to high maintenance costs, AS2 connections are usually only set up for large message volumes. This is where the EDI service provider may help in particular SMEs: it collects the message from the SME’s system via SFTP or via its own client module (see Fig. 2) and delivers the message – after conversion – to the trading partner via the desired transport protocol. At the receiver side, the service provider receives the messages, converts them, and delivers them in in-house format to the SME so that they can be automatically imported into its information system.
OTA Messaging In the beginning, the standardization work of OTA did not care about any messaging protocols. However, with the increasing use of the OTA XML schemas in a Web Services-based environment, the need for appropriate guidelines for the message transport became evident. Naively, one may think that if two parties agree on using SOAP over HTTP for message exchanges, interoperability on the messaging level will be guaranteed. Unfortunately, this is not the case. Owing to the dual purpose of SOAP (RPC vs. messaging) and the flexible and variable SOAP structure, very different SOAP structures may be used to exchange OTA messages. Accordingly, the OTA specifications come today with guidelines on how to use HTTP, SOAP, and WSDL. These guidelines help to implement OTA messaging between different parties in a consistent manner. The goal is that any party that follows the guideline of the OTA specification should be able to interoperate without implementing any changes on message or messaging protocol level. Consequently, the OTA specification attempts to deliver the simplest method of transferring OTA messages over SOAP. The most basic rules specify that the payload (i.e., message content) conforms to the OTA message which is the only and immediate child of the SOAP Body element. For a detailed discussion on all the guidelines regarding SOAP-based messages, the interested reader is referred to the section on Open Travel Message Transport of the Message User Guide (OTA 2017).
Parties Involved in the Data Exchange As described above, EDI service providers may be involved to transport the messages, but they do not play a mandatory role. Accordingly, the number of parties involved varies and different interaction patterns result. Before two companies start exchanging data electronically, it is important to find a suitable interaction pattern in which electronic messages are exchanged as easily as possible.
15 Electronic Data Interchange and Standardization
363
The simplest model of interaction is the direct connection between two companies (two corner model). Common protocols for such a connection are SOAP/HTTP, REST/JSON, SFTP, AS2, SMTP/IMAP, etc. In the two-corner model, no services provider participates in the transfer, and the data is exchanged directly between the companies. In practice, this method is only preferred by large companies that have their own EDI departments with local solutions for conversion and message routing. For small and medium-sized companies, this method is not suitable because expensive EDI solutions have to be purchased and the entire EDI administration has to be handled by the company itself. The two-corner model also scales poorly, since a separate connection must be set up for each additional company that is to be connected. A further development of the two-corner model is the three-corner model, in which a service provider is used. The companies have their own connection to the service provider (e.g., via SFTP, AS2) and exchange EDI data with other partners via this connection. The use of a service provider allows synergies to be exploited, as there is no need to maintain a separate EDI connection to each individual partner, but a single line to the service provider is sufficient, which can then be used to send and receive data. In addition, the service provider can perform data conversions, archive data, and apply signatures. When compared with the telephone network, the three-corner model is best compared with the large national telephone providers of the past, which handled all fixed lines. From an economic point of view, the three-corner model is problematic insofar as the provider can develop a monopoly position. In contrast to the three-corner model, the four-corner model uses at least two service providers. Every company has a connection to a service provider and exchanges messages via this provider. If data is exchanged with another company that is not with the same service provider, the own service provider guarantees the correct delivery to the service provider of the other company. This process is known as roaming. Seamless communication is possible through appropriate roaming agreements between the providers. This competitive situation between the individual providers also leads to a price formation between the providers, which is determined by the market. In the end, companies benefit from lower prices under the four-corner model for two reasons: On the one hand, the competitive situation creates a natural competition and, thus, leads to a price reduction. On the other hand, providers try to keep customers in their own networks and offer additional price incentives when sending and receiving messages to partners in their own network.
The Role of the Service Provider The EDI service provider supports message transport from A to B as explained in the previous section. However, an EDI service provider offers a much broader range of services. Therefore, the typical activities of a service provider are presented below.
364
C. Huemer et al.
EDI service providers operate networks that are used to transmit structured electronic documents such as UN/EDIFACT, CSV, and XML. The most important services are typically: • Routing: This refers to the delivery of electronic messages from a sender to a recipient. • Roaming: If the sender and the receiver do not have the same EDI provider, the messages are forwarded according to the four-corner model by routing across different providers. • Conversion: During a conversion, a source format (e.g. CSV) is translated into a target format (e.g. OTA). This is required when sender and receiver do not support the same document formats or document standards. • Signature: Even if there is no legal requirement in most countries, some recipients still insist that an electronic message is electronically signed. This is a technical guarantee of the authenticity of the sender and the integrity of the content. • Archiving: Evidently, electronic documents must be stored and accessible later at any time. This is due to both external (e.g., due to tax regulations) and internal (e.g., to enable later evaluations of the exchanged documents) regulations. In addition, the EDI service provider offers consulting services. The service provider helps in the analysis of the status quo. The primary objective here is to evaluate which suppliers/customers are to be reached with EDI and, if so, which EDI standards they use. Furthermore, the EDI service provider checks the EDI capability of the software currently used in the company. EDI data is typically generated and processed by the in-house IT-system. Therefore, it is essential that this software is able to send and receive structured documents. Without this basic prerequisite, the use of EDI is not possible. Another issue concerns conversion. The service provider evaluates what is more cost-effective, the purchase of a local conversion software that translates between the format of the in-house system and the required EDI format of the business partners, or the conversion be carried out by the EDI service provider. The purchase of a local converter software typically only pays for itself with a large volume of documents and a large number of (connected) business partners. The service provider facilitates the development of a company’s strategy for an internal master data management. Successful EDI projects are based on the exchange of information based on commonly agreed codes with well-defined business semantics and unique identifiers. Thus, the acquisition of industry-wide identifications for companies, products, services, etc. is strongly recommended. Furthermore, the EDI service provider contacts the responsible EDI contacts of the customers and suppliers and carries out the so-called EDI on-boarding. Onboarding is the process of coordinating the document standards used by companies and setting up appropriate conversions where necessary. Parallel to the coordination
15 Electronic Data Interchange and Standardization
365
of the document formats, the technical connections to the suppliers and customers are established. Once the technical infrastructure has been set up and the coordination on the document standards has been completed, dedicated test messages are exchanged for the first time. Tests are performed to check whether the data sent can be processed correctly by the recipient. In the context of the test operations, a parallel operation of paper documents and electronic documents is also often pursued. In addition to paper documents, electronic messages are exchanged for a certain period of time. The recipient then compares the paper document with the electronically received data in his/her system. If deviations are detected, document mapping, for example, can still be improved. If no problems occurred during the test phase, a switch to productive operation takes place. Here, the benefits of EDI become now fully evident: Electronic documents are exchanged between companies fully automatically and without human interaction. Another essential service, typically offered by an EDI service provider, is error handling and trouble shooting. Typical examples of misbehavior include the following situations: the system of the business partner is unavailable due to a hardware failure or an internal corporate network failure; electronic messages cannot be delivered or received; messages are exported incorrectly by the sender and are therefore not accepted by the receiver; and messages are sent with incorrect identifications. In these cases, the EDI service provider offers proactive support, informs on the cause of the error, and assists in troubleshooting.
Classic EDI and WebEDI Another offer of an EDI service provider that is relevant for many SMEs is a WebEDI solution. In the following we introduce the concept of WebEDI and how it differs from classic EDI.
Classic EDI Today, large enterprises prefer to execute business transactions electronically, based on data in a structured form so that it can be processed by IT systems automatically without human interaction. The advantages for (large) companies in comparison to paper-based document exchange are obvious, as described before. If the other company is also a large enterprise, then in most cases classical electronic data exchange is a common solution, nowadays. The operation of an EDI solution requires an in-house IT system that is able to read and generate structured data. The second essential component is the transformation from the internal data of the company to the exchanged data format and vice versa. Although small and medium enterprises (SMEs) often have an
366
C. Huemer et al.
in-house IT system, they are not able to perform the data transformation and, thus, cannot connect to a classic EDI system. Therefore, these companies are unable to send and receive structured data in the format/standard required by large enterprises. The result is a stalemate. The large company would like the data to be in a structured form, but the SME cannot deliver the data in the desired format because it does not have the necessary infrastructure.
WebEDI In order to enable SMEs to receive and send EDI data without having to purchase expensive EDI converters or to expand their in-house IT systems, a web-based environment is an alternative solution. This is often referred to as WebEDI (Beck et al. 2003). The basic principle is very simple. The SME logs into a WebEDI system in a web browser and can send and receive structured files via this system. No special EDI know-how is required for sending or receiving data. When sending a document, the data is entered using a form in the WebEDI application. The WebEDI system checks the data for completeness, converts it into the required EDI format, and delivers it to the business partner. On receipt of a document, the data received in EDI format is translated into a human readable presentation format and displayed to the SME in the WebEDI system. Depending on the application scenario, different EDI document types are used in a WebEDI solution. The exact specification of which document types are mandatory and which can be used optionally depends on the respective business partner with which the SME maintains a WebEDI relationship. An essential feature of a WebEDI application is the so-called turnaround process. This means the WebEDI application maintains the state of the business transaction and prefills some of the form fields of the next transactional step; some of them may be even unmodifiable. This guarantees a better data quality for the business partner and reliefs the SME from the burden to complete some form fields over and over again. We illustrate the turnaround process by a simple example. A large company reserves a room from a small hotel and sends a reservation message. Since the SME does not log into the WebEDI system on a daily basis (e.g., as is often the case with e-mails), when a new message is received from the large company, the SME is notified either by e-mail or by SMS. In the next step, the SME may now confirm the reservation by means of a reservation response message. In this case, the data is taken from the reservation message and copied into the reservation response message. So, the SME does not have to copy or manually enter the data of the requested room manually, but only inserts the additional data required for the booking confirmation. In most cases, this is only the confirmation identifier. Evidently, the data is not only copied in the case of request and response pairs, but is maintained over the whole transaction, for example, to guarantee that an invoice can only include items for which a valid reservation exists.
15 Electronic Data Interchange and Standardization
367
A Comparison In principle, a small or medium-sized supplier is always free to use a WebEDI solution, when executing business transactions with a larger company, in order to save costs when introducing EDI. As long as the larger trading partner receives the data in the correct EDI format (which is ensured by the WebEDI solution), no problems will arise when using WebEDI. With WebEDI, however, seamless integration is not possible, since in this case two systems coexist independently of each other on the SMEs side – the in-house IT system (e.g., the property management or reservation system) and the WebEDI system. A WebEDI solution leads to a second document management solution, which, however, is operated decoupled from the in-house solution. Since automatic document transfer from the WebEDI system to the in-house system and vice versa is not possible, the documents must be manually reconciled between the systems by a user. Thus, the use of WebEDI becomes uneconomical in the case of a very high volume of documents, and a classic EDI solution should be preferred. In general, only a classic EDI solution may unfold all the potentials of electronic data exchange. However, there are situations in which the use of WebEDI pays off, namely, when only very few documents are to be exchanged with a few business partners via EDI and all other processes continue to be operated unchanged on paper or PDF basis. Furthermore, if the in-house solution does not provide any import or export interfaces, which are prerequisites for classic EDI, a WebEDI solution must be considered.
EDI Projects On the basis of the previous chapters, we summarize the considerations that should be made when introducing EDI in a company. Careful planning of such EDI deployment projects is the first key to success. EDI projects are often subject to delays because they are not sufficiently planned and technical details are therefore forgotten. EDI is usually only introduced once in a company, and the large number of partners and technologies involved make a new EDI system a challenge. If an internal company project requires coordination across departments, an EDI project also requires coordination across different companies. When planning an EDI project, the following five points must be considered: Project management It may sound trivial, but an EDI project needs project management with the appropriate technical skills. In many companies – especially in smaller ones – EDI processing is often transferred to the sales department or the secretariat, which is certainly not appropriate. In order to ensure a smooth
368
C. Huemer et al.
execution of the project, project management should be entrusted to someone who is sufficiently familiar with the technology. If nobody is available in the company itself, an external EDI service provider may act as the technical contact for the larger trading partner and to coordinate the project. Master data Before EDI data can be exchanged, the master data of the companies involved must be exchanged. This applies at least to the identification of the companies and their branches and, in particular, to the identification of the products/services exchanged. Unfortunately, companies do not pay enough attention to the master data management, which quite often delays the start of the test of the EDI system. Data import and export The first step is to clarify whether the software used in the company is capable of importing and exporting structured files (e.g., in the form of TXT, XML, CSV). This data can then be processed by an EDI service provider and translated into the desired target formats of the trading partners and vice versa. If data can be imported and exported by the software, then classical EDI can be used, in which the data exchange between the systems takes place fully automatically. If the software used is not able to import and export structured data, either new software must be purchased or a WebEDI solution must be used. Exchange protocol and exchange standard In order for EDI data to get from A to B, a technical connection must be established between the partners involved. The choice of protocol usually depends on which protocols are supported by the larger trading partner. The coordination with the larger trading partner as well as the test of the connection is commonly carried out by the service provider, the format in which the data is exchanged and the subset of the corresponding standard is usually not bilaterally agreed, but specified by the larger trading partner. The challenge for a smaller business is to cope with different requirements from larger business partners. In the case of classical EDI, a suitable EDI converter is used to translate the internal company format into the target format of the recipient (and vice versa). This converter is operated by the company itself, or in many cases the conversion is done by the EDI service provider. In case of a web-based solution, no data conversion is required – but a second system, i.e., the WebEDI application, is in place. In order to test the solution no matter which, the trading partner(s) usually provide test messages on request, on the basis of which the response messages are then created. These reply messages must be accepted by the trading partner(s) to guarantee a smooth EDI process. Implementation of the process Once the technical details are clarified and the test messages have been accepted, the productive transmission of EDI messages begins. In most cases, the start is in the form of a parallel operation, i.e., paper documents are still exchanged as a kind of backup in addition to EDI messages. This enables verification and control of the correct processing of EDI messages. If no more errors are detected, the paper documents can be omitted. During operations,
15 Electronic Data Interchange and Standardization
369
the ongoing exchange of data must be consistently monitored so that errors can be quickly counteracted.
The Core Issues and Future Trends When XML appeared in the late 1990s, it raised a lot of attention in the business document standardization community. The XML evangelists praised that it will be the solution to all existing problems in electronic data interchange and that it will provide low-cost EDI solutions for SMEs. However, XML is just a syntax and, thus, unable to solve semantic problems. One of its biggest advantages is the fact that there is much more tool support for XML than exists for UN/EDIFACT and that XML know-how is easier to find on the labor market than UN/EDIFACT know-how. The fact that XML is easier to read than UN/EDIFACT is not really an advantage, because these documents are simply not meant to be read by anybody. When reading Sections 2 and 3, one may find OTA as an XML-based standard easier to understand than UN/EDIFACT, but still it is not rocket science to develop, e.g., a parser for UN/EDIFACT. The most significant problem in EDI is the variability in the interpretations of the standard (Huemer et al. 2013). In other words, if company A and company B both support the same EDI standard, this does not mean that they are ready to exchange their business documents. The smaller issue is that they have to establish a connection between them as described in section 4. More important is the fact that the design of any message is so overwhelming that no single company would be able to process all data that may appear in an EDI message. The business document standards are more or less designed as a reference document that has to be customized for partner-specific implementations. The “standard” covers the union of all elements that may be required by any enterprise for the corresponding business function. This reference model of the “standard” is then restricted by a so-called message implementation guide (MIG) to the specific needs of a particular partnership. This subset usually covers only a small percentage of the “standard” reference model. Large corporations use their economic power to enforce a particular MIG on their smaller partners. This leads to the paradoxical situation that most commonly larger corporations can stick to a single MIG implementation, but smaller ones trading with multiple larger corporations have to suffer from the costs of implementing multiple MIGs. A solution for seamless data exchanges that does not require implementing different dialects of the same standard are still an open research issue. The appearance of the Semantic Web triggered some corresponding effort (Dell’Erba et al. 2005; Kaukal et al. 2000; Höpken 2000). Although semantic web approaches are certainly a step into the right direction, their success has been limited by the rigid structures of legacy in-house IT systems. Another issue in EDI is the fact that existing EDI standards were developed for a specific syntax only. This means UN/EDIFACT messages were designed for the EDIFACT syntax and OTA messages were defined as XML schemas. The UN/CEFACT community realized the disadvantage of this approach as soon as XML
370
C. Huemer et al.
appeared, because it meant to restart the standardization of the document types again for another syntax. Thus, UN/CEFACT recommended during the ebXML initiative (Hofreiter et al. 2002) to develop transfer syntax-neutral document standards and bind the resulting models to different transfer syntaxes. The UN/CEFACT core components were started with this assumption in mind. However, most XML-based standardization efforts did not worry much about this problem (Huemer 2000), because for a while there was no real alternative to XML as an exchange format. However, when JSON became a well-accepted format for REST-based interfaces, the need for transfer-syntax neutral business document standards became an issue again. Also, OTA changed its approach in OTA 2.0 (OTA 2018) and is now focusing on the standardization of object models that may be expressed in XML and JSON. So even if major works in EDI were done already some years ago, there are still open issues to be solved. If you think that EDI is already somewhat outdated, read the article EDI Is Cool Again on Forbes (Banker 2019).
Cross-References Advanced Web Technologies and E-Tourism Web Applications Semantic Web Empowered E-Tourism Web Information Retrieval and Search
References Banker S (2019) Edi is cool again. Forbes. https://www.forbes.com/sites/stevebanker/2019/08/07/ edi-is-cool-again/?sh=c6566bc66207 Barrett AP (1995) Interactive EDI – IT and commerce in the 21st century. In: Fifth IEEE conference on telecommunications 1995, pp 164–169. https://doi.org/10.1049/cp:19950134 Beck R, Weitzel T, König W (2003) The Myth of WebEDI. In: Monteiro JL, Swatman PMC, Tavares LV (eds) Towards the Knowledge Society. IFIP – The International Federation for Information Processing, vol 105. Springer, Boston, MA, pp 585–599. https://doi.org/10.1007/ 978-0-387-35617-4_38 Berge J (1994) The EDIFACT standards. Blackwell Publishers, Inc., Oxford Colberg TP, Gardner NW, Horan K, McGinnis D, McLauchlin P, So YH (1995) The Price Waterhouse EDI handbook. Wiley, New York Dell’Erba M, Fodor O, Höpken W, Werthner H (2005) Exploiting semantic web technologies for harmonizing E-markets. Inf Technol Tour 7(3/4):201–220 Hill NC, Ferguson DM (1989) Electronic data interchange: a definition and perspective. EDI Forum J Electron Data Interchange 1:5–12 Hofreiter B, Huemer C, Klas W (2002) ebXML: status, research issues, and obstacles. In: 12th international workshop on research issues in data engineering: engineering E-commerce/ E-business systems, RIDE’02, San Jose, 24–25 Feb 2002, pp 7–16. https://doi.org/10.1109/ RIDE.2002.995093 Höpken W (2000) Reference model of an electronic tourism market. In: Fesenmaier DR, Klein S, Buhalis D (eds) Information and communication technologies in tourism 2000. Springer, Wien, pp 265–274 Huemer C (2000) XML vs. UN/EDIFACT or Flexibility vs. Standardization. In: 13th Bled electronic commerce conference, Bled
15 Electronic Data Interchange and Standardization
371
Huemer C, Zapletal M, Liegl P (2013) Crossing the boundaries: e-invoicing/e-procurement as native ERP features. In: Novel methods and technologies for enterprise information systems, ERP Future 2013 conference, Vienna, Nov 2013, Revised Papers, pp 9–17. https://doi.org/10. 1007/978-3-319-07055-1_2 International Organization for Standardization (2014) ISO 9735, Electronic data interchange for administration, commerce and transport (EDIFACT) – Application level syntax rules (Syntax version number: 4, Syntax release number: 2). Standard, Geneva Kaukal M, Höpken W, Werthner H (2000) An approach to enable interoperability in electronic tourism markets. In: Hansen HR, Bichler M, Mahrer H (eds) ECIS 2000 A cyberspace Odyssey – proceedings of the 8th European conference on information systems. Wirtschaftsuniversität Wien, pp 1104–1111 Liegl P, Zapletal M, Pichler C, Strommer M (2010) State-of-the-art in business document standards. In: 2010 8th IEEE international conference on industrial informatics, pp 234–241. https://doi.org/10.1109/INDIN.2010.5549423 OpenTravel Alliance (2010) XML Schema Design Best Practices, Version 3.08 OpenTravel Alliance (2017) Message Users Guide 2017B, Version 1 Open Travel Alliance (2018) 2018A XML Object Suite Schema Publication Release Notes
Semantic Web Empowered E-Tourism
16
Kevin Angele, Dieter Fensel, Elwin Huaman, Elias Kärle, Oleksandra Panasiuk, Umutcan S¸ im¸sek, Ioan Toma, and Alexander Wahler
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semantic Web in a Nutshell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semantic Web of Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Schema Definition Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Annotation Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semantic Web of Data: Linked Open Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semantic Web of Services: Semantic Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heavyweight Semantic Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lightweight Semantic Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustration: A Hotel Booking Chatbot Based on Schema.org Actions . . . . . . . . . . . . . . . . Knowledge Graph Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge Hosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge Curation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use Cases in E-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Touristic Chatbots and Intelligent Personal Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Touristic Knowledge Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
374 375 377 377 380 382 385 387 388 389 390 391 394 397 398 399 401 401 405
K. Angele Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria Onlim GmbH, Telfs, Austria e-mail: [email protected] D. Fensel () · E. Huaman · E. Kärle · O. Panasiuk · U. Sim¸ ¸ sek Semantic Technology Institute, University of Innsbruck, Innsbruck, Austria e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] I. Toma · A. Wahler Onlim GmbH, Telfs, Austria e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_22
373
374
K. Angele et al.
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
407 408 411
Abstract Smart speakers such as Alexa and later Google Home have introduced Artificial Intelligence (AI) into millions, soon to be billions of households, making AI an everyday experience. These new communication channels present a new challenge for successful e-Marketing and e-Commerce providers. Data, content, and services are becoming semantically annotated, allowing software agents, socalled bots, to search through the web and understand its content. Nowadays, users typically consult their bot to find, aggregate, and personalize information and to reserve, book, or buy products and services. As a consequence, it is becoming increasingly important for touristic providers of information, products, and services to be prominently visible in these new online channels to ensure their future economic maturity. In our chapter, we survey the methods and tools helping to achieve these goals. The core aim is the development and application of machine-processable (semantic) annotations of content, data, and services, as well as their aggregation in large Knowledge Graphs. It is only through these methods bots are able to answer a question in a knowledgeable way and organize a useful dialogue (Knowledge Graphs in Use A significantly extended and generalized version of this article will appear as D. Fensel, K. Angele, E. Huaman, E. Kärle, O. Panasiuk, U. Sim¸ ¸ sek, I. Toma, J. Umbrich, and A. Wahler: Knowledge Graphs: Methodology, Tools and Selected Use Cases. Springer Nature, 2020.).
Keywords Smart speakers · Artificial Intelligence (AI) · e-Marketing · e-Commerce · Knowledge graphs · Semantic web · Semantic technologies
Introduction In Berners-Lee et al. (2001), the authors envisaged a web where now it is no longer humans but bots who are accessing information on the web, and these bots are supporting humans in the fulfilment of their tasks. Content, data, and services must be enriched with machine-processable semantics to be accessible by these bots. We are now on the brink of seeing this vision becoming reality. Meanwhile, a large body of work has been done to provide rich frameworks for the semantic description of content, data, and services and large, heterogeneous, dynamic, and distributed environments. Furthermore, industrial de-facto standards such as schema.org1 provide the necessary impact for such approaches. Just as it has become a must for an e-Marketing and e-Commerce provider to have a web site, it is now state of the art to add semantic description to web presences.
16 Semantic Web Empowered E-Tourism
375
As a result of billions of semantic statements on the web, a new area of research has been evolving: so-called knowledge graphs; see Bonatti et al. (2019). They integrate these statements with other data and information sources and build knowledge-based systems using billions and soon trillions of facts. This introduces new requirements for the scalability of traditional knowledge-based technologies and provides significant new opportunities for online interaction with potential customers. Meanwhile, all core players in e-Tourism are using this technology to enrich their service offers with customized information on points of interests, events, and further information of potential interest for their customers. In the end, a customer may not be simply wanting to book a hotel room but is instead searching for an exciting holiday experience with a variety of entertaining and informative aspects and therefore may also look for a room to sleep in. This significantly increases the requirements for scalable and dynamic information integration on a worldwide scale. Ignoring this may lead to a provider being invisible to potential guests. In our chapter, we provide a survey of the available methods and technologies to semantically enable e-Marketing and e-Commerce, with a focus on e-Tourism. We introduce the core and essence of semantic web technology in section “Semantic Web in a Nutshell.” Sections “Semantic Web of Content,” “Semantic Web of Data: Linked Open Data,” and “Semantic Web of Services: Semantic Web Services” elaborate on the different means for annotating content, data, and services, while section “Knowledge Graph Technology” focuses on Knowledge Graphs and discusses their essence, creation, hosting, curation, and deployment. Section “Use Cases in E-Tourism” describes use cases and pilots in the area of e-Tourism, and final conclusions are provided in section “Conclusions.”
Semantic Web in a Nutshell We are currently at the beginning of a major paradigm shift in accessing and sharing information on the Internet. In fact, this is not the first time that the Internet has drastically changed the way we cooperate and communicate. The Internet began in the 1960s as a local network of four computers in the USA and evolved over the next 20 years into a worldwide computer network. An early paradigm shift for human communication based on it was email, which has provided an instantbased messaging service to a fast-growing number of people. A complementary interaction paradigm began in 1989 based on the work of Tim Berners-Lee. Instead of messaging, the World Wide Web (hereafter simply referred to as the web) is based on publishing information to a large number of potential readers. The web is an information space where documents and other web resources are described by hypertext markup, interlinked by hypertext links, identified by URIs, and can be accessed via the Internet. This combination of hypertext with the Internet was Berners-Lee’s actual innovation. Soon this information space began growing dramatically and crowded out all the competing approaches. Research on the Semantic Web began in 1996 for two reasons. First, the aim has been to support
376
K. Angele et al.
Concept Hierarchy Object[]. Person :: Object. Employee :: Person. AcademicStaff :: Employee. Researcher :: AcademicStaff. Publication :: Object.
Attribute Definitions Person[ firstName =>> STRING; lastName =>> STRING; eMail =>> STRING; ... publication =>> Publication]. Employee[ affiliation =>> Organization: ...].
Rules FORALL Person1, Publicaiton1 Publication1:Publication [author ->> Person1]
Person1:Person [publication ->> Publication1].
< html > < body >
< h2 > Welcome on my homepage My name is Richard Benjamins .
Fig. 1 An early example of a schema and annotation language
the web in its nearly infinite scale. As more information is added, more machine support is needed to access relevant information pieces. In Fensel et al. (1997, 2000), we described a semantic web system based on a schema (Ontology), annotations of content (based on an annotation language called HTML-a), and reasoning engines and crawlers to access and process the available information (see also Fig. 1). The second goal was to solve the knowledge acquisition bottleneck and create a brain for humanity (cf. Fensel and Musen 2001). Billions of humans put data, information, and knowledge on this global network for free. Through this, the web mirrors large fractions of the human knowledge, and a new brain of humanity based on the knowledge of mankind is generated. Empowered by semantics, computers can access and understand this knowledge. This vision of the Semantic Web has been to build a brain of/for human kind. Unfortunately, in the period around the millennium, web search engines arose that chose a different approach toward information access on the web. They based their operations on syntax and statistical analysis, and, in fact, some of them performed quite amazingly in retrieving a proper list of links to follow a given keyword as an input. Obviously, the statistical analysis of web resources is enough to provide a fast and excellent index system for the web. The situation changed drastically around 2011 when search engines such as Google tried to reach their next level of service. Originally, the business model was quite simple. Ads on the Google site brought revenue because increasing numbers of users used it as the starting point for their web surfing. After they found an interesting link, they left the Google site and manually extracted information from the websites they visited. This search engine business model was extremely successful but ultimately limited. As quickly as users entered the Google site, as soon did they leave it. Therefore, step by step, Google is aiming to turn from a search into a query answering engine (see Guha et al. 2003; Harth et al. 2007). Why point the visitor to other websites? Why not provide the answer to his query directly at the Google website, thus keeping him there and opening new opportunities with him for commercial
16 Semantic Web Empowered E-Tourism
377
cooperation. However, this requires more intelligence on the Google side. It must be able to extract exact information from a website based on machine-processable semantics of content and data. In fact, achieving this goal requires more elaborated approaches. As a consequence, around 2011, a coalition of leading search engines began the schema.org1initiative that allows for the injection of semantic annotations in HTML code based on JSON-LD, Microdata, and RDFa syntax. Meanwhile, a mature corpus of types, properties, range restrictions, and enumeration values have been developed, and the uptake is significant. Most important websites are using it. In complementary fashion, Google has developed its Google Knowledge Graph, a knowledge base already containing around 100 billion facts about more than 5 billion entities. These figures offer substantial proof that the knowledge acquisition bottleneck is being bypassed. Furthermore, Google is certainly not the only player in this game. The current hype is focused around chatbots and Intelligent Personal Assistants, which are targeting this new access layer on top of the web. Alexa, Bixby, Cortana, Facebook messenger, Google assistant, Siri, and others provide personalized and (spoken) message-based access to information. Clearly this generates new challenges for entities that need to make their content, data, and services visible to potential customers. Just as it was a must 20 years ago to communicate via email and be visible on the web, it is now key for economic success to be firmly present in this new era of dialog-based information access. In the following sections, we will introduce some core technologies underlying these efforts and illustrate their usage in e-Tourism.
Semantic Web of Content This section introduces the schema definition languages, annotation languages, and predefined vocabularies (Ontologies) used in the Semantic Web. Figure 2 presents the language stack of semantic web languages. On top of the stack is the application, and the bottom of the stack is the Semantic Web of Linked Data. In the first part of this section, we describe the languages used in the Dictionaries layer. The second part describes the annotation languages itself, which are part of Document Types layer. The third part is again focused on the Dictionaries layer, but in contrast to the first part, we describe predefined Ontologies, i.e., vocabularies.
Schema Definition Languages In this section, we present schema definition languages. We first start with RDF and continue with RDFS, a schema definition language for RDF. Subsequently, we present OWL2, SKOS, and RIF.
Resource Description Framework (RDF) RDF (Manola et al. 2004) is a standard language to interchange semantic data on the web. RDF is a foundation for describing such metadata and can be used in
378
K. Angele et al.
Smart (Cognitive) Applications & Services Trust Proof Unifying Logic First-Order Logic (FOL) Rules SWRL, SPIN, R2RML, SHACL
Query SPARQL, SPASQL
Dictionaries (Ontologies) RDF, RDFS, OWL, SKOS, Schema.org
Transmission Security (Crypto)
Abstract Language RDF Subject->Predicate->Object Sentences
Sentence Part Identifiers HTTP IRIs & URIs
Document Types RDF-NTriples, RDF-Turtle, RDF-XML,RDF-JSON,JSON-LD, others
Semantic Web of Linked Data
Fig. 2 Language st ack of the Semantic Web. (Cited from Language stack of semantic-web https://cdn-images-1.medium.com/max/1600/1*YQ04iyBrbq-VrQwkzCMkkA.png, accessed: 02 Jun 2019)
many application areas. RDF defines triples which have the following structure. They consist of a subject (URI), a property (predicate), and an object (blank node, literal, URI). An example of such a triple is as follows: Listing 1 Example of a RDF triple Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila Subject: "http://www.w3.org/Home/Lassila" Predicate: "creator" Object: "Ora Lassila"
As the object can be an URI, RDF defines a graph as a data model. The normative XML syntax of RDF is not very easy to use. Therefore, there are many syntaxes on top of RDF, e.g., Turtle (Beckett et al. 2014). This makes it easier to use RDF as an interchange format.
RDF-Schema RDF-Schema (Brickley et al. 2014) is an extension of the basic RDF vocabulary. It is used as data-modeling vocabulary to represent RDF data. It provides mechanisms
16 Semantic Web Empowered E-Tourism rdf:type
ex:dog1
of
:sub
e ng
rdf:type
rdfs
fs :ra
ex:cat1
ex:animal
ss Cla
ex:cat
rd
Fig. 3 RDFS example definition. (Cited from https:// en.wikipedia.org/wiki/ RDF_Schema (CC BY-SA 3.0))
379
ex:cat2
RDF special terms
zoo:host
ex:zoo1
RDFS special terms
describing groups of related resources and the relationships between resources. These definitions are written again in RDF. The following example in Fig. 3 shows a definition of two types and some instances. It shows the special RDFS terms rdfs:subClassOf and rdfs:range. RDFS defines a simple Knowledge Representation Formalism for the Semantic Web.
Web Ontology Language (OWL2) OWL2 (Beckett et al. 2014) is the recommended language used to describe ontologies on the web. The ontologies are exchanged as RDF documents. OWL2 is an extension of OWL, which was published by the W3C Web Ontology Working Group in 2004. The goal of OWL2 is, as with OWL, to make web content better accessible and understandable for machines by applying and customizing Description Logic (Baader et al. 2003). OWL2 is a very expressive language, and, therefore, it is difficult to implement and work with it. Furthermore, OWL2 also defines different profiles (DL, Full, EL, QL) which makes it hard for all those profiles to be supported. In particular, the syntax of OWL2 is very complex. Simple Knowledge Organization System (SKOS) SKOS (Miles and Bechhofer 2009) is a recommendation used for data sharing. This means that an existing knowledge organization system can be expressed in a way that it is machine-readable. The idea is to exchange data between different computer applications and also publish the description on the web. The data model of SKOS is defined as an OWL Full ontology. Concept schemas for knowledge representation systems are represented by SKOS. The following example presents SKOS data expressed as RDF triples. It describes some of the terms from a thesaurus2 : Listing 2 SKOS data expressed as RDF triples rdf:type skos:Concept; skos:prefLabel "love"@en; skos:altLabel "adoration"@en; skos:broader ; skos:inScheme . rdf:type skos:Concept ; skos:prefLabel "emotion"@en; skos:altLabel "feeling"@en; skos:topConceptOf . rdf:type skos:ConceptScheme ; dct:title "My First Thesaurus"; skos:hasTopConcept .
380
K. Angele et al.
Annotation Languages In this section, we present the languages used for annotating information on the web. Annotating means describing the available information in a way that it can be read and interpreted by machines.
Microformats “Humans first, machines second” is the slogan of microformats (Khare and Çelik 2006). The principles of microformats are that they should solve a specific problem, start as simply as possible, reuse building blocks from existing standards, and must be modular embeddable. Microformats take advantage of existing XHTML facilities to reuse existing web pages for new services and applications. The following example shows a calendar event represented in microformat XHTML: Listing 3 Example calendar event represented in microformat XHTML. (Cited from Khare and Çelik 2006)
Microformats: What the Hell Are They and Why Should I Care? Ryan King will explain why microformats are important and how you can markup specific kinds of content in ways that make it easier for the right people to find your stuff.
September 25th, 2005, 5
6PM
in the Balder Room
Limitations include the fact that not all things can be represented using HTML tags. For example, the address tag is used for the author of the web site; therefore it cannot be used to represent a location of an event. In that case, you need to use other mechanisms such as vCard to represent an event. This means in general that you have to combine different standards to describe your information correctly. The latest version is Microformats2,3 which combines the lessons learned from Microdata and RDFa. It was published in 2012.
Microdata Microdata4 is a specification for new HTML attributes that offers the possibility of embedding machine-readable data in HTML documents. The approach is very similar to RDFa, but Microdata is not as expressive. Furthermore, RDFa and Microdata do not have the same level of internationalization. Essentially, Microdata adds labels to content in a document. Therefore, the content can then be interpreted
16 Semantic Web Empowered E-Tourism
381
as name-value-pairs. The following example shows a definition of the property name: Listing 4 Microdata definition of name in HTML document. (Cited from Microdata https://www. w3.org/TR/microdata/, accessed: 14 Jun 2019)
My name is Elizabeth .
With itemscope, you can start a new definition, and itemprop then defines the name of the property. Limitations arise if you need internationalization to support different languages. To give internationalization information, they need to be encoded as Microdata. The downside is that it does not follow an established standard, so it may not be understood by users of the information.
Resource Description Framework in Attributes (RDFa) RDFa5 adds structured data to HTML pages directly. It provides a set of markup attributes for machine-readable hints. RDF 1.0 was specified only for XHTML. The new version 1.1. is specified for both XHTML and HTML5. The idea is to reuse some HTML attributes. Furthermore, with the use of simple HTML tags, the website is made understandable by machines that crawl web pages. As already mentioned, RDFa is based on existing HTML tags and also adds new attributes. The RDFa design choices replicate an early Semantic Web proposal called HTML-a (Fensel et al. 2000). The following example shows a HTML page with annotations made in RDFa. The property attributes are the RDFa annotations. Listing 5 HTML page with annotations made in RDFa. (Cited from RDFa https://www.w3.org/ TR/xhtml-rdfa-primer/, accessed: 14 Jun 2019)
...
The Trouble with Bob
Date: 2011-09-10
...
This example annotates the title and the date when it was created so that machines can extract this information.
382
K. Angele et al.
JavaScript Object Notation in Linked Data (JSON-LD) JSON-LD6 is a format to serialize Linked Data based on JSON.7 With minimal changes, existing JSON can be extended to be interpreted as Linked Data. The design goals of JSON-LD are simplicity, compatibility, expressiveness, and as few edits as possible to enrich a normal JSON. With JSON-LD, it is possible to embed links to other pieces of information saved on different sites across the web. As JSON-LD is 100% compatible with JSON, it is well supported. Nowadays, nearly every programming language has a JSON parser; therefore JSON-LD can also be parsed by nearly all of them. The following example8 shows a definition of a person with name, url, and image. The @id keyword means that the given value should be interpreted as IRI (Internationalized Resource Identifier). Listing 6 Example definition of person with name, url and image { "http://schema.org/name": "Manu Sporny", "http://schema.org/url": { "@id": "http://manu.sporny.org/" }, "http://schema.org/image": { "@id": "http://manu.sporny.org/images/manu.png" } }
The benefit of JSON-LD is that it is not only machine-readable and machineunderstandable, but also humans can easily read JSON-LD definitions. Currently, JSON-LD 1.1 is in development. The latest editor’s draft for the new version can be found here9 :
Ontologies In the previous sections, we defined schema and annotation languages. This section is about a predefined set of vocabularies. These vocabularies can be used when annotating information on the web. The benefit of predefined vocabularies is that they provide a common understanding. This means that everyone who uses them has the same understanding of what the annotation means.
Dublin Core Dublin Core10 is a lightweight RDFS vocabulary describing generic metadata. A set of attributes are provided to define a term. A term can consist of a name, label, URI, definition, and the type, and additional attributes can be used to describe metadata. As an example, we give the definition of a format, taken from (Schema.org Airport https://schema.org/Airport, accessed: 14 Jun 2019.): – URI: http://purl.org/dc/elements/1.1/format – Label: Format – Definition: The file format, physical medium, or dimensions of the resource.
16 Semantic Web Empowered E-Tourism
383
– Comment: Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME]. – References: [MIME] http://www.iana.org/assignments/media-types/ – Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property – Note: A second property with the same name as this property has been declared in the dcterms: namespace (http://purl.org/dc/terms/). See the introduction to the document “DCMI Metadata Terms” (/specifications/dublin-core/dcmi-terms/) for an explanation.
Friend of a Friend (FOAF) FOAF11 is an ontology to link people and information. Social networks, representational networks, and information networks are integrated in FOAF. The descriptions of FOAF are published on the web. We present here an example of a person described with FOAF. The example is taken from:11 Listing 7 Friend of a friend example of a person
Dan Brickley
This example defines the name, homepage, the openid, and an image for the person.
GoodRelations GoodRelations12 presents a data structure for e-Commerce that is industry-neutral, valid across different stages of the value chain, and syntax-neutral. Four entities represent the e-Commerce scenarios: an agent (e.g., person), object (e.g., camcorder), promise (offer), and a location (e.g., a store). GoodRelations is a fully fledged Microdata vocabulary. In GoodRelations, only the local part of a property identifier is used, in contrast to Microdata and RDFa. See the following example, taken from GoodRelations12 : Listing 8 GoodRelations example of a Product
Weight: 50 kg
Schema.org Schemas for structured data on the Internet are provided by schema.org.1They can be used in many different encodings such as RDFa, Microdata, and JSON-LD. The vocabularies cover entities and relationships between entities and actions.
384
K. Angele et al.
Fig. 4 Screenshot from schema.org (Schema.org Airport https://schema.org/Airport, accessed: 14 Jun 2019.)
Schema.org provides a mechanism to extend the given set of schemas with its own schemas. The following screenshot in Fig. 4 shows the schema for an airport. The schema defines the possible properties and their ranges. Schema.org provides schemas for many domains and topics. Given its industrial support, it seems to have become a de-facto standard, but it is also very complex, provides numerous alternatives to be used, and is still fairly incomplete for the specifics of most domains. This makes the usage of schema.org reasonably difficult and is the reason why we have developed Domain Specifications (Sim¸ ¸ sek et al. 2018b), see section “Knowledge Graph Technology” for more details, as a means to restrict, constrain, and domain specifically extend schema.org.
16 Semantic Web Empowered E-Tourism
385
Further Types In the tourism area, there are no globally accepted ontologies. Nevertheless, there are some initiatives trying to establish an ontology to describe touristic information. One initiative is the Harmonise project (DellErba et al. 2003) that proposes an ontology-based mediation and harmonization tool to allow touristic organizations exchange data while keeping their own data format. Another initiative in this direction is the DACH-KG13 that is currently working on a common schema to exchange tourism information in the German-speaking area (Germany, Austria, Switzerland, and South Tirol (Italy)). The current state of the schema.org extensions is available on github.14 Many more ontologies are available on the web as part of the Linked Open Data cloud (see the following section). Rather than being too few of them, there are actually too many, comparable to the human language zoo of more than 6000 languages which makes interaction and cooperation beyond cultural boundaries so cumbersome.
Semantic Web of Data: Linked Open Data Linked Open Data, sometimes called the “semantic web of data,” is Linked Data which is published as Open Data. Tim Berners-Lee coined the term Linked Data and defined it as follows: “Linked Open Data (LOD) is Linked Data, which is released under an open license, which does not impede its reuse for free” (BernersLee 2006). Apparently the term Linked Open Data consists of the two terms Linked Data and Open Data. Before describing what Linked Open Data is and means, we will introduce the concepts of Open Data and Linked Data. Open Data describes all kinds of data which are published under an open license. The Open Data Handbook describes Open Data as “. . . data that can be freely used, re-used and redistributed by anyone – subject only, at most, to the requirement to attribute and share alike” (Dietrich et al. 2009). A common license applied to Open Data is CC-BY-SA, which is the Creative Commons license with the attributes BY, which defines that the author of the data has to be mentioned, and SA, which stands for Share Alike and defines that the license also has to be attached when the data is reused. Linked Data was defined by Berners-Lee in 2006 and is a common way to share data on the web in a machine-readable and machine-understandable way: “With linked data, when you have some of it, you can find other, related, data” (BernersLee 2006). Whereas on the web, documents are linked to each other with hyperlinks, in the web of data, data sets are linked to each other. On the web, those links are described in hypertext, whereas in Linked Data, those links are described in RDF. To publish data as Linked Data, the publication must follow four principles: 1. Use URIs as names for things: this principle ensures that things in data sets are identified uniquely. 2. Use HTTP URIs so that people can look up those names: this principle ensures that the information about the thing can be dereferenced on a web page.
386
K. Angele et al.
3. When someone looks up a URI, provide useful information using the standards (RDF*, SPARQL): this principle ensures that the information provided is also read and interpretable for machines, since RDF allows/requires the semantic description of the things. 4. Include links to other URIs so that they can discover more things: the last principle puts the “Linked” into Linked Data. Since these links, according to principle 3, are described in RDF, machines can follow them autonomously to find more data on their own. The step from Linked Data to Linked Open Data does not only imply that the data is open and linked. It goes one step further and defines firstly that the data must be open and then covers the way data is accessed – preferably in nonproprietary formats. Only the last requirement then brings in the linking aspect. The quality of Linked Open Data is depicted in a star rating from one to five stars. The rating criteria build on each other, which means that having four stars implies that the data also matches the three-star criteria and so on: ∗ The data set gets awarded one star if the data is provided under an open license. This is a very nontechnical requirement, so the data could, for example, be available as a pdf or an image. ∗∗ Two stars is awarded if the data is available as structured data. This requirement is satisfied if it is machine-readable and has a certain structure, for example, in an Excel sheet. ∗ ∗ ∗ Three stars is awarded if the data is also available in a nonproprietary format. Since Excel is a proprietary format, this criterion requires the data to be available as, for example, CSV, JSON, or XML. ∗ ∗ ∗∗ Four stars is awarded if URIs are used so that the data can be referenced, as already defined in the Linked Data principles one and two. ∗ ∗ ∗ ∗ ∗ Five stars is awarded if the data set is linked to other data sets that can provide context (according to Berners Lee 2015). If a data set satisfies all five stars, it can become part of the “Linked Open Data Cloud” (Bizer et al. 2008). The LOD-cloud (see Fig. 5) is a collection of data sets which are all published according to the five-star criteria. As of March 2019, there are 1,239 data sets in the cloud with 16,147 links. The LOD-cloud project started in 2007, and the first version contained only 12 data sets. Back then, the central data set was, and still is, DBpedia (Auer et al. 2007). The DBpedia project dedicated itself to making the data of the Wikipedia machine processable and publishing the content of Wikipedia as Linked Open Data. Other important datasets in the LODcloud are GeoNames (Maltese and Farazi 2013), a LOD representation of a database containing more than 25 million geographical names; MusicBrainz (Swartz 2002), a comprehensive LOD representation of knowledge about music-related topics; and SNOMED clinical terms (Stearns et al. 2001), a large database of health-care terminologies. From the tourism prospective, the TourMISLOD dataset contains the linked data encoding of European tourism statistics (Sabou et al. 2013). The
16 Semantic Web Empowered E-Tourism
387
Fig. 5 The LOD-cloud, as of March 2019, with its 1239 data sets (Linked Open Data Cloud https://www.lod-cloud.net, accessed: 14 Jun 2019)
City Service Development Kit (CitySDK) is a system that collects open data of governments to develop scalable Smart City services (Pereira et al. 2015).
Semantic Web of Services: Semantic Web Services The web contains three main elements, namely, content, data, and services. The Semantic Web that envisions the accommodation of intelligent agents for completing tasks in an automated fashion cannot succeed without the semantic description of these three elements. Therefore, the research need for semantic web services has been identified by the community in the early stages of the Semantic Web (Ankolekar et al. 2002; Fensel and Bussler 2002). Semantic web services can have
388
K. Angele et al.
interesting application areas in e-Tourism, such as the automated composition of different services (e.g., flight, hotel, tour) in order to book travel arrangements or helping the development of dialogue systems by explicitly providing the behavior of a service that can guide a dialogue to complete a certain task. This section gives an overview of the approaches for semantic web services. The first part of the section describes heavyweight approaches that mainly target services using the Simple Object Access Protocol (SOAP)15 as a messaging protocol and that are mostly used in the internal and external B2B communication of large organizations. The second part reviews the lightweight approaches that mainly target RESTful services,16 i.e., service offers on the web mostly targeting B2C. Finally, we provide a short illustration for semantic web services and their functionalities.
Heavyweight Semantic Web Services The early efforts on semantic web services targeted services that use the Web Service Description Language WSDL17 as a description language and SOAP as a protocol. These approaches mostly offered advanced mechanisms to describe web services in order to enable automated agents to complete web service tasks such as discovery, composition, and invocation automatically through logical reasoning. We consider them heavyweight in terms of supported web service protocol and advanced mechanisms for describing web services semantically. In this section, we introduce these heavyweight approaches. OWL-S uses Web Ontology Language (OWL) to describe web services (Martin et al. 2007). It consists of three main components, namely, the Service Profile, Process Model, and Service Grounding. The Service Profile ontology enables service providers to create a description of some functional (e.g., capabilities) and nonfunctional (e.g., service category, QoS) properties. The Process Model describes how the service will be provided (i.e., behavioral properties) and the conditions and invocation steps to obtain certain outcomes. The third component, Service Grounding, describes the details of how services can be invoked concretely, for instance, in terms of connecting with WSDL components. Semantic Web Services Framework (SWSF)18 is built on top of the experience gained from OWL-S. It aims to create a more expressive framework by using first-order logic (FOL) instead of description logic (DL) for process modeling, which is, however, undecidable. The framework consists of two main elements, namely, The Semantic Web Services Ontology (SWSO) for modeling web services conceptually and The Semantic Web Services Language (SWSL) to express SWSO. In addition, SWSF uses an extended version of the Process Specification Language (PSL) (Gruninger and Menzel 2003) to define the behavioral aspects of the web service. In general, it appears to be a valuable academic exercise.
16 Semantic Web Empowered E-Tourism
389
METEOR-S (Patil et al. 2004) extends the existing web and web service technologies with semantics. The core technology they adopt is the SAWSDL (Semantic Annotations of Web Services) (Kopeck`y et al. 2007). The forthcoming feature of METEOR-S is that it supports the full life cycle of the web services and provides tools for the design, discovery, composition, and execution of web services. The Web Service Modeling Framework (WSMF) (Fensel and Bussler 2002) offers a comprehensive decoupled way of automating the whole life cycle of web service consumption. It comprises a conceptual model, a modeling ontology WSMO (Fensel et al. 2006), a structured set of languages WSML (Fensel et al. 2006), and an execution environment WSMX (Fensel et al. 2008). WSMF has four main components for describing web services: (1) ontologies that provide the means for describing the domain of discourse; (2) goals that represent the user’s perspective for the consumption of web services; (3) web service descriptions for describing functional, behavioral, and nonfunctional aspects of web services; and (4) mediators that solve various interoperability problems. The most prominent feature of the WSMF approach is to provide an explicit mechanism for mediation and goals that are distinguished from web service capabilities. The Internet Reasoning Service (IRS-II) (Motta et al. 2003) is a framework and infrastructure for the publication, storage, composition, and execution of heterogeneous web services and their semantic descriptions. The IRS-II follows an approach that applies UPML (Fensel et al. 1999), a framework for increasing knowledgebased system reusability through modularization, to semantic web services. The IRS-II framework abstracts services as problem-solving methods and matches them with tasks. In that sense, they are along the same lines as the WSMF approach. The most distinct feature of IRS-II is that it has advanced publication and registry mechanisms.
Lightweight Semantic Web Services In this section, we describe the approaches that mainly target RESTful web services with simpler mechanisms to semantically describe web services. A comprehensive survey of such approaches can be already found in Verborgh et al. (2014). We briefly summarize some prominent approaches in the literature while focusing on the most recent ones. WSMO-Lite (Roman et al. 2015) is a conceptual model for describing the functionality of RESTful web services in a lightweight, bottom-up manner. Unlike the approaches for SOAP services (e.g., OWL-S, WSMO), it does not follow a semantics-first policy, but is connected directly to the syntax of a web service documentation (i.e., HTML file) through a microformat called MicroWSMO. Although it has limitations in terms of expressiveness and description of the behavioral aspects of web services, it provides a minimal model for web service
390
K. Angele et al.
descriptions to foster interoperability. In fact, WSMO-Lite is also the conceptual model for SAWSDL, a bottom-up approach for annotating WSDL files. RESTDesc (Verborgh et al. 2013) offers a mechanism for describing functional aspects of RESTful web services based on N3Logic (Berners-Lee et al. 2008). The functionality over a resource can be described in terms of pre- and post-conditions. An OPTIONS call to a resource would return the possible actions that can be taken on that resource along with their expected outcomes. A client can then complete certain tasks with a follow-your-nose approach. Hydra (Lanthaler and Gütl 2013) is a RESTful web service documentation approach mainly focused on hypermedia. The main principle of Hydra is that every RESTful web service should be its own machine-readable documentation, meaning that a client can consume a RESTful web service with minimal a priori knowledge (i.e., only an entry point and media types). A Hydra annotated web service operates with JSON-LD format only and benefits from hypermedia over JSON-LD for the description of behavioral aspects, whereas the functional aspects are only limited to annotation of input and output of an operation on a resource. SmartAPI (Zaveri et al. 2017) is a lightweight semantic extension to the OpenAPI Specification.19 It provides a mechanism for the annotation of input and output of an operation on a resource of RESTful web services. Although the semantics are very limited, SmartAPI benefits from the popularity and vast tooling support of the OpenAPI specification. Schema.org actions20 allow RESTful web service publishers to annotate their APIs with semantic annotations. A high-level operation over a resource can be represented with an action (e.g., SearchAction, BuyAction). These operations are then mapped to an HTTP method, and the resource itself is described as the object of the action. The relationship between input and output can be provided by connecting the values of object and result properties. An example action is shown in Fig. 6. The SearchAction represents the description of a search operation on a LodgingReservation. The operation is expected to return an offer. The behavioral aspects are represented through potential actions on responses. Given the fact that Bing, Google, Yahoo!, and Yandex are supporting this approach for service description, it may become the de facto standard for describing web services semantically.
Illustration: A Hotel Booking Chatbot Based on Schema.org Actions The semantic descriptions of RESTful web services provide an opportunity for intelligent agents to be decoupled from the web services they interact with. This even allows for the semiautomated generation of a dialogue system based on the semantic descriptions. The dialogue in Fig. 7 is an interaction between a user and
16 Semantic Web Empowered E-Tourism
391
Fig. 6 A partial annotation of a booking engine cited from Sim¸ ¸ sek et al. (2018a)
a semiautomatically generated hotel booking bot based on the action annotation in Fig. 6. A generated “search” intent is triggered when the user reveals her desire to search for a specific hotel. After the required parameters are elicited from the user, the dialogue is dynamically guided by the potential action attached to the response of the first request sent by the bot to the booking service. The action annotations are also used to help conversational agent developers to create natural language sentences for the training of the natural language understanding unit.
Knowledge Graph Technology A “graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense related.” – discrete mathematics definition.21 Strictly speaking, we need to slightly extend this definition to multi-sets since the same object can syntactically and/or semantically appear several times in our graph. This simple definition can be extended in various directions, and we end up with an entire zoo of graph types: simple graphs, undirected versus directed graphs, oriented graphs, mixed graphs, multigraphs, quiver, weighted graphs, half-edges and looseedges graphs, finite versus infinite graphs, etc. In the semantic web community, the consensus is to use RDF as representation formalism for representing a Knowledge Graph. The concept of Knowledge is somewhat hazier. If we return to what (Newell et al. 1982) called the knowledge level then, based on the assumption that an
392
K. Angele et al.
Fig. 7 An example dialogue between a user and a generated hotel booking dialogue system cited from Sim¸ ¸ sek et al. (2018a)
agent follows the principle of rationality (later refined as the concept of bounded rationality (Simon 1957), including the costs for “optimal” decision making), we subscribe knowledge to the agent, perceiving the actions it takes to achieve certain goals. In this sense, knowledge is externally assigned to this agent by an observer. Internally, the “knowledge” is coded at the symbol level. “Beneath the knowledge level resides the symbol level. Whereas the knowledge level is world oriented, namely, that it concerns the environment in which the agent operates, the symbol level is system oriented, in that it includes the mechanisms the agent has available to operate. The knowledge level rationalizes the agent’s behavior, while the symbol level mechanizes the agent’s behavior” (Newell et al. 1982). Obviously, we could interpret the Knowledge Graph in a similar way. An agent has/generates knowledge by interpreting a graph, i.e., relates its elements to so-called real-world objects and actions. A graph is a specific encoding formalism. If we refine this further, we may want to place the graph on the logical or epistemological level rather than on the implementational level (Brachman 1979). At the implementation level, we have means such as graph-based databases, etc. In fact, Google coined the term Knowledge Graph in 2012 (Singhal 2012) as a means to build a model of the world. Meanwhile, it has become a hype term in product and service industry. In tourism, note that tourism is one of the most important economical verticals on a worldwide scale, accounting for around 10% of global GDP and total employment in 2017 (Council 2018), not necessarily the most innovative area in general, every major player already has a knowledge graph, and thousands of players (such as Destination Management Organizations) need or want one. The drive for this stems from how increasingly important successful e-Marketing and e-Commerce providers have become in terms of the value distribution in tourism and other areas.
16 Semantic Web Empowered E-Tourism
393
Summarizing the discussion, we can state that Knowledge Graphs are very large semantic nets that integrate various and heterogeneous information sources to represent knowledge about certain domains of discourse. According to GomezPerez et al. (2017), Knowledge Graph Technologies in a nutshell consist of: • “knowledge representation and reasoning (languages, schema, and standard vocabularies), • knowledge storage (graph databases and repositories), • knowledge engineering (methodologies, editors, and design patterns), • (automatic) knowledge learning including schema learning and population.” Knowledge Graph methods and techniques must additionally reflect the specific focus on very large amounts of instances beyond any tradition knowledge base; see Schultz et al. (2012). We identify the following major steps for a process model (see Fig. 8): • A traditional knowledge acquisition (perhaps better referred to as knowledge engineering). Phase that establishes the core data for a Knowledge Graph (see section “Knowledge Creation”). • The process to implement this knowledge in a proper storage system, such as document or graph-based repository (see section “Knowledge Hosting”).
Fig. 8 A process model for Knowledge Graphs
394
K. Angele et al.
• The knowledge curation process (cf. Paulheim 2017) to establish large Knowledge Graphs of significant coverage and quality. We identify the following activities as substeps of this curation process: Knowledge Evaluation, Cleaning, and Enrichment (see section “Knowledge Curation”). • Finally, we need to deploy and apply such a Knowledge Graph (see section “Knowledge Deployment”). Similar process models can be found in Gawriljuk et al. (2016) and VillazonTerrazas et al. (2017b). Each of the mentioned steps is discussed in detail in the following subsections.
Knowledge Creation The knowledge creation (also referred to as knowledge acquisition) process represents the process of extracting knowledge from domain experts and available data sources, and structuring it and managing established knowledge (Schreiber et al. 2000). The knowledge acquisition process was viewed as one of the most critical aspects in the knowledge engineering process (Studer et al. 1998). Nowadays, the acquisition process is influenced by the web and has become an intense area of research (Gil 2011; Schreiber 2013). There exist different approaches for knowledge acquisition from the web. In Sánchez and Moreno (2006), the authors described the domain-independent learning methodology modeled over multi-agent systems that crawl the web to semiautomatically build an ontology for a given domain according to the user’s interests. In Tandon et al. (2014), the method for automatically constructing a large common sense knowledge base from web contents is described. An overview of existing methods, tools, and techniques for knowledge elicitation, as a sub-process of knowledge acquisition, is given in Shadbolt et al. (2015). The paper describes the problem of knowledge elicitation for knowledge-intensive systems from conventional expert systems through to intelligent tutoring systems, adaptive interfaces, and workflow support tools. The authors discuss the knowledge elicitation and modeling from the perspective of knowledge engineering and in the context of the Semantic Web. Elsewhere, special attention has been paid to information extraction and natural language processing (NLP) technologies (Cambria and White 2014), as well as data mining and machine learning techniques (Silwattananusarn and Tuamsuk 2012; Nickel et al. 2015). There is also interest in the use of ontology learning techniques to create initial ontological structures and develop automatic methods for knowledge extraction for a specific domain (Drumond and Girardi 2008). As an example, we present our methodology for semantic annotations; see Fig. 9. The methodology consists of three main parts: (i) the bottom-up part, which describes the steps of the annotation process; (ii) the domain specification modeling; and (iii) the top-down part, which applies the constructed models.
16 Semantic Web Empowered E-Tourism
395
Fig. 9 Methodology for semantic annotation cited from Panasiuk et al. (2018c)
The bottom-up part of the methodology helps to define the domain area, analyze domain entities, detect the format and type of data, select the ontology to represent collected information, map data to schema.org vocabulary, provide and deploy semantic annotations, and evaluate the results. The domain specifications modeling focuses on developing domain specification patterns called domain specifications (DSs). A domain specification is an extended subset of types, properties, and ranges (Sim¸ ¸ sek et al. 2018b) of schema.org. The goal of a DS is to provide a model of how a concrete domain should be represented in a semantically structured way. The top-down part of the methodology describes how to map new income data to developed DSs with no need to carry out the steps of the bottom-up part and perform annotation development according to domain specifications. To support the semantic annotation process, tools are required to support the manual and semiautomatic editing process, automatic annotation tools, and mappings of external schemas. Manual editing The annotation process of web content can be done manually via the semantify.it Annotation Editor22 (Kärle et al. 2017). The interface is automatically generated based on the domain specification. To start a new manual annotation, the user selects the domain specification of the document on which the annotation will be based and obtains an appropriate editing interface. If all the necessary fields are filled out, the user can get the source code of the annotation in the format JSON-LD. This source code can either be copied or saved on the semantify.it platform for further use. The annotation editor can be used by users to annotate their web content and make the semantic annotation process easier, complete, and consistent. Semi-automatic editing The fields in the editor will be filled in by extracting information from the given URL or source file. If a source file is semi-structured, then the editor will suggest the mapping to JSON-LD by using the mappings
396
K. Angele et al.
as an approximation based on the training data. If the content is unstructured, some approaches to extracting information from a web page can be applied. The information can be extracted from the source web page by tracking the appropriate HTML tags. Some ontology discovery techniques for the tourism domain are discussed in Karoui et al. (2004), and the tree-based technique of the document object model (DOM)23 of a web page is described in Gupta et al. (2003). In addition, semantic types and properties can be automatically inferred by using the semantic types and properties, which have been trained to be recognized (Gupta et al. 2012). Automatic annotation tools retrieve data from the web using natural language processing (NLP) and machine learning (ML). There are some approaches to extracting knowledge of text presentation and web pages, such as named entity recognition (Mohit 2014), information extraction (Chang et al. 2006), concept mining (Shehata et al. 2009), text mining (Inzalkar and Sharma 2015), etc. There are many tools or libraries, such as GATE24 for text analysis and language processing; OpenNLP,25 which supports the most common NLP tasks; and RapidMiner26 for data preparation, machine learning, deep learning, text mining, and predictive analytics; see Villazon-Terrazas et al. (2017a). The typical tasks of NLP are described in Moschitti et al. (2017). Mapping However, in order for large and fast-changing data sets to be generated effectively and continuously, other methods are required. The data are often provided by different institutions and might be both in and using different formats. To make this data assessable in the Knowledge Graph, we need to transfer it into the format and schema of our knowledge representation formalism (VillazonTerrazas et al. 2017a). The XLWrap approach generates graphs triples from specific cells of a spreadsheet (Langegger and Wöß 2009). Mapping Master (M2) is a mapping language for converting spreadsheets to OWL (O’connor et al. 2010). A generic XMLtoRDF tool provides a mapping document (XML document) that has a link between an XML schema and an OWL ontology (Van Deursen et al. 2008). Tripliser27 is a Java library and command-line tool for creating triple graphs from XML. In addition, GRDDL28 translates the XML data into RDF. Virtuoso Sponger29 generates Linked Data from a variety of data sources and supports a wide variety of data representation and serialization formats. R2RML30 specifies how to translate relational data into the RDF format. RDF Mapping Language (RML) (Dimou et al. 2014) extends R2RML’s applicability to define mappings of data in other formats. With RML, rules can be expressed that map data with different structures and serializations (e.g., databases, XML, or JSON data sources) to the domain-specific schema.org data model (cf. Sim¸ ¸ sek et al. 2019). The knowledge acquisition process plays an important role in the development of knowledge graphs. It defines the methods and techniques necessary for the semantic annotation of data from various resources and extracts important information from the textual description. The tools above help to create and test semantic annotations at the level available to the user.
16 Semantic Web Empowered E-Tourism
397
Knowledge Hosting Knowledge can be represented in different ways. In the context of our survey, we focus on knowledge which is present in RDF format. This means the data is represented in the form of subject-predicate-object triples, using an ontology to describe things semantically. For example, the information Fritz Phantom lives in Innsbruck and is born in 1984 contains three facts and can be expressed in a triple form in the following way: Listing 9 RDF example Fritz Phantom "Fritz Phantom" "Fritz Phantom" "Fritz Phantom"
is a lives in is born in
Person Innsbruck 1984
If we want to describe those three facts with the ontology of schema.org, it would look like this: Listing 10 Schema.org example Fritz Phantom http://fritz.phantom.com http://fritz.phantom.com http://fritz.phantom.com
rdf:type schema:Person schema:homeLocation http://innsbruck.gv.at schema:birthDate "1984"
There are two different popular ways to store this data: either in a NoSQL or specifically document database, serialized in JSON-LD, or natively in RDF form, in a graph database. JSON-LD stands for JSON for Linked Data and is a backward-compatible extension of JSON, the JavaScript Object Notation (Sporny et al. 2014). NoSQL databases such as MongoDB31 store JSON natively; hence using a document store to store knowledge is very effective and at the same time very convenient, due to the good support of document stores by web frameworks. Even though storage and retrieval work seamlessly, querying over JSON-LD files is not supported natively. To circumvent this limitation, manual implementation work is required. The second way to store knowledge is in the native RDF format, into a graph database supporting RDF. A typical example for such a database is GraphDB.32 To query the data, Knowledge Graphs typically offer a SPARQL interface over which all read-write operations and extensive reasoning requests can be performed. The results of SPARQL requests are typically in Turtle format and can be transformed into any other RDF serialization. Compared to document stores, graph databases are much better for querying and reasoning, but on the other hand, they are also much more expensive to purchase and maintain and have, de facto, no integration with any popular web framework. Both methods have their preferred scenario of usage. NoSQL is more suitable in the context of web site annotation, while graph databases are more suitable in the context of Linked (Open) Data publication (see section “Semantic Web of Data: Linked Open Data”). More information on RDF storage technology can be found in Ma et al. (2004), Angles and Gutierrez (2005), Stegmaier et al. (2009), and Faye et al. (2012).
398
K. Angele et al.
Knowledge Curation Knowledge is an important asset in all enterprises. For instance, Gutiérrez-Cuellar and Gómez-Pérez (2014) propose the HAVAS Knowledge Graph that aims to collect information from start-up, innovators, and tech companies; Amato et al. (2017) present KIRA that gathers data from user, social media, and multimedia data sources; Quimbaya et al. (2014) propose EXEMED that is a knowledge base for structuring clinical guidelines; Achichi et al. (2018) propose DOREMUS, a knowledge graph of music works and events; and an Amazon product knowledge graph is led by Dong (2018). Knowledge is being continuously gathered and maintained in order to serve several purposes, from providing a common unified view on all data resources of the enterprises to powering their applications. For instance, the large technology companies – including Microsoft, Facebook, Google, and many more – have knowledge graphs and have invested in their curation with the purpose of making all their web-scale services better (Pan et al. 2017). In the previous sections, we examined different methods and tools in which knowledge can be modeled and how knowledge graphs can be built and hosted. Building and hosting a knowledge graph is one thing. Turning them into a useful resource for problem-solving requires additional effort. In this context, knowledge curation plays a key role. In short, knowledge curation is about (1) assessing the quality of the knowledge graph, i.e., knowledge assessing; (2) improving the correctness of the knowledge graph, i.e., knowledge cleaning; and (3) improving the completeness of the knowledge graph, i.e., knowledge enrichment.
Knowledge Assessment Knowledge Assessment describes and defines the process of assessing the quality of a Knowledge Graph. The goal is to measure the usefulness of a Knowledge Graph considering two major quality dimensions, namely, its correctness and completeness. Knowledge Graph assessment can differ along different quality dimensions (Batini et al. 2009; Zaveri et al. 2013). For example, Paulheim and Bizer (2013) tackle the identification of missing instance assertions, Fürber and Hepp (2010a) identify wrong and missing property value assertions, and Lertvittayakumjorn et al. (2017) address the identification of wrong property value assertions. The approach presented by Mendes et al. (2012) defines the additional quality assessment methods. Knowledge Cleaning Knowledge Cleaning is about identifying and correcting the wrong assertions in a Knowledge Graph by deleting or modifying them. The goal of Knowledge Cleaning is to improve the correctness of a knowledge graph. The following tasks are relevant for Knowledge Cleaning: – Detection and correction of wrong instance assertions – Detection and correction of wrong property value assertions – Detection and correction of wrong equality assertions
16 Semantic Web Empowered E-Tourism
399
The achievement of those tasks heavily relies on the employed tools. For example, SDType (Paulheim and Bizer 2013) detects wrong instance assertions, SPIN (Fürber and Hepp 2010b) identifies functional dependency violations, LOD Laundromat (Beek et al. 2014) allows for the detection and correction of syntax errors, SDValidate (Paulheim and Bizer 2014) partially identifies wrong property value assertions, KATARA (Chu et al. 2015) identifies and corrects wrong property names, and HoloClean (Rekatsinas et al. 2017) can be used for detecting and correcting wrong property value assertions.
Knowledge Enrichment Knowledge Enrichment identifies and adds new assertions into a Knowledge Graph. The goal of knowledge enrichment is to improve the completeness of a knowledge graph. The following tasks are relevant for Knowledge Enrichment: – Identifying and resolving duplicates by adding missing instance assertions and missing equality assertions. – Resolving conflicting property value assertions by adding or deleting missing property value assertions. Several methods and tools have been developed to address entity resolution and the related problems of conflicting property value assertions. For instance, for identifying duplicates, some authors use methods and techniques based on string similarity measures (Winkler 2006), association rule mining (Hipp et al. 2000), topic modeling (Sleeman et al. 2015), support vector machine (Sleeman and Finin 2013), property-based (Hogan et al. 2007), crowd-sourced data (Getoor and Machanavajjhala 2013), and graph-oriented (Korula and Lattanzi 2014); and for resolving duplicates, there are various tools such as Silk (Volz et al. 2009), SERIMI (Araújo et al. 2011), Legato (Amato et al. 2017), or Duke (Garshol and Borge 2013). Regarding the resolution of conflicting property value assertions, this can be tackled using Sieve (Mendes et al. 2012), FAGI (Giannopoulos et al. 2014), or ODCleanStore (Michelfeit et al. 2012).
Knowledge Deployment One of the earliest applications of knowledge graphs was provided by Google who began in 2012 to develop the so-called Google knowledge graph (Singhal 2012), which should contain significant aspects of human knowledge found semantically annotated on the web or in other data sources. Since then, a multitude of knowledge graphs have been developed (cf. Paulheim 2017 and others) including Airbnb Knowledge Graph,33 Bing Knowledge Graph34 (previously called Microsoft’s Satori35 ), Cyc/OpenCyc36 (cf. Lenat 1995; Lenat and Guha 1989), datacommons.org,37 DBpedia38 extracted from Wikipedia (cf. Auer et al. 2007; Lehmann et al. 2015), Facebook’s Entities Graph,39 Freebase40 (see Bollacker et al. 2008) meanwhile close, bought by Google and also incrementally included in Wikidata,
400
K. Angele et al.
Table 1 Numerical Overview of some Knowledge Graphs, taken from Paulheim (2017) Name DBpedia (English) YAGO Freebase Wikidata NELL OpenCyc Google’s Knowledge Graph Google’s Knowledge Vault Yahoo! Knowledge Graph
Instances 4,806,150 4,595,906 49,947,845 15,602,060 2,006,896 118,499 570,000,000 45,000,000 3,443,743
Facts 176,043,129 25,946,870 3,041,722,635 65,993,797 432,845 2,413,894 18,000,000,000 271,000,000 1,391,054,990
Types 735 488,469 26,507 23,157 285 45,153 1,500 1,100 250
Relations 2,813 77 37,781 1,673 425 18,526 35,000 4,469 800
Google’s Knowledge Graph,41 kbpedia42 (see Bergman 2018), Knowledge Vault43 (see Dong et al. 2014), NELL44 and45 (Carlson et al. 2010), Wikidata46 (Vrandeˇci´c and Krötzsch 2014), YAGO47 (Suchanek et al. 2007), (Hoffart et al. 2013), extracted from Wikipedia plus wordnet,48 and Yahoo!’s Knowledge Graph49 ; see Blanco et al. (2013). Table 1 provides a survey of the size of some of the previously mentioned Knowledge Graphs where this information is made publicly available. These Knowledge Graphs, especially when based on schema.org, play an increasingly important role for web-based information search. Search is in fact one core application of knowledge graphs. It has evolved over time as the web has changed. From a system based on publishing information intended for human user consumption (the classical web) to a system where machines can understand and consume the content (Semantic Web), the web is nowadays changing into a web for bots. Search and search engines have evolved along the same path, from being very effective indexes of the web based on syntax and statistical analysis to becoming query-answering engines (see Guha et al. 2003; Harth et al. 2007). This was only achieved by using semantic annotations of website content, data, and services through the use of de facto standards such as schema.org. Chatbots and Intelligent Personal Assistants have become very popular in the last couple of years. They require structured, machine-processable data, content, and services. Alexa, Bixby, Cortana, Facebook messenger, Google Assistant, Siri, and others provide personalized and (spoken) message-based access to information. Knowledge Graphs can be used to improve such dialogue systems in two major ways: (1) to power the language understanding part of the dialogue system and (2) to react to the conversations and provide additional interactions, information, and recommendations to the user engaged in conversations with the dialogue system. When it comes to supporting the language understanding part of the dialogue system, the goal is to use the Knowledge Graphs to provide training data for the Natural Language Understanding. We can automatically ingest from the Knowledge Graph training data for the entity recognition task (e.g., Vienna is a City) and provide (semi-)automatically generated intents and example questions. Based on the Knowledge Graph structure, we can generate, on the one hand, entities
16 Semantic Web Empowered E-Tourism
401
and synonyms, and on the other hand intents needed in the Natural Language Understanding service based on the entities, specifically the relations between these entities in the Knowledge Graph. Furthermore, we can use ontology2text approaches to generate example questions that can be used to train the Natural Language Understanding service. The second direction in which Knowledge Graphs can be used to improve dialogue systems is to react to conversations and provide additional interactions, information, and recommendations to the user. Using the knowledge from the Knowledge Graph, the dialogue system can elaborate on the topic of discussion and provide additional interesting facts. A Knowledge Graph can also be used to improve the handling of the conversation context. Finally, a Knowledge Graph can also be used to refine the search for products or services in a dialogue system. In case the dialogue system cannot answer the given question, the Knowledge Graph can be used to obtain more information.
Use Cases in E-Tourism We can distinguish between proprietary and public Knowledge Graphs. For example, the Google Knowledge Graph is an internal resource of Google that improves its answering quality. Alternatively, a public Knowledge Graph can be the basis of an ecosystem of bots that search it for products and services. Obviously, there are many variations and combinations of these two principles possible. In the following, we discuss pilots for developing both internal Knowledge Graphs and open and public ones in the tourism domain.
Touristic Chatbots and Intelligent Personal Assistants The Gartner hype cycle50 for emerging technologies (August 2018) shows both knowledge graphs and conversational AI in the innovation trigger. MarketsandMarkets51 forecasts the global conversational AI market size to grow from USD 4.2 billion in 2019 to USD 15.7 billion by 2024, at a compound annual growth rate (CAGR) of 30.2% during the forecast period (2019–2024). The major growth drivers for the market include the increasing demand for AI-powered customer support services, omni-channel deployment, and reduced chatbot development costs. As we describe in the following use case, applications of Knowledge Graphs are a complementary technology for conversational platforms to scale the automation of conversations of chatbots and voice assistant at reduced costs. The growth for conversational AI is due to the evolving usage of chatbots for content marketing activities, such as digital marketing and advertising. The technological capabilities, individuality, and customization are the main features accelerating market growth. Chatbots are there to assist, interact, and engage with customers, and they offer personalized marketing capabilities.52 Voice-to-text understanding has recently achieved a very high accuracy and continues to improve. Nevertheless, use cases of current chatbots and voice assistants
402
K. Angele et al.
Fig. 10 Typical Dialogue with current chatbots and voice assistants
Fig. 11 The inner process of a Knowledge-centered chatbots and voice assistants
are still basic and focus on simple question and answer solutions. A dialogue with an Alexa Box or Google Home quite often ends in a “Sorry, I don’t know” due to the lack of knowledge these devices have. The reason for this is that the natural language solutions of such devices lack knowledge of entities, e.g., restaurant and roast pork (as demonstrated in the example in Fig. 10) and cannot achieve the goals of the questions. To support the chatbot and voice assistant type of scenarios introduced before, we need to design, implement, and deploy a knowledge-centered solution that will enable conversational interfaces to engage in human-like dialogues. Figure 11 depicts the inner process of such a solution for chatbots and voice assistants. At first, the natural language input of a user, in written or spoken form, undergoes a natural language understanding step (understand 1.), in which the user intent, together with parameters, is identified. The intent then needs to be resolved to an action that typically translates in a number of queries (map 2.) that can then be executed (query 3.) against the integrated large volumes of heterogeneous,
16 Semantic Web Empowered E-Tourism
403
Fig. 12 Using Knowledge Graphs to make chatbots and voice assistants (e.g., Alexa) smarter
distributed, dynamic, and potentially (i.e., almost certainly) inconsistent statements in order to identify the relevant knowledge parts necessary to generate the user answer in natural language (NLG – natural language generation 4.) as text or voice. Let us revisit our use case and see how Knowledge Graphs can enable chatbots and voice assistants to understand the goal the human users express in natural language requests. Figure 12 illustrates the different steps of the process, from understanding the user request to generating and executing the query against the Knowledge Graph to generating the answer for the user. With a Touristic Knowledge Graph in place that includes touristic entities such as restaurants and offers from these restaurant (e.g., roast pork), as well as actions related to these entities that can be performed (e.g., booking a table), intents and parameters can be derived. For example, an intent TableReservation for entities of type Restaurant can be generated. Restaurants, and in general organizations, can be connected in the Knowledge Graphs to other entities of the type Offer (e.g., roast pork offers). Furthermore, the Knowledge Graph can be used to improve the understanding of the natural language understanding (NLU) by pushing entities from the Knowledge Graph (e.g., Hofbräu Bierhaus NYC) to the NLU or by generating example questions for the intents. The Knowledge Graph can be also used to generate the rules that restrict the view/access to the Knowledge Graph depending on the use cases. Such rules, together with the intent and parameters extracted by the NLU, are used to generate the queries to be executed against the Knowledge Graph. Last but not least, the Knowledge Graph can be used to generate templates for the answers, the textual answers, or follow-up questions to run the dialogues. Chatbots and voice assistants have started to play an increasing role in customer communication for many businesses in various verticals. Especially in tourism, they are proving to offer an increasing number of benefits in terms of convenience, availability, and fast access to information delivery and customer support through the entire customer journey.53 In the dreaming and planning phase, hotels and Destination Management Organizations (DMOs) can provide information through chatbots and voice assistants about the hotel and/or the region, the surroundings,
404
K. Angele et al.
and weather conditions to potential guests. In the booking phase, from booking the hotel and transport to buying connected services, e.g., ski tickets, the entire process becomes much simpler and efficient by using natural language. Finally, in the experience phase, chatbots and voice assistants can also announce special offers or events. All requested information and processes are instantly available 24/7/365. For hotel guests in particular, the stay experience can be enriched by providing them with access to hotel services and beyond. Recently, Amazon launched a program for hotel operators54 that allows guests to request room service, ask for housekeeping, configure the temperature and lights in the hotel room, set wake-up calls, and even connect their accounts to listen to their own music and audio-books. Last but not least, customer support questions regarding rooms, equipment, additional services, and more are answered in a fully automated way. On the one hand, it can be argued that similar functionalities are available in mobile apps, but the major drawback of these apps is that each one of them focuses on different aspects, and time is required to learn how each app works. Chatbots and voice assistants, on the other hand, provide easier means to access the same functionalities by using the most natural way in which humans interact, i.e., natural language (as voice or written text). Touristic chatbots and voice assistants are thus expected to answer questions and satisfy commands of a different nature, for example, “What’s the most popular attraction in the city?”, “What events are happening in the coming weekend?”, “What’s the snow height?”, “Book me a table tonight for 2 people in a Tyrolean restaurant”, “I’m looking for a bike ride that is difficult and offers huts on the way”, etc. To properly answer all these types of questions and perform tasks such as booking, chatbots and voice assistants need machine-processable (semantic) annotations of content, data, and services. They need structures that encode the knowledge about the tourism domain, in terms of entities and relations between them, in a machine-processable form. Knowledge Graphs are such structures providing the technical means to integrate various heterogeneous touristic information sources about accommodations, points of interests, events, sports activity locations, etc. With the help of Knowledge Graphs, not only can simple question-answering tasks be supported but rather complex conversations/dialogues can be too. Applying the principles, methods, and tools introduced in the previous sections, we have built a Knowledge Graph for tourism that integrates multiple sources of content, data, and services from various providers, both – closed: Feratel,55 Outdooractive,56 intermaps,57 General Solutions,58 Verkehrsauskunft Österreich59 – open: WikiData,60 DBpedia,38 OpenStreetMap 61 and GeoNames62 The resulting Touristic Knowledge Graph powers several chatbots and voice assistants of touristic regions in Tyrol, Austria. 1. The Seefeld pilot63 focuses on integrating only closed data sources, namely, from Feratel, Intermaps, Outdooractive, and General Solutions. The use case is
16 Semantic Web Empowered E-Tourism
405
for the tourist region Olympiaregion Seefeld. For this use case, we also focus on question-answering for more advanced (compound questions), for instance, “Where can I have traditional Tyrolean food when going cross country skiing?”. 2. The Serfaus-Fiss-Ladis pilot64 focuses on integrating both close data sources, namely, Feratel, Intermaps, Outdooractive, General Solutions, and Verkehrsauskunft Österreich as well as open data sources, namely, Wikidata and DBpedia. The Serfaus-Fiss-Ladis tourist region envisions that users cannot only chat about the specific tourism data but also inquire about common knowledge of the region. The conversational interface is able to handle questions which combine the closed and open datasets, for instance, “How many inhabitants does Serfaus have?” or “Traffic information from Serfaus to Via Claudia Augusta?” Common to these pilots and use cases is the need to integrate data from multiple heterogeneous static and dynamic sources, for which we need to track provenance (e.g., data owner, temporal validity, or the integration process) and maintain one common evolving schema. Using knowledge cleaning and enrichment, we also ensure a certain level of quality of the touristic knowledge. The ultimate aim is to optimize conversational interfaces based on Knowledge Graphs by providing a rich intent and entity management (e.g., automated NLU training), question-answering over the Knowledge Graph, and supporting advanced dialogues such as guiding a user through actions or recommendations or follow-up conversations. These pilots have been implemented and used to test and validate the usage of Knowledge Graphs to enable the better understanding of natural language dialogues and knowledge access for touristic chatbots and voice assistants.
Open Touristic Knowledge Graph We have built the Tirol Knowledge Graph (TKG) as a five-star linked open data set published in a graph database providing a SPARQL endpoint, Kärle et al. (2018), for the provision of touristic data of Tyrol, Austria. The TKG currently contains data about touristic infrastructure, such as accommodation businesses, restaurants, points of interests, events, and recipes. The data of the TKG fall under three categories of data: static data is information which is rarely changing, such as addresses of hotels, descriptions of points of interests, and suchlike. Dynamic data is fast-changing information, such as availabilities and prices. Active data describes actions that can be executed, for example, the description of a purchase or reservation Web API that can be accessed through the TKG GraphDB platform.65 The data is collected either through the crawling of websites or mappings from proprietary data sources into the Knowledge Graph, which uses schema.org as ontology. Therefore, only websites containing schema.org annotated data are considered, and data sources are always mapped to schema.org before being stored in the TKG. The crawler is implemented inside the semantify.it annotation platform (Kärle et al. 2017), called broker.semantify.it. Based on a list of URLs of touristic websites, the data gets collected periodically and is then stored in the graph.
406 Table 2 Top 10 entities used in the TKG
K. Angele et al. Entity schema:Thing schema:CreativeWork schema:MediaObject http://purl.org/dc/dcmitype/Image schema:ImageObject schema:Intangible schema:StructuredValue schema:Place schema:ContactPoint schema:PostalAddress
Count 453,841,147 175,787,490 175,746,110 175,735,868 175,735,868 172,124,244 155,482,666 60,996,190 53,155,166 51,706,023
The mapping is provided for different data sources such as Feratel,55 General Solutions,58 Infomax,66 Tomas,67 etc. (Panasiuk et al. 2018a,b). The data are mostly retrieved through SOAP or REST APIs and are originally provided in XML or JSON format. For fetching these data, translating it to schema.org, and storing it in the Knowledge Graph periodically, wrappers are implemented inside semantify.it that are executed periodically. The mapping is either implemented programmatically in NodeJS or done through the mapping language RML (Dimou et al. 2014). As of May 4, 2019, the TKG contained around 7.5 billion statements, of which 55% are explicit and 45% are inferred. Every day, the Knowledge Graph grows by around 8 million statements. The data is held in around 2000 subgraphs, where every subgraph represents one import process per data source (see section “Annotation Languages”, data provenance). TKG contains more than 200 entity types; the most frequently used ones are shown in Table 2. To demonstrate the possibilities of the TKG and to evaluate its usability, we built several pilots. 1. Chatbot-driven room booking: among the crawled websites are many customers of the Internet booking engine development company Easybooking.68 The features, identifying a website as a customer of Easybooking, inside the source code are known. The booking API structure of Easybooking is also known. Therefore, we decided to develop an Alexa skill that enables voice-driven booking of Easybooking hotels through the TKG. If the showcase skill is asked for a certain hotel, it sends a requests to a web hook. The result, a list of available hotel offers, is sent back to the skill and read to the user. The list also contains annotated API descriptions for the booking API, so if the user decides on an offer, a booking can be executed through a voice command. 2. Show case dialogue system: as described in Sim¸ ¸ sek and Fensel (2018), Panasiuk et al. (2018c), we built two dialogue systems that fetch their data from the graph. One generically answers touristic topics such as hiking or opening hours. The other (Sim¸ ¸ sek and Fensel 2018) goes one step further and conducts generic dialogs solely based on data taken from the Knowledge Graph.
16 Semantic Web Empowered E-Tourism
407
3. Time series analysis of prices in touristic regions: since all the prices of offers, if available, are stored permanently, a time series analysis can be conducted. We compare the price development of two touristic regions over a period of time. The time series analysis works perfectly with Knowledge Graphs and is a promising application of them in tourism. Despite the TKG, there are other initiatives regarding touristic Knowledge Graphs worth mentioning in this context. One of these initiatives is the DACH-KG. DACH is an acronym for the German-speaking regions of Germany (D), Austria (A), Switzerland (CH), and the region of South Tyrol, whereas KG stands for Knowledge Graph. Key representatives of the touristic domain and academics from said countries and from Italy are working together. To achieve this, they are working not only on aligning data sources technically but also on extending the expressivity of the ontology of their choice, schema.org. The achievements and progress of this working group are described in a living paper69 and in two blog articles.70,71 A similar goal is pursued by another initiative, the French DATAtourisme.72 Here, a union of French DMOs and regional tourism boards is working on a 5* Linked Open Data (see section “Semantic Web of Data: Linked Open Data”) set, to cater as the official source for structured touristic data in France. Data from more than 40 systems are aggregated, described by a custom-made ontology, and provided over the project’s website and the corresponding government website.73 An academic initiative around touristic knowledge graphs is the TourismKG workshop series.74
Conclusions The Semantic Web began more than 20 years ago as a means to add machineprocessable semantics to the web. In the meantime, it has become a fairly mature area of research and in the past 5 years has begun to have a significant impact on how information is presented on the web. Obviously, search engines were fairly slow in taking up and exploiting its potential for providing intelligent access to web resources beyond simple search result listings. However, it is increasingly becoming a must-have for content, data, and service providers to enrich their online resources with semantic markup such as schema.org. Schema.org is a rather simple and limited approach, but we expect richer approaches to be adopted soon (given the usual delay in take-up by big industry). This is mission accomplished, except for the fact that this success is in the process of opening up an even more challenging and demanding research area, the so-called Knowledge Graphs. In the context of the evolving Knowledge Graph technology, we aimed to provide answers to three important questions: What are Knowledge Graphs, how are they built, and in what sense are they important? We provided a number of approaches for constructing, hosting, curating, and deploying Knowledge Graphs and showed their potential usage for dialog-based information access on the Internet, usage that may revolutionize humans’ information access. We described applications in the areas of e-Tourism as a cornerstone for future e-Marketing and e-Commerce.
408
K. Angele et al.
In the future, we expect Knowledge Graphs to rapidly grow to trillions of facts and beyond. This introduces harsh requirements on the methods that are able to handle them. Even in the optimistic case, Paulheim (2018a) estimates the related costs at billions of dollars. Keeping scale without a cost explosion is an obvious requirement for the success of the Knowledge Graph approach. This may require the return of more traditional AI techniques where large amounts of facts are attempted to be captured through the elegance of simple rules and axioms (like a picture that can express more than a thousand words). Therefore, building up meaningful Tboxes on top of existing Knowledge Graphs may be an interesting avenue to investigate (see Töpper et al. 2012; Socher et al. 2013; Galárraga et al. 2013, 2015; Paulheim 2018b).
Expected Future Developments To tackle the huge size of future knowledge graphs, we expect to operate not on the knowledge graph as a single unit but on smaller subgraphs. These subgraphs will then be used to operate efficiently and effectively (by reasoning over them and carrying out Knowledge Curation). Not only is the size of the knowledge graph a problem, but also consumers having different points of view present challenges. Therefore, it must be possible to apply different constraints for different consumers as a single set of constraints is infeasible. For example, a knowledge graph storing information about people and their relations to each other contains a person with multiple spouses. Based on the points of view, different conclusions can be drawn. A point of view that allows polygamy may not constrain the cardinality of the spouse property but in another point of view that information may be identified as an error. By extracting subgraphs from the huge knowledge graph and directly operating on them, it is possible to define and apply different constraints for different consumers. We also expect that in the future knowledge graphs will be used together/to support machine learning algorithms. A very prominent and important example is autonomous driving. In May 2016, Joshua Brown was killed by his car because its autopilot mixed up a very long truck (18- wheeler) with a traffic sign. If the car had been connected to a knowledge graph containing traffic data, his life could have been saved. The autopilot could have used the information from the knowledge graph to detect that there is no traffic sign and stop the car. Not only can knowledge graphs be used to support machine learning algorithms, but also machine learning algorithms can be used to create, correct, and enrich knowledge graphs. For the creation of knowledge graphs, heuristics are frequently used. Those heuristics could be implemented by using techniques based on machine learning methods. In addition, for example, adding missing type information can be considered as a hierarchical multi-label classification problem. For error detection, we can use machine learning algorithms, which use relations in a knowledge graph, as positive training examples and create negative examples to detect wrong statements.
16 Semantic Web Empowered E-Tourism
409
Cross-References Advanced Web Technologies and E-Tourism Web Applications Big Data Technologies Data Mining and Predictive Analytics for E-Tourism Data Privacy and the Travel Sector Drivers of E-Tourism Electronic Data Interchange and Standardization e-Tourism Research: A Review e-Tourism: An Informatics Perspective Impact of Artificial Intelligence in Travel, Tourism, and Hospitality Information and Communication Technology in Event Management Interactive and Context-Aware Systems in Tourism Recommender Systems in Tourism Strategic Use of Information Technologies in Tourism: A Review and Critique Travel Information Search Web Information Retrieval and Search
Notes 1
Schema.org https://www.schema.org, accessed: 02 Jun 2019. Thesaurus https://www.thesaurus.com, accessed: 05 Jun 2019. 3 Microformats2 http://microformats.org/wiki/microformats2, accessed: 14 Jun 2019. 4 Microdata https://www.w3.org/TR/microdata/, accessed: 14 Jun 2019. 5 RDFa https://www.w3.org/TR/xhtml-rdfa-primer/, accessed: 14 Jun 2019. 6 JSON-LD https://www.w3.org/TR/json-ld/, accessed: 14 Jun 2019. 7 JSON https://www.json.org/, accessed: 14 Jun 2019. 8 JSON-LD example https://www.w3.org/TR/json-ld/#basic-concepts, accessed: 11 Jul 2019. 9 JSON-LD Syntax https://w3c.github.io/json-ld-syntax/, accessed: 14 Jun 2019. 10 Dublin Core http://dublincore.org/specifications/dublin-core/dcmi-terms/, accessed: 14 Jun 2019. 11 Friend of a friend http://xmlns.com/foaf/spec/, accessed: 14 Jun 2019. 12 GoodRelations http://www.heppnetz.de/ontologies/goodrelations/v1, accessed: 14 Jun 2019. 13 Tourismus der Zukunft https://www.tourismuszukunft.de, accessed: 28 Nov 2019. 14 DACH-KG Schema.org extension https://github.com/STIInnsbruck/dachkg-schema, accessed: 02 Jun 2019. 15 Simple Object Access Protocol https://en.wikipedia.org/wiki/SOAP, accessed: 14 Jun 2019. 16 Representational state transfer https://en.wikipedia.org/wiki/Representational_state_ transfer, accessed: 14 Jun 2019. 17 Web Service Description Language https://en.wikipedia.org/wiki/Web_Services_Description_ Language, accessed: 14 Jun 2019. 18 Semantic Web Services Framework https://www.w3.org/Submission/SWSF/, accessed: 14 Jun 2019. 19 OpenAPI Specification https://github.com/SmartAPI/smartAPI-Specification/blob/OpenAPI. next/versions/3.0.0.md accessed: 14 Jun 2019. 20 Schema.org Actions https://schema.org/docs/actions.html, accessed: 14 Jun 2019. 21 Graph (discrete mathematics) https://en.wikipedia.org/wiki/Graph_(discrete_mathematics), accessed: 14 Jun 2019. 2
410 22
K. Angele et al.
Semantify.it https://semantify.it/, accessed: 14 Jun 2019. Document Object Model https://www.w3.org/DOM/, accessed: 14 Jun 2019. 24 GATE https://gate.ac.uk/, accessed: 14 Jun 2019. 25 Apache Open NLP https://opennlp.apache.org/, accessed: 14 Jun 2019. 26 RapidMiner https://rapidminer.com/, accessed: 14 Jun 2019. 27 Tripliser http://daverog.github.io/tripliser/, accessed: 14 Jun 2019. 28 GRDDL https://www.w3.org/TR/grddl/, accessed: 14 Jun 2019. 29 Virtuoso Sponger http://vos.openlinksw.com/owiki/wiki/VOS/VirtSponger, accessed: 14 Jun 2019. 30 R2RML https://www.w3.org/TR/r2rml/, accessed: 14 Jun 2019. 31 MongoDB https://www.mongodb.com/de, accessed: 14 Jun 2019. 32 GraphDB http://graphdb.ontotext.com/, accessed: 14 Jun 2019. 33 Airbnb Knowledge Graph https://medium.com/airbnb-engineering/scaling-knowledgeaccess-and-retrieval-at-airbnb-665b6ba21e95, accessed: 14 Jun 2019. 34 Bing Knowledge Graph https://blogs.bing.com/search-quality-insights/2017-07/bring-richknowledge-of-people-places-things-and-local-businesses-to-your-apps, accessed: 14 Jun 2019. 35 Microsoft’s Satori http://blogs.bing.com/search/2013/03/21/understand-your-world-withbing/, accessed: 14 Jun 2019. 36 Cyc/OpenCyc http://www.cyc.com/, accessed: 14 Jun 2019. 37 datacommons.org http://datacommons.org/, accessed: 14 Jun 2019. 38 DBpedia http://dbpedia.org/, accessed: 14 Jun 2019. 39 Facebook’s Entities Graph http://www.facebook.com/notes/facebook-engineering/underthe-hood-the-entities-graph/10151490531588920, accessed: 14 Jun 2019. 40 FreeBase http://www.freebase.com/, accessed: 14 Jun 2019. 41 Google’s Knowledge Graph https://developers.google.com/knowledge-graph/, accessed: 14 Jun 2019. 42 kbpedia http://www.kbpedia.org/, accessed: 14 Jun 2019. 43 Knowledge Vault https://ai.google/research/pubs/pub45634, accessed: 14 Jun 2019. 44 NELL http://rtw.ml.cmu.edu/rtw/resources, accessed: 14 Jun 2019. 45 NELL http://rtw.ml.cmu.edu/rtw/kbbrowser/, accessed: 14 Jun 2019. 46 Wikidata Main Page https://www.wikidata.org/wiki/Wikidata:Main_Page, accessed: 14 Jun 2019. 47 YAGO https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/ research/yagonaga/yago/downloads/, accessed: 14 Jun 2019. 48 WordNet https://wordnet.princeton.edu/, accessed: 14 Jun 2019. 49 Yahoo’s Knowledge Graph https://www.slideshare.net/NicolasTorzec/the-yahoo-knowledgegraph, accessed: 14 Jun 2019. 50 Gartner Hype Cycle https://gartner.com, accessed: 14 Jun 2019. 51 MarketsandMarkets https://bit.ly/2IAiHqw, accessed: 14 Jun 2019. 52 Personalized marketing capabilities https://www.sdcexec.com/software-technology/news/ 21011880/chatbot-market-to-grow-at-31-percent-cagr-from-2018-to-2024, accessed: 14 Jun 2019. 53 Customer journey https://tourismeschool.com/customer-journey-mapping-tourism-brands/, accessed: 14 Jun 2019. 54 Amazon Hotel System https://techcrunch.com/2018/06/19/amazon-launches-an-alexasystem-for-hotels/, accessed: 14 Jun 2019. 55 Feratel http://www.feratel.at/en/, accessed: 14 Jun 2019. 56 Outdooractive https://www.outdooractive.com/, accessed: 14 Jun 2019. 57 Intermaps https://www.intermaps.com/en/, accessed: 14 Jun 2019. 58 General Solutions https://general-solutions.eu/, accessed: 14 Jun 2019. 59 Verkehrsauskunft Österreich https://verkehrsauskunft.at/, accessed: 14 Jun 2019. 60 Wikidata https://www.wikidata.org/, accessed: 14 Jun 2019. 61 Open Streetmap https://www.openstreetmap.org, accessed: 14 Jun 2019. 23
16 Semantic Web Empowered E-Tourism
411
62
Geonames https://www.geonames.org/, accessed: 14 Jun 2019. Pilot Seefeld https://www.seefeld.com/en/, accessed: 14 Jun 2019. 64 Serfaus-Fiss-Ladis pilot https://www.serfaus-fiss-ladis.at/en/, accessed: 14 Jun 2019. 65 Tirol Knowledge Graph http://graphdb.sti2.at:8080/, accessed: 14 Jun 2019. 66 Infomax https://www.infomax-online.de/, accessed: 14 Jun 2019. 67 Tomas https://www.tomas.travel/, accessed: 14 Jun 2019. 68 Easybooking https://www.easybooking.eu/de/, accessed: 14 Jun 2019. 69 Knowledge Graph DACH V3 https://www.tourismuszukunft.de/wp-content/uploads/2019/ 04/Knowledge-Graph-DACH-V3.pdf, accessed: 14 Jun 2019. 70 Dach KG Touristischen Knowledge Graph https://www.tourismuszukunft.de/2018/11/dachkg-auf-dem-weg-zum-touristischen-knowledge-graph/, accessed: 14 Jun 2019. 71 Dach KG Nächste Schritte https://www.tourismuszukunft.de/2019/05/dach-kg-neueergebnisse-naechste-schritte-beim-thema-open-data/, accessed: 14 Jun 2019. 72 Data Tourisme https://www.datatourisme.fr/, accessed: 14 Jun 2019. 73 Data Tourisme Government Website https://www.datatourisme.gouv.fr/, accessed: 14 Jun 2019. 74 Tourism KG https://tourismkg.github.io/2019/, accessed: 14 Jun 2019. 63
References Achichi M, Lisena P, Todorov K, Troncy R, Delahousse J (2018) DOREMUS: a graph of linked musical works. In: Proceedings of the 17th International Semantic Web Conference (ISWC2018), Part II, Monterey, 8–12 Oct 2018. LNCS, vol 11137. Springer Amato F, Moscato V, Picariello A, Sperlì G (2017) Knowledge-based access to art collections: the KIRA system. In: Proceedings of the 25th Italian Symposium on Advanced Database Systems (SEBD2017), Squillace Lido (Catanzaro), Italy, 25–29 June 2017, CEUR-WS.org, CEUR Workshop Proceedings, vol 2037, p 82 Angles R, Gutierrez C (2005) Querying RDF data from a graph database perspective. In: Proceedings of the 2nd European Semantic Web Conference (ESWC2015), Heraklion, Crete, 29 May–1 June 2005. LNCS, vol 3532. Springer, pp 346–360 Ankolekar A, Burstein M, Hobbs JR, Lassila O, Martin D, McDermott D, McIlraith SA, Narayanan S, Paolucci M, Payne T et al (2002) DAML-S: web service description for the semantic web. In: Proceedings of the 1st International Semantic Web Conference (ISWC2002), Sardinia, 9–12 June 2002. LNCS, vol 2342. Springer, pp 348–363 Araújo S, Hidders J, Schwabe D, de Vries AP (2011) SERIMI – resource description similarity, RDF instance matching and interlinking. In: Proceedings of the 6th International Workshop on Ontology Matching (OM2011), Bonn, 24 Oct 2011, CEUR-WS.org, CEUR Workshop Proceedings, vol 814 Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: Proceedings of the 6th International Semantic Web Conference (ISWC2007) and 2nd Asian Semantic Web Conference (ASWC 2007), Busan, 11–15 Nov 2007. LNCS, vol 7031. Springer, pp 722–735 Baader F, Calvanese D, McGuinness D, Patel-Schneider P, Nardi D (eds) (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, Cambridge Batini C, Cappiello C, Francalanci C, Maurino A (2009) Methodologies for data quality assessment and improvement. ACM Comput Surv (CSUR) 41(3):16 Beckett D, Berners-Lee T, Prud’hommeaux E, Carothers G (2014) RDF 1.1 Turtle. World Wide Web Consortium. https://www.w3.org/TR/turtle/ Beek W, Rietveld L, Bazoobandi HR, Wielemaker J, Schlobach S (2014) LOD laundromat: a uniform way of publishing other people’s dirty data. In: Proceedings of the 13th International Semantic Web Conference (ISWC2014), Riva del Garda, 19–23 Oct 2014. LNCS, vol 8796. Springer, pp 213–228
412
K. Angele et al.
Bergman MK (2018) A knowledge representation practionary: guidelines based on Charles Sanders Peirce. Springer, Cham Berners-Lee T (2006) Linked data. https://www.w3.org/DesignIssues/LinkedData.html Berners-Lee T, Hendler J, Lassila O, et al (2001) The semantic web. Sci Am 284(5):28–37 Berners-Lee T, Connolly D, Kagal L, Scharf Y, Hendler J (2008) N3logic: a logical framework for the world wide web. Theory Pract Logic Program 8(3):249–269 Berners Lee T (2015) Five star open data. http://5stardata.info/en Bizer C, Heath T, Idehen K, Berners-Lee T (2008) Linked data on the web (LDOW2008). In: Proceedings of the 17th International Conference on World Wide Web (WWW2008), Beijing, 21–25 Apr 2008. ACM, pp 1265–1266 Blanco R, Cambazoglu BB, Mika P, Torzec N (2013) Entity recommendations in web search. In: Proceedings of the 12th International Semantic Web Conference, Part II, Sydney, 21–25 Oct 2013. LNCS, vol 8219. Springer Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08), Vancouver, 10–12 June 2008. ACM, pp 1247–1250 Bonatti PA, Decker S, Polleres A, Presutti V (2019) Knowledge graphs: new directions for knowledge representation on the semantic web (dagstuhl seminar 18371). Schloss Dagstuhl– Leibniz-Zentrum fuer Informatik, vol 8, pp 29–111 Brachman RJ (1979) On the epistemological status of semantic networks In: Findler NV (ed) Associative networks: representation and use of knowledge by computers. Academic Press, New York Brickley D, Guha RV, McBride B (2014) RDF schema 1.1. W3C Recommendation 25:2004–2014 Cambria E, White B (2014) Jumping NLP curves: a review of natural language processing research. IEEE Comput Intell Mag 9(2):48–57 Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI’10), Atlanta. AAAI Press Chang CH, Kayed M, Girgis MR, Shaalan KF (2006) A survey of web information extraction systems. IEEE Trans Knowl Data Eng 18(10):1411–1428 Chu X, Morcos J, Ilyas IF, Ouzzani M, Papotti P, Tang N, Ye Y (2015) KATARA: reliable data cleaning with knowledge bases and crowdsourcing. Proc VLDB Endowment 8(12):1952–1955 Council WTT (2018) Travel & tourism: economic impact 2018 China. World Travel & Tourism Council (WTTC) Dell’Erba M, Fodor O, Ricci F, Werthner H (2003) Harmonise: a solution for data interoperability. In: Towards the Knowledge Society. Springer, pp 433–445 Dietrich D, Gray J, McNamara T, Poikola A, Pollock P, Tait J, Zijlstra T et al (2009) Open data handbook. Open Knowledge International. http://opendatahandbook.org Dimou A, Vander Sande M, Colpaert P, Verborgh R, Mannens E, Van de Walle R (2014) RML: a generic language for integrated RDF mappings of heterogeneous data. In: Proceedings of the Workshop on Linked Data on the Web (LDOW’14), co-located with the 23rd International World Wide Web Conference (WWW’14), Seoul, 8 Apr 2014, CEUR-WS.org Dong XL (2018) Challenges and innovations in building a product knowledge graph. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’18), London, 19–23 Aug 2018. ACM, pp 2869–2869 Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 601–610 Drumond L, Girardi R (2008) A survey of ontology learning procedures. In: Proceedings of the CEUR Workshop, WONTO: 3rd Workshop on Ontologies and their Applications, Salvador, Bahia, 26 Oct 2008, 427:1–13 Faye DC, Cure O, Blin G (2012) A survey of RDF storage approaches. Revue Africaine de la Recherche en Informatique et Mathématiques Appliquées 15:11–35
16 Semantic Web Empowered E-Tourism
413
Fensel D, Bussler C (2002) The web service modeling framework WSMF. Electron Commer Res Appl 1(2):113–137 Fensel D, Musen MA (2001) The semantic web: a brain for humankind. IEEE Intell Syst 16(2):24– 25 Fensel D, Erdmann M, Studer R (1997) Ontology groups: Semantically enriched subnets of the WWW. In: In Proceedings of the International Workshop Intelligent Information Integration during the 21st German Annual Conference on Artificial Intelligence, Freiburg, Sept 1997, Citeseer Fensel D, Richard V, Enrico B, Bobwielinga M (1999) UPML: a framework for knowledge system reuse. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI 99), Stockholm, July 31–Aug 6 1999, pp 16–23 Fensel D, Angele J, Decker S, Erdmann M, Schnurr HP, Studer R, Witt A (2000) Lessons learned from applying AI to the web. Int J Cooperat Inf Syst 9(4):361–382 Fensel D, Lausen H, Polleres A, De Bruijn J, Stollberg M, Roman D, Domingue J (2006) Enabling semantic web services: the web service modeling ontology. Springer Science & Business Media Fensel D, Kerrigan M, Zaremba M (eds) (2008) Implementing semantic web services. Springer Fürber C, Hepp M (2010a) Using semantic web resources for data quality management. In: Proceedings of the 17th International Conference on Knowledge Engineering and Management by the Masses (EKAW2010), Lisbon, 11–15 Oct 2010. LNCS, vol 6317. Springer, pp 211–225 Fürber C, Hepp M (2010b) Using SPARQL and SPIN for data quality management on the semantic web. In: Proceedings of the 13th International Conference on Business Information Systems (BIS2010), Berlin, 3–5 May 2010. LNBIP, vol 47. Springer, pp 35–46 Galárraga LA, Teflioudi C, Hose K, Suchanek F (2013) AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: Proceedings of the 22nd International Conference on the World Wide Web (WWW’13), Rio de Janeiro, 13–17 May 2013. ACM, pp 413–422 Galárraga L, Teflioudi C, Hose K, Suchanek FM (2015) Fast rule mining in ontological knowledge bases with AMIE $$+ $$+. The VLDB J – Int J Very Large Data Bases (VLDB) 24(6): 707–730 Garshol LM, Borge A (2013) Hafslund sesam–an archive on semantics. In: Proceedings of the 10th International Conference on The Semantic Web: Semantics and Big data (ESWC2013), Montpellier, France, 26–30 May 2013. LNCS. vol 7882. Springer, pp 578–592 Gawriljuk G, Harth A, Knoblock CA, Szekely P (2016) A scalable approach to incrementally building knowledge graphs. In: Proceedings of the 20th International Conference on Theory and Practice of Digital Libraries (TPDL2016), Hannover, 5–9 Sept 2016. LNCS, vol 9819. Springer, pp 188–199 Getoor L, Machanavajjhala A (2013) Entity resolution for big data. In: Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining: Tutorial (KDD2013), Chicago, 11–14 Aug 2013. ACM, p 1527 Giannopoulos G, Skoutas D, Maroulis T, Karagiannakis N, Athanasiou S (2014) FAGI: A framework for fusing geospatial RDF data. In: Proceedings of the Confederated International Conferences “On the Move to Meaningful Internet Systems” (OTM2014), Amantea, 27–31 Oct 2014. LNCS, vol 8841. Springer, pp 553–561 Gil Y (2011) Interactive knowledge capture in the new millennium: how the semantic web changed everything. Knowl Eng Rev 26(1):45–51 Gomez-Perez JM, Pan JZ, Vetere G, Wu H (2017) Enterprise knowledge graph: An introduction. In: Pan JZ, Vetere G, Gomez-Perez JM, Wu H (eds) Exploiting linked data and knowledge graphs in large organisations. Springer, pp 1–14 Gruninger M, Menzel C (2003) The process specification language (PSL) theory and applications. AI Mag 24(3):63–74 Guha R, McCool R, Miller E (2003) Semantic search. In: Proceedings of the 12th International Conference on the World Wide Web (WWW ’03), Budapest, 20–24 May 2003. ACM, pp 700–709
414
K. Angele et al.
Gupta S, Kaiser G, Neistadt D, Grimm P (2003) Dom-based content extraction of html documents. In: Proceedings of the 12th international conference on World Wide Web (WWW’03), Budapest, Hungary, 20–24 May 2003. ACM, pp 207–214 Gupta S, Szekely P, Knoblock CA, Goel A, Taheriyan M, Muslea M (2012) Karma: A system for mapping structured sources into the semantic web. In: Proceedings of the 9th Extended Semantic Web Conference (ESWC2012), Heraklion, Crete, 27–31 May 2012. LNCS, vol 7540. Springer, pp 430–434 Gutiérrez-Cuellar J, Gómez-Pérez JM (2014) HAVAS 18 labs: a knowledge graph for innovation in the media industry. In: Proceedings of the Industry Track at the International Semantic Web Conference (ISWC-IT 2014), Riva del Garda, 19–23 Oct 2014, CEUR-WS.org, vol 1383 Harth A, Hogan A, Delbru R, Umbrich J, O’Riain S, Decker S (2007) SWSE: answers before links! In: Proceedings of the Semantic Web Challenge 2007, Busan, Korea, November 13th, 2007. Colocated with the 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, Busan, 11–15 Nov 2007, CEUR-WS.org, vol 295, pp 137–144 Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining- a general survey and comparison. ACM SIGKDD Exploration Newsl 2(1):58–64 Hoffart J, Suchanek FM, Berberich K, Weikum G (2013) YAGO2: A spatially and temporally enhanced knowledge base from wikipedia. Artif Intell 194:28–61 Hogan A, Decker S, Harth A (2007) Performing object consolidation on the semantic web data graph. In: Proceedings of the Workshop on Entity-Centric Approaches to Information and Knowledge Management on the Web (I3: Identity, Identifiers, Identification) co-located with the 16th International World Wide Web Conference (WWW2007), CEUR Workshop, Banff, 8 May 2007, vol 249 Inzalkar S, Sharma J (2015) A survey on text mining-techniques and application. Int J Res Sci Eng 24:1–14 Kärle E, Sim¸ ¸ sek U, Fensel D (2017) semantify.it, a platform for creation, publication and distribution of semantic annotations. In: SEMAPRO 2017: The Eleventh International Conference on Advances in Semantic Processing. Curran Associates, Inc., New York, pp 22–30. http://arxiv. org/abs/1706.10067 Karoui L, Aufaure MA, Bennacer N (2004) Ontology discovery from web pages: application to tourism. In: Proceedings of the Workshop of Knowledge Discovery and Ontologies, Pisa, 20–24 Sept 2004, Citeseer Khare R, Çelik T (2006) Microformats: a pragmatic path to the semantic web. In: Proceedings of the 15th international conference on World Wide Web (WWW’06), Edinburgh, 23–26 May 2006. ACM, pp 865–866 Kopeck`y J, Vitvar T, Bournez C, Farrell J (2007) SAWSDL: semantic annotations for WSDL and XML schema. IEEE Internet Comput 11(6):60–67 Korula N, Lattanzi S (2014) An efficient reconciliation algorithm for social networks. Proc Very Large Data Bases Endowment 7(5):377–388 Kärle E, Simsek U, Panasiuk O, Fensel D (2018) Building an ecosystem for the tyrolean tourism knowledge graph. In: Proceedings of the International Conference on Trends in Web Engineering (ICWE2018), International Workshops, MATWEP, EnWot, KD-Web, WEOD, TourismKG: Revised Selected Papers, Caceres, 5 June 2018. Lecture Notes in Computer Science, vol 11153. Springer, pp 260–267. https://doi.org/10.1007/978-3-030-03056-8_25 Langegger A, Wöß W (2009) XLWrap–querying and integrating arbitrary spreadsheets with SPARQL. In: Proceedings of the 8th International Semantic Web Conference (ISWC 2009), Chantilly, 25–29 Oct 2009. LNCS, vol 5823. Springer, pp 359–374 Lanthaler M, Gütl C (2013) Hydra: a vocabulary for hypermedia-driven web APIs. In: Bizer C, Heath T, Berners-Lee T, Hausenblas M, Auer S (eds) Proceedings of the Workshop on Linked Data on the Web Workshop (LDOW), Rio de Janeiro, 14 May 2013, vol 996, CEUR-WS.org Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, Van Kleef P, Auer S, et al (2015) DBpedia – a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web J 6(2):167–195
16 Semantic Web Empowered E-Tourism
415
Lenat DB (1995) Cyc: a large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38 Lenat DB, Guha RV (1989) Building large knowledge-based systems; Representation and Inference in the Cyc Project, 1st edn. Addison-Wesley Longman Publishing Co., Inc. Lertvittayakumjorn P, Kertkeidkachorn N, Ichise R (2017) Resolving range violations in DBpedia. In: Proceedings of the Joint International Semantic Technology Conference (JIST 2017), Gold Coast, 10–12 Nov 2017. Springer, pp 121–137 Ma L, Su Z, Pan Y, Zhang L, Liu T (2004) RStar: an RDF storage and query system for enterprise resource management. In: Proceedings of the 13th ACM international conference on Information and knowledge management, Washington, DC, 8–13 Nov 2004. ACM, pp 484–491 Maltese V, Farazi F (2013) A semantic schema for geonames. Tech. Rep. DISI-13-004, Department of Information Engineering and Comouter Science, University of Trento, Trento Manola F, Miller E, McBride B, et al (2004) RDF primer. W3C Recommendation 10(1–107):6 Martin D, Burstein M, Mcdermott D, Mcilraith S, Paolucci M, Sycara K, Mcguinness DL, Sirin E, Srinivasan N (2007) Bringing semantics to web services with OWL-S. World Wide Web 10(3):243–277 Mendes PN, Mühleisen H, Bizer C (2012) Sieve: linked data quality assessment and fusion. In: Proceedings of the 2nd International Workshop on Linked Web Data Management (LWDM 2012), in conjunction EDBT2012, Berlin, 30 March 2012. Citeseer, pp 116–123 Michelfeit J, Necask`y M et al (2012) Linked open data aggregation: Conflict resolution and aggregate quality. In: Proceedings of the 36th Annual IEEE Computer Software and Applications Conference Workshops (COMPSAC2012), Izmir, 16–20 July 2012. IEEE, pp 106–111 Miles A, Bechhofer S (2009) SKOS simple knowledge organization system reference. World Wide Web Consortium. https://www.w3.org/TR/skos-reference/ Mohit B (2014) Named entity recognition. In: Zitouni I (ed) Natural language processing of semitic languages. Springer, pp 221–245 Moschitti A, Tymoshenko K, Alexopoulos P, Walker A, Nicosia M, Vetere G, Faraotti A, Monti M, Pan JZ, Wu H, et al (2017) Question answering and knowledge graphs. In: Pan JZ, Vetere G, Gomez-Perez JM, Wu H (eds) Exploiting linked data and knowledge graphs in large organisations. Springer, pp 181–212 Motta E, Domingue J, Cabral L, Gaspari M, II I (2003) IRS-II: A framework and infrastructure for semantic web services. In: The Semantic Web – ISWC 2003, Second International Semantic Web Conference, Sanibel Island, 20–23 Oct 2003. LNCS, vol 2870. Springer, pp 306–318 Newell A, et al (1982) The knowledge level. Artif Intell 18(1):87–127 Nickel M, Murphy K, Tresp V, Gabrilovich E (2015) A review of relational machine learning for knowledge graphs. Proc IEEE 104(1):11–33 O’connor MJ, Halaschek-Wiener C, Musen MA (2010) Mapping master: a flexible approach for mapping spreadsheets to OWL. In: 9th International Semantic Web Conference (ISWC 2010), Shanghai, 7–11 Nov 2010. LNCS, vol 6497. Springer, pp 194–208 Pan JZ, Vetere G, Gomez-Perez JM, Wu H (eds) (2017) Exploiting linked data and knowledge graphs in large organisations. Springer Panasiuk O, Akbar Z, Gerrier T, Fensel D (2018a) Representing geodata for tourism with schema.org. In: Grueau C, Laurini R, Ragia L (eds) Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management, GISTAM 2018, Funchal, Madeira, 17–19 March 2018. SciTePress, pp 239–246. https://doi. org/10.5220/0006755102390246 Panasiuk O, Akbar Z, Simsek U, Fensel D (2018b) Enabling conversational tourism assistants through schema.org mapping. In: Gangemi A, Gentile AL, Nuzzolese AG, Rudolph S, Maleshkova M, Paulheim H, Pan JZ, Alam M (eds) The Semantic Web: ESWC 2018 Satellite Events – ESWC 2018 Satellite Events, Heraklion, Crete, 3–7 June 2018. Revised Selected Papers, Lecture Notes in Computer Science, vol 11155. Springer, pp 137–141. https://doi.org/ 10.1007/978-3-319-98192-5_26
416
K. Angele et al.
Panasiuk O, Kärle E, Sim¸ ¸ sek U, Fensel D (2018c) Defining tourism domains for semantic annotation of web content. e-Review of Tourism Research, Research notes from the ENTER 2018 Conference on ICT in Tourism, Jönköping, Sweden, January 24–26, 2018 9, URL https:// journals.tdl.org/ertr/index.php/ertr/article/view/127 Patil AA, Oundhakar SA, Sheth AP, Verma K (2004) METEOR-S web service annotation framework. In: Proceedings of the 13th international conference on World Wide Web (WWW’04), New York, 17–22 May 2004. ACM, pp 553–562 Paulheim H (2017) Knowledge graph refinement: a survey of approaches and evaluation methods. Semantic Web 8(3):489–508 Paulheim H, Bizer C (2013) Type inference on noisy RDF data. In: Proceedings of the 12th International Semantic Web Conference (ISWC 2013), Sydney, 21–25 Oct 2013. LNCS, vol 8218, Springer, pp 510–525 Paulheim H, Bizer C (2014) Improving the quality of linked data using statistical distributions. Int J Semantic Web Inf Syst (IJSWIS) 10(2):63–86 Paulheim H (2018a) How much is a triple? estimating the cost of knowledge graph creation. In: Proceedings of the 17th International Semantic Web Conference (ISWC2018): Posters & Demonstrations, Industry and Blue Sky Ideas Tracks, Monterey, 8–12 Oct 2018, CEURWS.org, CEUR Workshop Proceedings, vol 2180, http://ceur-ws.org/Vol-2180/ISWC_2018_ Outrageous_Ideas_paper_10.pdf Paulheim H (2018b) Machine learning with and for semantic web knowledge graphs. In: d’Amato C, Theobald M (eds) Reasoning Web. Learning, Uncertainty, Streaming, and Scalability – 14th International Summer School 2018, Esch-sur-Alzette, Luxembourg, 22–26 Sept 2018. Tutorial Lectures, Springer, pp 110–141 Pereira RL, Sousa PC, Barata R, Oliveira A, Monsieur G (2015) Citysdk tourism api-building value around open data. J Internet Serv Appl 6(1):24 Quimbaya AP, Muñoz O, Londoño D, Bohórquez R, García OM, González RA, Amortegui MP, Rodriguez S, Bustamante A (2014) An executable knowledge base for clinical practice guideline rules. Proc Technol 16:1446–1455 Rekatsinas T, Chu X, Ilyas IF, Ré C (2017) Holoclean: Holistic data repairs with probabilistic inference. Proc the Very Large Data Bases Endowment (PVLDB) 10(11):1190–1201 Roman D, Kopeck`y J, Vitvar T, Domingue J, Fensel D (2015) WSMO-Lite and hRESTS: lightweight semantic annotations for web services and RESTful APIs. J Web Semantics 31: 39–58 Sabou M, Arsal I, Bra¸soveanu AM (2013) Tourmislod: a tourism linked data set. Semantic Web 4(3):271–276 Schreiber G (2013) Knowledge acquisition and the web. Int J Hum-Comput Stud 71(2): 206–210 Schreiber G, Akkermans H, Anjewierden A, Shadbolt N, de Hoog R, Van de Velde W, Shadbolt NR, Wielinga B (2000) Knowledge engineering and management: the CommonKADS methodology. The MIT Press, Cambridge Schultz A, Matteini A, Isele R, Mendes PN, Bizer C, Becker C (2012) LDIF – a framework for large-scale linked data integration. In: 21st International World Wide Web Conference (WWW 2012), Developers Track, Lyon, 18–20 Apr 2012 Shadbolt N, Smart PR, Wilson J, Sharples S (2015) Knowledge elicitation. In: Wilson J, Sharples S (eds) Evaluation of human work, 4th edn. CRC Press, Boca Raton, pp 163–200 Shehata S, Karray F, Kamel M (2009) An efficient concept-based mining model for enhancing text clustering. IEEE Trans Knowl Data Eng 22(10):1360–1371 Silwattananusarn T, Tuamsuk K (2012) Data mining and its applications for knowledge management: a literature review from 2007 to 2012. Int J Data Min Knowl Manag Process 2(5). https:// arxiv.org/abs/1210.2872 Simon HA (1957) Models of man, nueva york Sim¸ ¸ sek U, Fensel D (2018) Now we are talking! Flexible and open goal-oriented dialogue systems for accessing touristic services. In: Research Notes from the ENTER 2018 Conference on ICT in Tourism, Jönköping, Sweden, 24–26 Jan 2018. https://journals.tdl.org/ertr/index.php/ ertr/article/view/126
16 Semantic Web Empowered E-Tourism
417
Sim¸ ¸ sek U, Kärle E, Fensel D (2018a) Machine readable web APIs with schema.org action annotations. Proc Comput Sci 137:255–261 Sim¸ ¸ sek U, Kärle E, Holzknecht O, Fensel D (2018b) Domain specific semantic validation of schema.org annotations. In: Petrenko AK, Voronkov A (eds) Perspectives of system informatics. Springer International Publishing, Cham, pp 417–429 Sim¸ ¸ sek U, Kärle E, Fensel D (2019) RocketRML – a NodeJS implementation of a use-case specific RML mapper. http://arxiv.org/abs/1903.04969 Singhal A (2012) Introducing the knowledge graph: things, not strings. Official Google Blog 5. https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html Sleeman J, Finin T (2013) Type prediction for efficient coreference resolution in heterogeneous semantic graphs. In: Proceedings of the 7th International Conference on Semantic Computing (ICSC2013). IEEE Computer Society, Irvine, 16–18 Sept 2013 Sleeman J, Finin T, Joshi A (2015) Topic modeling for RDF graphs. In: Proceedings of the 3rd International Workshop on Linked Data for Information Extraction (LD4IE2015) colocated with the 14th International Semantic Web Conference (ISWC 2015), CEUR Workshop, Bethlehem, 12 Oct 2015, CEUR-WS.org, CEUR Workshop Proceedings, vol 1467, pp 48–62 Socher R, Chen D, Manning CD, Ng A (2013) Reasoning with neural tensor networks for knowledge base completion. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13), Lake Tahoe, 5–10 Dec 2013, vol 1, pp 926–934 Sporny M, Longley D, Kellogg G, Lanthaler M, Lindström N (2014) JSON-LD 1.0. W3C Recommendation 16:41. https://www.w3.org/TR/json-ld/ Stearns MQ, Price C, Spackman KA, Wang AY (2001) SNOMED clinical terms: overview of the development process and project status. In: Proceedings of the AMIA Symposium, Washington DC, USA, 11–15 Nov 2001, American Medical Informatics Association, p 662 Stegmaier F, Gröbner U, Döller M, Kosch H, Baese G (2009) Evaluation of current RDF database solutions. In: Proceedings of the 10th International Workshop on Semantic Multimedia Database Technologies (SeMuDaTe) at the 4th International Conference on Semantics And Digital Media Technologies (SAMT), Graz, 2 Dec 2009, pp. 39–55 Studer R, Benjamins VR, Fensel D (1998) Knowledge engineering: principles and methods. Data Knowl Eng 25(1–2):161–197 Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core of semantic knowledge. In: Proceedings of the 16th International World Wide Web Conference (WWW2007), Banff, Canada, 8–12 May 2007. ACM, pp 697–706. https://doi.org/10.1145/1242572.1242667 Swartz A (2002) Musicbrainz: a semantic web service. IEEE Intell Syst 17(1):76–77 Sánchez D, Moreno A (2006) A methodology for knowledge acquisition from the web. Int J Knowl-Based Intell Eng Syst 10(6):453–475 Tandon N, De Melo G, Suchanek F, Weikum G (2014) WebChild: harvesting and organizing commonsense knowledge from the web. In: Proceedings of the 7th ACM international conference on Web search and data mining (WSDM 2014), New York, 24–28 Feb 2014. ACM, pp 523–532 Töpper G, Knuth M, Sack H (2012) Dbpedia ontology enrichment for inconsistency detection. In: Proceedings of the 8th International Conference on Semantic Systems (I-SEMANTICS ’12), Graz, 5–7 Sept 2012. ACM, pp 33–40 Van Deursen D, Poppe C, Martens G, Mannens E, Van de Walle R (2008) XML to RDF conversion: a generic approach. In: Proceedings of the 4rd International Conference on Automated solutions for Cross Media Content and Multi-Channel Distribution (AXMEDIS’08), Florence, 17–19 Nov 2008. IEEE, pp 138–144 Verborgh R, Steiner T, Van Deursen D, De Roo J, Van de Walle R, Vallés JG (2013) Capturing the functionality of web services with functional descriptions. Multimed Tools Appl 64(2):365–387 Verborgh R, Harth A, Maleshkova M, Stadtmüller S, Steiner T, Taheriyan M, Van de Walle R (2014) Survey of semantic description of REST APIs. In: Pautasso C, Wilde E, Alarcon R (eds) REST: Advanced Research Topics and Practical Applications. Springer, pp 69–89 Villazon-Terrazas B, Garcia-Santa N, Ren Y, Faraotti A, Wu H, Zhao Y, Vetere G, Pan JZ (2017a) Knowledge graph foundations. In: Pan JZ, Vetere G, Gomez-Perez JM, Wu H (eds) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, pp 17–55
418
K. Angele et al.
Villazon-Terrazas B, Garcia-Santa N, Ren Y, Srinivas K, Rodriguez-Muro M, Alexopoulos P, Pan JZ (2017b) Construction of enterprise knowledge graphs (I). In: Pan JZ, Vetere G, GomezPerez JM, Wu H (eds) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, pp 87–116 Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk – A link discovery framework for the web of data. In: Proceedings of the WWW2009 Workshop on Linked Data on the Web, LDOW, Madrid, vol 538, Citeseer Vrandeˇci´c D, Krötzsch M (2014) Wikidata: a free collaborative knowledge base. Commun ACM 57(10):78–85 Winkler WE (2006) Overview of record linkage and current research directions. Research report series: statistics #2006-2, Bureau of the Census, USA. https://www.census.gov/srd/papers/pdf/ rrs2006-02.pdf Zaveri A, Kontokostas D, Sherif MA, Bühmann L, Morsey M, Auer S, Lehmann J (2013) Userdriven quality evaluation of DBpedia. In: Proceedings of the 9th International Conference on Semantic Systems (I-SEMANTICS ’13), Graz, 4–6 Sept 2013. ACM, pp 97–104 Zaveri A, Dastgheib S, Wu C, Whetzel T, Verborgh R, Avillach P, Korodi G, Terryn R, Jagodnik K, Assis P, et al (2017) SmartAPI: Towards a more intelligent network of web APIs. In: Proceedings of the 14th European Semantic Web Conference (ESWC 2017), Portoroz, Sovenia, May 28–June 1. LNCS, vol 10250. Springer, pp 154–169
Big Data Technologies
17
Constantine J. Aivalis
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Definition of Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hadoop and MapReduce Expedite Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NoSQL Databases Promote Variety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hadoop and Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Hadoop Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Big Data Infrastructure Vendors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
420 421 422 424 426 429 431 432 432 433
Abstract Big Data has emerged as a new technological paradigm during the last few years, because of the need to master the occurring exponential growth of data. Big Data technologies offer toolboxes in the form of frameworks that deal with the data explosion created by the ever-growing number of applications, mobile devices, sensors, and the Internet of Things (IoT) in conjunction with the wish to have a better overview, receive answers to questions, and measure behavior and operational complexity of today’s systems. Big Data refers to large datasets and dataflows whose processing lays beyond the capabilities of traditional information systems and databases. Information like log files, images, messages, transaction records from remote or local application databases, composite distributed data structures, sensor data from remote devices, data from public databases, and IoT devices can be used C. J. Aivalis () Hellenic Mediterranean University of Crete, Heraklion, Crete, Greece e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_23
419
420
C. J. Aivalis
selectively to enrich existing data to provide clear operational insight and support the recognition of trends and tendencies. Very often, data generated in social media applications are used to measure or extend the impact of specific Internet campaigns for products and services. Big Data is the toolkit that targets the infamous “three Vs” of data, which comprise the three basic characteristics: volume, velocity, and variety. This chapter will explain the basic definitions of Big Data components, provide a list of the technologies used and vendors involved, and show how the 3 Vs can be applied on hand of application examples in tourism.
Keywords Big Data · Hadoop · HDFS · MapReduce · RDBMS · NoSQL · Spark
Introduction Big Data issues have been around since the very early years of the informatics era, although programmers have preferred to avoid dealing with them systematically, because the technologies that allow processing them conveniently have only become available during the last 10 years. As a first example, the first Universal Product Code (UPC code) was read by a retail cash register in 1974. Since then, supply chains have been also using radiofrequency identification (RFID) to automate the identification and tracking of objects by attaching tags to them. Scanners have since then produced huge quantities of data that have only been exploited partially and always in a structured manner, according to the standards and rules of traditional information systems designed in the 1970s era, basically aiming to generate typical invoices and printed statistical batch reports. Contemporary software applications manage and generate very large quantities of data. Web-based components and applications support online transactions, generate log files, and feed analytics applications. Social media applications also produce daily petabytes of Big Data. Portions of this data are publicly accessible, concerning companies and organizations, and can be analyzed and used in order to target specific groups of customers, automatically generate customized advertisements for products and services, and measure their impact by evaluating the feedback and acceptance they receive. Retailers can make smart offers and recommend specific products to target customers better. This way any data, structured or unstructured concerning any company or organization, can be evaluated, and this can, for example, greatly enhance the insight of the clientele base and reveal how they shop and how they react when they see ads of specific products. The vast storage capacity of Big Data file systems allows also historical operational data, backups, and old log files generated by information systems and web servers over periods spanning many years to become directly accessible online instead of being stored in external magnetic or optical media. The drastic cost reduction of storage and commodity hardware, in conjunction with the free
17 Big Data Technologies
421
software revolution and the wide acceptance of open-source software applications and operating systems, has allowed establishing reliable and simple to maintain lowcost Big Data platforms. The amazing penetration of the Internet in all branches and the geographical dissemination due to today’s networking advances, in conjunction with mobile telephony, smart phones, mobile devices, and sensors, form the foundation for Big Data. As Sun Microsystems’ Scott McNealy pointed out in the mid-1980s: “The network is the computer.” Tourism depends heavily upon information and computer networking. Since Big Data can break size and performance barriers and limitations of data processing, it can provide more and faster information to its users. Big data related with tourism fall into three categories: user-generated data, data generated by devices, and transaction data generated by operations (Li 2018). All three categories of data benefit from Big Data technologies. Tour operators, agencies, hotels, cruise lines, car rentals, and service providers can combine operational data like reservations and prices with external data like event information, transportation schedules, competition offers, etc. for decision support and analytics without restrictions. Big Data can also be used for sentiment analysis, recommender systems, brand image monitoring, and competitive intelligence and to identify trends. Big Data are not solely a technology issue, but rather a socio-technical issue. Hence, if firms want to make full use of Big Data, then they need to adopt new management mindsets, new organizational structures, and cultures (e.g., cross-functional teams, corporate-wide and open communication, cooperation with third parties and online platforms) as well as new work practices such as data-driven and analytical culture (Sigala et al. 2019). Big Data makes the fusion of any data type from any source of data easier and feasible. Data collected from information management systems, the Internet of Things, devices, and various sensors like RFID and NFC devices can be used in order to provide better insight into the operational parameters of any corporation.
Definition of Big Data There is no unique definition for Big Data. The term Big Data means different things to different people. Everybody agrees though that Big Data starts where traditional online transaction processing (OLTP) or online analytical processing (OLAP) systems become too slow or even start to fail due to data volume, data velocity, and data variety. The simple substitution of a slow computer system with a more powerful one, known as vertical scaling, most often will not solve the problem, and if it does, then only temporarily, until the next bottleneck appears that would call again for a new more powerful system. Grace Hopper, the legendary US Navy rear admiral and designer of the COBOL language, described scaling, many years ago, like this: “If one ox could not do the job, they did not try to grow a bigger ox, but used two oxen. When we need greater computer power, the answer is not to get a bigger computer, but. . . to build systems of computers and operate them in parallel” (Schieber 1987). This approach
422
C. J. Aivalis
is known as horizontal scaling. Big Data is based on parallelization and distribution of processing and file storage. In parallel computing several processors are sharing the same memory system, while in distributed computing, we have groups of computers sharing the same goal, where each processor is operating on its own memory. Information must be exchanged through message traffic between the nodes. The concept of distributed computing has been around since the introduction of the Ethernet in the 1970s when multiple network architectures like peer-to-peer, n-tier, and client-server and coupling models have been implemented. Algorithms have been developed, improved, and established over the years that guarantee fault tolerant and efficient cooperation, simple installation, and easy setup and expansion. The concept of using distributed and parallel systems is based on splitting a large problem into smaller ones and letting each of the processors individually carry out its part of the job. We must divide a big dataset into small partitions, run the appropriate algorithms on the individual machines, and finally combine the output into one. This is a complicated procedure, and the developer must organize the partitioning scheme of the data manually as well as taking care of the appropriate communications that are required. Hadoop, for example, can deal with all nontrivial logistics involving interprocess communications, making a complicated distributed file system, expanded over large clusters of cooperating computers, seem like a huge local file system to the developer. While volume deals with data size, velocity deals with speed of generation, streaming, and data collection techniques that allow processing and visualization in near real time and sometimes even in real time. Variety is the component that deals with the mapping of different classes and structures of data into concrete objects, so they can function together (Laney 2001). Four additional V-buzzwords complete the Big Data component landscape. These stand for veracity, variability, visualization, and value. Veracity is established by creating mechanisms that check the accuracy of data and ensure the overall correctness of the data collection. Variability refers to possible inconsistencies of data that must be sorted out by interpreting their exact context. Visualization tools allow presenting and summarizing selected portions of data in either static or interactive mode, leading to insights and views that allow understanding the nature of problems and the status of the operations generating value. Although the last four V words are often used for Big Data, along with even more V words like viability, according to Doug Laney from Gartner Inc., they are not definitional specifically for Big Data, and they apply to all types of data and not just to Big Data (Grimes 2013).
Hadoop and MapReduce Expedite Volume Big Data technologies rely on distributed computing. The tool used as the foundation for most Big Data applications today is Apache Hadoop. Hadoop is not a single tool, but an entire ecosystem of applications. The base of this ecosystem is the Hadoop Distributed File System (HDFS), which allows clusters of commodity
17 Big Data Technologies
423
computers to act as a single storage component. Hadoop was designed in 2006 by Douglas Cutting for Yahoo!. Hadoop became an Apache project in 2008. HDFS is scalable and can grow on demand to hundreds of petabytes, simply by adding new DataNodes to our network. Very large companies own local area networks supporting HDFS, consisting of more than a thousand computers as DataNodes. Data in HDFS is organized in large-sized blocks. The typical block size of the HDFS is 64 MB. This size is huge when compared to the block size of 512 bytes, 4 K, or maximum 32 K, usually used for the hard drives of regular local file systems. Whatever data we store to the HDFS is automatically distributed to the DataNodes. The NameNode knows where the data blocks are located physically. Additionally, replication of the data blocks takes place transparently, and data blocks are spread among the physical DataNodes. This way not only huge capacity but also high levels of data integrity are guaranteed. The fact that all complexities of distributed processing are automatically taken care of by the file system leading to a simple programming model made HDFS popular. MapReduce (MR) is an algorithm that has traditionally been used for dealing with data stored in HDFS. MapReduce, originally generated in 2004 by Dean and Ghemawat in Google, was originally used for the Google File System (GFS). It is implemented in Java and became an Apache project as well along with Hadoop. MapReduce is an algorithm design pattern that originated in the functional programming world. The typical example found in most introductory books and tutorials describes how to apply MR for solving the simple problem of counting the occurrence of words in a document. Although the MR approach is relatively more complex than writing a counting class or algorithm, the fact that MR can run on HDFS renders it capable of dealing with virtually unlimited input volume. MR with HDFS can easily deal with petabytes of data. MR can easily be applied for various applications, including counting customer loyalty and satisfaction, analyzing log files, merging data from various sources, and providing measurements and statistics. The procedure consists of three steps. First, it requires a mapper function or script that goes through the input data and outputs a series of keys and values to use in calculating the results. The keys are used to cluster together bits of data that will be needed to calculate a single output result. The unordered list of keys and values is then put through a sort step that ensures that all the fragments that have the same key are next to one another in the file. The reducer stage then goes through the sorted output and receives all values that have the same key in a contiguous block (Warden 2011). Programs written in this functional style can be automatically parallelized and executed by a cluster of commodity machines. These steps can be easily implemented and adapted to the needs of any search application. MR was designed by Google to count words, URL access frequencies, and user counts (Ghemawat and Dean 2020). The advantage of MR running on HDFS DataNodes lays in the fact that parallelism extremely enhances the speed of computations in comparison with monolithic architectures when searching large datasets.
424
C. J. Aivalis
As mentioned above, HDFS is a scalable file system, consisting of clusters of commodity computer systems (nodes) that operate as a single unit. This file system automatically supports replication for integrity reasons, making any use of Redundant Arrays of Independent Disks (RAID systems) completely unnecessary for all DataNodes. Still RAID is necessary for the NameNode (Holmes 2014). The reason why HDFS uses such a very large block size (64 or 128 MB) is in order to keep the workload of the NameNode low. Hadoop is designed mainly for MapReduce applications. It is not particularly suited for supporting neither small files nor frequent updates of data. Hadoop is constantly enriched with various new projects and is always undergoing improvements. It supports distributed file storage and provides mechanisms that can easily be engaged with no need for manual intervention for segmentation, partitioning, and collection of results. The Hadoop 2.0 Ecosystem was enhanced with Apache YARN (Yet Another Resource Negotiator). YARN allows better job scheduling and performance. YARN allows horizontal scaling to thousands of nodes and enhances security by enabling auditing. YARN allows HDFS to support programming models beyond MapReduce and eliminate restrictions (Murthy et al. 2014). YARN provides one resource manager (RM) per cluster that consists of a scheduler and an application manager. The resource manager knows the location of the data nodes and their resources and manages the node managers of the cluster that offer resources to the RM. Hadoop is designed for batch processing and MapReduce provides a simple approach for the user to generate metrics and do research among huge amounts of data, without the need to care much about the exact physical location of the data. The HDFS acts as a single file system and takes care of all the complexity involved in a distributed and parallel system. Traditionally Hadoop has been used for mining large quantities of historical data or for log file analysis in batch mode. In batch mode the users start a MR program and receive its results a few minutes later.
NoSQL Databases Promote Variety The traditional model for database management systems that has been used for many decades has been the relational model. The S in the abbreviation SQL stands for structured. Relational databases are called structured databases today because they deal with structured data. In the relational model data is stored in tables. Tables have predefined columns, known as attributes, and records are collections of attributes, stored in each row, always containing predefined columns according to the data definition. Null values can substitute nonexisting values, but every row always consists of all attributes predefined by the designer during the creation of the table. The relational model is very close to the way people tend to think about data. Using tables has been a very familiar way of organizing information. Relational database management systems offer indexing capabilities and allow customized views of the database, in concordance with the user; they support normalization
17 Big Data Technologies
425
in order to reduce redundancy and offer ways of defining constraints and attributes referencing to additional tables. Relational database systems offer atomicity, consistency, isolation, and durability and are thus called ACID compliant. This means that they allow valid transactional processing even in the event of failures and errors. Atomicity means that any transaction consisting of multiple steps is considered successful only if all steps succeed. Only data that respect all predefined rules are considered consistent. Isolation is achieved when transactions do not interfere with each other’s data. Durability means that no data is lost even if a power failure occurs during processing. Very fast data insertions, selections, deletions, and updates are guaranteed when operating within the limitations defined by each RDBMS vendor. Size can be a serious problem, especially when we try to use relational databases with Big Data, since structured databases can mainly scale vertically. There are even relational database engines that allow storing database data straight into dedicated disk raw partitions for better performance, like Sybase, for example, which later became a part of the SAP Corporation. This approach first appeared in the early 1980s and is very fast, since it avoids the overhead of the file system. Still the issue with RDBM is volume. Hadoop has alleviated the restriction of vertical scalability for a file system. In order to lift this restriction also for databases, schema-less NoSQL database systems have been developed. NoSQL stands for “not only SQL.” NoSQL databases do not have table structure nor fields and are not relational. They scale horizontally and can operate well on HDFS. We have seen so far that by scaling horizontally, NoSQL databases support volume. The reason they solve the variety issue lies in their nature. Since they are not restricted to supporting predefined structures and data formats, they are excellent tools for storing and retrieving semi-structured and unstructured data. In MongoDB, for example, the document substitutes the row of a structured database table. The document has a JSON-like (JavaScript Object Notation) keyvalue format. Documents are included in collections. Each document inside a collection can have its own key-value set, and the format can become very complex if necessary. Different types of NoSQL databases exist: • • • • •
Wide column store databases (HBase, Cassandra) Cache systems (MemcachedDB, Redis) Key-value stores (Couchbase, Redis) Document store databases (CouchDB, MongoDB) Graph databases (Neo4J)
NoSQL databases are highly scalable, fault tolerant, and distributed but offer no ACID compliance. A newer category of database management systems, known as NewSQL, acts as hybrid systems, since they fully support SQL and the relational data model and, at the same time, they can run on HDFS, offering the speed, flexibility, and scalability of NoSQL. They even support online transaction processing (OLTP). NewSQL databases often use sharding, based on a middleware layer that
426
C. J. Aivalis
splits huge relational database tables into manageable- sized pieces (shards) that are stored on different nodes of HDFS, making horizontal scaling automatically feasible without the need to do any additional programming and form a transparent consistent distributed system for the database programmer and user. A shard map manager (SMM) keeps track of the location of the shards and is responsible of coordinating all communications with the clients. NoSQL database management systems are gaining in importance. There are many systems to choose from.1 In tourism the input data types include beyond structured information and also free text, like textual evaluations of hotels, places, and services, as well as images and graphs, videos, and various machine-generated data collected from the interaction between sensors and mobile or wearable devices. Processing and combining high volumes of unstructured data allow neural networks and machine learning algorithms to provide more accurate results and generate better predictions. Unstructured or semi-structured data require novel technologies to analyze them in order to develop new or improved products and services (Xiang and Fesenmaier 2017). Big Data ecosystems have great potential to process this variable data.
Hadoop and Velocity A Hadoop File System can serve petabytes of data. According to Forbes2 “the data we produce at the current pace every day are 2.5 quintillion bytes. This number is growing exponentially. 90% of the existing data was generated during the past two years.” Mobile phone apps use the various sensors of the device and generate useful data that is being streamed perpetually to data centers hosting specialized applications, for storing, processing, summarizing, analyzing, and visualizing. Back in 2013, for example, a Danish app called “Endomondo” managed to build a global sports community with around 20 million users, growing to over 25 million users by 20153 that engage in sports activities. Every route of a biker or a jogger that provides GPS access to the app is registered in near real time, and the central application provides statistics about speed, distance, and slope and registers the activity on the map. The application gives calorie advice to each athlete individually acting as a personal trainer. Historical data, statistics, and details can be retrieved for every registered activity. The activities can be shared on Facebook along with camera pictures and details of the route. Endomondo provides rich data of various formats and performs well even with low-speed mobile data connections. Velocity is here one of the primary key factors for the success of the application.
1 http://nosql-database.org/ 2 https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-
day-the-mind-blowing-stats-everyone-should-read 3 https://nordicgrowthhackers.com/from-zero-to-25m-users-know-your-purpose-and-grow-with-it/
17 Big Data Technologies
427
A huge fast-growing source of streaming data is also the Internet of Things (IoT) where large numbers of sensors send and receive data. With social media, some 2.7 billion active Facebook users upload enormous amounts of pictures, text, and messages every day. Other social media applications, like WhatsApp, Tumblr, Twitter, Quora, and LinkedIn, operate applications that support heavy communications of vast numbers of members, use fast peer-to-peer messaging, allow storing and streaming of music and videos, and offer advanced application programming interfaces that are often embedded in many websites and are used in conjunction with various contexts. Velocity deals not only with the speed of data acquisition but also with the speed of data processing, in order to allow using and visualizing the information. Data streaming into an application should be consumed without creating congestions and large queues. When critical events generate data, the propagation of the event description should take place as soon as possible and alert the business in order to take the appropriate actions. Data velocity applies to all kinds of industries and operation workflows. A website, operating an e-commerce application, needs analytics details in order to measure performance speed, see the load in terms of disparate sessions, and measure turnover and sales in near real time. For scrutinizing the sales figures in this example, a microservice can be used that acts as a data producer. It may be triggered by the database management system after a sale has been concluded; it collects all additional information necessary from involved tables of the database and generates some form of a message. This message is serialized, pushed into a queue, and the communication middleware makes it available to the consumer for further processing and near-real-time visualization. If the information arrives on time, the management can use it in order to take actions, view the current status, access options, and eventually correct bad decisions. Velocity means high speed of data acquisition, storage, and processing with low latency and responsiveness whenever data retrieval and visualization of results are considered critical. Several frameworks are available in the Hadoop Ecosystem for controlled data streaming in order to support velocity. Today many traditional Hadoop or Big Data applications are based on the MapReduce algorithm. Hadoop is conquering the world as a distributed parallel batch file system, using MapReduce as its “standard” engine computing intensive algorithms over clusters. Although almost 15 years have passed since the first Hadoop-MapReduce applications were launched, this software combination is still largely in operation. This happens not only because Hadoop managed to break the barrier of storage space, using a simple model for the user, but also because of the huge software ecosystem that was developed on top of it that deals with almost every aspect of the Big Data needs of the corporate world. Hadoop/MR developers have managed so far to deal well with security, software complexity, and installation maintenance issues of the software. However, MR responds relatively slow to queries and is not suited for dealing with small files because of its huge block size.
428
C. J. Aivalis
Apache Spark offers similar capabilities like MapReduce but shows lower latency and higher throughput. Spark usually runs on the Hadoop HDFS cluster platform, with YARN support like traditional Hadoop/YARN/MR solutions. It can certainly also run on single machines. Apache Spark started as a research project in UC Berkeley in 2009. It finally was open sourced in 2010 and moved to the Apache Software Foundation in 2013 (Spark Apache History 2019). Spark is a unified analytics engine for large-scale data processing.4 The ecosystem of Spark includes four additional packages: • Spark SQL, a native SQL platform for direct or programmable data access and querying • Spark Streams for working with real-time streaming data arriving from message queues, sensors, log files, and other sources • MLib for running machine learning algorithm models • GraphX for graph computations and data analysis of properties attached to nodes and vertices These four frameworks together on top of Spark form an extremely versatile and very coherent environment with high abstraction. The code of the entire Spark ecosystem has been developed in Scala. Scala is a very expressive and Flexible, functional, and object-oriented language, very well suited for heavy processing. All the framework packages extend the code base designed for Spark, share the same design philosophy, and are based on the same ground and foundations like Spark. One of the basic reasons Spark is much faster than MR lies in the fact that Sparks’ nodes use the so-called Resilient Distributed Datasets (RDD), which are immutable units of data residing on memory, rather than on persistent storage like the much slower hard disk drives. Spark is also easier to setup and operate and even less complicated to use. It is possible to run Spark jobs on Hadoop or YARN clusters, on Apache Mesos, or using a stand-alone scheduler for light applications and for learning purposes. Spark can be deployed when an already existing Hadoop/MR application needs to process streaming data, graph processing, or simply SQL support. The speed of Spark allows real-time processing of Big Data and brings Big Data one step closer to nonanalytic applications that require low-latency responses. Spark is not a substitute for Hadoop but should be viewed as a fast add-on for performing searches, since it is memory bound. Hadoop FS is used for persisting the data. Along with volume dealt with horizontal extension of the Hadoop file system, variety solved with various NoSQL and NewSQL databases Spark hands a velocity advantage to the Big Data programmer and technologist. In tourism applications, fast data collection and processing and rapid result and visualization production are crucial. 4 https://spark.apache.org/
17 Big Data Technologies
429
The Hadoop Ecosystem The innovation of the Hadoop approach lies in the fact that the software is acting as an abstraction layer that simplifies most issues and problems of distributed processing and behaves like a huge single disk to the user. Hadoop offers fault tolerance and has scaling capabilities. It also supports redundancy by automated replication of files. The replication improves speed, because when multiple clients request the same resource, they can be served by multiple nodes. Apache Hadoop has an HDFS NFS Gateway that supports NFSv3 and allows HDFS to be mounted as part of the client’s local file system.5 The logistics and complexities of storing and accessing large amounts of data on clusters of machines are hidden from the developer. Although the Hadoop/MR tool combination is a rigid approach, it can solve many types of problems involving querying huge data resources with reasonable response times. There are many applications built around Hadoop, used as add-ons. These applications form a suite of services, called the Hadoop Ecosystem. Some of the main projects of the Hadoop Ecosystem are: • • • • • • • •
Monitoring tool: Apache Zookeeper NoSQL databases: Apache HBase, Cassandra, MongoDB RDBMS import and export: Sqoop Data processing tools: Apache Hive, Apache Pig Impala Data analysis tools: Drill, Lucene Data serialization components: Avro and Thrift Log file integration component: Flume Machine learning and data mining tool: Mahout
Apache Zookeeper Zookeeper is a distributed coordination service. It maintains configuration information and knows all names and nodes. It supports synchronization and acts as a name registry for all resources, guaranteeing consistency, reliability, and timeliness of all updates. Apache HBase HBase is an open-source, reliable, and very fast real-time NoSQL database management system that allows real-time reading and writing of Big Data. HBase supports processing of very large databases with millions of columns and billions of rows stored on top of Hadoop. HBase is coordinated by Zookeeper. HBase supports random reads and writes. Its integration with MapReduce allows applications to scale gracefully. Cassandra Apache Cassandra is a distributed NoSQL database system. Cassandra is a fast, distributed database that’s highly fault tolerant as well as scalable. It provides high 5 https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html
430
C. J. Aivalis
availability and linear scalability, twin goals that traditional relational databases cannot satisfy when handling very large datasets (Alapati 2018). Sqoop Apache Sqoop allows transferring structured data from relational databases to and from Hadoop HDFS (Ting and Cecho 2013). Apache Hive Hive is an SQL query engine that converts SQL queries to MapReduce jobs. Thus, it simplifies the procedure of enabling Hadoop to be used as a data warehouse and supports queries. Tables correspond to HDFS directories. It is used for analytic jobs, unlike HBase, which is a true DBMS. Hive can easily integrate with traditional data center technologies with the use of the familiar JDBC interface. It is used as a bridge to integrate applications that use tabular data with Hadoop. The Hive Query Language (HQL) has similar semantics and functions as standard SQL in case of relational database, so that experienced database analysts can easily get their hands on it. Hive’s query language can run on different computing engines, such as MapReduce, Tez, and Spark (Du 2018). Pig: Pig Latin6 Apache Pig is a platform for analyzing large datasets and consists of a highlevel language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turn enables them to handle very large datasets. At the present time, Pig’s infrastructure layer consists of a compiler that produces sequences of MapReduce programs, for which large-scale parallel implementations already exist. Pig’s language layer currently consists of a textual language called Pig Latin. Mahout Mahout is a platform that supports executing machine learning algorithms. It allows finding meaningful patterns from data stored in HDFS and allows classification and clustering (Giacomelli 2013). It supports Scala and Apache Spark. It is a welladjusted and complicated black box solution that allows further customization and building of new models, rules, and features (Lyubimov and Palumbo 2016). Drill Drill is a tool that supports direct ANSI SQL queries against any data stored in HDFS. Drill does not require schemas and allows processing complex records easily and fast.
6 https://pig.apache.org/
17 Big Data Technologies
431
Apache Flume Apache Flume allows collecting, aggregating, and moving large amounts of data from various sources, like log files, social media data, and data from any web application to an HDFS. This application and project list is by no means exhaustive. The number of applications around Hadoop is already large, and new applications are added constantly.
Big Data Infrastructure Vendors The Apache Software Foundation (ASF) is a popular open-source software vendor. It has been an open-source technology incubator for 20 years, having a net worth of software products made available to the public at no cost estimated to be higher than 20 billion dollars. Three hundred different top-level projects, ranging from Java libraries and Application Programming Interfaces for various tasks, web servers and Database Management Systems to Big Data platforms and frameworks are available for immediate invocation. Almost all the Big Data frameworks mentioned above are Apache projects. This means they all are open source and available for free, under Apache licenses. Apache licenses allow free personal or commercial use, modification, distribution, and even selling. Both Hadoop and Spark are ASF projects. Theoretically, a technically knowledgeable administration team has the potential to setup and run Hadoop and Spark with low installation costs. Still, the total expenses of operation and ownership include hardware and installation costs, eventually commercial software and maintenance costs. Spark clusters are usually comprised of commodity machines with a lot of RAM memory, which usually are more expensive than the ones used for Hadoop. Although anybody can download a binary executable or even the full source code of any project and use it under the license conditions described here, there are commercial companies that sell enterprise-ready distributions, packaging Hadoop for the enterprises specifically. They also provide support and courses for training. The success of this market is due to the lack of expertise and skilled resources, along with the high complexity of deployment and the difficulty of choosing the right combination of tools for each application among the options offered in the ecosystems. The main Big Data software and Cloud vendors today are: • • • • • •
Cloudera – Hortonworks Amazon Web Services Elastic MapReduce Hadoop Distribution Microsoft MapR IBM InfoSphere® Insights DataBricks
432
C. J. Aivalis
Conclusions As seen, there are many different options to enter and join the world of Big Data. The Apache software packages can easily be installed to stand-alone Linux or Windows computers for testing, educational, or production purposes, and a few simple commodity machines could be used as test beds for installing Hadoop FS on a cluster and learning how to work with YARN, Zookeeper, and Spark and their entire ecosystems. It is advisable even for small and medium enterprises to start considering this form of infrastructure. Organizations and companies involved with tourism have always been among the first to adapt to technological changes, mainly because of the large volume and the very competitive nature of their ventures. Big Data technologies allow limitless data quantities, fast speed of processing, and diversification of data sources and file formats. Intelligent machine learning algorithms and neural networks can expand their input data range and reach far beyond the company’s Intranet, thus becoming capable of producing more complete analytics results, comparative measurements, and predictions and providing output, based on correlations and information available from limitless and multiple sources like public databases, social media, reservation systems, intermediaries, and destination management organization web services. The tourists today are more knowledgeable, better equipped with mobile devices, and the Internet gives them access to all sources of any required services 24/7. They can reserve, book, buy, and combine airline tickets with hotel reservations and train or bus schedules, rent vehicles, and cruises from home and without the need of any tour operator and intermediary support (Gursoy et al. 2015). Intelligent systems that take advantage of Big Data can support the traveler during the booking phase and provide invaluable customized information about any interesting destination, restaurant, location, point of cultural interest, and option during the stay, along with multimedia and rich information of the locations. Thus, the three Vs of Big Data find direct utilization in all phases and aspects of the travel industry by extending the data volume, types of data processable, and speed of response. The latest technology shift is Cloud solutions, since the Cloud can always scale dynamically and provide computer resources fitted to the exact needs of the corporation; they provide freedom of the need to maintain and support physical storage and hardware. In peak periods the Cloud provides resources and prevents the company using this paradigm from having bottlenecks at peak times and from the urge to possess and support infrastructure just in order be able to support requests at high peak times that is underutilized during low-request periods.
Cross-References Advanced Web Technologies and E-Tourism Web Applications Business Intelligence in Tourism Log File Analysis
17 Big Data Technologies
433
References Alapati SR (2018) Expert apache cassandra administration. Apress, Flower Mound. ISBN:978-14842-3125-8 Du D (2018) Apache hive essentials, 2nd edn. Packt Publishing Ltd., Birmingham. ISBN:978-178899-509-2 Ghemawat S, Dean J (2020) [Online], 10 Feb 2020. https://static.googleusercontent.com/media/ research.google.com/en//archive/mapreduce-osdi04.pdf Giacomelli P (2013) Apache Mahout cookbook. Packt Publishing, Birmingham. ISBN:978-184951-802-4 Grimes S (2013) Big data: avoid ’Wanna V’ confusion [Online]. InformationWeek, 7 Aug 2013– 01 May 2020. https://www.informationweek.com/big-data/big-data-analytics/big-data-avoidwanna-v-confusion/d/d-id/1111077? Gursoy D, Saayman M, Sotiriadis M (2015) Collaboration in tourism businesses and destinations, a handbook. Emerald Group, Bingley. ISBN:978-1-78350-811-2 Holmes A (2014) Hadoop in practice, 2nd edn. Manning, Shelter Island. ISBN:9781617292224 Laney D (2001) 3D data management: controlling data volume, velocity, and variety. [Online] Application delivery strategies. Gartner Inc. META Group Inc, 6 Feb 2001–1 May 2020. https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-ManagementControlling-Data-Volume-Velocity-and-Variety.pdf Li J et al (2018) Big data in tourism research: a literature review. Tour Manag 68:301–323 Lyubimov D, Palumbo A (2016) Apache Mahout: beyond MapReduce. CreateSpace Independent Publishing Platform. ISBN:1523775785 Murthy AC et al (2014) Apache hadoop YARN: moving beyond MapReduce and batch processing with Apache Hadoop 2. Addison-Wesley. ISBN:978-0-321-93450-5 Schieber P (1987) The wit and wisdom of Grace Hopper [Online]. Yale University, Apr 1987–Jan 2020. http://www.cs.yale.edu/homes/tap/Files/hopper-wit.html Sigala M, Rahimi R, Thelwall M (2019) Big data and innovation in tourism, travel and hospitality. Springer Nature, Singapore. ISBN:978-981-13-6338-2 Spark Apache History (2019) [Online]. 15 June 2019. https://spark.apache.org/history.html Ting K, Cecho JJ (2013) Apache Sqoop cookbook. O’Reilly Media, Inc., Sebastopol. ISBN:9781-449-36462-5 Warden P (2011) Big data glossary. O’Reilly Media Inc., Sebastopol. ISBN:978-1-449-31459-0 Xiang Z, Fesenmaier DR (2017) Analytics in smart tourism design, concepts and methods. Springer International Publishing AG, Switzerland. ISBN:978-3-319-44262-4
Artificial Intelligence and Machine Learning
18
Luisa Mich
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence Definitions and Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Machine Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Machine Learning Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence Tools and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence Adoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Future Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artificial Intelligence Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
436 436 438 439 440 441 441 444 448 450 452 453 453
Abstract Thanks to more powerful hardware and a new generation of learning algorithms, artificial intelligence is supporting the automation of a number of tasks and activities that are changing the job landscape as much as they have impacted on our everyday life. The first part of the chapter introduces artificial intelligence from its origins: its definition and its main research and application areas. The nature and the importance of machine learning for artificial intelligence applications are presented in the second part of the chapter. Existing approaches to machine learning are also classified and illustrated. The third part describes artificial intelligence tools and solutions by supported functionalities and automated tasks.
L. Mich () Department of Industrial Engineering, University of Trento, Trento, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_25
435
436
L. Mich
Cases of applications in tourism are provided, from the best known and widely adopted, e.g., personal assistants, to the most challenging, i.e., semantic systems. Future trends and risks related to the applications of artificial intelligence are considered in the last part of the chapter.
Keywords Artificial intelligence · Learning algorithms · Machine learning · Training · Data · Explainability
Introduction The goals of this chapter are to introduce the most important concepts needed to understand the scope and the complexity of artificial intelligence (AI) and give insights, some of which involving tourism, on the variety of franticly evolving applications. As in any other area of investment, AI-based solutions have to be planned and designed considering the state of the art of the involved technologies and the related risks. The nature of the AI field and the increasing research and commercial interests go along with challenging decision-making scenarios. According to a critical thinking approach, the chapter provides a general framework and information on the most authoritative sources to keep being updated on AI developments.
Artificial Intelligence Definitions and Characteristics AI is an umbrella term used to indicate a variety of diverse “things,” ranging from a discipline to tools implementing any form of intelligent behavior (as, e.g., in the expression “How to build an AI”), and unlikely gadgets that exhibit smart functionalities. Defining AI is difficult also because defining intelligence is difficult (Searle 1980); this ambiguity is possibly the main root of such a babel. AI is often used as a synonym of “automation,” to underline that a process or a procedure is performed without human assistance or to refer to a technology or a system, e.g., a (chat)bot, that can “understand,” “reason,” “learn,” and “interact” in a “natural way,” even if this is not always the case. As a matter of fact, no intelligence at all is involved sometimes. The first challenge is therefore to define AI: I first heard the term (AI) more than 50 years ago and have yet to hear a scientific definition. (David 2017)
First of all, AI is a discipline with many research areas and applications. AI roots are in cybernetics and computer science. The term “Artificial Intelligence” was officially coined by John McCarthy in 1955 as the name of a field investigating “thinking machine”:
18 Artificial Intelligence and Machine Learning
437
(AI) “is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.” (McCarthy 2007)
The goal of AI was: to study the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. (Anthes 2017)
Not surprisingly, AI foundations are related to a long list of disciplines. Besides computer science and computer engineering, control theory and cybernetics, also robotics, mathematics, physics, statistics, economics, neuroscience, biology, psychology, linguistics, philosophy, and more recently, ethics and law are involved. AI evolution has known many rises and falls. Periods of exceeding expectations and optimism, with high investments, were followed by limited results and funds: the so-called AI winters (Nilsson 2009; Lim 2018). Professionals and companies should therefore be cautious and always adopt a critical approach to AI projects. AI problem domains – and core courses for a degree in AI – are related to a variety of research areas, with different scopes and goals. Classical AI focused on problem-solving, search and optimizations, intelligent agents, logics, knowledge representation, uncertainty, reasoning, planning and decision-making, learning models, natural language processing, computer vision, and robotics (Russell and Norvig 2020; Elsevier 2019). Traditional classifications distinguish between two types of AI, corresponding to very different visions: strong AI (or general AI), whose goal is to imitate the hardware (human mind and body) in order to build a machine with “consciousness, sentience and mind”; and weak AI, also called narrow AI, whose goal is to mimick intelligent behaviors. While strong AI is far from reaching its final goal, narrow AI has produced some spectacular results that have increasingly fuelled interests and improvements (Tegmark 2017). Multiple taxonomies and models can be used to represent the AI toolbox, but all of them can be traced back to capabilities which characterize human intelligence. The following list supports a first classification of AI subfields and “application” areas: • Communication, i.e., ability to understand language and communicate: natural language processing, natural language understanding, question-answering, semantic annotation, and machine translation. • Perception, ability to transform raw sensorial inputs (images, sounds, etc.) into usable information: computer vision, image recognition, speech recognition, affective computing, and ambient intelligence. • Knowledge, ability to represent and understand the world: expert systems, knowledge management systems, business intelligence systems, and semantic networks.
438
L. Mich
• Planning, capability of setting and achieving goals: route planning, robotics, autonomous systems, and board and video games. • Reasoning, capability to generate conclusions from available knowledge: intelligent decision support systems, soft computing, and theorem proving. The list is not (and cannot be) exhaustive, also because of the high rate of innovation in AI methods and technologies. More importantly, AI areas are not mutually exclusive; for example, knowledge representation techniques are required in almost all the areas; dealing with input or output in natural language (NL) is needed in a variety of AI applications; computer vision is useful for robotics; and so on. AI solutions can also be classified according to the approach used to solve AI problems. There are two main approaches: symbolic and sub-symbolic. The symbolic approach assumes that intelligent systems can be implemented manipulating symbols (Newell and Simon 1972). Expert systems are the traditional and most successful example of this approach, exploiting logic programming languages and explicit, human-readable rules representing professional knowledge (Russell and Norvig 2020). Difficulties in dealing with the “knowledge acquisition bottleneck” (i.e., to elicit and translate experts’ knowledge in the form of rules) and the increasing size of rule-based systems, which more and more frequently led to contradicting conclusions, pushed researches in sub-symbolic approaches. Their origin dates back to connectionism and the Perceptron, a software program simulating a neural network (NN) (Rosenblatt 1958). The first applications of NNs were unsatisfactory and severely criticized. Only the availability of higher computational resources, technologies to retrieve, store, and process huge datasets (or big data) and new algorithms – in particular machine learning algorithms – has triggered the rebirth of (sub-symbolic) AI (Anthes 2017).
Artificial Intelligence and Machine Learning Many of the recent successful AI applications leverage machine learning algorithms (ML), so that these two terms, ML and AI, are often used interchangeably. Learning is an important feature of intelligence, but ML is actually a subfield of AI. The term ML was introduced in 1959 by Arthur Samuel (Samuel 1959). ML investigates and implements automatic processes moving away from the traditional procedural software programs toward problem-solving approaches based on learning models. ML exploits mathematical and statistical models and techniques, and its evolution and achievements are strongly related to computational statistics and data mining techniques (Neapolitan and Jiang 2018). In computer science, programs code algorithms, defined in terms of steps explicitly describing how to process (defined) input to produce output as solution of (a class of) given problems (Knuth 1997). In AI, symbolic approaches redefined the concept of program in two different ways: (a) starting from a set of rules (the knowledge base) to be activated by an “engine,” using logic programming languages (e.g., Prolog), or (b) evaluating
18 Artificial Intelligence and Machine Learning
439
functions using functional languages (e.g., Lisp). In all those programming “styles,” a program can be defined as a “set of messages to the computer on how to operate to get a solution.” ML drastically changed the concept of program: the main assumption is that a computer can learn from data: (large set of) data are used as input for software models (e.g., neural networks) to “configurate” them in order to produce an output. ML algorithms process data to make decisions (e.g., classifying input) and predictions. The workflow, i.e., process, needed for an ML application or project depends on the learning algorithm. In general terms, it is necessary to start with (a) an analysis and (b) pre-processing of the available data and then (c) to define the input, (d) validate the results, usually on a subset of the input data, and (d) apply the ML algorithm. In the next two sections, a classification of ML algorithms and models is given, with a short description. Both mathematics and statistics are needed to gain a broader understanding. For each of them, a lot of examples are available online, and some applications are listed in the third part of the chapter. A variety of ML courses can also be found online, some of which are reviewed in (Venturi 2017). The goal of this section is to give a general idea of the main topics related to ML, a complex and dynamic research and application area.
Machine Learning Algorithms There are three main types of learning algorithms: • Supervised learning: systems trained in advance with labelled examples. • Unsupervised learning: input data are not tagged in advance, and the goal of the algorithm is to identify common elements, such as patterns or cluster. • Reinforcement learning, systems are rewarded when they get the right answers. In systems implementing supervised learning algorithms, a number of input data – for example, of tagged pictures of cats and dogs – are processed indicating the right output (a classification of the pictures into “cat,” “dog,” or “other” category). This phase is called “training.” The trained system is then used to classify new input (unseen pictures of cats, dogs, or others). Performances of supervised learning algorithms depend on input data, as large sets of labelled data are needed, and the usage of most of them in the training step; as a rule of thumb: 2/3 for training and 1/3 for the evaluation. In unsupervised learning, algorithms look for common characteristics or features in input data in order to identify cluster or patterns by applying similarity metrics. Test data are not labelled in advance as in supervised learning: an unsupervised algorithm learns to “classify” new input data measuring how close such features are to those of other data. Reinforced learning algorithms are based on the idea of rewarding actions that have a positive outcome, usually with the goal of maximize a cumulative
440
L. Mich
reward function. Reinforced learning algorithms are typically applied in multi-agent systems, videogames, and in general when defining the environment in terms of a mathematical function is not possible. Some classifications also include semi-supervised learning algorithms, in which the input data are partially labelled. Other algorithms try to improve specific tasks in the ML workflow. For example, feature learning algorithms try to improve the input data, pre-processing them in search of a better representation.
Machine Learning Models Machine learning algorithms are implemented using different models or “techniques.” The most important are (artificial) neural networks, decision trees, Bayesian networks, and support vector machines. But the variety of them (Burkov 2019; Géron 2019; Neapolitan and Jiang 2018; Theobald 2018) is such, that one of the most critical steps for an AI project is to identify the most suitable model. Neural networks are loosely modelled after the neuronal structure of the human brain and can be used for all the classes of learning algorithms, from supervised to reinforced learning algorithms. Their success has recently been pushed by methods called deep learning, which work in multi-layers NNs. Each layer processes the data in order to extract features at different level of abstraction. Deep learning is also implemented using other methods, for example, belief networks. Their applications usually require defined objectives, e.g., in games. A specific type of deep learning algorithm exploits generative adversarial networks: two NNs play against each other using game strategy, usually in the form of a zero-sum game. Decision trees represent questions on a given item or issue as leaves in a tree, the answers to which determine the path to be chosen at each branching point. Decision trees are visualized upside down with roots at the top. They can be used both for classification (categorical or discrete output) and regression (numerical or continuous output) tasks. Bayesian networks, also referred to as belief networks (Neapolitan 2019), are represented as directed acyclic graphs, i.e., graphs with directed links (edges) not containing directed cycles. Bayesian network are based on inference probabilities defined from data or expert opinions: edges correspond to conditional dependencies and nodes to variables. Support vector machines (SVM) include supervised learning methods used for classification and regression. The goal of SVMs is to classify inputs characterized by a number of features (dimensions of the classification space) into different space zones (semi-plans for two-dimensional input; hyper-planes for n-dimensional spaces). The problem is that the ways to divide input data in subsets characterized by similar values for features are usually many, so that optimization functions are applied to identify the hyper-plane maximizing the “distance” between these subsets (Steinwart and Christmann 2008).
18 Artificial Intelligence and Machine Learning
441
Machine Learning Limitations Machine learning offers to AI a number of approaches and tools. Some critical issues, however, must be taken into considerations for a successful ML application. Some of these issues are related to specific learning algorithms, e.g., supervised algorithms. Others are more general and shared by a number of ML solutions. The most relevant are the following: • Lack of data for training and high costs to create an adequate number of labelled data. • Over-fitting and under-fitting, when in an inductive learning process, the function mapping input data to output is reproducing the latter too tightly or, respectively, too loosely. • Limited ability to deal with unknown “circumstances” algorithms have not been trained for, a problem of robustness in computer science. New learning models are investigated to overcome these limitations. Among them, predictive learning would be an important step toward human intelligence, as it is the main form used by humans and animals. Being able to reproduce transfer learning, i.e., the capability to apply acquired knowledge to new tasks (Yamada et al. 2018), would also be very important. Another advanced ML model is lifelong learning, which emulates human capability to learn continuously by accumulating past knowledge (Anthes 2019; Chen and Liu 2018). To implement ML solutions, developers can choose among a few platforms supporting the entire process, for example, Scikit-Learn or TensorFlow (Géron 2019). Robotics is one of the areas in which the most innovative learning algorithms and hardware solutions are applied. Some projects that focus on designing and developing “learning” machines can be mentioned. The iCub project at the Istituto Italiano di Tecnologia (ITT) develops a “learning humanoid robot.”1 Researchers at the Creative Machines Lab University of Columbia, develop machines that ‘can design and make other machines’.2 One of the project goals at MIT (Massachusetts Institute of Technology) is to create robots that learn language like children do.3
Artificial Intelligence Tools and Applications Citing some of the systems that were landmarks in the history of AI is helpful to begin introducing the large variety of AI tools and technologies. A most famous system, often mentioned as the turning point for AI, was Deep Blue, a chess-playing
1 https://www.iit.it/research/lines/icub 2 https://www.creativemachineslab.com 3 http://news.mit.edu/2018/machines-learn-language-human-interaction-1031
442
L. Mich
machine which defeated a human champion in 1997 (Newborn 2003). Another important system was IBM Watson, which in 2011 outperformed humans playing Jeopardy, a quiz game (Ferrucci et al. 2010). More recently, AlphaGo did the same in 2017 playing Go,4 an ancient Chinese board game. These projects share two interesting characteristics, which are useful to introduce AI applications: • Their tasks and goals are well defined, as all these systems were designed to play games the rules and scores of which can be clearly stated and verified. This means that they provided a good way to measure and communicate AI progress. In other words, they were successful also thanks to a comprehensible and sound research “goal,” allowing a good scientific communication and marketing and helping to revive investments in AI. • All these projects, beyond their specific tasks and goals, were backed by large investments and research groups working on a variety of different tasks. The scope and the impact of the projects were much bigger than media wrote about. IBM Watson, whose core task was answering questions posed in NL, is now described as a system that can “‘see,’ ‘hear,’ ‘read,’ ‘talk,’ ‘taste,’ ‘interpret,’ ‘learn,’ and ‘recommend”’5 (Gondek 2011). With all these abilities, it is not surprising that Watson has been applied in a number of innovative projects, though unfortunately not all of them, maybe because of too ambitious goals, are successful (Strickland 2019). The general problem is that, for some applications, the limitations of current AI solutions are still there (Searle 2011). As for AlphaGo, recent studies allowed to design a new version, called AlphaGo Zero. The biggest difference lies in the learning algorithm, as Zero was not trained using thousands of games but from random playing, then playing against itself. AlphaGo Zero defeated AlphaGo by 100 games to 0 (Silver et al. 2017). The resulting algorithms are much more important than their specific applications. The number of tools and applications offered by AI is overwhelming, also due to the continuous and dramatic evolution of the field. As a consequence, providing a comprehensive report of the AI toolbox is impossible. Nonetheless, a classification and a brief description of some of the existing AI applications follow. A few systems that exemplify the disruptive impact of AI are introduced. A general classification could distinguish horizontal versus vertical applications. Horizontal applications support processes and task (mostly) independently from sectors or domains. These applications are strictly related to the list of AI areas introduced in the first part of the chapter. In terms of functionalities, some AI tools support communication in different ways, i.e., use different types of media (text, image, audio, video) as well as different interfaces and devices. The 4 5
https://deepmind.com/research/alphago https://www.codexploitcybersecurity.com/2018/04/ibm-watson-artificial-intelligence.html
18 Artificial Intelligence and Machine Learning
443
most widespread tools are those implementing forms of conversation, questionanswering, language translations, information filtering, and ranking; all related to results in the area of NLP (Natural Language Processing). These tool can therefore be used to support intelligent search and optimization tasks (e.g., virtual assistant or companion), as in recommendation systems or customized advertising. Other NLP applications are useful to implement the semantic Web (Berners et al. 2001), or Web 3.0, in order to perform web sentiment analysis and web reputation monitoring (see Chap. 10, “Advanced Web Technologies and E-Tourism Web Applications”). More recent AI applications in the area of affective computing deploy emotion recognition (Algorithmia 2018) and even “though hearing” (Kapur et al. 2018). For companies, AI solutions can be used in a wide range of tasks including prediction, anomaly detection, diagnostics (e.g., image analysis), and decisionmaking under uncertainty. As a matter of fact, a company can integrate or introduce AI solutions into any of its information system modules, and a most important case is the implementation of successful cyber-physical systems (Alur 2015). Chatbots, or just bots, for example, can support customer relationship management (CRM) in an innovative way. Applying AI to adapt services and products to the single user, also personalizing the communication style (Debjani 2019), is a goal shared by many applications. Innovative marketing solutions aim to improve the so called “customer experience” (CX), an evolution of CRM concerning the relationships between companies and their customers. AI allows companies to create faster, more personalized experiences and, more importantly, gain insights that help leading customer experience (Morgan 2018). Companies can use AI techniques to give employees new analytical tools to improve CX, discovering new opportunities for products or services but also new business models. An example of advanced CX AI tool for retailers and manufacturers is AlgoFace, where AI algorithms, computer vision, and augmented reality (AR) technologies are applied in order to analyze the human face for an immersive and personalized CX (Hussain 2019). Vertical applications are developed to support activities related to a specific domain, or sector. Among them, content creation and summarization tools, along with social networking platforms, are changing journalism; machine created music, poetry, and painting are challenging the art industry. AI tools can be applied to domotics (or home automation) also exploiting the Internet of Things (IoT). A strong impact of AI is found in financial and insurance services; robotics is changing organizational and productive structures (Corea 2019). The case of robotics is interesting because an intelligent robot can bear on almost all the results of the other AI areas. To interact with a complex environment, as well as with human and artificial “subjects,” and a variety of devices, intelligent robots - in self-driving cars, for example - have to listen, talk, see, move, feel, and perceive, abilities in which new AI algorithms and technologies allow continuous progress. Finally, it cannot be forgotten that AI techniques are increasingly used to analyze big data in several research fields: astrophysics, particle physics, biology, social sciences, and many more.
444
L. Mich
Artificial Intelligence in Tourism Since the inception of the first information systems for supporting flight reservation, tourism is a sector where information technologies have completely changed the scenarios. The adoption of AI applications followed a similar pattern, redesigning both the offer and the demand side of the market. Companies and organizations in the tourism sector are taking advantage of AI technologies in all their information systems solutions. There are AI solutions for all the stakeholders: travellers, tourists, operators, destination managers, OTA (online travel agency), etc. Deploying AI technologies, businesses can further exploit information systems not only to save time and money and improve the quality of their processes and services but also to change their business models, by including customers into their strategies. The best known digital systems for tourism operators are the Global Distribution Systems (GDSs), an evolution of Computer Reservation Systems (CRS) and the legacy of systems supporting a variety of services based on information technology. An important change in those systems happened when the Internet and the Web opened access to digital services to the users too, eliminating or modifying the roles of intermediaries and all other operators in the tourism offer. AI pushed that process even further, offering new ways to interact and support online strategies. Native goals of AI applications in tourism are to improve personalisation and tailor recommendations, something that applies to the new generation of GDSs too. Thanks to the huge amount of data related to a variety of online interactions and transactions, AI algorithms enable GDSs to automate new analytics processes, gathering insights to support operational, management and decision activities. Companies like Amadeus, Sabre, and Travelport implement AI solutions to support e-commerce and e-marketing, to forecast demand, identify tourist profiles, detect fraud, monitor internal systems, automate agency tasks, and so on (Sorrells 2018). AI techniques are used to predict ticket costs or find the best alternative route on a disrupted journey (e.g., extracting information for travellers from rail operators’ tweets). By sharing data, travel operators and other companies can offer new services, also available to mobile apps. For example, travel companies could cooperate with weather forecast organizations to handle congestions and bad weather conditions and use AI tools to deliver real-time weather alerts. One of the main advantages of AI techniques in e-commerce is the ability to support personalisation of services, truly improving search results and interactions and, in turn, the tourist experience. It might be possible, for example, to select the flight most suitable for a specific traveller among the hundreds that are available between two destinations. GDSs could meet the challenge to provide relevant, targeted options, a more difficult task for travel companies, than for big e-commerce companies (e.g., Amazon has an average of 20 transactions for customer in a year, while in travel systems 80% of people travel once a year or less6 ). GDSs could
6 https://www.phocuswire.com/AI-series-part-1-GDSs
18 Artificial Intelligence and Machine Learning
445
apply AI techniques to their transactions data7 in order to segment tourists using personas. By doing so, they could identify the type of trip (e.g., a business trip, a short trip for a weekend, etc.) and give recommendation accordingly, helping travel and tourism businesses with their conversions and engagement. A further step toward personalization could be taken adding available information on the singleuser to the persona, as in the “Preference driven air shopping system.”8 The ultimate goal is being able to advice the customer as a travel agent would do, providing truly personalized recommendations. Some of the most significant ways in which AI technologies are currently being deployed are related to the need to support a variety of digital services (e.g., services for hotels and other businesses in the tourism industry, intended to assist customers online) based on conversation-like interactions in NL. This is the reason which explains the success of what is, probably, the best-known general-purpose AI application: the chatbot (or chatterbox, or simply bot), created to simulate conversations supporting man-machine interactions. According to a recent survey, 40% of Millennials use chatbot technologies on a daily basis. Chatbots come in different forms: textual chatbots, usually working in instant messaging services or on social media platforms; voice chatbots, as those implemented in virtual assistants (e.g., Alexa, Cortana, Google Assistant, or Siri); and robot chatbots, which are much more complex systems supporting face-to-face interactions. All these kinds of chatbots include some forms of linguistic techniques and tools; not all of them, however, really use real NLP systems. In their simplest form, chatbots work as frequently asked questions systems, which make use of pattern recognition functionalities to compare new questions to those stored in a knowledge base: in other words, there is no real language understanding here. This may be the reason why many consumers are not satisfied with chatbot-based services. According to a recent survey, 71% of consumers said that the chatbot they interacted with could not answer questions, nor help them.9 Such limitation affects also popular chatbots, like Amazon Alexa. Faced with the request from a teenager about what to do with annoying parents, Alexa replied “murder them,” because it had found a match to the input question in a “for-laughs” site. Though it deals with speech input (i.e., NL) in millions of homes all the time – with not a problem of available data – the system has no model of what is doing or what the request-answers mean. The lesson is that chatbots have to become more intelligent, a goal that can be achieved only if the state of the art of NL understanding improves. As a matter of fact, many hotels and OTAs employ chatbots to filter questions through existing AI tools first, before redirecting them to human agents. The result is an improved process, saving time and human resources.
7 Sabre has about 150 to 200 TB of data going in each day from shopping and booking transactions,
https://www.phocuswire.com/AI-series-part-1-GDSs Sabre project; https://www.sabre.com/insights/innovation-hub/prototypes/preference-drivenair-shopping 9 http://go.3cinteractive.com/l/13622/2017-07-13/38bb6b 8A
446
L. Mich
Performances of speech recognition functionalities used by voice-based systems, like virtual assistants, depend on the language and the speaker’s accents, but the state of the art is very good here (Ren et al. 2019). Recent advances in emotion recognition will allow to support even more personalized interactions, adding empathy. For example, a Japanese rail line is using IBM Watson to analyze the tone, personality, and emotion of customer inquiries, regarding fares, schedules, and seat availability.10 Aiming to increase engagement, companies are investing in personal assistant projects to specialize smart home devices for hotel rooms and other hospitality settings. “Alexa for Hospitality,” for example, is integrating assistants with Internet of Things (IoT) solutions, in order to support voice instructions in a number of functionalities. Similarly, “Actions of Google” allows to program actions that can run on Google Assistant. Interactions and sales in face-to-face customer services can be automated thanks to robot concierges, which may be employed in hotels but also at conferences, airports, shopping malls, and the like. In this way, companies could cut queues at information or reception desks and improve overall efficiency. A robot bears on speech recognition and NLP to talk with customers (this is the case of Connie, deployed By Hilton11 ) and on its robotics functionalities to help customers in more physical activities (e.g., in carrying their luggage). Another kind of artificial concierge is Claire, realized as a holographic life-size character you can walk to. It has been trained to answer questions about weather, flights, hotels, rental cars, restaurants, and also general questions, exploiting Wikipedia as a knowledge base. Claire implements state-of-the-art 3D graphics and animation; in terms of AI technologies, it uses speech recognition and speech to text, voice synthesis (text-to-speech), dialog, and a NL classifier. An example of unsuccessful project is provided by Churi, a doll-shaped assistant which was available in each room and at the reception of the Japanese Henn-na Hotels starting in 2015. These robots were fired because they could not answer questions that state-of-the-art virtual assistants could do (Gale and Mochizuki 2019). Chatbots are the front-end of administrative and customer service processes, but AI applications provide other benefits. A combination of AI analytics tools, linguistic technologies, and big data is also used in targeted advertising (or ads), mainly by social networking platforms, which store and analyze all the interactions. As a matter of fact, targeted ads are important source of income and structural component in the business model of many online companies. Facebook, for example, makes use of DeepText12 in order to extract information from users’ conversations and to automatically select ads and identify targets that marketers might consider.
10 https://www.ibm.com/blogs/client-voices/ai-personalizes-japan-airlines-travel-experience 11 https://www.ibm.com/blogs/watson/2016/03/watson-connie 12 DeepText
18 Artificial Intelligence and Machine Learning
447
Another area where NLP technologies are applied is the analysis of web sentiment (positive or negative opinions toward a company or organization – e.g., a tourism destination, or a hotel – or its products and services) and web reputation (analysis of the web presence of a company, including topics analysis and the related sentiment) (Mich 2012). AI “listening” and analysis can be deployed when dealing with user generated content (UGC), emails, phone calls, surveys and reviews across a variety of input devices and platforms, from mobile apps to websites. All the available input can be used, for example, by an hotel to run CRM, as well as support decision-making and strategic management. For DMOs, this could be a way to create highly targeted ads reducing the effort. Less known, but very important for companies and organizations, is the application of AI tools to automated asset management. Used with cloud computing, for example, by DMOs, AI-based solutions are changing the process of storing, sorting, retrieving, annotating, modifying, and sharing their assets. Besides, AI techniques for annotation support tagging of multimedia documents, images, and videos. AI-based systems are also helpful for maintenance, by enabling, for instance, a more efficient management of travel operators’ maintenance, with a positive impact on travellers’ experience and safety, as well as a reduction of maintenance costs and time. For this purpose, a Korean airline is using Watson to support root cause analysis, and for part failure prediction. The system has almost halved the maintenance lead times, reducing delays and cancellations and increasing customer satisfaction. In another project, Watson is deployed to improve the air cargo compliance procedures versus IATA and other regulations, automatizing the detection of irregularities and their resolution. Among the most recent and innovative applications of AI in the tourism sector are those based on Virtual Reality (VR) and Augmented Reality (AR). VR typically involves the use of a headset to immerse users in a digital environment they can move in and interact with. AR, instead, does not replace the real-world environment, but augments it by overlaying digital components. AI-algorithms and big data are applied in both VR and AR to address rendering and optimization problems. Starting from a few input pictures, patterns can be generated which contribute to the development of new environments at reduced costs (Barfield and Blitz 2018). VR applications are created to support CX, for instance, in visiting a museum or a destination, but also to explore rooms and facilities in a hotel as a part of the booking process. AR is a digital technology which changes a person’s perception of the physical surroundings by using a suitable device. A popular AR application is Pokemon Go. Beyond the gaming world, however, AR can be used in a variety of tourism experiences. For example, pointing their phone at a restaurant, users could “see” reviews, menus, opening time, or its history. Other tourist activities could be enhanced by providing on-the-go information, sending push notifications (like special offers or discount vouchers),and enabling functions when entering a specific location.
448
L. Mich
Finally, some big AI projects are very important for tourism even if they were not created for tourism. Because of the tourist need to communicate in a large variety of languages, one of the most beneficial among these projects is Google Translate, which is reportedly able to translate 345 source languages into 345 target ones.13 The core of the system resulted to be able to translate between a language pair it had not been trained in, as well as turn speech into text in a different language without transcribing the input (Reynolds 2017).
Artificial Intelligence Adoption AI provides innovative solutions to businesses of any size, in any sector and industry. Across all areas of business, humans, and intelligent machines are working together and are changing the ways in which companies operate (Daugherty and Wilson 2018; Yao et al. 2018). On the one hand, the contribution of AI can hardly be overestimated, and the most-cited business benefits that corporate leaders see from AI projects include the following (Forbes Insights and Intel AI 2018): • • • • • •
Increasing productivity (40%) Reducing operating costs (28%) Improving speed to market (21%) Transforming the business and operating model (20%) Improving bottom-line growth (19%) Improving customer engagement (18%)
On the other hand, the fragmentation of AI in a large number of subjects, the variety of technologies involved, and commercially oriented communication are all factors that contribute to complicate the management context. A prerequisite for a successful adoption of AI technologies is the definition of a plan which is consistent with the company’s information system strategy. First of all, planning and designing effective systems that exploit AI in a company require considering that: • There are many AI areas, applications and technologies, that need the involvement of experts and in particular, domain and AI experts. • AI innovation is fast. Today’s buzzwords may be quickly superseded by new models and technologies, posing a tough decision-making problem. • AI solutions are at different maturity levels, so that a careful evaluation of risks is recommended. • Different tools and technologies (chatbots, sensors, databases, robots, etc.) need to work in synchronization, within a systemic vision.
13 https://blog.google/products/translate
18 Artificial Intelligence and Machine Learning
449
Furthermore, according to a problem-solving approach to information systems (Laudon and Laudon 2019), any AI project should include the following activities: • An analysis of the challenges facing the company. • The definition of management strategies to address them, including a re-design of the business processes if necessary; sometimes the company’ business model itself has to be reinvented. • A definition of the internal organization structure, identifying external companies and roles to be involved to implement and keep working the AI system. • An investigation to identify (the digital and the) AI methods and tools available to implement the AI-based system to address the business challenges and support the management strategies. • The realization of the AI system, integrating it with the technologies already in place in the company information system. • Finally, the establishment of measures to evaluate the impact of the implementation of the AI solutions. Therefore, a critical success factor for the application of AI in a company is the presence of experts in the specific areas of AI (see the first part of the chapter) which have been identified for the project. Another critical factor is the availability of data for the AI algorithms and models. Besides, thanks to AI tools, a company can add hybrid “human+machine” roles to support innovative business processes (Daugherty and Wilson 2018). Most of these issues are not new. What is new is the fact that the new AI-based systems can have a very big impact, to a large scale, and in a very short time (Marr and Ward 2019). As in any other project, there are also risks and obstacles to AI adoption, to which many companies seem to be ill equipped. In particular, few organizations have adopted core practices (including those described above) to exploit the potential of AI in depth. Among the barriers to the adoption of AI solutions, the most frequently cited are the lack of a clear strategy for AI, the lack of talent with the appropriate skills, and limitations in technologies and data (McKinsey 2018). Another frequently cited problem is that some AI projects are not really AI. For one thing, not all the available tools and technologies promoted as smart or intelligent are actually implementing AI methods or tools. This is a critical issue, if the problems to be solved do required intelligence. Besides, answering the question if in a given project is “real” AI – a central and an old question – is not easy, both because it is connected with the definition of AI and because of the so-called “AI effect” (McCorduck 2004). The AI effect is a phenomenon for which as soon as an AI system solves a problem, its solution is not considered requiring “intelligence” anymore, and systems that were deemed intelligent loose the AI label. It happens again and again and is one of the paradoxes of AI. The solution would be to have a test for AI. The traditional way to check if a machine can “think,” in its form called “the imitation game,” is to apply the Turing
450
L. Mich
test (Turing 1950). This test is passed by a system if a user cannot say if answers to questions in NL are given by a human being or by the system. The validity of the Turing test has been debated since it was first proposed, and the question is still open, confirming the difficulties in defining a test for AI. To illustrate the “AI effect,” the fact can be mentioned that a human conversation can be “simulated” using a small number of rules and patterns. This was the case of Eliza (Weizenbaum 1966), a progenitor of chatbot and one of the first AI projects. The system, designed to mimic a Rogerian psychotherapist and implemented in 1964, gained such a popularity that people asked to use it even knowing that it was only a computer program. For the Turing test, Eliza should be considered intelligent, but it does not “understand” the meaning of the exchanged sentences, which is still the most important limitation of AI in the NLP area. In fact, Eliza can easily be fooled, also because of its limited vocabulary and domain knowledge. A new test for AI would be very beneficial. Recently, the Allen Institute for AI designed a test based on a set of multi-choice questions to check if AI machines could pass an eight-grade science exam, but they could not. Even state-of-the-art AI systems for NL are unable to go beyond surface text (there are many ways to say the same thing), understand the meaning underlying a question, and, more importantly, use reasoning to find the appropriate answer. For example, answering to a question like “How many chromosomes does the human body cell contain?”, by choosing among the following alternatives (A) 23, (B) 32, (C) 46 (D) 64, requires fact lookup and can be done using a search engine or an online knowledge base like Wikipedia. Whereas, answering a question like “City administrators can encourage energy conservation by (A) lowering parking fees, (B) building larger parking lots, (C) decreasing the cost of gasoline, (D) lowering the cost of bus and subway fares” requires knowledge, but also reasoning, an ability where AI is at a far distance away from human level performances (Schoenick et al. 2017).
Future Trends Artificial intelligence has made substantial progress in the last years. Research and applications are steadily growing. Companies and institutions blame skill shortage. It is difficult to find signs of decline or slowing down.14 The complexity of technological and application scenarios, however, urges government, industries, and citizens – all involved in an unprecedented evolution – to be extremely cautious and determined. Two recommendations can be made: keeping updated with the development is necessary, and so is training experts in the diverse AI fields in order to rely on authoritative sources of information.
14 There are authors saying that there are signs of another AI winter; however, they focus on specific
technology or methods (e.g., the hype for deep learning) or on specific accidents affecting e.g., the autonomous driving sector. All of them are challenges to be address, they will possibly cause harm, but not stop AI progress.
18 Artificial Intelligence and Machine Learning
451
A comprehensive description of future trends in AI is not the aim of this section nor is an achievable goal. Trends in AI are surveyed by large companies like Gartner15 or Forbes16 on a regular basis, and also by research centers, like the Cambridge Innovation Institute, publishing the “AI trends. The business and technology of enterprise AI” magazine17 ; other magazines are published on line on the Medium platform: OneZero,18 Becoming Human,19 and many more. Large IT (information technology) and AI companies announce progress on their AI-related projects, as posts, white papers but also scientific papers; among them, Intel AI, MIT-IBM Watson AI Lab,20 the Machine Learning Department (MLD) at Carnegie Mellon University,21 and the Stanford AI Laboratory.22 Other authoritative information sources are the AI societies which are present in many countries,23 organize conferences and courses, and publish specialized journals. Finally, there are many publications and websites dedicated to specific application areas, like AI in healthcare, finance, law, marketing, manufacturing, and so on. To name some of the most relevant future trends, AI methods and technologies related to the so-called narrow intelligent are probably going to be incrementally developed, improving existing tools and solutions. The success of many projects depends on the results of projects to design new hardware, for example, AIpowered chips to implement processors that speed up AI applications requiring large computation resources. And this is the case of many ML algorithms, as they are often based on complex statistics model and big data analysis (for instance, in facial recognition or computer vision). More innovations are expected from the integration of AI with other IT technologies. An example is the adoption of AI for IoT or blockchain. ML algorithms are going to be refined too, and experiences in their applications will allow the definition of guidelines and the identification of best practices, supporting companies that look for most suitable solutions in their business strategies. In speech recognition, for example, data on the voice interactions can be used to improve virtual assistants, optimizing them by taking into account the way in which people use them. For some AI areas, disruptive innovations are possible only thanks to new theories and models. This, for example, is the case of NLP systems: existing ML approaches cannot really understand the meaning of an NL input; even if big improvements have been introduced, large amounts of data cannot solve the 15 https://www.gartner.com 16 https://www.forbes.com 17 https://www.aitrends.com 18 https://onezero.medium.com 19 https://becominghuman.ai 20 https://www.research.ibm.com/artificial-intelligence 21 https://www.ml.cmu.edu 22 http://ai.stanford.edu 23 http://www.aiinternational.org
452
L. Mich
problem; on the other hand, simulating the way in which a child learn language is far from achievement.
Artificial Intelligence Risks It is no exaggeration to say that AI is going to change companies, industries, countries, and the whole world. But these changes involve risks concerning trust, security, privacy, safety, and other ethical issues. Though governments, institutions, companies, and research centers have started to invest in order to address such concerns, their efforts are hardly coping with the speed of the change. Being so pervasive and often invisible to the users, AI systems should be designed and implemented to comply with rights and laws, satisfy usability criteria, improve our lives, help to solve problems, and not add complexity. This is not always the case. More importantly, these objectives have to be explicitly pursued, and not be taken for granted. As regard trust, a common problem is that users have to be acknowledged if their interaction on, e.g., a chat is not with a real person but with an AI tool and vice versa. Another problem is that some companies exploit human intervention to support translation or search to overcome the limitations of the AI tools without telling it (Lance 2018). Untrustworthy systems may be dangerous if their output is taken for granted without knowing how they work (the black box problem), or the input data are biased. The first step to address such risks is being able to explain why a given solution, suggestion, or behavior has been chosen, providing information on the data and knowledge used, and their processing (Parnas 2017), also to limit errors. This is the goal of explainable AI, or XAI (Anthes 2017; Monroe 2018; Pearl 2019; Weld and Bansal 2019). The problem is that many AI systems run programs based on algorithms whose output cannot usually be traced back to specific parts of the input. The new generation of ML algorithms, based on deep learning, are even more inscrutable, due to the complexity of the processing steps and the huge size of data required and produced. A full comprehension of NL would be necessary to address explainability issues. New results in this field could also be of help to systems which do not have NL as their normal input or output (e.g., a medical system which takes patient data and produces a structured output), but which would benefit from being able to store their knowledge and explain it in a comprehensible manner using NL (Garigliano and Mich 2019). Another problem is that of accountability, or responsibility, whose relevance is evident for social networking platforms (who is to blame for fake news?) or for self-driving car (who is responsible for a driverless car collision? the algorithm designer? the lawyer that suggested the criteria for decision making during the trips? . . . ). Besides, datasets may be insufficient. The higher risk may be posed by data-bias, or data reflecting values of the people who design and realize the datasets. A number of cases warn to be careful (O’Neil 2016) from this point of view. AI companies should avoid inappropriate data practices, also because behaviors like these could reduce the acceptance of AI solutions.
18 Artificial Intelligence and Machine Learning
453
One of the most discussed ethical issue is that of job losses due to AI. The most extreme vision is the so-called singularity theory, depicting a world in which AI will be superior of human intelligence (see, e.g., Hawking 2014). More realistically, there is a problem of job losses, whose scope and scale are difficult to foresee. Automation is replacing human jobs in a number of industries and in professions that were considered almost impossible to automatize. On the other hand, AI requires more researchers and experts in many and different fields. What may happen is that the lost in low-medium skilled jobs will not be fully compensate by new higher-skilled jobs. An alternative vision to that of “singularity” is that of “multiplicity” in which AI is considered as an amplifier of human intelligence, and people and machines are collaborating to solve problems and innovate (Goldberg 2019). Final recommendations have to include the need to invest in educational and communicative initiatives as AI is of upmost importance; it implies a cultural and social revolution – almost silent if compared with its impact – whose drivers are more and more AI systems. Even if the singularity scenarios sound distant and improbable, many of our daily activities, regardless of our awareness and attention, are already regulated by algorithms (Tegmark 2017).
Cross-References Advanced Web Technologies and E-Tourism Web Applications Recommender Systems in Tourism Robotics in Tourism and Hospitality Semantic Web Empowered E-Tourism The Evolution of Online Booking Systems
References Algorithmia (2018) Introduction to emotion recognition, https://blog.algorithmia.com/ introduction-to-emotion-recognition Alur R (2015) Principles of cyber-physical systems. The MIT Press, Cambridge, MA Anthes G (2017) Artificial intelligence poised to ride a new wave. Commun ACM 60(7):19–21. https://doi.org/10.1145/3088342 Anthes G (2019) Lifelong learning in artificial neural networks. Commun ACM 62(6):13–15. https://doi.org/10.1145/3323685 Barfield W, Blitz MJ (eds) (2018) Research handbook on the law of virtual and augmented reality. Edward Elgar Publishing, Cheltenham, UK. https://doi.org/10.4337/9781786438591 Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Scientific American 284(5):34–43 Burkov A (2019) The hundred-page machine learning book. Kindle Direct Publishing, Québec Chen Z, Liu B (2018) Lifelong machine learning. Synthesis lectures on artificial intelligence and machine learning, 2nd edn. Morgan & Claypool. https://doi.org/10.2200/ S00832ED1V01Y201802AIM037 Corea F (2019) Tificial intelligence. Where AI can be used in business. Springer, Cham Daugherty PH, Wilson HJ (2018) Human + Machine: reimagining work in the age of AI. Harvard Business Review Press, Boston, MA
454
L. Mich
David LP (2017) Inside risks of Artificial intelligence. Commun ACM 60(10):17–31. https://doi. org/10.1145/3132724 Debjani D (2019) Redefining personalization with AI. https://www.analyticsinsight.net/debjanideb-redefining-personalization-with-artificial-intelligence Elsevier (2019) Artificial intelligence: how knowledge is created, transferred, and used. Trends in China, Europe, and the United States. Elsevier Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur AA, Lally A, Murdock JW, Nyberg E, Prager J, Schlaefer N, Welty C (2010) The AI behind Watson – The Technical Article. AI Mag. http://www.aaai.org/Magazine/Watson/watson.php Forbes Insights and Intel AI (2018) On your marks: business leaders prepare for arms race in Artif Intell. https://www.forbes.com/sites/insights-intelai/2018/07/17/on-your-marks-businessleaders-prepare-for-arms-race-in-artificial-intelligence/#47112da1946a Garigliano R, Mich L (2019) Looking inside the black box: core semantics towards accountability of artificial intelligence. To be published in LNCS Gale A, Mochizuki T (2019) Robot hotel loses love for robots. The Wall Street Journal (July 16) https://www.wsj.com/articles/robot-hotel-loses-love-for-robots-11547484628 Géron A (2019) Hands-on machine learning with Scikit-Learn and TensorFlow, 2nd edn. O’Reilly, Birmingham Goldberg K (2019) Multiplicity: how AI and robots can diversify, rather than replace, human thinking. https://docs.google.com/document/d/1UzPdp51v8K1uut7-OWj8j5i3Kii3CFhh2mxY wX1cTlQ/edit Gondek D (2011) How Watson “sees,” “hears,” and “speaks” to play Jeopardy!. IBM Research News Hawking S (2014) AI could end human race. BBC 2 Dec 2014. Hussain A (2019) AlgoFace: Transforming consumer experience at the convergence of computer vision, AI and AR. https://www.analyticsinsight.net/algoface-transforming-consumerexperience-convergence-computer-vision-ai-ar Kapur A, Kapur S, Maes P (2018) AlterEgo: a personalized wearable silent speech interface. In: 23rd International Conference on Intelligent User Interfaces (IUI 2018), pp 43–53 Knuth D (1997) The art of computer programming. 1: fundamental algorithms, 3rd edn. AddisonWesley Professional Lance N (2018) What Microsoft and Google are not telling you about their AI. Medium AI. https://medium.com/s/story/what-microsoft-and-google-are-not-telling-you-about-their-ai-b609 f5395a8e Laudon K, Laudon J (2019) Management information systems: managing the digital firm, 16th edn. Pearson, Harlow Lim M (2018) History of AI winters. Actuaries Digital. https://www.actuaries.digital/2018/09/05/ history-of-ai-winters O’Neil C (2016) Weapons of math destruction: how big data increases inequality and threatens democracy. New York Time Marr B, Ward D (2019) Artificial Intelligence in practice: how 50 successful companies used Artificial Intelligence to solve problems. John Wiley & Sons Inc, New York McCarthy J (2007) What is Artificial Intelligence? www-formal.stanford.edu/jmc/whatisai McCorduck P (2004) Machines who think, 2nd edn. A.K. Peters Ltd, Natick McKinsey (2018) Notes from the AI frontier. AI adoption advances, but foundational barriers remain. https://www.mckinsey.com/featured-insights/artificial-intelligence/ai-adoptionadvances-but-foundational-barriers-remain Mich L (2012) Requirements for a comprehensive and automated web reputation monitoring system: first iteration. In: IEEE International Conference on Software Science, Technology and Engineering (SWSTE), 11–19 Monroe D (2018) AI, explain yourself. Commun ACM 61(11):11–13 Morgan B (2018) The customer of the future. 10 guiding principles for winning tomorrow’s business, http://www.blakemichellemorgan.com Neapolitan RE (2019) Learning Bayesian Networks. Prentice Hall Series in Artificial Intelligence
18 Artificial Intelligence and Machine Learning
455
Neapolitan RE, Jiang X (2018) Artificial Intelligence: With an introduction to Machine Learning (2nd ed). Chapman & Hall/CRC Artificial Intelligence and Robotics Series Newborn M (2003) Deep Blue. An artificial intelligence milestone. Springer, New York, NY Newell A, Simon HA (1972) Human problem solving. Prentice-Hall, Englewood Cliffs Nilsson NJ (2009) The quest for artificial intelligence. A history of ideas and achievements. Cambridge University Press, Cambridge, NY Parnas DL (2017) Inside risks of artificial intelligence. Commun ACM 60(10):17–31 Pearl J (2019) The seven tools of causal inference, with reflections on machine learning. Commun ACM 62(3):54–60. https://doi.org/10.1145/3241036 Ren Y, Tan X, Qin T, Zhao S, Zhao Z, Liu T (2019) Almost unsupervised text to speech and automatic speech recognition. In: Proceedings of the 36th International Conference on Machine Learning, in PMLR, vol 97, pp 5410–5419. http://proceedings.mlr.press/v97/ren19a.html Reynolds M (2017) Google uses neural networks to translate without transcribing. Daily News, https://www.newscientist.com/article/2126738-google-uses-neural-networks-totranslate-without-transcribing Rosenblatt F (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychol Rev 65(6): 386–408. https://doi.org/10.1037/h0042519 Russell SJ, Norvig P (2020) Artificial intelligence: a modern approach, 4th edn. Pearson, Boston Samuel A (1959). Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229. https://doi.org/10.1147/rd.33.0210 Schoenick C, Clark P, Tafjord O, Turney P, Etzioni O (2017) Moving beyond the Turing test with the Allen AI Science Challenge. Commun ACM 60(9):60–64. https://doi.org/10.1145/3122814 Searle J (2011) Watson doesn’t know it won on ‘Jeopardy!’. The Wall Street Journal Searle JR (1980) Minds, brains, and programs. Behav Brain Sci 3(3):417–424 Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of Go without human knowledge. Nature 550:354–359 Sorrells M (2018) A review of 2018, a preview for what’s next: artificial intelligence in travel. https://www.phocuswire.com/2018-review-2019-preview-AI Steinwart I, Christmann A (2008) Support vector machines. Springer, New York, NY Strickland E (2019) How IBM Watson overpromised and underdelivered on AI health care. Spectrum IEEE. https://spectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpromisedand-underdelivered-on-ai-health-care Tegmark M (2017) Life 3.0: Being human in the age of artificial intelligence. Penguin Books Ltd, London, UK Theobald O (2018) Machine learning for absolute beginners: a plain English introduction, 2nd edn. Scatteredplot Press, Palo Alto, CA Turing A (1950) Computing machinery and intelligence. Mind 59(236):433–60 Venturi D (2017) Every single Machine Learning course on the internet, ranked by your reviews. https://www.freecodecamp.org/news/every-single-machine-learning-course-onthe-internet-ranked-by-your-reviews-3c4a7b8026c0 Weizenbaum J (1966) ELIZA—a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):36–45. https://doi.org/10.1145/365153. 365168 Weld DS, Bansal G (2019) The challenge of crafting intelligible intelligence. Commun ACM 62(6):70–79. https://doi.org/10.1145/3282486 Yamada M, Chen J, Chang Y (2018) Transfer learning: algorithms and applications. Morgan Kaufmann Yao M, Jia M, Zhou A (2018) Applied artificial intelligence: a handbook for business leaders. TOPBOTS, Middletown, DE
Recommender Systems in Tourism
19
Francesco Ricci
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommender System Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preference Elicitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scoring and Ranking Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendation Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key Recommendation Dimensions for Tourism RSs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Context-Awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recommendations for Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequential Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Open Issues and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
458 459 461 462 464 466 466 467 468 469 471 472
Abstract Recommender systems (RSs) are information search and filtering tools that provide suggestions for items to be of use to a user. They are now common in many Internet applications (Google News, Amazon, TripAdvisor), helping users to make better choices while searching for news, books, or vacations. RSs exploit data mining and information retrieval techniques to predict to what extent an item fits the user needs and wants. RSs interact with the user to finetune these suggestions while presenting a selection of the items, among those having the largest predicted fit score. RSs have been used in tourism applications for suggesting points of interest to visit, holiday properties, and flights, or even generating complete plans for holidays, that is, bundling different types of more
F. Ricci () Faculty of Computer Science, Free University of Bozen-Bolzano, Bozen-Bolzano, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_26
457
458
F. Ricci
elementary items (e.g., accommodations and events) in one recommendation bundle. In this chapter, we will first introduce basic recommender systems principles and techniques. We will discuss the general functioning of a recommender system and how various techniques are used to implement the model components. We will then present important key dimensions for recommender systems especially considering the travel and tourism application scenario. We will close this chapter by discussing some limitations and open challenges for recommender systems research.
Keywords Recommender systems · User preferences · Information overload · Context-awareness · Group recommendations · Decision-making
Introduction The explosive growth and variety of information available on the Web and the rapid introduction of new e-business and social services (purchase products, product comparison, auction, forums, social networking, multimedia fruition) have created an overabundance of options (services and products) and substantially increased the difficulty and the cost of making a choice. In this scenario, while having some choice is good, too much choice can be confusing (choice overload). Moreover, if, for instance, dozens of different types of jam, offered on a supermarket shelf, are only likely to confuse a buyer, as it is illustrated in Schwartz (2004), millions of songs or movies are simply impossible to scan, not even to compare. So, if the user’s goal is to discover and play some songs or watch a movie that she/he will like, then some form of system support is clearly necessary. Such a scenario motivated the introduction of recommender systems (RSs) (Ricci et al. 2015a; Konstan and Riedl 2012). RSs are defined as information search and filtering tools that provide suggestions for items to be of use to a user. In that sense, many applications may be classified as RSs. In this article we adopt a more restrictive view and assume that an RS must always implement, among others, a core computation task: predicting the potential interest of the user for the items in the available catalogue. This prediction task is accomplished with artificial intelligence algorithms. RSs have now become ubiquitous in the large majority of Internet applications, helping users to make better choices while searching for or exploring a collection of news, music, hotels, or video. “Item” is the general term used to denote what the system recommends to its users, and a specific RS normally focuses on one type of items (e.g., either hotels or movies). Accordingly, its core algorithmic components and its graphical user interface are tailored to provide useful and effective suggestions for that specific type of items. Recommender systems play an important role in highly rated Internet sites, such as Amazon, YouTube, Netflix, and TripAdvisor. Social networks, such as LinkedIn and Facebook, have introduced recommender systems to suggest posts, groups to join, or people to relate with.
19 Recommender Systems in Tourism
459
In their simplest form, personalized recommendations are offered as user tailored selections of items. In performing this personalization, the system tries to predict what the most suitable items are, based on the user’s characteristics, preferences, and contextual situation. In order to complete that computational task, an RS must elicit from users such characteristics (e.g., gender or personality), preferences (e.g., items that the user likes), and contextual conditions (e.g., when or with whom the user is supposed to consume the item). This information is acquired both along the full history of previous system interactions with a population of users and exploiting information entered by users at the time the recommendation is delivered. Moreover, such information either can be explicitly entered by users, e.g., in the form of ratings for items, or can be inferred by interpreting users’ actions. For instance, the navigation to a particular Web page can be interpreted as an implicit sign of preference for the items shown on that page, or the current time could be hypothesized as the time when the searched item will also be consumed. The study of recommender systems emerged as an independent research area in the mid-1990s (Goldberg et al. 1992; Resnick et al. 1994), and it is still growing. Research works on RSs are published in major conferences on machine learning (ICML, KDD, NIPS), information retrieval and Web (SIGIR, WISDM, CIKM, WWW), artificial intelligence (IJCAI, AAAI), and intelligent user interfaces and personalization (IUI, UMAP). A specific ACM conference on recommender systems has been launched in 2007, and every year it attracts more and more submissions and attendees, from both academia and industry. In this chapter, we will first introduce basic recommender systems principles and techniques. We will then discuss the general functioning of a recommender system and how various techniques are used to implement the model components, along with their pros and cons. We will then present important key dimensions for recommender systems especially considering the travel and tourism application scenario. We will close this chapter by discussing some limitations and open challenges for recommender systems research.
Recommender System Components In Adomavicius and Tuzhilin (2005) the authors introduce a general computational model for recommender systems. There, the fundamental goal of an RS is formulated as the prediction of a real-valued function defined on the product space of users and items r ∗ : U ×I → R; r ∗ (u, i) is the RS prediction of how a user u in the set U evaluates the “goodness” of an item i in the set I or, in other words, how the item i scores for user u. These predictions are based on a collection of sample observations of scores of items in I given by a collection of users in U that is available to the RS, i.e., data points (v, j, r(v, j )) in the set T of observed scores, where v is a generic user and j is a generic item. This set T is used for training the predictive model of the RS, i.e., the r ∗ function. Then, having a set of predicted scores of a user u for a subset of the items in the catalogue I , the RS recommends to a user u some of the
460
F. Ricci
items i among those with the largest predicted scores r ∗ (u, i) and further interacts with the user u to refine these initial suggestions and identify an even smaller subset of items to recommend. Hence, in the next sections, we detail this RS model by considering three fundamental subcomponents: preference elicitation, item score prediction, and recommendations refinement. In Adomavicius and Tuzhilin (2005) the authors call r(u, i) the observed and r ∗ (u, i) the predicted “utility” of the item for the user. We prefer here not to use the term “utility,” as normally no special assumptions are made on the characteristics of the score function, whereas a utility function must satisfy specific constraints (Steele and Stefánsson 2016). For that reason we simply call it either predicted or observed “score” of an item for a user, depending on the fact that the score either is predicted by the RS or is given by the user. Observed scores may be ratings that users gave to items, as in collaborative filtering RSs (see section “Scoring and Ranking Recommendations”). But scores may also identify observed or absent interactions between users and items, e.g., a user browsing of an item description page. Both types of scores provide the system with information, more or less explicit and reliable, about the users’ preferences; this is the essential ingredient for constructing an RS. We also note that in this chapter, we use a generalization of the abovementioned two-dimensional model (U sers × I tems), which was first introduced in Adomavicius et al. (2005) and further discussed in Adomavicius et al. (2011) and Ricci (2018). That is, in addition to the sets U, I , of users and items, we assume that the user score for an item depends on the contextual situation that we generically here assume belongs to a set of possible contextual situations C. In other words, an item i may score for a user u differently depending on the conditions under which the user consumes or experiences the item. For instance, a contextual situation c in C may be the specific user’s mood (e.g., happy) or occasion (e.g., birthday party) when the item, e.g., a restaurant, was experienced, or even the particular group of people with whom the user experienced the restaurant item. Hence, the same restaurant may obtain different scores from the user if visited for a birthday party vs. a business lunch. Or, to make another example, a music track may be listened to and appreciated when the user has a melancholic mood and not when the user is thrilled. Then a collection of training data T of observed scores of users for items in contexts are quadruples (u, i, c, r(u, i, c)) belonging to U × I × C × R where u is the user who interacted with item i in the situation c, and r(u, i, c) in R is the observed score, either explicitly formulated by the user and recorded by the RS, or the bare indication, e.g., with a 1/0 score that the user did interact with the item or not. R is an evaluation scale containing the possible (ordered) alternative scores that the user can enter or a quantitative measure of the “size” of the observed user interaction with the item (e.g., the number of times the user played a track in a day). Explicitly entered scores are commonly called ratings, and a popular rating scale is the five star set {1, 2, . . . , 5}. This scale is used, for instance, in Amazon.com or TripAdvisor. A simpler scoring scale contains only two values: positive (“like”, +1) vs. negative (“not like”, −1). This, for instance is used in YouTube.com.
19 Recommender Systems in Tourism
461
Preference Elicitation In order to generate recommendations, an RS should first collect a set of scores from the users in U , for some of the items in I , in some contextual situations in C, i.e., to acquire what we have called above a training set T . In some cases, explicitly acquiring these training scores r(u, i, c) from the users is not possible. In these cases, the predicted scores r ∗ (u, i, c) are derived by the system from observing the interactions that the user had with the items, which, therefore, are called “implicit” preferences. Let us discuss these two approaches in the following. Many recommender systems allow users to “explicitly” score (rate) items as the users encounter them while interacting with the system. More rarely, RSs explicitly request the user to evaluate a certain number of items (e.g., 20 books) before providing recommendations (for new books). This may happen in the sign up stage, i.e., when the user registers to the system. In more sophisticated approaches, the system may identify these items to evaluate by implementing a precise preference elicitation strategy, i.e., by applying active learning techniques (Elahi et al. 2016). Active learning techniques in RSs identify which item should be requested to the user to score. The goal is to acquire the most useful information for the algorithmic predictive technique that the recommender will use then. So, for instance, the system may identify the most popular items, and request a user to rate them, with the objective of maximizing the probability that the user knows these items and can score them. Or, in a completely different solution, the system may ask the user to score the items that have received so far the most diverse evaluations, since the opinion of the user on these items can better reveal the specific users’ preferences. A more subtle approach consists of exposing the user to some items that are recommended to him/her and then either to explicitly ask the user if this information was useful at that point in time (as in Google search app) or to unobtrusively observe the actions that the user is performing on the suggested items. This brings us to the “implicit” approach to preference elicitation. In “implicit” feedback approaches, the user is not directly scoring items by attributing to them values from an evaluation scale. The underlying assumption is that the user may not want to waste time for this task. Hence, in these cases, the system predicts the target score from a range of observations of user actions (Gurbanov and Ricci 2017). In this scenario the system developer must first choose a target action for the RS prediction, for instance, the action to make a hotel reservation or not. Then, the presence/absence of this action must be coded with a numeric score. For instance, hotels that were previously booked by a user may receive target score 1 and all the non-booked hotels are assigned to the target score 0. In addition to the booking actions, the RS may observe other types of actions, e.g., the addition of a hotel to a wish list or simply the browsing of a page describing the hotel features. These actions can be considered as auxiliary scores and are used all together to predict the target scores, which tell us how likely it is that the user will book the hotels in the catalogue. Another example of usage of implicit feedback can be found in the music recommender system presented in Moling et al. (2012). Here the system is measuring the percentage of a suggested track that is actually
462
F. Ricci
listened by the user to decide what type of track to recommend next (those with the largest predicted percentage of listening). Other examples of less “implicit” preference elicitation are offered by systems that do not ask users to evaluate individual items, but allow them to compare item pairs (Kalloori et al. 2018) or to criticize items (McGinty and Reilly 2011). In the first case, with a collection of comparisons of pairs of items performed by a set of users, specific ranking algorithms, which we will discuss later, are able to predict the target score of the items. In the second case, the system first presents some recommended items. These are primarily generated for encouraging the user to specify the characteristics of the preferred items. In response, the user can inform the system that the presented items do not completely conform to his/her preferences, and select one item that is almost good, but is still lacking a preferred feature (e.g., it should be cheaper). In these cases the system uses the user input, which is composed by the selected item and the “critique”, to update the predicted scores of the items for that user. As we have briefly illustrated above, there is a large variability in the type of preference data that an RS may collect and use to predict scores of items. The type of preference data the system collects strongly influences the choice of the score prediction algorithm that the system can use. This is discussed in the next section.
Scoring and Ranking Recommendations The second component of an RS is in charge of prediction scoring and ranking recommendations. This is achieved by using a training data set of either userprovided scores or observed users’ actions and by leveraging them to predict scores for items that the target user has not scored yet. In fact, this is a characterizing task of a recommender system: exploiting data related to users’ interactions with items, i.e., the system background knowledge, in order to generate predictions of the user scores for other items. The predicted score function r ∗ is estimated with a specific recommendation technology. In the past research, a basic distinction was made between techniques that use only the observed interactions between users and items, in the form of ratings or simple actions (e.g., clicks) and other techniques that leverage, in addition to this form of preference data, other types of information about the users and the items. The first group of approaches has been called, generically, collaborative filtering, while the second has more diverse names, depending on the type of additional information that is used. They are, for instance, content-based, casebased, community-based, tag-based, trust-based, and hybrid-based recommendation techniques and many others.
Collaborative Filtering The simplest and original implementation of this approach predicts that the active user, i.e., the user asking for recommendations, will evaluate higher the items that other users with similar tastes liked in the past (Ning et al. 2015). The similarity
19 Recommender Systems in Tourism
463
in taste of two users is calculated on the base of the correlation of the users’ evaluations of the items. Collaborative filtering is probably the most popular and widely implemented technique. The latest approaches to collaborative filtering use latent factor models, such as matrix factorization (e.g., using singular value decomposition, SVD) or factorization machines. These methods map both items and users to the same latent factor space. Then the predicted evaluation of a user for an item is computed by the dot multiplication of their representative vectors, which gives a kind of similarity or matching between the user and the item, in this common representation space (Koren and Bell 2015).
Content-Based The system in this case implements for each user a machine learning “classifier” that learns to score (classify) higher the items that are similar to the ones that the user scored higher in the past. The similarity of items, or more in general the item classification rule, is calculated by relying on the features associated with the compared items, and this is why this method is called “content-based.” For example, if a user has systematically positively rated movies that belong to the “action” genre, then the system can predict that other movies from this genre should have a high value for that user (de Gemmis et al. 2015). Recent research moved away from simple feature-based description of the content and used “semantic” approaches that also leverage external knowledge sources, such as ontologies and encyclopedic knowledge. Knowledge-Based Knowledge-based techniques predict items’ scores based on specific domain knowledge about how certain item characteristics meet user’s needs and preferences and ultimately how the item is useful for the user. Notable knowledge-based recommender systems are case-based (Bridge et al. 2006). In these systems a similarity function estimates how much the user needs (problem description) match the recommendations (solutions of the problem); the more similar the two are, the higher is the predicted score for the considered item. Here the similarity score can be directly interpreted as the predicted item score of the item for the user. Another group of knowledge-based systems uses constraints to represent user preferences and to find relevant items (Felfernig et al. 2015). These techniques have been applied especially in more complex application domains, such as financial services and travel and tourism (Ricci and Werthner 2002), where a good recommendation for a user may be defined as a special configuration of a generic item. For instance, a travel package recommendation may be appropriate for a user if it satisfies a number of user-specific configuration constraints, e.g., it falls in a particular price range and the starting date is on August Mondays. Community- and Trust-Based In this type of techniques, item score predictions are based on the preferences and opinions of the user’s friends. Evidence suggests that people tend to rely more on recommendations from their friends than on recommendations from similar but
464
F. Ricci
anonymous individuals. This observation, combined with the growing popularity of open social networks, is generating a rising interest in community-based systems or social recommender systems (Guy 2015). This type of RS techniques acquires and exploits information about the social relations of the users and the preferences of the user’s friends. The item evaluation predictions are based on ratings that were provided by the user’s friends. Similar techniques use trust relationships between users, which are normally deduced from their social relationships (Victor et al. 2011). These techniques are the first examples of more general recommendation algorithms that use the users’ and items’ network structure to generate recommendations (Schall 2015).
Hybrid Recommender Systems Hybrid scoring prediction techniques are based on the combination of the abovementioned techniques. A hybrid system combining techniques A and B tries to use the advantages of A to fix the disadvantages of B. For instance, collaborative filtering methods suffer from the new-item problem, i.e., they cannot generate evaluation predictions for items that have no ratings. This does not limit contentbased approaches since the prediction for new items is based on their descriptions (features) that are typically easily available. Given two (or more) basic RS techniques, several ways have been proposed for combining them to create a new hybrid system (see Burke et al. (2007) for the precise descriptions). Nowadays, we can state that all the commonly used RS techniques are hybrid as a single source of information on the users and the items does not suffice and there are plenty of additional information sources that can be leveraged to improve the performance of a non-hybrid system. For instance, in a hotel RS, one can improve the performance of a system based on ratings by considering also the textual content of the reviews (Hu et al. 2015). Ranking As we will discuss later, the predicted scores for items are normally used only to select the top scored items. Hence, the actual computational problem that an RS is facing is normally just that of ranking items that the users have not scored yet. Therefore, ranking techniques have been largely used in RSs, and many authors claim that they should be preferred to techniques that try to predict a score just for using it for ranking the predictions. For instance, a seminal work in this area introduced Bayesian personalized ranking (Rendle et al. 2012). This work is interesting also because it uses only implicit feedback and assumes that items that have been clicked by a user are preferred to items not yet considered and clicked.
Recommendation Refinement After the RS has generated score predictions for items or predicted a personalized ranking, it can finally identify a small set of recommendations and present them to the target user. The classical and common approach for accomplishing this task is to
19 Recommender Systems in Tourism
465
recommend to a user, in a contextual situation, the N items that have the largest predicted score in that situation. N is normally a small value, such as 5 or 10, sometimes even just 1, as in a system that suggests the next track or movie to watch. Reducing the number of recommendations is essential to address the main goal of an RS, namely, to filter irrelevant items and simplify the user’s decision-making process. However, there is a growing understanding that this apparently obvious design choice, which is based on the principle of maximizing the expected utility of the decision-maker, may not be the most appropriate one in many cases. For instance, the top N items may be all very similar; hence, it would be more useful to introduce in the recommendation list items, which may have a lower predicted score, but are, all together, providing a more useful combination of information to the user. Moreover, the top predicted scored items are often the popular ones, i.e., well-known items that are appreciated by many users (e.g., a popular attraction in a city). One may wonder if these suggestions are really useful, as it is very likely that the user already knows them and they would make the usefulness and the “intelligence” of the system questionable. The essential point that we raise here is that, while the item score prediction function estimates to what extent the user may like an item, the user decides what item to select by browsing the recommendation list. Hence, items presented in the recommendation list must fulfill two, possibly conflicting, criteria: be interesting to the user and help the user to make a good decision. The items that can better help the user to make a decision may not be just the top five ones. For instance, if a travel recommender system predicts that the user will like beach resorts more than mountain resorts, it may still be helpful to show one mountain resort in a list of five recommendations; it may help to recover a possible error of the prediction algorithm, letting the user to choose the mountain resort, or, if the user really prefers beach resorts, to let the user to increase the awareness of this preference while browsing the alternative options. It has also been observed that the items presented in a recommendation list influence the user in her/his estimation of the perceived utility of the items; hence, in order to better predict what the user will likely choose more, complex choice models should be used (Osogami and Otsuka 2014). Another related issue is pertaining to the information that is provided together with the recommendations. In practice, this has been dealt by including explanations for the recommendations (Tintarev and Masthoff 2015). Explanations may be directed to enhance the transparency or the scrutability of the system, i.e., giving the user hints about the system internal behavior. Or they may increase the trust, the persuasiveness, and the subjective satisfaction of the user. Or, ultimately, explanations can improve the efficiency and effectiveness of the decision-making process supported by the recommender. Moreover, it has also recently shown that the explanations should also be personalized as different decision-makers may look for different types of information and justifications (Coba et al. 2019). Hence, a critical component of an RS is the graphical user interface and the full interaction design. In fact, with a proper design, more than one recommendation technique may be effectively used simultaneously, and the system can become conversational
466
F. Ricci
(Mahmood et al. 2009), i.e., information request/response can alternate between the user and the system according to a structure interaction process.
Key Recommendation Dimensions for Tourism RSs Recommender systems have been deployed in many application domains and support a wide range of user needs. It is impossible in a short document, as this chapter, to provide a comprehensive survey. However, we believe that, especially in the area of travel and tourism, three important dimensions must be considered: the context of the recommendation, the group of users that are supposed to benefit together from a recommended item, and the temporal relationships between the recommended items. Actually, all these aspects can be considered as complementary kinds of contextual situations for the recommendation.
Context-Awareness As we discussed previously, the score given by a user to an item depends, even significantly, on the contextual situation in which the item is experienced. For instance, Baltrunas et al. (2012) describe a point-of-interest (POI) recommender system for the tourism domain, where 14 contextual factors are considered. To mention some examples, there is a factor that describes the time of the travel, the weather condition, the mood of the user, or the type of group that is accompanying the traveler. For each factor a finite set of possible values is defined. For instance, the weather factor can take the values: snowing, clear sky, sunny, or rainy. Hence, in this recommender system, the set C is the space of all the possible combinations of the values for the 14 considered contextual factors. This system uses a set of users’ scores for a collection of points of interest in Bolzano (items), in different contextual situations, to predict the users’ evaluations for POIs not yet experienced in some possibly new contextual situations. The main technical issue related to context management in RSs is that when context is explicitly modeled, it is even more difficult to obtain from the user’s sufficient information, that is, training data, so that the predicted scores for items can be reliably estimated by the system in alternative contextual situations. Hence, first of all, it is important to parsimoniously acquire contextual information that is really useful in the prediction step. This goal has been addressed in Braunhofer and Ricci (2017) where a statistical analysis of the training data is performed in order to identify the most useful contextual factors to be elicited when a user is rating an item; these are heuristically identified as the factors that do have an impact on the user’s predicted rating for that item. This is, again, a kind of active learning task, as it is defined in machine learning, where the learning system does not passively acquire all the possible data but actively identifies the most promisingly useful ones before starting the true learning process, such as the items’ scores prediction (Elahi et al. 2016).
19 Recommender Systems in Tourism
467
The development of context-aware RSs is now an active research area, and we refer the reader to Adomavicius and Tuzhilin (2015) for more examples and techniques. The major technical difficulties are related to understanding the impact of the contextual factors on the personalization process; dynamically selecting the right factors, i.e., relevant for a particular personalization task; obtaining sufficient and reliable data describing the user preferences in context; and embedding the contextual information in a more classical recommendation computation technique.
Recommendations for Groups The second dimension and application area that we would like to underline for its importance in travel and tourism applications is group recommender systems (GRSs) (Masthoff 2015). In this case the context space C represents the possible groups of travelers that the target user may belong to, and the system goal is to offer the same set of recommendations for items to the members of the same group. The underlying assumption is that the items will be experienced together, e.g., the group members will travel together to the recommended place. The training data set of users’ scores in this case should contain scores for items when the user was in a particular group g, for instance, with his family or coworkers. Hence, the basic idea is that the user score for an item is also influenced by the presence of other users, i.e., an item may appear to be better or worse if consumed with other particular users. This effect is also called emotional contagion. A group recommender system with such background knowledge must predict the scores of a user for items when she/he is together with other users that she/he may even have not joined before. This is a rather complex task, and, as a matter of fact, group recommender systems are not directly trying to predict the user evaluations as a function of the group g the user belongs to. Conversely, the current leading approach is to first predict the individual scores independently from the group, i.e., using a standard RS technique for individual users, i.e., neglecting the contextual situation expressed by the group. Then, in a second step, group RSs compute the “group score prediction” for an item by aggregating the various score predictions for that item for the individual groups members (Nguyen and Ricci 2018). So, for instance, if the average aggregation method is used, the predicted score of a group for an item is the average of the predicted scores for that item of the various group members. Then, as usual, the items with the largest (group aggregated) scores are recommended. This aggregation of individual predictions may be seen as the group score prediction, but, as we discuss later, there is no group score that can be predicted as the “group” cannot score any item, only individuals can. Other approaches, instead of aggregating scores’ predictions, aggregate, for each group and item by item, the known scores for items given by the users belonging to the group, hence generating a (fictitious) group profile of known item scores (Ardissono et al. 2003). These scores, which constitute the group profile, are again computed, for instance, by averaging the evaluations of the users in the group for
468
F. Ricci
a given item. After that, group score predictions are computed for these fictitious users, which represent groups, considering them as normal users. As we have already noted, a group cannot express a “group evaluation”; only an individual can score items. Therefore, evaluating GRSs off-line is barely impossible, since it would require to compare the predicted group score with a group score (ground truth), which is not available. In practice, off-line evaluations have tried to compare, for the group members, the group recommendation with the individual recommendations. One of these measures is the mean individual loss, i.e., how much utility, on average, the group members will lose if they choose the group recommendation and not their own individual recommendation (Nguyen et al. 2019). Group recommendation applications stress once more the fundamental, but so far neglected, difference between score or ranking prediction and the actual identification of the recommendations to be presented to the user. As we mentioned above, the most useful recommendations may not be the items having the largest predicted scores for the group members as the group as a whole may come with a different common choice; the recommender should be designed only as a tool for supporting group choice. In fact, the user-to-user interactions in a group become very important for determining the group choice. For instance, in Nguyen and Ricci (2018), the authors present a novel approach to group recommendation, which is also implemented in a mobile system, that monitors and exploits users’ interactions during a group discussion, and offers appropriate recommendations as well as other types of suggestions, to guide and help the group members in settling on an agreement. The system is based on a chat platform and focuses primarily on the discussion support, which is aimed at criticizing and refining recommendations coming from the system or from some group members. This development is inspired and supported by a live user study, which was conducted by observing groups while they were discussing and choosing a touristic destination to visit. In this analysis it emerged that the group choices may be very different from the top scored items and users are often happy with choices that are not matching their own priorities (Delic et al. 2018).
Sequential Recommendations An RS may try to optimize a sequence of recommendations rather than a single one (Shani et al. 2005; Baccigalupo and Plaza 2006), for instance, a sequence of songs broadcasted by a personalized radio channel. Sequential recommendations may be generated by systems that manage a structured dialogue between the user and the recommender. These systems are called conversational and have emerged in the attempt to improve the quality of the recommendations that are normally offered by simpler systems based on a one-time request/response. Conversational RSs can be further improved by implementing learning capabilities that can optimize not only the items that are recommended at each conversational step but also how the dialogue between the user and the system should unfold in all possible situations
19 Recommender Systems in Tourism
469
(Mahmood et al. 2009). Please note that these systems do not necessarily use natural language techniques and the “dialogue” may be implemented by a structured set of requests/responses that either the user or the system may initiate. More in general, in many application domains, but especially in travel and tourism, multiple user-item interactions of different types can be recorded over time. A user may browse a hotel, then book it, and then review it. A number of recent works have shown that this multistep process can be used to build better scoring prediction models that can take into account a range of user-item interactions and discover behavioral patterns that can be leveraged in the recommendation process (Quadrana et al. 2018). We have already cited, in that sense, the approach to movie recommendation developed in Gurbanov and Ricci (2017) where multiple types of action observations are leveraged to predict the target “watch” action and score the items. In travel and tourism, a particular task for a sequential recommender is the next point-of-interest recommendation or more in general the travel planning problem. In Massimo and Ricci (2019) the authors present a novel approach to recommendation that supports tourists in choosing interesting and novel points of interests (POIs) while dealing with situations where users’ data is scarce and there is no additional information about users apart from their past visited POIs. User behavior is modeled by first clustering users with similar visit trajectories and then learning a general user behavior model, which is common to all the users in the same cluster, via inverse reinforcement learning (IRL) (Ng and Russell 2000). Finally, recommendations are generated by exploiting the learned behavioral models. Interestingly, it is shown that with the proposed approach, the next POI recommendations are more novel, and they better fit the general user preferences compared to those identified by a standard sequence-aware collaborative filtering solution based on nearest neighbor computation. Finally, full trip planning has been a persistent goal for travel and tourism recommender systems. The core problem is here attacked with techniques that are not in the core tradition of recommender systems and are instead derived from operational research (Gavalas et al. 2014). The task is usually that of defining a specific, i.e., user-dependent, optimization function that depends on the user preferences related to the POI’s characteristics and integrating in that function a component that depends on temporal and budget constraints, such as opening hours of the POIs and the cost of visiting them.
Open Issues and Conclusions The research on RSs is very active and several issues and challenges are still open. We conclude this chapter by listing some of them. Surely the most important trend in RSs in the last years has been the application of deep neural network (DNN) techniques. In general, deep learning algorithms adopt a sequence of nonlinear processing layers to represent complex functions. In that sense, when applied to RSs, DNN try to model the complex and nonlinear mapping that may exists from the user,
470
F. Ricci
the item, and the context combinations to the corresponding score (Karatzoglou and Hidasi 2017). The underlying assumption is that DNN can approximate any function with arbitrarily low error if they have enough neurons arranged in enough layers. They have become very popular because of increased computational resources offered by modern computers, such as those using GPU technology, and because in many machine learning tasks, large data repositories have been made available. However, while they proved to be particularly successful in certain areas, e.g., in generating sequences of recommendations or in dealing with implicit definitions of context, they cannot solve all types of recommendation tasks. In fact, often we do not have at our disposal the large training data sets of user scored items or action logs that they require, and, maybe even more importantly, it is hard to find effective ways to explain the rationale of the generated recommendation, as the internal computation of a DNN is way too complex to be unfolded in an explanation. Moreover, it has also been repeatedly observed that precise score predictions, as those that are potentially achievable by a DNN, are not always needed and sometimes even much simpler solutions do not score inferiorly when is the user that is evaluating the recommendations, instead of measuring the goodness of the prediction algorithm with standard machine learning approaches based on cross validation. As we mentioned above, often, RSs must interact with the target user, and in this case it could be more important to correctly design the man-machine interaction rather than to excessively focus on the optimization of the score prediction functionality. Hence, the usage of new interaction modalities, as those provided by Bots or speech-based interfaces, may represent a better investment of the available resources. This focus calls also for a more user-centered analysis of the impact of RSs both at the individual level and at the larger community level. Moreover, since users approach RSs for different reasons, it is also important to develop solutions that may optimize alternative criteria. For instance, if the user goal is to discover new items she/he was not aware of, the system must strive to identify items that are less likely to be known by the user, hence probably are not very popular, but still relevant for the user. Another question that has not yet been explored extensively, but it is clearly very important in many applications, including travel and tourism, is how to manage the item price and the profit that the seller will make with a purchase of a recommended item (Jannach and Adomavicius 2017). Regarding the first question, some researchers have tried to model the “price sensitivity” or the user’s “willingness to pay” in the scoring prediction method (see Jannach and Adomavicius (2017) for a survey). It has been shown that these models have better precision than the corresponding models that do not take this personal information into account. With respect to the second question, some researchers have shown, both in off-line and on-line experiments, that the ranking produced by the RS can be easily altered by taking into account the items’ profit without producing a significant decrease of the user satisfaction and the precision of the recommendations. These results further support the intuition that the ranked list produced by the item score
19 Recommender Systems in Tourism
471
prediction component should be just taken as a starting point for the generation of the actual selection of items to be presented to the user; other criteria, such as those related to the cost or the profit of an item, or the diversity of the recommendations, should and could be considered. It is also becoming very clear that RS technologies can influence users’ opinions (Aljukhadar et al. 2012), and therefore the application of these technologies in certain domains, such as politics, health, or job, is raising critics and attention. For instance, the popular LinkedIn social network is largely using RSs to match profiles with job posts. This means that recruiters tend to scrutinize more the profiles of job searchers that the RS has matched with the job positions that they have entered. In that sense, the prospect of a person looking for a job may be influenced, and in some cases also penalized, by the algorithmic decision-maker implemented in an RS. But, we should not overweight this negative effects of RSs as it has been shown that they can even increase choice diversity and, for instance, produce higher amount of topic diversity, compared to non-personalized methods, in journalistic recommendations (Möller et al. 2018). Travel and tourism applications are probably not that sensible, but, still, one can imagine that a biased usage of an RS may drive more attention to certain destinations or services and therefore may have an impact that goes beyond the optimization of the single-user satisfaction. As it was recently discussed in Werthner et al. (2015), we should critically consider the fundamental five layers that structure the field of e-tourism (individual, group, corporation, network, and government) and identify the roles and potential applications of RSs in all these layers. Too much attention has been given so far to the first two layers (individual and group), while the most important and far reaching impact on people lives can be determined by operating at the highest levels of this architecture. On that respect, RSs can still have many interesting problems to solve, and the research on the subject can be further motivated by these important applications.
Cross-References Advanced Web Technologies and E-Tourism Web Applications Artificial Intelligence and Machine Learning Big Data Technologies Business Intelligence in Tourism Consumer Behavior in e-Tourism Data Mining and Predictive Analytics for E-Tourism Data Privacy and the Travel Sector e-Tourism: An Informatics Perspective Group Decision-Making and Designing Group Recommender Systems Interactive and Context-Aware Systems in Tourism Internet of Things and Ubiquitous Computing in the Tourism Domain Mobile Applications for e-Tourism
472
F. Ricci
Travel Information Search User Modelling in E-Tourism: A Human-Computer Interaction Perspective Web Information Retrieval and Search
References Adomavicius G, Sankaranarayanan R, Sen S, Tuzhilin A (2005) Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans Inf Syst 23(1):103–145 Adomavicius G, Tuzhilin A (2015) Context-aware recommender systems. In: Ricci et al. (2015b), pp 191–226 Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749 Adomavicius G, Mobasher B, Ricci F, Tuzhilin A (2011) Context-aware recommender systems. AI Mag 32(3):67–80 Aljukhadar M, Senecal S, Daoust C-E (2012) Using recommendation agents to cope with information overload. Int J Electron Commer 17(2):41–70 Ardissono L, Goy A, Petrone G, Segnan M, Torasso P (2003) Intrigue: personalized recommendation of tourist attractions for desktop and handset devices. Appl Artif Intell 17:687–714 Baccigalupo C, Plaza E (2006) Case-based sequential ordering of songs for playlist recommendation. In: Roth-Berghofer T, Göker MH, Altay Güvenir H (eds) ECCBR. Lecture notes in computer science, vol 4106. Springer, pp 286–300 Baltrunas L, Ludwig B, Peer S, Ricci F (2012) Context relevance assessment and exploitation in mobile recommender systems. Pers Ubiquit Comput 16(5):507–526 Braunhofer M, Ricci F (2017) Selective contextual information acquisition in travel recommender systems. J IT Tour 17(1):5–29 Bridge D, Göker M, McGinty L, Smyth B (2006) Case-based recommender systems. Knowl Eng Rev 20(3):315–320 Burke R (2007) Hybrid web recommender systems. In: The adaptive web. Springer, Berlin/Heidelberg, pp 377–408 Coba L, Rook L, Zanker M, Symeonidis P (2019) Decision making strategies differ in the presence of collaborative explanations: two conjoint studies. In: Fu W-T, Pan S, Brdiczka O, Chau P, Calvary G (eds) Proceedings of the 24th international conference on intelligent user interfaces, IUI, Marina del Ray, 17–20 March 2019. ACM, pp 291–302 Delic A, Neidhardt J, Nguyen TN, Ricci F (2018) An observational user study for group recommender systems in the tourism domain. J IT Tour 19(1–4):87–116 de Gemmis M, Lops P, Musto C, Narducci F, Semeraro G (2015) Semantics-aware content-based recommender systems. In: Ricci et al. (2015), pp 119–159 Elahi M, Ricci F, Rubens N (2016) A survey of active learning in collaborative filtering recommender systems. Comput Sci Rev 20:29–50 Felfernig A, Friedrich G, Jannach D, Zanker M (2015) Constraint-based recommender systems. In: Ricci et al. (2015), pp 161–190 Gavalas D, Konstantopoulos C, Mastakas K, Pantziou GE (2014) A survey on algorithmic approaches for solving tourist trip design problems. J Heuristics 20(3):291–328 Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. Commun ACM 35(12):61–70 Gurbanov T, Ricci F (2017) Action prediction models for recommender systems based on collaborative filtering and sequence mining hybridization. In: Seffah A, Penzenstadler B, Alves C, Peng X (eds) Proceedings of the symposium on applied computing, SAC 2017, Marrakech, 3–7 April 2017. ACM, pp 1655–1661 Guy I (2015) Social recommender systems. In Ricci et al. (2015), pp 511–543
19 Recommender Systems in Tourism
473
Hu G-N, Dai X-Y, Song Y, Huang S, Chen J (2015) A synthetic approach for recommendation: combining ratings, social relations, and reviews. In: Yang Q, Wooldridge M (eds) Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI, Buenos Aires, 25–31 July 2015. AAAI Press, pp 1756–1762 Jannach D, Adomavicius G (2017) Price and profit awareness in recommender systems. CoRR, abs/1707.08029 Kalloori S, Ricci F, Gennari R (2018) Eliciting pairwise preferences in recommender systems. In: Pera S, Ekstrand MD, Amatriain X, O’Donovan J (eds) Proceedings of the 12th ACM conference on recommender systems, RecSys 2018, Vancouver, 2–7 Oct 2018. ACM, pp 329–337 Karatzoglou A, Hidasi B (2017) Deep learning for recommender systems. In: Cremonesi P, Ricci F, Berkovsky S, Tuzhilin A (eds) Proceedings of the eleventh ACM conference on recommender systems, RecSys 2017, Como, 27–31 Aug 2017. ACM, pp 396–397 Konstan JA, Riedl J (2012) Recommender systems: from algorithms to user experience. User Model User-Adap Inter 22(1–2):101–123 Koren Y, Bell RM (2015) Advances in collaborative filtering. In Ricci et al. (2015), pp 77–118 Möller J, Trilling D, Helberger N, van Es B (2018) Do not blame it on the algorithm: an empirical assessment of multiple recommender systems and their impact on content diversity. Inf Commun Soc 21(7):959–977 Mahmood T, Ricci F, Venturini A (2009) Improving recommendation effectiveness by adapting the dialogue strategy in online travel planning. Int J Inf Technol Tour 11(4):285–302 Massimo D, Ricci F (2019) Clustering users’ pois visit trajectories for next-poi recommendation. In: Pesonen J, Neidhardt J (eds) Information and communication technologies in tourism, ENTER 2019, Proceedings of the international conference in Nicosia, Cyprus, Jan 30–Feb 1 2019. Springer, pp 3–14 Masthoff J (2015) Group recommender systems: aggregation, satisfaction and group attributes. In: Ricci et al. (2015), pp 743–776 McGinty L, Reilly J (2011) On the evolution of critiquing recommenders. In: Ricci F, Rokach L, Shapira B (eds) Recommender systems handbook. Springer, pp 419–453 Moling O, Baltrunas L, Ricci F (2012) Optimal radio channel recommendations with explicit and implicit feedback. In: RecSys ’12: Proceedings of the 2012 ACM conference on recommender systems, pp 75–82 Ng A, Russell S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the 17th international conference on machine learning – ICML’00, pp 663–670 Nguyen TN, Ricci F (2018) A chat-based group recommender system for tourism. J IT Tour 18(1–4):5–28 Nguyen TN, Ricci F, Delic A, Bridge DG (2019) Conflict resolution in group decision making: insights from a simulation study. User Model User-Adapt Interact 29(5): 895–941 Ning X, Desrosiers C, Karypis G (2015) A comprehensive survey of neighborhood-based recommendation methods. In Ricci et al. (2015), pp 37–76 Osogami T, Otsuka M (2014) Restricted Boltzmann machines modeling human choice. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems 27: Annual conference on neural information processing systems, Montreal, 8–13 Dec 2014, pp 73–81 Quadrana M, Cremonesi P, Jannach D (2018) Sequence-aware recommender systems. ACM Comput Surv 51(4):66:1–66:36 Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2012) BPR: Bayesian personalized ranking from implicit feedback. CoRR, abs/1205.2618 Resnick P, Iacovou I, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings ACM conference on computer-supported cooperative work, pp 175–186
474
F. Ricci
Ricci F (2018) Recommender systems: Models and techniques. In: Alhajj R, Rokne JG (eds) Encyclopedia of social network analysis and mining, 2nd edn. Springer, New York Ricci F, Werthner H (2002) Case-based querying for travel planning recommendation. Inf Technol Tour 4(3/4):215–226 Ricci F, Rokach L, Shapira B (2015a) Recommender systems: introduction and challenges. In: Ricci et al. (2015b) pp 1–34 Ricci F, Rokach L, Shapira B (eds) (2015b) Recommender systems handbook. Springer, New York Schall D (2015) Social network-based recommender systems. Springer, New York Schwartz B (2004) The paradox of choice. ECCO, New York Shani G, Heckerman D, Brafman RI (2005) An MDP-based recommender system. J Mach Learn Res 6:1265–1295 Steele K, Stefánsson HO (2016) Decision theory. In: Zalta EN (ed) The Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University, winter 2016 edition Tintarev N, Masthoff J (2015) Explaining recommendations: design and evaluation. In Ricci et al. (2015), pp 353–382 Victor P, De Cock M, Cornelis C (2011) Trust and recommendations. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 645–675 Werthner H, Alzua-Sorzabal A, Cantoni L, Dickinger A, Gretzel U, Jannach D, Neidhardt J, Pröll B, Ricci F, Scaglione M, Stangl B, Stock O, Zanker M (2015) Future research issues in IT and tourism. J IT Tour 15(1):1–15
Blockchain and Tourism
20
Horst Treiblmaier
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blockchain Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Impact of the Blockchain on Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Previous Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blockchain Applications in Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Blockchain and Tourism: An Academic Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
476 477 482 483 483 490 491 493 493
Abstract Blockchain technology has the potential to substantially transform the tourism industry. Its salient features such as immutability, transparency, programmability, and decentralization allow for innovative ways to design customer relationships, enable novel organizational structures and processes, and facilitate new forms of interorganizational collaboration. In this chapter, I first elaborate on the basic functioning of the blockchain and highlight those characteristics which are crucial for understanding the rest of the chapter. I not only equip scholars and practitioners with the knowledge needed to better comprehend blockchain technology and how to apply it in the context of the tourism industry but also highlight shortcomings and important research topics. Furthermore, I investigate the disruptive potential of the blockchain on an economic level by discussing various ways in which it can alter existing market structures und potentially
H. Treiblmaier () Modul University Vienna, Vienna, Austria e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_28
475
476
H. Treiblmaier
lead to the disintermediation of incumbents in the tourism industry and the emergence of new players. Economic theory is referenced to better understand how blockchain characteristics might shape the future of the tourism industry and who the main beneficiaries will be. I end the chapter with several suggestions for future research and expected future developments.
Keywords Blockchain · Blockchain use cases · Distributed ledger technology · Smart contracts · Tourism · Hospitality
Introduction Tourism can be defined as “a social, cultural and economic phenomenon which entails the movement of people to countries or places outside their usual environment for personal or business/professional purposes” (UNWTO 2015, p. 1). Over recent decades, tourism has increasingly turned into an information-intense business which relies heavily on information and communication technologies (ICT) (Werthner and Klein 1999). Consequently, each new wave of technological progress has led to profound changes in areas such as bookings and hospitality operations. Since the 1990s, the Internet has exerted a strong impact on the tourism industry as a whole and spawned new reservation systems as well as novel forms of direct interaction with prospective and existing customers. This has led to the notion of e-tourism (aka eTourism), defined as “the digitization of all the processes and value chains in the tourism, travel, hospitality and catering industries that enable organizations to maximize their efficiency and effectiveness” (Buhalis 2002, p. xxiv). A fairly recent development is the shift from e-tourism toward smart tourism, which entails the move from the digital sphere into a combined digital and physical sphere. This move is characterized by the gradual replacement of websites by sensors and smartphones, the shift from information to big data, the central paradigm of interactivity being replaced by technology-mediated cocreation, and main exchanges no longer being B2B, B2C, and C2C, but rather public-private-consumer collaborations (Gretzel et al. 2015). The blockchain is but another step in this gradual process of technical progress and not only offers new opportunities but might also pose a serious threat for numerous incumbent stakeholders. In this chapter, I discuss the potential impact of the blockchain on the tourism industry in a general sense and without going into much technical detail. Blockchain technology, which will be discussed in more detail in the following section, is a common denominator for a multitude of technologies, many of which are under constant development. It is therefore its general characteristics, rather than specific implementation details, which are of interest in this chapter. Given the wide variety of potential applications, it is sometimes hard to draw a clear line between those applications that are specifically relevant for hospitality and tourism and those which
20 Blockchain and Tourism
477
only have an indirect impact. For example, one of the most prominent use cases of blockchain technology is the tracking and tracing of goods in the supply chain, which necessarily affects the whole area of food tourism (i.e., gastronomy tourism, culinary tourism) as well as a major part of hospitality operations management. Another example is identity management, a prominent use case for blockchain technologies, which might include the recovery of tourists’ lost passports as a specific application. In this chapter I therefore focus mainly on potential use cases that are particular to the hospitality and tourism industry while acknowledging that many more applications may emerge, for example, in payment systems or supply chain management, which could indirectly yet substantially affect the hospitality and tourism industry. I start by elaborating on the foundations of blockchain technology which are essential for understanding the rest of the chapter. All explanations are rather nontechnical and focus instead on the main characteristics of the technology. The main part of this chapter is the analysis of potential implications of blockchain technology on tourism, starting with a brief overview of the current literature. I split the implications into a discussion of practical use cases and a more academic examination of the underlying rationale for blockchain adoption and the organizational and economic changes that this development might trigger. Finally, I provide some suggestions for future rigorous and theory-based academic research and end the paper with a brief outlook on expected developments.
Blockchain Technology Initially developed as a solution for the electronic transfer of value by Satoshi Nakamoto, the pseudonymous creator of the cryptocurrency Bitcoin, the potential of the blockchain for more general use cases was soon recognized and the technology rapidly gained widespread attention, with tourism being but one application field. A blockchain can be defined as a “a digital, decentralized and distributed ledger in which transactions are logged and added in chronological order with the goal of creating permanent and tamperproof records” (Treiblmaier 2018, p. 547). Figure 1 illustrates the basic and simplified structure of a blockchain using the example of Bitcoin. A list of transactions is bundled together, and a hash function is calculated for each of them, which generates a number of fixed length that serves as a representative of one block and can be used to connect blocks in the chain. Next, all the individual hashes are mapped into a single hash, called the Merkle root, which is stored in the block header. It is a noteworthy feature of hash functions (of which numerous types exist) that a slight modification in the underlying data yields a completely different hash value, which makes it easy to spot modifications of the data. In addition to the Merkle root, the block header contains information such as a timestamp and a nonce (“number only used once”), an arbitrary number that blockchain miners are trying to find. Miners are specialized computers that validate transactions and add them to the network. In order to find a nonce that leads to a valid solution, miners need to try out numerous numbers, and the first miner who
478
H. Treiblmaier
Fig. 1 Simplified bitcoin structure as an example for a public blockchain
finds a solution is granted the right to add a new block to the chain in exchange for some Bitcoin. This mechanism ensures that the right to adding new information only depends on computing power and is not granted by a central authority. An important feature of the blockchain is the inclusion of the hash from the previous block header as part of the header of the subsequent block, which creates a data structure that cannot be altered without destroying the integrity of the total chain following the modification. Similar technologies that do not exhibit a chainlike structure are also often subsumed under the umbrella term “blockchain,” although these may be more accurately described by the broader term “distributed ledger technology,” or the even more general “trustless systems” (Treiblmaier 2019). A ledger in this context is an auditable log of the complete history of transactions. Examples that do not present a chainlike structure include directed acyclic graphs, in which the links between the respective nodes all go in one direction and look more like a network, and the Hedera Hashgraph, which is based on a so-called gossip protocol that ensures that information spreads throughout the network in a parallel manner. Most authors and organizations do not care much about the underlying technology, but more about its core features such as immutability and decentralization, and therefore the term “blockchain” is frequently used to denote a wide range of such trustless systems. In this chapter I follow this convention. The blockchain is often referred to as the technology underlying Bitcoin, but this statement is an oversimplification. A more detailed investigation of the seminal ideas that enabled Bitcoin and paved the way for further applications reveals the separate development of sophisticated concepts such as linked timestamping, proof of work, byzantine fault tolerance, public keys as identities, and smart contracts (Narayanan and Clark 2017). It was the clever combination of all these technologies with the Internet as a communication layer that eventually solved the problem of double spending (i.e., the repeated spending of a digital asset) (Nakamoto 2008)
20 Blockchain and Tourism
479
and turned the Internet of information into an Internet of value. More specifically, the double spending problem was solved by the Bitcoin blockchain through a clever combination of five technologies: append-only databases, asymmetric (public key) cryptography, P2P networking, game theoretic incentives, and a consensus mechanism (Potts 2018). In Table 1, I briefly describe the core characteristics of the blockchain. The immutability of the blockchain results from the fact that new data can only be appended, such that existing data remains unaltered in the chain. Blockchain can help to create transparency in systems since all participants can access the same data. In private blockchains, access rights can be restricted, depending on the participants’ specific role. One of the major shortcomings of the Bitcoin blockchain remains its lack of flexibility, expressed by its limited programmability. This means that the language which Bitcoin uses for processing transactions is fairly limited in its functionality. As a solution, blockchains were developed that included full-fledged programming languages and allowed for the comprehensive development of software. One notable example is Ethereum, which enables software developers to build and deploy decentralized applications. Decentralization is one of the core features of the blockchain which means that data (e.g., transactions) or code are stored in an identical manner on a number of computers (nodes). This not only prevents a single authority from assuming control but also makes it difficult to compromise the network, since a multitude of targets would need to be attacked simultaneously. The degree of anonymity on a blockchain mostly depends on its type (public vs. private) and is not a major issue in many commercial applications with a clearly defined group of participants. Contrary to commonly held assumptions, Bitcoin, as a prominent example for a public blockchain, does not provide anonymity but rather pseudonymity, with the public address serving as the pseudonym. Previous research has shown that the true identity of participants
Table 1 Blockchain characteristics (cf. Treiblmaier 2019) Characteristic Immutability
Transparency Programmability
Decentralization
Anonymity Consensus
Explanation Data in a blockchain is unchangeable unless a specified part of the network (e.g., the majority of the hashing power in Bitcoin) decides to do so. Data that has been tampered with can be easily identified Data on a blockchain is visible to a specified group of users. Most importantly, they all share the same view on the data Programmable blockchains allow the specification of rules (often called smart contract) that are automatically executed in case prespecified conditions occur Blockchain technologies do not rely on a central point of control. Consensus protocols define how dispersed entities agree on what should be written onto a blockchain and the prevailing state of truth The visibility of identifying data in a blockchain ranges from full anonymity over pseudonymity to full identity A consensus mechanism is applied to achieve agreement on the state of a network including the validity of transactions and how decisions can be made
480
H. Treiblmaier
in this peer-to-peer network can potentially be revealed (Meiklejohn et al. 2016). Finally, an important feature of a decentralized network is the procedure to reach consensus pertaining to the validity of transactions and the determination of which entities are entitled to add data. The issue of reaching consensus is mainly a problem for public blockchains in which the participants do not know and trust each other. The most well-known consensus mechanism is proof of work (PoW), which is applied in Bitcoin. In order to determine who is entitled to add the next transaction, miners search for the answer to a mathematical problem which requires substantial computational power. Numerous other consensus mechanisms sharing the same goal exist, but the underlying principles vary. Other examples include proof of stake (PoS), in which users can signal interest by locking up their tokens for a certain amount of time; proof of capacity (PoC), which requires participants to allocate an amount of memory or disk space; and proof of elapsed time (PoET), in which each participant needs to wait a randomly assigned period of time. A detailed analysis of existing consensus mechanisms is beyond the scope of this chapter, but it is important to mention that they are also under constant development to create effective and efficient solutions. Additional features of the blockchain can be derived from its basic characteristics. One of the most important is (distributed) trust, namely, the partial substitution of trust in people or organizations by trust in systems based on blockchain technology in which data is immutably recorded and algorithms are executed automatically. Care has to be taken, however, not to mistakenly assume that blockchain technology can solve all trust-related issues. Possible problems include bugs in the code, hostile takeovers of the network, or powerful entities exercising control in the background. Furthermore, the use of asymmetric cryptography promises authenticity which can be assured by employing a pair of keys to encrypt and decrypt data. The socalled public key can be shared with everyone, while the private key stays in the possession of the owner. A message encrypted with one key can only be decrypted with the matching key from the pair. In addition to ensuring confidentiality by encryption, asymmetric encryption can also be used for authentication, which can easily be done by signing a message with a private key. Anyone in possession of the corresponding public key can easily verify that a specific message could only have been created by the owner of the respective private key. Auditability is another feature that can be derived from the combination of immutability and transparency of the data. Similarly, traceability is ensured by the immutability of the data in append-only databases, wherein existing data can be neither removed nor altered. Another feature that is commonly associated with blockchain technology is the security that is ensured by using cryptographic techniques. However, a closer examination reveals that the term “secure” turns out to be hard to define and that two core questions need to be asked: (1) secure from whom? and (2) secure for what? (Orcutt 2018). Privacy is also a double-edged sword when it comes to blockchain technology. On the one hand, blockchain technology can enable individuals to gain back control over their personal data, while on the other hand, it also might allow companies to easily match data, going far back into the past, in order to create a holistic picture of individuals’ preferences and behaviors. At a market
20 Blockchain and Tourism
481
level, disintermediation is frequently mentioned as an implication of the blockchain, but it might also be the case that new intermediaries emerge as controllers of the technology. Cryptocurrency coins such as Bitcoin or Ether represent encrypted digital currencies that can be used for payment purposes and operate on their respective blockchains. In contrast, a token is hosted by a platform and has a broader range of functionality: payment tokens can be used for buying goods or services, utility tokens provide digital access to applications or services, and asset tokens represent the underlying asset such as debt or equity claims (Goudarzi and Martin 2018). Blockchain technology is not a silver bullet for all organizational problems and several shortcomings exist, many of which can be directly linked to its characteristics. For example, a blockchain cannot guarantee the accuracy of the data, and its immutability can lead to adverse effects if data is wrong or legal regulations are violated. This is especially crucial if personal data are involved, given that strong privacy regulations are in place especially in the European Union with the General Data Protection Regulation (GDPR). Numerous legal uncertainties exist when smart contracts are used. Other disadvantages include the complexity and inefficiency of setting up and managing a network of distributed computers that store data in a highly redundant manner and the problem of creating scalable distributed solutions that allow for a high throughput of data. Particularly in public networks relying on a PoW consensus algorithm, it cannot be ruled out that the control of the network is taken over through the application of massive computing power. Figure 2 shows a decision tree that can serve as a guideline for organizations who want to assess whether they need a blockchain solution and, if so, which form would be the most appropriate to deploy. Blockchain applications can be differentiated
Do you need to store a transaction?
No
Yes
Are there multiple writers?
No
Yes
Can you use a trusted third party?
Yes
No
Are all writers known?
Permissionless blockchain
No
Yes Do the writers have a conflict of interest?
Yes
No
Is public verifiability required?
Yes
Public permissioned blockchain
No
Private permissioned blockchain
No need for a blockchain
Fig. 2 Blockchain decision flowchart (Goudarzi and Martin 2018; Wüst and Gervais 2018)
482
H. Treiblmaier
into public vs. private as well as permissionless vs. permissioned. No blockchain solution is needed if there is no need to continuously reflect the system’s current status, if there is only one entity that writes in the database, or if a trusted third party exists. In all other cases, the correct choice depends on whether the entities that write to the blockchain are known. If this is not the case, a permissionless blockchain would be the appropriate choice. Otherwise the pertinent decision criterion is that of public verifiability: the question of whether all participants should be able to verify the correctness of the system state. If this is the case, a public permissioned blockchain should be chosen. If not, a private permissioned blockchain that is only accessible to a clearly defined number of participants is the solution of choice (Wüst and Gervais 2018). The latter type is the preferred option for many blockchain solutions in the supply chain.
The Impact of the Blockchain on Tourism Blockchain technology is expected to have a huge impact on numerous industries and tourism is no exception (Treiblmaier and Beck 2019a,b). Bitcoin, the first practical implementation of an online blockchain, was released in January 2009,1 but it took several years before the industry started to take notice between 2015 and 2017 for most organizations. Given the long duration of academic publication cycles, rigorous research on tourism use cases is still scarce, but a huge amount of gray literature and numerous newspaper and magazine articles have been published on the topic in recent years. Many organizations are trying to incorporate the blockchain into their existing business models, with varying degrees of publicity. Simultaneously, start-ups are being launched in an effort to disrupt or alter existing business models. Due to the topicality of this subject, I refrain from mentioning individual applications or specific decentralized applications (Dapps) but encourage other authors to systematically analyze and compare them. Existing work in this area includes the studies from Nam et al. (2019) and Ozdemir et al. (2019). The coming sections briefly summarize existing academic research, before discussing potential use cases from a practitioner’s perspective. Subsequently, I take a look from an academic perspective and suggest various theories which can be used to investigate the topic from a behavioral (i.e., individual-centric), organizational, or economic perspective. The borders between academic and industry perspectives remain rather blurred at this early developmental stage, and academic research can easily transcend any perceived boundaries: for example, by taking a design science
1 Most
people believe that Bitcoin was the first blockchain. This is true if the definition of blockchain encompasses decentralization and the consensus of a number of online nodes that do not trust each other. If the term is taken literally (i.e., as a chain of blocks containing data), the first blockchain was launched by the two cryptographers Stuart Haber and Scott Stornetta. Starting in 1995, they used a cryptographic hash value to produce a unique ID and to connect data blocks. This hash has been published in the New York Times on a weekly basis ever since to ensure that this chain of records cannot be tampered with.
20 Blockchain and Tourism
483
approach and actively participating in deploying applications that are useful for the industry or by assessing their potential impact, as is done in the popular research stream of technology adoption. Furthermore, neither the list of potential use cases nor the theories I present are meant to be exhaustive and should rather serve as a starting point for researchers who want to delve into this subject in more depth.
Previous Research Rigorous academic research on the implications of the blockchain on tourism is scarce. In one of the first publications in a premier tourism journal, Önder and Treiblmaier (2018) suggest three research propositions pertaining to the impact of the blockchain on the tourism industry, namely, (1) that new forms of evaluations and review technologies will lead to trustworthy review systems, (2) that the widespread adoption of cryptocurrencies will lead to new types of C2C markets, and (3) that blockchain technology will lead to increased disintermediation in the tourism industry. Calvaresi et al. (2019) provide a comprehensive summary of current research, including gray literature, with a focus on how blockchain technology can help to foster trust in tourism. Their research highlights both strengths (e.g., cost reductions, increased efficiency) and weaknesses (e.g., low transaction speeds, lack of standardization). Treiblmaier (2019) present the findings from a qualitative study with managers from destination management organizations (DMOs) from ten major European cities. Their findings reveal that the blockchain has the potential to reduce costs and change market structures and may be a valuable asset for many tourism organizations. However, given the nascent state of the field, future research is needed to better understand and predict the full implications of blockchain technologies. Kwok and Koh (2018) denote blockchain technology as a potential watershed for tourism development and focus especially on its effect on small island economies. They identify six major areas of impact: inventory management, credential management, digital payment, loyalty programs, identity management, and reservations & ticketing. Nam et al. (2019) describe the latest trends and challenges regarding blockchain technology for smart cities and smart tourism and develop research propositions postulating an emergence of new market structures and business models. Finally, Ozdemir et al. (2019) present a comparison framework for Dapps and present several practical examples of how they are currently used in the tourism industry.
Blockchain Applications in Tourism There exist different ways of categorizing the potential applications of the blockchain in the tourism and hospitality industry. In Table 2, I present various use cases which can be derived from the characteristics of the blockchain which are already implemented in the industry or at least discussed in the practitioner literature. Examples of blockchain applications in the hospitality industry include
484
H. Treiblmaier
Table 2 Blockchain use cases in tourism Use case Inventory management Maintenance and tracking Content, reservations and ticketing Payments and tax compliance Loyalty programs and personalized marketing Tokenization and dedicated coins Identity, credential management and privacy
Baggage tracking Smart contract Dapp for smart tourism Disintermediation Coordination and coopetition
Source Bell and Hollander (2018), HTNG (2018), and Willie (2019) Goudarzi and Martin (2018), Irvin and Sullivan (2018), Nam et al. (2019), Pilkington (2017), and Willie (2019) Bell and Hollander (2018), Goudarzi and Martin (2018), HTNG (2018), and Larchet (2017) HTNG (2018), Kwok and Koh (2018), Nam et al. (2019), Önder and Treiblmaier (2018), and Willie (2019) Dogru et al. (2018), Goudarzi and Martin (2018), HTNG (2018), Irvin and Sullivan (2018), Kwok and Koh (2018), Pilkington (2017), and Willie (2019) Bell and Hollander (2018), Goudarzi and Martin (2018), and Ying et al. (2018) Bell and Hollander (2018), Dogru et al. (2018), Goudarzi and Martin (2018), HTNG (2018), Kwok and Koh (2018), Nam et al. (2019), Önder and Treiblmaier (2018), Pilkington (2017) and Willie (2019) Goudarzi and Martin (2018) and Ludeiro (2019) Dogru et al. (2018), Goudarzi and Martin (2018), Irvin and Sullivan (2018), Nam et al. (2019), and Willie (2019) Nam et al. (2019) and Ozdemir et al. (2019) Calvaresi et al. (2019), HTNG (2018), Önder and Treiblmaier (2018), and Pilkington (2017) Goudarzi and Martin (2018), HTNG (2018), Irvin and Sullivan (2018), Willie (2019), and Ying et al. (2018)
booking and facilities management. The former includes the customer interface as well as interorganizational data processing and the latter the administration of accommodations. When it comes to transportation, it is especially the airline industry which scrutinizes the technology’s potentials. It has to be mentioned, however, that the classification in Table 2 is not disjunct and numerous overlaps exist, which result from the complexity and the presumed far-reaching implications of the technology. For example, an alternative classification might investigate impacts based on the type of tourism, such as health tourism or food tourism (Pilkington 2017; Willie 2019). In the following sections, I briefly discuss selected potential use cases and how blockchain technology can potentially transform them.
Inventory Management Inventory in tourism often refers to the number of rooms available in the hospitality industry or the number of seats available in the airline industry. The blockchain can help to provide information regarding the availability and rate of inventory and to share this information among various stakeholders. In the hospitality industry, blockchain-based solutions can replace proprietary property management systems (PMS) and central reservation systems (CRS) that connect various nodes such
20 Blockchain and Tourism
485
as channel managers and global distribution systems. These new systems can further synchronize the data with sales outlets that face the customer, such as online travel agencies (OTAs), traditional travel agencies, and tour operators, which helps to foster coordination, as discussed below. The complexity of data handling, processing, and transmission often leads to situations in which the inventory owner has to pay a commission or fee to third parties. Blockchain technology can directly link suppliers of inventory with customer-facing sales outlets and thus remove intermediaries (HTNG 2018) and related expenses. Another example is the food service industry in which smart contracts can be used for inventory standing orders (Willie 2019).
Maintenance and Tracking Supply chain management is seen as one of the main application areas of blockchain technology (Treiblmaier 2018). This includes the tracking and tracing of food in order to certify its origin and handling procedures so as to avoid health and hygiene issues. This is especially important for those sectors of food tourism where the use of organic, local, authentic, or sustainable products creates a competitive edge (Nam et al. 2019). A different example includes the tracking of the status and the location of important assets such as aircraft spare parts along the supply chain, which not only streamlines existing processes but also helps to create more resilient value chains (Goudarzi and Martin 2018). This is especially important in the airline industry since recording an aircraft’s component parts and their respective conditions requires elaborate documentation to comply with mandatory maintenance requirements. In these cases the blockchain can avoid inconsistencies in the lifecycle documentation of an asset, keep the credentials uncorrupted, and verify that all maintenance regulations have been followed (Irvin and Sullivan 2018). Content, Reservations, and Ticketing Keeping the customer interface, most often a company’s website, up to date is a major challenge for many tourism organizations. For example, when hotels refurbish, rebrand, or open new facilities, they need to update their content including textual descriptions, photos, and videos. In such cases a blockchain can constitute a central location for storing data (HTNG 2018). The blockchain can also be used for reservations and ticketing and to eliminate black markets. This can be done by creating a standard ticketing protocol that allows buyers to use their wallets to prove ticket ownership. In case the ticket holder passes it on, the original ticket will be cancelled and a new one will be created (Larchet 2017). Similarly, the management of airline tickets, a complicated process which involves numerous business partners, can be facilitated with blockchain-based solutions (Goudarzi and Martin 2018). Payments and Tax Compliance The widespread adoption of cryptocurrencies could substantially impact payments in the tourism industry (Önder and Treiblmaier 2018). This also includes the use of tokens for payment purposes. Certain destinations such as small island economies
486
H. Treiblmaier
have already experimented with cryptocurrency payments for residents and tourists to gain a competitive edge. Blockchain networks facilitate cross-border remittances and avoid the problem of foreign currency conversion. In general, the removal of commission fees can help to reduce the operating costs of various market participants in the tourism industry (Kwok and Koh 2018). The current market structure is dominated by intermediaries that charge substantial commissions and fees. Using coins or tokens as payment can make the market less hierarchical and could even help to create efficient reward systems for travelers who provide feedback on online review sites (Nam et al. 2019). One further interesting area is tax compliance, which is complicated by the fact that reservations might be subject to taxation at many levels (city, county, regional, country, and inter-country). In such cases the blockchain can help by enabling tax authorities to publicly post tax structures, while smart contracts can automatically transfer taxes, and the paying entities receive a proof of compliance (HTNG 2018).
Loyalty Programs and Personalized Marketing Many existing loyalty programs create a lot of administrative overhead yet leave the benefits of participation rather unclear to customers. Most of the accumulated points are therefore never redeemed, and customers tend to selectively choose the programs which appeal most to them (Pilkington 2017). Since it is not clear when someone will redeem a point or mile, expenses sit in limbo, accrued but unable to be recognized (Irvin and Sullivan 2018). Using loyalty tokens that can be freely exchanged with others would create a competitive market that provides organizations with feedback on how they are doing. Additionally, loyalty tokens can be fairly easily used across industries (Dogru et al. 2018). Using loyalty wallets that track tokens across partners and purchase types enables connectivity and allows for real-time transaction processing and reconciliation, managing the exchange of points, using smart contracts for coordination and providing a full audit trail (Irvin and Sullivan 2018). Typical loyalty program transactions that can be processed on a blockchain include the transfer of loyalty points between accounts, the exchange of points between loyalty programs, and the bundling of redemption offers across multiple partners (HTNG 2018). Information aggregated over different loyalty programs also offers companies the opportunity for improved personalized marketing. While this might make marketing measures more efficient, it might also lead to serious privacy concerns on the side of consumers. Tokenization and Dedicated Coins Apart from using tokens for payments or loyalty programs, a broad number of additional use cases exist. Tokens not only ease accounting and reconciliation but also prevent digital items from being double spent (Goudarzi and Martin 2018). Travel tokens that work across suppliers’ ledgers can build the foundation for novel offerings in the tourism industry (Bell and Hollander 2018). Other potential applications include the raising of money through the issuance of security tokens, dedicated insurance tokens, or the tokenization of sensitive data by replacing them with unique identification symbols. Another example is a blockchain-enabled
20 Blockchain and Tourism
487
e-commerce platform in which digital coins can be deposited in employees’ digital wallets, which gives them greater freedom in their decision on how to spend them compared to providing them with traditional benefits (Ying et al. 2018).
Identity, Credential Management, and Privacy Blockchain technology can help to unambiguously determine a person’s identity. In the future the sharing of identity information across suppliers and even across blockchains could result in a global traveler identity (Bell and Hollander 2018). This would provide an easy solution against identity theft, which is especially important for tourists who frequently have to produce their IDs to check into flights or hotel rooms, to pick up rental cars, or even to purchase alcoholic beverages; each incidence might reveal information not only to authorized individuals but also to bystanders. IDs that contain cryptographically secured codes can allow for identity verification without producing personal information (Dogru et al. 2018). Easy identity verification would also be beneficial for professionals working in the tourism industry such as airline crews (Goudarzi and Martin 2018). A blockchain-based identity solution can share selected data according to the respectively applicable law. Biometric information (e.g., fingerprints, iris recognition, facial recognition) can be added to meet the requirements from different authorities. Having all such information stored on the blockchain will make work easier for hotels, who would no longer need to report a guest’s arrival to the police or immigration authorities, but rather store the arrival and departure dates on the blockchain (HTNG 2018). Online customer reviews are of utmost importance in the tourism industry, but their authenticity and reliability is often doubtful. The certification of reviews prior to storing them on the blockchain, which can be done by signing them with a private key, could increase the perceived credibility of online feedback (Pilkington 2017; Treiblmaier and Önder 2019). However, this does not necessarily guarantee the reliability of an individual review and comes with the drawback that unfair or biased reviews cannot be removed once posted (Nam et al. 2019). The sharing of personal information inevitably raises privacy issues and, more specifically, the question of who has access to sensitive data and under what conditions the data can be accessed. Blockchain technology offers a wide range of potential application scenarios, ranging from comprehensive surveillance to the self-determined sharing of data following the principles of data avoidance and data minimization. Baggage Tracking A special case of traceability is the tracking of tourists’ personal belongings. The transparency of blockchain systems facilitates the tracking of the location and status of travelers’ assets by recording every change in custody (Goudarzi and Martin 2018). Additionally, it is possible to provide tourists with up-to-date information about the current location of their belongings on their mobile device (Ludeiro 2019). Smart Contracts Smart contracts are computer programs that are automatically executed if certain prespecified conditions occur. These conditions are often confirmed by so-called
488
H. Treiblmaier
oracles, trusted data sources that sense and verify external information and submit it to the blockchain. The term “smart contract” is actually misleading. In their current state, these programs are not “smart” in the sense that they learn to adapt to environmental changes, but highly deterministic, since their execution across different nodes needs to produce identical results and the various catalyzing conditions have to be defined as precisely as possible in advance. Neither are they contracts in a legal sense. Nonetheless, the programmability and automated execution independent of any human interference offers numerous potentials for the tourism industry by reducing the effort required for contract execution, fulfillment monitoring, reconciliation, invoicing, and settlement (Goudarzi and Martin 2018). For example, smart contracts can trigger immediate payment after the recording of a transaction based on the contractual terms, which can facilitate collaboration between hotels and travel agencies. Furthermore, the hotel check-in process can be eliminated by assigning hotel rooms to guests via a digital key on the blockchain once the room rate has been paid. The same principle applies to the rental of apartments, office space, and cars, all of which can even be equipped with locks that can be controlled via a blockchain. Another application scenario is the airline industry where revenue recognition is becoming increasingly complex since revenues not only include seat fares but also selection fees, baggage costs, and in-flight enhancements. Sharing rules defined by the International Air Transport Association (IATA) together with the airlines and realized with the help of smart contracts might lead to more innovative total revenue drivers and rapid revenue distribution (Irvin and Sullivan 2018). Smart contracts can also facilitate flight insurance by automatically paying out the agreed sum in case of delay or cancellation (Dogru et al. 2018). Another example is the food service industry, where smart contracts can be used for inventory procurement standing orders, support services, kitchen maintenance, and the purchase or lease of equipment (Willie 2019).
Dapps for Smart Tourism Dapps are applications that are stored and executed on a decentralized peer-topeer network rather than on a centralized server. A detailed description of all possible application scenarios is beyond the scope of this chapter, but there is not much imagination needed to predict that novel business models will emerge as the technology matures. One of the major goals of many Dapps is to provide a user-friendly interface and to exploit the characteristics of the blockchain as I have described above. Although the development of Dapps is in an early stage, numerous projects exist among established organizations and start-ups that implement use cases in areas such as online reviews, planning of trips, direct communication with property owners, reservations and bookings, marketplaces for coins, selling personal information, and personalized marketing. Two academic publications have so far described Dapps according to various criteria. (Ozdemir et al. 2019) present an indepth assessment of four Dapps using the following evaluation criteria: governance, platform, consensus mechanism, use of cryptocurrency, smart contracts, and tokens. Nam et al. (2019) briefly describe and compare 13 Dapps and predict for the
20 Blockchain and Tourism
489
future of the tourism industry a reduction of costs, an increasing adoption of cryptocurrencies, and the development of all-encompassing ecosystems.
Disintermediation Disintermediation caused by blockchain technology arises through the introduction of a decentralized layer of nodes which removes rent-seeking middlemen. In tourism, the two main types of intermediaries are Global Distribution Systems (GDS), which enable transactions between service providers (e.g., hotels, airlines, travel agencies, car rentals), and Online Travel Agencies (OTA), both of which charge substantial fees (Nam et al. 2019) and can potentially be substituted by systems that allow for peer-to-peer communication and transactions (Önder and Treiblmaier 2018). The underlying driver for this development is a shift in the confidence of participants from trust in organizations and that emerging out of personal relationships to trust in the platform and, if applicable, its provider(s) (Calvaresi et al. 2019). The total cost of trust for the US economy, based on an activity assessment of various occupations, was estimated to be as high as 35 per cent of the total employment (Davidson et al. 2018), which illustrates the economic potential of a technology that strives to create “trustless systems.” Blockchain solutions in the tourism industry can reduce the need for intermediaries that provide information regarding rates and availabilities or enable bookings and payments. These intermediaries are currently used for transactional convenience, but more direct interaction between hoteliers and their customers can lead to lower-cost distribution channels and more competitive rates (HTNG 2018). It remains to be seen how the current services offered by intermediaries, such as an easy comparison of offerings, the provision of online reviews and a central contact point, can be fulfilled by blockchain-based applications. The introduction of blockchain-based platforms might even lead to new intermediaries who (partially) take market activities over from incumbents. Coordination and Coopetition Blockchain-based structural changes can go far beyond disintermediation and the emergence of new intermediaries. Existing relationships between organizations might change fundamentally, and new forms of coordination and coopetition (i.e., the simultaneous occurrence of competition and cooperation between businesses) might emerge. Examples include the sharing of revenues and the facilitation of B2B settlement. Through blockchain-based coordination, the distribution reach of all involved parties can be expanded, and the aggregation of travel products and services can be achieved more efficiently (Goudarzi and Martin 2018). Coordination of trips across suppliers can also help to reduce inefficiencies, as is the case when a traveler fails to check in for a flight and the hotel as well as the rental car supplier can immediately release inventory for sale to others (HTNG 2018). (Ying et al. 2018) present a case study of the implementation of a blockchain-enabled e-commerce platform and illustrate how it has been developed in three phases. Initially, digital coins were offered for carrying out transactions, followed by an invitation to suppliers to join the platform and, finally, an integration with third-
490
H. Treiblmaier
party e-commerce platforms to increase the product offering. This example shows that blockchain technology need not completely replace existing structures, but can also be integrated into existing business models.
Blockchain and Tourism: An Academic Perspective Academic research is characterized by its rigorous design and methodology, which often includes the use of theory and the expansion of previous research. Given the comprehensive nature of the blockchain and its expected wide-ranging impact, numerous academic theories are potentially applicable. Without any claim to be exhaustive, in Table 3 I present examples of theories that can be used to investigate antecedents and patterns of blockchain adoption as well as theories that analyze structural and managerial changes from a new institutional economics perspective. The table also contains references to seminal literature as well as early publications that adapt those theories to blockchain-induced changes. The first group includes the “classic” theories of reasoned action and planned behavior focusing on behavioral antecedents such as attitudes toward the behavior, subjective norm and perceived control (Ajzen 1991; Fishbein 1967), the parsimonious technology acceptance model with ease of use and usefulness as the main predictors, and the unified theory of acceptance and use of technology (UTAUT) which adds performance and effort expectancy as well as social influence, facilitation conditions, and several moderating variables (Venkatesh et al. 2016). Numerous modifications of these models exist that can be potentially applied. The final example in this group is the technology-organization-environment framework, which describes how decisionmaking regarding innovations depends on the specific technological, organizational, and environmental contexts (DePietro et al. 1990). This framework has previously been applied to blockchain adoption by (Clohessy et al. 2019).
Table 3 Theories for blockchain adoption and blockchain-induced change Theories Technology Adoption (Theory of Reasoned Action, Theory of Planned Behavior, Technology Acceptance Model, Unified Theory of Acceptance and Use of Technology, Technology-Organization-Environment Framework) Agency Theory Transaction Cost Theory Resource based view of the firm Actor network theory
Source Ajzen (1991), Clohessy et al. (2019), DePietro et al. (1990), Fishbein (1967), and Venkatesh et al. (2016)
Jensen and Meckling (1976) and Treiblmaier (2018) Berg et al. (2019), Coase (1937), Treiblmaier (2018), and Williamson (1981) Treiblmaier (2018) and Wernerfelt (1984) Johanson and Mattsson (1987) and Treiblmaier (2018)
20 Blockchain and Tourism
491
The second group of theories is centered around two main questions based on new institutional economics (Halldorssón et al. 2007; Treiblmaier 2018): (1) How to structure a blockchain-based value chain? and (2) What is needed to manage such a structure? A more fine-grained analysis of the central blockchain-related categories of each theory – based on an analysis of qualitative interviews with managers from DMOs – can be found in (Treiblmaier 2019), and a general description of the economic impact of blockchain technology from an institutional economics perspective in (Berg et al. 2019). Agency theory is derived from the principalagent problem, which states that the principal bears substantial costs to monitor, supervise, and control the agent to make sure that the latter acts in the best interest of the former (Jensen and Meckling 1976). Blockchain technology can significantly change this relationship through increased transparency and the automation of business processes (Treiblmaier 2018). Transaction cost theory (transaction cost economics) deals with the question of the conditions under which economic activities occur within a market and within a firm. Transaction costs include costs of initiation (search and information), agreement, control, and adjustment (Coase 1937; Williamson 1981). Given that blockchain technology opens up new ways for ensuring data quality and facilitates the sharing and auditing of information, it impacts internal as well as external transaction costs, facilitates new market structures, and potentially triggers changes in the size of organizations (Treiblmaier 2018). The resource based view of the firm postulates that it is a subset of a company’s resources that creates competitive advantage and an even smaller subset that ensures superior performance in the long term (Wernerfelt 1984). From this perspective the blockchain represents an asset that can be used to further develop an organization’s core competencies as well as to create new market offerings, and companies need to carefully investigate how to best exploit its features (Treiblmaier 2018). Finally, actor network theory, or simply network theory, focuses on interorganizational relations, expressed by personal relationships between representatives of different organizations and the trust structures that emerge (Johanson and Mattsson 1987). Since blockchain technology has the potential to alter trust by creating immutable, traceable, and auditable records, the question arises as to how the nature and quality of business relationships will change and how this might alter market structures (Treiblmaier 2018).
Expected Future Developments Every investigation of the potential impact of blockchain technology needs to take into account three important aspects. First, the blockchain is just a collective term for a number of technologies. More specifically, numerous blockchain protocols exist and yield even more implementations thereof. Public and permissionless blockchains provide different solutions and face other challenges than do private and permissioned ones. Second, blockchain technology or, more specifically, its main building blocks are in a state of constant development. On a low level, this pertains to the underlying protocols that are constantly being refined, while on a higher
492
H. Treiblmaier
level, this includes new applications that build on these foundations. For example, new consensus mechanisms are under development in order to create alternatives to the energy-intensive PoW algorithm frequently used in public and permissionless networks. Additionally, new scalability solutions are currently being tested in order to speed up the complex process of adding transactions. Many of these problems pertain especially to public and permissionless blockchains rather than to systems in which the participants know each other, as is the case in private blockchains. Third, numerous contingency factors exist that complicate the prediction of future developments. Most importantly, blockchain is a highly disputed technology. The potential to disrupt powerful intermediaries and especially to curtail state power, for example, by creating monetary systems that operate independent of governments and central banks, will probably lead to legal repercussions whose implications are hard to predict. Additionally, similar to the Internet, the characteristics of blockchain technology may lead to apparently contradictory outcomes. For example, it has been praised as a means of protecting individuals’ privacy by allowing for pseudonymity (or anonymity) in online communication and transactions but could also be used to exert tighter control through immutable data storage and easy auditability. States have only recently begun to react to developments in this area by introducing legislation and a consistent cross-national strategy is lacking. Smaller states might want to attract business by introducing blockchain-friendly laws, while bigger ones might feel threatened by the lack of control over borderless transactions. On an economic level, the further development of markets might be just as hard to forecast. Similar to the Internet, which was once predicted to become a major force of disintermediation, but which actually led to the emergence of new and powerful intermediaries, blockchain technologies can be used for numerous, sometimes even conflicting, purposes. The combination of blockchain with other technologies such as the Internet of things (IoT) and artificial intelligence (AI) might yield new applications that create synergies and benefit each other. For example, in the hospitality industry, AI in combination with sensor technology can be used to administer buildings and the data can be stored on a blockchain. Potential use cases include controlling air conditioning systems, monitoring room occupancy, or estimating the number of meals that need to be prepared. However, this might also lead to systems that are hard to control and to seamless monitoring, which might be acceptable for things, but which raises serious privacy issues where humans are involved. As I have outlined in the chapter, severe implications of blockchain technology are expected for individuals, organizations, markets, and economies, and the tourism industry is no exception. My recommendations for organizations is to carefully scrutinize their existing business models and to identify the potentials of the blockchain (in combination with other technologies) to foster organizational effectiveness and efficiency but also to pinpoint those weak spots that can be a potential target for disruptive start-ups. Academic researchers also face a major challenge. So far only few papers have been published in leading journals that rigorously investigate the potential impact of the blockchain. Given the importance of the topic, academia is clearly lagging behind, which can be partly attributed to lengthy publication cycles.
20 Blockchain and Tourism
493
Nonetheless, research is needed that guides the industry and explores, explains, and predicts technological evolution in this area. Most importantly, rigorous case studies and action research can help to better understand how the technology can be exploited without exploiting humans. I therefore believe that the combination of blockchain and tourism is a fruitful research area for the coming years and an exciting opportunity for academia to make a major contribution through creating understanding, raising awareness, and educating companies, governments, and the general public.
Cross-References Advanced Web Technologies and E-Tourism Web Applications Artificial Intelligence and Machine Learning Consumer Behavior in e-Tourism Drivers of e-Tourism e-Tourism: An Informatics Perspective Experimental Research in E-Tourism: A Critical Review Internet of Things and Ubiquitous Computing in the Tourism Domain
References Ajzen I (1991) The theory of planned behavior. Organ. Behav Hum Decis Process 50(2):179–211. https://doi.org/10.1016/0749-5978(91)90020-T Bell A, Hollander D (2018) Blockchain and distributed ledger technology at travelport, pp 1–12 [A travelport white paper]. Retrieved from https://www.travelport.com/sites/default/files/ travelport-blockchain-whitepaper.pdf Berg C, Davidson S, Potts J (2019) Understanding the blockchain economy: an introduction to institutional cryptoeconomics. Retrieved from https://www.e-elgar.com/shop/understandingthe-blockchain-economy Buhalis D (2002) eTourism: information technology for strategic tourism management. International edition. Financial Times/Prentice Hall, Harlow Calvaresi D, Leis M, Dubovitskaya A, Schegg R, Schumacher M (2019) Trust in tourism via blockchain technology: results from a systematic review. In: Pesonen J, Neidhardt J (Eds) Information and communication technologies in tourism 2019. Springer International Publishing, Nicosia, pp 304–317 Clohessy T, Acton T, Rogers, N (2019) Blockchain adoption: technological, organisational and environmental considerations. In: Treiblmaier H, Beck R (eds) Business transformation through blockchain, Vol I. Palgrave Macmillan, Cham, pp 47–76. https://doi.org/10.1007/978-3-31998911-2 Coase RH (1937) The nature of the firm. Economica 4(16):386–405 Davidson S, Novak M, Potts J (2018) The cost of trust: a pilot study. J Br Blockchain Assoc 1(2):1–7. https://doi.org/10.31585/jbba-1-2-(5)2018 DePietro R, Wiarda E, Fleischer M (1990) The context for change: organization, technology and environment. In: Tornatzky LG, Fleischer M (eds) The processes of technological innovation. Lexington Books, Lexington, pp 151–175
494
H. Treiblmaier
Dogru T, Mody M, Leonardi C (2018, Winter) Blockchain technology & its implications for the hospitality industry. Boston Hospitality Review. Retrieved from https://www.bu.edu/bhr/2018/ 02/13/blockchain-technology-its-implications-for-the-hospitality-industry/ Fishbein M (1967) Attitude and the prediction of behavior. In: Fishbein M (ed) Readings in attitude theory and measurement. Wiley, New York, pp 477–492 Goudarzi H, Martin JI (2018) Blockchain in aviation. Retrieved from International Air Transport Association website: https://www.iata.org/contentassets/2d997082f3c84c7cba001f506edd2c2e/ blockchain-in-aviation-white-paper.pdf Gretzel U, Sigala M, Xiang Z, Koo C (2015) Smart tourism: foundations and developments. Electron Mark 25(3):179–188. https://doi.org/10.1007/s12525-015-0196-8 Halldorssón A, Kotzab H, Mikkola JH, Skøjtt-Larsen T (2007) Complementary theories to supply chain management. Supply Chain Manag 12(4):284–296 HTNG (2018) Blockchain for hospitality. Retrieved from Hospitality Technology Next Generation website: https://www.hospitalitynet.org/file/152008497.pdf Irvin C, Sullivan J (2018) Using blockchain to streamline airline finance, pp 1–6. Retrieved from Deloitte Development LLC website: https://www2.deloitte.com/us/en/pages/consulting/ articles/airlines-blockchain-finance.html Jensen MC, Meckling WH (1976) Theory of the firm: Managerial behavior, agency costs and ownership structure. J Financ Econ 3(4):305–360. Johanson J, Mattsson L-G (1987) Interorganizational relations in industrial systems: a network approach compared with the transaction-cost approach. Int Stud Manag Organiz 17(1):34–48. https://doi.org/10.1080/00208825.1987.11656444 Kwok, A. O. J., & Koh, S. G. M. (2018). Is blockchain technology a watershed for tourism development? Current Issues in Tourism. Retrieved from https://www.tandfonline.com/doi/abs/ 10.1080/13683500.2018.1513460 Larchet V (2017) Blockchain: solution for the black market threat to the tourism industry, pp 1–14. Retrieved from SecuTix website: https://www.secutix.com/wp-content/uploads/2017/ 07/White-paper_Blockchain_final.pdf Ludeiro AR (2019) Blockchain technology for luggage tracking. In: Rodríguez S, Prieto J, Faria P, Kłos S, Fernández A, Mazuelas S, . . . Navarro EM (eds) Distributed computing and artificial intelligence, Special sessions, 15th International Conference. Springer International Publishing, pp 451–456 Meiklejohn S, Pomarole M, Jordan G, Levchenko K, McCoy D, Voelker GM, Savage S (2016) A fistful of bitcoins: characterizing payments among men with no names. Commun ACM 59(4):86–93. https://doi.org/10.1145/2896384 Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system, pp 1–9. Retrieved from https:// bitcoin.org/bitcoin.pdf. Accessed Feb 10, 2017 Nam K, Dutt CS, Chathoth P, Khan MS (2019) Blockchain technology for smart city and smart tourism: Latest trends and challenges. Asia Pac J Tour Res 1–15. https://doi.org/10.1080/ 10941665.2019.1585376 Narayanan A, Clark J (2017) Bitcoin’s academic pedigree. Commun ACM 60(12):36–45. https:// doi.org/10.1145/3132259 Önder I, Treiblmaier H (2018) Blockchain and tourism: three research propositions. Ann Tour Res 72(C):180–182. Orcutt M (2018) How secure is a blockchain, really? MIT Technol Rev 2018:40–41 Ozdemir AI, Ar IM, Erol I (2019) Assessment of blockchain applications in travel and tourism industry. Qual Quant 1–15 https://doi.org/10.1007/s11135-019-00901-w Pilkington M (2017) Can blockchain technology help promote new tourism destinations? The example of medical tourism in moldova. SSRN Scholarly Paper No. ID 2984479. Retrieved from Social Science Research Network website: https://papers.ssrn.com/abstract=2984479 Potts J (2018) Innovation economics: blockchain uses in higher education. Presented at the Finance in Higher Education Conference, Sydney, Australia. Treiblmaier H (2018) The impact of the blockchain on the supply chain: a theory-based research framework and a call for action. Supply Chain Manag Int J 23(6), 545–559
20 Blockchain and Tourism
495
Treiblmaier H (2019) Toward more rigorous blockchain research: recommendations for writing blockchain case studies. Front Blockchain 2(3), 1–15. https://doi.org/10.3389/fbloc.2019.00003 Treiblmaier H, Önder I (2019) The impact of blockchain on the tourism industry: a theorybased research framework. In: Treiblmaier H, Beck R (eds) Business Transformation through blockchain, vol II. Palgrave Macmillan, Cham, pp 3–21 Treiblmaier H, Beck R (2019a) Business transformation through blockchain, vol I. Horst Treiblmaier/Palgrave Macmillan, Cham Treiblmaier H, Beck R (2019b) Business transformation through blockchain, vol II. Horst Treiblmaier/Palgrave Macmillan, Cham UNWTO (2015) Understanding tourism: basic glossary. Retrieved from https://webunwto.s3-euwest-1.amazonaws.com/2019-08/glossary_EN.pdf Venkatesh V, Thong JYL, Xu X (2016) Unified theory of acceptance and use of technology: a synthesis and the road ahead. J Assoc Inf Syst 17(5):328–376 Wüst K, Gervais A (2018) Do you need a blockchain? In: 2018 Crypto valley conference on blockchain technology (CVCBT), pp 45–54. https://doi.org/10.1109/CVCBT.2018.00011 Wernerfelt B (1984) A resource-based view of the firm. Strateg Manag J 5(2):171–180. Retrieved from JSTOR Werthner H, Klein S (1999) Information technology and tourism: a challenging relationship. Springer, Wien/New York Williamson OE (1981) The economics of organization: the transaction cost approach. Am J Sociol 87(3):548–577 Willie P (2019) Can all sectors of the hospitality and tourism industry be influenced by the innovation of blockchain technology? Worldwide Hosp Tour Themes. https://doi.org/10.1108/ WHATT-11-2018-0077 Ying W, Jia S, Du W (2018) Digital enablement of blockchain: evidence from HNA group. Int J Inf Manag 39:1–4. https://doi.org/10.1016/j.ijinfomgt.2017.10.004
Business Intelligence in Tourism
21
Wolfram Höpken and Matthias Fuchs
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Business Intelligence Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Extraction, Transformation, and Loading (ETL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chronology of the ETL Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Update Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ETL Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Warehousing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Traditional Data Warehouse Approaches: Normalized vs. Dimensional Modeling . . . . . . Data Lakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Analysis and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OLAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structuring a Tourism MIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Case Study: DMIS Halland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Sources and Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reporting and Dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technical Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
498 501 503 505 505 506 507 507 508 508 509 509 510 513 513 513 514 515 517 517 517 518 519 519
W. Höpken () Institute for Digital Transformation, Ravensburg-Weingarten University of Applied Sciences, Weingarten, Germany e-mail: [email protected] M. Fuchs Department of Economics, Geography, Law and Tourism, The European Tourism Research Institute, Mid-Sweden University, Östersund, Jämtland, Sweden e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_3
497
498
W. Höpken and M. Fuchs
Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operational Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Real-Time Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
522 522 523 524 525 525
Abstract Business intelligence encompasses all activities dealing with collecting, storing/managing, and analyzing business-relevant data with the objective of generating knowledge as input to decision support. Business intelligence is often used as an umbrella term for data warehousing, reporting and OLAP (online analytical processing), MIS/DSS, and data mining, respectively. If we count all topics listed above, it is obvious that business intelligence has quite a long history also in the tourism domain. As early examples in tourism, we can identify the DINAMO system introduced by American Airlines already in 1988 or TourMIS in 1998. The widespread use of ICT, especially the uptake of the Internet and social media, led to an increase of available data on customers, competitors, and the whole market in all major business domains, including tourism. More powerful hardware and sophisticated methods to store and analyze such data turned business intelligence into one of the fastest-growing technologies and most challenging areas in the last decade. This chapter gives an overview on the topic of business intelligence and all technical components of a BI architecture (i.e., information extraction and transformation, data warehousing, and different mechanisms and tools to access and analyze data, like reporting or OLAP tools, dashboards, or data mining toolsets). Moreover, the chapter looks at the history of BI in tourism and presents and discusses typical application scenarios in tourism. Finally, we look at current trends and latest developments in the area of business intelligence and their expected implications for the tourism domain.
Keywords Business intelligence · Data warehousing · Data analysis · Online analytical processing · Management information systems · Decision support systems
Introduction Business intelligence (BI) encompasses all activities, IT applications, and technologies to collect, analyze, and visualize business-relevant data in order to support operative and strategic decision-making (Kimball and Ross 2016). BI is nowadays typically used as an umbrella term for the domains of data extraction, data
21 Business Intelligence in Tourism
499
warehousing (DW), and different kinds of data analyses, like reporting, OLAP (online analytical processing), and data mining (Williams 2016). The first appearance of the term business intelligence can be traced back to Hans Peter Luhn, who used the term to describe “the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal” (Luhn 1958, p. 314). Later, in 1989, Howard J. Dresner broadened the scope of BI toward an umbrella term, which encompasses “concepts and methods to improve business decision making by using fact-based support systems.” (Power 2007, p. 128). BI, as it is currently understood, offers historical, current, and predictive or even prescriptive views on business processes (Kimball et al. 2008). Typical functions embrace reporting, direct access & OLAP, data mining, and specific analytic applications (Rud 2009; Williams 2016). Enterprises make use of BI to support a wide range of operational business decisions, such as product positioning or yield management, as well as to provide strategic insights into new markets and assess customer demand and the suitability of products and services developed for different market segments or the impact of marketing and advertising strategies (Chugh and Grandhi 2013). As most important factors for the growing importance of BI, we can identify the explosive growth of data collection in most everyday situations (e.g., shopping/payment, travelling), especially due to the Internet and WWW, the tremendous growth in computing power and storage capacity, and finally, from a business perspective, an increasing competitive pressure due to globalization. Accordingly, the global big data and business analytics market increased from 122 billion US dollars in 2014 to 168.8 billion US dollars in 2018 and is forecasted to expand to 274.3 billion US dollars by 2022. In the tourism domain as well, the amount of available data on, e.g., customers, products, and markets and competitors is increasing dramatically. The click or navigation behavior of customers on tourism websites, travelers’ booking behavior and customer profiles within computer reservation systems (CRS), or information on tourism offers and suppliers within property management systems (PMS) and destination management systems (DMS) are typical information sources for big data available in the e-Tourism area. In fact, methods of business intelligence (BI) have been applied in tourism since the early stage of ICT adoption. Therefore, they offer up-to-date and strategically relevant information on tourists’ travel motives, channel use, booking behavior, the service experience, and added-value per guest segment (Min and Emam 2002; Ritchie and Ritchie 2002; Sambamurthy and Subramani 2005; Wong et al. 2006; Höpken et al. 2014; Fuchs et al. 2014). Modern IT systems in tourism (e.g., GDS/CRS, PMS, DMS) typically offer built-in BI and, in particular, reporting functionalities. Additionally, more sophisticated approaches from the area of knowledge extraction and data mining are increasingly used to support decisionmaking processes in tourism. Airline companies, for example, started to analyze transactional data as input to product optimization and yield management. Accordingly, prominent BI application areas in the airline industry are demand forecasting (Subramanian et al. 1999),
500
W. Höpken and M. Fuchs
as well as the prediction of cancellation behavior and no-shows (Garrow and Koppelman 2004). The DINAMO system, introduced by American Airlines in 1988, is among the historically first examples of an airline revenue and yield management system (Smith et al. 1992), building on American Airline’s GDS SABRE as the data source. The system was employed for forecasting cancellations, no-shows, or tourism demand, based on exponential smoothing techniques of time-series, as well as decision trees and clustering approaches, respectively. Early BI applications can also be found in the area of tourism destinations and the hospitality industry. The Austrian TourMIS (tourism marketing information systems; Wöber 1998) offers market research and decision support for tourism stakeholders, like destinations, hotels, attraction providers, etc. TourMIS collects data on tourism arrivals, overnight stays, and visits at tourism attractions directly by the tourism stakeholders and offers descriptive analysis of various performance indicators as well as trend analysis and forecasting methods to identify seasonal or long-term trends or to predict tourism demand. The Austrian MIS MANOVA WEBMARK (www.webmark.eu) supports tourism destinations and related stakeholders by analyzing guest satisfaction and various performance indicators based on descriptive and strategic analysis (like importance/performance or SWOT analyses). Data collection is based on manual data gathering or online surveys. DestiMetrics (www.destimetrics.com) supports performance analyses and decision-making for tourism destinations and accommodation providers in the United States and Canada. Accommodation reservation data are imported from PMS companies on a monthly basis, enabling descriptive analyses of performance indicators, like occupancy rate, daily average room rate, or revenue per available room (RevPAR). T-Stats (www.t-stats.co.uk), an MIS for tourism destinations, mainly offers descriptive analyses of performance indicators for accommodations and attractions, general tourism statistics, customer satisfaction, and web traffic. Tourism stakeholders execute the data gathering process mainly manually on a monthly or daily basis. Geztio (www.gezt.io), a Swedish destination management information system, offers descriptive analyses in the form of reporting and interactive dashboards as well as knowledge discovery as direct input to decision support by various data mining techniques. Geztio collects data from a multitude of different data sources automatically, e.g., booking and arrival data, web navigation data, survey data, or customer feedback on Facebook, and enables analyses across data sources based on a homogeneous data warehouse model. To conclude, all typical BI techniques to visualize and analyze data are used in tourism nowadays. Descriptive analyses are commonly offered in the form of reporting/dashboards or OLAP (Höpken et al. 2015; Keil et al. 2017). Data mining (i.e., machine learning) techniques are applied in both a supervised and an unsupervised manner. Supervised learning (e.g., classification, estimation, and prediction) is used to explain tourists’ consumption behavior (Morales and Wang 2008), to predict tourism demand (Vlahogianni and Karlaftis 2010; Law et al. 2019; Höpken et al. 2019), or to analyze customer feedback (Kasper and Vela 2011; Gräbner et al. 2012;
21 Business Intelligence in Tourism
501
Schmunk et al. 2014; Menner et al. 2016). Unsupervised learning (i.e., clustering and association rule analysis) is used for customer segmentation (Bloom 2004) and product recommendation (Pitman et al. 2010; Zhu et al. 2017) or click-stream analysis (Jiang and Gruenwald 2006). Finally, when looking at thematic awareness within the tourism research community, proxied by the number of related research articles, listed on Scopus or Web of Science in the past 5 years, we can observe that on the one hand the topic of BI (including big data) is gaining more and more attention. On the other hand, however, with 77 papers on BI, and 96 on big data, respectively, the overall number is still limited, and, interestingly enough, the majority of papers are not even published in tourism-specific journals, limiting their visibility for the tourism research community (Mariani et al. 2018).
A Business Intelligence Architecture When looking at a business intelligence architecture on a conceptual level, we can distinguish between a knowledge generation and a knowledge application layer. The former is extracting and collecting relevant information from different customer and supplier-based data sources and generating new knowledge by data analyses and data mining. The latter is using the collected and generated knowledge as input to offer intelligent services and decision support for both customers and suppliers (Fig. 1; Höpken et al. 2015). The customer-based knowledge generation layer provides content, e.g., in the form of tourists’ feedback (generated by guest surveys, review platforms, etc.),
Customer-oriented knowledge application
Supplier-oriented knowledge application
- Recommendation services - Community services - Location-based services
- De-centralized access to competitive knowledge bases (EDA, OLAP, DM)
Customer-based knowledge generation
Supplier-based knowledge generation
- Tourists’ feedback - Information traces - Mobility behavior
- Customer profiles, products, processes, competitors and cooperation partners
Fig. 1 A conceptual business intelligence architecture (Höpken et al. 2015)
Knowledge application layer
Knowledge generation layer
502
W. Höpken and M. Fuchs
information traces (e.g., generated by booking systems, online platforms, etc.; Pitman et al. 2010), or mobility behavior (e.g., generated by mobile applications or customer cards; Zanker et al. 2010). On the supplier side, information on, e.g., customers, products, and markets (i.e., competitors, cooperation partners) is typically available in form of CRM or product databases or extracted from websites or online platforms (e.g., competitor websites, DMS, etc.) (Ritchie and Ritchie 2002; Gretzel and Fesenmaier 2004; Pyo 2005). The customer-oriented knowledge application layer provides intelligent and knowledge-based services, like recommendation services (recommending products and services based on customer preferences as well as real-time information on product and service availability), or adaptive community or location-based services. On the supplier side, knowledge-based applications mainly fall into the category of management information and decision support systems, offering reporting (Explorative Data Analyses [EDA], OLAP, etc.) and data mining or predictive analytics functionalities (Cho and Leung 2002; Olmeda and Sheldon 2002; Fuchs and Höpken 2009). Figure 2 shows a business intelligence architecture from a technical perspective. The knowledge generation layer consists of: • Structured and unstructured data sources (e.g., reservation and booking data, web navigation data, customer feedback, product availability, price information, etc.) • The ETL (extraction, transformation, load) process, extracting relevant data from different data sources, transforming source data into a homogenous data format appropriate for further analyses, and storing/loading data into the data warehouse
Operative application
Reporting & OLAP
Data mining & knowledge generation
Data warehouse
Data extraction (ETL)
Structured data
Unstructured data
Fig. 2 Technical BI architecture (Höpken et al. 2015)
Knowledge application layer Knowledge generation layer
21 Business Intelligence in Tourism
503
• The data warehouse (DW) as central and homogenous data storage, combining data from different business processes and data sources and enabling business process and data source overarching analyses • Methods of data mining and knowledge generation to gain relevant knowledge as input to adaptive and intelligent services as well as decision support The knowledge application layer mainly provides reporting and OLAP functionalities, thus enabling both access to data and knowledge stored in the central data warehouse and interactive data visualization and explorative/descriptive data analyses. Following the trend of operative BI, operative applications make use of the data and knowledge stored in the central data warehouse as well, in order to offer intelligent and adaptive services (e.g., recommender systems, yield management systems, etc.).
Data Sources The aim of any kind of business intelligence activity is to analyze the performance of business processes or single business transactions and to identify and understand factors or circumstances influencing the organization’s performance. Necessary data to analyze business transactions’ performance are either generated by the business transaction itself (e.g., booking records) or by a separate measurement process (e.g., web application tracking). Factors or circumstances influencing business transactions’ performance are either stemming from preceding business transaction within the overall business process (e.g., positive customer feedback driving information search or bookings) or constitute exogenous factors (e.g., weather or economic indicators like the GDP of sending countries). Table 1 shows most relevant tourism business transactions and corresponding data, generated by the business transaction, as input to business transaction performance analyses. In addition to the data sources listed in Table 1, the following exogenous factors, influencing or explaining the performance of business transactions, can be identified: • Econometric indicators of supplier environment (e.g., GDP, investment behavior, etc.) (affected business transactions: offer generation) • Econometric indicators of sending countries (e.g., GDP, unemployment rate, spending power) (affected business transactions: all) • Environmental factors (e.g., weather/climate, traffic, pollution, etc.) (affected business transactions: all except information provision and feedback) • Events (e.g., mega events, catastrophes, diseases, etc.) (affected business transaction: all)
504
W. Höpken and M. Fuchs
Table 1 Tourism business transactions and generated data Business transaction Offer (generation) Generation of adaptive and personalized offers of tourism products and services
Marketing Offline marketing via traditional media as well as online marketing activities via websites, display advertising, search engine marketing (SEM), content marketing (e.g., UGC marketing), social media marketing (SMM), or email marketing Information (consumption) and search Information search via search engines, travel websites (from DMOs, suppliers, intermediaries, etc.), mobile apps/guides (e.g., Google Maps, mobile city guides, etc.), social media sites (e.g., TripAdvisor, YouTube, Facebook, Twitter, etc.), as well as information requests via email, messaging services (e.g., WhatsApp), or social networks (e.g., Facebook, Twitter, etc.) Reservation and booking Reservation or booking of tourism services via CRS/GDS, PMS, or any other kind of booking systems, online platform and Internet booking engine, or social media platform (e.g., Facebook) Consumption Arrival or stay at a destination or consumption of accommodation product, transportation services, or local tourism products like attractions, activities, events, food and beverage, etc.
Generated data Offer description in supplier IT systems (e.g., ERP system, CRS/PMS) or online supplier listings (e.g., yellow pages or registries like UDDI) Official product/offer statistics Website content Advertising/placement statistics in marketing networks, search engines, social media Posts, comments in social media (email) Marketing messages Offline marketing statistics Search engine traffic and search terms (e.g., Google Trends) Web navigation behavior/traffic (page views and search terms) (geo-referenced) Mobile interactions Social media traffic (e.g., downloads, impressions, followers, likes, etc.) Social media interactions: post, comment, etc. Information request/message Booking records including customer information (e.g., PNRs) Web navigation behavior/traffic Social media navigation behavior Official statistics with booking information
Arrival/overnight figures via accommodation providers’ CRS/PMS, official statistics, mobile apps/guides usage, or consumption of other local tourism products (e.g., paid by credit card, etc.) Overnight figures via accommodation providers’ CRS/PMS or official statistics Transportation figures by suppliers’ CRS/GDS, ticket system, etc. or official statistics (by IATA, etc.), including traffic, delays, no-shows, etc. Consumption figures via ticket offices/sales, supplier systems (e.g., cash systems), credit card payment or online/mobile payment, (mobile) customer cards or loyalty programs, location tracking (by mobile phones, GPS tracking, beacons, etc.), sensors and cameras, or official statistics Information provision and feedback Product reviews from online platforms, typically Provision of travel information and feedback including basic demographic customer via online review sites (e.g., TripAdvisor), information supplier-specific online feedback or survey Posts, comments, photo uploads, systems, or social media platforms (e.g., check-in/checkout, etc., typically including basic Facebook, Flickr, Google Maps, etc.) demographic customer information and often geo-referenced
21 Business Intelligence in Tourism
505
Data Extraction, Transformation, and Loading (ETL) ETL represents the process of data extraction, transformation, and loading (Kimball et al. 2008). In the first stage, relevant data from different data sources (e.g., operational databases, CRM systems, webserver log files, etc.) are extracted. The subsequent step focusses on transforming such data into a data format appropriate for visualization or data mining activities (separated into the single tasks of cleaning, migration, and record linkage). In the third and final step, data is loaded or stored into a database (typically a data warehouse). Figure 3 shows the overall process flow and single process steps of the ETL process, described in more detail in the following sub-sections.
Data Extraction The first step of the ETL process deals with extracting relevant data from different data sources. The most important requirement for the step of data extraction is the support of all possible types of data sources and data formats, which can be differentiated into structured and unstructured data. Structured data comes in different formats, depending on the data origin and source system. Typical data formats are text files (the most common data export format, e.g., for official statistics or Google Trends data); databases (provided by, e.g., ERP systems, CRS/GDS, or PMS); application-specific formats (e.g., SPSS files, MS Excel files, etc.); XML files (used, e.g., by data interchange standards like OpenTravel); and JSON files (used, e.g., as output format of web APIs).
Flat structure – corresponding to complete transaction or single business object like customer, product, etc. Extraction
Record linkage can take place: (1) after cleaning (2) after migration (3) during load
Cleaning
Data source Record linkage
n:m relationship
Extraction
Cleaning
Data source
Merged data, if several data sources are loaded simultaniously
Fig. 3 ETL process structure
Load
Data warehouse
506
W. Höpken and M. Fuchs
Unstructured data can take different formats, like semi-structured html documents, free text, or even images or videos. Methods for extracting data from unstructured data sources vary quite widely: • Html documents: Structured data is extracted from html documents (e.g., product review sites) by the means of wrappers, either created manually based on static patterns or (semi-)automatically generated by means of (un-)supervised learning methods (Liu 2008). • Free text: Free text is stored in the data warehouse as blob (binary large object) and/or transformed into structured data by means of statistical language models (e.g., word vectors with TF/IDF weights) or linguistic approaches (Manning and Schütz 2001). • Audio/video: Audio/video objects (e.g., photos uploaded on Flickr or videos uploaded on YouTube) are either stored as separate files, referenced by the data warehouse, or as blob, if supported by the database system (e.g., NoSQL databases). As seen in the section data sources, most of the relevant data in tourism are external and, thus, owned by different stakeholders (e.g., CRS/GDS, PMS, etc.) or online platforms (e.g., Facebook or TripAdvisor) (Cerba et al. 2015). To deal with this externality and heterogeneity of data, different technical mechanisms to access such data do exist: file access provided via ftp (file transfer protocol), email, or other file exchange mechanisms; direct database access; streaming, i.e., data constantly generated and provided in small packages; web page access via http requests and crawlers; and web resource access via APIs (application programming interfaces).
Data Transformation The second step of the ETL process is transforming data into an appropriate format for visualization/OLAP and data mining. Data transformation again can be separated into the following steps (Kimball et al. 2008): • Data cleaning: Data cleaning aims at increasing the quality of data by dealing with missing, incorrect, or useless data (Chu et al. 2016). Data cleaning handles the data quality dimensions accuracy (i.e., syntactic and semantic accuracy dealing with data type/format, misspellings, or wrong values), completeness (i.e., dealing with missing values), and consistency (typically proved by integrity constraints). • Data migration: Data migration is the process of migrating heterogeneous data (from different data sources) into a homogeneous data format, including data type or value domain transformations, combination and separation of data elements (e.g., street name and number vs. street name and street number), and structure mapping, i.e., transforming the data into the final data warehouse schema. Automatic or semi-automatic mapping techniques, based, for example,
21 Business Intelligence in Tourism
507
on approaches of schema matching (Liu 2008), often do not reach a sufficient accuracy, e.g., in the case of complex booking data, and, thus, such mappings are typically defined manually or based on semantic web technologies (Dell’Erba et al. 2005). • Record linkage: Record linkage refers to the process of identifying duplicated data entries representing the same real-world entity and typically occurs after integrating data from different data sources. Typical approaches are exact matching, based on unique global identifiers (e.g., booking number, customer id) or a collection of key attributes (e.g., name and address to uniquely identify customers), fuzzy matching for comparing attribute values, rule-based approaches, or even machine learning techniques (e.g., naïve Bayesian classifiers, neural networks) (Christen and Winkler 2016). • Data enrichment: Data enrichment denotes the process of enriching existing data by additional information available in separate data sources. A typical example in the context of web navigation data is the enrichment of web-server log-file data by additional information, like user’s country, city, and provider based on external data sources or services (e.g., ip2location – www.ip2location.com). Additionally, customer data can be enriched by demographic or economic data of the sending country (e.g., inhabitants, GDP per capita, etc.).
Load The complete ETL process takes place in a staging area, logically and physically separated from the operative systems as data sources on the one side and the data warehouse on the other side, in order not to compromise both systems’ integrity and performance. Thus, the final ETL process step deals with loading the extracted and transformed data from the staging area into the data warehouse. The concrete procedure of writing data into the data warehouse strongly depends on the type of data warehouse model, e.g., a fully normalized data model, a multidimensional data model, or a data lake like repository with a structure near to the original data sources.
Chronology of the ETL Process ETL processes are executed either on demand (i.e., on explicit request), periodically (i.e., in a periodical manner, depending on the time criticality of the data), or event-driven (i.e., based on the occurrence of certain events, e.g., changes in data sources) (Kimball et al. 2008). Comprehensive tourism data warehouses, dealing with the above listed data sources, have to make use of all three types of chronology, depending on the characteristics of the data source at hand. Long-term data, like economic data or sending country demographics, are typically integrated on demand. By contrast, data from operative systems (e.g., CRS/GDS, online booking systems, etc.) are usually integrated periodically. Finally, time-critical data, which serve as input to real-time analyses, are typically integrated event-driven, e.g., traffic
508
W. Höpken and M. Fuchs
information or usage and consumption behavior as input to a mobile real-time recommender system.
Update Strategies The data sources, relevant for a comprehensive tourism data warehouse, are heterogeneous with regard to not only their structure but also their coverage of historical data and their chronology of change. Thus, different update strategies are necessary in order to enable an efficient and timely data update. The static capture strategy loads a full snapshot of the source data into the data warehouse, either completely reloading the data into the data warehouse (not preserving historical data) or appending, i.e., inserting only new or changed data. As the approach suffers from performance issues, it is only suitable for small data sources. Timestamp capture only processes new or changed data records, leading to a significant performance improvement. File comparison capture identifies changes by comparing two snapshots of the source data. Application-assisted capture requires the change detection logic to be implemented as part of the source application and is, thus, difficult to be added to legacy systems. Trigger-based capture uses data base triggers to store data changes. As most source data in the tourism context (cf. data sources listed above) contain a timestamp (i.e., in case of most transactional data, e.g., webserver log files or booking data), timestamp capture is the most appropriate update mechanism. In certain cases, where changes in old data may occur (e.g., changing booking status from booked to cancelled), a combination of timestamp capture and file comparison capture may be necessary. The mechanisms application-assisted capture or triggerbased capture require a specific support by the source system, which will often be not in place in the tourism context.
ETL Tools ETL processes can be defined and executed based on a range of different programming environments and tools. All software programming languages, like Python, Java, or C++, are suitable for developing ETL processes, and most languages offer specific libraries supporting typical ETL functionalities, e.g., Bubbles (https:// pythonhosted.org/bubbles) or petl (petl.readthedocs.io) for Python. Additionally, specific ETL tools do exist, enabling to develop and schedule complex ETL processes, typically using a graphical interface for process design, in order to increase its suitability for non-software development affine users. Examples of ETL tools can be found on the professional side, offered by all well-known business intelligence and database providers like SAP, Oracle, IBM (Cognos), SAS, Microsoft, or RapidMiner (www.rapid-i.com) as well as on the open-source Side, e.g., Pentaho Kettle (www.pentaho.com), Talend Open Studio (www.talend.com), or Jaspersoft ETL (www.jaspersoft.com).
21 Business Intelligence in Tourism
509
Data Warehousing At the core of a business intelligence architecture in tourism is a central data warehouse that embraces data related to all different business processes and tourism stakeholders (Cho and Leung 2002). Heterogeneous data from a multitude of different data sources are stored in a central data repository. Only through this integration process is it possible to carry out an all-stakeholder and business process encompassing analysis approach (Pyo et al. 2002; Kimball and Ross 2013, p.40). Compared to an operational database, a data warehouse is theme-oriented (dealing with themes or subjects of the business like products or customers), time-oriented (periodic updates), integrated (aggregating data from different data sources), and invariant (new data are appended but existing data never changed) (Inmon 2002).
Traditional Data Warehouse Approaches: Normalized vs. Dimensional Modeling In the past decades, to model and implement a data warehouse, two main approaches emerged. Inmon’s approach of a fully normalized data warehouse, at the one hand, and Kimball’s approach of multidimensional modeling, at the other hand. The normalized data warehouse approach by Inmon (2002) follows the fundamental database design principle of normalization, meaning that each data entry within a database table is atomic (1st normal form) and data tables do not contain any redundancies (3rd normal form). A normalized data warehouse adheres to normalization and models all business entities on a most granular level, in order to avoid any redundancies. Dimensional modeling (DM) is a conceptual design technique for data warehousing. DM is business process- or transaction-oriented, i.e., each single DM models one business process. The fundamental principle of dimensional modeling is differentiating between performance indicators of a business process, called facts (e.g., the turnover or person number of a booking), and the context of the business process execution divided into different context dimensions (e.g., the date and time of a booking, the booked product, or the customer) (cf. Fig. 4). DimDate
DimTime Minutes Hours DimProduct
Booking Turnover PersonNumber
Description Category Fig. 4 Dimensional model for booking process (Höpken et al. 2015)
Day Month DimCustomer Age Origin
510
W. Höpken and M. Fuchs
Accordingly, a master DM diagram for a large company or a tourism destination may consist of 10 to 25 single DM diagrams, and, in turn, each DM diagram may have only a few or up to 15 or more dimensions. To support cross-process analyses (e.g., analyzing relationships between customers’ booking and web navigation behavior), separate fact tables are linked together through conformed (shared) dimensions (e.g., the customer dimension being part of all customer-centric business processes like booking, web navigation, etc.). Table 2 shows a master DM for typical tourism processes and their corresponding dimensions and clearly demonstrates that most dimensions are used by several business processes (Höpken et al. 2015). When comparing both approaches, Inmon’s normalized data warehouse approach offers the advantage of providing a neutral, integrated, and redundancy-free data representation (single version of truth). Changing business requirements can be reflected more easily, and redundancy-free data structures simplify data integration (in course of the ETL process). On the other hand, data models in 3rd normal form tend to become quite complex, hampering its understandability by business users and significantly decreasing the speed of data access. By contrast, dimensional modeling leads to a simple, straightforward database design, highly recognizable to the business user and offering a better support of dynamic OLAP functionalities. Often, both approaches are combined into a two-layer data model architecture with a normalized data structure as first layer to foster data integration and a dimensional data structure as second layer, generated from the normalized data structure, to support data analyses and OLAP (Inmon 2011, p.30). Additionally, both approaches can be combined by a so-called snowflake schema, a dimensional structure with fully normalized dimensions.
Data Lakes All approaches, discussed above, rely on relational databases as storage mechanism (Kimball and Ross 2013, p.43; Inmon 2011, p.29). With the advent of big data, i.e., huge data volumes with a high velocity and variety (Mariani et al. 2018), relational databases reach their limits as a storing mechanism within a business intelligence architecture (Fang 2015). Tourism-relevant data sources, discussed in section data sources, come in formats like JSON files, XML files, semi-structured html files, or unstructured like free text or audio/video files. Relational databases are not designed to store data in heterogeneous formats but require a predefined and precise data model (cf. approaches above). A data lake is a new storage concept better suited to cope with big data requirements (Fang 2015). A data lake is a collection of different storage instances to store data in heterogeneous formats and can, thus, store source data in their original format. In contrast to the classical data warehouse approaches, described above, data in a data lake can simply be stored as is, independent of their concrete structure. Different technical solutions exist to implement a data lake, like Hadoop or NoSQL databases. Hadoop is a data management platform providing at its core a
Business process Information request Web navigation Booking Stay Consumption Location tracking Feedback Capacity Marketing activity
Dimensions Time Date x x x x x x x x x x x x x x x
Customer x x x x x x x x x x x
x x x x
Product Vendor x x x x
Supplier Channel Location Feedback URI x x x x x x x x x x x x x x x x x
Table 2 Process and dimensions of a dimensional model in tourism (Höpken et al. 2015)
x
x x
Session Survey Marketing
21 Business Intelligence in Tourism 511
512
W. Höpken and M. Fuchs
distributed file system enabling to store non-relational data of different formats and offers specific support of OLAP analyses (Singh and Kaur 2014). Similarly, NoSQL databases, like mongoDB or Cassandra, are characterized by their ability to store data in a non-relational structure, compared to traditional relational SQL databases, like MySQL, PostgreSQL, or Oracle. Additionally, cloud services are available to implement a data lake, e.g., Amazon AWS or Microsoft Azure. Compared to the classical data warehouse approaches discussed above, the data lake concept offers a high flexibility by storing source data in their original and heterogeneous formats. The complex and in the big data context even impracticable task of transforming such data into a homogeneous data format can be avoided (Dixon 2010; Fang 2015). Data stored in a data lake are well-suited to serve as input to sophisticated analyses, like data mining and predictive analytics. The necessary transformation of the source data into a format appropriate for further analyses is an integral part of data mining anyway, called data preprocessing. However, reporting and OLAP analyses, as most important access mechanisms for a data warehouse, are typically not well-supported by a data lake. Accessing data and executing necessary transformations are more complex especially when compared to a dimensional modeling approach, and, thus, data lake solutions will suffer from a comparably low performance. Additionally, the lack of homogeneous data structures for all data sources complicates data source and, thus, business process overarching analyses, per definition one of the core functionalities of OLAP analyses. The situation in the tourism domain is complex and multi-faceted, by nature. On one hand, lots of transaction-oriented and well-structured data are available, e.g., booking data, statistical data on arrivals and overnights, web navigation data, etc. Most of these data can be mapped into homogeneous data structures in a traditional data warehouse and serve as input to powerful reporting and OLAP analyses. On the other hand, a huge amount of semi-structured and unstructured data are available, e.g., customer feedback on product review sites or other social media sites, free-text feedback within customer surveys, uploaded pictures or videos, etc. These data can hardly be stored in a traditional data warehouse, thus making a data lake approach more promising. Consequently, a combination of both approaches seems most promising for the tourism domain. All source data are in a first step loaded into a data lake in their original and proprietary format. All data which are of relevance for reporting and OLAP analyses and are sufficiently well-structured are in a second step transformed and loaded into a normalized or dimensional data warehouse. Customer survey data, for example, typically show a quite complex structure, containing questions on different abstraction levels and related to different business process. Following the approach above, the full survey data are stored in the data lake as a flat structure, serving as basis for detailed question-by-question analyses. Only relevant parts, properly overlapping with transactional data of corresponding business processes, are transformed and loaded into the traditional data warehouse, serving as input to OLAP and especially cross-process analyses. Similarly, in case of customer feedback from review platforms, like TripAdvisor, for example, the complete feedback would be stored as free text in the data lake or even the complete html page.
21 Business Intelligence in Tourism
513
Structured data, like the review date, user characteristics, or the reviewed tourism product, extended by the sentiment of a customer statement and the feature or topic the feedback is about (identified by a sentiment analysis) are stored in the traditional data warehouse.
Data Analysis and Visualization When looking at different approaches to analyze and visualize data, stored in a data warehouse, we can distinguish between reporting, OLAP, data mining, and specific analytical or operative applications (cf. Fig. 2).
Reporting Reporting denotes the process of generating static and predefined reports consisting of a collection of charts and analysis results. Reporting results can stretch from simple descriptive analyses to complex data mining results, like clustering or classifications. Reports can be generated fully automatically (active reporting) or parameter-driven (passive reporting). Reporting is supported by most business intelligence systems (e.g., Pentaho, Jaspersoft, etc.) as well as specific reporting tools (e.g., BIRT), typically offering a template-based graphical interface to design the structure and layout of reports interactively. While the report design process is interactive and flexible, the resulting reports are predefined and fixed and cannot be changed or adapted by the end user. In order to overcome this limitation, reporting nowadays most often takes place in form of interactive dashboards. Dashboards typically support high-level reporting, focusing on key performance indicators (KPIs) across multiple data sources, but enable to drill down into more detailed analyses. Dashboards should be high density, information-rich, but at the same time easy to view and understand. To achieve these goals, dashboard design principles have been developed, e.g., the Hichert SUCCESS rules (Gerths and Hichert 2014), defining how business information can be presented in a standardized and structured way. Reporting does not necessarily build on a (dimensional) data warehouse, but typical reporting tools can work on any type of data. As reports and dashboards should be easy to understand and offer limited interaction possibilities, they directly address the management or other types of untrained business users.
OLAP OLAP (Online Analytical Processing) is a specific technique of interactive and multidimensional data analysis (Codd et al. 1993). Based on a so-called OLAP cube, corresponding to the star schema of one business process within a dimensional data warehouse model, OLAP enables to analyze performance indicators (i.e., facts)
514
W. Höpken and M. Fuchs
from different perspectives (i.e., dimensions). Within an OLAP query, the user can specify completely flexibly by which dimension characteristics (e.g., customer age or gender) one or several performance indicators (i.e., facts, e.g., the turnover) should be calculated on an aggregated level and by which aggregation function (e.g., sum or average). The hierarchical arrangement of several dimension characteristics (e.g., country, state, province, city) enables the user to drill up and down along such hierarchy by selecting dimension characteristics on different abstraction levels. Under the FASMI acronym (Fast Analysis of Shared Multidimensional Information), we can summarize typical requirements to OLAP systems, i.e., they should support a multidimensional analysis perspective, cross-dimensional and cross-process analyses, homogeneous access to heterogeneous data from different (external) data sources, fast multi-user (shared) access, and an intuitive and flexible user interface (Pendse and Creeth 1995). Typically, OLAP systems (e.g., Pentaho Analysis Services) are based on dimensional structures within a relational database (i.e., star schemas) and, thus, prerequisite a (multi-)dimensional data warehouse model. Additionally, OLAP necessitates a good knowledge of this data model by the end user. Thus, in contrast to reporting, OLAP mainly addresses experienced business analysts.
Data Mining Data mining, often also called knowledge discovery in databases (KDD), denotes the process of discovering correlations, patterns, and trends by sifting through large amounts of data, using pattern recognition and statistical/mathematical techniques (Feyyad 1996). Data mining is, thus, capable of generating new knowledge as input to decision support or adaptive and intelligent services or applications (cf. Fig. 2). Data mining techniques are differentiated into supervised and unsupervised learning, depending whether the system is trained to explain or predict a target variable (e.g., classification, estimation, or prediction) or whether patterns are learned without any supervision (e.g., clustering or association rule analysis). In tourism, for example, flight or hotel bookings can be classified into show/no-show as input to overbooking. Moreover, the demand can be predicted (e.g., based on big data like Google Trends), as input to dynamic pricing and yield management, customers can be clustered as input to segment-specific marketing and CRM, or products often bought together can be identified (by association rules) as input to product recommendations. Data mining can take place on data in a normalized or dimensional data warehouse but can also work on the raw data sources directly (e.g., stored within a data lake in its original format). As data mining needs specific preprocessing steps to transform the data into a format appropriate for the respective data mining technique anyway, a data lake approach is well suited as input to data mining, especially in the context of big data. Data mining requires specific methodological knowledge and is, thus, typically executed by skilled business analysts or data scientists, based on specific data
21 Business Intelligence in Tourism
515
mining toolsets (e.g., Pentaho, KNIME, Weka, SAS Enterprise Miner, RapidMiner, etc.). However, data mining results can also be integrated into reports or dashboards and in this way be made available to end users as well. Instead of simply showing static results of predefined data mining processes, modern approaches give users the possibility to define their own processes or adapt predefined ones. Approaches from the area of meta learning support the selection or recommendation of appropriate methods or parameters depending on the dataset at hand or the defined objective of the analysis. Additionally, data mining can be used to enrich existing data by additional information or knowledge (as already described before in relation to data lakes). In this case, data are extracted from the data warehouse, new information are generated by data mining techniques, and the resulting data are integrated into the existing data warehouse structures again (Meyer et al. 2015). Results of a sentiment analysis can, for example, enrich the information on customer reviews. Customer segments, identified by a cluster analysis, can enrich customer profiles. Frequent item-sets, representing products often bought together, can enrich data on booking transactions. Forecasts (e.g., of bookings or arrivals) can be stored parallel to the corresponding actual transactions to foster an easy comparison. Or classification results, like the cancellation or no-show likelihood, can be stored as additional characteristics of a booking.
Structuring a Tourism MIS A full BI portal or management information system (MIS) typically consists of a multitude of different analyses and visualizations (i.e., single dashboards), dealing with different performance indicators and business objects on different levels of granularity (Höpken et al. 2015; Fuchs et al. 2014). This leads to the challenging question how to structure such a system in order to keep it simple and over seeable from the end users’ perspective. A common approach here is to structure such a system around the business processes or transactions, defined in a corresponding dimensional data warehouse model (cf. Table 2), extended by sections for the most important business objects, like products or customers. As can be easily seen, a business process and theme (i.e., business object)-oriented dimensional data warehouse model is an ideal basis for this MIS structure. However, although leading to a clear structure, highly recognizable to the management or business analysts, cross-process analyses, constituting an important type of analysis within a tourism BI system, do not properly fit into this data modeling approach (Höpken et al. 2015). A different structuring approach for MISs in tourism, better suited to incorporate cross-process analyses, has been proposed by Keil et al. (2017), following the management cockpit approach presented by Daum (2006) and tourism-relevant performance indicators defined by Bornhorst et al. (2010) and Fuchs and Weiermair (2004). The fundamental approach here is to differentiate between three information perspectives: resources, performance, and demand. Resources cover any kind of tourism or non-tourism resources provided by or available to tourism
516
W. Höpken and M. Fuchs
Fig. 5 Resources/performance/demand perspectives – a usage scenario (cf. Keil et al. 2017)
business (e.g., available tourism products, employees and their skill level, overall population, GDP, available funding, etc.). Performance mainly covers economic indicators (e.g., arrivals/overnights, revenue, etc.) and satisfaction rates (e.g., of guests, residents, or employees). Demand includes information on the customers (e.g., customer characteristics, needs, and online or offline behavior), together with external factors influencing the demand (e.g., weather conditions, terrorism, disasters, etc.). To summarize, the three different perspectives correspond to the business questions: Which resources are available and how supportive is our environment? Who are our customers, what are their needs, and which factors might influence the market? How “performant” is our destination in fulfilling the demand and how well do our resources fit? Additionally, each perspective is structured into three different abstraction levels. A first overview level presents high-level and abstract information on all relevant (key) performance indicators (e.g., the absolute turnover of the actual year, the change to the previous year, and the difference to the planned turnover). A second, more detailed level presents (graphical) visualizations of the indicators (e.g., the turnover for each month of the year or grouped by product categories). Finally, a third level offers detailed and flexible OLAP-like analyses and data mining results to explain further the information presented on the two upper levels. Figure 5 shows a usage scenario for the three different perspectives on the first abstraction level, spanning across the business processes capacity, stay, and feedback. The process overarching business questions to be answered are as follows: What is the most important travel motive driving our customers’ demand? Did our corresponding resources increase in order to fulfill the demand? How good do we actually perform in fulfilling the demand?
21 Business Intelligence in Tourism
517
Case Study: DMIS Halland DMIS Halland applies methods of BI and data warehousing to the Swedish tourism destination, Halland, by conceptualizing and prototypically implementing a destination management information system (DMIS). DMIS Halland collects data from a multitude of different data sources and is based on a data warehouse model (cf. Table 2), combining all relevant business processes, like web navigation, bookings, arrival, feedback, capacity, etc. into a homogeneous data pool. Based on this homogeneous data pool, cross-process analyses, spanning across different business processes and data sources, are enabled as rich input to decision support.
Data Sources and Extraction DMIS Halland extracts information from different customer- as well as supplierbased data sources (cf. Table 3). Customer-based data sources comprise (1) survey data, generated by online guest surveys executed via a specifically developed eSurvey tool; (2) web navigation data, i.e., customer click behavior on DMO and supplier websites (generated by the web tracking tool Matomo – matomo.org); and (3) social media data, i.e., interaction patterns and metrics, as well as customer comments and feedback on the social media platform Facebook (extracted via the Facebook Graph API – https://developers.facebook.com/docs/graph-api/reference). Supplier-based data sources comprise (1) supply-side statistics on the offer structure, capacity, occupancy, arrivals/overnights, or turnover and (2) economic statistics like population and growth statistics. Both supplier-based data sources originate from the national statistics database SCB (Statistiska centralbyråns – www.scb.se), provided via ASCII-file export on a monthly basis.
Data Warehouse The DMIS Halland data warehouse is based on the dimensional data model depicted in Table 2. Figure 6 exemplarily shows the single star schema for the feedback Table 3 DMIS Halland data sources Customer-based data sources Survey data: online guest surveys executed by e-survey tool Web navigation data: customer behavior on DMO and supplier websites (web statistics) Social media data: social media metrics, interaction patterns, user comments and feedback (Facebook)
Supplier-based data sources Supply statistics: offer structure, capacity, occupancy, arrivals/overnights, turnover (SCB data) Economic statistics: population and growth statistics (SCB data)
518
W. Höpken and M. Fuchs
DimTime TimeID (PK) DayTime Minutes Hours DimLocation LocID (PK) POI POIDescription GPSCoordinates LocCity LocCountry
Feedback Time (FK) Date (FK) Customer (FK) CusUsageProfile (FK) CusDemographicProfile (FK) Product (FK) Vender (FK) Supplier (FK) Channel (FK) Location (FK) Feedback (FK) Survey (FK) QuestionnaireNo (DD) FeedbackValue (F)
DimDate DateID (PK) DayInWeek Weekend DayInMonth Week Month Year Season DimFeedback FeedID (PK) FeedDescription FeedCategory
Fig. 6 Star schema for the feedback process
process, consisting of the feedback fact table and surrounding dimension tables. Splitting up a complete customer survey into single answers (fact FeedbackValue) to each question (described in FeedDescription within the feedback dimension) enables to analyze the average feedback value for different feedback categories, product categories, or any other dimension attributes, demonstrating the power of well-designed dimensional data warehouse models.
Data Mining DMIS Halland makes use of different data mining techniques including a sentiment analysis on Facebook posts and comments, i.e., identifying the topic (e.g., food and beverage, service, location, room, etc.) and sentiment (i.e., positive, neutral, or negative) by a text classification, in order to enrich the data warehouse by such additional characteristics of Facebook posts or comments. Additionally, data mining results are presented in the form of static reports. Exemplarily, Fig. 7 shows a classification by a decision tree, explaining the overall customer satisfaction (classes high or low), based on specific feedback categories or questions within a customer survey. As can be deduced from the decision tree, customers showing a high satisfaction with the service quality and at the same time with expectation fulfilment, fun/excitement, or the travel motive “to have fun” are classified as highly satisfied customers. As an example for customer segmentation, Fig. 8 shows the results of a cluster analysis, based on customers’ travel motives, important activities, and satisfaction values, revealing interesting segments of customers with specific preferences and satisfaction & loyalty values. Analyzing the most important characteristics distinguishing the found clusters (based on the centroid plot on the left hand side of Fig. 8), results in the four customer segments: “unpretentious and fully satisfied
21 Business Intelligence in Tourism
519
Fig. 7 Explaining overall customer satisfaction by a decision tree
regular customers,” “unpretentious and unsatisfied unregular customers,” ‘lonely but satisfied beach enthusiast,” and “unsatisfied beach bum who might nevertheless come back,” illustrated by the cluster profiles on the right hand side of Fig. 8.
Reporting and Dashboards DMIS Halland offers a web application as user interface to visualize the data within the data warehouse. Figure 9 shows exemplarily a dashboard with different visualizations of turnover, overnights, and population growth statistics.
Technical Architecture The general technical architecture, depicted in Fig. 2, has been prototypically implemented, making use of appropriate state-of-the-art technologies and tools. Figure 10 shows the different layers and components of the DMIS Halland technical architecture. The data source layer contains all different data sources and source systems, like the survey tool, collecting survey data by executing e-surveys. The BI layer consists of all components or systems dealing with collecting, storing, and analyzing data. The ETL server extracts data from various data sources, transforms them into an appropriate and homogeneous data format, and stores them in the central data warehouse, based on the open source database system Postgres. The ETL server makes use of the data mining toolset RapidMiner as well as Python scripts. The open source web tracking system Matomo collects data on customers’ web navigation behavior on partner websites and feeds such data into the ETL servers. The survey tool stores its data in its own survey database (based on the open source database system mySQL), to support a question-byquestion visualization within the application layer. Additionally, the survey tool feeds selected survey results into the ETL process to be stored in the central data warehouse.
Expecting „sun, warmth and beautiful weather“ Focus on “swimming and sunbathing on the beach” Most unsatisfied Low loyalty (but even higher than cluster 3)
Fig. 8 Cluster analysis based on travel motives, important activities, and satisfaction values
= unsatisfied beach bum who might nevertheless come back
• • • •
Cluster 2
Cluster 1 • Focus on beach activities • High satisfaction & loyalty (except for emotional & social values) = lonely but satisfied beach enthusiast
Cluster 3 • No specific preferences • Low satisfaction & loyalty (except for emotional & social values) = unpretentious and unsatisfied unregular customer
Cluster 0 • No specific preferences • High satisfaction & loyalty = unpretentious and fully satisfied regular customer
520 W. Höpken and M. Fuchs
Fig. 9 Dashboard with data visualizations
21 Business Intelligence in Tourism 521
522
W. Höpken and M. Fuchs
Data Source Layer Partner Websites
BI Layer
Web Tracking (Matomo)
Application Layer
Frontend Layer
Data Mining (RapidMiner)
SCB Website SCB Excel Files
ETL Server (RapidMiner Server, Python Scripting)
Data Warehouse (Postgres)
Application Server (Ruby on Rails)
Web App (AngularJS)
Facebook
Survey Tool
Survey Database (mySQL)
User Database (Postgres)
Fig. 10 Technical architecture DMIS Halland
Finally, the data mining tool RapidMiner executes specific data mining processes in order to gain new knowledge on, e.g., customer behavior and satisfaction, and stores its results in the data warehouse again (or provides results in the form of static data mining reports). The application layer provided an application server (based on the web application framework Ruby on Rails), offering API-based access to the central data warehouse for different client applications. The application server stores specific user profiles within its own user database. The frontend layer contains a web application as graphical frontend (GUI) to the customer, based on the open source web application framework AngularJS.
Expected Future Developments Big Data Traditional BI applications focus on the core business transactions, like bookings, arrivals, overnights, etc. The corresponding data sources normally have a welldefined and known structure. Nowadays, new data sources are becoming available, which show different characteristics, denoted by the term big data. Big data sources are mainly characterized by a high volume, a high velocity (i.e., speed of data provision), or a high variety (i.e., range of different data types and sources). Big data sources in tourism can be categorized in: • Website content, like user-generated content (on review or social media platforms), or data on markets and competitors (on competitor websites or yellow pages and registries)
21 Business Intelligence in Tourism
523
• Environmental data, like economic data (e.g., GDP or employment data in sending countries), or weather data (historic weather data and weather forecasts) • Tourists’ interactions with the local environment (e.g., interacting with in-room equipment within a hotel room or IoT services within a tourism destination, or tourists’ movements, e.g., gathered by GPS tracking). Big data sources, besides offering a wealth of opportunities, also constitute new technical challenges like extracting information from unstructured data sources, integrating heterogeneous data into a homogeneous data structures, and coping with huge data volumes, which can hardly be stored or processed by traditional data base systems and BI applications. Data lakes (cf. Dixon 2010; Fang 2015) may help to solve at least some of these issues as they can store data in lots of different data formats (e.g., by using NoSQL databases) and can handle big data volumes (e.g., based on a Hadoop cluster). Nevertheless, we have to assume that data volumes and also the amount of different data source types will increase dramatically in the future, constituting big challenges for modern BI applications. In the context of big data, researchers often postulate the end of theory (Anderson 2008). A theory to conduct research and, e.g., gather data by a customer survey would no longer be necessary if data on all relevant aspects like customer behavior, needs, etc. are already existing as part of big data sources and a confirmatory analysis based on sample data would simply be replaced by an exploratory analysis on full data on all customers. It is, however, questionable whether this holds true as although data mining is exploratory by nature, a theory in the form of a causal model may still help to avoid spurious correlations and drawing wrong causal conclusions. Additionally, especially big data, gathered from social media platforms like TripAdvisor or Facebook, are often highly biased, as only a specific subset of customers is frequently using such platforms. And even if complete data would be available in the future, the technology, e.g., the social media platform, itself might cause a bias as it is not designed as a research tool but as a service fulfilling certain customer needs.
Operational Business Intelligence Operational BI denotes the approach of an automatic backflow of information and generated knowledge into operative applications, either operative applications, like ERP systems or yield management systems, in order to support operational decisionmaking, or customer applications, like online booking systems in order to support adaptive and intelligence and smart services. In more detail, analysis results, typically models learned by supervised or unsupervised machine learning methods, are automatically used within operative applications. A dynamic price setting function, as part of a yield management system, for example, makes use of prediction models for demand predictions or
524
W. Höpken and M. Fuchs
classification models for no-show or cancellation predictions. Intelligent product recommendations build on frequent item sets or association rules, specifying which products are often bought together. Personalization of offers and marketing activities (i.e., targeting) are based on a cluster model for customer segmentation. Thus, operational BI puts a higher emphasis on analytical BI, i.e., data mining or in this context often called predictive analytics. While a report or a dashboard supports the management in strategic or tactical decision-making, data mining models provide knowledge in an explicit format and, thus, can be used to automatically take operative decisions or adapt customer applications.
Real-Time Business Intelligence Real-time BI describes the concept of providing information and knowledge in real time, thus with near to zero latency (therefore often called near-real time). This trend is closely connected to operational BI (see above), as operational decision support typically depends on timely and up-to-date information. Real-time BI can be reached by different technical approaches: • Frequent data updates: Periodic updates of the data warehouse are executed more frequently in order to achieve a real-time appearance of the BI application. Typically, ETL processes have to be simplified and consequently data quality is reduced. Therefore, real-time data are often stored in a separate real-time partition of the data warehouse and are overwritten by the full ETL process periodically. • Event-driven data updates: Data updates are initiated by events, triggered by the source application or database (e.g., based on database triggers). Analog to above, ETL processes may have to be simplified, feeding a separate real-time partition. • On-demand data access: Source data are not loaded into a data warehouse in advance but accessed on demand. Thus, heterogeneous source data are not transformed into a homogeneous data structure by a complex and time-consuming ETL process, but lightweight approaches to access and combine different data sources and operative systems are used and a central data warehouse is omitted (serverless analysis), or a data lake approach is used for data storage. Real-time BI, although quite powerful in certain circumstances, does not come without additional costs. Omitting a central data warehouse or simplifying the ETL process has a negative effect on data quality. Additional mechanisms, like realtime partitions, or specific hardware to reduce latency (e.g., in-memory databases), increase overall effort and costs. Thus, advantages and costs have to be weighted in each single case of application, and it should be critically reflected whether simply reducing the update frequency would already satisfy most business needs.
21 Business Intelligence in Tourism
525
Cross-References Artificial Intelligence and Machine Learning Big Data Technologies Compositional Data Analysis in E-Tourism Research Content Analysis of Online Travel Reviews Data Mining and Predictive Analytics for E-Tourism Log File Analysis Spatial Analytics and Data Visualization Visual Methods and Visual Analysis in Tourism Research
References Anderson CD (2008) The end of theory: the data deluge makes the scientific method obsolete. Wired, https://pdfs.semanticscholar.org/f7ad/e77c6572b2b5c8d4eb9831605273ef473634. pdf. Accessed 20 July 2019 Bloom J (2004) Tourist market segmentation with linear and non-linear techniques. Tour Manag 25(6):733 Bornhorst T, Ritchie J, Sheehan L (2010) Determinants for DMO & destination success: an empirical examination. Tour Manag 31(5):572–589 Cerba O, Janecka K, Jedlicka K, Mildorf T, Fryml J, Vlach P, Kozuch D, Charvat K (2015) Integration and Visualization of Tourism Data. https://doi.org/10.13140/RG.2.1.4611.6568 Cho V, Leung P (2002) Knowledge discovery techniques in database marketing for the tourism industry. Qual Assur Hosp Tour 3(3):109–131 Christen P, Winkler WE (2016) Record linkage. In: Sammut C, Webb G (eds) Encyclopedia of machine learning and data mining. Springer, Boston Chu X, Ilyas IF, Krishnan S, Wang J (2016) Data cleaning: overview and emerging challenges. In: SIGMOD’16 proceedings of the 2016 international conference on management of data, pp 2201–2206 Chugh R, Grandhi S (2013) Why business intelligence? Significance of business intelligence tools and integrating BI governance with corporate governance. Int J Entrep Innov 4(2):1–14 Codd EF, Codd SB, Salley CT (1993) Providing OLAP (on-line analytical processing) to useranalysts: an IT mandate. Codd & Associates, Ann Arbor Daum JH (2006) Management cockpit war room: objectives, concept and function, and future prospects of a (still) unusual, but highly effective management tool. Controlling 18:311–318 Dell’Erba M, Fodor O, Höpken W, Werthner H (2005) Exploiting semantic web technologies for harmonizing e-markets. Inf Technol Tour 7(3/4):201–220 Dixon J (2010) Pentaho, Hadoop and data lakes. https://jamesdixon.wordpress.com/2010/10/14/ pentaho-hadoop-and-data-lakes/. Accessed 20 May 2019 Fang H (2015) Managing data lakes in big data era: what’s a data lake and why has it became popular in data management ecosystem. In: 2015 IEEE international conference on cyber technology in automation, control, and intelligent systems (CYBER). IEEE, pp 820–824 Feyyad UM (1996) Data mining and knowledge discovery: making sense out of data. IEEE Expert 11(5):20–25. IEEE Fuchs M, Höpken W (2009) Data Mining im Tourismus. Praxis der Wirtschaftsinformatik 270(12):73–81 Fuchs M, Weiermair K (2004) Destination benchmarking – an indicator-system’s potential for exploring guest satisfaction. J Travel Res 42(3):212–225 Fuchs M, Höpken W, Lexhagen M (2014) Big data analytics for knowledge generation in tourism destinations – a case from Sweden. J Destin Mark Manag 3(4):198–209
526
W. Höpken and M. Fuchs
Garrow L, Koppelman F (2004) Predicting air travelers’ no-show and standby behavior using passenger and directional itinerary information. J Air Transp Manag 10(6):401–411 Gerths H, Hichert R (2014) Designing Business Charts with Excel based on the standards of HICHERT®SUCCESS. Haufe, Freiburg Gräbner D, Zanker M, Fliedl G, Fuchs M (2012) Classification of customer reviews based on sentiment analysis. In: Fuchs M, Ricci F, Cantoni L (eds) Information and communication technologies in tourism. Springer, Wien/New York, pp 460–470 Gretzel U, Fesenmaier D (2004) Implementing a knowledge-based tourism marketing information system: the Illinois tourism network. Inf Technol Tour 6:245–255 Höpken W, Fuchs M, Lexhagen M (2014) The knowledge destination – applying methods of business intelligence to tourism applications. In: Wang J (ed) Encyclopedia of business analytics and optimization. IGI Global, Hershey, pp 2542–2556 Höpken W, Fuchs M, Keil D, Lexhagen M (2015) Business intelligence for cross-process knowledge extraction at tourism destinations. Inf Technol Tour 15(2):101–130 Höpken W, Eberle T, Fuchs M, Lexhagen M (2019) Google trends data for analysing tourists’ online search behaviour and improving demand forecasting: the case of Åre, Sweden. Inf Technol Tour 21(1):45–62 Inmon W (2002) Building the data warehouse, 2nd edn. Wiley, New York Inmon WH (2011) A tale of two architectures. Database Mag 1:28–31 Jiang N, Gruenwald L (2006) Research issues in data stream association rule mining. SIGMOD 35(1):14–19 Kasper W, Vela M (2011) Sentiment analysis for hotel reviews. In: Computational linguisticsapplications conference, Katowice, pp 45–52 Keil D, Höpken W, Fuchs M, Lexhagen M (2017) Optimizing user interface design and interaction paths for a destination management information system. In: Marcus A, Wang W (eds) Design, user experience, and usability: understanding users and contexts. DUXU 2017. Lecture Notes in Computer Science, vol 10290. Springer, Cham., pp 473–487. https://doi.org/10.1007/978-3319-58640-3_34 Kimball R, Ross M (2013) The data warehouse toolkit: the definitive guide to dimensional modeling, 3rd edn. Wiley, Indianapolis Kimball R, Ross M (2016) The Kimball Group Reader: relentlessly practical tools for data warehousing and business intelligence, 2nd edn. Wiley, Indianapolis Kimball R, Ross M, Thronthwaite W, Mundy J, Becker B (2008) The data warehouse lifecycle toolkit, 2nd edn. Wiley, Indianapolis Law R, Li G, Fong DK C, Han X (2019) Tourism demand forecasting: a deep learning approach. Ann Tour Res 75:410–423 Liu B (2008) Web data mining, 2nd Ausg. Springer, New York Luhn HP (1958) A business intelligence system. IBM J Res Dev 2(4):314–319 Manning C, Schütz H (2001) Foundations of statistical natural language processing. MIT Press, Cambridge Mariani M, Baggio R, Fuchs M, Höpken W (2018) Business intelligence and big data in hospitality and tourism: a systematic literature review. Int J Contemp Hosp Manag 30(12):3514–3554 Menner T, Höpken W, Fuchs M, Lexhagen M (2016) Topic detection – identifying relevant topics in tourism reviews. In: Inversini A, Schegg R (eds) Information and communication technologies in tourism. Springer, Heidelberg, pp 411–423 Meyer V, Höpken W, Fuchs M, Lexhagen M (2015) Integration of data mining results into multidimensional data models. In: Tussyadiah I, Inversini A (eds) Information and communication technologies in tourism. Springer, Heidelberg, pp 155–168 Min H, Emam A (2002) A DM approach to develop the profile of hotel customers. Contemp Hosp Manag 14(6):274–285 Morales D, Wang J (2008) Passenger name record data mining based cancellation forecasting for revenue management. Innov Appl O.R. 202(2):554–562 Olmeda I, Sheldon P (2002) Data mining techniques and applications for tourism Internet marketing. Travel Tour Mark 11(2/3):1–20
21 Business Intelligence in Tourism
527
Pendse N, Creeth R (1995) The OLAP Report: succeeding with on-line analytical processing. Business Intelligence, Wimbledom Pitman A, Zanker M, Fuchs M, Lexhagen M (2010) Web usage mining in tourism – a query term analysis and clustering approach. In: Gretzel U, Law R, Fuchs M (eds) Information and communication technologies in tourism. Springer, New York, pp 393–403 Power DJ (2007) A brief history of decision support systems, version 4.0, available at: DSSResources.com. Accessed 6 June 2017 Pyo S (2005) Knowledge-map for tourist destinations. Tour Manag 26(4):583–594 Pyo S, Uysal M, Chang H (2002) Knowledge discovery in databases for tourist destinations. J Travel Res 40(4):396–403 Ritchie R, Ritchie J (2002) A framework for an industry supported destination marketing information system. Tour Manag 23:439–454 Rud O (2009) Business intelligence success factors: tools for aligning your business in the global economy. Wiley, Hoboken Sambamurthy V, Subramani M (2005) Information technologies and knowledge management. Manag Inf Syst Q 29(1):1–7 Schmunk S, Höpken W, Fuchs M, Lexhagen M (2014) Sentiment analysis – extracting decisionrelevant knowledge from UGC. In: Xiang Z, Tussyadiah I (eds) Information and communication technologies in tourism. Springer, Heidelberg, pp 253–265 Singh K, Kaur R (2014) Hadoop: addressing challenges of Big Data. In: 2014 IEEE international advance computing conference (IACC). IEEE, pp 686–689 Smith B, Leimkuhler J, Darrow R (1992) Yield management at American airlines. Interfaces 22(1):8–31 Subramanian J, Stidham S, Lautenbacher C (1999) Airline yield management with overbooking, cancellations, and no-shows. Transp Sci 33(2):147–167 Vlahogianni EI, Karlaftis MG (2010) Advanced computational approaches for predicting tourist arrivals. In: Evans T (ed) Nonlinear dynamics. InTech, Vienna, pp 309–324 Williams S (2016) Business intelligence strategy and Big Data analytics. Morgan Kaufmann, Cambridge, MA Wöber K (1998) Global statistical sources- TourMIS: an adaptive distributed marketing information system for strategic decision support in national, regional or city tourist offices. Pac Tour Rev 2(3):273–286 Wong J-Y, Chen H-J, Chung P-H, Kao N-C (2006) Identifying valuable travellers by the application of data mining. Asia Pac J Tour Res 11(4):355–373 Zanker M, Jessenitschnig M, Fuchs M (2010) Automated semantic annotation of tourism resources based on geo-spatial data. Inf Technol Tour 11(4):341–354 Zhu G, Cao J, Li C, Wu Z (2017) A recommendation engine for travel products based on topic sequential patterns. Multimed Tools Appl 76(16):17595–17612
Part III e-Tourism Methods
Data Mining and Predictive Analytics for E-Tourism
22
Nuno Antonio, Ana de Almeida, and Luis Nunes
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patterns, Applications, and Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pattern Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How to Choose the Technique/Tool That Best Suits the Problem Type . . . . . . . . . . . . . . . . How to Conduct a Data Mining Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges and Limitations of Data Mining Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expected Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
532 537 537 539 547 551 551 552 552 552
N. Antonio () Nova Information Management School (NOVA IMS), Universidade Nova de Lisboa, Lisboa, Portugal e-mail: [email protected] A. de Almeida Department of Information Science and Technology, ISCTE-IUL, Lisbon, Portugal CISUC, Coimbra, Portugal ISTAR-IUL, Lisbon, Portugal e-mail: [email protected] L. Nunes Department of Information Science and Technology, ISCTE-IUL, Lisbon, Portugal ISTAR-IUL, Lisbon, Portugal Instituto de Telecomunicações, Aveiro, Portugal e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_29
531
532
N. Antonio et al.
Abstract Computers and devices, today ubiquitous in our daily life, foster the generation of vast amounts of data. Turning data into information and knowledge is the core of data mining and predictive analytics. Data mining uses machine learning, statistics, data visualization, databases, and other computer science methods to find patterns in data and extract knowledge from information. While data mining is usually associated with causal-explanatory statistical modeling, predictive analytics is associated with empirical prediction modeling, including the assessment of the quality of the prediction. This chapter intends to offer the readers, even those unfamiliar with this topic, a general overview of the key concepts and potential applications of data mining and predictive analytics and to help them to successfully apply e-tourism concepts in their research projects. As such, the chapter presents the fundamentals and common definitions of/in data mining and predictive analytics, including the types of problems to which it can be applied and the most common methods and techniques employed. The chapter also explains what is known as the life cycle of data mining and predictive analytics projects, describing the tasks that compose the most widely employed process model, both for industry and academia: the Cross-Industry Standard Process for Data Mining, CRISP-DM.
Keywords Database mining · Data visualization · Knowledge discovery · Machine learning · Predictive analytics · Predictive modeling
Introduction Computers and devices that are nowadays omnipresent in every aspect of our lives foster the generation of vast amounts of data. Following this trend, the capability to conveniently or automatically collect and process data to find patterns and extract knowledge intensifies (Witten et al. 2011; Han et al. 2012). The process of identifying original, useful, and understandable patterns from data is known as Knowledge Discovery in Databases (KDD). Data mining resides at the core of the KDD process and involves the use of different algorithms to identify patterns in data (such as trends, associations, or affinities) or build predictive models (Maimon and Rokach 2010; Delen and Demirkan 2013; Zaki and Meira 2014; Malik et al. 2018). Above all, patterns have to be meaningful and “interesting.” To be considered interesting, patterns should be “(1) easily understood by humans, (2) valid on new or test data with some degree of certainty, (3) potentially useful, and (4) novel. A pattern is also interesting if it validates a hypothesis that the user sought to confirm. An “interesting pattern represents knowledge” (Han et al. 2012, p. 21). Although not consensual, it is not surprising that several researchers acknowledge that the use of data mining and big data is inducing a paradigm shift from scientific research to data-driven
22 Data Mining and Predictive Analytics for E-tourism
533
research (Strasser 2012; Kitchin 2014; Mazzocchi 2015), that is, decision supported by the (underlying) data. A shift that runs counter to the most traditional deductive approach is found in current science because it states that (1) big data is able to capture the whole domain; (2) there is no need for a priori theory or hypothesis; (3) data analytics methods can give a non-biased view of patterns in data; and (4) results can be interpreted by someone who understands statistics or data visualizations (even with limited domain knowledge) (Kitchin 2014). Data mining tasks are described by one of the two categories: descriptive or predictive tasks. While descriptive tasks characterize/learn properties in data, predictive tasks aim for an induction phase in order to make estimations and predictions (Han et al. 2012). From a taxonomical point of view, analytics is divided into three main categories (Delen and Demirkan 2013): • Descriptive Analytics: uses data and standard reporting/aggregations to answer the questions “What happened?” and “What is happening?”. • Predictive Analytics: uses data and algorithms to build mathematical models to discover explanatory and predictive patterns that can answer the questions “What will happen?” and “Why will it happen?”. • Prescriptive Analytics: uses data and algorithms to build mathematical models to determine high-value “courses of actions” or “what-if scenarios.” The objective is to answer the questions “What should I do?” and “Why should I do it?”. Together with text mining, web and media mining, or statistical time series forecasting, data mining is considered one of the main enablers of predictive analytics (Delen and Demirkan 2013). Probably because of this, data mining and predictive analytics are either used interchangeably (e.g., Malik et al. 2018) or used to identify the type of task to be undertaken. While data mining is employed for causal-explanatory statistical modeling (descriptive tasks), predictive analytics is used to describe predictions and estimations of future outcomes (predictive tasks) (e.g., Larose and Larose 2015). Data mining can be applied to almost any kind of data as long as the data are meaningful for the intended application. This describes mostly all types of data: database data, data warehouse data, transactional data, hypertext and multimedia data (text, image, video, or audio), and graph and network data (e.g., social and information networks), among others (Han et al. 2012). However, when modeling is intended, most data mining techniques require the data to be transformed into a two-dimensional view of the dataset, where rows represent the unit of analysis (e.g., traveler, customer, online review) and each column represents the measure of an attribute (e.g., year of birth, nationality, review rating) (Zaki and Meira 2014; Hastie et al. 2017). As an example, take a data mining project to analyze online review ratings. The dataset should be structured in a table-like manner like the one shown in Table 1. Depending on the field of study, the attributes (columns) in the dataset can assume different names. Nevertheless, these column identifiers should also be meaningful in the sense that they provide context, that is, a data dictionary is an important piece of data semantics. The column with the dependent variable,
534
N. Antonio et al.
Table 1 Example of a dataset structure (online review ratings) Rating 5 4 5 3 ...
User ID John Doe 1 Mary Doe 1 John Doe 2 Mary Doe 2 ...
User location London, UK NY, USA Manchester, UK Chicago, USA ...
User age 30–40 50–60 30–40 20–30 ...
Attraction London Eye Tower of London Windsor Castle Tower of London ...
Comment Excellent, bla, . . . Very good, bla, . . . Excellent, bla, . . . Not so good, bla, . . . ...
the variable that is the focus of the problem, is also known as a response variable, target or label (which is the case of the variable “Rating” in Table 1). The remaining columns are usually known as independent or explanatory variables, predictors, or features. According to Zaki and Meira (2014), variables can be classified into two main types: (i) numeric, real or an integer-valued and assuming a finite or infinite set of values, respectively, classified as discrete or continuous, and (ii) categorical – a set-valued domain (e.g., gender or education levels). Categorical attributes can also be classified as nominal (when only differentiation and no order are assumed, e.g., gender) or ordinal (when the values are meant as a ranking – like for education levels). The process to carry out a data mining and predictive analytics project, including the preparation of the modeling dataset and its attributes, is presented in Section “How to Conduct a Data Mining Project”. As illustrated by Fig. 1, data mining incorporates techniques coming from different domains, including (Han et al. 2012; Zaki and Meira 2014): • • • •
Statistics: exploratory study, analysis, interpretation or explanation of data Data visualization: a visual representation of data Information retrieval: search of documents or information in documents Data warehouse and database systems: creation, maintenance, and use of databases, including the integration of data from multiple sources to build systematic data analysis capabilities (data warehouses) • Machine learning: investigates how computers can learn from data Machine learning plays an essential role in data mining (Witten et al. 2011; Han et al. 2012), and Fig. 2 shows some types of machine learning problems that are highly related to data mining (Han et al. 2012; Hastie et al. 2017). It is usual to divide machine learning problems or tasks according to the following categories: • Supervised learning: uses labeled input attributes to build models to predict an outcome/target. Supervised learning problems are commonly divided into: – Classification problems: when the target is discrete or categorial. The goal is to identify to which of the possible classes an observation belongs (e.g., classify travelers by predefined segments based on the time to service at which they book their trips and the length of the stay, like in the visual illustration shown in Fig. 3).
22 Data Mining and Predictive Analytics for E-tourism
HighPerformance Computing
535
Applications
Algorithms
Information Retrieval
Data Visualization
Data Warehouse
Pattern Recognition
Statistics
Database Systems
Data Mining
Machine Learning
Fig. 1 Techniques adapted from the different domains by data mining. (Adapted from Han et al. 2012) Fig. 2 Machine learning’s types of problems
SemiSupervised Learning Regression Supervised Learning Classification Machine Learning
Clustering Unsupervised Learning
Reinforcement Learning
Dimension Reduction
– Regression problems: when the target is a continuous measurement (e.g., predict the average occupation rate of a hotel, or the average amount spent per tourist in a destination). • Unsupervised learning: when the input attributes are not labeled and there is no target. The goal of unsupervised learning is to explain or discover associations and patterns between input attributes, and it is divided into: – Clustering: identifies clusters of more similar input attributes (e.g., identifies customer segments) – Dimension reduction: “simplifies” data by combining variables with similar contribution to the dataset variance
Fig. 3 Classification model visualization example (identify customers’ segment based on two attributes)
N. Antonio et al.
length of stay
536
time to arrival • Semi-supervised learning: makes use of non-labeled and labeled input attributes to gain more understanding of the population. • Reinforcement learning: when the input attributes are not labeled or labels are not defined and the model learns (improves performance) based on a rewarding process. Although information and communications technologies (ICTs) have been transforming tourism globally since the 1980s (Buhalis and Law 2008), until 2007 the literature related to the application of data mining and machine learning in tourism and travel research was still scarce (Delen and Sirakaya 2006; Law et al. 2007). However, from thereon work on multiple topics in these research areas has been published, like trend analysis/forecasting (Wu et al. 2010; Claveria et al. 2015; Moro and Rita 2016; Höpken et al. 2020), travel planning/recommendation (Hsu et al. 2010; Zhao and Ji 2013), understanding travelers’ preferences/personalization (Li et al. 2015; Chang et al. 2016; Shapoval et al. 2017; Antonio et al. 2018), understanding travel patterns (Chen et al. 2018; Hu et al. 2019), analysis of customers’ profitability (Pei 2013), analysis of destination competitiveness (Srivihok and Intrapairot 2016), impact on customer relationship management (Xie and Tang 2009), and predicting and understanding hotel cancellation drivers (Falk and Vieru 2018; Antonio et al. 2019), among others. Bach et al. (2013) performed a keyword analysis of the literature published between 1995 and 2013 and identified 88 peer-reviewed articles with the application of data mining in tourism. In these, the authors identified six core areas of application: forecasting, personalization, tourism management, tourism systems (e.g., recommender systems), multi-agent systems (e.g., swarm optimization), and machine learning-based applications. More recently, Mariani et al. (2018), in a literature review about research published between 2000 and 2016 on business intelligence and big data in hospitality
22 Data Mining and Predictive Analytics for E-tourism
537
and tourism, found that there was a wide distribution of publications over those years, presenting a linear and relevant growth over time. The authors used, among others, the search keywords “Business Intelligence,” “Data Mining,” and “Data Warehouse” and identified the existence of an upsurge in the number of research publications in hospitality and tourism literature that apply analytical techniques to large quantities of data. However, they also concluded that research is somewhat fragmented in scope and limited in methodologies and, overall, shows gaps. In fact, despite the growing interest in the application of data mining in tourism research, reflected by the increasing number of publications and topics covered, there is still much to explore. The potential of data-driven research is enormous.
Patterns, Applications, and Processes In this section, we will present examples of insights that data mining and predictive analytics can expose to help to explain and to predict tourism phenomena. We will also describe possible applications and the recommended process to conduct data mining and predictive analytics-based projects.
Pattern Types Figure 4 presents the five primary types of patterns that can be mined (Han et al. 2012; Zaki and Meira 2014): • Class/concept description: Description of a class or concept that can be derived from the summarization of the class under study – data characterization – or the comparison of the target class with other classes, data discrimination, or both. While the outputs of data characterization are charts (e.g., pie, bar, or columns plots), single or multidimensional tables, or in rules form, data discrimination outputs are mostly done in the form of rules. An example of this type of patterns could be the study in a hotel company of segment distribution by nationality and distribution channel. • Frequent patterns, associations, and correlations: Frequent patterns are patterns that occur frequently in data, such as frequent itemsets, frequent subsequences (known as sequential patterns), and frequent substructures. Frequent itemsets usually refer to a set of items that often appear together in a transactional dataset. Frequent subsequences refer to a sequential order obtained in terms of purchase history, like for what actions a traveler tends to perform when buying the traveling services (e.g., first the airline ticket is selected/bought, which is followed by a hotel booking and then a transfer booking). A frequent substructure can refer to the combination of frequent itemsets and frequent subsequences with other structures (e.g., graphs or trees). These patterns are usually designated as “association rules” in machine learning.
538
N. Antonio et al.
Data Characterization Class/Concept Description Data Descrimination
Frequent Patterns Frequent Patterns, Associations, and Correlations Association Analysis Data Mining Patterns Classification Predictive Analysis Regression Outlier Analysis
Cluster Analysis
Fig. 4 Data mining types of patterns
• Predictive analysis: The process of finding a model (function) that predicts categorical labels (classification) or continuous-valued labels (regression). For this, the modeling dataset must be fully labeled, i.e., one of the attributes (columns) of the dataset must be the dependent variable or label. For example, to forecast hotel room occupation demand, the dataset should be comprised of attributes of monthly demand per country, distribution channel, market segment, room types, among others, and also an attribute as the label, in the case, the number of rooms occupied. • Outlier analysis: The identification of observations in a dataset that do not comply with the general pattern or model of the data (also known as anomaly mining or anomaly detection). Outlier analysis is used for fraud detection or for the identification of customers with high customer lifetime value (CLV).
22 Data Mining and Predictive Analytics for E-tourism
539
X2
Fig. 5 Clustering visualization example (analyzing travelers per amount spent and time to service booking)
X1
• Cluster analysis: Contrarily to supervised learning, unsupervised learning is employed to analyze “non-labeled” data, that is, where no target is predetermined and finding similarity patterns is the goal. One common example of unsupervised learning methods is cluster analysis. As illustrated in Fig. 5, cluster analysis could be used to identify groups in data, like customers segments or buying profiles.
How to Choose the Technique/Tool That Best Suits the Problem Type For most data mining problems, multiple tools/techniques/algorithms can be applied. Each one requires particular conditions for application, as well as thereof consequences and trade-offs (Maimon and Rokach 2010; Han et al. 2012). In this subsection, we introduce some of the techniques/algorithms employed for the different types of problems, but we will not delve into the details of each one as it would be out of the scope of this chapter. Details of techniques/algorithms can be found in data mining reference literature, such as Hastie et al. (2017), Maimon and Rokach (2010), Han et al. (2012), or Zaki and Meira (2014). According to the problem and to better illustrate which tools, techniques, or algorithms are used, we present the flowchart diagram in Fig. 6. While not exhaustive, this visual aid attempts to describe the most popular tools and techniques. The process for selecting the best one for a particular data mining project will be discussed in Section “How to Conduct a Data Mining Project”.
Class/Concept Description The type of problems where class or concept description is required is frequently descriptive problems, that is, problems that rely on statistics summarization, statistical tests, and aggregations and whose output is presented in the form of visualizations, tables, crosstabs, or rules. When mining a single dataset, summarizations statistics and aggregations can be implemented, as most of the techniques described in this section, using free,
Y
N
Neural Network
Needs to be interpretable?
Logistic Regression
N
N
Fig. 6 Algorithm/tool selection diagram
(Deep) Neural Network
Y
Naïve Bayes
Y
Gradient Boosting Tree
Random Forest
Decision Tree
Neural Network
Support Vector Machine
Large dataset?
Needs to be interpretable?
Predictive Analysis – CLASSIFICATION
Linear Regression
Gradient Boosting Tree
Random Forest
Decision Tree
Predictive Analysis – REGRESSION
>2
N
Continuous
Y
N
PrefixScan
Spade
GSP
FP-Growth
Elgat
Apriori
DBSCAN
K-Means
Robust Principal Component Analysis
OUTLIER ANALYSIS
Y
Outlier detection?
“Graphs”
Patterns over time or N positions?
Y
“Market N basket analysis”?
Y
Y
N
Dimension reduction?
Y
FREQUENT PATTERNS/ASSOCIATIONS
One-class Support Vector Machine
Is one of the categories rare?
N
Y
Y
How many 2 categories?
Categorical
Target type?
Y labeled?
Is data
Frequent pattern or N association problem?
N problem?
Descriptive
START
N
N
N
Y
N
DBSCAN
Need to define k?
Hierarchical?
Y
Y
Y
K-Means
Hierarchical
Singular Value Decomposition
Latent Dirichlet Allocation
Probabilistic?
CLUSTERING
Principal Component Analysis
Topic modeling?
Y
Multidimensional Database
Analytical Tools
DIMENSION REDUCTION
Relational Database
Analytical Tools
Multiple N dimensions?
Y
CLASS/CONCEPT DESCRIPTION
540 N. Antonio et al.
22 Data Mining and Predictive Analytics for E-tourism
541
noncommercial, programming languages (such as Python, R) or tools (such as SAS Miner or Weka). However, if data is multidimensional, it should be stored into a relational or a dimensional database. However, this is not necessarily implemented in a database management system (DMBS), as many of the actual analytical tools, including spreadsheets, support the creation of dimensional databases models. Relational databases, also referred to as online transaction processing (OLTP) systems, are user-oriented databases, employed for transaction and query processing. Dimensional databases, or online analytical processing (OLAP) systems, are decision-making oriented and used for data analysis. OLAP systems allow data to be modeled and viewed in multiple dimensions – usually designated as a “cube” (Han et al. 2012). In case the goal involves analysis of a multidimensional problem (such as the study of the impact of different dimensions and factors, like the weather forecast, currency exchange, or local events on travelers’ demand), then a dimensional database should be used. Dimensional databases, used by OLAP systems, allow the creation of measures based on aggregation functions, such as the count of frequencies, minimum values, maximum values, mean, median, and rank order, among many others. The measures can be consulted through simple reporting or explored through the four main types of OLAP operations (see examples in Fig. 7): • Roll-up: Performs aggregation on a data cube, either climbing up a concept hierarchy in a dimension or by dimension reduction. • Drill-down: Drill-down is the reverse of roll-up. It allows navigation from an aggregated level to a more detailed level. • Slice and dice: While “slice” creates a sub-cube by performing a selection in one of the dimensions, “dice” creates a sub-cube by performing a selection in two or more dimensions (e.g., (month=“January”) and (region=“NY”)). • Pivot (rotate): Visualization operation which rotates the data axes to offer an alternative view on data (i.e., from another point of view). For example, instead of observing the average length of stay travelers, per region and year (lines), and per month (column), change the year to the columns and the month to the lines of the report.
Frequent Patterns, Associations, and Correlations Frequent pattern mining investigates recurring relationships in a dataset, like the co-occurrence of two or more objects of interest (Han et al. 2012; Larose and Larose 2015). The most known application of frequent pattern mining is “market basket analysis,” that is, to identify sets of items that are purchased together (e.g., in supermarket transactions). However, “market basket analysis” can be used not only to identify customers’ purchase habits, but it can also be used in e-tourism applications, like the identification of which traveling services were purchased together in an online traveling website; understanding web pages visiting patterns of an official destination website; or understanding visitors’ patterns in a trip by analyzing the online reviews posted by the travelers. The most common techniques
542
N. Antonio et al.
on
i Brazil Mexico
on gi N.America re S.America
250
Q1
time (in quarters)
500
Q1
Q2 MICE
(re g A ion D AN ND =“B IC D (tim raz E f (s eg e=” il” o or OR me Q1” r “M n “M t=“c OR exic IC or “Q o” E” po 2 ) ) ra ) te ”
corporate
segments
USA
Q4
time (in quarters)
sports leisure
segment
n, tin gio con re
USA on gi Canada re Brazil Mexico
150
350
Mexico
Jan
100
Q2
Feb
175
Q3
Mar
25
Q1
500
100
200
300
(on
Q4
S
MICE corporate
c L RO from ents)
250
(ti LIC m e= E f “Q or 1” )
Q3
to ies
(on
on gi Canada re Brazil
1000
250
Q2
UP ntr L- ou
MICE corporate
sports leisure
D tim RILL e, fro -DO mo m q WN nth u a s) rter st
segments
Apr
May o
time (in months)
time (in quarters)
g re
Jun
Jul
USA
corporate
500
Canada
MICE
100
Oct
leisure
200
Nov
300
Dec
PIVOT Brazil
Mexico
500
100
200
MICE corporate
300
segment
region
Aug
sports
sports leisure
segment
Canada USA
Sep
Mexico Brazil
region
MICE corporate
sports leisure
segment
Fig. 7 Examples of main OLAP operations. (Adapted from Han et al. 2012). Fact table being the sales and dimensions being the customers’ segment, customers’ region, and time
to identify co-occurrence of patterns are the Apriori algorithm, the Eclat algorithm, and the FP-Growth algorithm. Although the first is the most employed and most known, the other algorithms were designed to overcome some of the limitations shown by the Apriori algorithm, such as the high computational requirements for processing large datasets (Han et al. 2012; Larose and Larose 2015). One example of the “market basket analysis” in e-tourism research is the work of Liao et al. (2010), where the authors combined the application of the Apriori algorithm with clustering to propose suggestions and solutions for tourism product development. Also interesting in e-tourism research is sequence patterns mining. Sequence mining allows for the discovery of patterns across time or positions in a dataset (Larose and Larose 2015). Sequence patterns could be used to identify what travelers say, across time, in online reviews about a destination or a hotel brand, travelers’ purchase patterns in hotels across time, or travelers’ destination type selection through time or even, as in Bermingham and Lee (2014) and
22 Data Mining and Predictive Analytics for E-tourism
543
Fig. 8 Example of a graph network representation
Cai et al. (2014), explore travelers’ trajectories. There are several techniques to identify sequence patterns, like the GSP, SPADE, or PrefixScan algorithms. Graph mining is one other interesting pattern mining tool. With the ubiquitous presence of social networks, graph data has grown in importance. Graph mining aims to find interesting subgraphs in data. For those not familiar with the term, a graph is a structure that represents a set of objects and existing interconnections. Objects are usually named vertices or nodes, and the links between them are called edges or links. One example of a graph is the representation of “friendship” in a social network, with users being the vertices and the “friendship” between users being the edges (Fig. 8). Graphs analysis can be used to study and measure service quality based on online reviews (Li et al. 2010), to understand travelers’ movements patterns in a destination (Hu et al. 2019), or to identify social media influential users and predict their network impact (Francalanci and Hussain 2016). A common graph pattern mining is the mining of frequent subgraphs from a database of graphs, which is usually done with the gSpan algorithm (Han et al. 2012; Zaki and Meira 2014). Other patterns such as closed graphs, coherent graphs, or dense graphs can also be mined.
Predictive Analysis As mentioned in Section “Pattern Types”, the objective of predictive analysis is to, based on past events, build a mathematical function (model) to estimate future events. Conversely, beyond estimating or forecasting future events, sometimes researchers and modelers seek to uncover the causes for the events. Uncovering of causes should be achieved not only by statistic modeling but also by predictive modeling (Shmueli 2010). However, high-performance machine learning algorithms usually generate black box models, that is, highly complex mathematical functions, making interpretation and understanding of the causal mechanisms behind the prediction virtually impossible. An example of these algorithms is artificial neural networks (ANNs). ANNs are being increasingly employed in e-tourism
544
N. Antonio et al.
Fig. 9 Example of a neural network (one input layer, two hidden layers, and one output layer)
research (Moro and Rita 2016). ANNs are inspired by the biological neural networks of the human brain. An ANN is a collection of nodes (neurons), connected by edges, consisting in an input layer (a1 in Fig. 9) with an entry for each input variable, an output layer (a4 in Fig. 9), and can have one or more hidden layers (a2 and a3 in Fig. 9). The output is usually calculated by a nonlinear function that sums the nodes’ inputs multiplied by the edges’ weights (wn in Fig. 9 example). Other algorithms are quite straightforward to interpret by humans, like decision tree-based algorithms. Figure 10 presents a model built for the well-known pedagogical “play golf” dataset, where the label is a YES or NO answer to the “play golf” questions. The predictors are weather-related variables: “Outlook,” “Humidity,” and “Windy.” It is relatively easy to understand the model built from the 14 observations of the dataset. Besides allowing a certain degree of interpretation, decision treebased algorithms also have the advantage of automatically handling missing data, incorporate the treatment of outliers, inherently detect variable interactions, and are not being affected by variable skewness. Additionally, decision tree-based algorithms have the advantage of being nonparametric in the sense that no statistical distribution assumptions are made about the explanatory variables and the label. Nevertheless, these algorithms also present some disadvantages, namely, a tendency to overfit, i.e., perform well on training data but do not generalize well for unknown data. An optimized version consists in employing ensembles to overcome the overfitting issue. Ensemble methods combine the results of multiple trees into one model. The most known ones, random forest or gradient boosting tree, usually show excellent performance while maintaining a certain level of interpretability. Interpretability is an important point. As illustrated in Fig. 6, both for regression and for classification, some of the algorithms generate models with a certain degree of interpretation. This interpretability is mostly achievable when using techniques as decision tree, random forest, gradient boosting tree, or linear regression
22 Data Mining and Predictive Analytics for E-tourism
545
OUTLOOK? Play: 9 5 sunny
rain
HUMIDITY? Play: 2 3 70 Play: 0 NO
overcast
3
Play: 4 YES
TRUE
0
FALSE
Play: 0
Play: 3
NO
YES
Fig. 10 Decision tree example (play golf dataset: 22 weather condition observations)
for regression problems. A certain level of interpretation is also achievable for classification problems when using decision tree, random forest, gradient boosting tree, or logistic regression. On the other hand, high-performance algorithms such as neural network or support vector machine generate less interpretable models. It is up to the modeler to select the “right” algorithm, depending on the type of estimation problem and whether an understanding of what are the causal mechanisms behind the estimation is wanted. For example, Falk and Vieru (2018) and Antonio et al. (2019), when predicting hotel bookings’ cancellations classification outcome (cancel or not cancel), opted for more interpretable algorithms, which enable not only to predict the cancellation outcome but also to understand cancellation drivers. Another example of a regression problem where interpretability was not sought for is the research of Claveria et al. (2015) that employed different types of ANNs to forecast tourist demand.
Cluster Analysis Clustering is a method for the partition of multiple entities into groups (Fig. 5) so that within the same group entities share a certain degree of similarity but are, ideally, very dissimilar to the entities in the other groups (Han et al. 2012). There are several types of clustering methods: (1) partitioning, creating completely separated partitions (groups or clusters); (2) hierarchical that builds a hierarchical decomposition of the dataset, identifying subgroups inside groups; (3) density-based, which differs from other techniques by enabling the definition of nonspherical-shaped clusters (hierarchical clusters); and (4) grid-based where the entities are quantized into a finite number of cells (grid). The most famous clustering algorithm is K-means, a partitioning method. After setting a numeric parameter k for the number of clusters, that many points, named centroids, are generated, and the dataset points are assigned to each of the centroids
546
N. Antonio et al.
based on the closest distance to each centroid. The clusters’ centroids are redefined in an iterative process, based on the average distance of the data points to the cluster’s centroid. A typical application of this algorithm in e-tourism has to do with travelers’ segmentation (Pesonen et al. 2011; Chen et al. 2014). One other popular clustering algorithm is DBSCAN. Instead of using the distance between data points, DBSCAN uses the local density of points to define the clusters. Though slower than K-means, DBSCAN has the advantage of working well in settings where the different clusters to be obtained are not spherically shaped, or presenting very different sizes/densities, or not well-separated.
Outlier Analysis An outlier (or anomaly) is a data point that deviates significantly from the dataset’s remaining points (Fig. 11). Thus, the objective of an outlier analysis is the discovery of unusual data points in the dataset. On one hand, these data can occur due to human errors (data entry errors), experimental errors (errors related to data extraction or preparation), processing errors (data manipulation errors), sampling errors (incorrect sampling or data sources selection), or novelty errors (indicating, e.g., new trends). However, besides being used in an exploratory analysis to better understand the data, outlier analysis can also be the central methodological approach for an investigation project. As an example, in tourism research outlier analysis can be used to identify the variety of existing accommodations in a given region (SánchezMartín et al. 2019) or to discover new tourist points of attraction using social media data (Halim et al. 2018). As in other types of data mining patterns, there are several techniques/algorithms for performing outlier analysis. If the dataset is labeled, the most common detection algorithm is One-Class Support Vector Machine, a machine learning-based algorithm viewing the task as a classification problem. If the dataset is not labeled, then a clustering algorithm such as K-means or DBSCAN is usually employed to identify data points that are further away from the cluster centroids. Another method used for unlabeled data is a version of PCA known as robust principal component analysis.
x2
Fig. 11 Example of outliers in a dataset (points inside the circle)
x1
22 Data Mining and Predictive Analytics for E-tourism
547
Data Preprocessing Before applying any of the previous mentioned techniques, as described in the following section, data must be processed and transformed, a phase also known as data preparation. From the many existing transformations’ techniques, two are worth mentioning: dimension reduction and topic modeling. Dimension reduction is the process of scaling down the number of predictor variables, generating a reduced representation of the dataset yet preserving the essential properties of the full dataset (Han et al. 2012; Zaki and Meira 2014). Highdimensional datasets can be complex to analyze, require processing time, and, for predictive modeling, can lead to overfitting models. There are several dimensionality reduction algorithms, but principal component analysis (PCA) is the most wellknown and one of the most used. The goal of PCA is to project the original dataset onto a lower dimensional space preserving as much variance as possible but minimizing the projection error. At the same time, PCA also helps to understand the correlations between a set of predictor variables. This understanding was used by Brida et al. (2011) and Muresan et al. (2016) to understand residents’ perceptions of tourism. Topic modeling is a text mining technique to discover topics that are addressed within a collection of texts. However, topic modeling, like clustering, is sometimes used as a data reduction technique. By assigning each document a topic, the variables used in the identification of the topics can be replaced by one variable only, the identified topic. Two usual algorithms that are used for topic modeling are Singular Value Decomposition and Latent Dirichlet Allocation. A well-known application of these techniques/algorithms is the use of Latent Dirichlet Allocation to extract the dimensions of traveler’s satisfaction and dissatisfaction from online reviews (Rossetti et al. 2016; Guo et al. 2017).
How to Conduct a Data Mining Project Cross-Industry Standard Process Mining for Data Mining (CRISP-DM) (Chapman et al. 2000) is today the de facto standard process model employed for data mining projects, both in industry and academy. The reason for CRISP-DM’s popularity seems to be related to its foundation in technical principles originated from practical and real-world experience on how modelers conduct data mining projects. CRISP-DM provides a life cycle of a data mining project (Fig. 12) that is composed of six phases, presenting multiple tasks in each of the phases. Sequencing is not rigid, and it is usual to move back and forth between several phases/tasks. The outcome of one phase or task defines the one that should be performed next. The arrows connecting each phase in Fig. 12 illustrate the principal and most frequent interactions. As already stated, multiple iterations between different phases are usually necessary until a model can be deployed. The outer arrows in Fig. 12 represent the cyclical nature of data mining projects. Projects do not end when models are deployed. Lessons acquired from the modeling process and, in particular,
548
N. Antonio et al.
Fig. 12 Phases and generic tasks of the CRISP-DM reference model
changes in the phenomena under study brought by the model deployment are reincorporated for the sake of the model’s continuous improvement. Without entering into a high level of detail, this section describes the generic tasks of each phase, including their outputs and main challenges: 1. Business understanding: the initial phase where project objectives and requirements are studied, from the domain and particular problem’s perspective, and converted into an analytics project, resulting in the definition of a plan to achieve the objectives. As Han et al. (2012) recognize, “It’s tempting to jump straight into
22 Data Mining and Predictive Analytics for E-tourism
549
mining,” but, for a data mining project to be successful, special attention should be paid to the problem’s goal and domain. a. Determine business objectives: Understand the problem correctly, balancing objectives with constraints, and accurately identify the questions to be answered, in order to avoid wasting a great deal of effort producing the right answers to the wrong questions. b. Assess situation: Detailed identification of all available resources, constraints, assumptions, and other factors, including data, stakeholders, experts, and computational resources, among others, that could influence project design and goal achievement. c. Determine data mining goals: Define the project goals and the criteria for a successful outcome in technical terms (e.g., a certain level of predictive accuracy). d. Project planning: Describe in detail the plan to answer the objective questions and the project goal. 2. Data understanding: starts with the initial data collection and continues with the activities required to allow modelers to become acquainted with data, including identifying any patterns, tendencies, or anomalies. a. Data initial collection: Attain the data identified in the project resources. Data collection can be performed through SQL database queries, accessing files in data warehouses, downloading data from the internet through the use of websites’ application programming interfaces (API), websites’ scraping, or any other required methods. For this task output, besides the obtained data, also the methods and the detailed description of how the data was obtained are important for the future. b. Data description: Analyze the general data details and generate a report with the basic properties of data (e.g., number of columns or number of observations) to evaluate if the data does so far satisfy the project requirements. c. Data exploration: Use statistics, reporting, and data visualization to study data in order to perform exploratory analysis, including distribution of key variables, correlations, results of simple aggregations, categorical variables classes frequencies, and properties of significant sub-populations, among other statistical analyses. d. Data quality check: Identify any quality issues in data (does data cover all cases?), and discover errors and typos (e.g., different names are used to identify the same classes in categorical variables) and missing values, among other issues. If issues arise, enumerate them and define possible solutions (knowledge in the domain of the problem is usually a key factor to overcome data quality problems). 3. Data preparation: all activities related to the creation of the final dataset (modeling dataset). a. Data selection: Select the data (columns and rows) to be included in the modeling dataset based on criteria taking in consideration the goal of the project, data quality, and the resources available, including, among others, the computational power required.
550
N. Antonio et al.
b. Data cleaning: Improve data quality by, for instance, cleaning sub-sets of data, inserting of default values, or estimating missing values. c. Data construction: Construct new observations from aggregations of observations or variables derived from other variables (known as “engineered features” in machine learning – e.g., a ratio built from two other variables). Engineered features are considered a critical success factor in predictive modeling due to information gain obtained from the association of multiple input variables. Sound feature engineering requires not only technical knowledge but also creativity, intuition, and domain knowledge (Antonio et al. 2019). d. Data integration: Merge data from different sources or from constructed data from the previous task, if any. e. Data formatting: Format data according to algorithm requirements (e.g., the label must be the first column in the dataset; binary categorical variables classes are expressed as 0 and 1 or change the distribution of highly skewed variables with a mathematical function such as the logarithmic ones). 4. Modeling: comprises all final activities related to the preparation of the dataset for modeling and the application of the chosen algorithm, including parameters’ calibration. a. Modeling techniques: Select the technique/algorithm to use according to the problem (e.g., neural network or decision tree). If multiple techniques/algorithms must be applied, this task should be performed separately for each technique/algorithm. b. Test design generation: Set up the tests to evaluate the model’s quality and validate it. For predictive modeling, this is in this task where the dataset is typically divided into a training set (usually comprising between 60% and 80% of the data) and a test set (with the remaining data). While the training set is used to build the model, the model’s quality is tested using the test set. c. Modeling: Using the modeling tool, create the model. d. Model assessment: Analysis of the model’s performance according to technical performance and model. For example, for classification models, measures such as accuracy, precision, recall, F1 score, or area under the curve (AUC) are preferred. In regression problems, measures such as mean absolute error (MAE), root mean squared error (RMSE), or mean squared error (MSE) are used. For clustering, different measures are employed, such as Average Silhouette Width (ASW), p-Separation Index, and p-Stability Index, among others. If models were built using different techniques/algorithms, make relative performance comparisons. 5. Evaluation: assessment of the model’s performance according to the objectives initially determined. a. Evaluate results: While in the previous phase results were evaluated from a technical perspective, now results are analyzed in terms of the project’s goal and questions. The tasks’ output should be an assessment based on the success criteria. b. Review process: Revise all the process to identify any undone activities and those that should be repeated.
22 Data Mining and Predictive Analytics for E-tourism
551
c. Determine next steps: Based on the results of the previous two tasks. A decision should be made to either finish the project and proceed to deployment or to initiate further iterations. 6. Deployment: application of the model in a production environment. In descriptive models, deployment is the application of the findings to the problem’s objective. a. Plan deployment: Summarize the deployment strategy, including not only the “what to do” but also the “how to do it.” b. Plan monitoring and maintenance: If the goal involves putting the plan into production, since models are not eternal and are data-dependent, monitorization and support for the model deployment should be planned. c. Final report: Generate the final project report and, for accomplished projects, a comprehensive presentation of the project results. d. Review project: Assess what went well and not so well to document the experience gained during the project. An example of the use of CRISP-DM in tourism research can be seen in the paper written by Antonio et al. (2019) on how to use big data to predict cancellations in hotel bookings.
Conclusion We expect that reading this chapter may entice the reader not previously familiarized with data mining of the possibilities and methods it presents and to learn more about this topic so as to apply it in e-tourism research projects. We believe that data-driven research has the potential of greatly enhancing e-tourism research.
Challenges and Limitations of Data Mining Projects Significant challenges and limitations on the application of data mining and predictive analytics research in e-tourism still exist. Let us begin by commenting on the access to data. Although data is being generated at an unprecedented scale, much of it is privately owned (e.g., hotels, airlines, and other company’s data), which makes the access to it difficult. Even public data, such as user-generated content in social networks websites, is now more challenging to use since it has been subject of rules of use, like in the case of rules imposed by the European Union’s General Data Protection Regulation (GDPR). Another important point is data quality. Data must be accurate, reliable, unbiased, timely, and appropriate for the problem under analysis. Another key factor is multidisciplinary knowledge. Since data mining and predictive analytics combine techniques from multiple fields of study, including statistics, machine learning, and data visualization, e-tourism research teams, besides being composed of people with a background in social sciences, should also be composed
552
N. Antonio et al.
of individuals knowledgeable in data science or, at least, possessing a background in ICT or quantitative fields. Lastly, one must consider the ethical implications. When personal or ethnical information is implicated, data mining and predictive analytics models could easily be used to discriminate. Profiling should carefully consider the possibility of results being used as a discriminatory tool, which is one of the GDPR concerns and of European Union ethical committees that are presently studying ethical regulations concerning data usage in the European research arena. Even unintentionally, modeling could result in incorrect usages due to unfair data. As an example, take a hotel website and a model to determine prices or cancellation policies dynamically. The use of zip code or country variables may be used to discriminate against anyone who lives in regions typically inhabited by ethnicities, conditioning the diversity of the hotel customers. It is crucial that researchers always consider the ethical or privacy implications of their data mining and predictive analytics treatment and modeling design.
Expected Future Developments While descriptive analytics and predictive analytics are now commonly employed in business and increasingly employed for e-tourism research, only a few examples of the latter can be found in academic research (Lepenioti et al. 2020). Prescriptive analytics has the potential of enabling optimized decision-making ahead of time, allowing users and modelers to comprehend the results of possible courses of actions or scenarios. Due to this potential, it is expected that, in future years, prescriptive analytics becomes one of the hottest topics within data mining research, for instance, creating post-pandemic tourism scenarios or simulating the impact of future crisis on the tourism industry.
Cross-References Artificial Intelligence and Machine Learning Big Data Technologies Visual Methods and Visual Analysis in Tourism Research
References Antonio N, de Almeida A, Nunes L et al (2018) Hotel online reviews: different languages, different opinions. Inf Technol Tour 18:157–185. https://doi.org/10.1007/s40558-018-0107-x Antonio N, de Almeida A, Nunes L (2019) Big data in hotel revenue management: exploring cancellation drivers to gain insights into booking cancellation behavior. Cornell Hosp Q 60:298– 319. https://doi.org/10.1177/1938965519851466
22 Data Mining and Predictive Analytics for E-tourism
553
Bach MP, Schatten M, Marušic Z (2013) Data mining applications in tourism: a keyword analysis. In: Hunjak T, Lovrenˇci´c S, Tomiˇci´c I (eds) Proceedings of the 24th central European conference on information and intelligent systems, Varaždin, pp 26–32 Bermingham L, Lee I (2014) Spatio-temporal sequential pattern mining for tourism sciences. Proc Comput Sci 29:379–389. https://doi.org/10.1016/j.procs.2014.05.034 Brida JG, Disegna M, Osti L (2011) Residents’ perceptions of tourism impacts and attitudes towards tourism policies in a small mountain community. SSRN Electron J. https://doi.org/10. 2139/ssrn.1839244 Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the Internet – the state of eTourism research. Tour Manag 29:609–623. https://doi.org/10.1016/j.tourman.2008.01.005 Cai G, Hio C, Bermingham L et al (2014) Mining frequent trajectory patterns and regions-ofinterest from Flickr photos. In: 2014 47th Hawaii international conference on system sciences, pp 1454–1463 Chang K-C, Chen M-C, Kuo N-T et al (2016) Applying data mining methods to tourist loyalty intentions in the international tourist hotel sector. Anatolia 27:271–274. https://doi.org/10.1080/ 13032917.2015.1099554 Chapman P, Clinton J, Kerber R et al (2000) CRISP-DM 1.0: step-by-step data mining guide. In: The modeling agency. https://the-modeling-agency.com/crisp-dm.pdf. Accessed 10 Sept 2015 Chen G, Bao J, Huang S (Sam) (2014) Segmenting Chinese backpackers by travel motivations. Int J Tour Res 16:355–367. https://doi.org/10.1002/jtr.1928 Chen Q, Hu Z, Su H et al (2018) Understanding travel patterns of tourists from mobile phone data: a case study in Hainan. In: 2018 IEEE international conference on big data and smart computing (BigComp), pp 45–51 Claveria O, Monte E, Torra S (2015) Tourism demand forecasting with neural network models: different ways of treating information. Int J Tour Res 17:492–500. https://doi.org/10.1002/jtr. 2016 Delen D, Demirkan H (2013) Data, information and analytics as services. Decis Support Syst 55:359–363. https://doi.org/10.1016/j.dss.2012.05.044 Delen D, Sirakaya E (2006) Determining the efficacy of data-mining methods in predicting gaming ballot outcomes. J Hosp Tour Res 30:313–332. https://doi.org/10.1177/1096348006286795 Falk M, Vieru M (2018) Modelling the cancellation behaviour of hotel guests. Int J Contemp Hosp Manag 30:3100–3116. https://doi.org/10.1108/IJCHM-08-2017-0509 Francalanci C, Hussain A (2016) Discovering social influencers with network visualization: evidence from the tourism domain. Inf Technol Tour 16:103–125. https://doi.org/10.1007/ s40558-015-0030-3 Guo Y, Barnes SJ, Jia Q (2017) Mining meaning from online ratings and reviews: tourist satisfaction analysis using latent dirichlet allocation. Tour Manag 59:467–483. https://doi.org/ 10.1016/j.tourman.2016.09.009 Halim MA, Saraf NM, Hashim NI et al (2018) Discovering new tourist attractions through social media data: a case study in Sabah Malaysia. In: 2018 IEEE 8th international conference on system engineering and technology (ICSET). IEEE, Bandung, pp 157–161 Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques, 3rd edn. Elsevier, Waltham Hastie T, Tibshirani R, Friedman J (2017) The elements of statistical learning. Springer series in statistics, 2nd edn. Springer, Berlin Höpken W, Eberle T, Fuchs M, Lexhagen M (2020) Improving tourist arrival prediction: a Big Data and artificial neural network approach. J Travel Res 0047287520921244. https://doi.org/ 10.1177/0047287520921244 Hsu L, Hsu C, Lin T (2010) Data mining in personalized travel information system. In: 2010 2nd international conference on information technology convergence and services, pp 1–4 Hu F, Li Z, Yang C, Jiang Y (2019) A graph-based approach to detecting tourist movement patterns using social media data. Cartogr Geogr Inf Sci 46:368–382. https://doi.org/10.1080/15230406. 2018.1496036
554
N. Antonio et al.
Kitchin R (2014) Big Data, new epistemologies and paradigm shifts. Big Data Soc 1:205395171452848. https://doi.org/10.1177/2053951714528481 Larose DT, Larose CD (2015) Data mining and predictive analytics, 2nd edn. Wiley, Hoboken Law R, Mok H, Goh C (2007) Data mining in tourism demand analysis: a retrospective analysis. In: Alhajj R, Gao H, Li J et al (eds) Advanced data mining and applications. Springer, Berlin/Heidelberg, pp 508–515 Lepenioti K, Bousdekis A, Apostolou D, Mentzas G (2020) Prescriptive analytics: literature review and research challenges. Int J Inf Manag 50:57–70. https://doi.org/10.1016/j.ijinfomgt.2019.04. 003 Li S, Hao J, Chen Z (2010) Graph-based service quality evaluation through mining web reviews. In: Proceedings of the 2010 international conference on natural language processing and knowledge engineering. IEEE, Beijing, pp 280–287 Li G, Law R, Vu HQ et al (2015) Identifying emerging hotel preferences using Emerging Pattern Mining technique. Tour Manag 46:311–321. https://doi.org/10.1016/j.tourman.2014.06.015 Liao S, Chen Y-J, Deng M (2010) Mining customer knowledge for tourism new product development and customer relationship management. Expert Syst Appl 37:4212–4223. https:// doi.org/10.1016/j.eswa.2009.11.081 Maimon O, Rokach L (eds) (2010) Data mining and knowledge discovery handbook, 2nd edn. Springer, Boston Malik MM, Abdallah S, Ala’raj M (2018) Data mining and predictive analytics applications for the delivery of healthcare services: a systematic literature review. Ann Oper Res 270:287–312. https://doi.org/10.1007/s10479-016-2393-z Mariani M, Baggio R, Fuchs M, Höepken W (2018) Business intelligence and big data in hospitality and tourism: a systematic literature review. Int J Contemp Hosp Manag 30:3514– 3554. https://doi.org/10.1108/IJCHM-07-2017-0461 Mazzocchi F (2015) Could Big Data be the end of theory in science? EMBO Rep 16:1250–1255. https://doi.org/10.15252/embr.201541001 Moro S, Rita P (2016) Forecasting tomorrow’s tourist. Worldwide Hosp Tour Themes Bingley 8:643–653 Muresan I, Oroian C, Harun R et al (2016) Local residents’ attitude toward sustainable rural tourism development. Sustainability 8:100. https://doi.org/10.3390/su8010100 Pei S (2013) Application of data mining technology in the tourism product’s marketing CRM. In: 2013 2nd international symposium on instrumentation and measurement, sensor network and automation (IMSNA). IEEE, Toronto, pp 400–403 Pesonen J, Laukkanen T, Komppula R (2011) Benefit segmentation of potential wellbeing tourists. J Vacat Mark 17:303–314. https://doi.org/10.1177/1356766711423322 Rossetti M, Stella F, Zanker M (2016) Analyzing user reviews in tourism with topic models. Inf Technol Tour 16:5–21. https://doi.org/10.1007/s40558-015-0035-y Sánchez-Martín J-M, Rengifo-Gallego J-I, Blas-Morato R (2019) Hot spot analysis versus cluster and outlier analysis: an enquiry into the grouping of rural accommodation in Extremadura (Spain). ISPRS Int J Geo-Inf 8:176. https://doi.org/10.3390/ijgi8040176 Shapoval V, Wang MC, Hara T, Shioya H (2017) Data mining in tourism data analysis: inbound visitors to Japan. J Travel Res 0047287517696960. https://doi.org/10.1177/0047287517696960 Shmueli G (2010) To explain or to predict? Stat Sci 25:289–310. https://doi.org/10.1214/10STS330 Srivihok A, Intrapairot A (2016) To be or not be competitive country: analysis of travel and tourism competitiveness index by multiple data mining techniques. In: 2016 6th international workshop on computer science and engineering, WCSE 2016, Tokyo, pp 206–213 Strasser BJ (2012) Data-driven sciences: from wonder cabinets to electronic databases. Stud Hist Philos Sci Part C: Stud Hist Philos Biol Biomed Sci 43:85–87. https://doi.org/10.1016/j.shpsc. 2011.10.009 Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Burlington
22 Data Mining and Predictive Analytics for E-tourism
555
Wu EHC, Law R, Jiang B (2010) Data mining for hotel occupancy rate: an independent component analysis approach. J Travel Tour Mark 27:426–438. https://doi.org/10.1080/10548408.2010. 481585 Xie H, Tang W (2009) Application research of Data Mining in travel agency’s Customer Relationship Management. In: Li Q, Yu F, Liu Y, Russell M (eds) 2009 second international workshop on computer science and engineering. IEEE, Qingdao, pp 464–467 Zaki MJ, Meira W (2014) Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, New York Zhao X, Ji K (2013) Tourism e-commerce recommender system based on web data mining. In: 2013 8th international conference on computer science education, pp 1485–1488
Content Analysis of Online Travel Reviews
23
Estela Marine-Roig
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Online Travel Reviews as a Data Source for Research in Hospitality and Tourism . . . . . . . . Destination Image Analytics Through Online Travel Reviews . . . . . . . . . . . . . . . . . . . . . . . Selecting Sources of Online Travel Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Content Analysis Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Term Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Case Study: Impacts of Serious Events on Barcelona’s Destination Image During the Second Half of 2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Collection and Arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measuring Destination Image Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descriptive Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appraisive Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Advisory Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Descriptive Aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appraisive and Advisory Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
558 560 561 562 563 563 564 565 566 567 568 569 570 571 571 572 572 572 573 575 576 577 578 578
E. Marine-Roig () University of Lleida, Lleida, Catalonia, Spain e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_31
557
558
E. Marine-Roig
Abstract In the past decade, the rise of social media, the proliferation of user-generated content (UGC), and, consequently, the effect of electronic word-of-mouth (eWoM) have revolutionized marketing. In tourism, and hospitality, travelergenerated content (TGC), including travel blogs and especially online travel reviews (OTRs), has become a prominent source of information. OTRs hosted on travel-related websites collect information about visitors’ experiences, opinions, and appraisals of places, attractions, activities, products, and services that allows measuring visitors’ perceived image, satisfaction and loyalty. The aim of this chapter is to describe a computerized content analysis of OTRs, from data collection and content analysis. To illustrate the most basic procedures, this study examined a sample of English-language OTRs from TripAdvisor about attractions and activities in Barcelona, Catalonia, between August 17 and December 31, 2017, a period characterized by a terrorist attack and an unrelated independence movement. The results show that serious events had a minimal impact on the city’s image as perceived and shared by reviewers, despite the enormous amount of media coverage of both events.
Keywords Traveler-generated content · Electronic word-of-mouth · Online travel agency · Content analysis · Online travel review · TripAdvisor
Introduction According to O’Reilly (2007), for more than a decade, Web 2.0 companies have relied on users as co-developers and taken advantage of collective intelligence. As the generation and sharing of information between peers – that is, user-generated content (UGC) – has become commonplace on the Internet, database-backed websites with dynamic UGC have come to replace static web pages. In that process, social media, all based on Web 2.0, have facilitated the creation and exchange of UGC (Kaplan and Haenlein 2010), which has consequently become essential in today’s information and knowledge societies. In parallel, the profusion and improvement of mobile devices that allow the installation of third-party apps has led to the rise of social media and UGC. They have also given rise to electronic word-of-mouth (eWoM) based on UGC that encompasses all informal person-toperson communication via online technologies about the use or attributes of brands, products, services, or their vendors or retailers. In general, eWoM is more trusted, believable, and persuasive than commercial information, and its most common forms are online user or consumer reviews and ratings. In the field of travel, tourism, and hospitality, social media is an important source of online information for travelers and a means to share their experiences (Xiang and Gretzel 2010; Xiang et al. 2015; Volo 2020). The usual UGC formats used by travelers have been travel blogs (Marine-Roig 2010) and online travel reviews
23 Content Analysis of Online Travel Reviews
559
(OTRs), the latter of which have demonstrated dramatic growth in the past decade. For instance, the travel website TripAdvisor.com hosted 10 million OTRs in 2007 alone (Gretzel and Yoo 2008) and currently features more than 900 million OTRs concerning roughly 8 million travel-related properties, experiences, and services. In fact, the expansive policy of TripAdvisor, Inc. has contributed to the decline of travel blogs, chiefly by involving the acquisition of two major websites hosting travel blogs (i.e., TravelPod.com and VirtualTourist.com) and closing them in 2017. Although TripAdvisor is the self-proclaimed world’s largest travel site, other important websites hosting OTRs, most of which specialize in travel and lodging, are online travel agencies (OTAs) such as Booking.com, which hosts more than 206 million verified reviews from real guests, 25 million destination reviews, and 14 million photos shared by real travelers – 4 million travel gurus who share their best travel tips. Several researchers have demonstrated the usefulness of OTRs in travel planning (Shin and Xiang 2021). For instance, in a web-based survey of 7,000 TripAdvisor users (Gretzel et al. 2007), 1,480 suitable responses revealed that 97.7% of the Internet users surveyed had read other travelers’ OTRs and used them in their decision-making processes related to past or planned trips. The aspects considered to be extremely or very important were where to stay (77.9%), where to eat (33.6%), what to do (32.5%), where to go (27.0%), and when to go (26.6%). In another online survey with a nationally representative sample of US adults (Analysts 2018), 58.2% of the 2,025 leisure travelers who completed the survey reported that, in relation to resources and services used to plan their leisure travel, they had referred to UGC: primarily OTRs about hotels, followed by ones about restaurants. Taken together, the findings of both studies indicate that users mostly consult OTRs about lodging, followed by OTRs about dining. Researchers in travel, tourism, and hospitality have also used UGC as a source of data in their studies (Lu and Stepchenkova 2015; Chen and Law 2016), especially traveler-generated content (TGC), which includes narratives, opinions, pictures, and ratings shared on social media and based on visitors’ experiences of traveling, sightseeing, entertaining, shopping, lodging, and dining in a tourist destination (Marine-Roig and Huertas 2020). TGC joins the online tourism and hospitality ecosystem via eWoM (Xiang et al. 2017). OTRs have deserved the priority attention of researchers for their high level of availability and because they are a relatively structured type of TGC and include interesting information about the tourist destination image (TDI) perceived by visitors and the evaluation of their experiences at destinations. Given the importance of OTRs for users and researchers alike, this study aimed to offer an overview of the subject from a practical perspective, with special attention to the automation of data collection, extraction, and arrangement and the content analysis of OTRs. Research questions addressed where OTRs are, how they can be collected, how they can be arranged, what information they contain, how useful data in them can be extracted, and how insights into their content can be gained. To answer those questions by demonstration, this study performed a content analysis of a sample of English-language OTRs, published between August 17 and December
560
E. Marine-Roig
31, 2017, a period characterized by a terrorist attack followed by an unrelated secessionist process, aimed at determining the impact of these serious events on the online destination image of Barcelona, Catalonia.
Online Travel Reviews as a Data Source for Research in Hospitality and Tourism OTRs can be defined as narratives, opinions, pictures, and ratings posted on travelrelated websites by users or consumers and based on their travel-related experiences with places, attractions, and tourism-related activities, products, and services. This section demonstrates their usefulness as a source of big data for research on travel, tourism, and hospitality. OTRs influence the process of choosing a destination, attraction, or service undertaken by potential travelers because these consumers consider demand-side information to be more reliable than that emanating from destination or resource promoters (Gretzel 2022). Users can filter reviews by areas such as traveling (e.g., flights, cruises, rental cars, and shuttles), sightseeing (e.g., top attractions to visit, best outdoor activities, best day trips, and most popular things to do), accommodations (e.g., hotels, hostels, resorts, and P2P lodgings), and dining (e.g., restaurants, pubs, and bakeries). Within each area, there may be filters according to popularity, recency, specialties or amenities, cost, and so forth. Figure 1 shows an outline of the information that is usually presented in OTRs. Basically, the reviews contain textual elements (i.e., narratives and opinions of reviewers) as well as paratextual elements (e.g., visual components, dates and ratings of experiences, profile of reviewers, and types of trips). There may also be options for interaction with readers, such as assessing the usefulness of the review (“helpful” votes) and asking the reviewer for more information about the experience. Paratextual elements can be generated by reviewers, webmasters/platforms, or both (Marine-Roig 2017a) and are useful for readers and researchers to delimit the context of reviews and to learn about the characteristics of reviewers. Another source of information for researchers, which is not visible to readers, is HTML metadata and codes (Marine-Roig 2017b) intended for search engines and Internet browsers. Ratings range from one to five bubbles (TripAdvisor) or stars (Yelp), or from 1 to 10 points (Booking). The vast majority of scores tend to be positive, especially in the case of Airbnb (Marine-Roig 2019, 2021). Scores can be global or dedicated to specific resource attributes. For example, a reviewer can separately score the food, service, atmosphere, and value for money in restaurants, and the cleanliness, location, staff, and amenities in hotels. Since the seminal work developed in the Laboratory for Intelligent Systems in Tourism (Gretzel et al. 2007; Gretzel and Yoo 2008), investigations involving the content analysis of OTRs have increased considerably. Following the trends of users, most published studies have involved addressing OTRs about accommodation (Muritala et al. 2020) and collecting data from TripAdvisor (Chen and Law 2016;
23 Content Analysis of Online Travel Reviews
561
Reviewed: Date
Title Username City, Country
ݫ#
#
Text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text, text ... More
Photo 1
Photo 2
Photo 3
Photo 4
Date of stay / visit: Month YYYY Helpful?
#
Fig. 1 Schematic sample of an online travel review. Note: # = number; = helpful votes
ݫ
= collaborations;
Table 1 Sample of studies on online travel reviews in hospitality and tourism Research domain Lodging Dining Travel/Tour
Schuckert et al. (2015) Kwok et al. (2017) Hlee et al. (2018) N % N % N %
Sum N
%
30 9 11
112 25 33
65.9 14.7 19.4
60.0 18.0 22.0
47 8 10
72.3 12.3 15.4
35 8 12
63.6 14.5 21.8
Zarezadeh et al. 2018). Table 1 displays a sample of research on travel, tourism, and hospitality that used OTRs as a data source and shows that about two-thirds of the studies addressed the accommodation sector.
Destination Image Analytics Through Online Travel Reviews The TDI analysis deserves a special mention because it is one of the topics most studied by scholars in the field of tourism for more than 50 years, from the image of cities (Lynch 1960) and regions (Gunn 1972) to destinations in general (Hunt 1975). Most of the conceptual models to analyze TDIs were demonstrated through surveys (Yilmaz and Yilmaz 2020). Recently, Lin et al. (2021) have proved the usefulness of combining two data sources – TGC and survey – to deduce TDIs. From the perspective of obtaining useful data for research on TDI through OTRs, Marine-Roig (2019, 2021) proposed a TDI analytics framework designed to make the most of TGC. In short, the model consists of three hierarchically interrelated aspects of the TDI, namely, descriptive, appraisive, and prescriptive. The descriptive aspect includes the structure/form and facilities of the tourist
562
E. Marine-Roig
attributes and resources, located in space and time. The appraisive aspect has two dimensions: evaluative (i.e., rating scale from worst to best) and affective (i.e., feelings and moods of tourists). Finally, the prescriptive aspect has two dimensions: behavioral (e.g., intention to visit or revisit a place) and attitudinal (e.g., recommend or advise against visiting a place). In the case of OTRs, the descriptive aspect allows identifying tourism resources, products, or services and placing them in time and space. Secondary to the descriptive aspect, the appraisal and advisory aspects allow developing a sentiment analysis. In particular, the evaluative dimension consists of opinions and judgments expressed in standardized scores, usually on a scale from one to five stars or bubbles. By contrast, the affective dimension encompasses emotional responses conveyed in expressions of feelings and moods. The feelings, moods, and recommendations can have a positive or negative polarity.
Selecting Sources of Online Travel Reviews The Internet hosts a massive amount of TGC that makes its manual analysis virtually impossible. In response, it is necessary to have minimally structured information to automate the collection and processing of data. OTRs with similar structures can be obtained on travel websites such as TripAdvisor.com, from subsidiaries and affiliates of TripAdvisor, Inc., and on metasearch engines such as Kayak.com. Other metasearch engines that show OTRs are specialized in the accommodation sector such as Google Hotel Finder and trivago.com. In the travel and hospitality sectors, major online travel agencies (OTAs) on the Nasdaq-100 – Booking.com, Expedia.com, and Ctrip.com – stood out before the COVID-19 pandemic. Many OTRs are also generated in the homestay sector, in which Airbnb.com acts as broker that charges a fee to providers of accommodations. The selection of using one or more travel websites or OTAs to collect OTRs depends on the case study. According to Marine-Roig (2014), this selection of the most suitable sources of OTR data depends primarily on three web rankings based on visibility (V ) in terms of quantity and quality of inbound links; popularity (P ) as the number of unique visitors, visits, and traffic in general; and size (S) as the number of OTRs related to the case study. Such webometrics can be combined in a weighted formula (1) that aggregates rankings inspired by the de Borda’s (1781) count (B): TBRH = 1 ∗ B(V ) + 1 ∗ B(P ) + 2 ∗ B(S)
(1)
The calculation of the most suitable source is done in two phases. First three full lists ‘L’ are constructed, ordered by the variables ‘V’, ‘P’, and ‘S’, with all the OTAs competing. Then, the function ‘B’ assigns a score for each candidate ‘c’, which consists of the number of candidates ranked below ‘c’ in ‘L’. Once the operations have been carried out, the candidates are then sorted in descending order of total score.
23 Content Analysis of Online Travel Reviews
563
Data Collection After selecting the most appropriate web sources, data collection begins. In a literature review about online UGC (Ukpabi and Karjaluoto 2018), scholars found that most researchers used manual data collection or else software programs (e.g., spiders and crawlers) that they developed. Currently, numerous web crawling and scraping tools allow researchers to obtain data from websites; they include commercial computer applications (e.g., Data Toolbar, Mozenda, and Diffbot), open-source Python libraries (e.g., Beautiful Soup and Scrapy), open-source Java libraries (e.g., Apache Nutch and Heritrix), and offline browsers (e.g., free HTTrack Website Copier and the commercial Offline Explorer from MetaProducts). In all cases, it is essential to configure the mentioned tools according to the results of an accurate analysis of the content and hyperlink network of the webhost (Liu 2011; Marine-Roig and Anton Clavé 2016a). Otherwise, incorrect settings can collapse the local computer.
Data Arrangement OTRs are composed of a text and some paratextual elements that allow their systematic arrangement (Marine-Roig 2017a). The basic classification of reviews is by area (travel, tourism, and hospitality): things to do and see (e.g., attractions and activities), lodging (e.g., hotels, hostels, apartments, and homestays), dining (e.g., restaurants, bars, and pubs), and traveling (e.g., flights, cruises, and rental cars). Other paratextual elements allow the segmentation of OTRs by language, dates of experience, dates of publication (Fazzolari and Petrocchi 2018), reviewer’s nationality, travel purpose, and the type or class and location of tourism resource, among other things. Researchers encounter three sources of data on OTR web pages: text and photos posted by the reviewer (Mak 2017; Lojo et al. 2020), paratextual elements generated by the webhost (Marine-Roig 2017a), and HTML metadata for web browsers and search engines (Marine-Roig 2017b). The subsequent preprocessing of information requires separating and organizing the textual and paratextual elements of OTRs downloaded from the website. Once the HTML entities have been decoded (e.g., from ' to ‘), a search-and-replace utility that supports regular language in search patterns (e.g., Notepad++) can be used to extract the mentioned elements. The same utility can be used to give a standard format (ISO: International Organization for Standardization) to dates (ISO 8601: YYYY-MMDD), languages (ISO 639-1: en, zh), and countries (ISO 3166-1: gb, us, cn). A specific application is needed to standardize locations (ISO 6709); for example, Google Maps Geocoding API converts addresses into geographical coordinates. If the language is not indicated in OTRs, then open-source libraries can facilitate the classification; a good example is Nakatani Shuyo’s Language Detection Library for
564
E. Marine-Roig
Java, which allows detecting more than 50 languages and is expandable through Wikipedia’s corpus. A structured arrangement of OTRs can be achieved with two comma-separated values (CSV) files related to a field that contains a common code. A CSV for resources contains the fields: resource code, name, location, global score, scope, type or class, and others, if any. A second file for OTRs contains the fields: resource code, language, reviewer’s profile (e.g., username, nationality, and other), date, score, title, and text. The CSV files are plain text, can be manipulated with any word processor, and are compatible with spreadsheets and database systems.
Content Analysis Procedures Content analysis comprises a series of procedures aimed at transforming diverse and unstructured information into a format that allows analysis (GAO 1989). Table 2 presents four classic definitions of content analysis from different perspectives. In the content analysis of written texts, the basic analytical technique is counting terms. Subtypes of text content analysis focus on the occurrence of themes (i.e., thematic text analysis), the relationships between themes within sentences (i.e., semantic text analysis), and the locations of themes or sentences, if not both, within networks of interrelated themes (i.e., network text analysis) (Roberts 2001). In parallel to the spectacular growth of UGC as a rich source of subjective information, methods of computerized content analysis have been developed. In fields involving artificial intelligence (Ku et al. 2019) and soft computing (Kumar
Table 2 Sample of classic definitions of content analysis
Author Berelson (1952)
Krippendorf (2004)
GAO (1989)
Roberts (2001)
Definition A research technique for the objective, systematic, and quantitative description of the manifest content of communication (p. 18) A research technique for making replicable and valid inferences from texts (or other meaningful matter) to the contexts of their use (p. 18) A set of procedures for collecting and organizing information in a standardized format that allows analysts to make inferences about the characteristics and meaning of written and other recorded material (p. 6) A class of techniques for mapping symbolic data into a data matrix suitable for statistical analysis (p. 2697)
23 Content Analysis of Online Travel Reviews
565
Table 3 Sample of computer tools used for analyzing online travel reviews Software NVivo
Type QDA
CoreNLP
NLP
KH Coder FA Word2vec ML NLTK
NLP
Own algo- FA rithm OpenCoDa MSA Weka ML OpenNLP NLP LIBSVM SVM
Research (Mate et al. 2019)
Goal Negative OTRs (Perikos et al. Opinion min2018) ing (Murakami 2018) Text mining (Li et al. 2018a) Sentiment lexicon (Hou et al. 2019) Opinion mining
Resource Hotels
Source Dataset TripAdvisor 57
Hotels
Booking
1682
Attractions All tourist products All tourist products
5404 30,180
(Marine-Roig et al. Gastronomic 2019) image (Lalicic et al. 2021) Tourism design (Lee et al. 2018) Helpfulness (Guy et al. 2017) Travel tips (Martin-Fuentes Star rating et al. 2018)
Restaurants
TripAdvisor Ctrip, Qunar Ctrip, Tuniu, Tongcheng TripAdvisor
P2P lodging
Airbnb
811,235
Hotels Attractions Hotels
TripAdvisor 1,170,246 TripAdvisor 3,362,296 Booking 18,710,881
165,429
500,000
Note. QDA qualitative data analysis, NLP natural language processing, FA frequency analysis, MSA multivariate statistical analysis, ML machine learning, SVM support vector machine
and Jaiswal 2019), researchers have used supervised machine learning algorithms (e.g., regression analysis, support vector machines, decision trees, naïve Bayes, and k-nearest neighbors), unsupervised machine learning algorithms (e.g., fuzzy c-means and k-means clustering), neural networks, evolutionary computing, fuzzy logic, probabilistic reasoning, and natural language processing. Table 3 presents a sample of computer tools that researchers have used to analyze OTRs. The area of research that has stood out most in literature on social media content analysis is opinion mining, or subjectivity analysis, particularly sentiment analysis (Li et al. 2018a; Alaei et al. 2019). Sentiment analysis has been elaborated at the level of not only the term, sentence, paragraph, and document but also the aspect, namely, in aspect-based sentiment analysis. For instance, Do et al. (2019) categorized more than 40 proposals for aspect-based sentiment analysis according to their main architecture and classification tasks. In relation to OTRs, aspects are distinguished based on the subject or target (e.g., an attraction, hotel, or restaurant). For example, hotels have some different evaluable features than museums.
Unit of Analysis According Prasad (2008), before proceeding to content analysis, analysts need to answer two interrelated questions: what unit of analysis will be selected to classify the information, and what enumeration system will be used for its quantification?
566
E. Marine-Roig
The unit of analysis can be a character (e.g., letter, digit, or symbol), a word, a phrase, a theme, an entire document, a news story, or even an entire film, among other things. By extension, methods of enumeration can involve measures of space (e.g., area occupied in the columns of a newspaper), amount of time (e.g., air time on radio or television), the frequency of units (e.g., number of times that a keyword or key phrase appears in a body of text), and level of intensity (e.g., in terms of font type, color, and size). Gauging the intensity level allows a far more sensitive data analysis, because spaces, times, and frequencies are counted, but each unit of analysis is adjusted by a weight that measures its relative intensity. The system for quantifying the units of analysis was part of a process known as feature extraction (Alaei et al. 2019), which involves “building or deriving a set of discriminative, informative and non-redundant values from a set of data, which eventually facilitates the learning process” (p. 179). Term frequency is one of the most common ways to extract this set of values.
Term Frequency In this way of extracting data from OTRs, the term is the minimum unit of content analysis, understood as a single keyword (e.g., Barcelona, distressed, wonderful, and pickpocket) or as a group of consecutive words that together mean what the words alone do not (e.g., New York, never disappoints, not so nice, and off-putting). To calculate other metrics, including the popularity and assessment of resources and their spatial and temporal distribution, the unit of analysis is the entire OTR. Counting the occurrence of terms deserves special mention as the most common method of measuring the content in written texts. In any text, the most frequent terms used are perceived, produced, and remembered faster and more effectively than low-frequency terms (Brysbaert and New 2009). Analysts assume that the terms mentioned most often are those that reflect the greatest concerns in each text (Stemler 2001). Marine-Roig (2019) proposed a pseudocode algorithm that can be used to calculate the frequency of terms in a text. The algorithm requires a list of terms composed of two or more consecutive words included in the keyword categories and lexicons used in the content analysis, as well as a list of nonsignificant words in the case study (e.g., adverbs, conjunctions, determiners, prepositions, and pronouns). Lists of stop words in several languages are available from the literature, e.g., from Neuchâtel University (UniNe 2020), and can be adapted to a researcher’s needs. To convert the analyzed text into a word list, the aforementioned algorithm considers the non-letter characters as word separators. Because the algorithm is case-insensitive in data searches and comparisons, it works with lowercase text. Regarding the processing of inflections, two techniques are possible: including all of the variants of a word in the keyword categories and lexicons or reducing all words to their stems. In languages with highly inflected forms, it is advisable to work with stems. For example, the adjective beautiful has several inflections in German: schön, schöne, schönem, schönen, schöner, and schönes. The process
23 Content Analysis of Online Travel Reviews
567
of lemmatization in different languages requires an auxiliary library such as that of Snowball (Porter 2021). In that case, the stemming function of the algorithm delegates the task of generating the list of stems to Snowball’s library. The algorithm loads the text into memory, along with the list of terms composed of two or more words and the list of nonsignificant words, and assumes that both lists contain lowercase text. The algorithm follows a thread of execution. First, it converts the text into lowercase text. Second, it extracts the compound terms in the text and stores them with their frequency in the results table. Third, it splits the text into words via tokenization. Fourth, it removes nonsignificant words. Fifth, it gets rid of affixes and reduces words to their stems. Sixth and last, it stores the stems with their frequencies in the results table. The fifth step is unnecessary in languages such as English with few inflections; in that case, the sixth step involves storing keywords directly with their frequencies. If two terms overlap, then the algorithm prioritizes the compound terms. For example, “not friendly” has priority over the stop word “not” and the keyword “friendly.” If two compound terms overlap, then the first one on the list takes priority.
Categorization A central idea of textual content analysis is to classify many terms into a few categories (Weber 1990). Categories are structures that allow the grouping of recorded units of analysis with similar meaning or connotations. Together with frequency counts, categorization is a crucial procedure in content analysis, and in most studies in tourism published from 1986 to 2012, researchers used the categorization of data (Camprubi and Coromina 2016). Indeed, according to Berelson (1952), the success or failure of any content analysis depends on having clearly formulated categories that are well adapted to problems and content. Categories have to be exhaustive and mutually exclusive (Stemler 2001). The criterion of exhaustiveness is met when all relevant units of analysis can be assigned to any of the categories. By contrast, categories are mutually exclusive when no item can be assigned to more than one category. However, in the era of big data, which is marked by the automated processing of millions of terms, it is virtually impossible to fully meet those classic requirements. After all, it is difficult for an algorithm to detect ironic or sarcastic comments, rhetorical questions, paradoxical phrases, polysemic terms, idioms, and typographical alterations, to name just a few sources of ambiguity. Furthermore, the categorization can be applied to examine a specific aspect of the destination or tourism resource studied. For example, the categories of crowdedness and dirtiness can allow detecting problems of sustainability in a destination (Marine-Roig 2019). Categories can be built by way of a priori coding, emergent coding, or a combination of both techniques. On the one hand, a priori coding supposes the agreement of researchers based on some theory that allows the establishment of categories before analysis. For example, in sentiment analysis, a category of positive feelings can be formed by adjectives that generally reflect positive sentiments or moods. The
568
E. Marine-Roig
same category can also be generated automatically from seeds (e.g., “excellent” and “ecstatic”) and a thesaurus by using a recursive propagation algorithm. That algorithm builds a tree from the root node; first-level nodes contain the synonyms of the seed adjective, whereas second-level nodes successively contain the synonyms of the synonyms without any duplicates. That procedure allows assigning a weight to each adjective of the category based on the distance between its node and the root tree. By the same procedure, a researcher can generate a category of negative feelings as well, and both categories can be cross-checked by means of an antonyms dictionary. In relation to the example categories of positive and negative feelings, numerous web-based services (Serrano-Guerrero et al. 2015) and countless studies (Mäntylä et al. 2018) are available to facilitate sentiment analysis. For example, the Vrije Universiteit van Amsterdam (VU 2014) provides a Phyton library to generate sentiment lexicons in several languages. On the other hand, in emergent coding researchers construct categories from checklists of terms arising from preliminary data analysis. Continuing with the abovementioned example, the category of positive feelings can be initially formed with the most frequent terms with positive meaning that appear in the titles of TripAdvisor OTRs rated with five bubbles (i.e., excellent). Likewise, the category of negative feelings with negative terms can be formed with the most frequent terms in OTRs rated with one bubble (i.e., terrible). It is convenient to start with OTR titles, for they concentrate the most significant terms, present an overview of the OTR, and give a preview of the reviewer’s opinion (Marine-Roig 2017a; Marine-Roig and Ferrer-Rosell 2018). That construction process proceeds iteratively until acceptable levels of reliability in the classification of terms according to their polarity are achieved. In both example categories for sentiment analysis, terms composed of two or more words that represent a polarity change from positive to negative (e.g., “not so nice”) or from negative to positive (e.g., “never disappoints”) should be considered. At the same time, the combination of a priori coding and emergent coding can improve compliance with the requirements for content categories, although reliability in terms of stability, reproducibility, and accuracy needs to be confirmed (Krippendorf 2004). Stability is calculated as the degree of change in the process and results over time, whereas reproducibility, or intercoder reliability, can be calculated as the degree to which a process is replicable by different analysts under similar conditions. Last, accuracy can be calculated as the degree of compliance with the specifications and purposes of the process. Neuendorf (2017) has argued that reliability of at least 90% is acceptable in any case and that at least 80% reliability is acceptable in most situations. Lower degrees of reliability are especially acceptable in exploratory studies.
Metrics Reviews contain structured or semi-structured information (e.g., date, place, language, rating) and unstructured information (e.g., text, photos). The first allows
23 Content Analysis of Online Travel Reviews
569
statistical analysis using the standards discussed in Section “Data Arrangement”. In contrast, data extracted from unstructured information requires normalization prior to analysis. Considering the example of TripAdvisor as the website that hosts the largest quantity and diversity of OTRs, the first metric is the number of tourism resources and OTRs in each section (i.e., Things to Do, Restaurants, and Hotels and Places to Stay) and language. Other general metrics derive from the territorial distribution of resources, their popularity by rank, their percentage of the total number of OTRs, and the temporary distribution of OTRs by dates of experience and publication. Within each section, multiple classifications allow accounting for resources by subtypes. For example, in the Things to Do section are sights, landmarks, points of interest, museums, neighborhoods, and sightseeing tours, among other things; in the Restaurants section are region of cuisine (e.g., Mediterranean, European, American, and Asian), country of cuisine (e.g., Spanish, Chinese, Italian, and Japanese), and kind of food (e.g., seafood, steak house, vegetarian, and fast food); and in the Hotels and Places to Stay section are property type (e.g., hotel, B&B or inn, and hostel) and hotel class (i.e., from one to five stars). Continuing the previous example, reviewers from TripAdvisor evaluate tourism resources on a 5-point scale: five bubbles (i.e., excellent), four bubbles (i.e., very good), three bubbles (i.e., average), two bubbles (i.e., poor), and one bubble (i.e., terrible). The first two scores are positive and give rise to the first metric: percentage of OTRs with positive scores (score+). The last two are reflected in another metric: percentage of OTRs with negative scores (score-). The third metric of the evaluative dimension is the weighted average of scores (2) on a scale from 0 to 100 according to the conversion formula (N = number of OTRs): avgScore = ((N5 ∗ 100) + (N4 ∗ 75) + (N3 ∗ 50) + (N2 ∗ 25) + (N1 ∗ 0))/N (2) Since the categories, resources, and the OTRs themselves can have different sizes, in the case of metrics derived from the frequency of terms, normalization can be achieved by calculating the percentage of terms in relation to the total number of words in each text, including “stop words.”
Case Study: Impacts of Serious Events on Barcelona’s Destination Image During the Second Half of 2017 Barcelona, the capital city of Catalonia, is a leading smart city and an outstanding Mediterranean destination (Marine-Roig and Anton Clavé 2015). On August 17, 2017, a terrorist attack was perpetrated in the heart of the city (Bolon et al. 2017), a few hours after which the Catalan police killed five members of the terrorist cell responsible for the attack. The previous day, two other members of the cell had died in an accidental explosion. Coincidentally, the months following the attack were characterized by an independence movement known as “el procés” (the process), marked most notably by the Catalan independence referendum issued on October
1453.6
1204.5
1371.9
1589.8
1321.8
1500
1464.5
1900.6
1874.2
1747.1
2000
1900.2
2500
1907.2
E. Marine-Roig
1852.4
570
1000 500 0 September
October 2016
November 2017
December
2018
Fig. 2 Overnight stays (in thousands) in hotels in Barcelona during the last quarter of 2016–2018 (IdEsCat 2021)
1, 2017 (Dewan et al. 2017). When the Spanish government removed the Catalan government and dissolved the parliament (Jones et al. 2017), pro-independence social and political leaders left the country or entered prison under charges of sedition and rebellion (BBC News 2017). Scholars have analyzed the impact of the terrorist attack on tourism (Huertas and Oliveira 2019), the period of political instability (Perles-Ribes et al. 2019), and the confluence of both events on safety as perceived by visitors (Marine-Roig and Huertas 2020) and the reactions of residents (Gray 2019). Since the study only analyzes a sample of tourists (Englishspeaking visitors), it is interesting to check the impact of events on this segment in relation to the total number of tourists at the same time. Therefore, Fig. 2 shows the events impact on the influx of tourists during the last quarter of 2017, including an approximately 10% decrease in overnight stays in hotels in Barcelona, according to official data published by the Statistical Institute of Catalonia (IdEsCat 2021).
Data Collection and Arrangement Using the website selection formula (Section “Selecting Sources of Online Travel Reviews”), TripAdvisor was listed in first position, highly distant from the positions of other travel-related websites. For example, TripAdvisor currently hosts more than 160,000 OTRs and nearly 120,000 photographs of the Basilica of La Sagrada Família in Barcelona, in contrast to the few hundred reviews hosted on other websites. Figure 3 shows the 116,298 English-language TripAdvisor OTRs about things to do and see published between August 17 and December 31 for each fortnight in the years 2013–2017. Figure 3 shows that the number of OTRs had grown in years prior to the terrorist attack and independence movement and that the year of those events was reduced by
23 Content Analysis of Online Travel Reviews
571
40000 35000
12-2
30000
12-1 11-2
25000
11-1 20000
10-2
15000
10-1
10000
09-2 09-1
5000
08-2 0 2013
2014
2015
2016
2017
Fig. 3 Number of English-language OTRs between August 17 and December 31 by year and fortnight, 2013–2017
almost 30%. That percentage nearly tripled the decrease in overnight stays (Fig. 2). Figure 2 represents all visitors, whereas Fig. 3 represents English speakers only. Taken together, the impact of the events on English-speaking tourists was greater than that on other groups.
Measuring Destination Image Aspects In order to analyze, through the OTRs, the impact of the events on the online TDI of Barcelona, the study measures some dimensions of the aspects of the TDI framework seen in Section “Destination Image Analytics Through Online Travel Reviews” and compares them with the same dimensions in relation to the matching period of previous years. Thus, it distinguishes three aspects of the TDI: descriptive (i.e., structure or shape, facilities, and spatial and temporal dimensions), appraisive (i.e., evaluative and affective dimensions), and advisory (i.e., recommendations and warnings). The measurement of the dimensions derives from the units of analysis, categories, and metrics seen in the third section.
Descriptive Aspect In addition to the date of the OTRs and the types and location of the resources hosted on TripAdvisor, the ad hoc categories of terrorism (ISIS attack(s), terror attack(s), terrorist activities, terrorist attack(s), terrorist incident, terrorist killings, terrorist tragedy, terrorist van attack, terrorist victims, attack(s), terrorism, terrorist(s)) and secession (pro-independence, independence movement, independence vote, vote
572
E. Marine-Roig
for independence, independence referendum, referendum day, referendum Sunday, referendum process, Catalonian referendum, Catalan independence, Catalonian independence, Catalonian secession, separatist(s), referendum) allowed investigating the impact of the terrorist attack and the frustrated independence process on tourism in Barcelona.
Appraisive Aspect The evaluative dimension is obtained through Formula (2) seen in Section “Appraisive Aspect”, and affective dimension metrics emerge from the two categories described in section “Descriptive Aspect”. The first metric, positive sentiments (feel+), is the percentage of terms that represent feelings and moods with positive polarity (e.g., amazing, great, never disappoints, and pleasant). The second metric, negative sentiments (feel-), is the same percentage in case of negative polarity (e.g., not friendly, pickpocket, overcrowded, and disappointed).
Advisory Aspect It is common for reviewers to make recommendations and warnings to subsequent visitors, which is considered an indicator of their loyalty to the tourism brand or destination. In parallel with the affective dimension, a metric, positive recommendations (recom +), is the percentage of terms that represent recommendations with positive polarity (e.g., must see, not to be missed, recommend, unmissable) in relation to the total number of words in the text, including stop words. The second metric, negative recommendations (recom-), is the same percentage in case of negative polarity (e.g., avoid, nothing to see, don’t bother, and can’t recommend) or warnings (e.g., beware, stay away, watch out, and be careful).
Results and Discussion Once the data were cleaned and arranged, the first step was to apply the algorithm in section “Measuring Destination Image Aspects” to obtain the frequency of key terms during the period considered and during the last 5 years, in OTRs referring to all of Barcelona and ones referring to the boulevard where the terrorist attack occurred. Table 4 shows the most frequent terms according to the percentage of total words, including stop words, in English-language OTRs about attractions and activities in Barcelona (i.e., from TripAdvisor’s Things to Do section) published between August 17 and December 31, 2017. Comparing the results of Table 4 with those of a previous study on OTRs regarding attractions and activities in a region in Greece (Marine-Roig 2019) reveals a coincidence in the first three positions: “tour,” “Barcelona” (i.e., the capital city), and “great” versus “Athens” (i.e., the capital city), “tour,” and “great.” They also agree by having two positive qualifying adjectives, “amazing” and “good,” among the most frequent keywords.
23 Content Analysis of Online Travel Reviews
573
Table 4 Most frequent terms in online travel reviews about attractions in Barcelona Rank 1 2 3 4 5 6 7 8 9 10 11
Term Tour Barcelona Great Time Visit Amazing Place Beautiful Gaudi Good Guide
% 0.6326 0.5851 0.4533 0.3607 0.2937 0.2700 0.2581 0.2359 0.2319 0.2224 0.2158
Rank 12 13 14 15 16 17 18 19 20 21 22
Term Day Worth Just Recommend City Really Nice Experience Way Tickets Park
% 0.2107 0.2063 0.1993 0.1951 0.1949 0.1810 0.1779 0.1765 0.1645 0.1473 0.1443
Rank 23 24 25 26 27 28 29 30 31 32 33
Term Walk People Best Service Like Building Inside Interesting Excellent Area Architecture
% 0.1396 0.1345 0.1340 0.1338 0.1332 0.1312 0.1312 0.1294 0.1243 0.1222 0.1191
Source: TripAdvisor OTRs in English between August 17 and December 31, 2017
80 70 60 50 40 30 20 10 0 08-2
09-1
09-2
10-1
Terrorist aack
10-2
11-1
11-2
12-1
12-2
Secession process
Fig. 4 Fortnightly frequency of online tourist reviews mentioning the terrorist attack or process of secession
Descriptive Aspect Figure 3 illustrates the timing of the OTRs. In this case study, no territorial distribution existed, for all OTRs were located in the same city; however, they could be classified by neighborhood or district (Marine-Roig 2021). Figure 4 shows the temporal evolution of OTRs mentioning the terrorist attack or the Catalan independence process. For events with great international media coverage (Bolon et al. 2017; Dewan et al. 2017; Jones et al. 2017), they seemed to have little impact on OTRs. English-language OTRs that comment upon the attack did not amount to even 1% of ones published during the period, despite the fact that visitors had to be informed about the attack and that Las Ramblas hosted an expansive carpet of flowers, candles, and other tributes to the victims. Despite the interval between the
574
E. Marine-Roig
date of the attack and the date of the publication of OTRs (Fazzolari and Petrocchi 2018), in the fortnight following the attack, the number of OTRs dropped by more than half. The impact of the secessionist process on OTRs stood out only in the fortnight after the independence referendum, probably due to the activity of Spanish police that received considerable international media coverage (Dewan et al. 2017). Boxes 1 and 2 show the wealth of information that the OTRs may contain (the identifying data of the review and reviewer have been overwritten.) The OTR in Box 1 reflects the chaos in the early stages of the terrorist attack, whereas the OTR in Box 2 confirms the reaction of residents and visitors 2 days after the attack. Box 1. Review on a market near Las Ramblas on August 17, 2017
gCode, g187497; dCode, d190164; rCode, r000000001; resource, Mercat de la Boqueria; userName, X1; isFrom, Montreal, Canada; score, 1; date, 2017-08-17; lang, en; title, EXTREMELY DANGEROUS PLACE TODAY!!! STAY AWAY; writingBody – This afternoon, our 1st day in Barcelona, we had just walked into La Boqueria from La Rambla... FIRST IMPRESSION WAS SHOCK... FIRST THOUGHTS WERE THIS CAN’T BE SAFE! THE PLACE WAS SOOOOO PACKED WITH PEOPLE, we couldn’t proceed even a few steps with our baby stroller. PEOPLE PACKED IN LIKE SARDINES. LITERALLY NO WHERE TO GO... Suddenly COMPLETE HELLISH CHAOS BROKE OUT! People screaming, people running, brutally shoving, loud bursts of gunshots or bombs echoing all around us! Violently knocked to the ground by the stampede of hysterical, panicked people! HELL ON EARTH EXPERIENCE. By some miracle we managed to escape out of La Boqueria alive, running down a side street and seeking refuge in a nearby hair salon, where we waited 4 hours on lockdown until police permitted us to leave via a specific route. It turned out it had been a terrorist attack with a white van on La Rambla, that killed and injured many people, but it’s very likely that there were several people/children injured inside La Boqueria market as well.
Box 2. Review on Las Ramblas on August 20, 2017
gCode, g187497; dCode, d190163; rCode, r000000002; resource, Las Ramblas; userName, X2; isFrom, Manchester, UK; score, 5; date, 2017-08-20; lang, en; title, What to say...; writingBody – What is there to say apart from standing in solidarity with the people of Barcelona. There were thousands of people on Las Ramblas 2 days after the attack, showing the rats how they/we were not scared. My sorrow was for the victims, the casualties and all their families and friends. Las Ramblas is alive and kicking!!
23 Content Analysis of Online Travel Reviews
575
Table 5 Top ten experiences in Barcelona by percentage of online travel reviews Resource Sagrada Família Suntransfers Park Güell Casa Batlló Las Ramblas Gothic Quarter SANDEMANs Camp Nou Magic Fountain Casa Milà
% 15.81 6.94 6.78 5.78 4.41 3.30 3.05 3.01 2.49 2.20
Type Churches and cathedrals, architectural buildings, sights and landmarks Taxis and shuttles, transportation Nature and Parks Architectural buildings, historic sites, museums, sights and landmarks Points of interest and landmarks, sights and landmarks Historic walking areas, neighborhoods, sights and landmarks Bar, club and pub tours, city tours, walking tours, sightseeing tours Arenas and stadiums, sights and landmarks Fountains, points of interest and landmarks, sights and landmarks Architectural buildings, museums, sights and landmarks
Source: English-language OTRs published on TripAdvisor from August 17 to December 31, 2017
Table 5 shows the 10 most reviewed attractions or activities in Barcelona. Among them are four works of Catalan architect Antoni Gaudí (i.e., Basilica of the Sagrada Família, Park Güell, Casa Batlló, and Casa Milà, or “La Pedrera”) that are UNESCO World Heritage Sites (UNESCO 2005).
Appraisive and Advisory Aspects According to the metrics defined in section “Appraisive Aspect”, the scores in Table 6 were used to calculate metrics in the evaluative dimension (i.e., avgScore, score+, and score-) of the most popular attractions listed in Table 5. Table 4 shows several positive qualifying adjectives among the most frequent terms: “great,” “amazing,” “beautiful,” “good,” “nice,” “best,” “interesting,” and “excellent.” The verb “recommend” also appears. Those terms and other similar ones circumscribed to Las Ramblas (Table 7) were used to calculate metrics of the affective dimension (i.e., feel+ and feel-) of the appraisive aspect and metrics of their advisory aspect (i.e., recom+ and recom-). Table 6 shows the ratings given by reviewers to the most popular attractions and activities in Barcelona by the number of OTRs during the period considered. The best rated was SANDEMANs NEW Europe, a company dedicated to organizing tours, which had higher scores than the works by Gaudí that are UNESCO World Heritage Sites. Remarkably, the same circumstance was previously observed in relation to companies that organize tours and the Parthenon in Athens, another UNESCO World Heritage Site (Marine-Roig 2019). Table 7 compares the results for the period from August 17 to December 31, 2017, with those for the same period in the previous 4 years according to several metrics (section “Appraisive Aspect”) in relation to English-language OTRs of the boulevard, Las Ramblas, where the terrorist attack occurred on August 17, 2017. The popularity of Las Ramblas according to number of OTRs had increased during those months in previous years before decreasing during the year of the attack.
576
E. Marine-Roig
Table 6 Reviewer scores for top ten attractions and activities from Table 5 Resource
5*
4*
3*
2*
1*
avgScore score+
score-
Sagrada Família Suntransfers Park Güell Casa Batlló Las Ramblas Gothic Quarter SANDEMANs Camp Nou Magic Fountain Casa Milà
3110 1459 800 1008 460 578 668 463 361 342
483 94 487 284 304 194 46 174 138 126
153 11 257 73 216 19 8 58 67 43
33 9 56 13 57 3 6 12 15 9
27 99 33 13 25 0 6 17 18 10
93.46 91.94 80.08 90.64 76.29 92.41 96.46 86.40 83.76 86.84
94.40 92.88 78.81 92.88 71.94 97.23 97.28 87.98 83.31 88.30
1.58 6.46 5.45 1.87 7.72 0.38 1.63 4.01 5.51 3.58
recom+ 0.36 0.33 0.44 0.43 0.45
recom0.16 0.26 0.26 0.33 0.19
Note. 5* , excellent; 4* , very good; 3* , average; 2* , poor; 1* , terrible Table 7 Annual comparison (%) according to metrics shown above Year 2013 2014 2015 2016 2017
Rank 11 7 5 4 5
OTRs 2.15 3.21 4.57 5.23 4.41
score+ 65.85 63.62 68.39 65.69 71.94
score14.63 13.14 10.27 12.44 7.72
avgScore 70.02 68.48 72.72 71.35 76.29
feel+ 2.97 3.10 3.53 3.49 3.45
feel1.65 1.74 1.70 2.00 1.68
Source: Online travel reviews of Las Ramblas published on TripAdvisor from August 17 to December 31, 2013–2017
Nevertheless, it is noteworthy that the latest reviewers, perhaps out of empathy, gave the place better ratings and recommendations than their counterparts in the previous 4 years.
Discussion In this case study, the method allowed us to extract significant information about the impact of the terrorist attack in Barcelona and the Catalan independence movement during the last quarter of 2017. First, among the research findings, English-language OTRs of Barcelona (29%) and the boulevard where the attack occurred (40%) decreased in number after the attack, perhaps due to the decrease in the influx of English-speaking tourists at the time. Second, the reviewers rated the boulevard more highly and gave it more positive recommendations after the attack, perhaps out of empathy with the victims. Third, the number of OTRs mentioning the extraordinary events analyzed did not amount to even 1% of the total OTRs, which suggests that the dissemination of the events via OTRs and eWoM was minimal. This result coincides with that obtained in a previous study on the same events based on the natural language processing of Airbnb OTRs in several languages
23 Content Analysis of Online Travel Reviews
577
(Marine-Roig and Huertas 2020). Fourth, the number of OTRs mentioning the attack declined rapidly from fortnight to fortnight after the event. Fifth and last, the number of OTRs mentioning the secessionist process was significant only in the fortnight leading up to the referendum and dropped by 75% in the following fortnight. In relation to the last two findings, it should be taken into account that an interval might have separated the date of the experience from the date of the publication of the OTRs (Fazzolari and Petrocchi 2018).
Concluding Remarks OTRs are a valuable source of big data for research in travel, tourism, and hospitality. However, to process textual big data, researchers require considerable skills with computers and statistics, especially to apply classification techniques based on artificial intelligence (Martin-Fuentes et al. 2018). The example developed in this chapter involved using classification techniques based on quantifiable data provided by websites that host OTRs and on the analysis of the frequency and categorization of key terms in OTRs written by reviewers. The reliability of that second part of the method depends primarily on the accurate elaboration of categories and lexicons. By contrast, the first part has allowed multiple classifications by language of publication and reviewer’s nationality (Marine-Roig and Mariné Gallisà 2018); dates of experience and of publication (Fazzolari and Petrocchi 2018); location (Marine-Roig and Anton Clavé 2016b), type of attractions, restaurants, and hotels reviewed (Marine-Roig 2019); and traveler type (e.g., solo, in a couple, with family, on business, and with friends) (Banerjee and Chua 2016). In addition to the descriptive analysis of the available information, the proposed method allows deducing aspects of the reviewers’ satisfaction and loyalty by analyzing their emotional responses, whether positive or negative, as expressed in ratings and textual terms denoting feelings, moods, recommendations, and warnings. Moreover, it is possible to discover patterns in the writing of OTRs, including the mentioned coincidence between Barcelona and the region of Greece, and of keywords located in the first position in the frequency table. The opinions and ratings of reviewers are influenced by their personal characteristics and social environment. For example, a relaxing destination (i.e., positive feeling) for an older person can be boring (i.e., negative feeling) for a younger one. Big data help to neutralize such subjective deviation, because they allow obtaining an average of the assessment of diverse people from various countries and cultures. In addition to the aforementioned limitations derived from language ambiguities, the quality (Hwang et al. 2018), reliability (Xiang et al. 2018), and helpfulness (Lee et al. 2018; Ma et al. 2018; Shin et al. 2019, 2021) of OTRs can vary considerably between platforms and within the same platform. Other authors (Mariani et al. 2019) detected differences in the trends and features of OTRs according to the sending device (mobile vs. desktop). Another limitation is the high positivity of the reviews on some platforms. For instance, in a study of Airbnb reviews, negatively rated properties did not even
578
E. Marine-Roig
reach 1% of the total (Marine-Roig 2021). It also does not seem normal for a tour-organizing company to obtain better marks than World Heritage Sites such as the Basilica of La Sagrada Família in Barcelona or the Parthenon in Athens. However, quality control is difficult because review data is proprietary. Thus, given the importance of reviews for users or consumers, businesses, and researchers, some nonprofit organizations recently have proposed building an open-data ecosystem for online customer reviews (ORA 2021). Future studies on content analysis of OTRs could attempt to demonstrate several relationships between variables (e.g., destination image, satisfaction, and loyalty) that scholars have repeatedly demonstrated through surveys. A first exploratory attempt failed (Marine-Roig 2021), perhaps because Airbnb reviews were not suitable for this type of research. Acknowledgments This work was supported by the Spanish Ministry of Science, Innovation and Universities [Grant ID: TURCOLAB ECO2017-88984-R].
Cross-References Big Data Technologies Compositional Data Analysis in E-Tourism Research Consumer Behavior in e-Tourism Service Management in the E-Tourism Era Social Media and Crisis Communication in Tourism and Hospitality Social Media Approaches and Communication Strategies in Tourism Travel Information Search Trust in E-Tourism: Antecedents and Consequences of Trust in Travel-Related
User-Generated Content Web Information Retrieval and Search
References Alaei AR, Becken S, Stantic B (2019) Sentiment analysis in tourism: capitalizing on big data. J Travel Res 58:175–191. https://doi.org/10.1177/0047287517747753 Analysts (2018) The state of the American traveler. Destinations edition, vol 27. Destination Analysts, San Francisco Banerjee S, Chua AYK (2016) In search of patterns among travellers’ hotel ratings in TripAdvisor. Tour Manag 53:125–131. https://doi.org/10.1016/j.tourman.2015.09.020 BBC News (2017) Catalonia crisis: sacked ministers held in Spanish jails. BBC Eur Berelson B (1952) Content analysis in communication research. Free Press, New York Bolon A-S, Karasz P, McKinley JC (2017) Van hits pedestrians in deadly Barcelona terror attack. New York Times Brysbaert M, New B (2009) Moving beyond Kuˇcera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav Res Methods 41:977–990. https://doi.org/10.3758/BRM.41.4.977 Camprubí R, Coromina L (2016) Content analysis in tourism research. Tour Manag Perspect 18:134–140. https://doi.org/10.1016/j.tmp.2016.03.002
23 Content Analysis of Online Travel Reviews
579
Chen Y-F, Law R (2016) A review of research on electronic word-of-mouth in hospitality and tourism management. Int J Hosp Tour Adm 17:347–372. https://doi.org/10.1080/15256480. 2016.1226150 de Borda JC (1781) Mémoire sur les élections au scrutin. In: Mémoire de l’Académie Royale. Histoire de l’Académie des Sciences, Paris, pp 657–665 Dewan A, Cotovio V, Clarke H (2017) Catalonia independence referendum: what just happened? Cable News Netw Do HH, Prasad P, Maag A, Alsadoon A (2019) Deep learning for aspect-based sentiment analysis: a comparative review. Expert Syst Appl 118:272–299. https://doi.org/10.1016/j.eswa. 2018.10.003 Fazzolari M, Petrocchi M (2018) A study on online travel reviews through intelligent data analysis. Inf Technol Tour 20:37–58. https://doi.org/10.1007/s40558-018-0121-z GAO (1989) Content analysis: a methodology for structuring and analyzing written material. Government Accountability Office, Washington Gray C (2019) Afraid of what? Why Islamist terrorism and the Catalan independence question became conflated in representations of the 2017 Barcelona attacks. Bull Spanish Stud 96:1113– 1133. https://doi.org/10.1080/14753820.2019.1651007 Gretzel U (2022) Online reviews. In: Buhalis D (ed) Tourism management and marketing. Edward Elgar Publishing, Northampton (forthcoming) Gretzel U, Yoo KH (2008) Use and impact of online travel reviews. In: O’Connor P, Höpken W, Gretzel U (eds) Information and communication technologies in tourism 2008. Springer, Vienna, pp 35–46 Gretzel U, Yoo KH, Purifoy M (2007) Online travel review study: role and impact of online travel reviews, Texas Gunn CA (1972) Vacationscape: designing tourist regions. Bureau of Business Research, University of Texas, Austin Guy I, Mejer A, Nus A, Raiber F (2017) Extracting and ranking travel tips from user-generated reviews. In: Proceedings of the 26th international conference on world wide web – WWW’17. ACM Press, New York, pp 987–996 Hlee S, Lee H, Koo C (2018) Hospitality and tourism online review research: a systematic analysis and heuristic-systematic model. Sustainability 10:article 1141. https://doi.org/10.3390/ su10041141 Hou Z, Cui F, Meng Y et al (2019) Opinion mining from online travel reviews: a comparative analysis of Chinese major OTAs using semantic association analysis. Tour Manag 74:276–289. https://doi.org/10.1016/j.tourman.2019.03.009 Huertas A, Oliveira A (2019) How tourism deals with terrorism from a public relations perspective: a content analysis of communication by destination management organizations in the aftermath of the 2017 terrorist attacks in Catalonia. Catalan J Commun Cult Stud 11:39–58. https://doi. org/10.1386/cjcs.11.1.39_1 Hunt JD (1975) Image as a factor in tourism development. J Travel Res 13:1–7. https://doi.org/10. 1177/004728757501300301 Hwang J, Park S, Woo M (2018) Understanding user experiences of online travel review websites for hotel booking behaviours: an investigation of a dual motivation theory. Asia Pac J Tour Res 23:359–372. https://doi.org/10.1080/10941665.2018.1444648 IdEsCat (2021) Statistical yearbook of catalonia. Statistical Institute of Catalonia, Barcelona Jones S, Burgen S, Graham-Harrison E (2017) Spain dissolves Catalan parliament and calls fresh elections. Guard Kaplan AM, Haenlein M (2010) Users of the world, unite! The challenges and opportunities of Social Media. Bus Horiz 53:59–68. https://doi.org/10.1016/j.bushor.2009.09.003 Krippendorf K (2004) Content analysis: an introduction to its methodology. SAGE Publications, London Ku CH, Chang Y-C, Wang Y et al (2019) Artificial intelligence and visual analytics: a deeplearning approach to analyze hotel reviews & responses. In: 52nd Hawaii international conference on system Sciences (HICSS 2019), pp 5268–5277
580
E. Marine-Roig
Kumar A, Jaiswal A (2019) Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurr Comput Pract Exp e5107. https://doi.org/10.1002/cpe.5107 Kwok L, Xie KL, Richards T (2017) Thematic framework of online review research: a systematic analysis of contemporary literature on seven major hospitality and tourism journals. Int J Contemp Hosp Manag 29:307–354. https://doi.org/10.1108/IJCHM-11-2015-0664 Lalicic L, Marine-Roig E, Ferrer-Rosell B, Martin-Fuentes E (2021) Destination image analytics for tourism design: an approach through Airbnb reviews. Ann Tour Res 86:article 103100. https://doi.org/10.1016/j.annals.2020.103100 Lee P-J, Hu Y-H, Lu K-T (2018) Assessing the helpfulness of online hotel reviews: a classificationbased approach. Telemat Informatics 35:436–445. https://doi.org/10.1016/j.tele.2018.01.001 Li J, Xu L, Tang L et al (2018a) Big data in tourism research: a literature review. Tour Manag 68:301–323. https://doi.org/10.1016/j.tourman.2018.03.009 Li W, Zhu L, Guo K et al (2018b) Build a tourism-specific sentiment lexicon via Word2vec. Ann Data Sci 5:1–7. https://doi.org/10.1007/s40745-017-0130-3 Lin MS, Liang Y, Xue JX et al (2021) Destination image through social media analytics and survey method. Int J Contemp Hosp Manag. https://doi.org/10.1108/IJCHM-08-2020-0861 Liu B (2011) Web data mining: exploring hyperlinks, contents, and usage data. Springer, Berlin/Heidelberg Lojo A, Li M, Xu H (2020) Online tourism destination image: components, information sources, and incongruence. J Travel Tour Mark 37:495–509. https://doi.org/10.1080/10548408.2020. 1785370 Lu W, Stepchenkova S (2015) User-generated content as a research mode in tourism and hospitality applications: topics, methods, and software. J Hosp Mark Manag 24:119–154. https://doi.org/ 10.1080/19368623.2014.907758 Lynch K (1960) The image of the city. The MIT Press, Cambridge, MA Ma Y, Xiang Z, Du Q, Fan W (2018) Effects of user-provided photos on hotel review helpfulness: an analytical approach with deep leaning. Int J Hosp Manag 71:120–131. https://doi.org/10. 1016/j.ijhm.2017.12.008 Mak AHN (2017) Online destination image: comparing national tourism organisation’s and tourists’ perspectives. Tour Manag 60:280–297. https://doi.org/10.1016/j.tourman.2016. 12.012 Mäntylä MV, Graziotin D, Kuutila M (2018) The evolution of sentiment analysis – a review of research topics, venues, and top cited papers. Comput Sci Rev 27:16–32. https://doi.org/10. 1016/j.cosrev.2017.10.002 Mariani MM, Borghi M, Gretzel U (2019) Online reviews: differences by submission device. Tour Manag 70:295–298. https://doi.org/10.1016/j.tourman.2018.08.022 Marine-Roig E (2010) Los “Travel Blogs” como objetos de estudio de la imagen percibida de un destino [Travel blogs as objects of study of the perceived destination image]. In: Guevara Plaza AJ, Aguayo Maldonado A, Caro Herrero JL (eds) Turismo y Tecnologías de la Información y las Comunicaciones. Facultad de Turismo, Málaga, pp 61–76 Marine-Roig E (2014) A webometric analysis of travel blogs and review hosting: the case of Catalonia. J Travel Tour Mark 31:381–396. https://doi.org/10.1080/10548408.2013. 877413 Marine-Roig E (2017a) Online travel reviews: a massive paratextual analysis. In: Xiang Z, Fesenmaier DR (eds) Analytics in Smart Tourism design: concepts and methods. Springer, Heidelberg, pp 179–202 Marine-Roig E (2017b) Measuring destination image through travel reviews in search engines. Sustainability 9:article 1425. https://doi.org/10.3390/su9081425 Marine-Roig E (2019) Destination image analytics through traveller-generated content. Sustainability 11:article 3392. https://doi.org/10.3390/su11123392 Marine-Roig E (2021) Measuring online destination image, satisfaction, and loyalty: evidence from Barcelona districts. Tour Hosp 2:62–78. https://doi.org/10.3390/tourhosp2010004 Marine-Roig E, Anton Clavé S (2015) Tourism analytics with massive user-generated content: a case study of Barcelona. J Destin Mark Manag 4:162–172. https://doi.org/10.1016/j.jdmm. 2015.06.004
23 Content Analysis of Online Travel Reviews
581
Marine-Roig E, Anton Clavé S (2016a) A detailed method for destination image analysis using user-generated content. Inf Technol Tour 15:341–364. https://doi.org/10.1007/s40558-0150040-1 Marine-Roig E, Anton Clavé S (2016b) Perceived image specialisation in multiscalar tourism destinations. J Destin Mark Manag 5:202–213. https://doi.org/10.1016/j.jdmm.2015. 12.007 Marine-Roig E, Ferrer-Rosell B (2018) Measuring the gap between projected and perceived destination images of Catalonia using compositional analysis. Tour Manag 68:236–249. https:// doi.org/10.1016/j.tourman.2018.03.020 Marine-Roig E, Huertas A (2020) How safety affects destination image projected through online travel reviews. J Destin Mark Manag 18:article 100469. https://doi.org/10.1016/j.jdmm.2020. 100469 Marine-Roig E, Mariné Gallisà E (2018) Imatge de Catalunya percebuda per turistes angloparlants i castellanoparlants [Image of Catalonia perceived by English-speaking and Spanish-speaking tourists]. Doc d’Anàlisi Geogràfica 64:219–245. https://doi.org/10.5565/rev/dag.429 Marine-Roig E, Ferrer-Rosell B, Daries N, Cristobal-Fransi E (2019) Measuring gastronomic image online. Int J Environ Res Public Health 16:article 4631. https://doi.org/10.3390/ ijerph16234631 Martin-Fuentes E, Fernandez C, Mateu C, Marine-Roig E (2018) Modelling a grading scheme for peer-to-peer accommodation: stars for Airbnb. Int J Hosp Manag 69:75–83. https://doi.org/10. 1016/j.ijhm.2017.10.016 Mate MJ, Trupp A, Pratt S (2019) Managing negative online accommodation reviews: evidence from the Cook Islands. J Travel Tour Mark 36:627–644. https://doi.org/10.1080/10548408. 2019.1612823 Murakami KH (2018) A comparison of destination images from three different perspectives. J Glob Tour Res 3:107–114 Muritala BA, Sánchez-Rebull M-V, Hernández-Lara A-B (2020) A bibliometric analysis of online reviews research in Tourism and Hospitality. Sustainability 12:9977. https://doi.org/10.3390/ su12239977 Neuendorf KA (2017) The content analysis guidebook, 2nd edn. SAGE Publications, London O’Reilly T (2007) What is web 2.0: design patterns and business models for the next generation of software. Int J Digit Econ 65:17–37 ORA (2021) Building an open data ecosystem for reviews. In: Open Review Association. https:// open-reviews.net. Accessed 19 Mar 2021 Perikos I, Tsirtsi A, Kovas K et al (2018) Opinion mining and visualization of online users reviews: a case study in Booking.com. In: 2018 9th international conference on information, intelligence, systems and applications (IISA). IEEE, pp 1–5 Perles-Ribes JF, Ramón-Rodríguez AB, Such-Devesa MJ, Moreno-Izquierdo L (2019) Effects of political instability in consolidated destinations: the case of Catalonia (Spain). Tour Manag 70:134–139. https://doi.org/10.1016/j.tourman.2018.08.001 Porter MF (2021) SnowBall: a language for stemming algorithms. https://snowballstem.org/. Accessed 19 Mar 2021 Prasad BD (2008) Content analysis: a method of social science research. In: Lal Das DK, Bhaskaran V (eds) Research methods for social work. Rawat Publications, New Delhi, pp 174–193 Roberts CW (2001) Content analysis. Int Encycl Soc Behav Sci 2697–2702 Schuckert M, Liu X, Law R (2015) Hospitality and tourism online reviews: recent trends and future directions. J Travel Tour Mark 32:608–621. https://doi.org/10.1080/10548408.2014.933154 Serrano-Guerrero J, Olivas JA, Romero FP, Herrera-Viedma E (2015) Sentiment analysis: a review and comparative analysis of web services. Inf Sci 311:18–38. https://doi.org/10.1016/j.ins.2015. 03.040 Shin S, Xiang Z (2021) Contextual effects of online review recency: three research propositions. In: Information and communication technologies in tourism 2021. Springer International Publishing, Cham, pp 315–321
582
E. Marine-Roig
Shin S, Chung N, Xiang Z, Koo C (2019) Assessing the impact of textual content concreteness on helpfulness in online travel reviews. J Travel Res 58:579–593. https://doi.org/10.1177/ 0047287518768456 Shin S, Du Q, Ma Y et al (2021) Moderating effects of rating on text and helpfulness in online hotel reviews: an analytical approach. J Hosp Mark Manag 30:159–177. https://doi.org/10.1080/ 19368623.2020.1778596 Stemler S (2001) An overview of content analysis. Pract Assess Res Eval 7:article 17. https://doi. org/10.7275/z6fm-2e34 Ukpabi DC, Karjaluoto H (2018) What drives travelers’ adoption of user-generated content? A literature review. Tour Manag Perspect 28:251–273. https://doi.org/10.1016/j.tmp.2018.03.006 UNESCO (2005) Works of Antoni Gaudi. In: World Herit. List. http://whc.unesco.org/en/list/320. Accessed 19 Mar 2021 UniNe (2020) IR multilingual resources at UniNE. University of Neuchâtel. http://members.unine. ch/jacques.savoy/clef/. Accessed 19 Mar 2021 Volo S (2020) Tourism statistics, indicators and big data: a perspective article. Tour Rev 75:304– 309. https://doi.org/10.1108/TR-06-2019-0262 VU (2014) VU sentiment lexicon. Vrije Universiteit Amsterdam. https://github.com/openerproject/VU-sentiment-lexicon. Accessed 19 Mar 2021 Weber R (1990) Basic content analysis, 2nd edn. SAGE Publications, Inc., Thousand Oaks Xiang Z, Gretzel U (2010) Role of social media in online travel information search. Tour Manag 31:179–188. https://doi.org/10.1016/j.tourman.2009.02.016 Xiang Z, Magnini VP, Fesenmaier DR (2015) Information technology and consumer behavior in travel and tourism: insights from travel planning using the internet. J Retail Consum Serv 22:244–249. https://doi.org/10.1016/j.jretconser.2014.08.005 Xiang Z, Du Q, Ma Y, Fan W (2017) A comparative analysis of major online review platforms: implications for social media analytics in hospitality and tourism. Tour Manag 58:51–65. https:// doi.org/10.1016/j.tourman.2016.10.001 Xiang Z, Du Q, Ma Y, Fan W (2018) Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews. Inf Technol Tour 18:43–59. https://doi.org/10.1007/s40558017-0098-z Yilmaz Y, Yilmaz Y (2020) Pre- and post-trip antecedents of destination image for non-visitors and visitors: a literature review. Int J Tour Res 22:518–535. https://doi.org/10.1002/jtr.2353 Zarezadeh ZZ, Rastegar HR, Gretzel U (2018) Reviewing the past to inform the future: a literature review of social media in tourism. Czech J Tour 7:115–131. https://doi.org/10.1515/cjot-20180006
Network Science and e-Tourism
24
Julia Neidhardt
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Network Science Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Network Science in e-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
584 584 589 589 592 592 593
Abstract This chapter provides an introduction to network science and its applications within e-tourism research. In the first part, an overview of network science as a continuously growing scientific field is given. Network science provides various concepts and methods for the analysis of the structure and dynamics of all kinds of networks such as social networks, information networks, and economic networks. Afterward, popular software and tools to model, analyze, and visualize network data are briefly discussed. In the third part, an overview of research in e-tourism that utilized network science methods is provided. In existing studies, different types of networks were constructed and analyzed, in particular networks of travelers, networks of tourism websites, networks capturing behavioral patterns of travelers, or text networks of travel-related posts. Furthermore, it is briefly discussed, which data sources are typically used in the literature. Finally, the main points are summarized and conclusions are drawn.
J. Neidhardt () Research Unit E-Commerce, TU Wien, Vienna, Austria e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_33
583
584
J. Neidhardt
Introduction In this chapter, network science as a strongly growing scientific field that draws from various disciplines is introduced. Networks or graphs have proven to be very powerful means to model and to analyze all types of phenomena that are constituted through entities and relations between these entities. Network science provides therefore various concepts and methods for the analysis of structural relations and their dynamics in very different domains including social systems, communication, biology, technology, and transactions. In this chapter, some basic concepts and definitions as well as their interpretation will be introduced. These include paths and graph density, components and connectivity, centrality indices (degree centrality, closeness centrality, betweenness centrality, and PageRank), groups and community structure, as well as large-scale properties of networks (including preferential attachment and power-law distributions as well as the small-world phenomenon). Although the distinction is typically blurred, there is a slight difference between a network and a graph. This difference will be characterized. The tourism domain exhibits a high complexity. The tourism product is typically a bundle of interconnected products that are increasingly offered and searched for on the Web, tourism is emotional, and travel decisions are not taken on rational criteria only (Werthner et al. 1999). Furthermore, traveling is a social experience; people typically travel with their friends and family and discuss their experiences online. These are just some examples that illustrate how, in the context of e-tourism, multilevel relational structures emerge that can be captured with the help of networks. Therefore, e-tourism is a rich application domain for network science. Thus, another aim of this chapter is to assess to what extent network science is used in e-tourism research. Thus, a look at the literature is taken. Surprisingly, not many studies in the e-tourism domain utilize network science. In existing studies, different types of networks were constructed and analyzed, in particular networks of travelers, networks of tourism websites, networks capturing behavioral patterns of travelers, or text networks of travel-related posts. The study of large-scale networks has become more common in recent years due to the accessibility of these data. The rest of the chapter is organized as follows. In section “Network Science Overview”, a brief introduction to network science is provided. In section “Software and Tools”, popular software and tools to model, analyze, and visualize network are briefly discussed. In section “Network Science in e-Tourism”, an overview of research in e-tourism that utilized network science methods is provided. Finally, in section “Conclusion”, the main insights are summarized and conclusions are drawn.
Network Science Overview Network science is a strongly growing, multidisciplinary field, which aims to model and to understand all kinds of networks including information networks, social networks, technological networks, and biological networks (Barabási 2013). Obviously
24 Network Science and e-Tourism
585
these networks capture a wide range of empirical phenomena. However, it has turned out that the structure of such networks exhibits certain general properties and that their evolution is driven by similar dynamics (Tiropanis et al. 2015). Therefore a unified methodology can be applied, which integrates approaches from various disciplines including mathematics, computer science, physics, social sciences, and biology (Vespignani 2018). In the following, a brief overview of relevant concepts to characterize networks and their attributes is given. This overview is based on an overview provided in a previous article (Piazzi et al. 2011). For a comprehensive introduction to major concepts, definitions, and algorithms of network science, see Easley et al. (2012) or Newman (2018). A network can be mathematically described by a graph G = (V ,E). Both terms, network and graph, are used in the literature, sometimes synonymously. However, in this chapter, the term network is used to describe empirical Phenomena, and the term graph is used when talking about the corresponding abstract data structure, which is a common distinction in network science (Hamilton 2020). The observed entities form a set V of nodes or vertices, and the set Econsists of edges or links connecting pairs of nodes. An edge connects two nodes if there is a certain relationship or interaction between them. Two nodes that are connected by an edge are called neighbors. A graph is called complete, if there is an edge between each pair of nodes. In a directed graph, each edge captures an asymmetric relationship, e.g., who-followswhom relationship on Twitter. Therefore, the edge has an origin and a destination. An undirected graph, on the other hand, captures a symmetric relationship, e.g., friendship relationship on Facebook. Thus, it contains edges with no orientation. In a weighted graph, an additional numerical value that captures, for example, the strengths of the relationship can be assigned to each edge. The structure of a graph is often captured by an adjacency matrix A, a n ×n matrix, where n is the number of nodes of the graph. In an unweighted undirected graph, the element Ai j of the adjacency matrix equals 1 if node i and j are connected by an edge and 0 otherwise. In a unweighted, directed graph, Ai j equals 1 if there is an edge from j to i (by convention) and 0 otherwise. Thus, in contrast to the undirected case, the adjacency matrix is not symmetrical. In a weighted graph, the elements of the adjacency matrix usually represent the weights of the edges. The degree deg(v) of a node vin an undirected graph is the number of neighbors of v. The average degree of a graph is the arithmetic mean of the degrees of all nodes in the graph. Obviously, in a directed graph, in-degree and out-degree of a node have to be distinguished: in-links are connections leading to a node, and outlinks are those leading away from a node. A path in a graph is a sequence of nodes such that two consecutive nodes are connected by an edge. The number of all such edges is called the length of the path. The geodesic distance d(v, w) between two nodes v and w is defined as the length of the shortest path between them. Note that there can be more than one shortest path between v and w. If there is a path from a node v to a node w, these nodes are called connected. A connected component is a maximum set of nodes all connected to one another. If, on the other hand, there is no path at all between v and w, these nodes belong to different components
586
J. Neidhardt
and their geodesic distance is generally defined as infinite. The average distance in a graph is the arithmetic mean of the finite distances between all pairs of nodes. Furthermore, a graph’s diameter is the longest finite geodesic distance existing in the graph. When considering directed graphs, these concepts take into account the edges’ orientations. In this case, two types of connected components are defined: strongly and weakly connected components. In a strongly connected component (SCC), there is a directed path from each node to each other. In a weakly connected component, there is one path from each node to each other, but the edges’ orientation is ignored. Let G be an undirected graph and n the number of its nodes, and then the graph density ρ of G is defined as the number m of edges divided by the maximum possible number of edges, i.e., the edges that would be present if G were a complete graph: ρ=
2m . n (n − 1)
(1)
A local density can also be defined – the clustering coefficient. Let ki be the number of neighbors of a node v and ei the sum of all edges between them. If each pair of neighbors of node v were connected by an edge, then there would be ki (ki – 1)/2 edges. Therefore, the clustering coefficient Ci of a node v is as follows: Ci =
2ei . ki (k1 − 1)
(2)
Hence, Ci reflects the probability that two arbitrary neighbors of vare connected by an edge. The clustering coefficient C of the entire graph G is defined as the arithmetic mean of the clustering coefficients Ci of all nodes. Looking at directed graphs, there can be two edges between each pair of nodes – one in each direction. Taking this into account, both the graph density ρ and the clustering coefficient Ci as defined above have to be divided by two. Since network analysis methods should provide a better understanding of the underlying empirical structure, some concepts that facilitate a richer interpretation have been proposed. In this context, a very important category is the class of socalled centrality measures. These try to formalize the idea that in many instances some nodes or edges respectively play a more important role than others; hence they should be considered as more central. Three widely used centrality measures, which have a long tradition in the study of social networks (Wasserman et al. 1994), are degree centrality, closeness centrality, and betweenness centrality. The degree centrality CD (v) of a node v is defined as the number of edges it is connected to: CD (v) = deg (v).
(3)
24 Network Science and e-Tourism
587
In a directed graph, two kinds of degree centrality are usually distinguished, namely, in-degree centrality and out-degree centrality. The closeness centrality CC (v) of a node v is often defined as the reciprocal value of the sum of all distances between v and each other node w: CC (v) =
1 . w∈V \v d (v, w)
(4)
The betweenness centrality CB (v) for a node v is typically defined as follows: CB (v) =
σuw (v) . σuw
(5)
u=v∈V w=v∈V
Here σuw denotes the number of shortest paths between node u and node w and σuw (v) the number of shortest path between those nodes that run through v. For all three, i.e., degree centrality, closeness centrality, and betweenness centrality, it is also common to calculate normalized values. The interpretation of these centrality measures is quite intuitive. According to the degree centrality, a node is more important the more neighbors it has; such a node is able to influence many others. A node that lies on many shortest paths between pairs of nodes in the network is important according to the betweenness centrality. Such a node is involved in the interaction between those nodes and is capable of controlling their communication. On the other hand, a node with high betweenness can be regarded as a bridge connecting two different areas of the network and assume the role of a bottleneck. When considering closeness centrality, nodes that have in total a small geodesic distance to all other nodes are considered to be more central or important. The node with the highest closeness centrality reaches all the other nodes through a minimum number of intermediaries and is, for example, able to communicate faster with the whole network than every other node. Of course, the applicability and the significance of any centrality index depend on the area observed and on the questions asked. Another important and well-known centrality measure is PageRank PR (Page et al. 1999). It was introduced in the context of Google Web Search and defines the importance of Web pages in a recursive way, i.e., the centrality of a Web page depends on the number and centrality of the Web pages linking to it: 1−d P R (pi ) = +d n
pj ∈N (pi )
P R pj , lj
(6)
where PR(pi ) is the PageRank of page pi , N(pi ) the set of Web pages that link to pi , lj the number of outgoing links of page pj , N the total number of pages, and d a damping factor. There are a number of properties that many real-world networks have in common. One of them is the so-called power-law degree distribution, i.e., the degree
588
J. Neidhardt
distribution of the network’s nodes can be approximated by a function of the form p(k) = ck−γ , where k ∈ N denotes the degree of a vertex and c ∈ R and γ ∈ R are positive constants. This implies that in such a network the majority of nodes have a very low degree while very few members of the system have a remarkable high number of neighbors, thus acting as hubs for the network. Such networks are also called scale-free. One of the most common mechanisms for obtaining such a network structure has been found in the fact that links are not added randomly but are attached to specific nodes preferentially. Thus, this mechanism is called preferential attachment. Another common property is the so-called small-world phenomenon. The term goes back to an experiment by Stanley Milgram on social networks in the 1960s and expresses that the average distance within such a network is relatively short. A third property that many real-world networks have in common is a distinctive community structure, i.e., the network’s nodes can be divided into groups within which the edges are denser than between different communities. Appropriate measurements can be used to assess the quality of a particular division of a network into communities. The most common used measure is called modularity (Newman 2018): Q=
ki kj 1 δ ci , cj , Ai j − 2m 2m
(7)
i,j
where Ai j is the corresponding element of the adjacency matrix, ki (kj respectively) the degree of vertex i(j respectively), ci (cj respectively) the community to which node i(j respectively) is assigned, m the total number of edges, and the δ-function equals 1 if ci equals cj and 0 otherwise. Given a specific community structure of a network, the idea is to compare the fraction of edges that fall within the communities present to the fraction of such edges if edges were distributed randomly. A high modularity (Q ≈ 1) indicates a strong community structure, whereas a low modularity (Q ≈ 0) means that there is no evidence that the network exhibits a community structure (i.e., the proposed division is not better than random). Many social and economic networks, finally, see the presence of a very large connected component, which contains the vast majority of all nodes, and of other components with significantly smaller sizes. By looking at such characteristics, methods from network science make it possible to model and to predict structural properties and dynamics on different levels, i.e., at the level of individual nodes or edges, at the level of groups of nodes, or at the level of the entire system. Current research in network science benefits from unparalleled computing power, large data sets, and new computational modeling techniques (Vespignani 2018). In this context, network embeddings, i.e., methods for embedding network data into metric spaces, graph neural networks, and deep generative models of graphs make networks more accessible for machine learning and are specific fields of interest at the moment (Hamilton 2020).
24 Network Science and e-Tourism
589
Software and Tools There exist a number of well-established software and tools for exploring, Analyzing, and visualizing networks. In the following, we briefly describe a few of them – note that almost all of the listed software and tools are free to use. Nowadays, the programming languages Python (Van Rossum and Drake 2009) and R (R Core Team 2020) together with appropriate libraries are widely used in the context of network science. Both Python and R are well documented and have a very active community of users that provides support whenever needed. However, Python is a general- purpose language and therefore more flexible. Important Python libraries that are suitable for analyzing and manipulating large-scale networks are NetworkX (Hagberg et al. 2008) and Graph-tool (Peixoto 2014). Another relevant library for network analysis tasks is igraph (Csardi and Nepusz 2006). It is available as R and as Python library and is written in C. Furthermore, there are various libraries for network embeddings and deep learning, e.g., the Python library PyTorch (Paszke et al. 2019). In recent years, Gephi (Bastian et al. 2009), a tool for the interactive exploration and visualization of networks, has become very popular. It runs on all systems that support Java. Due to the interactive character, it is not appropriate for large-scale network analysis. NodeXL (Hansen et al. 2010) is a plug-in for Excel and provides a convenient way for importing network data from certain media sources as well as analyzing and visualizing these data. No programming skills are required. However, although there is a free basic version, many relevant functionalities for network science are only available with a license purchase. Large-scale analyses cannot be conducted with NodeXL.
Network Science in e-Tourism Although many aspects in e-tourism are highly complex and exhibit a rich relational structure (as also pointed out in Werthner et al. (2015)), a surprisingly low number of works apply methods from network science. To illustrate this, the proceedings of the ENTER e-tourism conference of the last 15 years (i.e., from 2007 to 2021) have been systematically searched for full papers applying this methodology, i.e., papers that refer to network science or network analysis when describing their approach. However, only a very small fraction of papers of the ENTER proceedings meets this criterion. In the following, all the papers identified are briefly summarized in chronological order. In Baggio et al. (2007) network statistics and topology of the Italian tourism destination, Elba is investigated. Thus, websites of the companies and services related to tourism such as accommodation (i.e., hotels, apartments, camping sites, etc.), intermediaries (i.e., travel agencies and tour operators), and means of transports, and the hyperlinks between them are analyzed. The data used for the study was obtained with the help of a Web crawler and manually enhanced.
590
J. Neidhardt
In Baggio and Corigliano (2009), a follow-up study is presented, where random walks are used to simulate the behavior of a Web surfer who navigates through the Elban network. Additional links are subsequently added in the simulation, and their impact on the navigability is assessed. In Piazzi et al. (2012), a network analysis of the Austrian e-Tourism Web graph is conducted. Properties of the Web graph are described. The data was collected with the help of a Web crawler. The study presented in Baggio and Del Chiappa (2013) assesses virtual as well as physical connections among stakeholders within the tourism destinations Elba and Livigno in Italy. Physical entities represent the actual companies and organizations, whereas virtual entities represent the websites of tourism organizations. The study shows that these components are strongly coupled and coevolve. In Stienmetz and Fesenmaier (2014) patterns of traveler activities are studied in Baltimore, Maryland (USA). A weighted B2B network is constructed, where the nodes represent attractions and the links represent the sharing of visitors between these attractions. Centrality measures are calculated, and community detection helps to find clusters of attractions. The data was collected with the help of an online survey filled out by travellers that had recently visited Baltimore. In Inversini et al. (2015) Twitter data related to the Glastonbury music festival in the UK is used to examine how event stakeholders engage in socially motivated online discussions. Tweets were collected based on certain keywords. The nodes in the network represent users and the links interactions such as retweets. Also a text analysis of the content of the tweets is conducted. In Marchiori et al. (2016) outgoing links of 161 websites of hotels within the tourism destination Ticino in Switzerland are studied. However, the study is qualitative, i.e., the outgoing links were manually checked to determine to what extent they refer to destination-related content. Another work Akbar et al. (2016) looks at dissemination of content in multiple online channels. The goal is to maximize the audience reached. In the formalization of the problem, communication channels are seen as nodes and the communication flow as edges between them. Linear optimization is used to identify the optimal publication flow. An experiment is conducted using social media posts from Austrian hotels and restaurants. A Web crawler was used to gather the data. In Neidhardt et al. (2016) user interactions and user-generated content in a travel online forum are investigated. In addition to looking at the network structure of the users’ discussions, also the sentiment of their postings are taken into account and a joint analysis is conducted. The goal is to assess whether the sentiments are contagious. The data was obtained from a collaboration with a travel start-up. In Baggio (2017) mobile data is used to study movement patterns of travelers in Fribourg, a canton in Switzerland. Out of these patterns, a network is Constructed, and properties of the network are identified. The data stem from a collaboration with Swisscom, a major Swiss telecommunication provider. In Stienmetz and Fesenmaier (2017) a model to facilitate the understanding of destination value creation is proposed. Based on geo-tagged Flickr photos and tax records from Florida, network structure and travel expenses for destinations in
24 Network Science and e-Tourism
591
Florida are investigated. The data was downloaded with the help of the Flickr API. Based on these data, traveler activity networks are constructed, i.e., activity paths for the single users are created for each destination in each year of observation. Out of all activity patterns, a joint network is constructed, which captures how many tourists were taking certain trip segments. For each network in each year, a separate network analysis is conducted. The network statistics serve as input for a feasible generalized least squares (FGLS) model. In Stienmetz (2018) touristic movement patterns and the expressed sentiments of the tourists are investigated within London. Photos plus metadata were downloaded using the Flickr API. Visitors and residents are distinguished based on the number of days between the first and the last photo taken (a month is considered as threshold). The networks are constructed in the same way as in Stienmetz and Fesenmaier (2017). Based on network analysis, differences between visitors and residents are identified. In addition, a sentiment analysis of the photo titles, tags, and descriptions is conducted. Also here the sentiments of visitors and residents are compared. In Cheng et al. (2019) a word co-occurrence network based on WeChat comments is constructed and analyzed in order to find out how users perceive travel-related WeChat mini programs, i.e., mini applications within the WeChat platform that can make use of certain features provided by the platform such as e-commerce features or payment features. The data was crawled and downloaded. Some basic network analysis is conducted, i.e., betweenness centrality is calculated and a community detection algorithm is applied to find important words and groups of words, respectively. Summing up, 13 full papers, which apply network science methods, in 15 years of ENTER proceedings were identified, i.e., on average less than one per year. Many of the first papers apply network science to study the Web graph of tourism destinations. One paper compares online connections with offline connections. Another focus is the analysis of online user discussions. Here, also the content of user posts is considered. One paper aims to integrate network and content analysis to also consider the dynamics of user discussions. Furthermore, networks are used to capture activity or movement patterns. This type of analysis started to emerge with the increasing availability of appropriate data such as large-scale data from location- based social networks (LBSN). A semantic network related to traveling is the focus of one paper. Furthermore, one paper looks at dynamic aspects and captures information flow with the help of networks. The fact that network science has been rarely applied in tourism research has also been pointed out in Baggio (2017), a literature survey on the topic of network science in tourism published a few years ago. Also other observations listed above are confirmed by this survey. It emphasizes, in particular, that, compared to other disciplines, tourism research started to apply concepts and methods from network science rather late. In the beginning, only some qualitative studies were conducted. The survey points out, moreover, that the analysis of structural properties of tourism destinations and supply chains can help policy makers and governance bodies. Furthermore, examples are provided, where network science facilitated to identify the most important actors in a tourism ecosystem. Another application domain for
592
J. Neidhardt
network science in tourism, which is discussed in the survey, are collaborations and citation patterns among researchers. However, the lack of studies on network dynamics is also clearly highlighted. Another survey published a few years ago focuses on social network analysis and its application in tourism research (Casanueva et al. 2016). Here, the following main topics were identified: (i) the study of interorganizational relationships, (ii) metainvestigation such as bibliometric studies, (iii) studies that look at social capital, and (iv) the study of destination networks. It is also worth mentioning that there was a special section on network science and e-tourism of the Journal Information Technology & Tourism (JITT) in 2018 comprising three full papers and one research note (Baggio and Fuchs 2018).
Conclusion This chapter aimed to show that networks are a powerful means for modeling and analyzing complex domains such as tourism. Network science offers a rich repertoire of methods and models as well as software and tools that are ready to use – but applications of this methodology in e-tourism are still scarce. However, on the other hand, in machine learning, recommender systems, or user modeling, travel and tourism is a popular domain of application in academia and industry, also in the context of network science (e.g., graph representation learning for travel time estimation (Li et al. 2018) or using social relations to improve group recommender systems for traveling (Delic et al. 2018)). As also outlined in Neidhardt and Werthner (2018), e-tourism research should aim to establish closer connections to these communities, as this would lead to a fruitful exchange for all sides and could ensure, moreover, that the research contributions are cutting-edge and foster innovations. Furthermore, code and data sharing for studies that apply network science in e-tourism should be encouraged, as this would ensure reproducibility on the one hand and on the other hand would also allow the community to do follow-up research and explore further levels of the typically very rich network data.
Cross-References Big Data Technologies Business Intelligence in Tourism Data Mining and Predictive Analytics for E-Tourism Digital Ecosystems, Complexity, and Tourism Networks Log File Analysis Semantic Web Empowered E-Tourism Web Information Retrieval and Search
24 Network Science and e-Tourism
593
References Akbar Z, Toma I, Fensel D (2016) Optimizing the publication flow of touristic service providers on multiple social media channels. In: Information and communication technologies in tourism 2016. Springer, pp 211–224 Baggio R (2017) Network science and tourism–the state of the art. Tour Rev 72:120–131 Baggio R, Corigliano MA (2009) On the importance of hyperlinks: a network science approach. In: Information and communication technologies in tourism 2009, pp 309–318 Baggio R, Del Chiappa G (2013) Tourism destinations as digital business ecosystems. In: Information and communication technologies in tourism 2013. Springer, pp 183–194 Baggio R, Fuchs M (2018) Network science and e-tourism Baggio R, Corigliano MA, Tallinucci V (2007) The websites of a tourism destination: a network analysis. In: ENTER, pp 279–288 Barabási AL (2013) Network science. Philos Trans R Soc A Math Phys Eng Sci 371(1987):20120375 Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Third international AAAI conference on weblogs and social media Casanueva C, Gallego Á, García-Sánchez MR (2016) Social network analysis in tourism. Current Issues Tour 19(12):1190–1209 Cheng A, Ren G, Hong T, Nam K, Koo C (2019) An exploratory analysis of travel-related wechat mini program usage: affordance theory perspective. In: Information and communication technologies in tourism 2019. Springer, pp 333–343 Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal Complex Systems, 1695. https://igraph.org Delic A, Masthoff J, Neidhardt J, Werthner H (2018) How to use social relationships in group recommenders: empirical evidence. In: Proceedings of the 26th conference on user modeling, adaptation and personalization, pp 121–129 Easley D, Kleinberg J et al (2012) Networks, crowds, and markets. Cambridge Books. New York Hagberg A, Swart P, Chult DS (2008) Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos Hamilton WL (2020) Graph representation learning. Synth Lect Artif Intell Mach Learn 14(3): 1–159 Hansen D, Shneiderman B, Smith MA (2010) Analyzing social media networks with NodeXL: insights from a connected world. Morgan Kaufmann, Amsterdam Inversini A, Sage R, Williams N, Buhalis D (2015) The social impact of events in social media conversation. In: Information and communication technologies in tourism 2015. Springer, pp 283–294 Li Y, Fu K, Wang Z, Shahabi C, Ye J, Liu Y (2018) Multi-task representation learning for travel time estimation. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1695–1704 Marchiori E, Casnati F, Cantoni L (2016) The role of destination in hotels’ online communications: a bottom-up approach. In: Information and communication technologies in tourism 2016. Springer, pp 113–125 Neidhardt J, Werthner H (2018) It and tourism: still a hot topic, but do not forget it. Inf Technol Tour 20(1–4):1–7 Neidhardt J, Rümmele N, Werthner H (2016) Can we predict your sentiments by listening to your peers? In: Information and communication technologies in tourism 2016. Springer, pp 593–603 Newman M (2018) Networks. Oxford University Press, Oxford Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
594
J. Neidhardt
Peixoto TP (2014) The graph-tool python library. figshare. https://doi.org/10.6084/m9.figshare. 1164194. http://figshare.com/articles/graph_tool/1164194 Piazzi R, Baggio R, Neidhardt J, Werthner H (2011) Destinations and the web: a network analysis view. Inf Technol Tour 13(3):215–228 Piazzi R, Baggio R, Neidhardt J, Werthner H (2012) Network analysis of the austrian etourism web. In: Information and communication technologies in tourism 2012. Springer, pp 356–367 R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/ Stienmetz JL (2018) Deconstructing visitor experiences: structure and sentiment. In: Information and communication technologies in tourism 2018. Springer, pp 489–500 Stienmetz JL, Fesenmaier DR (2014) Analysing the traveller activities network for strategic design: a case study of baltimore, md. In: Information and communication technologies in tourism 2014. Springer, pp 453–465 Stienmetz JL, Fesenmaier DR (2017) Structural implications of destination value system networks. In: Information and communication technologies in tourism 2017. Springer, pp 159–171 Tiropanis T, Hall W, Crowcroft J, Contractor N, Tassiulas L (2015) Network science, web science, and internet science. Commun ACM 58(8):76–82 Van Rossum G, Drake FL (2009) Python 3 reference manual. CreateSpace, Scotts Valley Vespignani A (2018) Twenty years of network science Wasserman S, Faust K et al (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge Werthner H, Klein S et al (1999) Information technology and tourism: a challenging ralationship. Springer, Wien Werthner H, Alzua-Sorzabal A, Cantoni L, Dickinger A, Gretzel U, Jannach D, Neidhardt J, Pröll B, Ricci F, Scaglione M et al (2015) Future research issues in it and tourism. Inf Technol Tour 15(1):1–15
Spatial Analytics and Data Visualization
25
Yang Yang
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploratory Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-Spatial Exploratory Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Point Pattern Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploratory Spatial Data Analysis (ESDA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Explanatory Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-Spatial Explanatory Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Interaction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geographically Weighted Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geovisualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Choropleth Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proportional Symbol Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Density and Interpolation Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flow Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Space-Time Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tourist Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial-Temporal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computation Burden . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web-Based GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
596 597 597 598 598 599 599 601 602 602 602 603 605 606 606 607 608 609 609 610 610 611 611 612 612 613 613
Y. Yang () Department of Tourism and Hospitality Management, Temple University, Philadelphia, PA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_34
595
596
Y. Yang
Abstract Along with the growing availability of geospatial data generated with information and communications technologies, spatial analytics and data visualization have prevailed in e-tourism research. This chapter systematically reviews the application of spatial analytics and data visualization in tourism. Specifically, the chapter discusses various exploratory analytics, such as spatial network analysis, spatial clustering, point pattern analysis, ESDA, and sequence analysis, and explanatory analytics, such as spatial interaction model, spatial econometrics, and geographically weighted regression. Some popular geovisualization methods are discussed with examples of their e-tourism applications. Lastly, the chapter discusses several major challenges of spatial analytics and geovisualization, including tourist identification, temporal angle in addition to the spatial analysis, computation burden, and web-based GIS applications.
Keywords Spatial analytics · Geovisualization · Spatial econometrics · GIS
Introduction With the revolution of modern information and communications technologies (ICTs) and their increasing penetration into the tourism industry, a wider variety of geospatial data have become available from various sources (Pan and Yang 2017b). Advances in ICTs can facilitate location identification via different technologies. First and foremost, WiFi and cellphone roaming networks provide infrastructure to capture the geographic locations of mobile devices; tourists’ spatial footprints are now readily available given the popularity of smartphones and associated apps (Shoval and Ahas 2016). Second, Internet users’ IP addresses are geo-trackable, albeit with limited precision. This type of data can be aggregated on a large geographic scale for rigorous analysis. Third, the emergence of geographic data based on physical addresses makes it easier to convert such information into spatial coordinates via geocoding (Lo Duca and Marchetti 2019). Lastly, as more mobile devices are equipped with a GPS antenna, high-resolution geographic locations are easily obtainable from satellite-based radio navigation systems. At least four types of spatial data are prevalent in e-tourism: point data, areal data, dyadic data, and trajectory data. Most geospatial data consist of point data with specific geographic coordinates; examples include geotagged social media posts (Vu et al. 2017), points of interest (POIs) and areas of interest (AOIs) (Yang et al. 2017), and individual visitors’ web traffic (Yang et al. 2014). Point data can be easily aggregated into areal units to become areal data, such as Google trends in geo-information for given geographic areas (e.g., states and cities) (Pan and Yang 2017a). Furthermore, point data can be aggregated into origin-destination pairs to reveal typical dyadic data containing geospatial information for an origin and
25 Spatial Analytics and Data Visualization
597
a destination (Yang et al. 2019a). Point data also include “time stamps,” through which a set of geospatial data can be linked to form a sequential route or itinerary (e.g., travel routes) via cellphone tracking. Such data reveal spatial trajectories with time-ordered information about an object’s movement through space (Chua et al. 2016). In sum, the availability of geospatial information greatly contributes to understanding the spatial dimension of tourism. This information also affords tourism researchers and practitioners unprecedented opportunities to monitor and investigate tourism-related spatial activities that would be difficult to track using conventional data sources. The remainder of this chapter is composed of five main sections. First, exploratory analytics are discussed and applied to track, identify, and recognize spatial patterns embedded in e-tourism data. Second, explanatory analytics are introduced, which incorporate various factors to explain spatial patterns via rigorous spatial modeling. Third, spatial data visualization is presented using various types of maps. Fourth, several overarching challenges and research directions are presented relative to spatial analytics and visualization in e-tourism. Lastly, pertinent conclusions are drawn.
Exploratory Analytics In spatial analytics, exploratory analytics summarizes the general spatial characteristics of the data. Some traditional exploratory data analysis approaches can be leveraged as a-spatial exploratory methods. Also, popular spatial exploratory analytics include spatial network analysis, spatial clustering, point pattern analysis, exploratory spatial data analysis, and sequence analysis. These methods have been applied in e-tourism research to better monitor and understand the spatial pattern of e-tourism activities and impacts.
A-Spatial Exploratory Methods Several a-spatial methods that do not explicitly consider spatial information can be used to understand spatial patterns. For example, a location quotient (LQ) is a measure evaluating the concentration of activities and phenomena in a region compared to the nation (or a larger-scale region). Therefore, for areal data, LQs can be used to track e-tourism agglomeration, clustering, and hotspots within a particular region. For example, Batista et al. (2018) used an LQ measure of tourism to plot tourist activities across EU countries based on hotel booking data and traditional tourism statistics. Majewska (2017) developed an LQ measure to understand the patterns of tourism firms in regions of Poland. Another a-spatial method, dimension reduction, can be used to describe patterns in dyadic data. In a typical form of dyadic data (i.e., origin-destination pair data), the origin-destination matrix may contain many dimensions contingent upon the number of origins and/or destinations. Various dimension reduction methods
598
Y. Yang
can transform the high-dimensional data into fewer dimensions. Conventional dimension reduction methods, such as multidimensional scaling (MDS) and factor analysis (FA), have been used to examine the dimensional structure of a dyadic flow matrix. Yang and Durarte (2019) applied MDS to reduce the dimensions of tourist flows between POIs using Foursquare check-in data. The similarity between POIs after MDS was considered the functional distance. The other dimension reduction method, FA, can extract the spatial structure of matrix by treating each extracted “factor” as a spatial field. Also, the factor loading and factor score of each origindestination pair can be particularly helpful in identifying linkages between areas embedded in areal data. In e-tourism research, Li and Yang (2017) applied FA to unveil patterns of inter-province tourist flow data based on Chinese Weibo checkin data.
Spatial Network Analysis Derived from graph theories, network analysis (NA) investigates the relations between discrete objects within a network. Specifically, in the context of spatial analytics, a network can be constructed by observing the movement over space with different locations as the notes and the linkage as the distance. By regarding destinations/attractions within a particular area, NA has been used to investigate the structural and topological configuration of an e-tourism network with dyadic and trajectory data (D’Agata et al. 2013; Shih 2006; van der Knaap 1999). NA output provides valuable insights into the organization, positions, and functions of tourist destinations within a network. A spatial network of flows consists of multiple nodes, which often represent stops or sites of interest along a tour route. Spatial network analysis can unveil the typological pattern of a network and the roles of different nodes in shaping the network (Jin et al. 2018). Various measures can be generated, such as the indegree (total flow to a specific node), out-degree (total flow from a specific node), degree centrality (the sum of the in-degree and out-degree), and index vergence (the difference between inflow and outflow) (Jin et al. 2018). For a more comprehensive review of the application of network analysis, please refer to the chapter on network analysis in this book.
Spatial Clustering As a standard multivariate statistical tool, clustering analysis has been widely applied in e-tourism research to segment tourists based on sociodemographics, attitudes, behavior, and loyalty (Dolnicar 2002). In spatial analytics, traditional clustering analysis cannot guarantee that the clustering outcomes will be spatially contiguous; therefore, specific spatial clustering methods have been proposed to classify spatial information (e.g., locations) in e-tourism research. Hasnat and Hasan (2018) compared the validity of three spatial clustering methods in an attempt to
25 Spatial Analytics and Data Visualization
599
recognize tourist patterns using filtered Twitter data: k-means clustering, densitybased spatial clustering of applications with noise, and the mean shift algorithm. Spatial clustering can also be applied to trajectory data. For example, Grinberger et al. (2014) used a trajectory clustering method to examine resource allocation behavior. This method involves three steps: (1) transforming data into segmented trajectories, (2) computing two time-space measures (i.e., the average stop distance and ratio between the length of the shortest route and the actual distance), and (3) k-means clustering based on these measures.
Point Pattern Analysis Point pattern analysis consists of multiple tools designed to recognize the spatial arrangement of points. Several spatial statistics can be used to measure general point patterns. The mean center refers to the arithmetic mean of point coordinates, whereas the median center is the point with the shortest Euclidean distance to all datapoints. Each measure reflects the central tendency of points and geographic features (Su et al. 2020). Based on the center, the standard distance can be calculated to reveal the dispersion of points and geographic features around the center (Majewska 2017). A standard deviational ellipse is a tool used to measure the intensity and orientation of spatial features; that is, it measures the standard deviation of features from the mean center for x-coordinates and y-coordinates, respectively (Derek et al. 2019). The lengths of the axes and orientation of the ellipse are calculated using specific geometry formulas. The K-function represents another powerful tool in point pattern analysis by calibrating the second-order process of point patterns. Specifically, the method investigates whether the individual distributions of points are independent. The number of neighboring features (points) is counted within a given distance of each feature. If this number exceeds the expected value from a random distribution, then the pattern is considered to be clustered at that distance. Using the location data of Chinese A-grade attractions, Wang et al. (2020) applied the K-function method to understand the clustering pattern of these attractions over space, and an aggregated pattern over space was recognized.
Exploratory Spatial Data Analysis (ESDA) ESDA represents a set of analytical and visualization methods that can reveal spatial distributions, detect hotspot patterns, and suggest spatial regimes or other forms of spatial heterogeneity (Anselin 1988). Global and local methods are both included in ESDA. For the global method, the global Moran’s coefficient (i.e., Moran’s I ) can be used to detect the existence of global spatial autocorrelation among particular variables across space. According to Anselin (2001), spatial autocorrelation is defined as the coincidence of value similarity with location similarity: a positive spatial autocorrelation between spatial units indicates that they are surrounded
600
Y. Yang
by neighbors with similar values of a particular variable, whereas a negative spatial autocorrelation implies that spatial units share dissimilar values with their neighbors. The most frequently used measurement of global spatial autocorrelation is the global Moran’s coefficient (Cliff and Ord 1981), which is written in the following matrix form: I=
n i
j
wij
·
i
j
wij (xi − x)(x ¯ j − x) ¯ ¯ 2 i (xi − x)
where i and j index the spatial units and x¯ is the mean value of x. W denotes the spatial weighting matrix: the elements wii on the diagonal are set to zero, and wij indicates the way spatial unit i spatially connects to unit j . The statistic has a maximum value of 1, suggesting a perfect positive autocorrelation, and a minimum value of −1, showing a perfect negative autocorrelation. A value of 0 indicates no spatial autocorrelation. Despite its utility, a global test of autocorrelation can mask localized correlation patterns. The local indicator of spatial association (LISA) can confirm the importance of hotspots. Anselin (1995) defined LISA as any statistic satisfying two criteria: (1) the LISA for each observation suggests important hotspots with similar values around that observation; and (2) the sum of the LISA for all observations is proportional to a global indicator of spatial association. Several local spatial statistics satisfy these requirements, including local Moran’s I statistics, local Geary’s C statistics, and Getis-Ord G and G∗ statistics. The local version of Moran’s I statistic for each spatial unit i is written as: Ii =
¯ (xi − x) wij (xj − x) ¯ with m0 = (xi − x) ¯ 2 /n m0 j
i
A positive value of Ii indicates spatial clustering of similar values; a negative value indicates spatial clustering of dissimilar values. The local Geary’s C is written as: Ci =
¯ (xi − x) wij (xi − xj )2 with m0 = (xi − x) ¯ 2 /n m0 j
i
Getis and Ord suggested two versions of G statistics. The first metric does not include a focal point and is specified as follows: Gi =
j,j =i
wij xj
j,j =i
xj
25 Spatial Analytics and Data Visualization
601
The other metric, including a focal point, is specified as: G∗i
=
j
wij xj
xj
j
Under the null hypothesis, the expected values of G∗i and Gi are each zero, and their variance is one; therefore, no standardized z-value is required for them. In e-tourism research, van der Zee et al. (2020) used Getis and Ord hotspot and coldspot analysis (i.e., G statistics) to unveil the geographic pattern of urban tourism in five cities based on TripAdvisor restaurant review data. Similarly, Kirilenko et al. (2019) applied G∗i statistics to identify three hotspots in Florida based on TripAdvisor attraction reviews. Salas-Olmedo et al. (2018) used global and local Moran’s I when analyzing tourists’ digital footprints from three e-tourism datasets: Panoramio, Foursquare, and Twitter. For supply-side analysis, Majewska (2017) utilized local Moran’s I to identify the location patterns of tourism accommodations and attractions in Poland.
Sequence Analysis Sequence analysis is tailored to trajectory data generated from various digital data sources in tourism. This analytical approach uncovers sequential points along tourists’ trajectories to reveal visitors’ travel sequences and preferences (Cai et al. 2018). Bermingham and Lee (2014) applied the PrefixSpan algorithm to investigate frequent visitation sequences among regions of interest using geotagged tourism photos. According to Vu et al. (2017), the sequential rule consists of two essential metrics: support describes the popularity of a rule, and confidence shows the sequence pattern between any items in the trajectory. Zheng et al. (2017) proposed a heuristic prediction algorithm to understand the trajectory sequence. Unlike the Markov-based model that solely considers a one-step transition between locations, this new method can incorporate historical information on the basis of multistep transition probability. Cai et al. (2018) leveraged semantic information in geotagged photo data of individual tourists to conduct semantic trajectory pattern mining, and the semantic trajectory patterns were analyzed to provide itinerary recommendations. Exploratory spatial analytics are able to extract the spatial information embedded in the data and provide insight into spatial patterns and characteristics of data. It has been typically employed as the first step of spatial analytics. However, exploratory analytics can hardly provide rigorous inferences about underlying factors explaining the spatial pattern and associations.
602
Y. Yang
Explanatory Analytics Unlike exploratory analytics, explanatory analytics have been applied to better elucidate spatial patterns and associations and to identify factors shaping said patterns and associations.
A-Spatial Explanatory Methods Despite neglecting spatial information, some traditional explanatory tools can be used in spatial analytics. For example, point data can be aggregated at a particular geographic unit, and regression analysis can be performed at the unit level. In an effort to understand the distribution pattern of US restaurants based on online POI data, Yang et al. (2017) used zonal regression to explain the number of restaurants within each US zip code area. The discrete choice model (DCM) has shown particular promise in modeling efforts with dyadic and trajectory data. This approach can unveil influential factors in a destination (e.g., destination attractiveness, accessibility, and weather condition) along with the next stop in dyadic and trajectory data, respectively (Nicolau 2017). The specification of DCM is consistent with random utility theory, which postulates that when facing an array of choices, tourists will select the most “reasonable” choice to maximize the corresponding utility subject to certain constraints (Eymann and Ronning 1997).
Spatial Interaction Model In addition to describing patterns, gravity-based spatial interaction models can be used to explore various types of movement/flow in dyadic e-tourism data with the information of origin and destination (e.g., credit card transaction data, mobile phone data, and geotagged social media data). Spatial interaction is a common phenomenon that captures the realized movement of people, commodities, money, information, and technology between origins and destinations. The spatial pattern of e-tourism flows is shaped by spatial interaction between tourist origins and destinations. Theoretically, this spatial interaction can be elucidated by classical Newtonian principles. The spatial interaction model (SIM) has been applied to capture this interaction, the basic equation of which is: Tij = G
Pi Aj Dijb
where i is the origin region, j is the destination region, Tij is the number of tourist arrivals from i to j , Pi is the population of the origin region, Aj is the destination attractiveness, andDij is the distance between the origin and destination. G and b are parameters to be estimated. The SIM family includes several model specifications.
25 Spatial Analytics and Data Visualization
603
The sizes of origin areas and destinations are important factors as indicated by the functional forms of SIMs. The size of an origin area describes the potential for spatial interaction, whereas a destination’s size reflects its attractiveness and opportunities. Taking e-tourism demand as a brief example, GDP and population can be considered size factors for origin areas and destinations; the number of tourism opportunities and visit costs represent the size variable in destinations. Measuring the separation between origin areas and destinations is similarly essential in SIM. In most cases, geographic distance is considered reflective of physical separation. However, as tourists use various transportation modes when traveling, travel time and transport costs can also serve as proxies for geographic distance in SIM (Taplin and Qiu 1997; Um and Lee 1998). Another distance measure that has been ignored in prior research is cultural distance, which assesses the proximity between cultures. Potential cultural conflicts are considered barriers to international travel and can thus negatively influence international tourist flows (Yang and Wong 2012). Several other measures have been used to represent separation, such as common borders (physical separation), common language (cultural separation), colonial links and visa restrictions (political separation), and shared currencies union and free trade agreements (economic separation) (Yang and Durarte 2019).
Spatial Econometrics As an econometric tool designed specifically for spatial analytics, spatial econometric models capture spatial dependence in a model. The two essential specifications of spatial econometric models are that the model contains spatial error autocorrelation and includes a spatially lagged dependent variable. In spatial econometrics, the former model is a spatial error model (SEM), and the latter is a spatial autoregressive model (SAR). A SAR model is specified as: Yi = δ
N
wij Yj + Xi β + εi , E(εi ) = 0, E(εi εi, ) = σ 2 IN
j =1
A SEM model is specified as: Yi = Xi β + φi , φi = λ
N
wij φj + εi , E(εi ) = 0, E(εi εi, ) = σ 2 IN
j =1
where i and j are indices for spatial units (cross-sectional dimension), with i, j =1,. . . ., N . wij is an element of the spatial weighting matrix W. εi is an independently and identically distributed error term for i. In SAR, δ is called the spillover coefficient, used to measure spillover effects of the dependent variable
604
Y. Yang
based on the specified spatial weighting matrix In SEM; λ is called the spatial autocorrelation coefficient, used to capture spillover effects of unobserved factors in the model. Apart from these two basic forms, other specifications apply to spatial econometric models. Accurate specification is vital when establishing spatial panel models, as misspecification consistently leads to biased estimation and unreliable results. Various tests have been developed to choose accurate specifications of spatial dependence (i.e., spillover effects) within spatial econometric models: the single crosssection Lagrange multiplier test statistic for the SEM specification, LM E (Burridge 1980), and the single cross-section LM test statistic for the SAR model, LM L (Anselin 1988). Specification of the spatial weighting matrix W is pivotal for identifying the spatial channels of spillover effects. This matrix can be specified in three ways: by nearest-neighbor weights, distance-based weights, and contiguity-based weights. The nearest-neighbor spatial weighting matrix is specified as follows: ⎧ ⎨ wij (k) = 0 if i = j w (k) = 1 if dij ≤ Di (k) ⎩ ij wij (k) = 0 if dij > Di (k) where dij is the distance between the centroids of regions i and j . Di (k) is the critical cutoff distance defined for each region i; k indicates the number of nearest neighbors specified in the matrix. For distance-based weights, it is assumed that spatial interactions are negligible beyond a certain distance. Thus: ⎧ ⎨ 0 if i = j wij = 1 if dij ≤ D ⎩ 0 if dij > D where dij is the geographic distance between the centers of two regions and D is the selected distance threshold above which spillover effects are considered insignificant. Researchers can assign distance decay weights, such as wij = dij−α , where α > 0. Contiguity-based weights are specified as wij = 1 if region i and a j share a common boundary, and wij = 0 otherwise. The spatial weighting matrix can also be specified in other ways, such as on the basis of economic distance and institutional proximity. To obtain unbiased and consistent estimates, the spatial weighting matrix must be exogenous. Usually, this matrix is row-standardized to render interpretation more intuitive. To estimate the model, the maximum likelihood estimator yields efficient estimates via the log-likelihood function. Researchers can also use two-stage least squares regression analysis or the generalized method of moments to estimate the SAR model, considering the endogeneity problem (LeSage and Pace 2009). The
25 Spatial Analytics and Data Visualization
605
efficiency of alternative estimators depends on the instrumental variables specified. In the context of e-tourism, Kim et al. (2022) applied the spatial econometric model to investigate the spatial spillover of tourism demand embedded in the Flickr photo data. Specifically, the spillover effect can be calibrated based on the spatially lagged terms with a spatial weighting matrix in the model.
Geographically Weighted Regression Spatial heterogeneity suggests that the spatial process may vary over space, and the existence of spatial non-stationarity calls for a local model to capture the localized relationship. Geographically weighted regression (GWR) is the most popular tool to capture spatial heterogeneity by allowing the coefficient to vary across space. This form of regression is written as: yi = a0 (ui , vi ) +
ak (ui , vi )xik + εi
k
where ui , vi denote the respective coordinates of the ith point and the coefficients a0 and ak become a function of geographic coordinates, suggesting the coefficient may vary over space. GWR is a local multivariate regression function that weights data samples by their spatial proximity. This method produces a separate set of regression parameters and model goodness-of-fit for each local subset of observations across the study area. The model the assumption in traditional OLS models that relationships (i.e., regression coefficients) between the dependent and independent variables being modeled remain constant across a study area. Under the GWR framework, estimated regression coefficients can vary over the geographic space, highlighting the spatial heterogeneity of relationships between dependent and independent variables. When using GWR, the parameters can be estimated by solving the following equation: β(g) = (XT W (g)X)−1 XT W (g)Y where W (g) is the weight matrix, denoting connectivity between observations. Note that this weight matrix differs from the spatial weighting matrix in spatial econometrics. The weight function can be determined using several approaches. Three common methods are the Gaussian function, the exponential function, and the tri-cube function. In the case of the Gaussian function, the weight for observation i is as follows: Wi = ϕ(di /σ θ )
606
Y. Yang
where ϕ denotes the standard normal density, and σ represents the standard deviation of the distance; d is the Euclidean distance between the location of observation i and location g; and θ is a quantity known as the bandwidth of sampled observations. Another weighting scheme is the exponential function: Wi =
exp(−di /θ )
Still another approach is based on the tri-cube function, which is specified as: Wi = (1 − (di /qi )3 )3 I(di < qi ) where q denotes the distance of the qth nearest neighbor to the observation. The bandwidth may be defined either by a fixed number of nearest neighbors or a given distance as a threshold. The optimal number of nearest neighbors and distance thresholds can be determined by minimizing cross-validation (CV) statistics or Akaike information criterion (AIC) score (Hurvich et al. 1998). The AIC method is advantageous because it is more generally applicable than CV statistics, especially when more complicated model form is presented. Additionally, the AIC approach can be used to select between several competing models by accounting for differences in model complexity (Fotheringham et al. 2002). In e-tourism research, Soler and Gemar (2018) applied GWR to model TripAdvisor room rates, and their results unveiled distinct submarkets within the same destination.
Geovisualization Visualization represents an important step in spatial analytics, and many hidden facets of high-dimensional data can be revealed using certain visualization tools. Many visualization approaches are closely linked to exploratory and explanatory analytics. For example, as a result of point pattern analysis, the standard deviational ellipse is often demonstrated on a map. ESDA depends heavily on geovisualization, especially LISA indicators, and a Moran significance map should be used in addition to LISA (Anselin and Bao 1997). In terms of explanatory analytics, GWR relies on geovisualization to present geographically varying coefficients on a map. In this section, several popular geovisualization tools will be discussed, including choropleth maps, proportional symbol maps, density and interpolation maps, flow maps, and space-time maps.
Choropleth Maps As the most popular type of geovisualization, choropleth maps are appropriate for phenomena exhibiting spatial variation that coincides with unit boundaries. To use this form of mapping, the phenomenon should be uniformly distributed within a given geographic unit (Slocum et al. 2009). Therefore, choropleth maps are best
25 Spatial Analytics and Data Visualization
607
Fig. 1 Choropleth map of US restaurant growth potential using online data. (Source: created by the author based on the data from Yang et al. 2017)
suited to areal data wherein each area is considered a single mapping unit. The primary design of a choropleth map involves a color scale, and three color schemes are most popular: sequential, divergent, and qualitative (Maciejewski 2018). If data are ordered sequentially, the sequential scheme is useful for demonstrating data values by shade (i.e., darker colors reflect higher values). Similar to the sequential scheme, the divergent scheme can also be used for data with a sequential order; however, the divergent scheme selects two sequential schemes with values above and below a benchmark value (e.g., zero point). In Fig. 1, a choropleth map from CARTO platform with a sequential color scheme is presented to visualize restaurant growth potential in US zip code areas; darker areas are associated with high potential for future restaurant business growth.
Proportional Symbol Maps Proportional symbol maps can be created to visualize numerical data associated with point locations, where the size of a symbol at each location is proportional to the corresponding numerical value of interest (Slocum et al. 2009). Two types of proportional symbols can be used, namely, geometric symbols (e.g., circles and cubes) and pictographic symbols (e.g., icons of tourists or icons of hotels). Figure 2 presents an example of a proportional symbol map displaying the number of overseas tourists to 48 US contiguous states. Circle sizes represent the number for each state.
608
Y. Yang
Fig. 2 Proportional symbol map of overseas tourist number of contiguous states in 2016 (Note: created by the author; only states with a number 100,000 or above are displayed)
Density and Interpolation Maps Density and interpolation maps are maps that “generate” new values with surrounding units to better visualize a spatial phenomenon. A density map is used to depict point patterns and assigns a probability density to each pixel of a fine grid over space. The most straightforward type of density map includes small grids to present granular details. For example, Batista et al. (2018) used tourist density grids at a 100 × 100 m resolution to present tourism density in the European Union based on online hotel booking data. Kernel density estimation is particularly useful in the event of a sparse point pattern; it assumes a continuous estimate of intensity over space. The kernel intensity estimate, λˆ k (x), can be specified as follows: λˆ k (x) =
i=n 1 x − xi ) k( h h2 i=1
A major task in kernel density estimation is to determine the k function and the value of h (bandwidth). Several methods have been suggested, including cross-validation. Density maps ultimately display a probability density surface that is continuous over space. Unlike density maps, interpolation maps generate a continuous attribute over space using raw location-based values. This type of map lays a pixel grid over the space and estimates the value at each pixel. Three interpolation methods are available (Brunsdon and Comber 2015). Nearest-neighbor interpolation estimates a new value using the value at the closest observation, whereas the inverse distance weighting approach estimates the value as the weighted average of nearby observations (Salas-Olmedo et al. 2018). The third and most complicated method, the Kriging approach, leverages spatial autocorrelation based on semivariance and a
25 Spatial Analytics and Data Visualization
609
Fig. 3 Interpolation map of tourists’ sentiment scores regarding smog in Beijing. (Source: created by the author based on the data from Zhang et al. 2020)
semivariogram to estimate the value at each grid point (Slocum et al. 2009). Figure 3 presents an interpolation map displaying tourists’ sentiment scores (from Chinese social media site Weibo) regarding smog in Beijing.
Flow Maps With dyadic and occasionally trajectory data, flow maps are necessary to visualize movement between geographic locations. The volume of movement can be reflected in the width of the line connecting locations, while the direction of movement can be shown in the arrow. Figure 4 offers an example of global tourist movement by pairs of origin and destination countries. Flow maps are also suitable for spatial trajectory data. Chua et al. (2016) developed a visualization tool to depict spatial-temporal patterns of geotagged tweets from tourists in Europe. The system can display three major spatial typological characteristics: circulation (diffusion and scale of flows), direction (directionality of movement), and centrality (popularity among tourists).
Space-Time Maps As discussed in previous sections, temporal information is coupled with geospatial information in most e-tourism datasets. Therefore, clear visualization of spacetime patterns can be especially revealing. Two popular methods are map animation and the space-time cube. Animating maps of the same scope and scheme over time is the most intuitive way to visualize geospatial and temporal information simultaneously (Maciejewski 2018). In this animation, each map at a particular
610
Y. Yang
Fig. 4 Flow map of international tourism demand. (Source: created by the author based on the data from Yang et al. 2019b)
time represents an animation frame. Class interval selection is challenging when designing the animation, as maps should be consistent across time. Space-time cubes provide another approach; the cubes expand a two-dimensional map into a three-dimensional space with time as the third dimension (Maciejewski 2018). Many software and tools became available to geovisualize the data. First, professional geographical information system packages, such as ESRI ArcGIS, QGIS, and MapINFO, can generate a wide array of maps and visualizations, in addition to the general spatial data management features. Second, various mainstream statistical software, such as SPSS and Stata, incorporate the mapping functions for basic geovisualization. Third, many geovisualization libraries are available for programming in R, Python, and JAVA, which helps better integrate the geospatial information. Lastly, several web mapping platforms, such as Carto (https://carto.com/), MapBox (https://www.mapbox.com/), and GeoHey (https:// geohey.com/home/), provide easy-to-use portals to geovisualize the data.
Challenges and Future Directions Tourist Identification Because many e-tourism data sources are not designed to target tourists specifically (e.g., general social media and cellphone roaming), the tourist population should be identified first. Several methods have been proposed for this purpose. Su et al. (2020) identified two straightforward criteria. First, tourist origin information, which can
25 Spatial Analytics and Data Visualization
611
be derived from self-reported user profiles, should be distinct from the location where data are collected (i.e., the destination). Second, the length of stay in a destination should be limited; studies have suggested a maximum span of 30 days. However, this oversimplified operationalization of tourists is problematic. Origin information may be inaccurate, and a more precise method is needed to examine individual users’ historical data to identify origin information from the locations where tourists tend to be active. For example, Hasnat and Hasan (2018) proposed a heuristic classifier to distinguish tourists from residents based on active historical coordinates. Furthermore, the use of a duration threshold is overly simplistic; some commuters may be mistakenly included in a tourist sample. Therefore, a more sophisticated framework is necessary to filter out tourist samples. Yang and Durarte (2019) proposed a semi-supervised method to identify tourists based on K-means clustering with manual improvement. Hasnat and Hasan (2018) demonstrated the use of ensemble classifiers, which combine multiple classifiers, to obtain a tourist sample. Other information embedded in e-tourism data can be incorporated to achieve greater classification accuracy, such as social media content and activities or transport modes as revealed by cellphone data (Grinberger and Shoval 2019). Spatial analytics can also be applied to recognize specific types of tourists based on geospatial information. For instance, Zhao et al. (2018) used a fine-grained travel party partition method to identify accompanied tourists in mobile data.
Spatial-Temporal Analysis As argued by Shoval and Ahas (2016), the revolution of big data and various tracking technologies have enabled seamless integration of spatial and temporal information. A corresponding challenge is to integrate both types of information into analytics. The expansion of some spatial analytics to spatiotemporal analysis is relatively intuitive, as in the use of spatial econometric models based on panel data (Elhorst 2010). However, for many other spatial methods, a rigorous expansion of spatiotemporal information can be proposed to unveil patterns from both dimensions.
Computation Burden Along with advances in ICTs, improved computational power and expanded storage capabilities can generate an unprecedented amount of geospatial data for e-tourism from many big data sources. Such massive-scale data may impede the efficiency of a traditional analytical paradigm (Li et al. 2018). Lower efficiency is especially concerning in spatial analytics due to the complexity of embedded spatial information. To process spatial information more effectively, advanced analytical methods should be proposed and tailored to spatial analysis of large-scale data.
612
Y. Yang
Fig. 5 Web-GIS dashboard monitoring the global impact of COVID-19 pandemic on tourism. (Source: https://experience.arcgis.com/experience/6e1ccb1ee1bb4469871898646aa62f54)
Web-Based GIS Web-based GIS aims to develop various GIS functionalities interactively through websites. Web GIS applications offer several advantages given Internet accessibility: they are platform-free, have low distribution maintenance costs, and provide user-friendly access for potential audiences to search for, analyze, and visualize data online (Yang et al. 2015). During the COVID-19 pandemic, various dashboards based on the web-based GIS systems played a significant role in monitoring the pandemics, providing future outlooks, and helping government and other agencies to formulate policies (Zhang et al. 2020). Specifically in tourism, Yang et al. (2021) developed a web-GIS dashboard to monitor the impact of the pandemic on global tourism with the COVID19tourism index (see Fig. 5). The dashboard compiled data from different sources, such as hotel performance, airport traffic, mobility, and online search, to provide a comprehensive picture of the COVID impact for each country as a tourism destination. A major challenge of the web-based GIS platform is the real-time monitoring of tourism. This platform requires the development of a compatible data feeding structure across different big data sources, and efficient data cleaning and analytical algorithm are needed to provide actionable insights in a timely manner.
Conclusion This chapter summarizes spatial analytics and geovisualization approaches that can be used to analyze four types of geospatial data in e-tourism: point, areal, dyadic, and trajectory data. In terms of analytical methods, exploratory and explanatory methods both offer a comprehensive toolbox to recognize and investigate the spatial
25 Spatial Analytics and Data Visualization
613
Table 1 Summary of spatial analytics in e-tourism Exploratory analytics A-spatial methods Location quotient A Dimension reduction D Spatial methods Spatial network analysis D T Spatial clustering P A D T Point pattern analysis P Exploratory spatial data analysis P A D Sequence analysis T
Explanatory analytics Regression P A Discrete choice model A D T Spatial interaction model D Spatial econometrics P A D Geographically weighted regression P A D
Notes: P point data, A areal data, D dyadic data, T trajectory data
time dimension embedded in data. Table 1 lists the analytics introduced, which can be categorized into quadrants depending on whether they are a-spatial or spatial and exploratory or explanatory. Notably, certain methods can be applied to analyze more than one type of geospatial data. The availability of geospatial data sources and analytical methods enables etourism researchers to better understand the spatial (and time) patterns of tourism activities. Geospatial analytics provide vital insights into many specific areas (e.g., geomarketing) to determine potential geographic markets, provide travel route recommendations (Bermingham and Lee 2014), evaluate the accessibility of new infrastructure (Yang et al. 2015), monitor overcrowding in urban destinations (Grinberger and Shoval 2019), and assess and enhance the tourist experiencescape (O’Leary and Fesenmaier 2017). Taken together, the chapter herein provides intriguing examples of the state of spatial analytics and geovisualization in e-tourism. With the advance of modern ICT technologies (e.g., 5G network), the improvement of spatial data analytical approaches (e.g., artificial intelligence), and the continuous development of e-tourism infrastructure (e.g., smart tourism database), it is expected that research leveraging spatial analytics and geovisualization will help transform the future of e-tourism research.
Cross-References Digital Ecosystems, Complexity, and Tourism Networks Mobile Applications for e-Tourism Network Science and e-Tourism
References Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic Publishers, Dorddrecht Anselin L (1995) Local indicators of spatial association-LISA. Geogr Anal 27(2):93–115 Anselin L (2001) Spatial econometrics. In: Baltagi B (ed) A companion to theoretical econometrics. Blackwell, Oxford
614
Y. Yang
Anselin L, Bao S (1997) Exploratory spatial data analysis linking SpaceStat and ArcView. In: Fisher M, Getis A (eds) Recent developments in spatial analysis. Springer, Berlin/Heidelberg/New York, pp 35–59 Batista e Silva F, Marín Herrera MA, Rosina K, Ribeiro Barranco R, Freire S, Schiavina M (2018) Analysing spatiotemporal patterns of tourism in Europe at high-resolution with conventional and big data sources. Tour Manag 68:101–115 Bermingham L, Lee I (2014) Spatio-temporal sequential pattern mining for tourism sciences. Proc Comput Sci 29:379–389 Brunsdon C, Comber L (2015) An introduction to R for spatial analysis and mapping. Sage, Thousand Oaks Burridge P (1980) Onthe Cliff-Ord test for spatial autocorrelation. J R Stat Soc B 42:107–108 Cai G, Lee K, Lee I (2018) Itinerary recommender system with semantic trajectory pattern mining from geo-tagged photos. Expert Syst Appl 94:32–40 Chua A, Servillo L, Marcheggiani E, Moere AV (2016) Mapping Cilento: using geotagged social media data to characterize tourist flows in southern Italy. Tour Manag 57:295–310 Cliff AD, Ord JK (1981) Spatial processes: models and applications. Pion, London D’Agata R, Gozzo S, Tomaselli V (2013) Network analysis approach to map tourism mobility. Qual Quant 47(6):3167–3184 Derek M, Wo´zniak E, Kulczyk S (2019) Clustering nature-based tourists by activity. Social, economic and spatial dimensions. Tour Manag 75:509–521 Dolnicar S (2002) A review of data-driven market segmentation in tourism. J Travel Tour Mark 12(1):1–22 Elhorst JP (2010) Spatial panel data models. In: Fisher MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 377–407 Eymann A, Ronning G (1997) Microeconometric models of tourists’ destination choice. Reg Sci Urban Econ 27(6):735–761 Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Chichester Grinberger AY, Shoval N (2019) Spatiotemporal contingencies in tourists’ intradiurnal mobility patterns. J Travel Res 58(3):512–530 Grinberger AY, Shoval N, McKercher B (2014) Typologies of tourists’ time–space consumption: a new approach using GPS data and GIS tools. Tour Geogr 16(1):105–123 Hasnat MM, Hasan S (2018) Identifying tourists and analyzing spatial patterns of their destinations from location-based social media data. Transp Res Part C Emerg Technol 96:38–54 Hurvich CM, Simonoff JS, Tsai CL (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J R Stat Soc Ser B (Stat Methodol) 60(2):271–293 Jin C, Cheng J, Xu J (2018) Using user-generated content to explore the temporal heterogeneity in tourist mobility. J Travel Res 57(6):779–791 Kim YR, Liu A, Stienmetz J, Chen Y (2022) Visitor flow spillover effects on attraction demand: a spatial econometric model with multisource data. Tour Manag 88:104432 Kirilenko AP, Stepchenkova SO, Hernandez JM (2019) Comparative clustering of destination attractions for different origin markets with network and spatial analyses of online reviews. Tour Manag 72:400–410 LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press, Boca Raton Li D, Yang Y (2017) GIS monitoring of traveler flows based on big data. In: Xiang Z, Fesenmaier DR (eds) Analytics in smart tourism design. Springer, Cham, pp 111–126 Li J, Xu L, Tang L, Wang S, Li L (2018) Big data in tourism research: a literature review. Tour Manag 68:301–323 Lo Duca A, Marchetti A (2019) Open data for tourism: the case of Tourpedia. J Hosp Tour Technol 10(3):382–398 Maciejewski R (2018) Geovisualization. In: Fischer MM, Nijkamp P (eds) Handbook of regional science. Springer, Berlin, pp 1–19
25 Spatial Analytics and Data Visualization
615
Majewska J (2017) GPS-based measurement of geographic spillovers in tourism – example of Polish districts. Tour Geogr 19(4):612–643 Nicolau JL (2017) Travel demand modeling with behavioral data. In: Xiang Z, Fesenmaier DR (eds) Analytics in smart tourism design. Springer, Cham, pp 31–43 O’Leary JT, Fesenmaier D (2017) Concluding remarks: tourism design and the future of tourism. In: Fesenmaier DR, Xiang Z (eds) Design science in tourism. Springer, Cham, pp 265–272 Pan B, Yang Y (2017a) Forecasting destination weekly hotel occupancy with big data. J Travel Res 56(7):957–970 Pan B, Yang Y (2017b) Monitoring and forecasting tourist activities with big data. In: Muzaffer U, Schwartz Z, Turk E (eds) Management science in hospitality and tourism: theory, practice and applications. Apple Academic Press, Watertown, pp 43–62 Salas-Olmedo MH, Moya-Gómez B, García-Palomares JC, Gutiérrez J (2018) Tourists’ digital footprint in cities: comparing Big Data sources. Tour Manag 66:13–25 Shih H-Y (2006) Network characteristics of drive tourism destinations: an application of network analysis in tourism. Tour Manag 27(5):1029–1039 Shoval N, Ahas R (2016) The use of tracking technologies in tourism research: the first decade. Tour Geogr 18(5):587–606 Slocum TA, McMaster RB, Kessler FC, Howard HH (2009) Thematic cartography and geovisualization, 3rd edn. Pearson, Upper Saddle River Soler IP, Gemar G (2018) Hedonic price models with geographically weighted regression: an application to hospitality. J Destin Mark Manag 9:126–137 Su X, Spierings B, Dijst M, Tong Z (2020) Analysing trends in the spatio-temporal behaviour patterns of mainland Chinese tourists and residents in Hong Kong based on Weibo data. Curr Issues Tour 23:1542–1558 Taplin JHE, Qiu M (1997) Car trip attraction and route choice in Australia. Ann Tour Res 24(3):624–637 Um S, Lee CK (1998) An application of the gravity model in a practical setting: estimating the effect of road network improvement in generating foreign tourists’ trips within Bali. Pac Tour Rev 2(1):21–27 van der Knaap WGM (1999) GIS-oriented analysis of tourist time-space patterns to support sustainable tourism development. Tour Geogr 1(1):56–69 van der Zee E, Bertocchi D, Vanneste D (2020) Distribution of tourists within urban heritage destinations: a hot spot/cold spot analysis of TripAdvisor data as support for destination management. Curr Issues Tour 23:175–196 Vu HQ, Li G, Law R, Zhang Y (2017) Travel diaries analysis by sequential rule mining. J Travel Res 57(3):399–413 Wang T, Wang L, Ning Z-Z (2020) Spatial pattern of tourist attractions and its influencing factors in China. J Spatial Sci 65:327–344 Yang L, Durarte CM (2019) Identifying tourist-functional relations of urban places through Foursquare from Barcelona. GeoJournal 86:1–18 Yang Y, Wong KKF (2012) The influence of cultural distance on China inbound tourism flows: a panel data gravity model approach. Asian Geogr 29(1):21–37 Yang Y, Pan B, Song H (2014) Predicting hotel demand using destination marketing organization’s web traffic data. J Travel Res 53(4):433–447 Yang Y, Tang J, Luo H, Law R (2015) Hotel location evaluation: a combination of machine learning tools and web GIS. Int J Hosp Manag 47:14–24 Yang Y, Roehl WS, Huang J-H (2017) Understanding and projecting the restaurantscape: the influence of neighborhood sociodemographic characteristics on restaurant location. Int J Hosp Manag 67:33–45 Yang Y, Li D, Li X (2019a) Public transport connectivity and intercity tourist flows. J Travel Res 58(1):25–41 Yang Y, Liu H, Li X (2019b) The world is flatter? Examining the relationship between cultural distance and international tourist flows. J Travel Res 58(2):224–240
616
Y. Yang
Yang Y, Altschuler B, Liang Z, Li XR (2021) Monitoring the global COVID-19 impact on tourism: the COVID19tourism index. Ann Tour Res 90:103120 Zhang X, Yang Y, Zhang Y, Zhang Z (2020) Designing tourist experiences amidst air pollution: a spatial analytical approach using social media. Ann Tour Res 84:102999 Zhao X, Lu X, Liu Y, Lin J, An J (2018) Tourist movement patterns understanding from the perspective of travel party size using mobile tracking data: a case study of Xi’an, China. Tour Manag 69:368–383 Zheng W, Huang X, Li Y (2017) Understanding the tourist mobility using GPS: where is the next place? Tour Manag 59:267–280
The Hive Mind at Work: Crowdsourcing E-Tourism Research
26
Jing Ge-Stadnyk
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Building Blocks of Crowdsourcing Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Crowd of Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Crowdsourcing Research in E-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Crowdsourcing E-Tourism Research: Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
618 619 620 622 624 625 629 630 630
Abstract Tourism scholars are increasingly turning to web-based platforms to conduct e-tourism research. The availability of crowdsourcing websites (e.g., Amazon Mechanical Turk or “MTurk”) has made a range of research approaches, including survey and experimental investigations, more efficient. When used to analyze social media data, human intelligence – an essential component of crowdsourcing research – can also help researchers tackle issues unsolvable through automation or machine learning, such as text and image annotation. However, compared to other domains (e.g., social science, computer science), within e-tourism, crowdsourcing research has not yet been fully leveraged as a scientific method. It is argued herein that, in order to move the field forward, e-tourism scholars must better grasp the unique and dynamic structure and principles of crowdsourcing research. This chapter reviews and synthesizes the relevant literature, proposing a
J. Ge-Stadnyk () University of California, Berkeley, Berkeley, CA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_119
617
618
J. Ge-Stadnyk
set of building blocks upon which crowdsourcing research may be structured, that is, a crowd of participants, crowdsourcing platforms, and research types. Further, it offers seven guidelines to inform e-tourism crowdsourcing research practice: determining research types, choosing crowdsourcing platforms, defining crowdsourced populations, recruiting participants, managing crowds, handling ethical issues, and reporting.
Keywords Crowdsourcing research · E-tourism · Crowdsourced data · Crowdsourced data analysis · Crowdsourcing platforms · Crowdsourcing research guidelines
Introduction Academic research across different fields (e.g., psychology, sociology) relies heavily on the use of study participants, and e-tourism research is no exception. Crowdsourcing platforms (e.g., Amazon Mechanical Turk or “MTurk”) constitute a new frontline for scholars, allowing them to harness social wisdom and collective intelligence for their research (Ghezzi et al. 2018). Crowdsourcing, i.e., the act of recruiting a large group of people in the form of an open call (Howe 2006), is employed in a wide range of social and economic activity, such as open innovation and product assessment (Simperl 2015). Enticed by the prospect of reaching large numbers of participants, scholars have moved quickly to unlock the great potential of crowdsourcing as a scientific method. In short, crowdsourcing research aims to acquire meaningful data through identifying, recruiting, and managing specific groups of people with different experiences, skills, and interests (Keating et al. 2013). Compared to traditional survey and experimental studies, crowdsourcing research possesses several advantages, including relatively low cost, participant diversity, flexibility, and sound data quality (Goodman and Paolacci 2017). Because of the central role of human intelligence, crowdsourcing research is also very attractive to those involved in computer science-based studies, particularly the area of artificial intelligence (AI), generally defined as the simulation of human intelligence processes (e.g., learning, reasoning, self-correction) by computer systems (Russell and Norvig 2016). AI has become a hot, perhaps trendy, topic in e-tourism research; the spectrum of possible application ranges from data mining techniques and recommendation systems to chatbots and robotics (Goh et al. 2009; Xiang and Fesenmaier 2017). However, rather than focusing on, and trying to predict, whether AI will displace human workers (as seen in some disciplines), crowdsourcing research is built upon a premise of coexistence and collaboration between human and machine. As crowdsourcing scholars convincingly argue, while AI has advanced to the point where automation and machine learning now allow for certain tasks to be performed with a high degree of efficiency and accuracy, human expertise remains an indispensable resource for tackling
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
619
problems that computers find challenging, such as the analysis of social media data (e.g., humor, emoji, memes) (Hill et al. 2015; Simperl 2015). This critical role of human intelligence is also recognized within tourism studies. Ge and Gretzel (2018) point out that automated sentiment analysis in tourism research is challenging, due to the growth of subjective and opinionated information and multimodal communication (e.g., text, images, video). Recently, Gretzel et al. (2020) have emphasized the importance of transparency (i.e., making implicit human values explicit) in e-tourism while critically asserting that AI is unable to make its knowledge and value structures explicit. The authors conclude that transformative e-tourism research should not let AI guide human patterns of speaking and action. To deal with these challenges, e-tourism scholars can and should employ crowdsourcing research, which also empowers individuals to contribute to system-wide knowledge and value co-creation (Xiang and Fesenmaier 2017). Despite the increasing interest in, and value of, crowdsourcing research, scholars across different fields have yet to determine how to fully leverage it as a scientific method (Van Nguyen et al. 2018). Moreover, compared to other domains, crowdsourcing research is less prominent within the field of e-tourism. Several authors have voiced concerns over crowdsourcing platforms, questioning, for example, the reliability and validity of crowdsourced populations or pointing to ethical problems (e.g., power imbalances) (e.g., Chambers and Nimon 2019; Goodman and Paolacci 2017; Sheehan 2018). However, such concerns reflect a lack of understanding of what crowdsourcing research is, the various types it can take, and what its best practices are (Van Nguyen et al. 2018). It is, thus, important to provide a systematic view of crowdsourcing research, inferring theoretical foundations, in order to demonstrate its rich potential for scientific application and informed research practice. This chapter first presents and illustrates the building blocks of crowdsourcing research by reviewing relevant literature; this is followed by a discussion of how such research is currently being used in tourism studies. This chapter then offers guidelines for crowdsourcing research, applicable to etourism scholars but also those working in other fields. Lastly, the implications of crowdsourced e-tourism research are discussed.
The Building Blocks of Crowdsourcing Research This section presents a crowdsourcing research framework comprised of three building blocks: a crowd of participants, crowdsourcing platforms, and research types. These building blocks offer a firm conceptual foundation and so can help guide e-tourism scholars as they employ crowdsourcing as a scientific research method. Further, they capture and portray the composition of literature streams currently under scrutiny as well as larger related trends (Fig. 1).
620
J. Ge-Stadnyk
Fig. 1 The Building Blocks of Crowdsourcing Research
A Crowd of Participants This building block is designed to help researchers identify, recruit, and manage a group of individuals by denoting their respective roles and understanding their motivations for participation. Compared to the types of individuals involved in traditional surveys and experimental studies (e.g., survey panel members and students), participants in crowdsourcing research are typically more diversified. They may come from practical or scientific backgrounds and have a wide variety of different experiences, skills, and interests (e.g., amateurs, students, scientists). These diverse specialists/nonspecialists take on the same roles – agent, worker, problemsolver – while researchers act as problem owner or employer (Buecheler et al. 2010). While crowdsourced social and economic activities often involve general, undefined groups of people, crowdsourcing research selects participants according to strict criteria in order to ensure targeted, high-quality results (Lenart-Gansiniec 2018). Given that the latter requires greater time and effort, scholars should benefit from understanding participants’ motivations for becoming subjects of crowdsourcing research. Table 1 presents findings from a study by Ghezzi et al. (2018).
Crowdsourcing Platforms This section introduces a general paradigm of crowdsourcing platforms, in the research context, by comparing several platforms that are already well-established. This particular building block refers to a variety of crowdsourcing websites (e.g., MTurk, TopCoder) that enable and facilitate the conducting of research investigations. While a detailed explanation of how available platforms operate is not necessary here, it is important to recognize that each of these platforms specializes in solving different research problems; as such, each attracts differing types of potential participants having varied interests and expertise. Despite not originally being thought of for academic research, MTurk came to attract scholars because
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
621
Table 1 A summary of motivations of crowdsourced participants. (Adapted from Ghezzi et al. 2018) Motivations
Intrinsic motivations
Extrinsic motivations
Show one’s creativity Sense of belonging, self-worth, and cooperation Enjoyment and entertainment Psychological compensation and sense of efficacy Social influence and identity Exchange of information Social search Learning Financial rewards Reputation Professional development and career benefits Self-marketing
it specializes in microtasks, providing large groups of individuals available to complete relatively small research tasks (e.g., survey participation, data validation). Such tasks usually take a very short amount of time (several minutes, as opposed to hours or days). Payments range from a few cents to a few dollars, depending on the level of effort and amount of time required (Goodman and Paolacci 2017). For instance, Tasci (2017) employed MTurk to conduct an online survey to investigate consumer demand for sustainability benchmarks in tourism and hospitality. This crowdsourcing platform was also adopted by Collins et al. (2019)’s study to explore factors influencing outbound medical travel from the USA. Moreover, in examining the use of humor for consumer engagement on social media, Ge (2017) adopted a crowdsourced sentiment analysis by recruiting coders on Witmart, a Chinese crowdsourcing platform similar to MTurk. In contrast, some platforms support large tasks and require participants who have specific expertise or professional skills. For example, TopCoder is a network of over one million mathematicians, engineers, and software developers worldwide who participate in coding competitions (https://www.topcoder.com/about/). InnoCentive is a platform with a broader reach, consisting of over 300,000 engineers, scientists, inventors, businesspeople, and research organizations from over 190 countries (https://www.innocentive.com/about-us/). The website called Appen offers AIand machine learning-related research projects, such as text and natural language processing and speech and image recognition (https://appen.com). It is noteworthy that crowdsourced research discussed above differs from recruiting participants through existing survey panels, such as SurveyMonkey Audience, Qualtrics Panels, and StudyResponse. For example, StudyResponse specifically facilitates online findings for behavior, social, and organizational researchers by distributing email participation requests to pre-recruited participants (http://www.studyresponse.net/ index.htm). While conducting crowdsourced research on the abovementioned platforms usually requires researchers to pay participants a nominal fee, citizen science platforms (e.g., Zooniverse, iNaturalist) attract many volunteers offering their time
622
J. Ge-Stadnyk
for science and humanities research, such as generating and processing research data (Keating et al. 2013; Law et al. 2017).
Research Types This building block refers to various types of research projects and their objectives. It helps researchers to explore and fully leverage the great potential of the crowdsourcing method. The first type – problem and idea generation – aims at generating research ideas and identifying research gaps from a broad and diverse group of participants (Keating et al. 2013). Buecheler et al. (2010) suggest that the crowdsourced ideas and information can potentially help researchers with hypothesis formulation. The common consensus is that contributions made by the pool of participants may help researchers overcome group biases (Buecheler et al. 2010; Goodman and Paolacci 2017). This crowdsourcing research type is often implemented through a challenge or contest without an entry fee. However, rewards can be substantial, depending on the platform. For instance, TopCoder winners typically receive $500 to $2,000, while InnoCentive winners receive $10,000 to $100,000 (Buecheler et al. 2010). The second type – data collection – can include a survey response, targeted data collection, and crowdsourced online data. Crowdsourced survey responses, in particular, have gained momentum in psychological and behavioral studies (e.g., consumer research) because this type of data collection allows researchers to access large groups of study participants (Chambers and Nimon 2019). Moreover, while the traditional convenience sample strategy often leads to debates on external validity due to its strong dependence on a specific group of people (e.g., graduate students), crowdsourced populations permit researchers to easily access participants who are more demographically diverse (Goodman and Paolacci 2017). Moreover, behavioral research found that compared to collecting survey data through traditional participant pools (i.e., an undergraduate sample and a sample of organizational employees), the MTurk sample respondents were more ethnically diverse and had more work experience. Also, the reliability of the data from the crowdsourcing sample is as good as or better than the corresponding university sample (e.g., Behrend et al. 2011; Feitosa et al. 2015). By comparison, crowdsourcing targeted at data collection allows researchers to create unique and specific sets of data, which can be used to supplement traditional social science research data. Research on “emerging tobacco product detection” by Keating et al. (2013) can best illustrate this. To examine the demand-side dynamics of “snus” smokeless tobacco products, the researchers first employed survey questions to gauge participants’ knowledge of and potential interest in using these tobacco products. However, the survey did not capture one dimension that may influence the demand-side dynamics of the nearby local population, that is, the supply of these products. Given this, the researchers asked a question (i.e., where is snus sold in Chicago) and then recruited MTurk workers to call retailers and inquire whether they sold snus tobacco products. This crowdsourcing targeted
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
623
data collection helped researchers obtain an entirely new microlevel dataset that supplemented the survey data in critical ways. Another type of data collection is largely carried out through citizen science. For example, eBird (www.ebird.org) allows users to gather dynamic data (e.g., bird sightings and migrations) and then share it with a community of bird watchers. Participants in this network are able to collectively accumulate large amounts of invaluable data for researchers, such as seasonal distribution changes in bird locations and differential timing in migration patterns (Sullivan et al. 2014). In contrast to those involved in survey response and targeted data collection, participants in these types of networks are mainly intrinsically motivated. In other words, the issues at hand matter to them personally; therefore, they are willing to be volunteers. Importantly, this type of data collection allows researchers to identify real-time trends (Keating et al. 2013). The third type of research form – data analysis – underscores the essential role of nonexperts as well as human intelligence. The premise here is that humans can identify and interpret certain data that cannot be decoded by computer systems, and thus crowdsourced participants may interpret data differently, adding useful insights to findings obtained by other means (Buecheler et al. 2010). One application area focuses on social media data, acknowledging that human linguistic annotation (i.e., texts or images) is a fast and accurate method and less expensive than computerbased approaches (Sabou et al. 2012). Indeed, Snow et al. (2008) evaluated the quality of nonexpert annotations for natural language tasks (e.g., affect recognition, word similarity) by comparing them against those conducted by expert annotators. The results show high agreement between MTurk nonexpert annotations and existing ones offered by experts. Building on Snow et al. (2008), Callison-Burch (2009) demonstrated that it is feasible to perform manual evaluations of the quality of machine translations by using MTurk. Specifically, their study recruited bilingual graduate students to translate English sentences into French, German, and Spanish, showing that nonexpert judgments have a high level of agreement with the existing gold-standard judgments of machine translation quality. The study also suggests that MTurk can be employed to calculate a human-mediated translation edit rate (HTER), to conduct reading comprehension experiments with machine translation, and to perform high-quality reference translations. One of the implications for e-tourism is that crowdsourced translators can enhance translation accuracy in crosslingual information retrieval (such as hotel websites) (Li and Law 2007). Further, due to their capability to understand sarcasm and nuance in language, crowdsourced coders have also been used to conduct sentiment analysis (Keating et al. 2013). A recent exciting use of crowdsourcing in this area involves the judging of news source quality, a potentially valuable strategy for dealing with the spread of online fake news (Hilton and Azzam 2019). In contrast to human linguistic annotation, challenge-based data analysis and the coding of open-ended survey responses are applied in both online and offline domains. For instance, Kaggle (http://www. kaggle.com) allows researchers to host data analysis competitions. One example is a 2012 US Census competition, where the crowd was asked to predict and visualize census mail return rates. Other researchers have adopted MTurk to conduct
624
J. Ge-Stadnyk
data analysis through brainstorming approaches, asking participants to identify and explain different dimensions of online data visualizations (Willett et al. 2012). Recently, Jacobson et al. (2018) recruited and trained a large number of MTurk participants to code open-ended survey responses, asking them to select a code to represent each unit of text.
Crowdsourcing Research in E-Tourism Compared to crowdsourcing research adopted across the aforementioned domains, its application remains scarce in the e-tourism context (Leung et al. 2017). A review of the literature leads to the identification of only a limited number of relevant publications. Moreover, current literature specifically focuses on crowdsourced data collection; crowdsourced data analysis is rarely used (although a few studies mentioned it). For instance, Ge (2017) recruited participants on Witmart (http://www. witmart.com/cn/), a Chinese crowdsourcing platform, to code consumer sentiment. In examining the use of humor for consumer engagement, Ge and Gretzel (2018) recruited crowdsourced coders from Witmart to determine the presence/absence of humor in tourism-related social media posts. Both studies followed seven steps: (1) organizing data, (2) recruiting raters, (3) providing coding instructions, (4) checking the coding results and identifying coding issues, (5) identifying solutions, (6) recoding ambiguous data, and (7) finalizing the results. The literature on crowdsourced data collection sheds light on crowdsourced survey responses, with an almost exclusive focus on MTurk (e.g., Dedeke 2016; Ert et al. 2016; Garrigos-Simon et al., 2016; Ghose et al. 2012; Tussyadiah 2016). State and Popescu’s 2014 study is the exception. In investigating the twoway relationship existing between the tourism clients and the beneficiaries of their services, the authors launched two crowdsourcing websites accessible to anyone willing to respond to their questionnaire: one for evaluating the quality of firms’ communication and another for assessing the level of satisfaction of their clients. To validate the design of a hotel ranking system, Ghose et al. (2012) recruited MTurk participants to obtain consumer-contributed opinions on key hotel characteristics. Similarly, Galdon-Salvador et al. (2016) conducted an extensive survey-based study, employing 450 MTurk workers to measure their approach to constructing travel itineraries. Other researchers have examined consumer behavior through crowdsourced data collection. For instance, Dedeke (2016) gathered data on MTurk to explore if and how a website’s design quality influences customer purchase intentions. Tussyadiah (2016) distributed a questionnaire through MTurk to investigate the relationship between tourist innovativeness traits and patterns of smartphone use during the travel stage. Specifically focusing on non-Asian tourists, Lee et al. (2016) employed MTurk to identity the preferred attributes of a 1-day Seoul tour package. Some studies have used crowdsourced data as a general guidepost, with no interaction with participants or crowdsourcing platforms (e.g., Garrigos-Simon et al., 2017; García-Palomares et al. 2015; Leal et al. 2017; Shi et al. 2017;
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
625
Zhou et al. 2017). Crowdsourced data are available in the form of user-generated content (UGC) across diverse social media platforms. For instance, Shi et al. (2017) examined tourism crowding by accessing “check-in” geo-tagged data, finding this method more effective than the use of traditional on-site questionnaire surveys in regard to scale, timeliness, and cost. Zhou et al. (2017) developed a travel-planning tool by crowdsourcing various UGC on TripAdvisor and Flickr. A study by Leal et al. (2017) employed crowdsourced hotel ratings and reviews to discover trends and patterns relevant for tourists and businesses. Overall, these studies indicate that compared to traditional survey and experimental studies, crowdsourcing research has several advantages, including relatively low cost, participant diversity, flexibility, and sound data quality (Goodman and Paolacci 2017). Meanwhile, it is also important to acknowledge that crowdsourcing could be misaligned with the needs of some e-tourism researchers with respect to feasibility (it might be impossible to use crowdsourcing in the first place), desirability (researchers are uncomfortable with adopting a new approach like crowdsourcing), and utility (researchers might not recognize the benefits of crowdsourcing as a research tool) (Law et al. 2017).
Crowdsourcing E-Tourism Research: Guidelines In sum, e-tourism crowdsourcing research is still in a nascent stage. In order to help researchers exhaust its scientific application and employ it to inform their research, this section proposes several guidelines based on the aforementioned building blocks.
Determine Research Type(s) Before initiating a crowdsourcing study, scholars need to have a full understanding of the different potential research types (i.e., idea generation, data collection, data analysis) to be able to determine which one will allow them to most effectively achieve their research goals. They should keep abreast of recent research trends and even take into consideration types currently not in use. In the context of empirical science, it is well recognized that nonexperts can help researchers investigate phenomena, obtain greater knowledge, and correct and/or integrate already acquired knowledge (Buecheler et al. 2010). For these reasons, this study suggests that etourism researchers can and should unleash the power of consumers, collaborating with them to generate ideas and form research questions. For example, given the rapid, dynamic development of ICT within tourism, researchers might ask nonexperts to identify pressing issues that call for academic attention. Alternatively, researchers may prepare a list of questions or ideas and then ask a crowd to choose those that appear most impactful academically, socially, and/or economically. Moreover, researchers should keep in mind that crowdsourced data analysis is especially useful for dealing with social media, particularly to understand the creative use of Internet-based languages and nuanced symbolic meanings. Social media users, who have shaped the unique online communication culture, are qualified candidates for identifying and interpreting online-based data.
626
J. Ge-Stadnyk
Defining a Crowdsourced Population One of the methodological issues that requires attention is sample representativeness. Although crowdsourced populations are more demographically diverse than traditional samples, researchers still need to be aware of their idiosyncratic characteristics (e.g., psychographics, attention, and involvement) when developing theories (Goodman and Paolacci 2017). These two authors suggest that researchers should not entirely substitute crowdsourced populations (e.g., MTurk workers) for traditional samples (e.g., on-campus students) but use them as complements to one another in theory development. For example, researchers might test a theory under investigation across different samples to see if results converge. A second solution derived from e-tourism literature is found in a 2012 study by Ghose et al. To ensure that participants were representative of the overall Internet population, the authors conducted a survey where MTurk workers were asked a series of questions. Demographic information included respondents’ place of origin and residence, gender, age, education attainment, income, marital status, household size, and number of children; other relevant information concerned the total time they spent each week on MTurk, the amount of work that they completed, the level of payment they received, and their reasons for participating in MTurk. Importantly, the authors also conducted the survey multiple times to confirm results were not accidental. Selecting a Crowdsourcing Platform After deciding on a specific research form and defining a crowd of participants, researchers should examine different crowdsourcing platforms, reviewing each platform’s functionalities and members (or available participants and their qualifications), and then choose an affordable one that can best facilitate productive research. For example, MTurk can be an optimal choice for conducting linguistic annotation studies or elicit survey responses, both of which often require recruiting a large pool of nonexperts in a short period of time and within a limited budget. Demographic information about MTurk workers identified by Casey et al. (2015) can be a useful source in this step. The authors found that the average age of these workers is approximately 33.5, that a majority live in the USA, and that there is a roughly equal split between males and females. Further, these workers (participants) are moderately more liberal than the general population, while over 80% are white. If researchers seek experts (e.g., engineers, scientists, businesspeople) to conduct challenge-based data collection or challenge-based data analysis, Topcoder and InnoCentive can provide qualified candidates. Noteworthy is that these latter two platforms require a substantial budget. Recruiting Participants To ensure the recruitment of qualified participants, researchers need to design a series of clear and sound steps. First, researchers must review both their institution’s Human Subjects Guidelines and the privacy policies of the platform involved (Sheehan 2018). Next, they should keep in mind that the process of recruiting participants
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
627
varies according to the crowdsourcing platform; to grasp how a particular one works, researchers should, at the outset, gain firsthand experience of being one of its workers. This will allow for experiencing the platform’s functionalities and services from a potential participant’s perspective, thereby facilitating the crafting of a clear and persuasive job description (Sheehan 2018). For example, on MTurk, researchers publish tasks and then determine the subpopulation of participants who qualified to complete them, based on information provided by the platform (e.g., ratio of submitted tasks, country of residence). Participants are free to choose and complete any available task for which they are eligible. In contrast, on Witmart, researchers must post a recruiting message to invite bids; prospective participants who are interested then place a bid, with or without a description of their qualifications. Researchers then select those capable of completing the required tasks, based on information offered by the platform (e.g., profession, total earnings, evaluations offered by previous employers) and/or additional qualification statements provided by participants. In addition to reviewing the demographic information and quality filters provided by the platform, researchers should make sure participants are capable of completing all required tasks. To this end, one recommendation is to conduct a pretest. For example, in examining social media-based humor used for consumer engagement, Ge (2017) conducted a pretest to recruit qualified participants based on two criteria: (1) The individuals are qualified coders well-versed in the language of Weibo (a local microblogging site in China) – that is, able to use specific emoji and creative wording – and (2) they are willing to provide quick responses. The author then selected those who met the following qualifications: (1) correctly coded the peculiarities of Weibo language and emotions using a small pretest dataset; (2) not only provided coding results but also elaborated on some of the coding categories; and (3) followed all requirements and provided coding results within 24 h. Noteworthy is also that while selecting participants for the purpose of idea generation, researchers should consider the diversity of the sample. As Fuchs and Baggio (2017) point out, network closure and structural holes can influence creativity. Several issues deserve researchers’ attention and efforts at solution. The first one involves self-selection, including choosing oneself to be a worker within a specific crowdsourcing platform and (especially on MTurk) deciding which tasks to participate in and, eventually, which to complete (Goodman and Paolacci 2017). Higher pay rates, a researcher’s reputation, and the recency of tasks posted influence task attractiveness (Chilton et al. 2010; Higgins et al. 2010; Mason and Watts 2009). Researchers have several ways to minimize self-selection. Goodman and Paolacci (2017) suggest providing basic task descriptions without specifying details that may influence the attractiveness of the study to participants of certain dispositions or characteristics. Alternatively, researchers can conduct surveys to collect the relevant information (e.g., attitudes toward a destination) from participants and then select only those individuals who fit into the target subpopulation. Moreover, researchers should also make use of prescreening surveys that conceal the required characteristics (e.g., asking about attitudes toward a destination without disclosing
628
J. Ge-Stadnyk
that only people with extreme attitudes will later be considered for participation in the study) (Goodman and Paolacci 2017; Sheehan 2018). Relatedly, a second issue concerns how to minimize nonselective and selective attrition (Goodman and Paolacci 2017). The authors suggest that researchers should ask participants to formally enroll in a study (or accept a task) before evaluating it, in order to ensure the study’s content does not influence the choice as to whether to participate. Lastly, increasing payments to participants can also minimize the possibility of them quitting the study midway through (Horton et al. 2011).
Managing the Crowd When conducting a crowdsourcing study, researchers should consider managing the crowd of participants in the following ways. First, they may monitor forums so as to better understand participants’ thinking (e.g., Mturkgrind, Turkernation, mTurklist) (Goodman and Paolacci 2017). Understanding participant motivations, as identified by Ghezzi et al. (2018), can also be of help. Second, researchers should initiate and maintain effective collaborations with participants by promptly responding to their questions and concerns and by providing clear job instructions and coder training (Jacobson et al. 2018; Sheehan 2018). The latter approach is especially useful when managing participants conducting crowdsourced data analysis. Third, experienced participants may speed through the study, possibly causing satisficing behavior (Smith et al. 2016). One solution for slowing down such participants is to freeze the survey page for a specific period of time (e.g., the respondent will not be able to click to the next page for 30 s) (Sheehan 2018). Sheehan’s paper suggests that, prior to this stage, researchers should anticipate completion times by pretesting the survey. Last but not least, Goodman and Paolacci (2017) point out that researchers must also manage non-naïve participants, because previous exposure to studies may affect the validity of responses. In this case, researchers should ensure that they do not recruit participants who have already participated in the researchers’ field of study – for example, by using a prebuilt filter offered by some platforms (e.g., Prolific) or manually assigning qualifications. Ethical Issues In order to make crowdsourcing research feasible and meaningful, it is vital that a reliable relationship exists between researchers and participants. To build such a relationship, researchers can employ institutional guidelines to help participants understand their rights in academic studies; for example, a participant should be fully aware of his or her prerogatives with regard to providing informed consent. At the same time, researchers should also comply with guidelines provided by the platform selected. Further, the researcher–respondent relationship arguably represents an employer–employee relationship. Unfortunately, participants or workers have few legal protections (Sheehan 2018); researchers should thus avoid exacerbating any power imbalances by paying fair wages (Goodman and Paolacci 2017). They can refer to the minimum wage in their region and should also reveal, in their manuscripts, the exact levels of remuneration participants receive (Sheehan 2018). Noteworthy is that a researcher’s reputation among a participant population is
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
629
largely determined by whether he or she pays a fair wage (Wessling et al. 2017). In short, researchers should always provide fair compensation, taking into consideration not only their own reputations but the reputation of their particular field.
Report Given the unique characteristics of crowdsourcing research, it is important that researchers justify their selections of platforms, providing details in the process. This approach can enhance the credibility of crowdsourced e-tourism research, thereby helping to strengthen the field. A review of related literature shows many tourism scholars currently rely on well-recognized social media channels (e.g., TripAdvisor, Flicker) for data collection and tend to avoid providing justifications. One can assume that such details may be redundant, because researchers already have a good understanding of how these channels work. In contrast, crowdsourcing platforms used in academic research differ markedly from social media channels in terms of functionalities and services, and therefore deserve explanation. Moreover, traditional empirical research strongly depends on relatively homogeneous populations (e.g., undergraduate students), which might attenuate journals’ requirement for details of sampling in data collection (Goodman and Paolacci 2017). This is also the case in e-tourism research. Due to the dynamic diversity of crowdsourced populations, however, it is necessary that researchers report their sampling strategies in detail, for example, in regard to wages, countries of residence, approval cutoffs (e.g., >90%), and demographic information (e.g., gender, age) (Goodman and Paolacci 2017).
Conclusion The arrival and development of crowdsourcing marketplaces offer researchers across disciplines, including the field of e-tourism, a unique opportunity to leverage social wisdom and human intelligence. However, despite the growing interest in, and acceptance of, crowdsourcing research, e-tourism scholars have yet to determine how to effectively employ it as a scientific method (Van Nguyen et al. 2018). In short, compared to its development in other domains, crowdsourcing within etourism research has fallen behind. It is argued herein that, in order to move the field forward, e-tourism scholars must better grasp the unique and dynamic structure and principles of this method. By reviewing and synergizing the general literature, drawing from developments beyond the e-tourism arena, this chapter has proposed a number of building blocks for crowdsourcing research, offering key guidelines. It is expected that these proposals will be modified and expanded upon in future studies as part of the ongoing effort to deepen crowdsourced e-tourism research. Crowdsourced participants – experts and nonexperts alike – come with different experiences, motivations, skills, and interests. As platform members seeking paid tasks, they generally become more sophisticated than participants involved in traditional empirical studies (e.g., students). Noteworthy is that the researcher– participant relationship is analogous to that between employer and employee. For
630
J. Ge-Stadnyk
these reasons, e-tourism researchers need to design sample strategies carefully, continuously monitoring each participant crowd to ensure the integrity of their work. Another pressing issue is that e-tourism scholars need to expand the current research territory by considering other crowdsourcing platforms and employing other research types. The former approach will open up a new world of possibilities for reaching target participants and conducting specific types of research, while the latter will allow scholars to fully leverage the power of the crowd and human intelligence to solve problems, ultimately advancing e-tourism research. For instance, given the ubiquity of social media data, e-tourism researchers can and should use crowdsourced data analysis to deal with those problems that cannot be solved through automated analysis. Crowdsourced e-tourism research is a dynamic field rife with innovative technological platforms and rapidly emerging new issues to address. Unsurprisingly, its research paradigms are constantly changing, necessitating the continuous adjustment of research approaches and strategies. This does, however, make it particularly difficult to prescribe rules for enhancing efficient practices (e.g., in regard to research reliability and validity, or the aforementioned ethical concerns). But it is hoped this chapter has provided useful guidelines that can help e-tourism researchers push the field forward.
Cross-References Artificial Intelligence and Machine Learning Content Analysis of Online Travel Reviews e-Tourism Research: A Review
References Behrend TS, Sharek DJ, Meade AW, Wiebe EN (2011) The viability of crowdsourcing for survey research. Behav Res Methods 43(3):800–813 Buecheler T, Sieg JH, Füchslin RM, Pfeifer R (2010) Crowdsourcing, open innovation and collective intelligence in the scientific method: a research agenda and operational framework. In: The 12th international conference on the synthesis and simulation of living systems. [online] MIT Press, Odense, pp 679–686. Available at https://digitalcollection.zhaw.ch/bitstream/11475/ 2725/1/2010_Buecheler_Crowdsourcing%2C%20open%20innovation%20and%20collective% 20intelligence_Alife%20Proceedings.pdf. Accessed 11 June 2019 Callison-Burch C (2009) Fast, cheap, and creative: evaluating translation quality using Amazon’s Mechanical Turk. In: Proceedings of the 2009 conference on empirical methods in natural language processing. [online] ACL and AFNLP, Singapore, pp 286–295. Available at https:// www.aclweb.org/anthology/D09-1030. Accessed 19 June 2019 Casey L, Chandler J, Levine AS, Proctor A, Strolovitch D (2015) Demographic characteristics of a large sample of us workers. Unpublished manuscript Chambers S, Nimon K (2019) Conducting survey research using MTurk. In: Information Resources Management Association (IRMA) (ed) Crowdsourcing: concepts, methodologies, tools, and applications. IGI Global, Pennsylvania, pp 410–439
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
631
Chilton LB, Horton JJ, Miller RC, Azenkot S (2010) Task search in a human computation market. In: Proceedings of the ACM SIGKDD workshop on human computation. ACM, Washington, DC, pp 1–9. Available at http://john-joseph-horton.com/papers/task_search_in_ a_human_computation_market.pdf. Accessed 19 June 2019 Collins A, Medhekar A, Wong HY, Cobanoglu C (2019) Factors influencing outbound medical travel from the USA. Tour Rev 74(3):463–479 Dedeke AN (2016) Travel web-site design: information task-fit, service quality and purchase intention. Tour Manag 54:541–554 Ert E, Fleischer A, Magen N (2016) Trust and reputation in the sharing economy: the role of personal photos in Airbnb. Tour Manag 55:62–73 Feitosa J, Joseph DL, Newman DA (2015) Crowdsourcing and personality measurement equivalence: a warning about countries whose primary language is not English. Personal Individ Differ 75:47–52 Fuchs M, Baggio R (2017) Creativity and tourism networks—a contribution to a postmechanistic economic theory. In: Critical tourism studies, understand tourism—change tourism—understand ourselves—change ourselves. [online] Palma de Mallorca, Spain, pp 25– 29. Available at https://www.iby.it/turismo/papers/fuchs_baggio(CTS).pdf. Accessed 10 January 2021 Galdon-Salvador JL, Garrigos-Simon FJ, Gil-Pechuan I (2016) Improving hotel industry processes through crowdsourcing techniques. In: Egger R, Gula I, Walcher D (eds) Open tourism. Springer, Berlin, pp 95–107 García-Palomares JC, Gutiérrez J, Mínguez C (2015) Identification of tourist hot spots based on social networks: a comparative analysis of European metropolises using photo-sharing services and GIS. Appl Geogr 63:408–417 Ge J (2017) Humour in customer engagement on Chinese social media – a rhetorical perspective. Doctoral dissertation summary. Eur J Tour Res 15:171–174 Ge J, Gretzel U (2018) Impact of humour on firm-initiated social media conversations. Inf Technol Tour 18(1–4):61–83 Ghezzi A, Gabelloni D, Martini A, Natalicchio A (2018) Crowdsourcing: a review and suggestions for future research. Int J Manag Rev 20(2):343–363 Ghose A, Ipeirotis PG, Li B (2012) Designing ranking systems for hotels on travel search engines by mining user-generated and crowdsourced content. Mark Sci 31(3):493–520 Goh C, Mok HM, Law R (2009) Artificial intelligence applications in tourism. In: Encyclopedia of information science and technology, 2nd edn. IGI Global, Pennsylvania, pp 241–247 Goodman JK, Paolacci G (2017) Crowdsourcing consumer research. J Consum Res 44(1):196–210 Gretzel U, Fuchs M, Baggio R, Höpken W, Law R, Neidhardt J, Pesonen J, Zanker M, Xiang Z (2020) E-tourism beyond COVID-19: a call for transformative research. J Inf Technol Tour 22:187–203 Higgins C, McGrath E, Moretto L (2010) MTurk crowdsourcing: a viable method for rapid discovery of Arabic nicknames? In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s mechanical turk. Association for Computational Linguistics, Los Angeles, pp 89–92. Available at https://www.aclweb.org/anthology/W10-0714. Accessed 10 July 2019 Hill J, Ford WR, Farreras IG (2015) Real conversations with artificial intelligence: a comparison between human–human online conversations and human–chatbot conversations. Comput Hum Behav 49:245–250 Hilton LG, Azzam T (2019) Crowdsourcing qualitative thematic analysis. Am J Eval. Available at https://doi.org/10.1177/1098214019836674. Accessed 10 Aug 2019 Horton JJ, Rand DG, Zeckhauser RJ (2011) The online laboratory: conducting experiments in a real labor market. Exp Econ 14(3):399–425 Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):1–4 Jacobson MR, Whyte CE, Azzam T (2018) Using crowdsourcing to code open-ended responses: a mixed methods approach. Am J Eval 39(3):413–429
632
J. Ge-Stadnyk
Keating M, Rhodes B, Richards A (2013) Crowdsourcing: a flexible method for innovation, data collection, and analysis in social science research. In: Hill CA, Dean E, Murphy J (eds) Social media, sociality, and survey research. Wiley, New Jersey, pp 179–201 Law E, Gajos KZ, Wiggins A, Gray ML, Williams A (2017) Crowdsourcing as a tool for research: implications of uncertainty. In: Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing, pp 1544–1561 Leal F, Malheiro B, Burguillo JC (2017) Prediction and analysis of hotel ratings from crowdsourced data. In: World conference on information systems and technologies. Springer, Cham, pp 493–502 Lee WS, Ko DW, Moon J, Park J (2016) Non-Asian Tourists’ preferred attributes: a choice experiment. Asia Pac J Tour Res 21(12):1300–1309 Lenart-Gansiniec R (2018) Methodological challenges of research on crowdsourcing. J Entrep Manag Innov 14(4):107–126 Leung XY, Sun J, Bai B (2017) Bibliometrics of social media research: a co-citation and co-word analysis. Int J Hosp Manag 66:35–45 Li KW, Law R (2007) A novel English/Chinese information retrieval approach in hotel website searching. Tour Manag 28(3):777–787 Mason W, Watts DJ (2009) Financial incentives and the performance of crowds. In: Proceedings of the ACM SIGKDD workshop on human computation. ACM, New York, pp 77–85. Available at http://crowdsourcing-class.org/readings/downloads/econ/financial-incentives-andthe-performance-of-crowds.pdf. Accessed 10 July 2019 Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Education Limited, Malaysia Sabou M, Bontcheva K, Scharl A (2012) Crowdsourcing research opportunities: lessons from natural language processing. In: Proceedings of the 12th international conference on knowledge management and knowledge technologies. ACM, New York. Available at https://eprints. weblyzard.com/51/1/SabouEtAl.pdf. Accessed 11 Aug 2019 Sheehan KB (2018) Crowdsourcing research: data collection with Amazon’s Mechanical Turk. Commun Monogr 85(1):140–156 Shi B, Zhao J, Chen PJ (2017) Exploring urban tourism crowding in Shanghai via crowdsourcing geospatial data. Curr Issues Tour 20(11):1186–1209 Simperl E (2015) How to use crowdsourcing effectively: guidelines and examples. Liber Q 25(1):18–39 Smith SM, Roster CA, Golden LL, Albaum GS (2016) A multi-group analysis of online survey respondent data quality: comparing a regular USA consumer panel to MTurk samples. J Bus Res 69(8):3139–3148 Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast—but is it good? Evaluating nonexpert annotations for natural language tasks. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, Honolulu, pp 254–263. Available at https://www.aclweb.org/anthology/D08-1027. Accessed 11 Aug 2019 State C, Popescu D (2014) A new option for customer relationship management of tourism units: crowdsourcing. In: Proceedings of the international management conference “managing challenges for sustainable development”, Bucharest. Available at http://conference.management. ase.ro/archives/2014/pdf/1.pdf. Accessed 11 Aug 2019 Sullivan BL, Aycrigg JL, Barry JH, Bonney RE, Bruns N, Cooper CB, Damoulas T, Dhondt AA, Dietterich T, Farnsworth A, Fink D (2014) The eBird enterprise: an integrated approach to development and application of citizen science. Biol Conserv 169:31–40 Tasci AD (2017) Consumer demand for sustainability benchmarks in tourism and hospitality. Tour Rev 72(4):375–391 Tussyadiah IP (2016) The influence of innovativeness on on-site smartphone use among American travelers: implications for context-based push marketing. J Travel Tour Mark 33(6):806–823
26 The Hive Mind at Work: Crowdsourcing E-Tourism Research
633
Van Nguyen T, Benchoufi M, Young B, El Chall L, Ravaud P, Boutron I (2018) 63 methods of mobilising collective intelligence through crowdsourcing in research: a scoping review. Available at https://ebm.bmj.com/content/ebmed/23/Suppl_1/A31.2.full.pdf. Accessed 13 Aug 2019 Wessling KS, Huber J, Netzer O (2017) MTurk character misrepresentation: assessment and solutions. J Consum Res 44(1):211–230 Willett W, Heer J, Agrawala M (2012) Strategies for crowdsourcing social data analysis. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 227–236. Available at http://vis.berkeley.edu/papers/CrowdAnalytics/CrowdAnalyticsCHI2012%28Preprint%29.pdf. Accessed 11 Aug 2019 Xiang Z, Fesenmaier DR (2017) Big data analytics, tourism design and smart tourism. In: Xiang Z, Fesenmaier DR (eds) Analytics in smart tourism design. Springer, Cham, pp 299–307 Zhou X, Wang M, Li D (2017) From stay to play–a travel planning tool based on crowdsourcing user-generated contents. Appl Geogr 78:1–11
Tourism Design: Articulating Design Beyond Science
27
Mads Bødker
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Design Science and Design Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Design (Science) and the Nature of Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Legacy of DS in Design Thinking and Service Design . . . . . . . . . . . . . . . . . . . . . . . . . Critiques and Responses to DS and Design Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Design Is an Engagement with Unknowable Futures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alternatives to Design Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implications and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
636 640 640 642 644 647 648 651 653 654 654
Abstract Design has received increasing attention in the field of tourism research and practice. A number of researchers have pointed out how design as an approach to innovation might benefit the development of digitally enhanced tourism products and services. The central and prevailing view of design in tourism relies on an understanding of design as a rational problem-solving activity, fundamentally devoid of creative judgment and largely decoupled from the situated and embodied context of organizations, creative design practice, and the enactment and use of products in the lifeworld of people. I thus argue that design in tourism research has tended to be aligned with the modernist legacy of Herbert Simon’s account of design as a largely cognitivist and rationalizing practice and tends to maintain a view on design grounded in scientific and managerial discourses.
M. Bødker () Department of Digitalization, Copenhagen Business School, Copenhagen, Denmark e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_32
635
636
M. Bødker
In this chapter, I suggest how such approaches to design in tourism research are potentially limiting the development of innovative practices. The conceptual underpinnings of design in tourism make it challenging to reflect on alternative accounts or conceptualizations of design based within the humanities and anthropology and could impede the construction of a more pluralist engagement with design. The paper reflects on accounts and activities in tourism research that involve designerly work, particularly focusing on the design of digital experience products and services in tourism. It problematizes the modernist decision science legacy of design thinking and design science approaches and suggests how tourism as a research field might usefully extend its vocabulary and conceptual grasp on design. Design is more than a science, and this chapter ultimately suggests some directions that go beyond the currently dominant rationalizing and science-based positions. Based on emerging understandings of speculative design and as a material and highly situated practice, the chapter outlines alternative conceptualizations of design. The chapter suggests positions and approaches that can work to facilitate a broader view and a more reflective practice of design in tourism. The chapter proceeds as a “think piece,” reflecting on two distinct legacies of and approaches to design. One is broadly rationalistic and solution-oriented, borne out of a desire to practice design as a science, and the other is a speculative approach that treats design as a critical, situated, and reflective practice of “studying” the future and the intricate consequences of innovation.
Keywords Design · Design science · Speculative design · Design thinking · Futures
Introduction A number of tourism scholars have taken up an interest in design. Evolving from the planning and innovation literatures in tourism (e.g., Dredge 1999; Hjalager 2010; Eide and Mossberg 2013; Sundbo 2008; Hall and Williams 2008) and picking up speed with the broad introduction of user-facing digital infrastructures, devices, and experiences in the tourism ecosystems, design-oriented studies have emerged as a noticeable trend in research on e-tourism. Clearly, there are many ways in which design is involved in both product and process innovation, and a number of studies have been conducted to explore the links between design practices and the innovative capacity of tourism businesses (e.g., Hjalager 2010; Sundbo 2007; Liburd and Carlsen 2013). While “design” as an orientation in tourism research is picking up speed, there is little scrutiny of the foundational worldviews that give shape to these engagements. This chapter proposes that “design” is not a neutral practice of simply imagining and building things in a particular manner. Invoking “design” and practicing designerly work comes with commitments to particular worldviews and philosophies, emphasizing certain aspects while downplaying others. The
27 Tourism Design: Articulating Design Beyond Science
637
arguments presented in this chapter transcend any particular genre or practice of IT design, product design, graphical design, or experience design. I will however attempt to preserve a focus on issues related to digital design in (e-)tourism, to suggest how the arguments and critiques presented might be relevant to pursue. A recent edited volume by Fesenmeier and Xiang on design science in tourism (2017) suggests how designerly activities and frames of inquiry are promising and potentially foundational undertakings for destination management and innovation in tourism. A guiding logic throughout the volume is how design (and “design thinking”, a somewhat unclear and controversial term that I will return to later) might play an important role in the assessment of (latent) customer value, providing a method for staging experiential qualities of a destination, and extending the value creation process in a productive tourism economy. With few exceptions, the focus in the introductory chapter is on design as a specific form of problem-solving, where indicative problems include issues such as destination brand equity, customer value, perceived attractiveness and distinctiveness, as well as the design of systems and procedures for the ongoing management of tourism products and destinations. What the authors term design science in tourism (DST) “[. . . ] can be used to inform tourism research in such a way that it integrates design thinking and the science of design, the nature of the visitor experience, and the artefacts that can be developed to manage these experiences” (Fesenmaier and Xiang 2017: 6). The increasing availability of digital devices and infrastructures is a key concern in Fesenmeier and Xiang’s edited volume. A number of questions about the benefits of exploiting new technologies in particular information and communications technologies (ICTs) such as Internet of Things (IoT) or artificial intelligence (AI) in the field of tourism and “experience design” are identified. How can design play a role in shaping such new technologies and the ways in which they allow for new forms of planning and new kinds of experience. Research on experience and how to conceptualize it (drawing on, e.g., the foundational work of Pine and Gilmore (1999) as well as service design and new service innovation are some of the guiding theoretical trajectories in the volume. The introductory chapter to the volume notably trails the so-called design science tradition, based largely on Herbert Simon’s positions from The Sciences of the Artificial (org. 1968/1996) and further developed and extended in the management and information systems (IS) literature by, e.g., Hevner et al. (2004), and suggests this approach as a particularly appropriate frame for a scholarly engagement with design in tourism. Design science, as Fesenmeier and Xiang suggest, is “[d]ifferent from conventional methods for tourism product development, design science in tourism is underpinned by a strong theoretical, scientific basis that supports the integration of a variety of tools” (2017: 6). The introduction also clearly designates design as a practice closely linked to the capitalization of consumer desires by suggesting that “it is widely recognized that it (i.e., design, designing or design thinking) is a hugely important facet of the value creation process, and as such, provides competitive advantage within an increasingly competitive and crowded market place” (2017: 8). Design for tourism, in this way, works along a behavioral scientific vector, productively geared toward the seemingly rational discovery of problems, “unmet needs,” marketable
638
M. Bødker
solutions, and economically profitable experiences. These aims of a design science resonate with key tenets in the service design literature that is increasingly adopted in tourism design research and practice, typically advocating various customer and process mapping, customer profiling, and replicable evaluative procedures (e.g., Stickdorn and Zehrer 2009; Zehrer 2009). The service design concept has its origins in operations and management literatures (e.g., Shostack 1987) and tends toward defining design in the service industries as a relatively linear process of requirements gathering and other “customer”-centric work, implementation, measurement, and maintenance/management (Kingman-Brundage and Shostack 1991). The general aim of service design interventions is to create economically viable and competitive advantages in a marketplace of touristic offerings. Tourism, seen through the lens of “conventional” design approaches aligned with DS, is conceptualized as an industry and (service) design interventions aim to simply increase market acceptance of an offering (Lambert and Watson 1984) or to support more complex interventions such as producing novel and marketable tourism experiences through new technologies (Neuhofer et al. 2014), improving brand image (Morgan et al. 2004; Kotler and Gertner 2002), or streamlining processes to facilitate more efficient service delivery (Zehrer 2009). As suggested, design science and the more applied uses of service design are becoming popular vectors for research on tourism innovation, and I will consider some key legacies in the particular (modernist) tradition of design research referred to as design science (drawing primarily on the legacy of Herbert Simon and his work on The Sciences of the Artificial) and work toward how design in tourism can be narrated and practiced differently. Importantly, my critique does not indicate a dismissal of design science in tourism (DST) or DS in general. Indeed, I find the (academic) attention that design has gained to be inspiring and encouraging. I believe that design will continue to play an important role in tourism research and practice, and I concur that there are domains where DST is helpful, even if I find that it tends to overemphasize its scientific orientation. However, the argument in the following is that DST builds on a set of specific assumptions that might limit the role that design can play in tourism. These legacies are particularly important to interrogate as we face radically unknowable, complex, and contingent futures where consumer capitalism and current economic growth ideologies are, in increasingly obvious ways, complicit in the anthropocene and its potentially devastating conclusions. As renowned science fiction author, futurist, and commentator Kim Stanley Robinson puts it: “The future is radically unknowable: it could hold anything from an age of peaceful prosperity to a horrific mass-extinction event. The sheer breadth of possibility is disorienting and even stunning” (Robinson 2018, my emphasis). In a world where technology, progress, politics, and an increasingly fragile environment are matters of concern and neither instinctively begets carefree optimism and post-war expectations of constant and linear improvement of our way of life nor promises a post-cold war “end of history,” there is a need to consider new worldviews and methods that extend our capacity for charting imaginative and “designerly” vectors in tourism research. As Dredge (1999) has pointed out, questions of planning and design are involved in profound and complex social
27 Tourism Design: Articulating Design Beyond Science
639
and experiential problems. Providing for the tourism experience involves questions such as “How can a destination’s spatial structure be manipulated to enhance its “sense of place”, to promote a sense of security, and to heighten environmental “legibility” for tourists who find themselves in an unfamiliar environment? How can a destination maximize its integration with the wider regional, provincial, or national tourism product? Can the spatial structure be manipulated in order to facilitate the protection of natural, social, and built attributes which make a destination appealing? What is the most appropriate and cost-effective spatial sequencing for tourism?” (Dredge 1999: 775). Indeed, the recent COVID-19 crisis, as well as the growing realization that rampant consumer capitalism and hyper-mobile societies have ominous environmental consequences, has prompted calls for more reflexive and pluralistic engagements with tourism research (Gretzel et al. 2020). Following from such calls, the argument in the current paper is that we need to reflectively append solution-hungry and problem-solving mindsets with more speculative inquiries that include opportunities for critical and conceptual design and affective/embodied and reflective research to critically inform the conception and design of tourism futures. Particularly, when design is invoked in a field such as tourism, often associated with and catering to managerial or administrative logics of optimization or commercialization of destinations, products, or experiences, there is a risk that productive rather than reflective perspectives become dominant. In this paper, the overall aim is to discuss why and how the broader design science agenda in tourism should be supplemented with more marginal conceptions and philosophies of design; these are foundations that employ a developing set of practices such as speculation in the form of conceptual design approaches and sensory, affective, and “felt” encounters as means of understanding and performing design. Encouragingly, the volume edited by Fesenmeier and Xiang features chapters that go well beyond strictly scientific registers. Notably, the chapter titled “An Uncanny Night in a Nature Bubble: Designing Embodied Sleeping Experiences” (Salmela et al. 2017) discusses the use of literary- and sensory-oriented autoethnographies as an approach to understand embodied performances and responses to novel (designed) visitor experiences. Their paper reflects on the prevailing modernist and anthropocentric agendas in tourism and the ways in which design can encourage new encounters with nature and the materialities of leisure and relaxation. By doing so, it creates a number of openings for a critical design engagement with humannature relations in tourism. It furthermore considers the messy, judgmental, and thoroughly entangled work of design rather than seeing design as the application of “solutions” to “problems.” In this chapter, I wish to further encourage such directions and provide some initial support for reflective engagements with design in tourism. Hence, the purpose of the chapter is to support designerly work in tourism that explores more reflective, embodied, or critical registers of design to extend the scope of design-based research and practice. This chapter will proceed in the following way: First, I will discuss the legacy of design science and design thinking, outlining some of the inquiries and critiques these have inspired. Next, I discuss an alternative framing or paradigm of design as “speculative” to further a material and critical-imaginative account of design, and
640
M. Bødker
I give two examples from studio-based design work. Lastly, I outline some of the further opportunities for design-oriented research in tourism that I see emerging from an alternative account of design.
Design Science and Design Thinking Although there are subtly different positions in research carried out in the “design science” tradition, the fundamental underlying paradigm is rationalistic, based on particular assumptions about the nature of problems and problem-solving. Broadly speaking, the “problem understanding paradigm” (Hevner et al. 2004), that is, the configuration of theories and methods used to understand the nature of a problem within DS, is based on natural or behavioral science research (Hevner et al. 2004; March and Smith 1995; Simon 1996). “Our definition of IT artifacts is both broader and narrower [than] those articulated above. It is broader in the sense that we include not only instantiations in our definition of the IT artifact but also the constructs, models, and methods applied in the development and use of information systems. However, it is narrower in the sense that we do not include people or elements of organizations in our definition nor do we explicitly include the process by which such artifacts evolve over time” ((Hevner et al. 2004): 82–83). In other words, technology and the sociotechnical assemblages change, co-evolve or learn are, according to Hevner et al., distinctly not part of DS. It represents a “scientistic” approach that unconsciously dispels the messy unpredictability of a “lived” reality of use of a designed artifact or service and any post-adoption contingencies or appropriations into a social reality. This affords a convenient “pruning” of a problem space, as the complex, emergent properties and sociomaterial entanglements of, e.g., new technologies or services are ignored. Key to the understanding of the legacy of design science is to understand how early proponents of the approach articulated what constitutes a problem for design and how such problems are solved.
Design (Science) and the Nature of Problems Herbert Simon’s The Sciences of the Artificial (1968/1996, henceforth SOTA) consistently ranks as one of the most comprehensive and influential articulations of a “science of design.” The tropes and vocabularies established around this legacy are variously implicated in many significant current accounts of design work. This implies, for instance, that “the repression of judgment, intuition, experience, and social interaction in Simon’s “logic of design” has had, and continues to have, profound implications for design research and practice” (Huppatz 2015; for an extended discussion, see Dorst 1997). Simon’s views on design matured in an environment of a cognitive and positivist science tradition, fueled among other things by his work on computation and artificial intelligence, a good deal of which was carried out under the patronage of the military think tank RAND Corporation.
27 Tourism Design: Articulating Design Beyond Science
641
A central concern in Simon’s work was to articulate logic of problem-solving and the formalization of problem-solving logics to a degree where software programs could be composed that emulated human problem-solving, ending the human tendency to introduce judgment and bias in reasoning processes. Newell et al. (1959) early work on logic machines, artificial intelligence (AI), and computer programs such as General Problem Solver (G.P.S.) is suggestive of Simon’s later synthesis of his attitudes toward human problem-solving presented in SOTA. Here, Simon programmatically suggests that “solving a problem simply means representing it so as to make the solution transparent” (Simon 1996: 132). Simons’ program sought to transform design education and practice, arguing that “[in] the past, much, if not most, of what we knew about design and about the artificial sciences was intellectually soft, intuitive, informal, and cookbooky” (Simon 1996: 135, quoted in Huppatz 2015: 33). Simon’s own synthesis of his work in AI presents the firm belief in the superiority of computers as problem-solvers, since they are able to break down problems in unambiguous informational bits, solving problems without the unscientific need to intuit or judge solutions. While acknowledging that there are problems that are less well-structured and hence present a challenge to softwarebased formal logic algorithms such as the General Problem Solver, the title of his 1973 book demonstrates his position quite explicitly: The Structure of Ill-Structured Problems (1973). Like an algorithm trawling methodically through a program, the bits and pieces that make up an ill-structured problem must (and can) be solved sequentially, and hence, within an ill-structured problem, there is still structure, only hidden from (ir)rational man and most appropriately parsed by the formal logic of a software program. Design theorist Kees Dorst has discussed and critiqued the notion of “illstructured problems” in some detail (Dorst 2006). Dorst explores the notion of structure in a design problem, and how methodologically rigorous and apparently scientific approaches seem to ignore combinatorial explosion when designers engage with a process of problem-solving. “In a multistep problem-solving process,” Dorst writes, “each problem solver will get the chance to pile interpretation upon interpretation, and thus end up taking the problem-solving processes in completely different directions. Therefore, the use of memory and subjective interpretation becomes a major influence on the problem-solving behavior of designers. If we take this seriously, then it undermines the very idea of having one, knowable problem at the start of the problem-solving process” (Dorst 2006: 8). Since Simon and Newell’s General Problem Solver was hardly more than a brute force expert system, a problem space had to be formalized in strict symbolic representation; initial states, goal states, and operators and constraints had to be unambiguously represented. Arguably, we already sense the futility of the G.P.S. and its philosophical foundations. In the face of human and social problem spaces such as urban planning or “experience” that include nontrivial entities such as relevance, ethics, or aesthetics, even advanced expert systems must fail. As Huppatz suggests, “[f]reed of situated bodies, Simon’s “science of design” failed to engage with designing as a fundamentally social, political, cultural, and embodied activity” (Huppatz 2015: 37).
642
M. Bødker
Contemporary appellations of design science also subscribe explicitly to the same problem-oriented philosophy. Baskerville, for one, argues that for DS “[i]t is a fundamental premise that a design is problem-driven, and leads to an artifact that solves the problem when the artifact is introduced into nature” (Baskerville 2008: 441). However, as Dorst and Cross argue, “design is not a matter of first fixing the problem (through objective analysis or the imposition of a frame) and then searching for a satisfactory solution concept. Creative design seems more to be a matter of developing and refining together both the formulation of a problem and ideas for a solution, with constant iteration of analysis, synthesis, and evaluation processes between the two notional design “spaces” – problem space and solution space” (Dorst and Cross 2001: 434). In this way, problem and solution (or “solutioning”) are entangled practices.
The Legacy of DS in Design Thinking and Service Design This “scientific” legacy underlies design-oriented work carried out in the tourism academy. The legacy dominates most accounts of design used in tourism research and design activities, with little attention and reflection on the assumptions and limitations imported by it. Early design science work by Simon indicates a historically contextualized need to confer onto planning and design an image of predictability, orderliness, and efficiency rather than messiness, guesswork, and artistry (Huppatz 2015). Similarly, exploring the concept of design in tourism is commonly oriented toward informing or developing competitive products, services, or experiences, requiring quick, pragmatic methods that promise replicable results or useful frameworks based on logical and structured theoretical work. Subscribing to a formalist problem-solving approach, Kim and Fesenmeier suggest in a paper on the design of tourist places how they wish to first “[deconstruct] a travel experience into a series of events and then examining what constitutes each event, how travelers perceive the elements or components within the event and the outcomes of these components enable us to understand how tourism experiences are created and translated into meaning” (Kim and Fesenmaier 2015). The legacy of Simon’s design science is further evident in concepts and practices such as “design thinking” and service design, terms likely familiar to many tourism researchers with an interest in the innovation of tourism products and services. Service design has been praised as a useful and valuable design approach for a service-dominant tourism domain. Zehrer suggests that the blueprinting method in service design “[. . . ] can be used to identify the “fail points” in the service-delivery process that precipitate such critical incidents in the customer experience. As such, blueprinting can be utilised as the basis for a service design that enables the service provider to shape the customer’s emotional experience, and thus attain a competitive advantage” (Zehrer (337)). A blueprint notation constitutes a means of formalizing a problem space, ideally representing complex situations such as service delivery and “customers’ emotional experience” as discrete steps in a sequence of events that can be manipulated and the results predicted.
27 Tourism Design: Articulating Design Beyond Science
643
“Design thinking” more broadly has been promoted as a remedy to the increasing managerial and experiential innovation needs of the tourism sector (Tussyadiah 2017). Tim Brown, CEO of IDEO, suggests that: “Design thinking is a humancentered approach to innovation that draws from the designer’s toolkit to integrate the needs of people, the possibilities of technology, and the requirements for business success” (https://designthinking.ideo.com). Although the history and legacy of design thinking is convoluted and the notion is largely found to be lacking any coherent theoretical base (Lindgaard and Wesselius 2017), it has become a relatively well-known term, typically describing a user-centered and iterative design process. In tourism research, design has been employed as a practical means for understanding customer needs and creating destination visitor experiences (experience design, e.g., Tussyadiah 2014), customer journeys (service design, e.g., Stickdorn and Zehrer 2009), touchpoints (often human-computer interaction, e.g., Cavadi et al. 2018), or event design (Nelson 2009). Tussyadiah points out how services research and theories of the experience economy have led tourism research to form an interest in design as a means of delivering on customers’ heightened anticipation of “experiences” when travelling. She suggests that “it is imperative for tourism destinations to identify design problems, the target behavior that will be effective as a means to achieve the overall goal of creating meaningful tourism experiences for tourists. In order to do this, designing starts with selecting the right target outcomes from both sides: the tourists (at the individual and social levels) and the destination” (Tussyadiah 2017: 184). This assumes that a “target outcome” can indeed be (fully, rightly) identified and articulated. Who, then, gets to decide when is a design “done”? Further, who gets to select and “prune” questions and problems so as to make them eligible for designerly involvement? Design thinking provides conveniently simple frameworks that are proposed as useful “orientational” tools for research that include problem-solving or innovation of new products and market competitiveness. These methods that evolved from design consultancy work in renowned researchoriented businesses and schools such as IDEO, Hasso Plattner Institute, and Stanford d.school and notable DS authors such as Kelley and Kelley (2015) and Brown (2008) are now well known beyond narrowly defined design school curricula. Extending into the domains of management and business, design thinking has evolved into a creed that retains a firm admiration of a privileged and creative individual and an unproblematized access to knowledge and insights about “users” or “customers.” Furthermore, on this view, creativity is seen as a relatively benign or neutral attribute or skill that does not hold any specific ethical, moral, or ideological charges or worldviews (cf. Kampylis and Valtanen). Design science and attendant design systems such as design thinking are harnessed to provide companies with a competitive advantage and seem to follow the claims of widely cited and popularized assertions by, e.g., Porter (1985) that the function of a company’s innovative capacity is to achieve market dominance. Such legacies have invoked a convenient discourse of design as a scientific, rational, and replicable practice.
644
M. Bødker
Critiques and Responses to DS and Design Thinking The scientific legacy of Simon is thoroughly implicated in design philosophies, practices, and studies, and it continues to have a profound resonance in the academic conversation about and articulation of design. At the same time, as a distinct paradigm, it has also received substantial critique from a variety of scholars. In the following I will highlight five critiques, and subsequently I will begin the outline of a speculative design project that begins to answer to these. 1. Solutionism. Rittel and Webber (1973) notably took issue with the solutionism and assumed linearity of problem-solving and planning inherent in the Simonian approach. Their critique of the planning literature is echoed in the work of, e.g., Dobbins (2009) and Morozov (2013) who state strongly that technological innovation has the tendency to recast: all complex social situations either as neatly defined problems with definite, computable solutions or as transparent and self-evident processes that can be easily optimized—if only the right algorithms are in place!—this quest is likely to have unexpected consequences that could eventually cause more damage than the problems they seek to address. I call the ideology that legitimizes and sanctions such aspirations “solutionism.” I borrow this unabashedly pejorative term from the world of architecture and urban planning, where it has come to refer to an unhealthy preoccupation with sexy, monumental, and narrow-minded solutions—the kind of stuff that wows audiences at TED Conferences. (Morozov 2013, 5–6)
Blythe et al. (2016) have argued that a key problem here is the abundance of and reliance on particular design representations, particularly “Silicon Valley”styled demos and glossy scenarios that showcase new technological visions and agendas. These strongly support a discourse of solutionism and technical fixes to complex and interlocking challenges. Design representations are not neutral vehicles of communication, but actively structure imagination and designerly work to comply with solutionist approaches. 2. Universalism. Lucy Suchman, an anthropologist who has researched the field of technology production for decades, argues that design is an embodied and situated practice that cannot be understood as disassociated from its context (Suchman 2002). Design is a local accomplishment and does not end at the point of “adoption” – it is an ongoing re-configuration of actors, cultures, technologies, economies, and politics. Unlike engineering, computer science, or the natural sciences, design cannot rely on universal rules that lead to predictably similar outcomes. Local material, political, economic, managerial, or cultural settings circumscribe how design happens and ultimately what the design process is about. In this sense, the “science” of design science seems to be overstating the “scientific-ness” of its approach (i.e., natural science as a model, “physics envy”), since the natural sciences upon which Simon’s work is based are not well suited for engagement with the broad social contexts and emergent materialities of designed objects and infrastructures. A universalist model seems to require a built artifact and its “impact” as something that can be measured in the unambiguous results of a formal
27 Tourism Design: Articulating Design Beyond Science
645
evaluation. Proponents of continental philosophy of technology contributed to this critique by insisting how artifacts or things in their relation to humans are “multi-stable” (Ihde 1990) and do not have discrete and definable properties beyond their actuation within practice. Design is fundamentally an engagement with complex emergent and situated properties of artifact-human imbrications or entanglements. Participatory design (PD; or cooperative design) is a critical and practiceoriented response to the universalistic notion of design. This does not include those one-dimensional versions adopted in commercial projects that claim to be user-centered, where participation amounts to early stage user involvement or usability testing. As Sundblad has pointed out, these pay lip service to the radical approaches that emerged from Kristen Nygaard’s (and later followed by proponents such as Pelle Ehn, Susanne Bødker, Yngve Sundblad, and others) pioneering work under the democratic, cooperative, and participatory design labels in the mid-1970s (Sundblad 2010). 3. Conservatism. If the agenda of design is primarily to predict consumer preference and offer a product consistent with these preferences, design often plays into the mechanics of a competitive market. Design historian Daniel Huppatz associates the “[the] promise of greater control has proven popular in recent characterizations of design thinking closely aligned to management. The logic of optimization promises greater predictability and profit while rigorously stripping judgment, intuition, and experience from systems and service design” (Huppatz 2015: 38). Here, intuitive and judgmental stand in stark contrast to the scientific and unambiguous project of design science. In the DS and DT processual schemes, there is very little scope for self-reflection and positionality, leading the designer to base design decisions on her own experiential worlds. As Iskander argues: “When the designer acts as a gatekeeper for the meanings that are included in the design process, the potential for connections becomes limited not only to what the designer views as significant, but also to the relationships she can imagine” (Iskander 2018). The designer becomes the experiential nexus, but does not recognize her own complicity or consider the positionalities enacted in encounters with the other. So while design thinking suggests a number of methodical prescriptions, it fails to attend to the kinds of positions and perspectives that are enacted in the application of various methods: “The design thinking method does not stipulate rigorous attention to positionality, however. This omission signals that the designer, as creative visionary, is somehow suspended above the fray of bias, blind spots, and political pressure” (ibid.). As a simple example, one could suggest that the terms “customers” and “users” to describe the people interacting with a solution already entail fundamental commitments to particular ideological structures that emphasize capital gains (the customer) or decontextualize a person from a complex sociotechnical milieu (the user). Ignoring this is also ignoring how DS, DT, and service design might be complicit in enabling and reproducing existing structures rather than projecting ideas or concepts through a radical reimagining of sociomaterial worlds.
646
M. Bødker
4. Colonialism. In recent years, a discussion of design as complicit in a “Western” colonial or imperialistic agenda has emerged. Design theorist Lily Irani has suggested how design thinking evolved and was popularized in order to protect and extend US commercial interests. Echoing the discussion of design science above, Irani proposes that design thinking in the IDEO/Stanford appellation is preoccupied with the search for replicable solution algorithms and transferrable guidelines for research and design: “Design thinking promises to make innovation continuous and replicable. It is encoded in workbooks and guides published by IDEO, as well as universities” (2018: 3–4, my emphasis). Design thinking, she argues, prioritizes “an approach to design as market strategy over the craft skills of model making, typography, and mechanism design” (2018: 3). Irani suggests that design thinking indicates “a broader public understanding of the proper place of the US in a global economic order shaped by intellectual property law, as well as the opening up of markets in Eastern Europe and Asia as sources of labor and as potential consumer bases. American brands and patents [. . . ] could become central to economic success while the science, technology, engineering and mathematics (STEM) jobs celebrated during the space race now seemed more outsourceable” (ibid, 11–12). In critiquing colonialist imperatives in design thinking, Irani and others (e.g., Chumley 2016; Lindtner 2017; Tunstall 2013) open up a discussion of how political economies and identities are involved in the articulation of design as a management “philosophy” and how it invokes specific orders of (gendered/racist/orientalized) divisions of labor. Who gets to “think,” be creative, and propose change, and who gets to support the labor to produce artifacts (e.g., Chinese sweatshops), deliver services (e.g., Indian call centers), maintain infrastructures (e.g., an experiential service backstage populated with people of color), etc.? If you have one, take a look at your latest box from an Apple gadget “Designed in California. Assembled in China.” So, when Tussyadiah claims, “Designing and design research in many different disciplines draw from theories in psychology, anthropology, social and behavioral sciences, cognitive and decision sciences, marketing and management, etc.” (Tussyadiah 2014; 559), this, of course, is not untrue. However, one wonders at the conspicuous lack of the arts, craft-based competences, etc., arguably because such embodied skills are unruly, underdisciplined, and perceived to be relatively inscrutable practices that, at best, have a tenuous relationship to the management disciplines. Reproducing the account of design to primarily favor “thinking” enables a widespread discourse where manual, routine, or craft-based practices are consistently downplayed or ignored. The noble task of the (white, male, cosmopolitan, well-educated, empathetic, etc.) Design Philosopher King beats the (soon to be automated and redundant) worker at her engineering bench. A reductive scientific model beats the intuitive, judgmental design situation. 5. Reflectivity. Unlike more anthropological work or ethnography, the behavioral social sciences (modelled on the natural sciences) do not accommodate reflective work that complicates the epistemological positions of the researcher or the
27 Tourism Design: Articulating Design Beyond Science
647
relations performed in the field and interpreted/reported after observations have been completed. As Kimbell (2011) and Lindgaard and Wesselius (2017) point out, the oft cited “empathy” of design thinking and the ability of a designer to understand what users need or want is typically left unproblematized in the design thinking literature, neglecting fundamental tenets of reflectivity on anthropological research and the social sciences. Through adherence to a particular method, the designer/researcher is ostensibly conferred with special powers of attentiveness or sensibilities that enable her or him to empathize with otherwise latent and indiscernible needs or desires in the target population or market. In this way, the important step of “empathy building” and the search for “insights” is shorthand that hides complex processes of anthropological inquiry and ongoing interpretations performed in and out of the field. Related to the critique of conservatism, this critique suggests that design researchers need to reflect their experience against their privileged positions and continuously pursue an understanding of how knowledge in the design process is produced and used.
Design Is an Engagement with Unknowable Futures Design is always, somehow, an engagement with the future. It is a practice as well as a way of thinking that is fundamentally concerned with bringing forth materialities (including behaviors, activities, and processes or experiential affordances) that people are intended to engage with in some (more or less familiar if not-yet-known) future situation. Professor of design Cameron Tonkinwise suggests simply that “[d]esign makes futures. What designers make becomes the futures we inhabit. In this, design is unique. Other discourses imagine new and different things, but do not make, do not realise them as things that people in the future will experience as their reality. There are practices of making, but these crafts do not imagine new kinds of, and so future, things” (2015: 13). Seeing the application of speculation and futuring as indispensable for all forms of design, Tonkinwise suggests that “Design that does not already (imagine the) Future, (consider) Fiction, Speculate, Criticise, Provoke, (promote) Discourse, Interrogate, Probe, or Play is inadequate design. Not all (commercial) design does all those things, but it should” (Tonkinwise 2015: 13). Design, in this sense, must navigate within a complex space of that which currently exists and that which does not yet exist in the world (Ehn 1988; Nelson and Stolterman 2003). Hence, design involves ways of performing creative “leaps of faith” and ongoing critique and reflection on the kinds of futures (including narratives, discourses, and representations of such futures) we imagine and creatively respond to in design work. In this way, design can be a way to facilitate critical and speculative spaces for rich inquiries into radically contingent futures. This obviously goes beyond the idea of “design” as a universally cogent way of solving discrete problems. Instead, this understanding of design encourages designers and researchers to be more aware of how design practices can be turned toward articulating, reflecting on, and “rehearsing” futures. Speculative design
648
M. Bødker
questions the framings, discourses, and representations inherent in a modernist design tradition and sets out to pose “carefully crafted questions” (Dunne and Raby 2001) that take a broader view of the process as well as roles and wider responsibilities of design as a discipline (Fry 2011). A turn toward understanding design as “speculation” turns the responsibility of the designer from the production of new, innovative, and marketable artifacts to applying design as a means for (creatively) shaping speculative leaps of imagination. Ideally, speculation works to reflect on (rather than predict or determine) future worlds and what life might be like in some future setting. This, I believe, is pertinent for the ongoing study of tourism and its consequences, and I believe that design can be figured in a way that problematizes the singular focus on “problem-solving.” An alternative account of design and the subsequent understandings/practices of design is vital for understanding problematic, complex, and unpredictable sociomaterial futures of tourism. Risking the lumping together of quite disparate concerns that present their own specific logics, I would suggest that these are futures where complex matters such as “overtourism,” climate change and environmental concerns, cultural appropriation, gender and indigenous population concerns, gentrification, inequality and class conflict, identity politics, tribalization, and mobilities (and adding an indispensable “and so on” here) will be increasingly entangled and relevant matters for the tourism academy and beyond. While current technological innovations (including the multitude of digital technologies at our disposal as design materials for future interventions) will likely play an increasing role in tourism, we need to understand the orchestration of artifacts and interactions in tourism as more than problem-solving. Instead, we can turn to design as form of material and criticalimaginative inquiry. This will be the proposition explored in the following sections.
Alternatives to Design Science It is possible to trace an alternative account of design, an account that relies on completely different intellectual and academic histories. Risking generalization, in the design science tradition, the purpose of design is to produce a product (service, experience) that can be formally projected and evaluated as fit for an intended (and typically well-defined) goal. Using the label speculative design, I wish to suggest an opening for alternative accounts of design and design research. Speculative design lumps together a number of concerns that have been labelled variously critical design (Dunne and Raby 2001), design fictions (Sterling 2005; Bleecker 2009), alternative futures (Angheloiu et al. 2017), or design frictions (Forlano and Mathew 2014). All these approaches share a willingness to relinquish the market-based logics of design agendas but also to fundamentally rethink how and why design is done. What constitutes knowledge in design processes? How are relations (from designer’s apparent empathy with users to the performance of ownership and) performed in the design project? What kinds of stories are told about problems and solutions (and to whom)? They also share a concern with reframing the ways in which disciplinary commitments are organized and obligated toward
27 Tourism Design: Articulating Design Beyond Science
649
particular theories or worldviews. How might design, for instance, “[engender] new kinds of researchable entities and a new or rediscovered realm of the empirical, and it opens up new avenues for critique” (Buscher and Urry 2009: 99)? New conceptual accounts and practices of design thus expand the scope for what counts as an empirical field. Speculative design seeks to find “moments of working and experimenting with material worlds that expand our analysis” (Robinson 2018: 57). Rather than seeking closure and solution, design can become a way of making and “staying with the trouble” Haraway (2016) to expose new researchable phenomena as well as insisting on complicating any linear or “natural” account of design as problem-solving. In this counternarrative, design is no longer a linear process of empathy building, insights synthesis, prototyping, and evaluation that leads inevitably to a solution to a well-specified problem. Rosner (2018) suggests how anthropologists and feminist scholars such as Lucy Suchman and Donna Haraway (along with a host of like-minded academics and activists) provide generative and conceptual counterpoints to such canonical accounts of design. Auger (2013) has warned that speculative design is not merely about speculating about grand, futuristic scenarios or extrapolating current (technological) situations into probable (and desirable) future ones. Rather, speculation should be understood as a form of critical “interrogation” of the structures that underlie the various design aims and artifacts in both current and future iterations. Auger suggests that speculative design works to “apply different ideologies or configurations to those currently directing product development. This method is similar to the historiographical practice of counterfactual histories and the literary genre of alternate histories” (2013: 2). In a simplistic manner, it amounts to asking “what if. . . ” of both the contingent histories of designed things, places, or experiences already in place in our world, as well as of the possibilities and exigencies for new things in the world(s) of the future. A speculative approach allows designers to inquire more freely and without the constraints of creating for a competitive market of offerings. It facilitates “the removal of the commercial constraints that normally direct the creative process. This decoupling allows for the goals to be based on questions and discourse rather than market-led agendas; hypothetical possibilities not real products; utopian concepts and dystopian counter-products. They can inspire an audience to think not only about what they do want for their future selves but also what they do not want” (Auger 2013: 22). Dunne and Raby’s work on critical design (Dunne and Raby 2001) expresses similar concerns over a design discipline that is increasingly (perhaps unwittingly) complicit in corporate exploitation and rampant consumerism – with an uncanny ability to make the future of the things that are thus peddled look convincingly benign, devoid of ethical conundrums and emptied of political and ideological significance. There are few examples of speculative design work in tourism. I wish, in the following section, to look at two examples, noting some of the directions and imperatives suggested by them. In a project called simply “Speculative Tourism,” artists and designers Shalev Moran and Mushon Zer-Aviv (personal communication) have created a series of location-sensitive audio tour guides for Jerusalem. While Jerusalem is ripe with history (“it sometimes feels impossible to even think of
650
M. Bødker
Fig. 1 (Moran et al. n.d.) “Speculative Tourism” – audio guides from the future. https://www. shalevmoran.com/speculativetourism (used with kind permission)
Jerusalem in terms of the present, let alone the future”), these tours were staged as narratives from the future: “In each iteration, we conduct an extensive writing workshop with local talents and guide them through a series of exercises to imagine their own speculative futures for the local region, and to narrate them as personal tours in the actual local environment. We record those tours and place them in a geolocative mobile app, that is made available to anyone who visits the tour locations” (Moran et al. n.d.) (Fig. 1). Jerusalem is a site of conflict and dispute. One of the guided tours playfully trail the Zuckerbergs on their last visit to the city: “In the year 2086, the extended Zuckerberg family lands in Jerusalem for a last visit before they leave Earth. With them comes great grandfather Mark Zuckerberg, now 102. At the entrance of King David Hotel, where they stay, a local tour guide is waiting. It’s a morning they’ll never forget.” The invocation of Silicon Valley aristocracy, histories of belonging, and science fiction technologies points toward a recognizable but radically transformed vacationscape. Their departure for Mars suggests new geographies of migration. Another tour begins: “The year is 2037 and you are taking a tour with the CEO of “Equal but Separate,” a non-profit organization which has developed and implemented a sustainable solution for peace in Jerusalem. In contrast to previous peace solutions thinking in terms of East and West, “Equal but Separate” has divided Palestinian and Israeli societies on a vertical axis; Jerusalem is now split into two cities: Lower J. and Upper J.” Audiences are left to ponder the exigencies of urban planning, and how profoundly planning and design are entangled within political economies and geographies of belonging and heritage. Both tours are commentaries on contemporary ideological and economic conditions but extrapolate current discourses of technology and social structures beyond statistical forecasting into sociotechnical imaginaries. From a performance perspective, the appropriation of state-of-the-art digital technologies (smartphones running location-sensitive audio-based augmented reality applications) for a noncommercial arts project that emphasizes participation and cross boundary collaborations between local writers, technology designers, and community members in the locations where the performances are staged is a confident and optimistic articulation of the potential in creating more communal and action-based engagements in the experience economy. The choice of interaction design requires participants to move physically from site to site for listening
27 Tourism Design: Articulating Design Beyond Science
651
to the different parts of the story, creating a virtual layer within the embodied experience of being “there.” While in essence the functionality of the device is like a standard heritage site audio guide, the narrative presented can create a sense of a future place and how it might be transformed. Similar appropriations of precise location awareness, context sensitivity, and other digital technologies have been performed by arts and project studios such as GlowLab (GlowLab 2005a,b) and Blast Theory (Benford et al. 2006). Galloway et al. (2003) describe a number of studio-based set of design experiments on city tourism. Inspired by situationist art, these are often playful reflections on the possibilities of new technologies and exploring the affordances and sociabilities created by them in situ (Bødker and Browning 2013). One such speculative artifact is “The Cube,” a dice or cube with screens on all sides that can show pictures. When visiting a city (in the case of the project, Rome), a local is asked to “roll the dice,” and both guest and the host are tasked with figuring out what the cube face shows, and hence where the tourist should visit. The interactions between locals and visitors are rendered lighthearted and serendipitous and require playful collaboration and interpretation to become meaningful. Their work trails the work done by Gaver, Dunne, and Pacenti who “instead of designing solutions for user needs, then, we work to provide opportunities to discover new pleasures, new forms of sociability, and new cultural forms. We often act as provocateurs through our designs, trying to shift current perceptions of technology functionally, aesthetically, culturally, and even politically” (1999). Some work has also been carried out in the field of human-computer interaction, contributing with reflective versions of user-centered design that feature interventionist and performative research (e.g., Sengers et al. 2005). Similarly, a number of perspectives on reflective workshops and performance practices in tourism research have been suggested by Bødker and Browning (2013).
Implications and Conclusion The set of responsibilities that arises for designers when pondering and engaging with the futures of tourism requires a critical expansion of the concept of design. This includes speculative and performative inquiries into the contexts and complexities of future artifacts or services. Such design interventions, as I have discussed, take place outside of commercial settings and agendas, and they tend to eschew the distinctly solution-hungry and problem-solving logics of design science and approaches such as design thinking or service design. Speculative design is a way to extend the engagement within a problem space, where abstaining from solutionism and acknowledging the situated and reflective character of engaging with futures are key. It represents an opportunity for a number of more playful, poetic, and expressive (or even disruptive or disturbing) encounters with design in tourism (see, e.g., Veijola 2014). More studies are needed that explore in more detail how a shift in outlook – what we might call a “speculative ontology” of tourism futures – can be performed in the context of design in tourism. Some recent
652
M. Bødker
work has taken up this discussion. Practicing a game-based design method, Nielsen (2019) has begun to explore how co-design supported by a design game can support “futuring” and speculative/generative work to renew our approaches to innovation in “smart” tourism. In the following section, I will outline three opportunities for further research in this space. Speculative design feeds off a desire to subvert convenient narratives of design (Dunne and Raby 2001). As a method responsive to this, one concrete suggestion is to do work on conceptually driven design as a means for exploring tourism futures. Stolterman and Wiberg (2010) suggest how a theory-driven approach to design is tied to speculation and “that concept-driven interaction design research can be understood as rooted in futuristic use scenarios (“disciplined imagination,” Weick’s term) and in reasoning grounded in theory rather than in careful studies of present user conditions and situations” (2010: 97). The approach tasks designers with creating “a concept and an artifact that manifests desired theoretical ideas as a compositional whole” (ibid. 109). The design should embody a theoretical stance or concept that seems to be generative in bringing forth critical reassessments of the theories and truisms that underlie design for tourism. This sets concept-driven design apart from the standard design science approach (even if it retains an academic and intellectual position) – instead of focusing on solutions, concept-driven design is concerned with exploring theoretical concepts and how it might lead us to comprehend what axiomatic assumptions underlie current design practices, and how artifacts embody these assumptions. Stolterman and Wiberg reference the prototype work on “Bricks” (Fitzmaurie et al. 1995), a merging of physical bricks with a flat-screen display, as an example where designers are more interested in exploring concepts such as tangibility and the embodiment of interaction rather than marketability, short-term viability, or distinct user needs. This led to a productive stream of research on both fundamental and applied aspects of the role of the body, movements, and space for interaction design. In tourism, one could, for example, imagine interaction design approaches where a concept such as “sound” or “listening” acts as novel sensory motif for, e.g., destination design (see, e.g., Veijola 2014; Chamberlain et al. 2016). Another conceptual vector that I see as promising is the renewed interest in “atmospheres”, drawing in particular on the work of Gernot Böhme. Notably, Böhme’s concept of atmosphere is far from Kotler’s managerial use of the term “atmospherics” (1973), and that carefully arranged and orchestrated (designed) atmospheres can contribute positively to a customers’ purchasing decision. Whereas Kotler understands atmospheres as staged, modulating a receiver to conform to a behavioral scheme, Böhme suggests how atmospheres are a fundamental condition for our engagements in the world. First and foremost, Böhme’s idea of an atmosphere suggests first that experiences constitute “felt” relations to the world, mediated through precognitive and embodied sensory formations. I suggest that atmosphere or related concepts from the theory of affect and materiality (e.g., Massumi 2002; Seigworth and Gregg 2010; Bennett 2010) can aid us to reconsider the technical rationality embodied in the scientific-rational accounts of design and planning and the particular embodied figurations of the human
27 Tourism Design: Articulating Design Beyond Science
653
(human needs, human experience, human values) attended to under this perspective. In this way, they might help us uncover (and design for) new understandings of (technical) mediation, experience, and embodiment. This, as Böhme suggests, ultimately speaks to an “aesthetic” encounter with things, not as a judgment of the artistic qualities of a design but as a sensory and embodied encounter with things in their context. Moods and atmospheres are highly transient and contextual phenomena, and design arguably always fails in the attempt to “manage” the intensities feelings mediated by technology. And, yet, moods and atmospheres, designed or not, materialize all the time. Design of new touristic interactions or uses of technology are not merely about creating new things, but how such things last, how they are appreciated over time, and how their presence and the practices they afford contribute to the felt shapes and dispositions of people in context. A second and related opening is what Gatt and Ingold have termed “experimental anthropologies” (Gatt and Ingold 2013). The term suggests how anthropology might not only “inform” design as a kind of knowledge generated in the problemsetting phases of a design project. Instead, anthropological practices (what is likely referred to as empathy building or user studies in a conventional account of design) can include embodied experimentation, prototyping, making, crafting, etc. as means by which new understandings of concerns and local futures can emerge. Designing (together) can be understood as a way of encountering, perceiving, and learning about aspirations and complex formations of human and nonhuman agency. Lenskjold and Olander (2016) use an example of designing with elderly residents and use their relations to the chirping birds neighboring their care home as a means for reflecting on nonhuman actors in the assemblages of design practice. Specifically, their work urges us to reconsider the single-mindedness of humancentric approaches to design and instead allow design processes to discover novel relations between humans and nonhumans.
Concluding Remarks The aim of this paper was to append rational, science-focused, and solutionist design paradigms in tourism with more speculative inquiries that can include conceptual design, affective, and reflective research that can inform how we conceive tourism futures. The organizing principle of this account of design is a productive one, aimed at generating universal solutions to well-described problems or the optimization of existing artifacts or processes. It tells the story of design with a clear telos, a goal of producing a future thing structured around a representable (and often unambiguous) set of customer requirements or an unproblematized access to “empathy” resulting from customer engagement and the designer’s insights. Given the expanding palette of design materials in the form of new interactive technologies and digital infrastructures, this paper examines a dominant legacy of design in tourism and offers a short review of some key critiques, pointing toward alternative foundations for design in e-tourism and related areas. The hope is that this chapter can pave the way for a broader narrative about the role and dispositions of design.
654
M. Bødker
Such narratives, of course, never represent complete worlds, in the same way that the map is not the territory. I agree with Donna Haraway that “[i]t matters which stories tell stories, which concepts think concepts” (Haraway 2015: 160). In other words, they ways in which technologies and sociotechnical assemblages change, co-evolve or learn are, according to Hevner et al., distinctly not part of DS. The narrative I have attempted to inspect and the narratives or conceptual bases I offer that are based on more marginal traditions, philosophies, and practices are both inadequate in terms of drawing complete and final stories or strongly normative accounts of design. These stories remain sketches; like those loose sketches of imagined future artifacts (say a sketch of a new house), they are incomplete and failing, but I intend them to have a thrust. They aim to insert themselves into some of those unchallenged stories and discourses that we tend to accept in our research. The hope is to create new positions in the field, generate new researchable entities (Buscher and Urry 2009), and pave the way for an increased plurality of paradigms for a design-based engagement with new technologies and materialities of tourism.
Cross-References A Post-disciplinary Perspective on e-Tourism Group Decision-Making and Designing Group Recommender Systems Information and Communication Technology in Event Management Internet of Things and Ubiquitous Computing in the Tourism Domain Mobile Applications for e-Tourism Service Management in the E-Tourism Era User Experience and Usability: The Case of Augmented Reality Value Co-creation in Dynamic Networks and E-Tourism
References Angheloiu C, Chaudhuri G, Sheldrick L (2017) Future tense: alternative futures as a design method for sustainability transitions. Des J 20(Sup. 1):S3213–S3225 Auger J (2013) Speculative design: crafting the speculation. Digit Creat 24(1):11–35 Baskerville R (2008) What design science is not. Eur J Inf Syst 17:441–443 Benford S, Crabtree A, Flintham M, Drozd A, Anastasi R, Paxton M (2006) Can you see me now? ACM Trans Comput-Hum Interact 13(1):100–133 Bennett J (2010) Vibrant matter: a political ecology of things. Duke University Press, Durham Bleecker J (2009) Design fiction: a short essay on design, science, fact and fiction. Near Future Laboratory Blythe M, Andersen K, Clarke R, Wright P (2016) Anti-solutionist strategies: seriously silly design fiction. In: Proceedings of CHI’16, San Jose, 07–12 May 2016 Bødker M, Browning D (2013) Tourism sociabilities and place: challenges and opportunities for design. Int J Des [Online] 7:2. http://www.ijdesign.org/index.php/IJDesign/article/view/1181/ 580. Accessed July 2019 Brown T (2008) Design thinking. Harvard Bus Rev 86(6):84–92, 141. https://doi.org/10.5437/ 08956308X5503003
27 Tourism Design: Articulating Design Beyond Science
655
Büscher M, Urry J (2009) Mobile methods and the empirical. Eur J Soc Theory 12(1):99–116 Cavadi D, Elahi M, Massimo D, Maule S, Not E, Ricci F, Venturini A (2018) Tangible tourism with the internet of things. In: Stangl B, Pesonen J (eds) Information and communication technologies in tourism 2018. Springer, Cham Chamberlain A, Bødker M, Hazzard A, Benford S (2016) Audio in place: media, mobility & HCI: creating meaning in space. In: Proceedings of MobileHCI’16: 18th international conference on human-computer interaction with mobile devices and service. ACM, New York Chumley L (2016) Creativity class: art school and culture work in postsocialist China. Princeton University Press, Princeton Dobbins M (2009) Urban design and people. Wiley, Hoboken Dorst CH (1997) Describing design: a comparison of paradigms. PhD thesis, TUDelft Dorst K (2006) Design problems and design paradoxes. Des Issues 22(3):4–17 Dorst K, Cross N (2001) Creativity in the design process: co-evolution of problem-solution. Des Stud 22:425–437 Dredge D (1999) Destination place planning and design. Ann Tour Res 26(4):772–791 Dunne A, Raby F (2001) Design noir: the secret life of electronic objects. Springer Science & Business Media. Basel, Birkhäuser Ehn P (1988) Work-oriented design of computer artifacts. Arbetslivscentrum, Falköping Sweden Eide D, Mossberg L (2013) Towards a conceptual framework of innovation types in experience economy: innovation through experience design with focus on customer interactions. In: Sunbo J, Sørensen F (eds) Handbook on the experience economy. Edward Elgar, Cheltenham, pp 248–268 Fesenmaier DR, Xiang Z (2017) Introduction to Tourism Design and Design Science in Tourism. In: Fesenmaier D, Xiang Z (eds). Design Science in Tourism. Tourism on the Verge. Springer, Switzerland, Cham. pp. 3–16. https://doi.org/10.1007/978-3-319-42773-7_1 Fitzmaurie GW, Ishii H, Buxton WAS (1995) Bricks: laying the foundations for graspable user interfaces. In CHI ’95: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 442–449. https://doi.org/10.1145/223904.223964 Forlano L, Mathew A (2014) From design fiction to design friction: speculative and participatory design of values-embedded urban technology. J Urban Technol 21(4):7–24 Fry T (2011) Design as politics. Berg, Oxford Galloway A, Sundholm H, Ludvigsen M, Munro A (2003) From Bovine Horde to urban players: multidisciplinary interaction design for alternative city tourisms. Workshop proposal for MUM 2003, Norrköping, 10–12 Dec 2003. https://adk.elsevierpure.com/ws/files/76962/ MUM2003workshop_final.pdf. Accessed June 2019 Gatt C, Ingold T (2013) From description to correspondence: anthropology in real time. In: Gunn W, Otto T, Smith RC (eds) Design anthropology: theory and practice. Bloomsbury, London, pp 139–158 Gaver B, Dunne T, Pacenti E (1999) Design: cultural probes. ACM Interact 6(1):21–29 GlowLab (2005a) https://www.artinteractive.org/glowlab/ GlowLab (2005b) http://digicult.it/news/glowlab-open-lab/ Gretzel U, Fuchs M, Baggio R, Hoepken W, Law R, Neidhardt J, Pesonen J, Zanker M, Xiang Z (2020) e-Tourism beyond COVID-19: a call for transformative research. Inf Technol Tour. https://doi.org/10.1007/s40558-020-00181-3 Hall CM, Williams AM (2008) Tourism and innovation. Routledge, London Haraway D (2015) Anthropocene, capitalocene, plantationocene, chthulucene: making Kin. Environ Hum 6:159–165 Haraway D (2016). Staying with the Trouble: Making Kin in the Chthulucene. Duke University Press. Hevner AR, March ST, Park J, Ram S (2004) Design Science in Information Systems Research. MIS Quarterly 28(1):75–105. https://doi.org/10.2307/25148625 Hjalager AM (2010) A review of innovation research in tourism. Tour Manag 31(1):1–12 Huppatz DJ (2015) Revisiting Herbert Simon’s ‘science of design’. Des Issues 31(2):29–40 Ihde D (1990) Technology and the lifeworld: from garden to Earth. Indiana University Press
656
M. Bødker
Irani L (2018) “Design thinking”: defending silicon valley at the apex of global labor hierarchies. Catal Fem Theory Technosci 4(1):1–19 Iskander N (2018) Design thinking is fundamentally conservative and preserves the status quo. Harvard Business Review 2018. https://hbr.org/2018/09/design-thinking-is-fundamentallyconservative-and-preserves-the-status-quo. Accessed July 2019 Kelley D, Kelley T (2015) Creative confidence: unleashing the creative potential within us all. Harper Collins, London Kim JJ, Fesenmaier DR (2015) Designing tourism places: understanding the tourism experience through our senses. Travel and Tourism Research Association: Advancing Tourism Research Globally 19. Available at https://scholarworks.umass.edu/ttra/ttra2015/Academic_Papers_Oral/ 19. Accessed July 2019 Kimbell L (2011) Rethinking design thinking: part I. Des Cult 3(3):285–306 Kingman-Brundage J, Shostack LG (1991) How to design a service. In: Congram CA, Friedman ML (eds) The AMA handbook of marketing for the service industries. Amacom, New York, pp 243–261 Kotler P (1973) Atmospherics as a marketing tool. J Retail 49(4):48–64 Kotler P, Gertner D (2002) Country as brand, product, and beyond: a place marketing and brand management perspective. J Brand Manag 9(4–5):249–261 Lambert CU, Watson KM (1984) Restaurant design: researching the effects on customers. Cornell Hotel Restaur Admin Q 24(4):68–76 Lenskjold TU, Olander S (2016) Design anthropology as ontological exploration and inter-species engagement. In: Smith RC, Vangkilde KT, Kjærsgaard MG, Otto T, Halse J, Binder T (eds) Design anthropological futures. Bloomsbury Academic, London Liburd, JJ, Carlsen J (2013) Introduction to networks for innovation in sustainable tourism. In Liburd JJ, Carlsen J, Edwards D (Eds.), Networks for Innovation for Sustainable Tourism. Case Studies and Cross-Case Analysis. Melbourne: Tilde University Press, pp 1–12 Lindgaard K, Wesselius H (2017) Once more, with feeling: design thinking and embodied cognition. She Ji: J Des Econ Innov 3(2):83–92 Lindtner S (2017) Laboratory of the precarious: prototyping entrepreneurial living in Shenzhen. Women’s Stud Q 45(3–4):287–305 March TS, Smith G (1995) Design and natural science research on information technology. Decis Support Syst 15(4):251–266 Massumi B (2002) Parables for the virtual: movement, affect, sensation. Duke University Press, Durham Moran S, Aviv MZ, Adiram MG (n.d.) Speculative tourism. https://www.speculativetourism.com. Accessed July 2019 Morgan N, Pritchard A, Pride R (2004) Destination branding: creating the unique destination proposition. Butterworth-Heinemann, Oxford Morozov E (2013) To save everything, click here: the folly of technological solutionism. Public Affairs, New York Nelson KB (2009) Enhancing the attendee’s experience through creative design of the event environment: applying Goffman ìs dramaturgical perspective. J Conv Event Tour 10(2):120– 133 Nelson H, Stolterman E (2003) The design way – intentional change in an unpredictable world. Educational Technology Publications, Englewood Cliffs Neuhofer B, Buhalis D, Ladkin A (2014) A typology of technology-enhanced tourism experiences. Int J Tour Res 16:340–350 Newell A, Shaw JC, Simon HA (1959) Report on a general problem-solving program. In: Proceedings of the international conference on information processing, pp 256–264 Nielsen T (2019) Co-designing smart tourism: evoking possible futures through speculation and experimentation. Unpublished PhD Thesis, University of Southern Denmark Pine BJ, Gilmore JH (1999) The Experience Economy. Work is Theatre and Every Business a Stage. Harvard Business School Press, Boston, USA
27 Tourism Design: Articulating Design Beyond Science
657
Porter ME (1985) Competitive Advantage – Creating and Sustaining Superior Performance. The Free Press, New York Rittel HWJ, Webber MM (1973) Dilemmas in a General Theory of Planning. Policy Sci 4(2):155– 169 Robinson KS (2018) Empty half the Earth of its humans. It’s the only way to save the planet. The Guardian. 20 March 2018. https://www.theguardian.com/cities/2018/mar/20/save-the-planethalf-earth-kim-stanley-robinson. Accessed 20 June Rosner DK (2018) Critical Fabulations: Reworking the Methods and Margins of Design. MIT Press, Cambridge Salmela T, Valtonen A, Miettinen S (2017) An uncanny night in a nature bubble: designing embodied sleeping experiences. In: Fesenmaier DR, Xiang Z (eds) Design science in tourism. Springer International Publishing, Switzerland, pp 69–93 Seigworth GJ, Gregg M (eds) (2010) The affect theory reader. Duke University Press, Durham Sengers P, Boehner K, David S, Kaye J (2005) Reflective design. In: Proceedings of the 4th decennial conference on critical computing: between sense and sensibility (CC’05). ACM, New York, pp 49–58 Shostack LG (1987) Service positioning through structural change. J Mark 51(1):34–43 Simon HA (1973) The Structure of III Structured Problems. Artificial Intelligence, 4:181–20 Simon H (1996) The sciences of the artificial, 3rd edn. MIT Press, Cambridge, MA Sterling B (2005) Shaping things. MIT Press, Cambridge, MA Stickdorn M, Zehrer A (2009) Service design in tourism: customer experience driven destination management. In: Proceedings of first nordic conference on service design and service innovation, Oslo, 24–26 Nov 2009 Stolterman E, Wiberg M (2010) Concept-driven interaction design research. Hum Comput Interact (HCI) 25(2):95–118 Suchman L (2002) Located accountabilities in technology production. Scand J Inf Syst 14(2):91– 106 Sundblad Y (2010) UTOPIA: participatory design from Scandinavia to the World. In: 3rd history of nordic computing (HiNC), Stockholm, Oct 2010, pp 176–186 Sundbo J, Orfila-Sintes F, Sørensen F (2007) The innovative behaviour of tourism firms – Comparative studies of Denmark and Spain’, Research Policy, 36(1):88–106 Sundbo J (2008) Innovation and involvement in services. In: Fuglsang L (ed) Innovation and the creative process. Towards innovation with care. Edward Elgar, Cheltenham/Northampton, pp 87–111 Tonkinwise C (2015) Just design: being dogmatic about defining speculative, critical design future fiction. In: Moline K, Hall P (eds) Experimental thinking/design practices, Griffith University Art Gallery, Brisbane. https://research.gold.ac.uk/16591/1/ ExperimentalThinkingDesignPractices.pdf. Accessed July 2019 Tunstall E (2013) Decolonizing design innovation: design anthropology, critical anthropology, and indiginous knowledge. In: Gunn W, Otto T, Smith RC (2013) Design anthropology: theory and practice. Bloomsbury, London Tussyadiah IP (2014) Toward a Theoretical Foundation for Experience Design in Tourism. J Travel Res, Switzerland, 53(5):543–564. https://doi.org/10.1177/0047287513513172 Tussyadiah IP (2017) Technology and behavioral design in tourism. In: Fesenmaier DR, Xiang Z (eds) Design science in tourism. Springer International Publishing, pp 173–191 Veijola S (2014) Towards silent communities. In: Veijola S, Molz JG, Pyyhtinen O, Hockert E, Grit A (eds) Disruptive tourism and its untidy guests: alternative ontologies for future hospitalities. Palgrave Macmillan, Basingstoke Zehrer A (2009) Service experience and service design: concepts and application in tourism SME’s. Manag Serv Qual 19:332–349
Log File Analysis
28
Constantine J. Aivalis
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Log File Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Tagging System Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Generic Log File Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFA Requirements and Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonfunctional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input-Process-Output Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types of Log File Analyzers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extended Log File Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hybrid Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hybrid Analyzer with Real-Time Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NRT Analyzer with Social Media Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
660 661 664 665 666 667 668 669 670 670 670 671 672 674 679 681 682
Abstract Log files record invaluable information about the operational details of applications, database management systems, operating systems, and devices. They are autogenerated “diaries” that keep timelines of all data reflecting every event that took place during the operation of the system. Every web site visitor request and the corresponding responses is registered in an access log file, generated by the web server. Access log files keep the entire operational history. Publicly
C. J. Aivalis () Hellenic Mediterranean University of Crete, Heraklion, Crete, Greece e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_39
659
660
C. J. Aivalis
accessible web applications and e-commerce sites that operate 24 h a day, 7 days a week are exposed to the global Internet community. Analyzing the log file is not only crucial for security reasons but also for assessing the community of visitors, gaining insight into their operational habits, knowing their requests, measuring response times, spotting implementation errors, and locating problems of all levels. This chapter starts with a description that shows how to configure and customize a web server, in order to produce a useful access log file, and describes conceptually various software compositions that will constitute contemporary web analytics applications that deal with log files, inventory and customer data, a hybrid application combining log files and tagging system data, near real-time extensions, and social media aware applications that support data streams and provide a more global image of the way a web application is approached by visitors. The impact and the implications of the technology paradigm shift toward rich Internet applications (RIAs) on web analytics applications in Web 3.0 are taken into consideration, and remedies that solve the reduced log file problem are proposed.
Keywords Log file · Analyzer · Tagging systems · Near real time · ETL · Social media
Introduction Commercial software, as well as web-based applications, need to operate under controlled conditions. Analytics systems collect information and provide measurements showing its performance and response times, as well as details on load and activity over time periods, exact actions, and usage history. They can be complex multilayer software applications responsible for collecting information and operational data from log files and databases of systems in remote locations, storing it in their local database, processing it and providing reports and visualizations with high accuracy, running on dedicated machines. It is very important for every software system to offer some means of registering and measuring activities, performance, operations, and events. Log files allow collecting detailed historical data and detailed actions in textual format. These files can be used as basis and source for behavioral analysis. Log file analyzers parse and load log files and offer mechanisms that provide exact metrics and even predictions. Two basic types of analysis techniques exist: log file analyzers (LFA) and tagging systems (TS). The LFAs are closed extract, transform, load (ETL) applications that operate on either the access log file of the web server, on the custom-made log files generated by the application, or sometimes even on log files produced by the database management system of the application and the installation they are scrutinizing. Tagging systems on the other hand are applied strictly for web-based
28 Log File Analysis
661
applications. They collect their data from the browsers of all clients as they visit the web application. A short tag is sent from the browser of the visitor to the central tagging system operator. Google Analytics is such an operator and a major player in this area. They collect, generate, and distribute statistics, information, and visualized results to the registered owner of the web application on demand. Since the advantages of one technique are the disadvantages of the other, hybrid systems, having the functionalities of both, alleviate all disadvantages at the cost of slightly higher complexity. Big Data techniques often need to be applied to keep up with huge volume of interaction data and to feed the analyzer with log file data without delays. Streaming techniques can support near real-time (NRT) access to crucial events that occur on the web server and allow live visualizations and mixing log file data with operational data and social media information. Log files are semi-structured text files that often offer very detailed information about the interaction of visitors with a web application. Although the size and level of detail of the access log file depend on the configuration of the web server application, their size almost always makes manual processing difficult and restricted. Log file analyzers read log file lines and perform preprocessing procedures like extracting the details of every line, filtering out irrelevant data, and support loading the information into relational database tables offering intelligent indexing techniques for quick searching. Report generator and visualization software can then be used to produce metrics, statistics, and graphs in any form and file format and make data accessible and readable. This chapter describes how the access log file can be customized, deals with requirements and specifications for log file analyzers, and presents several architectures with different capabilities. It deals with hybrid LFAs and the reduced log file produced by rich Internet applications and shows techniques that allow streaming near real-time feeds of data.
Basic Log File Mechanism Programmers can always roll their own logging mechanisms. It is though usually more convenient to avoid reinventing the wheel and stick to existing logging systems that have been used by professionals for years, have become de facto standards, are popular, are guaranteed to perform well, and are familiar to many administrators. For the Java developer, for example, the site java-source.net (http://java-source.net/ open-source/logging) includes a list of logging libraries and front ends, many of which belong to the open-source community and can be easily embedded into any application. The Apache Software Foundation offers two basic types of logs for their web servers: internal and external. Internal logs inform about startup and shutdown events and history or include error logs. They are used for reporting the status of the web server operations and register administrator information and errors:
662
C. J. Aivalis
25-Apr-2019 10:03:14.734 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["ajp-nio-8788"] 25-Apr-2019 10:03:14.754 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 16687 ms
Textbox 1 Internal Log File Sample … 62.1.183.170 127.0.1.1 - 0 62.1.183.170 - GET 8780 ?prodId=8 - [05/Apr/2019:00:00:03 +0300] "GET /konakart/SelectProd.do?prodId=8 HTTP/1.1" 302 /konakart/SelectProd.do 19 6924E3EAC9BAE3649D0DCF6B4EFFFFB9 "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0" "http://aivalisco.ddns.net:8780/konakart/Welcome.do" 62.1.183.170 127.0.1.1 25906 25906 62.1.183.170 - GET 8780 ?prodId=8&manufacturer=Warner&category=Cartoons&name=A+Bug%27s+Life&model=DVD-ABUG - [05/Apr/2019:00:00:03 +0300] "GET /konakart/SelectProd.do?prodId=8&manufacturer=Warner&category=Cartoons&name=A+Bug%27s+Life&model=DVD-ABUG HTTP/1.1" 200 /konakart/SelectProd.do 29 6924E3EAC9BAE3649D0DCF6B4EFFFFB9 "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0" "http://aivalisco.ddns.net:8780/konakart/Welcome.do" 62.1.183.170 127.0.1.1 4558 4558 62.1.183.170 - GET 8780 - [05/Apr/2019:00:00:03 +0300] "GET /konakart/images/dvd/a_bugs_life_3.jpg HTTP/1.1" 200 /konakart/images/dvd/a_bugs_life_3.jpg 0 6924E3EAC9BAE3649D0DCF6B4EFFFFB9 "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0" "http://aivalisco.ddns.net:8780/konakart/SelectProd.do?prodId=8&manufacturer=Warner&category=Cartoons&name=A+Bug%27s+Life&mo del=DVD-ABUG" …
Textbox 2 External Log File Sample
External logs are used for logging external events. Events like visitor access requests to the application via the web server are stored here. External log files register all external requests to the web server including many details about each transaction. Every web server application offers its own way of reporting and logging events into log files. The format of the log file line output is always, to a large extent, configurable. The standard settings usually give only information about few access log data like IP address, request, timestamp, status code, and bytes transferred. A reconfiguration of the web server settings is usually required to allow the server to report more detailed user access information in the access log file. This is done by setting appropriate values in a specific XML file. With Tomcat, for example, it is necessary to reconfigure the server.xml file, to get access log files and information about visiting users in a precise way, which allows timing and response measurements, as well as access attitude information of prospective or existing customers. The default valve parameters and patterns can easily be modified, either manually through a text editor or by the LFA application. A reasonably detailed log file that can be used for access metrics may look like in Textbox 3. Almost all necessary options are activated in the example above, to obtain as much information as possible in the access log file. In general, the more pattern codes are used, the larger each line of the access log file will be. The included codes
28 Log File Analysis
663
Textbox 3 An Access Log Valve Table 1 Pattern variable options for access valve (https://tomcat.apache.org/tomcat-8.5-doc/ config/valve.html#Access_Logging) %a %A %b %B %h %H %l %m %p %q %r %s %S %t %u %U %v %D %T
%F %I
Remote IP address Local IP address Bytes sent, excluding HTTP headers, or ‘-’ if zero Bytes sent, excluding HTTP headers Remote host name (or IP address if enableLookup for the connector is false) Request protocol Remote logical username from identd (always returns ‘-’) Request method (GET, POST, etc.) Local port on which this request was received. See also {xx}below Query string (prepended with a ‘?’ if it exists) First line of the request (method and request URI) HTTP status code of the response User session ID Date and time, in Common Log Format Remote user that was authenticated (if any), else ‘-’ Requested URL path Local server name Time taken to process the request in milliseconds. Note: In httpd %D is microseconds. Behavior will be aligned to httpd in Tomcat 10 onwards Time taken to process the request, in seconds. Note: This value has millisecond resolution, whereas in httpd it has second resolution. Behavior will be aligned to httpd in Tomcat 10 onwards Time taken to commit the response, in milliseconds Current request thread name (can compare later with stack traces)
are a matter of the company policy and define the contents and the sequence of the information stored in each line of the produced log file. The LFA will parse each log file and load the data to a database. This phase offers a second chance to leave out the information from the log file, generated by any code that seems excessive. If the pattern is modified often, then the web server generates log files with different formats that are difficult to process. Thus, it is advisable not to change the pattern all too often. Changes to the pattern automatically imply also changes to the software that is used to parse the log file for loading to a database for further processing. The pattern variables are usually enhanced with every new version of the web server. The pattern variables of Apache Tomcat 8.5 can be seen in Table 1:
664
C. J. Aivalis
Beside the content of the log file, its location (directory path relative to default path), name, and characteristics can be configured. When the rotatable parameter is set to “false,” the system creates one single log file that grows as event lines are appended. The basic problem of this setting is that the log file of a busy site tends to become very quickly huge and hence difficult to manage. When the rotatable attribute is set to true, a new log file is recreated every time there is a change of date or time, whatever matches the content of the fileDateFormat parameter. When set to “yyyy-MM-dd,” the date format is used as part of the file name and we have a new log file for every day. Combined with rotatable = “true,” it triggers the log file to close and a new one to open as soon as the day changes. If there would be an “.HH” addition to the file date format, the system would produce a new log file with the new file name every hour. The AccessLogValve allows Apache Tomcat to produce the details described by the pattern variable. In the default version, the AccessLogValve tag used to be in comments and did not generate access log files. Recent versions offer limited information using the common pattern. The variable pattern = “common” defaults to ‘%h %l %u %t “%r” %s %b’ while pattern = “combined” appends the values of the referrer and user-agent headers, each in double quotes, to the common pattern. When the modified valve pattern of Textbox 3 is used two lines of the outcome are shown in the ASCII file in the External Log File Sample Textbox 2. This file is relatively difficult to read manually since it may contain hundreds of thousands of lines. Any request of information and extraction of statistical facts is very difficult without transforming it. The log file in its raw format may be scrutinized for ad hoc searches of specific situations that appear seldom, or on exceptionally occurring events, such as attacks. The work becomes easy when a log file analyzer (LFA) is available.
Basic Tagging System Mechanism While log files are generated locally, on the system where the web server is operating and handling them is a responsibility of the administrator of that systems, tagging systems usually are offered as services by specialized companies that deal with visitor clicks and requests. The web site administrator must register and create an account with the remote service operator and subscribe. Then a specific identification code is provided and a JavaScript snippet including a function that sends a specific tag that includes the unique code asynchronously to the remote service. This JavaScript code is embedded in every web page that needs to be tracked. Every time a page is requested by a client, this script is loaded into the DOM space of the remote browser, forcing it to send the tag to the service operator web site. The service operator registers all these clicks and recognizes the source of the page by the unique code in the received tag and combines it with the details of the client. The operator of the service provides a web site that visualizes results and statistics. Google Analytics is a popular tagging system operator, used by millions of web sites to analyze their traffic. Many companies do not want their click history and
28 Log File Analysis
665
visitor and customer details exposed to third companies. Matomo (https://matomo. org/) offers an on-premises, self-hosted open-source software that guarantees privacy since the hosting of the service is done locally.
A Generic Log File Analyzer A log file analyzer is usually developed around a toolbox with specialized tools necessary for loading the necessary data from a web application and measuring performance, calculating throughput, and presenting customer behavioral patterns giving insight to the operation details of past or current periods. Standard generalpurpose log file analyzers usually solely process log files, and evaluate access hits, calculate bandwidth, and report visited pages on hourly and daily basis, as well as visitor countries and browser-agent statistics. This information is very useful for a content management system, a portal, or even a static web site administrator, because the pages visited and the visit durations are sufficient to measure the success of the site. An e-commerce site administrator, on the other hand, needs more specific information about the performed actions and transactions. If this information is available, it can be combined with the log file data. E-shop-specific data about services, products, product categories, orders, and customers can be used to offer more precise information of the access events. This way, all generated reports and output produced by the LFA can have more business-specific details compared to what is available in the log file line, since additional information is available. This makes the LFA’s output easier to comprehend and richer for everybody in the company. For example, the first line of the sample access log file in Textbox 2, above, has a request with the following substring: “SelectProd.do?prodId=8.” Obviously, a product with ID 8 has been selected. This line redirects to the next line. The line following shows additionally the product description. If the product database table of the e-commerce application is accessible from the LFA, then any additional information concerning product 8, like retail price, full description, quantity on hand, etc., may be visualized along with the data extracted from the log file line. To achieve the goal of measuring and displaying access to the e-shop, a general architecture system can be built as a standalone desktop application with a graphical user interface that would make it easy to use (Fig. 1). The analyzer maintains its own database and includes options that allow the user to configure the web server of the e-business site and extract and load data from both log files and the web site database. Such an application can run anywhere, and its database can be located on any database server connected to the Internet, also on the machine where the web server resides. It would be of advantage if it can accommodate multiple e-commerce sites, running on multiple web server architectures, located anywhere. As the clients visit the web server, the web server provides web pages and generates log file entries. The LFA has the bidirectional connection (1) with the web server and a unidirectional (2) with the database of the web application. The connection from the web server to the LFA is used all the time whenever the LFA
666
C. J. Aivalis
Fig. 1 General architecture
needs to load one or many log files, or a portion of a log file on demand. This is the main connection. The connection (1) from the LFA to the web server is used whenever the LFA needs to configure the server, which may not happen at all, or may happen once during the initialization of the cooperation between the LFA and the web application. The unidirectional connection (2) allows data from the database of the web application to be loaded to the LFA. This information enhances the data that is stored in the log file, and makes the visualizations better comprehensible, since descriptions of products and services appear as texts and not as codes and IDs. This functionality can be activated and requires only read permissions on certain schemata of the web application database. It is practical to design an LFA in such a way that it can support more than one e-business site or web application. Often corporations run multiple web sites. In that case there will be multiple e-business site boxes, as well as the multiple pairs of connections like (1) and (2) of Fig. 1.
LFA Requirements and Specifications A log file analyzer can have various forms. It can function as a web application, as well as a standard desktop application built around a graphical user interface, like the generic LFA seen in previous section, or be a desktop application with a web interface, combining both forms. The e-shop administrator should be given a user-friendly and safe application to work with. It should be able to quickly track the
28 Log File Analysis
667
Fig. 2 Use case diagram of the LFA
transaction history of any e-shop, set up the required parameters, find measurements and sequences of actions, and compute performance indexes of actions or of the entire application, reveal the results of measurements under various load conditions, and compare with previous operation periods. The architecture must be easily expandable to easily allow data imports from various types of web servers, so that whenever web sites that run on new web server architectures must be analyzed, their respective log files and any e-shop application databases could be adapted and evaluated relatively fast. To be usable, the LFA must contain a tool with the ability to load entire log files or log file deltas as easy as possible and display the e-shops’ latest status for evaluation at any time. The use case diagram in Fig. 2 describes visually the system requirements for the LFA. The goal of such a system is simple: support the easy loading of log files to an environment that allows the generation of statistics, reports, and visualizations.
Nonfunctional Requirements • The LFA system must respond quickly, without long waiting times. This point is where sophisticated approaches and techniques need to be applied, since log files are usually very large ASCII files containing lots of details. They are often
668
• •
• • • •
C. J. Aivalis
located in remote web servers and they must be extracted and loaded through DSL connections that might be slow. The LFA should scale well because data volumes grow fast as time passes. The storage capacity should be unrestricted. Shifting from a local file system to a Hadoop file system cluster could become necessary and should be pre-planned to become feasible The LFA must be reliable, and the data should be safe. The software should easily adapt to new web server types and versions since any platform may be used with web applications It should be able to run on multiple operating environments and its portability should be guaranteed It must be easy to set up, manage, and use to be effective.
Functional Requirements • Each log file processed should be reloadable and easy to delete from the database of the analyzer. • The parsing should be done using a regular expression which must be analogous to the pattern of AccessLogValve entry. • Authentication is required, to maintain secure access to any remote system involved that hosts the web application, the configuration files, and the log file under scrutiny, as well as access to read and download data from its operational database. • The LFA must adapt to the standards of the web servers it is checking. • Graphics and reports should be generated on demand. Visualization and reporting are more efficient if it based on some well-designed standard commercial report generator applications. This simplifies the ad hoc and routine reporting procedure. • The database should accommodate historical data to support year, month, and day value comparisons and data mining. The application must be easily expandable since web servers evolve and support for different types of web servers would be an asset. Even if the LFA is used only on one web application, it should support new versions if possible, with minor software modifications. Figure 3 shows the main software components of the application for log file analysis on demand. On-demand analyzers have no automatic data feeds and receive and load data only whenever their administrator demands input. The administrator runs the log file loader module on demand; the system loads and parses the selected log file data, loads it to the database, and is ready to produce visualizations. The basic characteristic of the on-demand analyzer is that the administrator is responsible for loading log file data from the web server. Even if the time span used for selecting log file lines from the log file includes the current minute, the data
28 Log File Analysis
669
Fig. 3 Main components of the extended log file analyzer on demand (Boucouvalas and Aivalis 2010)
loaded and hence the metrics generated refer to the past. Near real-time analyzers, on the other hand, continue automatically loading log file entries soon after they are generated without any need of human interaction.
Database Design A log file analyzer uses xml files for storing settings and a relational database management system for storing the parsed data that are transferred and loaded from each log file and each web site’s operational database. The database must be simple and portable, and it suffices if it consists of the following tables: • Servers Table: Stores information about the standard locations of different web servers supported (log file directory, configuration file names, and locations). • Site Table: Table for the specific details of the various web applications the analyzer works for. Includes type of the server and all necessary details for the generated log file must be stored here. This table also contains the URL and IP port, database type, and necessary usernames and passwords to access the server configuration file and database in order to read the product categories, product table and codes, as well as orders and customers from the remote web site database. • Product categories for every e-commerce site. • Products for every e-commerce site. • Orders for every e-commerce site. • Order statuses for every e-commerce site. • Customer details of every e-commerce site. • Business Actions: (Like “add to cart” or “pay,” etc.) For every e-commerce site framework implementation. These are matched against the log file request strings. • Transactions Table: Which is the main and large table where every line of information contained in each log file loaded after its extraction is parsed into.
670
C. J. Aivalis
Input-Process-Output Outline The process, key inputs, and resulting outputs of a log file analyzer are outlined: Inputs: • • • •
Settings of server configurations E-commerce site detail characteristics and data Log file deltas Data from the operational database of the e-commerce site, like products, services, categories, customers, and orders Process:
• • • • • • •
Preprocessing of the log file and transfer into the transactions table Analysis of the log file contents Statistical information query processing Preparation of data for chart and graph generation Clustering visitor algorithms Data mining procedures Analysis of robots and spider visits Outputs:
• • • •
Graphical reports in multiple formats Printouts of detailed and summarized results Visualizations of data Session or customer behavior business graph
Types of Log File Analyzers In data warehousing (DW), data coming from multiple distributed and heterogeneous storage systems are integrated in a central repository (Kozielski and Wrembel 2009). The e-commerce log file analyzers are specialized forms of a data warehouses. This section includes a description of the various architectures and types.
Standard Analyzer Standard log file analyzers, as shown in Fig. 4, are designed like typical ETL (extract-transform-load) applications. They operate on the access log file of the web application only and provide a base for calculating metrics and generating visualizations and statistical calculations.
28 Log File Analysis
671
Fig. 4 Traditional log file analyzer dataflow diagram
This architecture, although simplistic, is easy to implement but has limitations. A major shortcoming is its inability of cross correlating product and action description and decoding data that appear in the Uniform Resource Locators requests of the e-commerce site, which are stored in the log file and used by the visualization system.
Extended Log File Analyzer A better approach would be an analytics application for e-commerce sites, with the following additional capabilities: • Provide cross correlation of the encoded information, collected from the requests registered in the log file, with product item and group details, like descriptions, prices, customer details, and so on. This feature makes all produced visualizations, reports, and graphs easier to comprehend, without any need to refer manually to external references.
672
C. J. Aivalis
• Data are loaded from the e-commerce database to the file analyzer database, either on demand or automatically, whenever unknown product groups and items appear in the log file. So massive inventory loads are avoided during daily operations of the e-shop, for whenever a new product or category is inserted to the e-shop database, or a price is changed, these updates also trigger an update to the analyzer database too, resulting to a self-updating system. • Additionally, the LFA can customize the configuration of the web server hosting the e-commerce application, to adapt the access log file information to the specific needs of every e-commerce site. The log file parsing mechanism also makes use of the configuration to extract information. The modules and components for this type of analyzer are presented in Fig. 3, which shows the existence of the two log file loading tools, rotatable and single-file readers, that will be used for the standard analyzer, but also the e-commerce product or services data reader, the web server customizer, and configurator that define the preferences of the operator about the dynamic generation and form of the log file produced and parsed, and, finally, the two visualization tools: one for statistics and one for various built-in reports. The dataflow of this extended LFA is shown in Fig. 5.
Hybrid Analyzer As pointed out, web traffic and visitor interaction of web sites is mainly measured by Clifton (2008) analysis of log files, since log files contain very detailed information about each request and keyclick, where the data must be extracted and carefully selected, and page tagging. Page tagging requires an extra web server. The web pages include a “tag,” which is JavaScript code that forces the visitor’s browser to automatically visit this extra web server. This server then collects the log data that is generated by this visit and stores it to a specific database for each site, based on the account number sent in the tag. Log file analysis is a precise methodology but has two main disadvantages. The first is proxy and caching inaccuracies: If a page is cached, no record is logged on your web server because there is no need for a web server request. The second is the lack of event tracking when JavaScript or Flash code is executed. Tagging systems on the other hand do not have the above disadvantages, but firewalls can mangle or restrict tags. They cannot track bandwidth or completed downloads because tags are set when page or file is requested and not when its download is complete. Also, they cannot track search engine spiders, since robots have the smartness to ignore page tags. According to Clifton (2008), only hybrid solutions can provide a complete analysis of the web site visitor behavior. Because of their complexities, only a small number of vendors can offer a hosted hybrid solution.
28 Log File Analysis
673
Fig. 5 Dataflow diagram of extended LFA version
Still at the end of the road, it is worth to go the extra mile and create a hybrid system, since all disadvantages of the two methods can be eliminated and additionally one system can always back the other up in case of a technical problem. Any e-commerce system can easily be configured to use an external page tagging analytics application, like Google Analytics, or Matomo, for example. The tag is just a specific JavaScript function, which is inserted either manually or programmatically in every web page that must be traced and is called when the page is loaded to the clients’ browser. These calls send a specific code-id to the Tagging System Operator (TSO), and the TSO registers them in a log file accessible by the operator of the e-commerce system (Aivalis and Boucouvalas 2014b). To support tagging the LFA makes use of the associated API. Google API (Fig. 6), for example, which allows to download the information collected by the service provider, enhances the data of the log file and substitutes or enriches to some extent data that may be missing. The log file analysis and the tagging system back each other up and provide a safer overall system.
674
C. J. Aivalis
Fig. 6 Hybrid log file analyzer
Hybrid Analyzer with Real-Time Extensions The evolution from Web 1.0 to social media aware Web 2.0 and Web 3.0 as well as the need to introduce rich Internet applications (RIA) that allow desktop level quality user interfaces that have crisp response and behavior on all browsers has changed the importance and the use of the traditional access log file. Because a very large portion of the user interaction software has moved to the browser of the visitor, RIAs produce very reduced and less detailed log files than the ones produced by traditional web applications. This makes pure log file-based analyzers less useful. This is due to the higher amortization of the client browser’s JavaScript interpreter, the use of AJAX, and asynchronous remote procedure calls (RPC). The communication between the client and the server becomes different, and the interface of the application becomes very similar to what the user is used to deal with, in a modern desktop application environment. The following snippet is an extract from a Tomcat access log file, which shows the very sparse nature of the log file resulting from using the Google Web Toolkit (GWT) framework:
28 Log File Analysis
675
Although this is a very short portion of the access log file, containing just a few lines, it was produced after relatively heavy interaction with the system: 1. The user logged into the application. 2. The user made four menu selections that loaded four different web forms filled with data from the database to the browser. The details of the interaction, unlike the very verbose log file of a sample Struts 2 application of Textbox 2 where similar actions would have produced a few hundred lines are missing here. In this example all small details and clicks are handled by the client’s browser and are not requested from the web server directly, so they will not appear in the access log file. The visible lines are generated when loading the CSS, the JavaScript, and the HTML file. The POST requests are generated by interaction with the “DBActions” Java class running on the server where the user logs into the system using an email and a password. After a successful log-in, the user runs a few search queries and transactions, none of which results to the familiar detailed information, expected to be found in the access log file if the web application were written using traditional frameworks. … … "GET /LogDB/ HTTP/1.1" 200 /LogDB/ 117 … "GET /LogDB/LogDB.css HTTP/1.1" 200 /LogDB/LogDB.css 19 … "GET /LogDB/com.art.logdb.LogDB/com.art.logdb.LogDB.nocache.js HTTP/1.1" 200 44 … "GET /LogDB/com.art.logdb.LogDB/1D630481E14BBD07DE7ED3D963A012CE.cache.html HTTP/1.1" 200 23 … "POST /LogDB/com.art.logdb.LogDB/DBActions HTTP/1.1" 200 450 … "POST /LogDB/com.art.logdb.LogDB/DBActions HTTP/1.1" 200 371 … "POST /LogDB/com.art.logdb.LogDB/DBActions HTTP/1.1" 200 449 … "POST /LogDB/com.art.logdb.LogDB/DBActions HTTP/1.1" 200 378 … "POST /LogDB/com.art.logdb.LogDB/DBActions HTTP/1.1" 200 885 …
Textbox 4 Extract from a GWT Application Access Log File
The problem of filling the gaps of information in a sparse log file is certainly not an issue for the tagging component of a hybrid analyzer, since tags do not rely or depend on the nature of the framework technology used for developing the interface of an e-commerce site in order to register detailed actions, as long as each page sends the tag it is supposed to send correctly. The registration of the actions takes place by sending and collecting the tag for every web page request. Each tag sent is logged by the operator server into a very traditional log file, and log file-based analyzers used by the tag-based operator provide the metrics. To keep the hybrid character of the analyzer and make it capable of operating with RIA web applications, near real-time extensions can be applied. Near real-time extensions solve not only the problems that are expected to occur whenever rich Internet applications substitute traditional frameworks for web applications, but at the same time, they are able to provide metrics and information in near real time also to the e-commerce site operating personnel.
676
C. J. Aivalis
Fig. 7 Main components of the real-time analyzer
Figure 7 shows the components of the near real-time analyzer (NRTA) web application. They provide an alternative input source for the log file entries, plus a technique to receive business data, as any additional event information, right after they are generated. The log file entries are permanently queued (as soon as they are generated by the web server) by a background process (daemon) that runs on the web server that hosts the e-commerce site and feeds the queue whenever new lines are appended to the log file. This data is dequeued by the real-time analyzer’s Web Reader application. The dequeuing application sends lines of the log file to the analyzer. These lines are parsed by the log file parser and loaded to the analyzer database. In addition to the two log file readers used in the on-demand analyzer in Fig. 3, this log file queue reader of Fig. 7 is the third log file input method. This reader provides constantly fresh, near real-time information to the LFA and eliminates the need to load data on demand and selectively. The access log file produced by RIA applications becomes so sparse that it cannot provide sufficient data to accurately reflect the events that are necessary for gaining the needed insight, as is the case with traditional applications. A web application can be modified to produce the necessary information. This approach can be expensive and is inconvenient for public domain applications that are updated often since the changes should be made for every new version. Most of the necessary data missing from the log file can be found in the database. For example, whenever an order is placed, a new row is added to the orders database table. Contemporary database management systems support trigger mechanisms. Triggers allow SQL statements to automatically run before and/or after rows are inserted, updated, or deleted. New database tables can be used to store collections
28 Log File Analysis
677
of data describing every order placement. For example, the order table includes the orders_id, customer_id, and customers_name. The orders_products table contains rows for all products of the order with the current id, including price and quantity. After the application inserts the order into the orders table, a trigger may select all information, package it together, and insert it to an additional specialized table, created for gathering information for polling purposes by a daemon process, for the NRT analyzer. Message broker software can be used to support reliable, safe, and fast streaming of all necessary data from the web application to the NRT analyzer. The source code of the web application under scrutiny could be modified to prepare the necessary information for the analyzer. Modifying the web application can be expensive and often may not be feasible. We will see how to generate the necessary data by inserting triggers to the site database with no need to modify the application. Thus, the components used for input are: • Data extractor from the operational e-commerce site database • A log file queue processing tool, which dequeues the log file data received and parses it using the same algorithms as the on-demand analyzer. The data ends up in the transaction database and becomes immediately available for the analyzer. • Every metrics presentation tool of the NRTA and every instrument selected to display measurements requires another data queue to operate. The information source originates from the e-shop database. For example, every time a sale is completed, or a user has logged into the system, the generated information is pushed into the queue, and the appropriate presentation tool is refreshed in near real time. The processing is targeted on producing quick graphical information and results to the user. In addition to that, action is taken whenever a user logs in to the system for identifying the session statistics and to alleviate anonymity. Demographic information of the user and her shopping habits and behavior before logging in become known since the entire session corresponds to a known customer. Any event can be tracked down this way. The following metrics can be visualized in near real time: • • • •
Number of current users Real-time turnover meter Throughput meter Statistical Information about origin of visitor, browser, operating system, etc.
The real-time extensions can be used both for the traditionally designed web applications, just in order to receive real-time metrics, and for RIA web applications. In RIA applications the extensions complete the missing log file entries. Figure 8
678
C. J. Aivalis
Fig. 8 Near real-time sequence diagram (Aivalis and Boucouvalas 2014a)
shows the sequence of the steps involved in queuing and dequeuing the order data described above. Near real-time extensions (NRTE) are used to produce “live” metrics of the operational aspects of the site under scrutiny. They rely on automated extraction of data from the site and automated loading into the database of the LFA. The data is processed upon arrival, and the metrics are made available to the operating personnel of the site via a separate special web application. Metrics can include any aspect of the operation including the turnover of the e-shop, number of visitors analyzed by category, speed, geolocation, etc. as well as the load of the system, throughput, and responsiveness. The implementation is based on queuing systems, and the architecture and distribution is completely flexible in terms of location of sensors (data producers) and location of the consumers. The overall system is designed in a way that is light to operate and does not require a lot of resources of the server-side (Vassiliadis and Simitzis 2009). The principle behind the NRTE implementation is based on the introduction of a set of daemon processes, each of which tracks occurring events of the e-shop operation, for example, a concluded purchase, or when some prospective customer adds an item to the cart, the appropriate data is collected by the producer daemon, enqueued in a named queue as a data package, and marked as sent. If, or whenever the client runs the corresponding consumer daemon, the data is dequeued, moved to the analyzer database, and finally visualized whenever needed, or used in any manner according to its content. There may be a short delay of a few minutes if the consumer has not been running for a while and the enqueued datasets are large, but since the packages are designed
28 Log File Analysis
679
to be small, this delay may seem irrelevant. The data sources operate fast and adding metrics that deal with any form of necessity becomes a simple task. Near real-time metrics can include: • • • • • • • •
Overall speed measured in bytes per second Number of active visitors Requested items per visitor and overall Orders completed Customer behavioral graph Number of logged in customers Turnover or profit per hour, day Internet bot counter
Whether the application under scrutiny is generating a traditional detailed full log file, or a sparse one, a specific daemon is used that places every single line of the log file to a named queue, in addition to the metrics mentioned above, and so the information is sent to a consumer that parses it straight to the analyzer transaction database. This can be considered as the third tool to load the parsed contents of the access log file to the analyzer. We have the rotatable and the nonrotatable access log file readers, as presented in Fig. 2, which operate on demand. The queued log file daemon transfers automatically the contents of the log file in near real time and stores the parsed data in the transactions database table. Any arbitrary number of queues and their corresponding producer and consumer daemon application can be implemented to support the necessities and the requirements of every e-shop operator.
NRT Analyzer with Social Media Integration A more advanced form of the analyzer application, with most of its add-ons, is presented in Fig. 9 and its dataflow diagram summary in Fig. 10. The e-shop services requests coming from the customers-clients are not shown in Fig. 9. While the requests are registered to the log file the file is being polled by the producer daemon. The data is queued and sent to the consumer daemon that parses and inserts it into the analyzer. The web server of the analyzer application makes the data available to any authorized client through the World Wide Web. The turnover feed, for example, shown in Fig. 8, is activated whenever an order is placed by a customer. An after-insert trigger is placed in the orders table. This trigger copies the order-id auto number unique key to a database table, which was added to the e-shop database. The table can be polled periodically for sending any pending records by the queue producer. If pending entries are located, based on the order number, all required additional information is gathered with the necessary SQL operations on other e-shop database tables and appropriate log file data, and the resulting data structure is enqueued. The
680
C. J. Aivalis
Fig. 9 Social media aware near real-time analyzer (Aivalis and Boucouvalas 2014a)
queue consumer at the LFA server-side dequeues the information and transforms and loads the data to the LFA database. From there the order data is made available to the web application of the analytics server and is accessible (Aivalis and Boucouvalas 2014a). The web application, running on the web server of the analyzer, must contain several visualization tools, one for every producer-consumer daemon pair, acting as a specialized “sensor,” transmitting information from the web server in near real time, but also provide general statistics and graphs generated by comparing historical data. The “Social Media awareness” of the analyzer can be achieved through the Social Media Processor (SMP), which is a publisher application that can be embedded in the analyzer (Fig. 10) and can post automatically prewritten messages to selected social media applications (Facebook, Twitter, etc.), at specified times, either randomly or following the list created by the administrators. The administrator can define what and where and when content is posted. The SMP generates the postings and places them according to schedule. All social media offer application programming interfaces that allow to measure social feedback directly from the used applications.
28 Log File Analysis
681
Fig. 10 Dataflow of the social media aware NRTA (Aivalis and Boucouvalas 2014a)
The selected media can also regularly be measured to evaluate their impact in terms of visits or sales through the access log file, as referrers.
Conclusions Log files are important for documenting operation and functionality details of hardware and software applications. Their configuration and usage were presented in this chapter, along with specifications for designing software log file analyzers. Since the detail of the captured information seems to decline for certain types of rich Internet applications, due to the shift of their operational model that is shifted toward the client, four architectures of log file analyzers were presented to cover the needs of different types of systems. These architectures can be used in any combination and evolve, allowing the integration of various external data types to enhance the information in near real time also including social media. Applications need to evolve fast enough to accurately adapt to the evolving environment they operate in. Currently social media extensions and the interconnections of queues generated and consumed by daemon processes allow analyzers to support and provide any data feed from any client-server application in near real time if
682
C. J. Aivalis
the appropriate application programming interface exists. This capability allows an e-commerce site operator to gain an immediate global view of the feedback and a full perspective of the events that take place on the e-commerce site and the social media. A next step could be an analytics application software system that uses techniques that will collect data not only from the e-commerce site but also from the physical shopping floor of the company. This can mainly be achieved with lowcost mobile technologies, and all interactions can be promoted for processing by a customized sophisticated e-commerce analyzer. This analyzer can connect the two worlds of physical and virtual shopping to one. The information can be collected automatically during the physical store customer interaction with the support of iBeacons and Near Field Communications, used for data acquisition in the retail shop floor. The only prerequisite is that the customer uses a specialized app and a mobile device. The generated retail data can be pushed to the analyzer in near real time. This data will then enhance the analyzer input with physical store data and can be used to enrich the customer experience and provide customer clustering and precise behavioral analysis, by enhancing the available operational data sets, collected by a web analytics application, providing better overview, and supporting decision-making for the entire corporation (Aivalis and Gatziolis 2016). The sources, the format, and the characteristics of data are in permanent evolution. The collected data are used as a base for models and algorithms to collect, preprocess, analyze, and evaluate data, from various fields such as statistics, system theory, machine learning, pattern recognition, or computational intelligence (Aivalis et al. 2016). Log file analysis heavily depends on the technologies and frameworks used for the implementation of the web application under scrutiny. Very large heavily visited sites generate very large sets of data. The selected and preprocessed data collection can be used as the basis for training a neural network for predictive analysis. The model can also analyze customer reviews, classifying them as positive, negative, or neutral, analyze if customers have read them before purchasing items, and measure their impact, cluster products by popularity, and provide more accurate recommendations.
References Aivalis CJ, Boucouvalas AC (2014a) Future proof analytics techniques for web 2.0 applications, TEMU 2014. IEEE Aivalis CJ, Boucouvalas AC (2014b) Near real time support techniques for web 2.0 applications, TEMU 2014. IEEE Aivalis CJ, Gatziolis K (2016) Integrating retail and e-commerce using web analytics and intelligent sensors. In: E-business and telecommunications: 12th international joint conference, ICETE 2015, Colmar, 20–22 July 2015, Revised Selected Papers. Springer International Publishing. ISBN: 978-3-319-30222-5 Aivalis CJ, Boucouvalas AC, Gatziolis K (2016) Evolving analytics for e-commerce applications, TEMU 2016. IEEE
28 Log File Analysis
683
Boucouvalas AC, Aivalis CJ (2010) An E-shop log file analysis toolbox, CSNDP 2010 Newcastle IEEE Clifton B (2008) Web traffic data sources & vendor comparison, Omega digital media, http://www. advanced-web-metrics.com/docs/web-data-sources.pdf. Retrieved: May 2011 Kozielski S, Wrembel R (eds) (2009) New trends in data warehousing and data analysis. Springer, New York, p v Vassiliadis P, Simitzis A (2009) Near real time ETL. In: Kozielski S, Wrembel R (eds) New trends in data warehousing and data analysis. Springer, New York, pp 25–28
Eye-Tracking Technology for Measuring Banner Advertising Efficacy on E-Tourism Websites: A Methodological Proposal
29
Francisco Muñoz-Leiva
Contents Introduction: Eye-Tracking Concept and Typologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Eye-Tracking Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Brief Historical Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How Eye-Tracking Methodologies Operate in the Study of Consumer Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Critical Issues and Possible Applications of Eye-Tracking Technology in the Marketing Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Banner Advertising on E-Tourism Websites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical Background: Determining Factors of Banner Advertising . . . . . . . . . . . . . . . . . . Causes and Consequences of Banner Blindness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Determining Factors of Attention and Banner Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Practical Application: Research Questions, Methodology, Results, and Analysis . . . . . . . . . Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Final Discussion, Implications, Limitations, and Future Research . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implications and Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Limitations and Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
686 687 687 688 690 692 694 694 696 697 697 698 701 706 706 707 709 712 713
Abstract In this chapter, we focus on the case of e-tourism websites (also known as Travel 2.0 tools or T2T, such as travel blogs, profiles on social networks, and online travel communities) and measure the effectiveness of advertising banners
F. Muñoz-Leiva () Department of Marketing and Market Research, Universidad de Granda, Granada, Spain Sport and Health University Research Institute (iMUDS), Universidad de Granda, Granada, Spain e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_121
685
686
F. Muñoz-Leiva
on these sites. To do this, we propose a methodological approach known as VIROG, video-based infrared oculography, or infrared eye-tracking technology. This technique is applied here in a practical application to track subjects’ visual attention when exposed to three different e-tourism websites. To date, marketing scholars have paid very little attention to analyzing the moderating effect of customer engagement (CE) or ad type when measuring visual attention and recall for a certain banner. To achieve objectives of the practical application, a within-subject and between-group experimental design was applied on a sample of participants, using their eye-tracking data in addition to a self-administered questionnaire. Based on variables of visual attention and cognitive processing, results of this study reveal the following: (i) no banner blindness was identified for any of e-tourism websites under analysis, and the greatest advertising efficacy of all sites tested was achieved by Facebook; (ii) post hoc measurements of recall were explained by other gaze metrics related to advertising effectiveness (such as number of fixation or duration of the visit); and (iii) the level of CE and the degree of animation of the particular banner exerted a moderating effect on the relationship between attention and recall or memory.
Keywords Banner effectiveness · Travel 2.0 tools · Eye tracking · Customer engagement · Banner type
Introduction: Eye-Tracking Concept and Typologies Starting from the premise that companies want to know: (a) how consumers process their commercial messages and (b) how to make their advertising strategies more effective (Varadarajan and Yadav 2009), most of the self-reporting tools that measure the level of attention consumers pay to a set of stimuli prove problematic or limited in scope. Over the last decade, several techniques or methodologies based on cognitive neuroscience and psychology have been used to address these shortcomings, such as eye tracking (e.g., Drèze and Hussherr 2003) or biofeedback monitoring and facial coding (Hill 2003). These tools can be used to gauge marketing efforts by measuring nonverbal responses of subjects. From this “neuromarketing” approach, the application of these methodologies opens an infinite number of possibilities for studying the attention consumers pay to marketing communications in general and online advertising, in particular. Eye tracking is a technique that records the user’s eye movements while viewing a given scene or image (Ehmke and Wilson 2007; Hassan Montero and Herrero Solana 2007). Basically, there are two types of eye-tracker devices: (1) those fixed on the participant’s head (mounted on a helmet or glasses) and (2) those
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
687
Fig. 1 (a) Eye-tracking device in an LCD monitor. (b) Eye-tracking device in a smartphone. (c) Eye-tracking glasses
that record the ocular movement remotely (using a camera placed underneath the computer screen). The former are particularly useful for tasks requiring more freedom of movement (Goldberg and Wichansky 2003). The latest devices feature a miniature camera set into an LCD monitor, which enables participants to move their head slightly during the experiment, or in specially adapted glasses, which enable full movement. Figure 1 shows the eye-tracking device mounted on an LCD, a smartphone, and glasses. This technology enables physiological data on the individual’s attention to be recorded and decodified, as well as providing information about their emotional states (such as attraction or surprise) and cognitive processes that are triggered. Let us now turn to a brief overview of the evolution of different techniques for recording eye movements, from the earliest eye-tracking prototypes to the very latest devices. We will then clarify the link between eye movement, attention, and consumer behavior, according to certain theoretical models and studies that deal with this methodology, setting out its broad characteristics and basic workings.
The Eye-Tracking Technology A Brief Historical Outline At the end of the nineteenth century, the French ophthalmologist Louis-Émile Javal made the observation that a person’s eye movements when reading were not smooth, continuous, and linear but rather comprised of a series of movements combined with brief fixations on certain points of the text (Javal 1878). Since this discovery, different studies on eye movement have been applied to reading tasks, photography, flight simulation for pilots, printed advertising, and so on. Findings published in the early 1980s focused on the eye–mind hypothesis, which holds that when a subject looks at a word (or set of words) or at an object (or set of objects), they also think about it/them. That is, along with eye movements, simultaneous cognitive processing takes place (Just and Carpenter 1980). This hypothesis was then accepted in subsequent studies.
688
F. Muñoz-Leiva
Between the 1990s and the present day, eye-tracking devices have become cheaper yet more accurate, which has facilitated their application in an even wider range of scientific fields, including marketing and consumer behavior (Gamito and Rosa 2014). Nowadays, it is even possible to conduct such tests by tracking eye movement live online, via webcams. From the 2000s onward, the greater processor speed and recall capacity, the number and size of pixels offered by digital video cameras, and improved artificial vision techniques all combined to produce the fourth generation of eye trackers that are common today and more user-friendly (Duchowski 2007).
How Eye-Tracking Methodologies Operate in the Study of Consumer Behavior Consumers move their eyes to obtain information, and they stop when they see something that catches their attention. The eye is an extremely complex organ, capable of moving at high speeds; hence, the ability to collect data on rapid eye movement and fixations offers a very interesting option for studying an individual’s information processing (Russo 1978). What at first glance may appear to be a smooth eye movement, in fact, comprises two components, known as fixations and saccades (Buswell 1935; Goldberg and Wichansky 2003): • Saccades are rapid eye movements lasting between 20 and 40 ms, in which specific locations in a scene are projected onto the eye. Saccadic movements occur in both eyes at the same time, and each saccade is followed by a fixation. These are the fastest movements made by the human body, and it is estimated that the average person makes 70,000 such movements every day. • Fixations are periods of time during which a person’s gaze remains stationary. Fixations occur when the eye stabilizes for a duration of 200 to 300 ms (Pan and Zhang 2010; Holmqvist et al. 2011: 381), to 400 ms (Salvucci and Goldberg 2000), or to 500 ms for reading tasks (Rayner 1998; Goldberg and Wichansky 2003). The sequence of fixations and saccades that occur in relation to a given stimulus (such as an advert or product placement) is known as the “scanpath.” Eye trackers record scanpaths of consumers exposed to certain visual stimuli (Noton and Stark 1971), and some also record pupil dilation and blinking patterns. According to Rayner and Castelhano (2007), when a person observes a scene or searches a specific area within their field of vision, their eyes move, on average, every 250– 350 ms. Significantly, the volume of information transmitted via the optic nerve exceeds what the brain can process. Hence, we have developed certain mechanisms for paying attention based on selecting a subset of relevant informational elements on which we then focus and we process more effectively. When this selective attention is focused on a specific spot or object within a scene, it can be processed in more detail, whereas processing of remaining areas or objects simultaneously is
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
689
Fig. 2 (a) The four Purkinje images, captured by the eye-tracker camera. (b) Relative positions of the pupil. (Source: Based on Duchowski 2007)
halted. Hence, during fixation, a certain area of the scene is projected onto the fovea centralis (the most central and sensitive part of the retina, located opposite the lens), which provides sharp foveal processing. At any given time, the human eye projects just 8% of the field of vision onto the fovea centralis—meaning that visual attention is highly selective. Duchowski (2007) provided an in-depth description of different methods of eye tracking. Video-based infrared oculography (VIROG) is the method most favored for market research or marketing purposes. A majority of eye trackers used for commercial purposes are those based on infrared corneal reflection. These devices measure the distance and the angle of the reflection of the light source in the center of the pupil (glint). These reflections of light on the cornea are known as Purkinje images (Crane 1994). Once reflections form, VIROG systems identify the first images (Fig. 2) by means of calibration procedures (normally with 5 or 9 calibration points). In Fig. 2b, we can observe how the difference between the center of the pupil and the corneal reflection can change as the eyes rotate, but this difference is fairly constant so long as the head moves relatively little (Mele and Federici 2012). Various mathematical algorithms compare the position of the center of the pupil with the position of the reference point, estimating from this the x/y coordinates for each fixation. Next, following calibration of the system, eye trackers are able to extract an individual’s fixation point at any given moment, with a minimal error level. The eyetracking system also records coordinates (x, y) of the fixation in conventional ranges between 30 and 600 Hz or more. This methodology, therefore, offers sufficiently high resolution and spatial accuracy for use in marketing research applications, for both commercial and academic purposes. Mikalef et al. (2018) distinguished between three types of processing which can be identified through these approaches: • Cognitive processioning. In relation to the dual-process theory, the pupil diameter has a positive relationship with the depth of cognitive processing or
690
F. Muñoz-Leiva
mental activity of individuals during a specific task (Kahneman et al. 1969) and indicates an enhanced cognitive function load (Croson et al. 2013). The smaller the pupil diameter, the more load on the working memory and faster calculations. This measure should be normalized with respect to the first seconds of data to remove effects of several external factors (such as fatigue, age, stimulating substances, or medication). • Engagement. Peak saccadic velocity represents the percent of the saccade length when the maximum speed was reached at a given time (Duchowski 2007). This measure has been linked reversely to the level of user engagement; that is, the faster the eyes attain the maximum speed (which will return a lower value in terms of percent of length of the saccade), the more the user is engaged. • Observation. Just and Carpenter’s (1980) eye–mind hypothesis states that there is no significant lag between what we fixate on and what we process. This hypothesis has been widely confirmed by previous studies in different tasks (e.g., in reading and scene perception). In particular, careful observation refers to the average period of time during which the eye is relatively stable. This measure has been associated to researches focused on effects of stimuli. Careful observation essentially denotes that consumers are performing local processing on the information type and format, rather than overall processing of the entire page or product to find relevant information for their purchase.
Critical Issues and Possible Applications of Eye-Tracking Technology in the Marketing Sphere The study of these movements and fixations has been applied to the design of stimuli used in visual marketing (“bottom-up factors”), which is intended to attract consumers’ attention as well as to determine the impact of an individual’s specific characteristics (“top-down factors”) on their visual processing, such as their particular search objectives or their recall of the object in question, in the voluntary attention process (Wedel and Pieters 2008). But the challenge of measuring subjects’ attention when they are exposed to visual marketing stimuli results in measures that are too closely related to others, such as recall or emotional response. By contrast, consumers can also often pay attention even with a low level of awareness. In other words, the information can be actively processed by consumers, in interaction with internal knowledge representations already present in their recall or with external information (e.g., brand name, endorsement, etc.). Thus, external information may be “enriched” or misinterpreted due to (spontaneous) associations co-activated in the brain (Van Trijp 2009). There is, therefore, a clear need for more research on consumer attention and perceptions of marketing stimuli (such as ads, brands, and colors) and information about market-related conditions, as lack of attention can impede further information processing on the part of the consumer. Such research should be based on experimental and behavioral observation methods rather than on purely recall-based
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
691
survey research, as it used to be (Van Trijp 2009). In the present case, eyetracking methodologies provide a solution to this potential misrepresentation and enable the subjects’ attention data to be compared with self-reported recall or memory. Yet despite the clear predominance of visual stimuli in marketing contexts, the growing application of eye-tracking methodologies in business practices, and the importance of visual–cognitive processing for consumer behavior, the scientific literature dealing with applications based on the eye-tracking methodology is relatively scant. In the past, it may have been the case that the study of visual attention in the marketing literature was impeded by three major (erroneous) beliefs among scholars (Wedel and Pieters 2008): • That attention is merely a precondition—that is, it is merely a door via which information enters on its way toward high-level cognitive processes. • That capturing and retaining attention are relatively simple reactions—for example, related to an individual’s cognitive strategies for dealing with comparisons between competing products and processing repeated messages. • That using these methodologies is complicated and that, by contrast, recall-based measures (such as surveys) are a simpler—yet valid—alternative. On the basis of this rationale, the high cost of the first generation of eye-tracking devices meant that primarily recall-based methods for measuring visual attention were employed. However, further academic research showed that visual attention is a significant topic in its own right and that aforementioned beliefs were unjustified, for the following reasons: • Psychological research has demonstrated that visual attention is more than just a “door”—a notion exemplified in hierarchical processing models such as AIDA (awareness, interest, desire, and action). Rather, it is a fundamental coordination mechanism that sustains information processing and other objectives (LaBerge 1995). Attention is more closely related to behavior than we might think. Rizzolatti et al. (1994) found there has to be a close link between eye movements and certain higher-order cognitive processes, which indicates the importance of the application of this methodology in our discipline. • Measurements of attention based on subsequent recall may be prone to certain biases, whereas eye movements have become indicators par excellence of visual attention. What is more is these are now relatively easy to measure, thanks to modern devices. • Eye tracking offers wide-ranging possibilities for nonintrusive recording of consumer attention to visual marketing stimuli that are simply not achievable using other means. This methodology can show the influence of (a) the acquisition of information during fixations; (b) the evaluation of the brand during the selection process, based on certain stimuli; and (c) the change in states of attention during processing of ads.
692
F. Muñoz-Leiva
• Finally, in the marketing realm, it is increasingly difficult to isolate the point at which consumers’ attention is attracted and held, given the high level of advertising saturation that exists. Consumers are continually exposed to a high volume of ads, at any moment and in any setting, be it via the television, print media, outdoor advertising, or the Internet, and to a considerable number of different brands and products on display in commercial establishments. Dealing with this saturation and capturing the attention of consumers are a major challenge. Furthermore, in the experimental research context, if the consumer’s attention is not successfully attracted, any result relating to acceptance (or not) of the message under forced exposure conditions is rendered invalid. Therefore, eye tracking complements traditional measurement methods by offering greater detail on the spatial–temporal dynamic of attention. The combination of eye-tracking techniques and self-reported recall enables structural models to be developed that deal with information storage and retrieval, to test the influence of certain marketing stimuli and thus heighten or diminish the attention being paid. Examples include the effect of different banner ad types (animated or static), types of browsing, Internet user types, changes to the presentation of products, product labeling and pricing, application of security seals, specific location of the ad, and so on. In light of the predictive validity of the eye-tracking methodology, the ease with which eye movements can now be recorded, and the emergence of increasingly robust theories related to visual attention, research is called for that deals with the impact of marketing stimuli on consumer behavior in general and on e-tourism website users in particular.
Banner Advertising on E-Tourism Websites A new online world of collaboration and communication has emerged in recent years, Web 2.0 (Cheung et al. 2011). Here, consumers are more connected to each other than ever, with greater access to content and more active, in-depth participation in that content (Nielsen 2014). This development of the Web has, on the one hand, affected all sectors, including the tourism industry, and, on the other hand, tourist behaviors when planning trips has also undergone changes thanks to the availability of Travel 2.0 tools (i.e., e-tourism websites) (Mendes-Filho and Tan 2008; MuñozLeiva et al. 2012, 2018). These websites (travel blogs, travel social networks, online forums, and so on) enable tourists to play a more proactive role when planning their travel and making relevant decisions and also to help fellow travelers develop a vision of their destination prior to their trip (Muñoz-Leiva et al. 2012). In this context, Web designers and advertising agencies endeavor to attract the attention of consumers by using banners strategically placed on these e-tourism websites. Banners—the oldest and most widely used form of online advertising— first appeared on the website www.hotwired.com in October 1994 (Hollis 2005; Abuín 2008) and subsequently led to the new “reign” of online advertising
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
693
(Lohtia et al. 2007). Since then, advertisers have continued to invest resources in creating banners that not only attract the user’s attention but also inspire trust (Gong and Maddox 2003). Once banners have attracted the user’s attention, the objective is for them to click on the ad and be redirected to the advertiser’s website (Abuín 2008). However, due to the growing saturation of advertising, the phenomenon of “banner blindness” has arisen, which results in a reduction in the volume of clicks on banners (click-through rate or CTR). Considering this issue, advertisers continue to search for solutions that will help them improve their advertising effectiveness. However, to find solutions, we must understand a priori the main characteristics and effects of online advertising. First, interactivity influences the design and implementation of advertising; and, in turn, it shapes consumer opinions and attitudes toward that advertising (Chandon et al. 2003). Furthermore, as the Internet is an interactive medium, it allows for bidirectional communication in which users can participate and collaborate (Chandon et al. 2003). In the case of social media, advertising is characterized by the significant coverage these platforms provide, as well as their high degree of innovation and creativity that further draw attention toward advertised brands. Once the advertising message has successfully attracted the user’s attention, their information processing will generate cognitive storage mechanisms for the information displayed, which will impact on their recall (Yoo 2008; Lee and Ahn 2012; Nihel 2013). Studies based on self-report instruments (such as Baack et al. 2008; Putrevu 2008) identify the need for a comprehensive examination of (a) the role of recall in the customer decision-making process and (b) advertising effectiveness. Different studies have sought to understand how ads are processed, encoded, and stored in human recall and later recalled, since these variables directly impact on purchasing decisions (Krishnan and Chakravarti 1999). Furthermore, marketing scholars have paid scant attention to analyzing the moderating effect of customer engagement (CE) when measuring the visual attention and recall of a certain banner. In particular, our review of the relevant scientific literature found no study focused on the effect of CE and banner type (and the effect of the interaction between these two variables) on advertising effectiveness in the e-tourism context. The present study will be especially pertinent as a methodological resource with which to assess the relationship between visual attention and recall via a within-subject and between-group experimental design (combining eye-tracking techniques and personal interviews) that takes into account moderating variables such as the level of CE and banner type. Therefore, the aim of the practical application is to evaluate the advertising efficacy of three specific e-tourism websites: a travel blog, a social network, and a virtual community with tourism content, by measuring psychophysiological variables of attention and cognitive processing, along with post hoc measurements of spontaneous (unprompted) and guided (prompted) recall. In the context of e-tourism websites, specific objectives of this study were (i) to confirm whether participants experienced any banner blindness in relation to each e-tourism site, (ii) to explain post hoc measurements of recall based on other indicators related to advertising effectiveness and approached through psychophysiological variables
694
F. Muñoz-Leiva
of visual attention and cognitive processing, and (iii) to identify the moderating effect of CE level and the degree of animation of a particular banner (based on two commonly used types, static and animated) on the relationship between attention and memory. To fulfill these objectives, an experiment was conducted on a sample of 60 participants, using an eye-tracking methodology, in addition to a self-administered questionnaire.
Theoretical Background: Determining Factors of Banner Advertising Causes and Consequences of Banner Blindness Studies conducted with eye-tracking technology typically model the user’s scanpath—that is, if they look from left to right, or vice versa, what they focus on first, how many areas of interest they cover, which areas they return to most frequently, and so on (e.g., Lindgaard et al. 2006). These latter authors conducted three studies to determine how much time users take to form an opinion about the visual appeal of a website and discovered that Internet users can make a reliable decision about whether they like a site or not in 50 ms. Later, Djamasbi et al. (2010a) demonstrated that, when viewing a website, users focus most on primary images, on faces of famous people, and on search engine results. By contrast, they tend to focus less on parts of the site featuring lots of text. Furthermore, it has been demonstrated that not only websites with images of people are perceived as more attractive but they also enable users to complete the task at hand more quickly (Djamasbi et al. 2010b). And there is empirical evidence showing that the complexity of a website’s design (text size and format, inclusion of images, and so on) exerts an effect on viewing patterns of different subjects exploring the same website (Djamasbi et al. 2011; Pan et al. 2004) and on user attention (Pieters and Wedel 2004). This visual complexity is caused, in part, by an increase in the number of advertising stimuli on the website; this is known as “clutter” and is related to advertising saturation. With regard to banner efficacy in terms of generating attention, the literature review showed that many users do not recall the banners that appear during their visit to a website; and there are other users who actively avoid them (Bayles 2002; Nielsen 2007; Lapa 2007). In other words, users not only quickly learn a website’s structure (Lapa 2007) but also use their prior browsing experience to avoid banners, thereby focusing on the main content (Hsieh and Chen 2011). This is known as “banner blindness,” which alludes to the fact that users ignore (Burke et al. 2005) and/or do not recall banner content (Hervet et al. 2011). Banners may be considered intrusive and generate negative perceptions among Internet users, toward not only
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
695
advertised products and services but also the brand, the format, and the website itself (Abuín 2008). The majority of fixations on banners has been found to occur in the first few ocular movements, and these help the user avoid banners during their website visit (Burke et al. 2005). Furthermore, users’ peripheral vision allows them to skim over the website content and, since ads are usually in graphic format, quickly filter them from the editorial content (León Sáez de Ybarra 2009). Usually, when ads are published in traditional media such as television, the entire available space is used to attract viewers’ attention. However, banners typically occupy less than 10% of the area of a Web page (Drèze and Hussherr 2003). Nielsen (2007) showed that users barely look at Web ads, while Burke et al. (2005) found that subjects usually spend less time looking at banners than at editorial content and they tend not to focus on design elements that resemble an ad, even if it has no promotional purpose. In contrast to other aforementioned studies, Burke et al. (2005) found little evidence of banner blindness. Their results showed that at least 82% of participants focused on one or more of the four banners on the page during their website visit. Furthermore, they looked at the first ad they saw during the visit more than the rest (greater frequency and average duration of fixation). Elsewhere, a study by IAB Spain Research and The Cocktail Analysis (2009) found that 75% of the banners analyzed during the experiment made a visual impact on at least 50% of the participants. With regard to social media, Margarida’s study (2013) analyzed whether Facebook users look at ads published on this social network. Results revealed that subjects pay more attention to their friends’ recommendations on Facebook than to banners on the site. In this sense, ads are also ignored on social media and banner blindness occurs (Margarida 2013). Effective advertising both attracts the public’s attention and also remains in their short- or long-term memory (Margarida 2013). Therefore, advertisers are concerned not only that people pay attention to their ads but also that they remember them. Our review of the relevant literature on this subject confirmed that research study participants have been found not to recall many of the ads to which they are exposed online (Benway 1998; Bayles and Chaparro 2001; Bayles 2002; Drèze and Hussherr 2003; Burke et al. 2005). For example, results of the study by Drèze and Hussherr (2003) demonstrated that 46.9% of subjects indicated that they recalled seeing some banners on the website, but they did not remember them well or they avoided banners altogether. According to Crespo (2011), what people most recall about banners is the advertised brand. Danaher and Mullarkey (2003) studied factors that influence advertising recall and recognition. They demonstrated that the longer a person remains on a website, the more likely they are to recall advertisements. They also found that subjects who intend to perform a task on the website are less likely to recall banners than people who browse the Internet without any specific purpose.
696
F. Muñoz-Leiva
Determining Factors of Attention and Banner Recall Different studies have addressed the study of factors that determine the attention and recall of a banner, such as its position on the website, nature of the task performed, and engagement and experience with the website or ad type, among others. The position of the banner on the Web page can also influence users’ attention and recall. Banners located at the top of the screen are recalled more frequently than those located in the lower part (Dos Santos Meirinhos 2002; Burke et al. 2005; Nihel 2013). Owens et al. (2011) found that banner blindness (to miss information in text ads) occurs more frequently when the ad appears on the right-hand side of the page than when it is positioned at the top. In the case of online newspapers, there is no banner blindness when ads are located at the top of the page. Therefore, to help increase users’ fixation time on the banner, it should be placed very close to the main text of the news story or even in the middle of the story (Mosconi et al. 2008). The nature of the task performed by users also influences how much attention they pay to banners. Searches posing a greater degree of difficulty require more attention, which reduces the amount of time users have to process irrelevant objects; hence, sometimes these are ignored (Burke et al. 2005). Only ads that are closely related to the subject’s purpose achieve positive results, as users avoid all content that does not correspond to that purpose (León Sáez de Ybarra 2009). The study by Djamasbi et al. (2011) indicated that viewing patterns depend on whether users are simply browsing a website or searching for specific information. The presence of customers on tourism-related social media platforms has attracted moderate attention in recent literature (Harrigan et al. 2017) linked to the fact that engagement and commitment are associated with numerous important indicators of brand performance, including sales growth, customer participation in product development, customer feedback, and customer referrals (Van Doorn et al. 2010). The relationship between user experience in online contexts and CE has been supported by multiple studies. In particular, Malthouse and Calder’s (2009) conceptualization of experience and engagement has been used to refer to the concept of people’s involvement with the medium they have viewed, based on the audience as proactive individuals. Calder et al. (2009) also found that higher experience and engagement levels are associated with improved ad effectiveness. Drèze and Hussherr (2003) found that those individuals with a low level of CBE browsing online would struggle to recall advertising messages shown on different websites. Yaveroglu and Donthu (2008) analyzed effects of banner repetition on brand recall and the user’s intention to click on the ad. They concluded that ad repetition leads to greater recall and greater intention to click on it in an online environment. Furthermore, results of Gong and Maddox’s study (2003) suggest that even one additional exposure to a banner improves brand recall. More specifically, Kuisma et al. (2010) conclude that, when many animated ads are shown simultaneously, the effectiveness of animation in terms of being eye-catching decreases. Interestingly, these types of banner ads also attract most fixations near the beginning or end of the primary task, suggesting that bottom-up
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
697
salience may be more likely to interfere with top-down processing during these early and late periods of information search. However, various studies have shown that animated ads are not an effective tool (Hong et al. 2007), since they can have a negative impact on the viewer’s attitude and response to the ad (Baltas 2003). Furthermore, although animation reduces banner blindness (Bayles 2002), it also makes users remember or recognize the banner less than in the case of static banner positioning (Bayles 2002; Hong et al. 2007). Other more recent studies (e.g., Hernández-Méndez and Muñoz-Leiva 2015) have analyzed the direct effect of other variables for classifying potential tourists such as gender or age.
Practical Application: Research Questions, Methodology, Results, and Analysis Research Questions Based on an exhaustive literature review, the practical application sought to respond to these four research questions (RQ): RQ1: Do users pay attention to banners placed on different e-tourism websites or do they ignore them (banner blindness)? RQ2: Do readers of travel blogs, users of travel social networks, and members of online travel communities recall online ads to which they have been exposed? RQ3: Does engagement with e-tourism websites have any effect on the influence of user visual attention on banner recall? RQ4: Is the relationship between visual attention and recall moderated by banner type (static vs. animated) on e-tourism websites? The conceptual model of proposed research questions is shown in Fig. 3.
Fig. 3 Conceptual model and research hypotheses
Z1
Z2
RQ3
RQ4
RQ1
RQ2
Xk
Y
Where Xk refers to visual attention indicator k, Y is recall, Z1 is customer engagement (CE), Z2 is banner type (TYPE).
698
F. Muñoz-Leiva
Methodology Fieldwork and Eye-Tracking System Used The fieldwork was conducted at the University of Granada’s Mind, Brain and Behaviour Research Centre (CIMCYC). The final sample comprised a total of 60 adults (30 male and 30 female participants) between 16 and 57 years of age (average age = 34.27, standard deviation = 10.84). Participants were recruited using the “snowball” sampling method, were invited by e-mail or phone, and were paid 15 each for their participation. The experiment was based on an eye-tracking methodology to determine participants’ eye movements and fixation patterns while browsing different e-tourism websites. This technique has been increasingly applied in different consumerrelated disciplines and more specifically in studies related to the online context, for example, Web search services (Cutrell and Guan 2007), processing text and data diagrams (Ho et al. 2014), e-commerce websites (Wong et al. 2014), and measurement of ad effectiveness (Hernández-Méndez and Muñoz-Leiva 2015), among others. The experiment was performed on one participant at a time, in a quiet room, isolated from outside noise, and with ambient light of 200 lux, as recommended in ITU (2002) to simulate a “home” environment. The specific eye-tracking system used (Tobii) has a typical accuracy level of 0.5° and a head movement error of 0.2° and is integrated into a 17” TFT monitor, with a screen resolution of 1280 × 1024 pixels, a maximum vertical sync frequency of 75 Hz, and a horizontal frequency of 60 Hz. The camera used in the experiment has a resolution of 640 × 480 pixels and a frame rate of 30 fps (Fig. 4). Experimental Design This research was based on a within-subject experimental design, where all participants visited replicas of three e-tourism websites referencing Hotel Jardín Tropical (located in Tenerife, Spain): the hotel blog, the Facebook profile, and the TripAdvisor profile.
Fig. 4 Eye-tracking system used. (Source: www.tobii.com)
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
699
Fig. 5 Static airline banner used in the experiment
An Air Europa airline banner (which was the area of interest or AOI) was also embedded in each site, featuring three famous celebrities (see Fig. 5). The banner contained text (“We fly just for you!” + Visit www.volamossoloparati.com + “for the chance to win prizes every week”), together with a composite image (celebrities + airplane). One version of the banner contained only static elements (http:// webcim.ugr.es/polls/EP_ET/banner/B1.jpg), while the other featured animated elements (http://webcim.ugr.es/polls/EP_ET/banner/B2.swf). The final stimuli (Travel 2.0 websites) used in the experiment can be visited at the following URL addresses: • EG1—Blog, Facebook page, and TripAdvisor profile with static airline banner (n1 = 30) • EG2—Blog, Facebook page, and TripAdvisor profile with animated airline banner (n2 = 30) Under these conditions, the experiment was counterbalanced—that is, the number of subjects was the same for each experimental group (EG) according to banner type (between-group factor). Randomness was ensured in assigning test units for treatment groups and the treatment of experimental groups regarding the order of presentation of the stimuli (Malhotra 1997: 247) based on the level of engagement and banner type. Randomization permits the equal distribution of effects of independent variables or factors under all possible conditions (Zikmund 2003: 203) and ensures that the experiment’s total number of repetitions under these conditions will demonstrate true effects, should any exist (Luque 1997: 157; Zikmund 2003: 203).
700
F. Muñoz-Leiva
Finally, participants were moved to another room, where they responded to an online questionnaire regarding sociodemographic characteristics; behavioral variables, such as engagement with social media; and questions regarding whether they were able to recall the banner displayed during their visits to different websites.
Measurement Scales and Data Analyses Applied In this study, variables measuring visual attention were the number of fixations before focusing on the AOI (FB), time to first fixation (TFF), fixation count (FC), fixation duration (FD), visit count (VC), and visit duration (VD). FC measures the number of times the participant fixates on an AOI. FD measures the duration of each individual fixation within an AOI. If the participant returned to the same media element during their recording, new fixations would be included in the calculation of the metric. A visit is defined as the interval of time between the first fixation on the AOI and the next fixation outside the AOI. VC measures the number of visits within an AOI. And finally, VD measures the duration of each individual visit within the AOI. Therefore, as the FC increases, the VD also increases. To answer research questions 1 and 2, we applied an analysis of variance (ANOVA) of one factor for different dependent variables (TFF and FC before reaching the banner—FCB—as well as the average FD and average FC). To respond to research questions 3 and 4, we used several linear regressions. Our conditional process model included self-reported recall as an outcome or dependent variable and previous visual attention measurements as antecedent causal agents or predictor factors interacting with CE and banner type (or level of animation). CE with social networks was included as a discrete quantitative moderator, while banner type was approached as a moderating factor with two values (“static” or low level of animation vs. “animated” or high level). With regard to the proposed CE measurement scale, Harmeling et al. (2017) assessed the multidimensional nature of the engagement construct, behavioral and psychological dimensions. CE measurement was based on Hollebeek et al.’s (2014) scale adapted to social media (see questionnaire in http://webcim.ugr.es/polls/EP_ ET/questionnaire.pdf), specifically the activation factor. Banner recall, adapted from research by Danaher and Mullarkey (2003), among others, was composed of two open and two closed questions measuring spontaneous and guided or prompted recall, which was re-encoded, and resulting values varied from 0 (“does not remember anything”) to 4 (“remembers all the presented banner elements”). Research questions dealing with different moderating effects will be answered if, in the proposed model, (a) the interaction between moderators on visual attention measures (FB, TFF, FC, FD, VC, and VD) becomes significant in terms of variance explained (i.e., have a significant weight) and (b) conditional effects of attention on recall for different levels of moderators are also present. All variables were standardized to avoid a negative effect due to the measurement unit. The initial statistical model is expressed as follows: yi = b0 + b1 · Z1i + b2 · Z2i + b3 · X1i + b4 · X2i + . . . + bj · Xki + bj +1 · X1i · Z1i + bj +2 · X1i · Z2i + . . . + bp · Xki · Z2i + εi
(1)
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
701
Table 1 Visual attention and recall measurements (without standardization) Measurement Recall FC FD VC VD
Average 1.14 13.14 times 0.20 s 5.42 times 0.56 s
Stand. dev. 1.233 7.622 0.043 2.815 0.247
where yi is recall values for the subject i, X k is the visual attention indicator k, Z 1 is customer engagement (CE), Z 2 is banner type (TYPE), and εi is the estimation error. As expected, FC impacted on VD; thus, this indicator can be considered a mediating factor between FC and recall (see Appendix B).
Analysis of Results Descriptive Analysis: Visual Attention and Banner Recall With regard to average measurements of participants’ visual attention, the FC was 13.14 times; the FD was approximately 0.20 s; the VC was 5.42; and the VD was 0.56 s (see Table 1). Results of the self-administered questionnaire revealed that two-thirds of the participants recalled having seen an ad during the three website visits; the rest of the sample did not remember any ads. Additionally, of the two-thirds that did register recall, approximately half did not remember the brand or company in the ad. Only 10% of the subjects indicated that the ad was for the airline Air Europa, and a very small minority (3%) accurately recalled the image in the ad. The combined measurement of recall level obtained a mean value of 1.14 out of a maximum value of 4. Banner Blindness: Analysis by E-Tourism Website First, we analyzed whether banner blindness occurred on each e-tourism website. Results of this analysis revealed that 95% of the participants fixated at least once on the blog banner, 98% on the Facebook banner, and 67% on the TripAdvisor banner. We can, therefore, confirm that, overall, banner blindness did not occur on websites selected for our study, although each one demonstrated this to a different extent. Below, we compare different degrees of advertising efficacy in terms of attention toward the banner, according to e-tourism website type (blog, Facebook, and TripAdvisor), through several variance analyses (ANOVA) of one factor for different dependent variables: TFF, FD, and FC. In particular: • Participants took the least time of all to focus on the Facebook banner (14.16 s), followed by the blog banner (23.10 s) and, lastly, the TripAdvisor banner (38.56 s), TFF, F (2, 153) = 17.977 and p < .001. This result suggests that users
702
F. Muñoz-Leiva
somehow recognize the ad with their peripheral vision, but do not devote time to fixations, or at least very few. Although these are average values, we confirmed that the first fixation on the banner did not occur with the first ocular movements. • The longest total FD was achieved by the Facebook banner (4.08 s), followed by the blog (2.43 s) and lastly TripAdvisor (1.23 s), FD, F (2, 153) = 15.864 and p < .001. • Participants focused more times on the Facebook banner (19.05), followed by the blog (11.72) and lastly TripAdvisor (6.08), FC, F (2, 1534) = 18.578 and p < .001. In summary, we can conclude that not only participants took less time to fixate on the banner placed on Facebook but they presented a higher number of fixations on the Facebook banner and for a longer duration of time. That is, the website delivering the greatest advertising efficacy of the three was Facebook, followed by the blog and, finally, TripAdvisor. This may be explained by the greater simplicity of editorial content on Facebook and thus its greater visual simplicity, which contributes to banner visibility.
Explanation of Memory Based on Visual Attention The following data analysis used a combined measurement based on spontaneous and prompted memory, which can be explained using visual attention variables. With regard to initial assumptions about the linear regression analysis, an absence of multicollinearity was observed in the low condition index (2.19). The resulting model after performing a multiple linear regression analysis using a backward calculation method is shown below. Coefficients attained significant values at all times (see Appendix B). yi = −0.09∗ + 1.67 · F C i + 1.85 · V C i + 1.07 · F C i · ENGi + 0.69 · VC i · ENGi + 0.44 · VD · ENGi − 0.81 · F C i · TYPE i − 0.90 · VC i · TYPE i + εi
(2)
*constant is not significant A positive relationship was observed between the recall measurement and the FC (B = 1.67; T = 2.89; p = 0.005) and between the recall and the VC (B = 1.85; T = 3.01; p = 0.004). This means that subjects who present a greater number of fixations on the banner will have greater recall of it, as supported by previous literature.
Moderating Effect of CE and Banner Type With regard to the moderating effect of engagement, higher levels of this variable had a positive impact on fixation count (FC) on the banner ad and improved its recall, both for the static and dynamic versions, although loading values are somewhat lower in the case of dynamic banners (see Fig. 4). To be more precise, FC loading values varied, increasing from 1.93 to 4.07 when users visited e-tourism
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
703
Table 2 Net effect for indicators FC, VC, and VD according to ENG and TYPE level Attention measurement Conditions If VC and VD = mean (0) and TYPE = 1 (static). . . FCa . . . & if ENG = 1 (low) . . . & if ENG = 2 (medium) . . . & if ENG = 3 (high) If VC and VD = mean (0) and TYPE = 2 (ani. . . & if ENG = 1 (low) mated). . . . . . & if ENG = 2 (medium) . . . & if ENG = 3 (high) If VC and VD = mean (0) and TYPE = 1 (static) . . . VCb . . . & if ENG = 1 (low) . . . & if ENG = 2 (medium) . . . & if ENG = 3 (high) If VC and VD = mean (0) and TYPE = 2 (animated) . . . . . . & if ENG = 1 (low) . . . & if ENG = 2 (medium) . . . & if ENG = 3 (high) If FC and VD = mean (0) . . . VDc . . . & if ENG = 1 (low) . . . & if ENG = 2 (medium) . . . . & if ENG = 3 (high)
Estimated regression line
y = –0.09 + 1.93· FC y = –0.09 + 3.00· FC y = –0.09 + 4.07· FC y = –0.09 + 1.12· FC y = –0.09 + 2.19· FC y = –0.09 + 3.26· FC y = –0.09 + 1.64· VC y = –0.09 + 2.33· VC y = –0.09 + 3.02· VC y = –0.09 + 0.74· VC y = –0.09 + 1.43· VC y = –0.09 + 2.12· VC y = –0.09 + 0.44· VD y = –0,09 + 0.88· VD y = –0.09 + 1.32· VD
Note. a If VC and VD reach their mean value (zero since they are standardized) and type = 1 (static) or 2 (dynamic), values for recall present these value curves with regard to engagement (1, 2, and 3) values. For instance, in the case of a high level of engagement (3) and a dynamic type of banner (2), calculations for value curve will present as follows: Y = –0.09 + 1.67· FC + 1.07· FC· (3) + 0.69· (0)· (3) – 0.44· (0)· (3)-0.81· FC· (2) – 0.90· (0)· (2) = –0.09 + 3.26· FC b If FC and VD reach their mean value (zero) and type = 1 (static) or 2 (dynamic), values for recall present the following value curves for different levels of engagement 1, 2, and 3 c If FC and VC reach their mean value (zero) and type = 1 (static) or 2 (dynamic), recall will present these value curves for different levels of engagement 1, 2, and 3
websites displaying static banners. By contrast, FC increased from 1.12 to 3.26 among those visiting websites displaying a dynamic banner (see Table 2). When VC and VD reach their mean value (zero, since they are standardized), calculations to determine recall present curves displayed in the next graph for different levels of engagement 1 (low), 2 (medium), and 3 (high) (Fig. 4).
704
F. Muñoz-Leiva
Fig. 6 Scatter plots. FC vs. recall for both types of banners (levels of engagement)
Fig. 7 Scatter plots displaying VC vs. recall for both banner types (levels of engagement)
The level of CE also has a positive impact on banner recall in terms of the average VC, regardless of the FC in the area of the banner (see Fig. 7). Loading values are slightly higher than those of static banners Lastly, in the case of the average VD value, no different values for this indicator caused by moderating effects of banner type were identified. When FC and VC reach their mean value, calculations for recall present value curves displayed in Fig. 8 for the three levels of engagement. In this case, it is also observable that CE positively influences recall in terms of the VD of the banner. This result falls in line with previous observations. Thus, research questions 3 and 4 are answered with empirical evidence. In spite of the fact that the direct effect of VD on recall is not displayed in equation 2, this variable sufficiently explains on its own part of the variability regarding recall as the result of an applied linear regression (B = 0.41, t = 4.06,
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
705
Fig. 8 Scatter plot for VD vs. recall in the full sample (levels of engagement)
Fig. 9 Summary of the extracted model, beta values
FC
VC
VISUAL +1.67 ATTENTION LEVEL OF CUSTOMER ENGAGEMENT
+1.85
VD + 0.41*
+1.07 +0.69 +0.44
-0.81 -0.90 n.s.
ANIMATION LEVEL RECALL
*when analyzing direct effects n.s.=not significant
p = 0.000). This variable is excluded from this equation since its effect is already included in interaction effects previously assessed. To summarize, Internet users who present a high level of engagement with e-tourism websites and who pay a high level of visual attention to the banner ad (in terms of fixation and visit counts) also have a more accurate and clearer recall of it. Results from the multivariate analysis conducted in this research are synthesized in Fig. 9.
706
F. Muñoz-Leiva
Final Discussion, Implications, Limitations, and Future Research Conclusions This methodological proposal is framed in the area of marketing, specifically relating to advertising effectiveness or cognitive avoidance generated by different banner types and personal characteristics of users. The chapter has been conducted in response to the research gap identified in the area of e-tourism tools and the role of users’ personal and browsing characteristics on the ad effectiveness in these tools. Therefore, the latest eye-tracking devices that are able to monitor large samples at an ever-decreasing cost constitute a major and unprecedented opportunity for consumer research. Furthermore, the calibration process has become increasingly fast and simple, making it possible to create conditions for exposure that users find very natural. Hence, eye tracking can make a significant contribution to online marketing, helping it to achieve its potential by offering an exceptional insight into consumers’ processing of visual stimuli that provides a high level of predictive validity. There is also now a significant body of knowledge about visual marketing, on which new studies can build (Wedel and Pieters 2008). Based on the extant knowledge regarding the relationship between activation and memory, our practical application developed a general framework to help us understand mechanisms responsible for the influence of visual attention on selfreported banner recall. Findings of this research suggest that the impact of attention on recall can be moderated by variables manipulated a priori (in this case, banner type) and also by self-reported variables (in this case, level of engagement). We analyzed banner recall, confirming that, despite the fact that approximately two-thirds of the participants remembered some advertisement during their visit, more than half of the subjects did not recall the brand/company associated with the banner in question. A small minority (3%) accurately remembered the image in the advertisement, and about one-third of the participants marked the correct option regarding the brand and slogan advertised in the banner. It is known that some users only look at banners without recalling them during an initial website visit (e.g., Nielsen 2007; Bayles 2002; Lapa 2007) and quickly learn how sites work, using their browsing experience to quickly filter banners from editorial content (León Sáez de Ybarra 2009) or to avoid them in the future during other website visits (Burke et al. 2005). In terms of our first research question, we can confirm that, similar to other studies (e.g., Burke et al. 2005; IAB Spain Research and The Cocktail Analysis 2009), banner blindness does not occur in the case of e-tourism websites. This could be due to the fact that the image on the banner used in our study featured three famous celebrities, which may have influenced users’ attention during their website visits.
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
707
Regarding the second research question, we can affirm that the advertising on Facebook achieved highest levels of efficacy, followed by the blog and lastly TripAdvisor. Participants not only focused on the banner sooner but also on a greater number of times and for longer periods of time, even though it was located in the same position on different websites. This may be due to the fact that the complexity of a website’s design can affect viewing patterns of subjects exploring different sites (Djamasbi et al. 2011; Pan et al. 2004). Furthermore, if users have a stronger memory or experience of how a Facebook page is structured, in contrast to a blog with which they are unfamiliar, for example, it would further increase banner blindness on the latter. This would make Facebook’s advertising efficacy even stronger. This study also found that the greater the number of fixations and visits on the banner, the better and clearer the recall of said banner—a finding supported by the previous scientific literature. In this regard, our analysis of moderating effects shows that, as users become increasingly engaged with e-tourism websites, a higher level of visual attention on the banner (in terms of fixation counts and visits, as well as visit duration) will lead to better banner recall. This moderating effect has also been found in different types of banners employed in this research, static banners making a greater impact on recall compared to the lower impact generated by dynamic banners that use animation to attract users’ attention.
Implications and Recommendations The scientific literature shows that eye movements, under normal conditions, are directly linked to higher-order cognitive processes and that an individual’s attention plays an important role in their information processing and in communications effectiveness—to a much greater degree than previously thought. Studies to date have confirmed that the attention paid by individuals to various marketing stimuli is subject to the influence of numerous “bottom-up factors” (those relating to the marketing stimulus itself) “and top-down factors” (those relating to personal characteristics of the user). In recent years, some of these factors have attracted significant interest in scientific publications applying structural equation modeling, such as is the case with brand engagement, brand loyalty, and product experience. Aforementioned variables are proposed in some studies as future research topics as determinant or moderating variables of visual attention, cognitive processing, and recall. We have incorporated these themes into the practical application of our proposed methodology. Our literature review on such topics (in the practical application section) led to a series of interesting findings with regard to advertising in online social networks.
708
F. Muñoz-Leiva
These helped us formulate our research questions and complete our findings. We should highlight that no prior studies were found in the literature review that analyze the advertising effectiveness of banners in terms of visual attention regarding different e-tourism websites, moderated by customer engagement variables. Moreover, marketing scholars have paid little attention to the effect of banner ads on this effectiveness, or results are inconclusive with regard to Web 2.0 websites. However, given that the majority of the studies we analyzed involves the banner blindness variable, we have determined that users simply ignore banners during their website visit (e.g., Bayles 2002; Burke et al. 2005; Nielsen 2007; Drèze and Hussherr 2003; Margarida 2013). With regard to banner recall, some authors confirm that many users do not recall banners after viewing a website (Pagendarm and Schaumburg 2001; Drèze and Hussherr 2003; Heath and Nairn 2005; Chatterjee et al. 2008), while other authors find no such banner blindness. Many factors are responsible for these inconclusive results: level of use, banner type, or type of measures used, among others. Banner blindness and low recall could be explained, in part, by the subjects’ type of browsing. When searching for specific information on the Internet (the searching task in our study), people are more likely to ignore banners than when browsing the Internet without a specific purpose; in other words, if the informational content they are searching for is not in the advertisement, subjects will most likely avoid areas of the screen where banners are located (Pagendarm and Schaumburg 2001). These results are consistent with the study conducted by Danaher and Mullarkey (2003), which found that subjects who have a task-related objective while browsing online are less likely to remember banners than those who are browsing the Internet with no particular aim in mind. Based on these results, we have concluded that online banner ads can annoy users, thereby decreasing their performance in their chosen task Brajnik and Gabrielli (2010). In addition, our results show that the higher the number of fixations on the banner and the longer the visit duration on the e-tourism website, the higher the banner recall—and even more so among more engaged users. This could be explained by a shifting effect or negative impact on memory among subjects who use this type of website less frequently. It could even be due to banner blindness causing low recall based on the more exploratory type of browsing these subjects conduct throughout the website or their own cognitive factors (Resnick and Albert 2014). Furthermore, we found that participants who presented a higher level of engagement with e-tourism websites and showed increased fixation and visit counts had a better recall of static than animated banners. This result falls in line with results
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
709
of Hwang and Jeong’s study (2016), which found that animated banner ads not only capture less attention than static ads but also have a less positive effect on the viewer’s memory. This could be explained by the fact that the presence of animation alerts users to the existence of advertising in that particular spot, causing them to adopt behaviors of rejection and psychological opposition toward it (Edwards et al. 2002; Chandon et al. 2003). Therefore, as noted in other studies (Pagendarm and Schaumburg 2001; Drèze and Hussherr 2003; Heath and Nairn 2005; Chatterjee et al. 2008), users hardly remembered the banner that was the subject of this study, with opposing results in our case, although they also exhibited low attention values measured with the eyetracking technique. This suggests that the use of both research methodologies (eyetracking techniques and self-reported memory) provides complementary results.
Limitations and Future Research This study analyzes advertising effectiveness in terms of recall by measuring users’ visual attention and its interaction effect with CE and banner type. In the future, it would be interesting to complement these studies with other measurements of advertising effectiveness, such as click-through rate (CTR), to determine which subjects voluntarily visit the selected advertiser’s website. Many eye-tracker brands also record this measurement. With regard to limitations of the present study, the practical application considers only two independent variables (engagement level and banner type) in the analysis of banner effectiveness on e-tourism sites. Additional classification variables (e.g., gender, employment status, level of studies, etc.) could, therefore, be considered for a more in-depth analysis of their moderating effects. In light of the nature of the product/website, we should also consider the analysis of (high or low) involvement with it, as well as the user’s reaction to it. Lastly, we would also recommend expanding the study to include other geographical areas for a more comparative, cross-cultural perspective, including other European countries and countries on other continents. Acknowledgments This study was funded by the Spanish Ministry of Economy and Competitiveness under research project grant ECO2012-39576 and the Spanish Ministry of Science, Innovation and Universities, National R&D&I Plan, and FEDER Plan under research project grant B-SEJ-209UGR18. We would like to express our sincere thanks to Janet Hernández-Méndez (University of La Laguna—ULL) for her collaboration in the fieldwork of the practical application.
710
F. Muñoz-Leiva
Appendix A: Areas of the Travel 2.0 e-Tourism Web Pages
Fig. 10 (a) Areas of the hotel blog (b) Areas of the hotel’s Facebook page (c) Areas of the hotel’s TripAdvisor profile
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
711
Appendix B Table 3 Linear regression. Coefficients (outcome, VD; predictor, FC) Model Constant FC
Unstandardized coefficient .000 .321
Std. error t .107 .00 .169 1.90
LLCIa −.2130 −.005
p 1.000 .061
ULCIb .213 .646
Model summary—R2 = .103; F = 3.62; p = .061 lower level for confidence interval b ULCI upper level for confidence interval a LLCI
Table 4 Linear regression. Coefficients and statistics. Dependent variable, recall of banner Model (Constant) FC VC FC_x_ENG VC_x_ENG VD_x_ENG FC_x_TYPE VC_x_TYPE
Nonstandardized coefficients B Error typ. -0.09 0.104 1.67 0.577 1.85 0.616 1.07 0.457 0.69 0.363 0.44 0.224 -0.81 0.423 -0.90 0.442
t
Sig.
−0.87 2.89 3.01 2.34 1.90 1.97 −1.91 −2.04
0.387 0.005 0.004 0.022 0.061 0.052 0.060 0.044
R 2 = 0.252
Index • AIDA. It is an acronym that stands for attention or awareness, interest, desire, and action. This model is widely used in marketing and advertising to describe stages that occur from the time when a consumer first becomes aware of a brand or product to when the consumer makes a purchase decision or finally trials a product. • Blindness. It is a fact that users ignore and/or do not recall banner content. • Bottom-up factors. It is characteristics of the stimulus itself. Bottom-up attentional processing is a rapid form of capture of attention that relies on physical characteristics of the stimulus such as its color, size, or shape of elements in which it is included. • Click-through rate or CTR. It is commonly used to measure the success of an online advertising campaign. It is calculated as the number of users who click on a link divided by the number of total users who view a page or advertisement.
712
F. Muñoz-Leiva
• Cross-cultural perspective. It is a perspective or approach adopted in anthropology and sister sciences (such as sociology, political science, psychology, or economics) that uses field data from some societies to examine the scope of human behavior and test hypotheses about human behavior and individual’s culture. • Customer engagement. It is the customers’ behavioral manifestation toward a brand or company, beyond the purchase, resulting from motivational drivers. • Eye tracking. It is a technique that records the user’s eye movements while viewing a given scene or image. • Fixations. These are periods of time during which a person’s gaze remains stationary. • Lux. It is an International System of Units (SI) derived of illuminance and luminous emittance, measuring luminous flux per unit area. • Saccades. These are rapid eye movements lasting, in which specific locations in a scene are projected onto the eye. • Scanpath. It is a sequence of fixations and saccades that occur in relation to a given stimulus, such as an advert or product packaging. • Top-down factors. These are previous ideas about the product that consumers already have. Top-down attentional processing requires consumer to voluntarily search and pay attention to specific information or characteristics. • VIROG. Video-based infrared oculography is a method based on infrared corneal reflection. This technology measures the distance and the angle of the reflection of the light source in the center of the pupil. • Visual attention. In cognitive psychology, visual attention is thought to operate as a two-step process. In the first step, attention is distributed uniformly over the external (visual) scene, and processing of information is performed in parallel. In the second step, attention is concentrated to or focused on a specific area of the scene, and processing is performed in a serial fashion. • Web 2.0. It is also known as “participative Web” or “social Web,” which refers to websites that emphasizes user-generated content, participatory culture, and ease of use for end users. Interoperability, i.e., compatible with other products, systems, and devices, is another characteristic of the Web 3.0.
Cross-References Advanced Web Technologies and e-Tourism Web Applications Digital Marketing in Tourism Log File Analysis User-Centered Design
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
713
References Abuín N (2008) La publicidad en periódicos electrónicos: Creación y evaluación de un modelo de eficacia. Universidad Complutense de Madrid. http://eprints.ucm.es/7958/. Accessed 15 Nov 2014 Baack DW, Wilson RT, Till BD (2008) Creativity and memory effects: recall, recognition, and an exploration of nontraditional media. J Adv 37(4):85–94 Baltas G (2003) Determinants of Internet advertising effectiveness: an empirical study. Int J Mark Res 45(4):505–513 Bayles ME (2002) Designing online banner advertisements: should we animate? In: Proceedings of the SIGCHI conference on human factors in computing systems, Minneapolis, Apr 2002. ACM Press, New York, pp 363–368 Bayles ME, Chaparro B (2001) Recall and recognition of static vs. animated banner advertisements. Proc Hum Factors Ergon Soc Annu Meet 45(15):1201–1204. SAGE Publications Benway JP (1998) Banner blindness: the irony of attention grabbing on the World Wide Web. Proc Hum Factors Ergon Soc Annu Meet 42(5):463–467. SAGE Publications Brajnik G, Gabrielli S (2010) A Review of Online Advertising Effects on the User Experience. International Journal of Human-Computer Interaction, 26(10):971–997 Burke M, Hornof A, Nilsen E, Gorman N (2005) High-cost banner blindness: ads increase perceived workload, hinder visual search, and are forgotten. ACM Trans Comput-Hum Interact (TOCHI) 12(4):423–445 Buswell GT (1935) How people look at pictures: a study of the psychology of perception in art. University of Chicago Press, Chicago Calder BJ, Malthouse EC, Schaedel U (2009) An experimental study of the relationship between online engagement and advertising effectiveness. J Interact Mark 23(4):321–331 Chandon JL, Chtourou MS, Fortin DR (2003) Effects of configuration and exposure levels on responses to web advertisements. J Advert Res 43(2):217–229 Chatterjee P (2008) Are unclicked ads wasted? Enduring effects of banner and pop-up ad exposures on brand memory and attitudes. Journal of Electronic Commerce Research: Online Advertising and Sponsored Search 9:51–61 Cheung CMK, Chiu PY, Lee MKO (2011) Online social networks: why do students use Facebook? Comput Hum Behav 27(4):1337–1343 Crane HD (1994) The Purkinje image eyetracker, image stabilization, and related forms of stimulus manipulation. In: Kelly DH (ed) Visual science and engineering: models and applications. Marcel Dekker, New York, pp 13–89 Crespo E (2011) Eficacia de la promoción de ventas on-line. Influencia del tipo de incentivo promocional y la experiencia de uso web. Departamento de Comercialización e Investigación de Mercados. Universidad de Granada Croson R, Schultz K, Siemsen E, Yeo M (2013) Behavioral operations: the state of the field. J Oper Manag 31(1–2):1–5 Cutrell E, Guan Z (2007) What are you looking for? An eye-tracking study of information usage in web search. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 407–416 Danaher PJ, Mullarkey GW (2003) Factors affecting online advertising recall: a study of students. J Advert Res 43(3):252–267 Djamasbi S, Siegel M, Tullis T (2010a) Generation Y, web design, and eye tracking. Int J HumComput Stud 68(5):307–323 Djamasbi S, Siegel M, Tullis T, Dai R (2010b) Efficiency, trust, and visual appeal: usability testing through eye tracking. In: System sciences (HICSS), 2010 43rd Hawaii international conference on IEEE, pp 1–10
714
F. Muñoz-Leiva
Dos Santos Meirinhos G (2002) El tamaño y la posición de los web banners publicitarios y su recuperación de la memoria episódica. Un análisis desde el enfoque del procesamiento de la información, TDX (Doctoral thesis in Xarxa). http://www.tdx.cat/handle/10803/4127. Accessed 12 Nov 2014 Djamasbi S, Siegel M, Tullis T (2011) Visual hierarchy and viewing behavior: an eye tracking study. In: Human-computer interaction. Design and development approaches. Springer, Heidelberg/Berlin, pp 331–340 Drèze X, Hussherr FX (2003) Internet advertising: is anybody watching? J Interact Mark 17(4):8– 23 Duchowski A (2007) Eye tracking methodology: theory and practice, 2nd edn. Springer, London Edwards SM, Li H, Joo-Hyun L (2002) Forced exposure and psychological reactance: antecedents and consequences of the perceived intrusiveness of pop-up ads. J Advert 31(3):83–95 Ehmke C, Wilson S (2007) Identifying web usability problems from eye-tracking data, In: Proceedings of the 21st British HCI group annual conference on people and computers: HCI. . . but not as we know it-, vol 1. British Computer Society, pp 119–128 Gamito P, Rosa PJ (2014) I see me, you see me: inferring cognitive and emotional processes from gazing behavior. Cambridge Scholars Publishing, Newcastle upon Tyne Goldberg JH, Wichansky AM (2003) Eye tracking in usability evaluation: a practitioner’s guide. In: Hyona J, Radach R, Duebel H (eds) The mind’s eye: cognitive and applied aspects of eye movement research. Elsevier, Boston, pp 573–605. To appear Gong W, Maddox LM (2003) Measuring web advertising effectiveness in China. J Advert Res 43(1):34–49 Harmeling CM, Moffett JW, Arnold MJ, Carlson BD (2017) Toward a theory of customer engagement marketing. J Acad Mark Sci 45(3):312–335 Harrigan P, Evers U, Miles M, Daly T (2017) Customer engagement with tourism social media brands. Tour Manag 59:597–609 Hassan Montero Y, Herrero Solana V (2007) Eye-tracking en interacción persona-ordenador, No solo usabilidad, 6. http://www.nosolousabilidad.com/articulos/eye-tracking.htm. Accessed 10 June 10 2012 Heath R, Nairn A (2005) Measuring emotive advertising: implications of low attention processing on recall. J Advert Res 45:269–281 Hernández-Méndez J, Muñoz-Leiva F (2015) What type of online advertising is most effective for e-tourism 2.0? An eye tracking study based on the characteristics of tourists. Comput Hum Behav 50:618–625 Hervet G, Guérard K, Tremblay S, Chtourou MS (2011) Is banner blindness genuine? Eye tracking internet text advertising. Appl Cogn Psychol 25(5):708–716 Hill D (2003) Tell me no lIES: using science to connect with consumers. J Interact Mark 17(4):61– 72 Ho HNJ, Tsai MJ, Wang CY, Tsai CC (2014) Prior knowledge and online inquiry-based science reading: Evidence from eye tracking. Int J of Sci and Math Educ 12:525–554 Hollebeek LD, Glynn MS, Brodie RJ (2014) Consumer brand engagement in social media: conceptualization, scale development and validation. J Interact Mark 28(2):149–165 Hollis N (2005) Ten years of learning on how online advertising builds brand. J Advert Res 45:255– 268 Holmqvist K, Nyström M, Andersson R, Dewhurst R, Jarodzka H, van de Weijer J (2011) Eye tracking: a comprehensive guide to methods and measures. Oxford University Press, Oxford Hong W, Thong JY, Tam KY (2007) How do Web users respond to non-banner-ads animation? The effects of task type and user experience. J Am Soc Inf Sci Technol 58(10):1467–1482 Hsieh YC, Chen KH (2011) How different information types affect viewer’s attention on internet advertising. Comput Hum Behav 27(2):935–945 Hwang Y, Jeong J (2016) Electronic commerce and online consumer behavior research A literature review. Inf Dev 32(3):377–388 IAB Spain Research and The Cocktail Analysis (2009) Claves sobre la Interacción visual con la publicidad web. Aplicación de la técnica eye tracking. https://iabspain.es/wp-content/uploads/ Estudio_IAB_Eyetracking_junio_09v2_Modo_de_compatibilidad.pdf. Accessed 10 Nov 2014
29 Eye-Tracking Technology for Measuring Banner Advertising Efficacy. . .
715
International Telecommunication Union, ITU (2002) Methodology for the subjective assessment of the quality of television pictures, ITU-R BT.500–11 Javal L (1878) Essai sur la physiologie de la lecture. Annales d’Oculistique 79:97–117 Just MA, Carpenter PA (1980) A theory of reading: from eye fixations to comprehension. Psychol Rev 87(4):329–354 Kahneman D, Tursky B, Shapiro D, Crider A (1969) Pupillary, heart rate, and skin resistance changes during a mental task. J Exp Psychol 79(1):164–167 Krishnan HS, Chakravarti D (1999) Memory measures for pretesting advertisements: an integrative conceptual framework and a diagnostic template. J Consum Psychol 8(1):1–37 Kuisma J, Simola J, Uusitalo L, Öörni A (2010) The effects of animation and format on the perception and memory of online advertising. J Interact Mark 24(4):269–282 LaBerge D (1995) Attentional processing: the Brain’s art of mindfulness. Harvard University Press, London Lapa C (2007) Using eye tracking to understand banner blindness and improve website design. Doctoral Thesis, Rochester Institute of Technology Lee J, Ahn JH (2012) Attention to banner ads and their effectiveness: an eye-tracking approach. Int J Electron Commerce 17(1):119–137 Lindgaard G, Fernandes G, Dudek C, Brown J (2006) Attention web designers: you have 50 ms to make a good first impression! Behav Inf Technol 25(2):115–126 Lohtia R, Donthu N, Yaveroglu I (2007) Evaluating the efficiency of Internet banner advertisements. J Bus Res 60(4):365–370 Luque T (1997) Investigación de marketing. Fundamentos. Ed. Ariel, Barcelona Malhotra NK (1997) Investigación de Mercados. Un enfoque práctico, 2ªedn. Prentice Hall Hispanoamericana, México Malthouse EC, Calder BJ (2009) Media engagement is as important as advertising execution. In: Terlutter R, Diehl S, Karmasin M, Smit E (eds) Proceedings of the 8th ICORIA international conference on research. Alpen-Adria University, Klagenfurt Margarida A (2013) Do users look at banner ads on Facebook? J Res Interact Mark 7(2):119–139 Mele M, Federici S (2012) Gaze and eye-tracking solutions for psychological research. Cogn Process 13(1):261–165 Mendes-Filho LAM, Tan FB (2008) An overview on user-generated content and the empowerment of online travellers. FARN J 7(2):17–30 Mikalef P, Sharma K, Pappas I (2018) Social commerce and consumer search behavior: an eye-tracking study. In: Proceedings of the 24th Americas conference on information systems (AMCIS), New Orleans, pp 1139–1148. https://aisel.aisnet.org/cgi/viewcontent.cgi?article= 1374&context=amcis2018. Accessed 15 Jan 2019 Mosconi M, Porta M, Ravarelli A (2008) On-line newspapers and multimedia content: an eye tracking study. In: Proceedings of the 26th annual ACM international conference on design of communication. ACM, pp 55–64 Muñoz-Leiva F, Hernández-Méndez J, Sánchez-Fernández J (2012) Generalising user behaviour in online travel sites through the Travel 2.0 website acceptance model. Online Inf Rev 36(6):879– 902 Muñoz-Leiva F, Hernández-Méndez J, Gómez-Carmona D (2018) Measuring advertising effectiveness in Travel 2.0 websites through eyetracking technology. Physiol Behav 200(1): 83–95 Nielsen J (2007) Banner blindness: old and new findings. http://www.nngroup.com/articles/ banner-blindness-old-and-new-findings/. Accessed 11 Nov 2014 Nielsen (2014) The digital consumer. http://www.nielsen.com/content/dam/corporate/us/en/ reports-downloads/2014%20Reports/the-digital-consumer-report-feb-2014.pdf. Accessed 11 Nov 2014 Nihel Z (2013) The effectiveness of internet advertising through memorization and click on a banner. Int J Mark Stud 5(2):93–101 Noton D, Stark L (1971) Eye movements and visual perception. Sci Am 224:34–43 Owens JW, Chaparro BS, Palmer EM (2011) Text advertising blindness: the new banner blindness? J Usability Stud 6(3):172–197
716
F. Muñoz-Leiva
Pagendarm M, Schaumburg H (2001) Why are users banner-blind? The impact of navigation style on the perception of web banners. J Digit Inf 2(1). Available at: http://journals.tdl.org/jodi/ article/view/36/38. Accessed 10 Nov 2014 Pan B, Zhang L (2010) An eye tracking study on online hotel decision making: the effects of images and number of options. In: Travel and tourism research, association annual conference, San Antonio, 20–22 June Pan B, Hembrooke HA, Gay GK, Granka LA, Feusner MK, Newman JK (2004) The determinants of web page viewing behavior: an eye-tracking study. In: Duchowski A (ed) Proceedings of eye tracking research & applications symposium (ETRA 04). ACM, New York, pp 147–154 Pieters R, Wedel M (2004) Attention capture and transfer in advertising: brand, pictorial, and textsize effects. J Mark 68(2):36–50 Putrevu S (2008) Consumer responses toward sexual and nonsexual appeals: the influence of involvement, need for cognition (NFC), and gender. J Advert 37(2):57–70 Rayner K (1998) Eye movements in reading and information processing: 20 years of research. Psychol Bull 124(3):372–422 Rayner K, Castelhano MS (2007) Eye movements during reading, scene perception, visual search, and while looking at print advertisements. In: Wedel M, Pieters R (eds) Visual marketing: from attention to action. Lawrence Erlbaum, New York, pp 9–42 Resnick M, Albert W (2014) The impact of advertising location and user task on the emergence of banner ad blindness: an eye-tracking study. Int J Hum-Comput Interact 30(3):206–219 Rizzolatti G, Riggio L, Sheliga BM (1994) Space and selective attention. Atten Perform 15:231– 265 Russo JE (1978) Eye fixations can save the world: a critical evaluation and a comparison between eye fixations and other information processing methodologies. Adv Consum Res 5(1):561–570 Salvucci DD, Goldberg JH (2000) Identifying fixations and saccades Sciences (HICSS), 2010 43rd Hawaii International Conference on IEEE, pp 1–10 Van Doorn J, Lemon KN, Mittal V, Nass S, Pick D, Pirner P, Verhoef PC (2010) Customer engagement behavior: theoretical foundations and research directions. J Serv Res 13(3):253– 266 Van Trijp HCM (2009) Consumer understanding and nutritional communication: key issues in the context of the new EU legislation. Eur J Nutr 48(1):41–48 Varadarajan R, Yadav MS (2009) Marketing strategy in an Internet-enabled environment: a retrospective on the first ten years of JIM and a prospective on the next ten years. J Interact Mark 23:11–22 Wedel M, Pieters R (2008) A review of eye-tracking research in marketing. Rev Mark Res 4:123–147 Wong W, Bartels M, Chrobot N (2014) Practical eye tracking of the ecommerce website user experience. In: Stephanidis C, Antona M (eds) Universal access in human-computer interaction. Design for all and accessibility practice. Springer International Publishing, Heraklion, pp 109– 118 Yaveroglu I, Donthu N (2008) Advertising repetition and placement issues in on-line environments. J Advert 37(2):31–43 Yoo CY (2008) Unconscious processing of Web advertising. Effects on implicit memory, attitude toward the brand, and consideration set. J Interact Mark 22(2):2–18 Zikmund WG (2003) Fundamentos de investigación de mercados. Thomson, Madrid León Sáez de Ybarra JL (2009) Nuevos Soportes y mercados de la publicidad digital. Transiciones y experiencias. Pensar la Publicidad. Revista internacional de investigaciones publicitarias 3(2):17–30
User-Centered Design
30
˘ Hilda Tellioglu
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sociotechnology as a Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . User-Centered Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . User-Centered Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
718 719 720 724 733 734 734
Abstract In this chapter, we will present the iterative user-centered design process as an adaptive and agile life cycle for open innovation and successful development of socio-technical systems. The user acceptance and usability of the technical system are central to the design process, which determines how to proceed at several stages of designing and evaluating. Besides an overview of the most accepted and well-established methods of design thinking, which are based on user-centered design principles and presented also in tables for easy accessibility, we will describe cultural probes, provocative requisites, design games, design workshops, sketches, wireframes, mockups, low- and high-fidelity prototypes, as well as technology probes in detail. This will enable, on the one hand, an understanding of the methods by different stakeholders in multidisciplinary project teams and, on the other hand, a base for deciding which methods and approaches are suitable for a specific phase or type of a project.
H. Tellio˘glu () Faculty of Informatics, Institute of Visual Computing and Human-Centered Technology, Artifact-Based Computing and User Research (ACUR), Vienna University of Technology (TU Wien), Vienna, Austria e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_122
717
718
˘ H. Tellioglu
Keywords User-centered design · Design thinking · Iterative design · Sociotechnology · Design methods · Qualitative methods
Introduction Due to digitization almost everywhere around us, we are dealing more and more with contradicting requirements for technology-based services which are provided by companies, education institutions, or governmental bodies. Design with users, design for users, and voice of the customer techniques have become very important in industry practice, especially if it is about user acceptance. Methodologically, everything is possible, but not everything is successful. We need the right approach for a user-centered development of innovative products. To avoid the gap between the use and design of systems, the sociotechnology framework can be utilized as a guiding approach (Emery and Trist 1960). Based upon the principles of participation at all stages of development processes, user-centered methods have proved themselves as very useful means of facilitating open and cooperative settings. Co-creation and mutual understanding among users and designers make it possible to design and develop successful systems that are acceptable for their users, both in their shape and look and feel and in their functionality. This is a promising way to create sustainable systems for (real) use. First of all, we have to understand why we should focus on users in design processes (Ritter et al. 2014). The answer is very simple: We want to create a system or technology that is intended for human use, no matter how much artificial intelligence is helping to carry out certain tasks automatically or semiautomatically in the background, which means non-transparently. We want to design and develop an effective, safe, efficient, scalable, and – the most important among all requirements – enjoyable and usable system for people. That is why we need to understand users, their characteristics, skills, experiences so far, commonalities, differences, and their use context, in which they will perform certain tasks when using the system or technology we provide. Understanding users entails specific methodological knowledge and skills of designers, focusing on ways how to get to know the potential users of an intended system and furthermore how to involve them intelligently in the design process as much as possible. Understanding in order to inform the design for users includes the following actions (Ritter et al. 2014, p.4): • Knowing how to observe and document what people do, by using appropriate methods • Understanding why people do what they do, by gaining insights about people’s intrinsic and extrinsic motivations • Understanding and predicting when people likely do things, by identifying people’s behavioral patterns
30 User-Centered Design
719
• Understanding how people choose to do things the way they do them, by studying the options people have as well as the constraints and resources that are given All these actions require certain knowledge and skills of designers to establish the appropriate methodology, research, and design setting at the right time in the design process to carry out the necessary design activities, including these abovementioned factors for understanding users. User-centered design (UCD) helps to achieve not only an understanding of users but also their involvement throughout the whole design process. The consideration of human characteristics and capabilities as central factors in the design process (Preece et al. 2015) facilitates the creation of better accepted, actually used, and sustainable systems. Successful systems are the ones which go beyond individuals’ requirements and capabilities, by also explicitly considering social interactions and environments of their users. Here, sociotechnology is the needed framework to base the design on. To address all these aspects in a design process is not easy. We need to know what we have to ask when in the course of design projects. For instance, we have to find out who is going to use the system or technology and why. What are the goals of users when using the system or technology? Are users willing to put effort into learning how to use the system or technology? How often are they going to use the system or technology? Will they use it alone or with others? Besides the who question, we have to ask why, how, and when the use of the system or technology will occur. These questions are central, especially during the evaluation and experimentation stages with potential users of the system or technology. UCD methods like brainstorming, storyboarding, creating cultural probes, use scenarios and personas, mockups, lowand high-fidelity prototypes, and then later, when design process progresses, user tests, thinking aloud evaluation sessions, focus groups, etc. are very useful to apply for answering these questions. In the next section, we will summarize firstly the principles of socio technology to provide the base for a user-centered design process. Secondly, we will introduce the user-centered design approach with its methods by showing their characteristics especially in innovative design processes. Besides an overview of the most accepted and well-established methods of design thinking, which are based on user-centered design principles, we will describe cultural probes, provocative requisites, design games, design workshops, sketches, wireframes, mockups, low- and high-fidelity prototypes, as well as technology probes in detail. This will enable, on the one hand, an understanding of the methods by different stakeholders in multidisciplinary project teams and, on the other hand, a base for deciding which methods and approaches are suitable for a specific phase or type of a project.
Sociotechnology as a Principle Emery and Trist (1960) have introduced the term socio-technical systems. These are systems that involve a complex interaction between humans, machines, and the environment. The goal is to consider people, machines, and context when
720
˘ H. Tellioglu
designing and developing such systems. Badham et al. (2000) described sociotechnical systems with five characteristics: • Systems should have interdependent parts. • Systems should adapt to and pursue goals in external environments. • Systems have an internal environment comprising separate but interdependent technical and social subsystems. • Systems have equifinality. In other words, system goals can be achieved by more than one means. This implies that there are design choices to be made during system development. • System performance relies on the joint optimization of the technical and social subsystems. Focusing on one of these systems to the exclusion of the other is likely to lead to degraded system performance and utility. Baxter and Sommerville (2011) introduced the term of socio-technical system engineering to address the need of delivering the expected support for real work in organizations. With socio-technical system engineering, they mean “the systematic and constructive use of socio-technical principles and methods in the procurement, specification, design, testing, evaluation, operation and evolution of complex systems” (p.4). It is still a common problem that systems often meet their technical requirements but are seen by their users as failures because they do not deliver the expected support for the real use. To avoid to produce failure in system engineering, socio-technical principles and methods should be used in design and engineering processes. In this chapter, we show how this approach can be integrated into (usercentered) design processes. Firstly, we introduce the user-centered design as a process, and then, we describe its methods for different goals and activities in design.
User-Centered Design User-centered thinking is about creating a direct link between the current and future users (Baek et al. 2008; Wallach and Scholz 2012). Gould and Lewis (1985) defined three principles for a user-centered design process: • Early focus on users and tasks, to gather knowledge about the social, cultural, personal, and all other types of characteristics of users • Empirical measurement, gained by capturing and analyzing user feedback • Iterative design, based on iterations after each user feedback The iterative process of user-centered design enables approaching a final product step by step, by reducing development risks and avoiding dismissing big parts of the achieved components or results (Fig. 1). Design thinking was introduced as a cognitive process of designers almost two decades ago (Cross et al. 1992; Eastman et al. 2001). The goal was to understand design creativity and to improve design thinking abilities. Today, design thinking is defined as “a complex thinking process of conceiving new realities, expressing
30 User-Centered Design
721
Fig. 1 The iterative process of user-centered design (ISO 9241-210(2010))
the introduction of design culture and its methods into fields such as business innovation” (Tschimmel 2012, p. 2). The most popular design thinking models are the 3 I Model (Inspiration, Ideation, Implementation) by IDEO (2001) (Brown and Wyatt 2010, 33ff); the HCD Model (Hearing, Creating and Delivering) again by IDEO; the model of Understand, Observe, Point of View, Ideate, Prototype and Test by Hasso Plattner Institute (Thoring and Müller 2011); the 4 D or Double Diamond design process model (Discover, Define, Develop, Deliver) by British Design Council (2005); and the Service Design Thinking Model (Exploration, Creation, Reflection, Implementation) by Stickdorn and Schneider (2010). Besides involving users in design processes, we believe that design thinking is a very helpful approach to design socio-technical systems. “Design thinking is a human-centered approach to innovation that draws from the designer’s toolkit to integrate the needs of people, the possibilities of technology, and the requirements for business success” (Tim Brown, IDEO). The exploration of the role and potential of design thinking within organizations has changed the original objective of this research (Brown 2009; Martin 2009). So, “design thinking is not only a cognitive process or a mindset, but has become an effective toolkit for any innovation process, connecting the creative design approach to traditional business thinking, based on planning and rational problem solving” (Tschimmel 2012, p.2). This shifted design thinking from design disciplines more and more to the fields of management and marketing. In this chapter we will use design thinking as a framework providing a set of methods that are used in user-centered design processes. If we take design thinking as an approach seriously and apply (all) its methods thoroughly throughout the whole design process, we can easily achieve the goal of understanding everyday practice and its actors. This would lead to system designs that consider the context
722
˘ H. Tellioglu
of use, user experiences, and the needed technology support. Our objectives in designing systems are being innovative and improving the user experience. We think this can be done only by understanding the actors, their actions, and their use context and of course by including them as experts into the design process. In this section, we will explore all these mentioned methods (see also the Chap. 27 “Tourism Design: Articulating Design Beyond Science”) that are needed to set up and carry out a user-centered design process. Artifacts created in each step of the process are both enablers and hosts of the evolving design ideas. In the course of design processes, especially if they are user-centric, several models are created (Tellio˘glu 2016, p.24) (Fig. 2). In the user-centered design process, contextual inquiry, capturing ludic experiences, and use context definition are needed to describe and understand the context of use and, furthermore, to specify the users’ requirements. Later on, designing and evaluating solutions for users require the steps of interactive user-centered system and product design. The following models help to support these processes (see also the Chap. 35 “User Modelling in E-Tourism: A Human-Computer Interaction Perspective”): • Use models are personas, scenarios, use cases, flow models, storyboards, and narrative posters, mainly presented as models and descriptions by using a standard modeling language like UML (Unified Modeling Language). The aim of these models is not only to detail and describe the design for the design team but also to make it accessible for others who are not involved in the design process. Use models help to address several requirements and answer the following design and specification questions: Who are the final users? What are the interaction and interface elements? How do the layout, user interface, and interaction look like? What does the user do? What does the product do? What are the use scenarios and use contexts? What are the use qualities? What are the specifics of the product? What is the positive impact of the product? In which cases does it help users? What are the features of the product? Who would like to have the product? Is it feasible? • System models are interface and interaction visualizations, technology probes, as well as (high-fidelity) executable 2D or 3D prototypes showing how the original idea looks like in action in the envisioned context. Interaction models are product descriptions and presentations with final corporate identity elements, demonstrating the use and features of the product, pricing, and measures for dissemination. They show the idea of the final product or service, by referring to its technology features, interfaces, architectural elements, or its real-time use. System and interaction models help designers to deal with following (re)design, interaction, and evaluation questions: What type of layout elements are needed for surfaces, interfaces, colors, etc.? What are the dimensions and scales of the product? Are there variations of the design? What are the functions that are usable and show affordance? How are ergonomic factors considered in the design? Which technologies should be used to implement the idea? Which interactions are implemented? Which part will be implemented with Wizard of Oz (a method
Fig. 2 User-centered design in relation to use, system, and interaction models and assigned methods
30 User-Centered Design 723
˘ H. Tellioglu
724
that asks users to interact with a system that they believe to be autonomous but which is in reality controlled by a human)? Which material, tools, hardware, etc. will be used? What are the sketches, wireframes, technology probes, and prototypes? What is the final product? Are there different visualizations? How is the product documented? What are the user references and technical documents of the product? How are intermediaries or the final product evaluated? How is the evaluation setup and what are the points to evaluate? How are evaluation results translated to (new) requirements and changes to the existing requirements?
User-Centered Design Methods In this paper, only such methods are presented in detail that are relevant for user involvement in the design process. These are the very first idea, observations with video analysis, cultural probes, provocative requisites, design games, scenarios, design workshops, sketches, wireframes, (video) mockups, prototypes, technology probes, and product and corporate identity. Others, which are mainly carried out by the design team without interacting with users directly, like literature review, interviews with experts, and narrative posters, are left out for space reasons (for details see Tellio˘glu 2016). The very first idea is a crucial step in a design process (Table 1). There is always an idea that drives the design process. In most cases, the idea is vague and needs a lot of elaboration, which happens at different design stages in a project. The main method applied in that case is brainstorming, which needs to be well documented and archived. All associations even if they are not actually used in the current idea development or furthermore in the project need to be kept in a retrievable way for later access and for a design discussion, e.g., in a design workshop, in case some options or ideas need to be opened again during the project for further consideration.
Table 1 The very first idea as initiator of a design project The very first idea Description Goal Type Example of use Steps
Brainstorming the very first associations and impressions in team Gather all associations and possible ideas based on images and impressions Idea generation Very early stages of idea generation or orientation After a discussion in the project team, create associations to the subject Brainstorm (brain write and brain sketch) different ideas, also by using images or impressions from media Map all results in a shared representation of ideas and associations Start thinking further on ideas following up the collected associations and impressions Document all relevant data for further reference
30 User-Centered Design
725
Table 2 Observations with video analysis to understand real use scenarios Observations with video analysis Description Goal
Type Example of use Steps
Creating views of real scenarios Observe and understand the context, use, or other aspects of design; analyze interactions, requirements, evaluation, or usability; document or illustrate Data inquiry and recording method Workplace studies Find the place and/or people for the observation Set up the video equipment Carry out the observation and recording; if needed, repeat it several times or change the location and time of recording Analyze the material qualitatively and quantitatively, select relevant parts, and create a short video to visualize the most relevant observations for further communication in teams
It is not enough to imagine how a system will be used; it is necessary to really observe how the targeted users will use it. And everyone is not the same. Tests make sure that designers understand their users and their use behaviors in a specific context with a specific piece of system or technology. Observations with video analysis help to understand real use scenarios (Table 2) (see also the Chap. 32 “Mobile Ethnography in Tourism and Hospitality: Concept, Tools, and Applications”). Before starting to develop solutions, the current situation needs to be captured in detail. How are people acting now? What tools are they using? Are there patterns of acting, individually or cooperatively? What types of help do people need and get? Where are the problems or gaps in handling? What can be the reasons for these problems in general and in specific situations? These questions and several more can be answered by just observing people in the focal environment, like at a workplace, for which a design idea should be generated and further developed. This method creates a lot of video and audio material that must be analyzed afterward and probably used to create short pieces to illustrate certain situations to address in design teams. The effort needed for the subsequent work should not be underestimated, but it is worthwhile to carry out. Designers should be aware of the point of repetition of actions and occurrences and stop observing the setting to avoid the additional unnecessary screening work. Cultural probes are very playful and multifaceted (Table 3). This is the beginning of creating first ideas using colors or other aesthetic elements in the design process. These elements are mainly used in the design of the probes, but they also include several parts, which are normally kept for the further design of the system. On the other hand, designers consider important questions they might have thought of after analyzing the videos and identifying the problems they want to tackle with the design. These questions and other additional data about the users and use scenarios can be captured by cultural probes in an asynchronous way (Beyer and Holtzblatt 1998), normally within at least 2 weeks. Users get the probes and fill in their data, which are then handed over to the designers for further analysis. The handing over
˘ H. Tellioglu
726 Table 3 Cultural probes to understand the cultural context of the use and users Cultural probes Description Goal
Type Example of use Steps
Understanding the cultural context of future users Create qualitative approach to understand the user, inspire the design functionally and aesthetically, evoke creative reaction of (potential) users, support the creation of design material Experimental research method Early stages of user-centered design processes Define data to inquire via the probe Design probe elements; consider corporate identity, visuals, sounds, tangible elements, and texts Create a cultural probe package for distribution Distribute the probe to users Analyze the data gathered in probes: qualitative; ask for clarification if needed; compare; extract particular occurrences including emotions, ideas, and inspirations Document the analysis and comparison without interpretation
step is normally accompanied with a short interview in which designers clarify open questions or misunderstandings directly with the originator of the data. Cultural probes have to provoke inspirational responses from the target group in diverse cultural settings (Gaver et al. 1999, p.22). It is a way of experimental design in a responsive way. Cultural probes host personal dialogue between designers and target user groups by provoking with the probes as interventions, in both directions, the design ideas to the target user group and the way of doing or living of user groups in the form of visual rather than textual material to the designers. Provocative requisites are characterized by their capacity for provocation and their experimental nature (Table 4). Through ambiguity of the designed requisite, users’ attention is led to certain playful and provocative statements or questions illustrated usually in a public space to make the requisite accessible for people. Designers capture the user reaction with the help of video and audio recordings, which then need to be analyzed. Everything integrated into the provocative requisite has to have a relation to an aspect of the design idea, including optional aspects or questioned interaction elements connected to the design artifact. Parts of the requisite might help to evaluate some of the features of the design, probably at a very early stage, like layout, interaction, sound, or visual elements of the design. A design game enables exploration of imaginaries, interaction elements, and emphatic aspects of co-creation or co-design (Vaajakallio and Mattelmäki 2014) (Table 5). It is based on the assumption that designing is a social process consisting of communication, negotiation, and compromising, interrelatedly. The process is as important as the final product (Brandt 2006). User participation is prepared with certain pre-structured settings and tasks. The result of interactions is not a system or a design artifact, rather a co-created understanding of a context, user experiences, possible design ideas, and ideal situations or “dreams.” Through gaming, an exchange between users and designers can take place that has also a
30 User-Centered Design
727
Table 4 Provocative requisites to provoke and activate the potential users Provocative requisites Description Goal
Type Example of use Steps
Provocation, ambiguity, inspiration in context Represent a design idea creatively and playfully, question and discuss the design ideas by letting them experience in context, create inspiration for design Experimental research method Dealing with ambiguity and dubiety of the idea Define a situation, a scenario, or a context for the requisite Design the requisite, populate it with data, play it, or set it up Observe the requisite in action Document the scenario, the observation, the interaction with the requisite Analyze and explain in relation to the design idea
Table 5 Design games as playful ways to design ideas Design games Description Goal
Type Example of use Steps
Playful way to gain design ideas Generate design ideas, concretize a design idea in form of a party game, play different options of interaction, experiment with use and functionality of design elements Design creation method Create playful elements of a design idea Define the goal of a design game Document the process of creation of the design game, also the dismissed ideas Describe the game with all elements (props, content, rules, etc.) Play several times with the design game Document the games played, describe how it was perceived by the players Adapt the game if necessary
learning effect. The game elements and interaction mechanisms enable taking roles in the game and, through this, to experience other perspectives. These elements again stipulate the creativity of the game players. New innovative ideas may occur and be developed cooperatively in the game (Pedersen and Buur 2000). Design games help to create decision-making situations which are very close to reality. Discussions and negotiations might occur in such situations, which are ensembled within rules, regulations, or conventions to ensure keeping the focus on the goals of the game. At the same time, all regulatory measures are loose enough to offer a free space for improvisations or creation of innovative combinations of ideas. A design game consists normally of a game board, several cards, dice, meeples, paper-based material, or additional digital elements. The game has to be played several times to get fruitful feedback from the players, who must be selected carefully. The games must be recorded for further detailed analysis. Design games are bridges between past experiences, current subjects, and future visions (Sanders and Stappers 2012). To create scenarios that describe intended use
˘ H. Tellioglu
728 Table 6 Scenarios to set up the use context Scenarios Description Goal Type Example of use Steps
Scenarios of use context with personas and actions Identify problems and search for solutions in certain settings, provoke ideas Experimental research method Product design, interaction design Define the goal, context, prerequisites, actors as personas, interactions, and processes of a scenario Start with a rough scenario Observe and play the scenario, analyze, and refine it Create a positive scenario: Adapt the scenario as long as it does not contain any negative aspect any more Create a negative scenario: Adapt the scenario as long as it does not contain any positive aspect any more Analyze the results and their impact onto the design idea Document all actions and results
situations (Brandt 2006), design games are used by designers as tools, by game players as ways of (design) thinking, and by designers of the design game as structures with concrete design materials (Vaajakallio and Mattelmäki 2014). Scenarios are used in UCD to support the tensions between reflection and action, as well as between typical and critical situations (Bødker 1999) (Table 6). Over the last two decades, scenarios containing several use cases have become more abstract, pre-direct users’ actions, and support prototyping in which users are very much involved. Later on, the role of scenarios in design processes evolved (Bardram 1998; Bødker and Christiansen 1997; Carroll 1995; Kyng 1995): they became the “the basis for overall design, technical implementation, co-operation within design teams, and across professional boundaries, e.g., between users and designers or between usability people and technical designers and implementers” (Bødker 1999, p.2). The purpose of a scenario depends on two factors: on the type of situation the scenario is dealing with and on the type of design situation that the constructors want to support. There are several reasons why scenarios should be used in design processes: use and exploration scenarios (Kyng 1995) help to present, situate, and illustrate solutions; explanation scenarios, on the other hand, help to identify potential problems. If designers want to have broad and conceptual answers, then they need open-ended scenarios. In case of gaining (more) detailed, specific answers, closed scenarios are very useful. Usually, scenarios are designed based on knowledge about typical ways of doing things. Nevertheless, it is also important to address specific, critical instances of the typical. Scenarios can provoke new ideas and are, therefore, very good in design processes. With small scenarios prototypes can be evaluated in a structured way. To evaluate prototypes which evolve vertically and horizontally during the design process, scenarios must be moved from typical ones to critical ones. Beside use cases, scenarios contain personas. Personas (Cooper et al. 2007) represent potential target users of a system. They are created after studying the
30 User-Centered Design
729
potential users intensively. Non-personas show the limits of a system by identifying who would not be addressed by the design. The number of personas should be limited to primary and secondary personas. Personas, if used actively and consistently throughout the whole design process, ensure a clear communication between the design team members and focus on the intended design. Personas must include characteristics of persons they represent, their experiences, expectations, and limitations, their life or work situation which is relevant for the design, and some unique quotes showing the main design aspect formulated by the persona him/herself (Calabria 2004; Grudin and Pruitt 2002). To make the best use of scenarios in user-centered design processes, they must be attuned to the particular use purposes and must be very selective based on these purposes (Bødker 1999). To conclude, scenarios mediate the thinking and communication that takes place in design processes. Design workshops are creative meetings of design teams. They can be used at different stages of a design project (Table 7). At an early stage, design ideas can be generated in a heterogenous small group cooperatively, usually in an inspiring room populated by several artifacts and materials to encourage the participants to interact with each other or improvise and use material around to illustrate their ideas. At a later stage in a design process, such workshops can be very useful to detail the interaction with the mockup or prototype, by considering users’ requirements and abilities to use the system under development. Design workshops focus on a small number of features or properties of a system. Workshops need to be prepared and documented properly to benefit of all creative ideas came out during the workshop, even not all of them are really used in the current project.
Table 7 Design workshops as settings for designing collaboratively Design workshops Description Goal
Type Example of use Steps
Being creative and exploring design ideas in teams, or exploring options for systems design Communicate different views to the design idea in a group, generate new ideas in a team, discuss different perspectives to the design on table in a group, explore different options for systems design based on a decided idea at a later stage of a design process Design in teams Create common understanding of a (rather complex) design idea in a team, e.g., in product design Define the goal of the workshop Select the participants of the workshop and define their role Set up a place, date, and process for the workshop Prepare the necessary material like models, plans, creative material, etc. as well as devices for audio/video recording and photos Carry out the workshop: introduction of participants and process, brainstorming related to the defined goal, working on different ideas, discussion and refinement of ideas came up during the workshop Identify and document results of the workshop
˘ H. Tellioglu
730 Table 8 Sketches as first visualizations of interface elements or ideas Sketches Description Goal Type Example of use Steps
Creating the first low-fidelity design artifacts Sketching the design ideas, for an overview but also for details Design generation and evaluation method User-centered design projects, prototyping Create sketches of interaction, with different details Compare and update sketches, explain their use Evaluate critical sketches with users Document the evaluation results
Design workshops are also places, in which design artifacts are generated, with different fidelity levels depending on the design progress. Design artifacts evolve through iterations while sketching, wireframing, creating (video) mockups, and prototyping. Sketching is a quick, easy, and cheap way of generating and discussing design ideas and concepts (Table 8). Sketches are simple, ambiguous, inspiring, and by their nature never finished (Buxton 2007). Even if they are seen as “throwaway” artifacts, they can be used later on in a design project, e.g., to try out alternatives or combine different ideas. They are as such identifiable and show the form and function of the artifact or idea they represent. They are somewhat vague and a lowfidelity representation – only for the current stage in argument building. The fidelity must have the right size: too little, then the argument might be unclear; too much, then the argument might be overdone, or the idea is already decided or completely worked out instead of suggesting. Wireframes are documents showing the design structures, hierarchies of information, elements of control, and contents (Table 9). They contain specifications, notes, meta data, navigation, and interplay of interface elements. They play an important role in product design because they help to convey features of the product and the technological and business logics. Wireframes are blueprints of the product’s functionality and can be of different types depending of their use and by whom they are used: almost a prototype of input and output interfaces and interactions (Saffer 2010), or a basic setup of interface elements without deeper functionalities. Wireframes can be reference zones, low- or high-fidelity wireframes, storyboards, standalone, or specifications, to name a few. (Video) Mockups are the first artifacts showing the look and feel of the design idea (Table 10). They extend wireframes by experiential components, e.g., colors, graphics, materials, etc. without straying too far from the wireframes’ definitions. Mockups are dummies which are scaled models or representations to present them to the users and other stakeholders. The input and output interfaces or interaction elements of mockups are prototypical without deeper functionalities. The main goal of mockups is to check requirements with users during the development process and to ensure decisions made so far on interaction and interface elements of the product.
30 User-Centered Design
731
Table 9 Wireframes as blueprints of the product’s functionality Wireframes Description Goal Type Example of use Steps
Creating more linked and organized design artifacts Design structures, control elements, contents as a blueprint Design generation and evaluation method User-centered design projects Create wireframes to cover all parts of the system Link all parts of the system with the wireframes, including the navigation Evaluate the wireframes with users Document the evaluation results Update the wireframes based on the evaluation results
Table 10 (Video) Mockups as first implementations of the look and feel of the design idea (Video) Mockups Description Goal Type Example of use Steps
Creating the first prototypical systems Create look and feel of the design with visual and audio elements Design generation and evaluation method User-centered design projects Create mockups to visualize the look and feel for the interaction with the system Use video and audio elements if needed Evaluate the (video) mockups with users Document the evaluation results Update the mockups based on the evaluation results
Prototypes are a simple and partly executable version of the product in development (Table 11). “In fact, a prototype can be anything from a paper-based storyboard through a complex piece of software, from a cardboard mockup to a modeled piece of metal” (Preece et al. 2002). In a prototype, the look and feel of the final product must be well-represented to make the interaction with it possible and the evaluation realistic enough for the final product development. If all interactions with the product are implemented in a prototype, then we talk about prototyping horizontally, where as if only one specific interaction is implemented completely, then these are called vertical prototypes (Moggridge 2006). Characteristics of a prototype can also be described with two terms: fidelity and resolution. Low-fidelity prototypes are created quick and dirty, used mainly for early validation, enable open discussions, and require prompting. High-fidelity prototypes, on the other hand, represent sharp opinions and concrete ideas and are self-explanatory, as well as well refined. Low-resolution prototypes contain less details, focus on the core interaction, are created quick and dirty, and used for early validation. On the contrary, high-resolution prototypes show more details, focus on the whole, are well refined, and show concrete ideas. Technology probes help to find out the feasibility of the technology chosen for the final product (Table 12). They might be a complex or simple implementation
˘ H. Tellioglu
732 Table 11 Prototypes as first impressions of the executable product at an early stage Prototypes Description Goal Type Example of use Steps
The first impression of the last design step Create 2D, 3D, or executable prototypes to illustrate the idea as an interactive artifact Executable design generation and evaluation method Product design Gather all positively evaluated design ideas Define the most important functions of the system under development Implement a prototype by focusing on the selected functions and applying look and feel from the positively evaluated mockup Evaluate the prototype and update it Describe the final prototype
Table 12 Technology probes to measure the feasibility of the technology selected Technology probes Description Goal Type Example of use Steps
Getting a hint about real-life interaction Examine and experiment a challenging technology implementation as a possible solution to the design idea Experimental technology application in field User-centered design projects Select a relevant technology for the system under development Select an interaction aspect of the system under development Set up the technology infrastructure and implement the selected interaction Evaluate the technology probe with users Document the evaluation results by stressing out the pros and cons of the technology selected for evaluation
of a future technology that is planned to be used in the final product. Technology probes “. . . are a particular type of probe that combine the social science goal of collecting information about the use and the users of technology in a real-world setting, the engineering goal of field-testing the technology, and the design goal of inspiring users and designers to think of new kinds of technologies to support their needs and desires” (Hutchison et al. 2003). So, they offer means of gathering and testing of interactions, technologies, and user feedback. They can be deployed in real use environments. They deliver real data (e.g., through logging) for further developments in the project (Fitton et al. 2004). This way they play a very relevant role in the decision-making about the technology and interactions of the prototype and later on of the product. At the end of a user-centered design process, the product and corporate identity need to be finalized (Table 13). The product and corporate identity definition is concerned with illustrating the product vision, including all of its interactions, physical properties, surrounding systems, marketing strategy, target groups, and target markets. It shows the product as a whole (e.g., purpose, cost, usage,
30 User-Centered Design
733
Table 13 Product and corporate identity as the design of the executable product Product and corporate identity Description Goal Type Example of use Steps
Designing the whole story Define and present the product as a result of the whole design process Product and context definition Product design Define all interactions, functions, and interrelated systems, including hard facts like costs and target users Finalize the visual and technical design of all product components Describe the use and administration of the product with a guide or handbook Design a corporate identity for the product, apply it for its presentation Create a product folder with all data relevant for target stakeholders
handling, maintenance, etc.) and the environment it is embedded in. It is defined in a clear product language following a defined product vision, corporate design, communication, behavior, and identity, containing aesthetics, target group features, logo, materials, color scheme, etc. (Balme 2001; Van Riel and Balmer 1997).
Conclusions In this chapter, we introduced user-centered design as a dynamic multidimensional process utilized by several design thinking methods and artifacts. The whole design process is an iterative circle of intertwined factors, namely, of people (users, designers, other stakeholders), particular design phases, and artifacts as intermediaries or final results to represent certain design aspects and parameters. The iteration of a UCD process is accompanied by user studies for design and for evaluation, which methodologically need different approaches in each design phase. In UCD projects, usability studies have to be seen as integral parts of design processes. This turns usability studies into activities that are responsible for the product and its future use (Bødker 1999). Methods we presented in this chapter show how a UCD process can be established and how a design process can evolve from the very beginning until the definition and presentation of the product design. We showed the relevance and use of single methods by stressing their strengths and weaknesses. There is no strict rule saying that all the methods presented here should be used in any type of design project. Each project is unique and designers have to select the most suitable methods for their particular projects. This chapter only helps to show the possible and useful ways of doing user-centered design.
734
˘ H. Tellioglu
Cross-References Tourism Design: Articulating Design Beyond Science User Modelling in E-Tourism: A Human-Computer Interaction Perspective
References Badham R, Clegg C, Wall T (2000) Socio-technical theory. In: Karwowski W (ed) Handbook of ergonomics. Wiley, New York Baek EO, Cagiltay K, Boling E, Frick T (2008) User-centered design and development. In: Spector JM, Merill MD, van Merriënboer J, Dirscoll MP (eds) Handbook of research on educational communications and technology, 3rd Edition. Lawrence Erlbaum Associates, New York, London, pp 659–670 Balmer J (2001) Corporate identity, corporate branding and marketing – seeing through the fog. Eur J Mark 35(3/4):248–291 Bardram J (1998) Scenario-based design of cooperative systems. In: Proceedings of COOP’98, Cannes Baxter G, Sommerville I (2011) Socio-technical systems: from design methods to systems engineering. Interact Comput 23:4–17 Beyer H, Holtzblatt K (1998) Contextual design. Defining customer-centered system. Morgan Kaufmann Publishers, San Francisco Bødker S (1999) Scenarios in user-centered design – setting the stage for reflection and action. In: Proceedings of the 32nd Hawaii international conference on system sciences. IEEE Bødker S, Christiansen E (1997) Scenarios as springboards in design. In: Bowker G, Gasser L, Star SL, Turner W (eds) Social science research, technical systems and cooperative work. Erlbaum, Mahwah, pp 217–234 Brandt E (2006) Designing exploratory design games: a framework for participation in participatory design? In: Proceedings of the 9th participatory design conference Brown T (2009) Change by design. How design thinking transforms organizations and inspires innovation. Harper Collins Publishers, New York Brown T, Wyatt J (2010) Design thinking for social innovation. Stanford Social Innovation Review, pp 31–35 Buxton B (2007) Sketching user experiences, getting the design right and the right design. Morgan Kaufmann, Amsterdam Calabria T (2004) An introduction to personas and how to create them. https://www.steptwo.com. au/papers/kmc_personas Carroll JM (ed) (1995) Scenario-based design. Envisioning work and technology in system development. Wiley, New York Cooper A, Reimann R, Cronin D (2007) About face 3: the essentials of interaction design. Wiley Cross N et al (eds) (1992) Research in design thinking. Delft University Press, Delft Eastman C et al (eds) (2001) Design knowing and learning: cognition in design education. Elsevier Science Ltd., Oxford Emery FE, Trist EL (1960) Socio-technical systems. In: Churchman CW, Verhulst M (eds) Management science models and techniques 2. Pergamon, Oxford, pp 83–97 Fitton D, Cheverst K, Rouncefield M, Crabtree A (2004) Probing technology with technology probes. In: Equator workshop on record and replay technologies, London Gaver W, Dunne T, Pacenti E (1999) Design: Cultural probes. In: ACM Interactions 6(1):21–29 Gould JD, Lewis C (1985) Designing for usability: key principles and what designers think. Commun ACM 28(3):300–311 Grudin J, Pruitt J (2002) Personas, participatory design and product development: an infrastructure for engagement. In: Proceedings of participation and design conference (PDC2002), Sweden, pp 144–161
30 User-Centered Design
735
Hutchison H, Mackay W, Westerlund B et al (2003) Technology probes: inspiring design for and with families. In: Proceedings of international conference on human interaction (CHI’03) ISO (2010) 9241-210, http://www.procontext.com/aktuelles/2010/03/iso-9241210-prozess-zurentwicklung-gebrauchstauglicher-interaktiver-systeme-veroeffentlicht.html Kyng K (1995) Creating contexts for design. In: Carroll JM (ed) Scenario-based design. Envisioning work and technology in system development. Wiley, New York, pp 85–108 Martin R (2009) The design of business. Why design thinking is the next competitive advantage. Harvard Business Press, Boston Moggridge B (2006) Designing interactions. MIT Press, Cambridge Pedersen J, Buur J (2000) Games and movies: towards innovative co-design with users. In: Collaborative design. Springer, London, pp 93–100 Preece J, Rogers Y, Sharp H (2002) Interaction design: beyond human-computer interaction. John Wiley & Sons, Ltd. 1st edition, Newyork Preece J, Sharp H, Rogers Y (2015) Interaction design: beyond human-computer interaction. John Wiley & Sons, Ltd. 3rd edition, Newyork Ritter FE, Baxter GD, Churchill EF (2014) Foundations for designing user-centered systems. What system designers need to know about people. Springer, London Saffer D (2010) Designing for interaction: Creating connovative application. New Riders, Berkeley Sanders EBN, Stappers PJ (2012) Convivial toolbox: generative research for the front end of design. BIS Publishers B.V., Amsterdam Stickdorn M, Schneider J (eds) (2010) This is service design thinking. Basic – Tools – Cases. BIS Publisher, Amsterdam Tellio˘glu H (2016) Models as bridges from design thinking to engineering. In: Proceedings of the 10th international conference on interfaces and human computer interaction (IHCI 2016), multi conference on computer science and information systems, Funchal, Madeira, 1–4 July, pp 21–28 Thoring K, Müller RM (2011) Understanding the creative mechanisms of design thinking: an evolutionary approach. In: Proceedings of the DESIRE’11 conference creativity and innovation in design. ACM, Eindhoven, pp 137–144 Tschimmel K (2012) Design thinking as an effective toolkit for innovation. In: Proceedings of the CCIII ISPIM conference: action for innovation: innovating from experience, Barcelona, pp 1–20 Vaajakallio K, Mattelmäki T (2014) Design games in codesign: as a tool, a mindset and a structure. CoDesign 10(1):63–77 Van Riel C, Balmer J (1997) Corporate identity: the concept, its measurement and management. Eur J Mark 31(5/6):340–355 Wallach D, Scholz SC (2012) User-centered design: why and how to put users first in software development. In: Software for people: fundamentals, trends and best practices. Springer, Berlin/Heidelberg
E-Tourism Research, Cultural Understanding, and Netnography
31
Robert V. Kozinets
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Brief History: Evolving Definitions of Netnography for Evolving Social Technologies . . . Understanding Netnography Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Comparing Big Data and Netnography Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-Tourism and Netnography: a Natural Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applying Netnography to E-Tourism Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Thoughts About Netnography and E-Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
738 739 741 743 745 746 749 750 751
Abstract For well over a decade, e-tourism researchers have been using netnography. Yet despite this use, netnography has thus far been under-utilized. Big data methods still predominate as a way to understand social media content, obscuring the potential for a more humanistic and meaning-rich understanding. This chapter is about netnography, a way to research social media that is flexible, contextualized, and enthusiastically agnostic about the type of data. Netnography has been developed as a way to study social media that maintains the cultural complexities of people’s experiences. This chapter introduces the reader to the rigorous practice of netnography as it exists today. Then, it contrasts netnographic methods and insights with those provided by big data analysis approaches. Finally, it uses examples and illustration to explore key territories and implications of
R. V. Kozinets () Jayne and Hans Hufschmid Professor of Strategic Public Relations and Business Communication, Annenberg School for Communication and Journalism and Marshall School of Business at the University of Southern California, Los Angeles, CA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_43
737
738
R. V. Kozinets
netnographic research to the understanding sought by e-tourism researchers, including electronic word of mouth, online reviews, online communities, selfies, and other travel and tourism-related phenomena.
Keywords Ethnography · E-tourism · Netnography · Online community · Online reviews · Qualitative research · Social media · Smart tourism · Word of mouth
Introduction The fields of e-tourism and smart tourism collect, aggregate, analyze, and harness data that derives from sources such as the social connections of social media, as well as various other sources (Gretzel et al. 2015, 181). As you examine the contents of this book about e-tourism, you will find that, similar to business practice in tourism and other fields today, quantitative methods for understanding social media appear to predominate. These e-tourism quantitative methods include a range of big data analytics approaches such as using automated data scraping, data mining, predictive analytics, natural language processing, tourist tracking, and sentiment analysis to study social media phenomena. These types of analysis can be very valuable tools for revealing patterns in the large quantities of information that social media data collection methods typically generate. However, there is a rich cultural and contextual aspect to e-tourism phenomena such as online travel reviews and electronic word of mouth, online communities, online influencers and audiences, travel food porn, and selfies. For almost 50 years now, qualitative researchers across the social sciences have been using and combining techniques such as focus groups, interviews, observation, participation, reflection, and interpretation to understand various aspects of online experiences and social interactions. Various types of qualitative research on social media data, using varieties of narrative, thematic, discursive, semiotic, and content analysis and interpretation techniques, have produced a rich body of theory as well as a curatorial type of chronicling of various aspects of life online. Cultural studies and technology studies researchers have been adapting qualitative research methods to the understanding of this deep, rich, multimedia, linguistically, symbolically, and visually complex mass of social media data. Across the social sciences, the application of a rigorous set of standards for qualitative social media research called netnography has been growing steadily. For well over a decade, e-tourism researchers have been using these techniques. They have been using netnography, for example, to study how people talk online about the brands of destinations such as Bologna and Florence (Woodside et al. 2007); Mumbai, Seoul, Singapore, and Tokyo (Martin et al. 2007); Beijing, Lijiang, Shanghai, and Xi’an (Hsu et al. 2009); and Tokyo (Martin et al. 2007). Woodside et al. (2007) studied social media accounts of overseas visitors writing about their
31 E-Tourism Research, Cultural Understanding, and Netnography
739
first visits to two Italian cities and found that these reports followed a narrative storytelling structure. The authors found that this rich data also offered “creative clues for positioning a destination uniquely and meaningfully in the minds of potential future visitors” such as “explanations of their own photographs that capture what these informants find especially worthwhile to report to others” which can offer both “an early warning system for learning [about] problems with a destination’s image as well as an early opportunity system for learning the images that excite visitors to advocate visiting the destination to friends, family members” and social media audiences (173). Yet, despite increases in the use of netnography in tourism and e-tourism research, netnography still has a lot of unrealized potential (Tavakoli and Mura 2018; Whalen 2018). Big data methods still predominate in e-tourism and smart tourism (Gretzel et al. 2015; Lu and Stepchenkova 2015), obscuring the potential for a more humanistic and meaning rich understanding – and also limiting dataset pattern recognition to relatively large and mundane patterns – the inhalations and exhalations of the mainstream. This chapter is about netnography, a way to research social media that is flexible, contextualized, and enthusiastically agnostic about the type of data, device formats, sources of information, and other aspects of its inputs. Netnography has been developed as a way to study social media that maintains the cultural complexities of its interactants’ itinerant experiences. In Kozinets (1997), I introduced netnography as a way to study online media fan cultures and have been developing and adapting it, along with many others, ever since. In this chapter, I will introduce the reader to the rigorous and interdisciplinary world of netnography today, contrast netnographic methods and insights with those provided by big data analysis approaches, and then explore some of the key territories and implications of netnographic research for the types of understanding sought by e-tourism researchers.
A Brief History: Evolving Definitions of Netnography for Evolving Social Technologies Before we can understand the application of netnography to e-tourism research, it is vital that we understand what netnography is, and to do that, I’d like to first provide some history. Working at the dawn of the contemporary age of social media, I developed netnography as a tool to study emerging online phenomena in a way that remained sensitive to their experiential, social, contextual, and cultural qualities. I began by honing the online ethnographic approach used by Baym (1993), Correll (1995), and Jenkins (1995) in my cultural study of fan groups online (Kozinets 1997). But unlike these other scholars, my investigations of fandoms in the midto-late 1990s broadened very naturally to a range of other topics including wine, food, technology, beauty and fashion bloggers, videogames, and all sorts of review writing – usually finding ways in which these topics and phenomena interrelated with brands.
740
R. V. Kozinets
Using netnography to investigate early developments which indicated a rise in commercialism on social networks, I published some of the first articles discussing the rise of virtual communities (the social media platforms of their day), online influencers, their storytelling and brand-building functions, and the power they would eventually wield over marketers (Kozinets 1999a,b). A few years later, I joined a team of skilled qualitative consumer researchers to use netnography to reveal the wider use of storytelling narratives, which were present both in market communications and in the public social media brand discussions of popular retro brands like the Volkswagen Beetle and Star Wars (Brown et al. 2003). Later, I secured an industry partnership to conduct a netnographic field study. That project studied a social influencer campaign conducted in Canada by Nokia. It conceptualized electronic WOM as a series of “networked narratives” where influencers evaluate, explain, embrace, and endorse brands, services, and products in various discursive ways adapted to their unique social media ecosystems and narrative storytelling arcs (Kozinets et al. 2010). Netnography’s substantive and theoretical developments include making the realization that social media is a contested, divisive, activist gathering place for online electronic tribes and a generator of e-tribalized markets (Kozinets 1999a,b). They chart social media developments into an online community and influencer-based ecosystem (Kozinets et al. 2017; Kozinets et al. 2010). More recently, they elaborate social media’s role in the evolution of vast, decentralized, and often passionate human-object-energy assemblages called “networks of desire” (Kozinets et al. 2017). These developments and findings were only possible because of the additional capacities provided by the netnography. As they have with any method, good researchers have adapted netnography to meet a changing range of challenges and opportunities. When I began developing the method in 1995, there were about 23,500 websites online and less than 40 million internet users concentrated mainly in the United States. I developed the method mainly from my fieldwork on fan newsgroups that were posting on the Usenet service and accessed through early versions of the Netscape browser. A few years later, the dot com boom turned to bust and it looked, for a while, like the Internet’s potential had been over-hyped. But then, blogs started to expand and gain popular attention, and so did social networking sites like Friendster, changing the social media game forever. From there, a range of different social media platforms including YouTube, Facebook, and Twitter came on the scene. Today, social media is a rapidly changing, complex, industrially and geopolitically critical, multi-billiondollar industry with over 3.8 billion participants worldwide. Positioned as a way to gather data in order to understand social media phenomena, netnography is a dynamic set of techniques for the study of this constantly evolving ecosystem. As social media platforms emerged and transformed, netnographic research followed and adapted existing methods to hunt for data. This means that netnography evolved and continues to evolve in relation to the larger field of social media and social media research and science (Kozinets 2020). Originally, in the world of newsgroups and forums, netnographic research tended to be located much more in single online sites, which were held to resemble traditional ethnographic field
31 E-Tourism Research, Cultural Understanding, and Netnography
741
sites (Kozinets 1998, 2002). At that point, it seemed to many researchers that the deep hanging out of traditional netnographers in particular physical and cultural places could be transplanted to an online hanging out in particular online venues. Thus, early netnography was rather closely aligned with traditional ethnography, positioned as a written account of online fieldwork. In fact, through about the first decade of its use, netnography was still mainly used to study particular online sites and thus was methodologically linked to cultural anthropology’s notions of field sites and ideas of participant-observation (Kozinets 2010). However, in more recent years, little of this original orientation remains. As it became evident that social media was becoming something very different and far more complex than newsgroups and chat rooms, netnographic researchers began exploring adaptations of netnography. Singular sites became much more complex, spreading across multiple platforms. Different platforms had different types of communication modalities and offered consumers different affordances. Even single platforms like Facebook contained complex online worlds within worlds. Anthropological notions of field sites were increasingly problematic and their principles increasingly difficult to maintain and adapt. Ethnographic notions of participation confused readers and researchers, who often simply substituted their own “observational” and “non-interactive” approaches. Sometimes, authors and editors confused netnography with content analysis, even claiming that covert techniques could be ethically used (they cannot). Eventually, I redefined and reconfigured the method away from field sites and participation (in Kozinets 2015) and toward an emphasis on data operations and researcher engagement (in Kozinets 2020). Contemporary netnography is built on the experimentation and publication of hundreds of researchers across the social sciences. Netnography combines scientific curiosity with a type of investigative journalistic predilection, casting the ethnographer in the role of a sort of social media detective who must follow the cultural trail in order to reveal embedded truths. The emphasis in netnography today is on providing a number of well-defined sub-procedures that guide the collection and analysis of social media data in order to provide quality qualitative research. However, there is wide latitude and an encouragement of innovation in representation and data usage in judging the quality of the work. Because social media ecosystems are dynamic, contextualized, and complex, contemporary netnographic research is designed to offer a wide range of different research technique that can be adapted for a variety of different platforms, phenomena, and research foci.
Understanding Netnography Today Netnography has developed into a synthesis of different procedures, operations, and academic fields. It is an amalgam of research perspectives that draw from its application to tourism studies, computer science, cultural studies, media anthropology, education, sociology, addiction studies, game studies, medicine and health, nursing, and many other fields – as well as from my own native fields of marketing and
742
R. V. Kozinets
consumer culture research. Netnography’s procedures and practices offer a new conceptual vocabulary to the social scientist who is interested in using social media and its multifaceted forms of communication as sources of qualitative data. Netnography today is a sophisticated and explicit set of operational procedures for conducting qualitative social media research. It is founded in four basic steps of (1) research inquiry, (2) data collection, (3) data analysis and interpretation, and (4) research communication. These four steps are further developed into six movements of initiation, investigation, interaction, immersion, integration, and instantiation. Living within those movements are a number of different, detailed, research operations. These operations are sets of procedures that are adaptable to particular research contexts. They are there to guide netnographic researchers through the entire research process, from finding a research question to presenting and submitting the final manuscript. For example, the operation of turning a research question into queries and keyword searches that can be used in conventional search engines is called “simplification” in netnography. It contains specific rules. Another operation, called “selecting,” provides researchers with five standard criteria – relevance, activity, interactivity, diversity, and richness – that researchers can use to evaluate potential sites of social media data in order to ensure that they meet the needs of particular research projects. Other operations cover research procedures such as cleaning and coding data and finding good sources of online traces, interviewing people, creating research web-pages, conducting mobile ethnography, and interpreting cultural themes. Despite the surface similarities, such as their focus on language and meaning, almost everything else about netnography is different from the workings of traditional anthropology. Unlike traditional anthropological fieldwork, netnography leaves behind core notions that no longer fit in the world of social media: ethnographic field sites, field notes, and research participation. Yet, netnography’s ethnographic sensibility still includes a detailed focus on cultural understanding. The netnographic researcher is encouraged to pay close attention to the use of language, imagery, symbolism, hierarchy, ritual, and other nuances of human experience as they present themselves within the vast range of social media behaviors. Through the operations of the method, the netnography of today is partly grounded in the ethnographic sensibility of past qualitative research inquiries, but also extends into the computer science reality of conducting research on social media. In netnography, the ethnographic field site has become a multi-sited online and offline site of potential data, a place to collect and then to curate consumer traces that relate to social media use. In e-tourism, for example, these cultural elements could be located in data such sites as YouTube travel videos and Pinterest fantasy travel board collections, in public comments to online news stories reported in the New York Times or The Guardian, or, of course, public reviews of hotels and tour operators on the TripAdvisor platform. Netnography includes historical and other probing forms of secondary research as well. It is a source of deep consumer insight that uses interviews of various kinds, from short online interviews to long in-person conversations. There is no prescribed mode of researcher participation, which is similar to the diversity of
31 E-Tourism Research, Cultural Understanding, and Netnography
743
practices in traditional participant-observational techniques. However, there is a range of positions from which the researcher can choose to engage with the data site in a variety of different ways. Because social media is diverse, varying by nation, platform, and type of media, researchers are constantly making adaptations to netnography’s basic techniques in order to get better results. The resulting operations in netnography are dynamic. Netnographers are also expected to read and synthesize the published work of a small and growing community of active netnographic researchers that are easily accessible through academic search engines such as Google Scholar and those present on ResearchGate and Academia.edu. Included in this now fairly substantial body of work are the works of groups of scholars working in the field of tourism and e-tourism. Netnographers scan for, study, and then adapt relevant prior work of other qualitative social media scholars, their theories, and research operations, to the needs of the particular research project at hand. They are empowered by clear rules of ethical research in social media. These guidelines are different from those of in-person and emplaced ethnographers, and other online researchers, and they are also internationally diverse and continually changing. Netnography’s ethical guidelines are in place to handle all the ethical questions and required institutional requirements that this research necessitates. These are some key differences to keep in mind. Netnography deals in data sites rather than field sites. It is focused on researcher engagement rather than participation. It presents an evolving and flexible methodology. And it offers clear and reputable research ethics guidelines. No other method for qualitative social media research can legitimately make these claims. The transition from traditional qualitative techniques to a set of rigorous qualitative social media research standards and procedures (i.e., netnography) has not been a simple one. In fact, the transition from traditional qualitative research to research on social media is still developing. This development is accompanied by significant amounts of confusion. In this chapter and in my recent works, I try to explain netnography as a way to help researchers address and ameliorate that confusion about how to rigorously apply and communicate about qualitative research that is conducted with social media data. With netnography, researchers can set forth confidently to research intriguing and complex topics using data drawn from social media sources using a set of more or less standardized, but still adaptable and always in flux, procedures for data collection, interviewing, journal keeping, data analysis, data interpretation, and communicating research results to various audiences. To understand why these are important to e-tourism, it may be useful to understand what netnography offers in comparison to the undisputed heavyweight champion of both social media data science and e-tourism-based studies: big data analytics.
Comparing Big Data and Netnography Approaches Big data analysis is defined in relation to the size of the dataset. It is an analysis of datasets so large that they require unconventional means to handle them. Successful big data methods can provide broad overviews of massive amounts of data. Methods
744
R. V. Kozinets
like automated data analysis, for instance, can allow e-tourism researchers to quickly classify millions upon millions of online reviews into various categories based upon keywords found in their subject lines or text. However, these methods currently have some important limitations. First, the view that they provide of e-tourism behavior tends to be very broad and decontextualized. Second, the types of data they can use are still somewhat limited (e.g., travel photos on Instagram or travel videos on YouTube might be difficult). Third, the reliability of the massive hardware and software machinery required for true big data analysis can be challenging (Jacobs 2009). Fourth, a lot of data tends to be included, so it can be difficult to filter or sort for the types of data needed for a particular study or research question. Fifth, the results of these complex and difficult analyses might not yield theoretically interesting or actionable information. Where there is a lot of data being examined, it can often be difficult to differentiate meaningful signals from background noise (Few 2015). Finally, the ethical standing of big data analysis in business research is unclear – and some scholars link the use of these methods to the manipulation and behavioral modification techniques that (Zuboff 2019) dubs “surveillance capitalism.” Lu and Stepchenkova (2015) identify some of the other problems with big data approaches that they considered in their examination of user-generated content-based research in tourism: sample representativeness, completeness of the data, reliability, validity, and the fact that discussions of the generalizability of research results are often left out of the published articles’ discussion sections. Unlike big data approaches, netnography is not defined in relation to the size of a dataset, but to its depth. Netnographic analysis and interpretation of social media data requires no major innovations in hardware construction, no advanced new software applications, and in fact no mathematical or other calculative skills. However, some netnographies might quantify or use social network analysis or use other techniques perhaps as data visualizations and reports to reveal cultural tendencies. Netnography is designed to do something that big data analysis cannot do well: hear and understand the voice of individual digital informants and their groups and various collectives and socialities. We can see this in the work of Björk and Kauppinen-Räisänen (2012, 70), whose netnography revealed not only that the language of risk assessment in social media revolves around perceptions of “safety, threat, and danger” but also found that “risk perception was destination specific.” Relatedly, in Sthapit and Björk’s (2019) study of AirBnB and distrust, the research finds that the language of “‘stress’, ‘totally unacceptable’, ‘inconvenience’, ‘headache’, ‘deceived’, ‘screwed’, ‘worst service experience’, and ‘unsafe’ suggests that the guests experienced psychological discomfort alongside losses in selfesteem and self-efficacy” in ways that contrast both with “guests’ expectations” and “AirBnB’s marketing pitches” (p. 250). Netnography’s strengths, then, are big data’s weaknesses. First, netnography offers a local and contextualized view of phenomena. Second, it is enthusiastically promiscuous about using all types of data, from Pinterest visual boards to travel photos on Instagram or YouTube travel vlogs. Third, it doesn’t require any sort of supercomputer, or anything more complex than a tablet or smartphone. Fourth, it is selective about which data it collects, minimizing the need to deal with a lot
31 E-Tourism Research, Cultural Understanding, and Netnography
745
of unhelpful background noise. Fifth, netnography has clear ethical guidelines, and these standards are linked to a deep concern about the abuse of data and manipulation of behavior that occur in so-called surveillance capitalism. Finally, and in conclusion to this section, I feel compelled to argue that the dominance of big data analytics methods for understanding social media phenomena come at a steep cost in terms of overshadowing a particular, and to my mind a more genuine and useful, form of understanding. I will now turn to this chapter’s central topic, which is the application of this form of research and understanding to the questions that concern e-tourism researchers.
E-Tourism and Netnography: a Natural Fit Contemporary netnography is concerned with engaged cultural approaches to data operations that include key topics and perspectives within e-tourism, such as the collection, analysis, and interpretation of data pertaining to eWOM, online reviews, online communities, selfies, and other forms of important travel and tourism-related communication. Digital cultural consumer insight techniques such as netnography draw researcher attention to the structures, systems, and influences of cultural sociality. They study phenomena like tourism and travel as part of an embedded human experience. Examining a range of extant e-tourism research practices and topics, the final sections of this chapter seek to explain, explore, and extend the use of updated netnographic research methods for high-quality and impactful e-tourism scholarship. Netnography seems extraordinarily relevant to e-tourism, as travel customers visit an average of at least 140 different sites or sub-sites during their search process (Whalen 2018). Yet Whalen (ibid, 3424) notes that “many of Kozinets’ (2015) prescriptions are not followed in hospitality and tourism research.” Across the sample of published peer-reviewed tourism articles, 53.9% of them were focused on destination image and tourism types of topics, a type of branding concern that has wide application across e-tourism studies. Thirty-nine-point seven percent of these publications used fan forums or online communities as their sites, and 84.1% used non-participative methods. Only 11% of the articles mentioned researcher disclosure or participant informed consent. Forty-six percent of them used thematic analysis. Sixty-seven percent of them give no specifics about the type of coding or brand of QDA software package used. This is what Whalen (2018) found – a use of netnography in tourism that predominantly is focused on destination image, uses online communities and thematic analysis, is non-participative, and does not mention ethical or data analysis practices. However, there are at least five big advantages that netnography holds for researchers conducting e-tourism research today. The first benefit was detailed in the section above that considered how netnography is able to reveal textures and meanings of travel phenomena that elude big data analysis methods. Gretzel (2018b) points out that, although data mining may help to discover patterns, it does not provide deeper understandings about the meanings of those patterns. She sees
746
R. V. Kozinets
qualitative approaches to understanding social media like netnography as continuing to play an important role in social media research in general and to tourismrelated topics specially. The epistemology and axiology of netnography offer a stark contrast to those underlying the operation of big data analytics. They offer a different way to think about data. When a linguistic, meaning-based, and cultural understanding is the goal – something which can inform market communications and positioning strategies and tactics – netnography can be advantageous. Second, netnography is both an interdisciplinary as well as an established method in the tourism and travel research field. In a non-exhaustive historical study whose data ended in 2012, Bartl et al. (2016) found that almost a quarter of the public peer-reviewed articles using netnography focused on tourism- or leisure-related topics. Xu and Wu (2018, 249) found that “most” of the netnography journal publications in tourism “were published with the key tourism and hospitality journals” such as Tourism Management, Annals of Tourism Research, and the Journal of Travel Research. They concluded that due to these journals’ “vast readership and high citation rate”, “netnography as a new research method has been well documented and introduced to researchers in tourism and hospitality studies” (249). However, and somewhat mysteriously, Mkono and Markwell (2014) and Tavakoli and Mura (2018, 190) both suggest that netnography may “not be fully legitimized” as a methodology in tourism research. Yet, with articles using and mentioning netnography published in all of the top travel and tourism peer-reviewed journals, full legitimacy is likely not very far away. Scholars from across the social sciences are applying and adapting netnography procedures to the theoretical and practical queries that fascinate their fields. Travel and tourism researchers have long been a part of this conversation. The third advantage is clarity. To learn, to teach, and to communicate qualitative social media research methods, there is no easier approach than netnography. Netnography provides a clear instruction set and practice exercises to introduce novice e-tourism researchers as well as to hone the skills of veteran investigators. Fourth, in the age of COVID-19, the conduct of both tourism and e-tourism research is bound to change in response to the dramatic reduction in global travel. In this situation, the ability to study online travel experiences up close may prove a huge boon to researchers. Finally, netnography fits very well with a number of the topical interests of e-tourism scholarship. I will develop these ideas, with examples, in the remainder of this chapter.
Applying Netnography to E-Tourism Research There is a rich cultural and contextual element to many e-tourism phenomena that have already been studied using qualitative social media research techniques. Baka (2016), for instance, uses netnography to study user-generated travel reviews and their effect upon reputation management in the travel sector. Baka not only
31 E-Tourism Research, Cultural Understanding, and Netnography
747
observed TripAdvisor’s online travel communities and conversations, but also interviewed a very substantial number of managers, property owners, community founders, and users (around 50 people) as part of the netnography, including TripAdvisor’s Cofounder and CEO. The resulting analysis is comprehensive and wide-ranging. The resulting netnography, which has been cited over 100 times in Google Scholar already, provides a useful portrait of TripAdvisor and other usergenerated travel review platforms as complex phenomena embedded in a range of online and offline reputation management practices and challenges. The author notes, intriguingly, that social media and user-generated reviews have “become platforms where truth is negotiated in a public ‘online court”’ (160). Netnographic techniques are particularly important when the researcher is seeking to understand how online reviewers are commenting upon or attempting to capture particular cultural aspects of their travel experience. For example, Holder and Ruhanen (2017, 7) found that a “netnographic approach utilising TripAdvisor reviews has allowed for a systematic and rigorous review of post-consumption online narratives of indigenous tourism experiences in Australia” and reveals the importance of the holistic “servicescape” to traveler impressions. In their useful bibliometric analysis of netnography’s use in tourism journal articles, Xu and Wu (2018) found that, although blogs, Facebook, and Twitter were used in published netnographies in tourism, “TripAdvisor, the world’s largest online review community, has been most popular with the researchers” (249). Travel experiences are fascinatingly multidimensional – they present us with a vast panoply of topics, socialities, languages, subcultures, influencers, and much more. Netnography allows us to understand some of these aspects as they are discussed and shared in various ways online. Textual communication, videos, and photography share travel experiences and reveal much that would otherwise be difficult or impossible for researchers to study. For example, Luo et al. (2015) used netnography to understand Chinese “donkey friends” travel behavior and compared them with Western backpackers, finding how their actions reflected some of the contemporary cultural forces shaping Chinese society. Wu and Pearce wanted to investigate a group that might be even more difficult to access: Chinese tourists who driven recreational vehicles (RVs) in unfamiliar countries and compared them to mature RV users who had been studied in other contexts. The study found a variety of interesting participant characteristics, motivations, and travel patterns and also was able to identify that participants “are generally young, affluent, and well-connected to social media” (Wu and Pearce 2017, 710–711). The researchers found that netnography was a useful “methodology for exploring new hard to access tourism interest groups” (710) and note “the value of contextual and comparative information during the data interpretation, and the potential value of using usergenerated images” (712). Providing another example, Goulding et al. (2013) studied “death tourism” at von Hagen’s “Body Worlds” exhibits. The researchers combined observational work at the exhibits with netnography of seven blogs that “contained rich and detailed information and there was also evidence of deep reflection on a number of philosophical issues relating to the body and its use as exhibit”
748
R. V. Kozinets
(312). In another example, Ao (2018) conducted netnographic research about space tourism using 19,116 tweets by 36 NASA astronauts. From this analysis, the author concludes that space travel offers specific and different phases of leisure experience (training, lift-off, in-space, reentry, and memory) and suggests that these phases may lay the foundation for future marketing efforts for a space tourism industry. In each of these examples, social communication about a travel experience is shared on social media and then collected and analyzed for its theoretical and substantive insights by researchers using netnography. The way that various sorts of travel photographs and videos are created and shared, as well as understanding their contents and the public reaction to it, is another rich area for further investigation by e-tourism researchers using netnography. Using netnography, Gretzel (2017) studied the identities communicated through the sharing of travel selfies on Instagram. The author uses the content and style of the photograph to classify travel selfies into a variety of different categories, from mundane to aesthetic/artistic, animal, sunglass, panoramic, drinks, ironic, and contemplative travel selfies. Contrasting the findings of this research with prior investigations such as Dinhopl and Gretzel (2016), the author nuances prior findings that travel selfies tend to redirect the gaze away from the destination and onto the self (Gretzel 2017, 124). Shakeela and Weaver (2016) studied responses to a YouTube video depicting a fake ceremony conducted at a Maldivian resort for Western guests that was intended to mock the tourists. Their netnography found two main types of responses to this inflammatory tourism-related social media video by potential tourists: “one which was hegemonic and tolerant, and the other polemical and intolerant” (122). The authors used their research as an opportunity to speculate about whether social media acts as an amplifier of conflict and about whether such exhibitions might have lasting effects upon travel destinations. Another important phenomenon in the realm of e-tourism is the growth and power of social media travel influencers. Netnography can help researchers seeking to understand the characteristics and roles of professional and semi-professional social media micro-celebrities and how they influence both the online travel ecosystem and the global travel industry. Chatzigeorgiou (2017) pointed out the important of social media influencers to rural businesses wishing to attract global millennial travelers. This study proposed that proper attention paid to the fame, image, and activities of social media influencers can lead to economic growth and touristic development of rural locations. In a useful introduction to and overview of the phenomenon, Gretzel (2018a) notes how travel consumers have evolved from being “occasional endorsers to micro-celebrity-seeking social media influencers,” many of whom have amassed dedicated followings as well as lucrative relationships with brand marketing and talent management agencies. Because travel marketers have long recognized the potential of online communications to amplify their messages and target particular audiences, these further developments should be of great relevance to them. Yet, “despite its prominence and practical significance, there is a lack of research that investigates the travel and tourism influencer marketing phenomenon” (Gretzel 2018a, 155). In the field of marketing, netnographic studies
31 E-Tourism Research, Cultural Understanding, and Netnography
749
of influencers have been appearing for about a decade and have helped to inform general theory and specific practice regarding this important phenomenon.
Concluding Thoughts About Netnography and E-Tourism E-tourism researchers often use an interesting term that appears to have originated in the field of computer science: “user-generated content” or “UGC” (Lu and Stepchenkova 2015). In using the term, Lu and Stepchenkova (2015, p. 142) point out the applications for the use of user-generated content deriving from social media communication toward increasing researchers’ understanding of service quality, intangibles such as the “destination image and reputation, experiences and behavior, the persuasive power of UGC as eWOM, as well as tourist mobility patterns.” UGC-based studies in the e-tourism field mention things like monitoring people and tracking them, a type of research which is clearly related to smart tourism and also has privacy-related ethical implications. E-tourism UGC researchers also like to use visualizations of digital journeys, sentiment analysis, and customer reviews. The realm of business-to-consumer communication is included in these trends and must include all forms of advertising, social media marketing, as well as public relations and crisis communications. Finally, UGC researchers like to study “tourist behavior in real time” (Lu and Stepchenkova 2015, 143), another objective that they have in common with many e-tourism and smart tourism scholars. There are many ways to understand human behavior, of course, and the same is true of the vast amounts of public information shared on social media platforms such as Twitter, Pinterest, Reddit, and TripAdvisor. We can use big data analytics to discover interesting major patterns that occur across millions of entries, and we can also drill down to do detailed microscopic examinations. In netnographic work, we tend to contextualize, historicize, and dig deep for the cultural meanings and implications that connect particular phenomena to the wider world. So, for instance, in work Kozinets (2016, 835) published on Amazon customer reviews, the researcher points out that “reviews and ratings offer consumers a social conversation, a communications environment that they use not only to talk about the objective and subjective characteristics of products and services, but also to socialize and communicate about themselves.” It turns out that Amazon.com’s rating and review system is used not only “as a source of peer opinion and information to inform decisions about potential purchases,” but that “it also acts as a platform for cultural connection, witty repartee, social commentary, entertainment, personal revelation, self-promotion, revenge seeking, and many other activities” (836). I think it very likely that the social communications in e-tourism fulfil a similarly wide range of functions and that travel-related exchanges are actually very social and cultural experience in their own rights – just as has been revealed and explored by some of the excellent netnographic e-tourism research already cited in this chapter. In some sense, a netnography that attends to the reflective reality of the researcher-as-instrument provides a type of detailed snapshot of a particular phe-
750
R. V. Kozinets
nomenon suspended at a particular point in time, viewed from a particular linguistic and cultural lens. Over time, these netnographies aggregate into substantial sections of a field, forming a multifaceted view that collects the online chronicles of interested travel researchers. Methodologically, we might see this undertaking as the combining of a variety of smaller scale bottom-up projects. These articles, chapters, books, and other research communications offer theoretical contributions, certainly, but they are also individual assets of a type that a digital humanities scholar might appreciate, works locked into cultural times and spaces, reflexive and also archival. In this chapter, I have attempted to provide a concise guide for e-tourism researchers interested in or perhaps considering the use of netnography. The chapter began with an explanation of netnography that examined the method’s evolution over time and charted how its definition adapted to the changing social media environment and the growing sophistication and adoption of data collection and analysis procedures. Netnography today is a flexible yet well-defined collection of different research operations. Each of these operations is adapted to the qualitative study of social media environments and data and can be adapted further to the contingencies of particular research contexts. The chapter also examined some of the strengths and weakness of netnography in comparison to big data analytics, in particular how they might be used in e-tourism to understand social media. The chapter then detailed and described a number of different areas where e-tourism studies could benefit from more netnographic research, such as with online travel reviews and electronic word of mouth, online communities, online influencers and audiences, and travel selfies. Emphasizing the contributions of the many scholars who have already published netnographic research in the travel and tourism field, the core assertion of this chapter has been to emphasize the value of the unique modality of cultural understanding that netnography offers to contemporary tourism and hospitality, and e-tourism, researchers. Social media is so much a part of our world and so much a part of the contemporary tourism experience. In the age of COVID-19, this use of connective technology has accelerated dramatically across every sphere of human activity and, in many cases, may be replacing “traditional” travel and tourism. Because social media is interactive, experiential, and cultural, to fully understand it, it may well be that travel, tourism, and e-tourism researchers need netnography now more than ever before.
Cross-References Big Data Technologies Content Analysis of Online Travel Reviews Data Mining and Predictive Analytics for E-Tourism e-Tourism Research: A Review Mobile Ethnography in Tourism and Hospitality: Concept, Tools, and Applica-
tions
31 E-Tourism Research, Cultural Understanding, and Netnography
751
References Ao J (2018) Ride of a lifetime: a netnographic research to unveil the leisure experience attached to orbital space tourism. Unpublished Ph.D. dissertation, Middle Tennessee State University Baka V (2016) The becoming of user-generated reviews: looking at the past to understand the future of managing reputation in the travel sector. Tour Manag 53:148–162 Bartl M, Kannan VK, Stockinger H (2016) A review and analysis of literature on netnography research. Int J Technol Mark 11(2):165–196 Baym NK (1993) Interpreting soap operas and creating community: inside a computer-mediated fan culture. J Folk Res 30:143–176 Björk P, Kauppinen-Räisänen H (2012) A netnographic examination of travelers’ online discussions of risks. Tour Manag Perspect 2:65–71 Brown S, Kozinets RV, Jr Sherry JF (2003) Teaching old brands new tricks: retro branding and the revival of brand meaning. J Mark 67(July):19–33 Chatzigeorgiou C (2017) Modelling the impact of social media influencers on behavioural intentions of millennials: the case of tourism in rural areas in Greece. J Tour Herit Serv Mark 3(2):25–29 Correll S (1995) The ethnography of an electronic bar: the lesbian cafe. J Contemp Ethnogr 24(3):270–298 Dinhopl A, Gretzel U (2016) Selfie-taking as touristic looking. Ann Tour Res 57:126–139 Few S (2015) Signal: understanding what matters in a world of noise. Analytics Press, Burlingame Goulding C, Saren M, Lindridge A (2013) Reading the body at von Hagen’s ‘body worlds’. Ann Tour Res 40:306–330 Gretzel U (2018a) Influencer marketing in travel and tourism. In: Sigala M, Gretzel U (eds) Advances in social media for travel, tourism and hospitality: new perspectives, practice and cases. Routledge, New York, pp 147–156 Gretzel U (2018b) Tourism and social media. In: Cooper C, Gartner W, Scott N, Volo S (eds) The sage handbook of tourism management, vol 2. Sage, Thousand Oaks, pp 415–432 Gretzel U (2017) # travelsel?e: a netnographic study of travel identity communicated via Instagram. In: Carson S, Pennings M (eds) Performing cultural tourism: communities, tourists, and creative practices. Routledge, London, pp 129–142 Gretzel U, Sigala M, Xiang Z, Koo C (2015) Smart tourism: foundations and developments. Electron Mark 25(3):179–188 Holder A, Ruhanen L (2017) Identifying the relative importance of culture in Indigenous tourism experiences: netnographic evidence from Australia. Tour Recreat Res 42(3):316–326 Hsu S-Y, Dehuang N, Woodside AG (2009) Storytelling research of consumers’ self-reports of urban tourism experiences in China. J Bus Res 62(12):1223–1254 Jacobs A (2009) The pathologies of big data. Commun ACM 52(8):36–44 Jenkins H (1995) “Do you enjoy making the rest of us feel stupid?” alt. tv. twinpeaks, the Trickster Author, and Viewer Mastery. In: Lavery D (ed) Full of secrets: critical approaches to twin peaks, pp 51–69. Wayne State University Press, Detroit Kozinets RV (1997) “I want to believe’: a netnography of the x-philes’ subculture of consumption. In: Brucks M, MacInnis DJ (eds) NA-advances in consumer research vol 24. Association for Consumer Research, Provo, pp 470–475 Kozinets RV (1998) On netnography: initial reflections on consumer research investigations of cyberculture. In: Alba J, Hutchinson W (eds) Advances in consumer research, vol 25. Association for Consumer Research, Provo, pp 366–371 Kozinets RV (1999a) How online communities are growing in power. In: Dickson T (ed) Mastering marketing: complete MBA companion in marketing. Pearson Education, London, pp 291–297 Kozinets RV (1999b) E-tribalized marketing? The strategic implications of virtual communities of consumption. Eur Manag J 17(3):252–264 Kozinets RV (2002) The field behind the screen: using netnography for marketing research in online communities. J Mark Res 39(February):61–72
752
R. V. Kozinets
Kozinets RV (2010) Netnography: doing ethnographic research online. Sage, London Kozinets RV (2015) Netnography: redefined. Sage, London Kozinets RV (2016) Amazonian forests and trees: multiplicity and objectivity in studies of online consumer-generated ratings and reviews, a commentary on de Langhe, Fernbach, and Lichtenstein. J Consum Res 42(6):834–839 Kozinets RV (2020) Netnography 3e: the essential guide to qualitative social media research. Sage, London Kozinets RV, de Valck K, Wojnicki A, Wilner S (2010) Networked narratives: understanding wordof-mouth marketing in online communities. J Mark 74(March):71–89 Kozinets R, Patterson A, Ashman R (2017) Networks of desire: how technology increases our passion to consume. J Consum Res 43(February):659–682 Lu W, Stepchenkova S (2015) User-generated content as a research mode in tourism and hospitality applications: topics, methods, and software. J Hosp Mark Manag 24(2):119–154 Luo X, Huang S, Brown G (2015) Backpacking in China: a netnographic analysis of donkey friends’ travel behavior. J China Tour Res 11(1):67–84 Martin D, Woodside AG, Dehuang N (2007) Etic interpreting of naïve subjective personal introspections of tourism behavior: Analyzing visitors’ stories about experiencing Mumbai, Seoul, Singapore, and Tokyo. Int J Cult Tour Hosp Res 1(1):14–44 Mkono M, Markwell K (2014) The application of netnography in tourism studies. Ann Tour Res 48:289–291 Shakeela A, Weaver D (2016) The exploratory social-mediatized gaze: reactions of virtual tourists to an inflammatory YouTube incident. J Travel Res 55(1):113–124 Sthapit E, Björk P (2019) Sources of distrust: Airbnb guests’ perspectives. Tour Manag Perspect 31:245–253 Tavakoli R, Mura P (2018) Netnography in tourism–beyond web 2.0. Ann Tour Res 73:190–192 Whalen EA (2018) Understanding a shifting methodology: a content analysis of the use of netnography in hospitality and tourism research. Int J Contemp Hosp Manag 30(11):3423–3441 Woodside AG, Cruickshank BF, Dehuang N (2007) Stories visitors tell about Italian cities as destination icons. Tour Manag 28(1):162–174 Wu M-Y, Pearce PL (2017) Understanding Chinese overseas recreational vehicle tourists: a netnographic and comparative approach. J Hosp Tour Res 41(6):696–718 Xu JB, Wu M-Y (2018) Netnography as a new research method in tourism studies: a bibliometric analysis of journal articles (2006–2015). In: Handbook of research methods for tourism and hospitality management. Edward Elgar Publishing, Northampton, pp 242–255 Zuboff S (2019) The age of surveillance capitalism: the fight for the future at the new frontier of power. PublicAffairs, New York
Mobile Ethnography in Tourism and Hospitality: Concept, Tools, and Applications
32
Elaine Yulan Zhang, Dan Wang, and Sut Ieng Lei
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ethnography Traditions and Mobile Ethnography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types of Mobile Ethnographic Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Studies Adopting Real-Time Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Historical Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tools and Scopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Advantages and Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reduced Average Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reduced Effort and Skills from Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Validity and Credibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Augmented Effect of Multi-types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . More Possibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concerns and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Difficulties and Bias in Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges in Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks on Mobile Ethnography and Media Convergence . . . . . . . . . . . . . . . . . Cross-References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
754 755 757 758 760 761 762 763 763 763 765 765 766 766 767 767 769 770 770 771 771
E. Y. Zhang · D. Wang () The Hong Kong Polytechnic University, Hong Kong, China e-mail: [email protected]; [email protected] S. I. Lei Macao Institute for Tourism Studies, Macau, China e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_44
753
754
E. Y. Zhang et al.
Abstract Digital consumers rely on smartphones and the mobile Internet for different tasks in their daily lives. Travel activities as an essential part of life for many people are greatly supported by a variety of connected mobile devices. In turn, these mobile devices and applications capture the digital footprint of tourists, which reflects their behaviors, preferences, and intentions. As a result, mobile ethnography has emerged as a valid and insightful method for researching tourist behavior. Mobile ethnography is distinct from classical ethnography in terms of data collection tools and procedures. Researchers collect data captured by mobile devices to understand the behavior of targeted groups, and participants can share insightful information such as their locations, photos, videos, and data points that indicate their emotions with the researchers. Mobile ethnography provides researchers with a variety of new tools and instruments to capture tourists’ behaviors and opinions from multiple perspectives. This chapter introduces the concept of mobile ethnography, the tools and instruments for data collection and analysis, and the applications of mobile ethnography in tourism and hospitality research. This chapter also discusses the opportunities and challenges such as ethical and privacy concerns in mobile ethnography research.
Keywords Mobile ethnography · Smartphones · Wearable devices · Experience sampling · Real-time data collection · Data mining
Introduction Ethnography has always been mobile, dating back to anthropologists such as Bronislaw Malinowski who traveled to Australia in 1914 to carry out participant observation (Novoa 2015). Since then, increasing human mobility has promoted the development of geographical studies with new approaches such as “mobile ethnography,” which involves researchers participating in the movement. Specifically, researchers travel with people, interviewing them either during or after they participate in certain activities (Urry 2016). For example, Watts (2008) traveled on a train and observed how other passengers used time and space during train travel. As summarized in Fig. 1, participant observation in stage 1 becomes “mobile” participant observation in stage 2 (Novoa 2015). With the development of mobile technology, a new stream of mobile ethnography research has emerged as stage 3. Since mobile technology removes the barriers of time and space, researchers have a new option to conduct ethnographic studies without physically traveling with people. Ethnographic studies supported by mobile technologies enable researchers to capture tourist experience in real time at different locations. As most research in tourism is still using conventional research methods such as questionnaires, mobile ethnography provides an innovative approach for researchers to capture authentic consumer insights. The mobile world has changed
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
755
Fig. 1 Development of mobile ethnography
consumer behaviors, and mobile ethnography enables researchers to capture consumer data that explains new behavior patterns and relates behaviors with consumer demographics and contexts (Stickdorn et al. 2014). Correspondingly, there are mainly two types of mobile ethnography research: using mobile devices such as smartphones or wearable devices to collect data while people are traveling and analyzing historical data that travelers create with their mobile devices such as geotagged photos shared online (Muskat et al. 2018). The body of research adopting these two approaches has grown to a scale that prompts scholars to conduct literature reviews on mobile ethnographic research supported by technologies (Muskat et al. 2018; Shoval and Ahas 2016). This article focuses on a review of stage 3, namely, technology-mediated mobile ethnography without the need to be co-present. In the review that follows, the connections between traditional and mobile ethnography are first discussed. Then different types of mobile ethnographic studies will be summarized with the tools they used and their respective scopes. We then examine the advantages, opportunities, concerns, and limitations of mobile ethnography as a research approach. Finally, an observation of mobile ethnography as media convergence is presented as an open-ended conclusion.
Ethnography Traditions and Mobile Ethnography “Ethnos” is a Greek term that means “folk” (Savin-Baden and Major 2013). In turn, ethnography has become the term for research that aims to understand people, cultures, and values, usually via intensive fieldwork that sometimes may take years. The key characteristics of ethnographic research include a focus on everyday life, long-term immersion of researchers in the field, participant observation as the primary method, an in-depth and unstructured approach, and findings presented from the perspective of participants (Savin-Baden and Major 2013). Early ethnographic studies aimed to understand other cultures and countries, focusing on cultural differences with existing primary Western values. Most studies were conducted in the field of comparative cultural anthropology (Creswell 2013). Once modern ethnography developed into more diverse types, traditional ethnography and its focus on cultures came to be referred to as “cultural ethnography”
756
E. Y. Zhang et al.
(Savin-Baden and Major 2013). Over the years, several different types of ethnography have been developed, and they vary in terms of ontological and epistemological stances. For example, realist ethnography aims to discover existing reality; autoethnography considers reality as an individual’s mental construction, and critical ethnography assumes knowledge is obtained by co-construction of critical consciousness. Without intending to draw a clear boundary that differentiates ethnographic studies from other qualitative studies, Hammersley and Atkinson (2007) refer to ethnography as a method or a set of methods for qualitative research, which typically involves a researcher’s participation in the subjects’ lives for an extended period of time. Ethnography involves “watching what happens, listening to what is said, asking questions” to collect information relevant to the research topic. With this unstructured approach, the research design of ethnographic studies is “almost superfluous” because studies are often conducted with open-ended observation (Hammersley and Atkinson 2007, p. 24). The research objectives are to describe and explain a phenomenon and to develop theories. The research problem is normally not pre-defined but formed during the early stage of data collection. Participant observation is an important tool in ethnographic studies. Ethnographers need to spend a significant amount of time (ideally 2 years), to participate in the daily life of subjects and observe what they experience in detail (Tedlock 2005). In the branch of “autoethnography,” researchers merge the autobiographic impulse (the gaze inward) and ethnographic impulse (the gaze outward) by self-reflection in their writing. Interviewing is another important tool, consisting of either informal conversations or formal interviews (Hammersley and Atkinson 2007). Only formal interviews are distinctive from participant observation. Hammersley and Atkinson (2007) consider the distinctive setting of these interviews as a resource rather than a problem. Participants’ verbal explanations enable researchers to better understand how participants behave in different situations. As a new branch of ethnography, mobile ethnography refers to ethnographic studies that adopt mobile technology-based data collection methods (Muskat et al. 2018). Aligning with the tradition of participant observation, mobile devices such as mobile phones have become an “extension of the participant observer’s self” in mobile ethnographic studies (Hein et al. 2011, p. 258). Since mobile ethnography is defined by its data collection approach, there is no pre-assigned philosophical stand for researchers. Researchers from both realism and idealism traditions can adopt mobile ethnography in their studies. For example, Vu et al. (2018a) investigated the potential privacy risks of location-based social media via data mining of venue check-in data with a positivist philosophical stance, while Shoval et al. (2018) explored travelers’ experience at a destination from a constructivist perspective. Mobile ethnography carries on ethnography traditions, and it also exhibits its own unique characteristics. For example, in traditional ethnographic studies, researchers normally observe the behaviors of participants without physically immersing themselves in the field. With the assistance of technology, researchers are able to obtain real-time information about subjects without maintaining a physical
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
757
presence near them. It is typical to collect different types of information, including geographical location, emotional arousal, and self-reported surveys at different time points (Birenboim et al. 2015; Kim and Fesenmaier 2015; Shoval et al. 2018). It is also possible to collect multimedia information from one self-reported survey throughout the trip (Idris et al. 2017). Moreover, mobile ethnography does not require a long period of time to obtain rich information. Typically, researchers follow the subjects from several hours to several days during a trip (Muskat et al. 2013; Stickdorn et al. 2014), or researchers collect historical online information shared by travelers during their trip (Vu et al. 2015). Two traditional ethnography concepts are close to mobile ethnography: netnography and digital ethnography. Mobile ethnographic studies that use secondary online information may seem similar to netnography, but netnography has a different scope compared to that of mobile ethnography. Netnography “is a qualitative research method that allows for in-depth, contextual understanding of data derived from websites, social media or mobile phone applications” (Gretzel 2021, p. 258). Netnography typically has three data collection movements: (1) immersion (participant observation); (2) investigation (use of archival data available online); and (3) interaction (elicitation of data through interviews, research websites, apps— including those used for mobile ethnography) (Gretzel 2021). Without contact between researchers and the informants when focused on immersion and investigation, netnography is often less obtrusive and can be used to study sensitive topics such as cosmetic surgery (Langer and Beckman 2005). One example of adopting netnography in tourism studies is to understand travelers’ perceptions toward a cultural-themed dining experience by reviewing their online comments (Mkono 2013). The comments, however, are not necessarily generated by mobile technology. Thus, netnography studies are not necessarily mobile ethnographic studies. Digital ethnography broadly refers to digitized ethnography where the contact between ethnographers and participants is mediated by digital platforms (Pink et al. 2016). The data collected, the data collection method, and how data is presented are changed by digital media. For example, instead of having a physical presence, researchers may ask participants to involve them in their social media practices. The traditional ethnographical way of taking field notes is replaced by multimedia recordings. The presentation of findings can also be digitized with video clips and computer-generated graphs. Thus, both mobile ethnography and netnography operate under the larger umbrella of digital ethnography. Table 1 clarifies the meaning of the different terms used in this article.
Types of Mobile Ethnographic Studies There are four types of studies with topics related to mobile ethnography currently present in the literature: (1) studies collecting real-time data using mobile devices; (2) studies analyzing historical data collected and stored by online, mobile-based platforms; (3) studies focusing on revealing the effectiveness of, and issues with,
758
E. Y. Zhang et al.
Table 1 Terms Term Mobile ethnography
Type of studies Researchers travel with participants, conducting participant observation and interviews Mobile (technology) Researchers use data collected or generated by ethnography (in this mobile technology (e.g., smartphone, wearable article) devices) to derive rich cultural insights about participants Netnography Netnography has three data collection movements: (1) immersion (participant observation); (2) investigation (use of archival data available online); and (3) interaction (elicitation of data through interviews, research websites, apps— including those used for mobile ethnography) Digital ethnography Researchers use digital media and tools to conduct ethnographic studies Virtual ethnography Researchers explore the social interactions and consumer behaviors in online settings
References Urry (2016) Muskat et al. (2018)
Gretzel (2021)
Pink et al. (2016) Hine (2000)
adopting mobile ethnography; and (4) literature review studies discussing the above empirical studies.
Studies Adopting Real-Time Data Collection This type of study collects data when participants are traveling—a form more similar to a traditional ethnographic study than studies which analyze historical data collected through mobile online platforms. Real-time data is created and collected almost simultaneously. Instead of referring to themselves as mobile ethnographic studies, studies of this type often state that an experience sampling method is used (Birenboim et al. 2015; Shoval et al. 2018). Experience sampling is a data collection method that is not widely used in tourism studies (Quinlan Cutler et al. 2018). It captures the mental process by eliciting self-reported data at different points of time during a day (Csikszentmihalyi and Larson 2014). Thus, it is used in psychology studies to collect information at certain moments in real time, which are often difficult to capture afterward, as in the case of real-time emotions (Killingsworth and Gilbert 2010). In tourism studies, experience sampling focuses on in situ data, real-time data, and the subjective experience of travelers (Quinlan Cutler et al. 2018). Usually, multiple types of data are collected in one study. Locational and emotional information is collected almost continuously due to the high-resolution measurement of mobile devices. Self-reported data reflecting travelers’ views and feelings is collected at predetermined number of times during a day, either via text or other media. Observation of participants is more structured in terms of location and emotion, and it remains unstructured when participants share their views. There
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
759
are mobile applications developed to assist in collecting this real-time self-reported data (e.g., Sensometer). Typically, experience sampling studies involve seven to ten short self-reports per day, and each report can be completed within 2 min (Quinlan Cutler et al. 2018). In tourism studies, researchers have tried to reduce the required effort of participants by allowing them to use their preferred way of expression. For example, Dimanche and Prayag (2016) collected real-time data via mobile applications, on participants’ own phones or loaned devices, to evaluate each touch point using Likert scale questions and recordings of perceptions in multimedia form. Muskat et al. (2013) used the mobile application myServiceFellow to collect data at different touchpoints during a museum visit. Participants were invited to share multimedia content including text, audio, photos, and videos. Similarly, Bosio et al. (2017) allowed travelers to choose their preferred media type to record their experience, and the overall satisfaction of the experience was scored at each touchpoint. Sometimes, the number of report times is reduced to optimize participants’ overall experience. Liu et al. (2016) collected event attendees’ emotions at three times during the event as well as one time after the event via a mobile application. They found that participants’ emotions during the events can be different even though their overall experiences are similar. In these studies, mobile ethnography is an active approach involving interactions between participants and researchers (Hein et al. 2011). Participants report their experience through mobile devices and collect data like a researcher under guidance (Bosio et al. 2017). They are involved in ethnography for recording observations and data co-creation (Hein et al. 2011). While real-time data is collected, researchers have found that traditional ethnography data collection methods, such as an interview, are useful as a supplement. Supplementary data collected in a traditional way can help researchers understand and interpret the real-time data they collect. For example, Birenboim et al. (2015) conducted a survey before they monitored activities in real time. Shoval et al. (2018) carried out both a questionnaire survey and a semi-structured interview at the beginning of data collection. Kim and Fesenmaier (2015) conducted a post hoc interview with each participant after emotional data was collected, to understand how the time, locations, activities, and perceptions corresponded to detected emotional variance. Quinlan Cutler et al. (2018) conducted two semistructured in-depth post-trip interviews at two phases which were 3 to 4 months after the trip and 16–17 months after the trip to understand momentary experience over the short term and long term. Outside the field of tourism, a similar approach has been used to understand residents’ movements and trips taken around a city. Spatial data is considered as the basis of understanding individuals’ physical movements, and textual data is collected for explaining the reasons for their movements. Asakura et al. (2014) adopted the Probe Person Survey System method in working on GPS-enabled mobile phones to understand self-reported travel purposes, travel modes, facilities, and auto-recorded GPS data. Chai et al. (2014) used both a GPS tracker and online diaries to understand the transportation choices and movements of residents.
760
E. Y. Zhang et al.
While most studies have used traditional data collection methods such as interviews to collect extra information and interpret spatial data, one study used spatial data as a supplement. Kou et al. (2017) conducted in-depth interviews as well as participant and non-participant observation in their study about seasonal residents’ life at the destination. But this was done together with collecting location information with a GPS tracker. The GPS data was used to understand the daily activities of participants, which supplemented their main field work.
Historical Data Collection Historical data collection studies collect empirical evidence from publically recorded data, created by mobile technology and including spatial and temporal stamps (e.g. geo-tagged photos shared online). The data can be collected at any time after its creation, and researchers have the freedom to choose related data according to their research objectives. Different from traditional ethnographic studies, these studies are more structured. However, they still inform mobile ethnographic research because they aim to understand the mobility of a population, which is indispensable to understanding culture (Greenblatt and Županov 2010). Flickr has become a popular data collection site because researchers have limited access to the same type of data from other sites like Snapchat and Facebook (Miah et al. 2017). Vu et al. (2015) collected geo-tagged photos posted on Flickr taken in Hong Kong by inbound visitors. They analyzed visitors’ movements for locations of interest and the sequences and times of visiting. Miah et al. (2017) also used geo-tagged photos from Flickr. They analyzed visited places in Melbourne and the different behaviors among visitors from different regions. Wood et al. (2013) conducted a similar study but extended their study scope from a city to the entire world. They collected photos shared on Flickr and identified popular destinations with a focus on natural recreation areas. The visitation rate for 836 sites was calculated based on secondary data. To investigate whether this is an effective way of estimating visitations, the researchers selected five nations to conduct a comparison study. The results showed that the demographic characteristics of Flickr users are similar to those of overall actual visitors to these locations, based on the data collected by immigration. By looking at the places visited by the same person over time, Vu et al. (2018c) analyzed the geographical data from geo-tagged photos shared on Flickr, and they identified sequential rules associated with travelers’ visitations of destinations. For example, many travelers who have visited Bangkok also visit Kuala Lumpur or Singapore later. Data mining techniques were used in this study, and the results may help predict tourist destination choices. Vu et al. (2018d) proposed a concept of “venue-referenced social media data” which refers to data that allows researchers to know where users have visited. The researchers collected 1 year of data from Foursquare via a Twitter API, and they analyzed the destination activities of travelers from different nations. For example, about one-fifth of travelers from Malaysia would “check in” at a coffee shop during travel. By comparing results across countries, researchers were able to identify the different preferences for local activities among travelers from different nations.
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
761
The results further identified the popular visiting times by venue types, which also demonstrated differences across nations. In another study, Vu et al. (2018b) used check-in data from Foursquare to understand popular travel routes and times for tourists visiting Hong Kong. A similar study has been conducted in the destination of Macau (Luo et al. 2019), revealing popular attractions visited and the time of check-in via Foursquare. Since Macau is a typical gaming destination, Luo et al. (2019) focused on visitors’ local activities to explore what motivates or interests them to visit Macau. The findings show that tourists not only visit Macau for gaming; they also make diverse activity choices. A few studies have collected data stored by mobile applications other than Flickr and Twitter. For example, GPS data from visits to a green area for sports activities can be collected by sport-tracking applications, and researchers can collect these data for analysis if the participants are willing to share the information. Korpilo et al. (2017) asked runners to share their tracking data related to their visitations of a big park within a large green area. Rodríguez et al. (2018) collected data of travelers’ routes via a travel information application, and they segmented travelers based on those routes. The contributions of this type of study seem mainly focused on proposing and demonstrating new research methods. The studies conducted by Vu et al. (2015, 2018b) revealed patterns which are easily recognized by industry experts. Using historical data, studies often successfully discover patterns but could not explain the reasons behind those patterns. The study conducted by Vu et al. (2018c) found the preferred sequence of visiting destinations by Australian tourists, but they would need further information to explain the reasons behind this, such as demographic characteristics. Yao et al. (2018) collected spatial-geographical data from Weibo, which is the Chinese version of Twitter. Researchers analyzed geographical information shared when users make posts on the platform, to understand individual movement patterns. They also focused on proposing data mining algorithms for individual mobility patterns instead of explaining movement patterns. Rodríguez et al. (2018) segmented visitors based on their movement patterns but could not justify the meanings of the segmentation from social-economical and psychological perspectives. They commented that a mobile-integrated survey would help to verify the segments, and this is a strategy often used in studies that involve real-time data collection. To not only identify the behavioral patterns but also explain the patterns, further studies can be developed based on the findings of historical data collection research. While most previous studies in this category used mobile data that contributes to our understanding of mobility, they were mainly quantitative in nature and did not adhere to overall ethnographic principles. This implies ethnographic methods could be value adding to these approaches and vice versa.
Methodology Studies Since mobile ethnography is an emerging approach, a few methodology studies exhibit the feasibility of adopting this approach and identify related issues. Idris et al. (2017) evaluated the performance of using smartphones as a data entry tool
762
E. Y. Zhang et al.
for collecting different types of data. While participants were entering data on a smartphone, the video cameras they were carrying allowed researchers to observe the process and performance. The performance was evaluated by the efficiency of data entry, the user friendless, and whether it was easy to recover from errors. The interactions between users and the interface have shown that radio buttons and image-icon forms surpass free text forms and drop-down lists. In terms of the efficiency, free text forms seem to have the worst performance on user-friendliness. Participants were able to recover from most errors, meaning that most of them could contribute data successfully within the time limit. The findings of this study contribute to the design of mobile data collection interfaces. Miyasaka et al. (2018) investigated the limitations of using smartphones to monitor tourist behaviors by collecting locational data via mobile applications and distributing surveys to investigate why some individuals might not want to participate. They identified low response rate and demographically biased sample as limitations. The major reasons why people refuse to participate are time and inconvenience, and other reasons include being unfamiliar with technologies, privacy concerns, not owning a capable mobile phone or having flat-rate internet access, and lack of interest. Hardy et al. (2017) also mentioned the limitations of using advanced mobile technologies in tourism research and emphasized the need to develop alternative methods to increase the effectiveness of tourist tracking studies. Employing digital tracking technologies, recruiting participants, and gaining consent from participants have been challenging, which might jeopardize data quality in the worst case.
Literature Reviews With the growing number of studies adopting mobile ethnography in the tourism field, a few literature review papers have outlined the development and problems of mobile ethnography as a research approach (Muskat et al. 2018; Shoval and Ahas 2016). Shoval and Ahas (2016) reviewed mobile ethnographic studies using technology for tracking tourist behaviors published between 2005 and 2015. According to their findings, the main research interest has changed from the possibility of using technologies to study tourist behaviors to discovering new aspects of tourism with new data types and challenging fundamental questions in tourism with new data. Reviewing studies conducted in the field of tourism, health, and retail, Muskat et al. (2018) attempted to better define mobile ethnography by considering the role of the researcher, the focus of research, data collection tools, and data analysis approaches. They identified two approaches that are related to, but different from, mobile ethnography: multi-site ethnography which studies people’s movements at multiple sites and netnography which studies social phenomena represented in online settings. Mobile ethnography also collects online data from online communities such as social media platforms and applications. Muskat et al. (2018) believe that mobile ethnography data is co-created by researchers and participants. Thus, it is suggested that researchers should specify their role in the data co-creation process as self-reflexive to the method.
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
763
Tools and Scopes In mobile ethnographic studies, the smartphone is the “glue” that brings together different methods and different sources of data (Hein et al. 2011). As shown in Table 2, the smartphone is a device that can be used to collect multiple types of data including locations, survey results, SMS messages, pictures, videos, and voice data. Shoval and Ahas (2016) reviewed papers about new tracking technology and found that two-thirds of the studies used GPS receivers. Other less frequently used methods include passive positioning (e.g., secondary data from the mobile operator), Bluetooth, and geo-tagged photos. Table 2 shows that many studies conducted after 2015 have started to use smartphones to track locations instead of using a GPS tracker, although the type of information collected may still be the same. Other than smartphones and GPS trackers, some studies have used emotion sensors and video cameras, depending on the purpose of the studies. Mobile ethnography has been widely recognized as a research approach. There are research companies that support data collection in real time. For example, SIS International Research (2019) has a real-time survey application service named “app ethnography.” Other companies that provide mobile ethnography applications or services include TSR (2019), experiencefellow (2019), indeemo (2019), dscout (2019), and Metricwire (2019). Among studies based on real-time data, their geographical scopes range from attractions such as museum (Muskat et al. 2013), zoo (Birenboim et al. 2015), event venue (Liu et al. 2016), or green area (Korpilo et al. 2017) to cities (Asakura et al. 2014; Bosio et al. 2017; Chai et al. 2014; Dimanche and Prayag 2016; Kim and Fesenmaier 2015; Kou et al. 2017; Shoval et al. 2018), provinces, and countries (Quinlan Cutler et al. 2018; Rodríguez et al. 2018). Studies of historical data have larger geographical scopes ranging from cities—such as Hong Kong (Vu et al. 2015, 2018b), Wuhan (Yao et al. 2018), and Melbourne (Miah et al. 2017)—to a global level (Vu et al. 2018c,d; Wood et al. 2013).
Advantages and Opportunities Reduced Average Cost In traditional ethnographic studies, researchers could normally approach only individuals who are physically near them. Mobile ethnography enables researchers to study a larger number of participants at the same time, thus lowering the cost of data collection (Bosio et al. 2017), especially for data collected via experience sampling. Scientists used to have only a small sample size when studying topics regarding human emotions due to the high cost of collecting data, but Killingsworth and Gilbert (2010) have developed a mobile application that could contact a large number of participants from 83 different countries by randomly asking the respondents a few questions to self-report what they are doing and how they feel. Since then, enlarging sample sizes or launching new research projects no longer
764
E. Y. Zhang et al.
Table 2 Devices used and data collected Device Smartphone
Data Location (high-resolution spatiotemporal and locational data, e.g., mobile applications such as the Zeeland mobile app, which is developed by the local tourism information agency, Simple Logger, and sports tracking applications such as Sports Tracker, Strava, Polar, Endomondo)
References Chai et al. (2014), Korpilo et al. (2017), Miyasaka et al. (2018), Rodríguez et al. (2018), and Shoval et al. (2018)
Smartphone
Real-time survey via mobile applications (e.g., Sensometer, myServiceFellow, ExperienceFellow, Probe Person Survey, and MetricWire)
Asakura et al. (2014), Bosio et al. (2017), Dimanche and Prayag (2016), Liu et al. (2016), Muskat et al. (2013), and Shoval et al. (2018)
Smartphone
Guided SMS (what participants are Birenboim et al. (2015) doing, looking at, and feeling)
Smartphone
Pictures and videos shared along the trip Kim and Fesenmaier (2015)
Smartphone
Voice recording of open-ended questions Quinlan Cutler et al. (2018)
Wrist-worn and wire- Emotion (skin conductance, heart rate Kim and Fesenmaier (2015) less sensors, such as measures, blood pressure, skin tempera- and Shoval et al. (2018) Empatica E4 (clinical ture) device) or Affective Qsensor GPS tracker
Flextrack Lommy Phoenix Personal
Birenboim et al. (2015) and Kou et al. (2017)
Video camera
Observation of the data entry process
Idris et al. (2017)
No device required
Social media, such as Twitter and Vu et al. (2015) and Yao et al. Flickr’s Application Programming Inter- (2018) face (API)
induces expensive data collection cost. In contrast, the study scope and actual execution might be limited without mobile technology. However, when researchers use secondary data for analyzing travelers’ movements, it seems that the scope of the study can be extended globally without a significant increment in cost (Wood et al. 2013). Many studies are now using smartphones instead of GPS trackers to collect spatial data. In the past, cost related to the purchase of trackers, website development, and the resources spent on monitoring data collection were considered
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
765
relatively high. Inviting participants to use their own smartphones for GPS tracking is thus an effective cost-reduction approach. However, this advantage is based on the assumption that most of the data is collected via smartphones. When other devices such as emotion trackers must be purchased, the cost of purchasing the equipment may increase depending on the number of participants required.
Reduced Effort and Skills from Participants In real-time mobile ethnographic studies, a large amount of high-resolution data for locations and emotions can be collected from participants after the initial setup of the data collection devices, which is less demanding in terms of level of participation compared to conventional techniques such as activity diaries (Shoval et al. 2018). In studies that used historical data, participants’ own devices and mobile applications can replace traditional GPS tracking devices. Efforts required for data collection is limited to giving consent and sharing the data, such as studies that monitor individuals’ sports activities (Korpilo et al. 2017). New mobile-enabled data collection approaches also require less language skills from participants. For example, Kim and Fesenmaier (2015) invited participants to take photos and videos by using their own smartphones to compose a travel diary. Compared with writing a textual travel diary, taking photos and videos seems to be a less demanding task and less time-consuming approach.
Validity and Credibility In mobile ethnographic studies, the validity and credibility of research findings can be enhanced by reducing recall bias. In a retrospective report, participants might have difficulty answering some of the questions raised by the researchers, such as the number of days the patient has experienced a headache in the past 12 months (Schwarz 2007). Such recall bias can be largely minimized using a mobile ethnography approach since the data is collected almost in real time (Bosio et al. 2017) or from historical data stored in real time. During real-time data collection, participants can report small events immediately which otherwise may not be remembered over a larger reference period. Detailed spatial and temporal data can also be collected. For example, the exact check-in time, which is sometimes difficult for participants to recall, can be obtained from Twitter posts (Vu et al. 2018b). By adopting a mobile ethnographic approach in these ways, recall bias can be minimized because the participants’ retrieval of information from their memories is no longer required (Quinlan Cutler et al. 2018). Moreover, data collected in mobile ethnographic studies is more continuous, thanks to mobile technology. Travelers are able to recall discrete events that happened during their trips. Tools such as GPS and devices used to measure emotional changes can capture continuous physical and psychological movement, a
766
E. Y. Zhang et al.
part of experiences that has normally been neglected in self-report data collection (e.g., Kim and Fesenmaier 2015). Although survey data in experience sampling methodology is not completely continuous, it captures data at multiple times and locations, thus enabling researchers to understand the interactions between the person and the context (Quinlan Cutler et al. 2018). Bosio et al. (2017) believe that using mobile technology in data collection reduces the influence of the researcher. The study conducted by Hein et al. (2011, p. 265) found that using a participant’s mobile phone which is part of the natural setting benefits the research since using other tools such as disposable cameras would bring “an alien, unnatural and uncomfortable element” into the relationship between the researchers and the participants. Using the participants’ own smartphone as the device to collect data is less obtrusive, thus increasing ecological validity (Quinlan Cutler et al. 2018). Validity can be further improved by various types of triangulations. Investigator triangulation can be achieved by different researchers analyzing the same set of data, whereas this may otherwise require the researchers to travel together (Bosio et al. 2017). Method triangulation is often performed by including interviews and surveys in mobile ethnographic studies (e.g., Birenboim et al. 2015). Data triangulation can be achieved by collecting multimedia data in the same study (e.g., Muskat et al. 2013).
Augmented Effect of Multi-types of Data Different types of data collected simultaneously from the same participant enable researchers to observe more than they normally could if the same type of data was collected separately through different studies. Heckhausen and Heckhausen (2018) commented that a much richer picture of layered social realities can be captured than using methods in isolation. For example, by mapping location data, physical emotion tracked by clinical device, and emotion reported by participants, Shoval et al. (2018) were able to interpret physical emotional data along with self-reported emotional experience and at the same time identify the participants who did not complete the survey in real time. By combining tourists’ physical movement and their instant sharing of experience, Birenboim et al. (2015) prepared a map of a zoo experience that highlights both attractive and less attractive areas. They found that the dynamic zoo experience was perceived positively overall after the whole trip. Thus, moments with experience ranging from negative to positive can be better captured by the real-time reports of participants.
More Possibilities Many people are willing to share their experience and are intrinsically motivated to participate in research, even without incentives (Bosio et al. 2017). For example, Asakura et al. (2014) successfully recruited 48 participants to join a survey
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
767
conducted on smartphones without distributing any incentives. This is not difficult to imagine in the tourism and hospitality field, since travelers are self-motivated to share their travel experience online or with friends. Mobile ethnography provides more forms and points in time for participants to share those experiences. Since researchers do not need to be with participants physically, data collection can be conducted even if participants enter private areas where security policies may not allow researchers to enter (SIS International Research 2019). For example, in the study by Hein et al. (2011), a researcher could not join events that traditionally exclude females but used mobile technology to get involved in those events. It seems that the development of technology has also inspired researchers to conduct studies that measure experience at multiple points in time. Even before the popularity of smartphones, researchers could ask participants to fill the respective form at certain times during the day. But the development of technology and realtime experience sampling has encouraged researchers to conduct such research (Liu et al. 2016). Thus, the availability of this new research approach can inspire new research ideas. There are also more possibilities for suggesting practical implications based on the result of rich data collected in natural settings. Studies based on online historical data include those that use mobile big data. Findings based on big data can often generate advice that assists consumers’ decision-making (Ahmed et al. 2018). Bothorel et al. (2018) have proposed algorithms for recommending places and activities for social media users. With location-based data shared by users on applications such as Foursquare, Gowalla, and Facebook Places, as well as point-ofinterest (POI) data, they developed strategies to suggest recommendations based on users’ historical data, profiles, and online friend networks. They also came up with ways to evaluate whether users would take the recommendations. The locational information is not only effective when making location-based recommendations but can also enhance the efficiency of recommending activities, such as movies (Bothorel et al. 2018)
Concerns and Limitations Difficulties and Bias in Data Collection In mobile ethnographic studies, researchers use different types of data collection methods, and experience sampling is one of them. Since participants would need to collect and report data under guidance, clear instructions and training are needed to ensure the process is successfully executed. If the devices and applications used for mobile ethnography are user-friendly, the training can be completed in a short time (Bosio et al. 2017). As various types of mobile communication channels have penetrated our daily life, it is expected that required training and instruction will be less intensive in the future. Other than clear instructions, participants’ commitment to the data collection task—from the beginning to the end—is also important to ensure the success of data
768
E. Y. Zhang et al.
collection. The quality of data largely depends on the participants’ performance. For example, Bosio et al. (2017) experienced insufficient data provided by participants because of the low number of reported touchpoints and the way participants reported them. More than one-third of the participants only reported one touchpoint, and it is difficult for the researchers to interpret why travelers were satisfied or dissatisfied. Birenboim et al. (2015) guided participants to send SMSs during their visit and instructed them to send at least ten messages; but the average number of messages received at the end was only seven. Liu et al. (2016) commented that a certain level of commitment is required from participants even if the researchers have minimized the time required for participants to answer the survey questions. One advantage of experience sampling is real-time reflection of tourists’ feelings. However, the effectiveness of such an approach depends on the participants. Shoval et al. (2018) found that some individuals may not respond to surveys in real time. Even if the survey is designed with locational or temporal triggers, participants may choose to complete multiple questions at the same time and same location. This below-expected performance of participants may be attributed to insufficient instruction before the research or insufficient incentives, as reflected by Bosio et al. (2017). The success of data collection also relies on the extent to which the device used for data collection works properly and is user-friendly. For example, in the study by Quinlan Cutler et al. (2018), a research assistant followed the participants to provide support. Despite this, the researchers experienced recording errors and failed to transcribe the full content because of the participants’ carelessness, fast talking speed, and background noise. Muskat et al. (2018) commented that mobile devices and applications that are not user-friendly can severely limit the effectiveness of data collection. There are also concerns regarding sampling bias in both real-time and historical data-driven studies, as well as data bias in the latter. For real-time studies, Asakura et al. (2014) identified issue related to sampling bias because only people who own smartphones could participate in this type of research. For historical data studies, the data were often associated with young and tech-savvy individuals (Miah et al. 2017). This concern has become less significant as the popularity of the smartphone has increased. Concerns about data bias stem mainly from studies involving analysis of historical data. First, the availability of the data depends on the policies of a specific platform. Most recent studies have collected data from Flickr and Twitter because they are open for data collection. If their access policies change in the future, many studies of this type will not be feasible. Second, data is limited to the information shared by travelers in studies using secondary data (Wood et al. 2013). For example, to understand travel destination activities, Vu et al. (2018d) have only collected venue check-in data from Foursquare because that was what travelers shared. Another example is that Vu et al. (2018c) used the travel diaries of Flickr users to examine which destinations they visited, but the researchers later realized that travelers may not include every trip in their album. Thus, the sequences of visited destinations may not reveal actual travel behaviors.
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
769
The last concern is mainly associated with studies that involve real-time data collection, where participating in data collection may change travelers’ behaviors. It can be challenging to find the balance between the intrusion of travelers’ experience and the effectiveness of data collection during the trip. Some researchers have considered this issue and minimized the number of questions for each measurement dimension (Liu et al. 2016).
Challenges in Data Analysis In traditional ethnographic studies, participation observation requires researchers to capture the essence of moments instantly during the observation. Thus, “doing mobile ethnography means moving while analyzing movement” (Novoa 2015, p. 106). The meanings of moments could be lost if researchers could not capture them during participation observation. Smartphones and other mobile devices enable researchers to capture and analyze subjects’ movements once they have occurred. However, there are new challenges in data analysis. For example, in realtime data studies (Bosio et al. 2017; Muskat et al. 2013), participants may share multimedia data during data collection and may have the freedom to choose the type of media they prefer. Thus, heterogeneous data from different individuals can become a challenge for analysis. One skill used by Bosio et al. (2017) and Muskat et al. (2013) is to include a Likert scale rating question to gauge whether an experience is positive or not. However, the rich information shared in the multimedia data is still partially lost and mainly used for triangulation. Chai et al. (2014) developed a website to assist data collection and asked participants to write diaries about the trips they took during the day in order to understand their physical movements which are recorded by GPS trackers. However, they found that participants’ diary entries did not match with GPS data, creating issues with data analysis. In studies involving historical data, challenges related to data analysis include the quality of the data and the skills needed. The secondary data is often collected, stored, and shared by participants for other purposes and not designed to match specific research purposes. Researchers, while enjoying the convenience of collecting ready-made data, need to take care of heterogeneity issues, which has created challenges for ensuring data accuracy and consistency (Korpilo et al. 2017). Yao et al. (2018) proposed new algorithms for studying individual mobility by using the data shared on social media platforms, as social media data often provides only a portion of locational data which might not be as complete as those collected by GPS devices. Data mining skills are required in managing large data sets which are often found in studies using secondary data (Vu et al. 2018c). Secondary data historically stored at open sources requires technological skills and knowledge in data collection and analysis. For example, many photos on Flickr cannot be directly downloaded but require programming procedures to do so (Vu et al. 2015). Again, some of these mining exercises violate ethnographic principles by stripping away context. Qualitative research approaches are helpful in interpreting and providing extra explanations on results generated based on big data.
770
E. Y. Zhang et al.
It is important to explain behaviors after discovering behavioral patterns. For example, studies discovering routes based on GPS data can often reveal movement patterns but cannot explain why participants have decided to choose a route. Thus, follow-up questions need to be asked (Korpilo et al. 2017). Many studies have demonstrated new approaches to analyze check-in or geo-tagged data from social media. However, while such studies have been successful and their methods are effective, their findings could have been more interesting and insightful if qualitative research techniques were also used (Vu et al. 2015, 2018a,b,d). Methodological innovation is worth appreciating, but how these methods can be further developed to reveal useful results is still a question.
Privacy As Asakura et al. (2014) have pointed out, privacy issues should be addressed through careful preparation and the design of easy to understand privacy policies at the data collection stage. Privacy can also be protected by limiting how the collected data will be used (Korpilo et al. 2017). It seems that researchers using secondary data for analysis have less concern about security and privacy issues (Vu et al. 2015; Yao et al. 2018). Researchers who collect data from participants directly tend to consider privacy as an issue of mobile ethnography. This is interesting because similar types of information are collected in both types of research, and researchers are usually able to seek consent from the participants. Researchers’ misunderstanding of policies related to privacy and data use might sometimes cause further issues. For example, researchers might possibly assume that the channel/platform (where consumers shared their data) is liable for privacy protection.
Concluding Remarks on Mobile Ethnography and Media Convergence Unlike traditional ethnographic studies in tourism, many researchers who adopt mobile ethnography did not classify their studies under the scope of mobile ethnography. The recognition and understanding of mobile ethnography as a research approach are still lacking. Sometimes, mobile ethnography is considered as a specific data collection method. For example, Kou et al. (2017) used GPS to track the physical movement of participants and introduced “mobile ethnography” as a data collection method. Many researchers who have used real-time experience sampling or historical data also did not categorize their studies as mobile ethnography despite their research designs implying it. One should note that not all studies using mobile data should be considered mobile ethnography research, as the latter should, at the very minimum, contribute to a cultural understanding of mobility. While the body of literature using mobile ethnography is growing and the concept of mobile ethnography has been established, this chapter clarifies the concept of mobile ethnography and encourages future studies that adopt this research approach to
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
771
acknowledge the uniqueness and benefits of mobile ethnography so as to further develop this line of research. One reason why researchers rarely acknowledged mobile ethnography in their studies could be that a mobile technology-based approach is a natural consequence of media convergence. Media convergence refers to the process of technology advancement changing the relationship among “existing technologies, industries, markets, genres and audiences” (Jenkins 2004, p. 23). Mobile ethnography emerges along with the rapid growth of communication and technology innovation. Enabling data to be collected via one single device is more than a technological change; media convergence has fundamentally changed the relationship between researchers and participants. It is likely that mobile ethnography will continue to change and evolve in the future, as “convergence refers to a process, but not an endpoint” (Jenkins 2004, p. 23).
Cross-References A Futuristic Look at Tourism in the Era of the Internet Ecosystem Big Data Technologies Strategic Use of Information Technologies in Tourism: A Review and Critique
References Ahmed E, Yaqoob I, Hashem IA et al (2018) Recent advances and challenges in mobile big data. IEEE Commun Mag 56(2):102–108 Asakura Y, Hato E, Maruyama T (2014) Behavioural data collection using mobile phones. In: Rasouli S (ed) Mobile technologies for activity-travel data collection and analysis. IGI Global, Hershey, pp 17–35 Birenboim A, Reinau KH, Shoval N et al (2015) High-resolution measurement and analysis of visitor experiences in time and space: the case of Aalborg zoo in Denmark. Prof Geogr 67(4):620–629 Bosio B, Rainer K, Stickdorn M (2017) Customer experience research with mobile ethnography: a case study of the alpine destination serfaus-fiss-ladis. In: Belk R (ed) Qualitative consumer research. Emerald Publishing Limited, Bingley, pp 111–137 Bothorel C, Lathia N, Picot-Clemente R (2018) Location recommendation with social media data. In: Social information access. Springer, Cham, pp 624–653 Chai Y, Chen Z, Liu Y et al (2014) Space-time behavior survey for smart travel planning in Beijing, China. In: Rasouli S (ed) Mobile technologies for activity-travel data collection and analysis. IGI Global, pp 79–90. https://doi.org/10.1080/00045608.2013.792179 Creswell JW (2013) Qualitative inquiry and research design: choosing among five traditions. In: Hine C (ed) Virtual Ethnography. Thousand Oaks Csikszentmihalyi M, Larson R (2014) Validity and reliability of the experience-sampling method. In: Flow and the foundations of positive psychology: the collected works of Mihaly Csikszentmihalyi. Springer, Dordrecht, pp 35–54 Dimanche F, Prayag G (2016) Visitor driven service experiences in a city destination: a mobile ethnographic approach. In: 2012 TTRA international conference, University of Massachuse, Virginia, 17–19 June 2012 dscout (2019) In-context insights via remote qualitative research. https://dscout.com/. Accessed 18 June 2019
772
E. Y. Zhang et al.
experiencefellow (2019) experiencefellow. https://www.experiencefellow.com/pricing.html. Accessed 27 May 2019 Greenblatt S, Županov IG (2010) Cultural mobility: a manifesto. Cambridge University Press, Cambridge Gretzel U (2021) Dreaming about travel: a pinterest netnography. In: Information and communication technologies in tourism 2021. Springer, Cham, pp 256–268 Hammersley M, Atkinson P (2007) Ethnography: principles in practice. Routledge, London Hardy A, Hyslop S, Booth K, Robards B, Aryal J, Gretzel U, Eccleston R (2017) Tracking tourists’ travel with smartphone-based GPS technology: a methodological discussion. Inf Technol Tour 17(3):255–274 Heckhausen J, Heckhausen H (2018) Motivation and action: introduction and overview. In: Motivation and action. Springer, Cham, pp 1–14 Hein W, O’Donohoe S, Ryan A (2011) Mobile phones as an extension of the participant observer’s self: reflections on the emergent role of an emergent technology. Qual Mark Res Int J 14(3): 258–273 Hine C (2000) The virtual objects of ethnography. In: Virtual ethnography Idris NH, Osman MJ, Kanniah K et al (2017) Engaging indigenous people as geo-crowdsourcing sensors for ecotourism mapping via mobile data collection: a case study of the Royal Belum State Park. Cartogr Geogr Inf Sci 44(2):113–127 indeemo (2019) https://indeemo.com/. Accessed 27 May 2019 Jenkins H (2004) The cultural logic of media convergence. Int J Cult Stud 7(1):33–43 Killingsworth MA, Gilbert DT (2010) A wandering mind is an unhappy mind. Science 330(6006):932–932 Kim J, Fesenmaier DR (2015) Measuring emotions in real time: implications for tourism experience design. J Travel Res 54(4):419–429 Korpilo S, Virtanen T, Lehvävirta SJL et al (2017) Smartphone GPS tracking—inexpensive and efficient data collection on recreational movement. Landsc Urban Plan 157:608–617 Kou L, Xu H, Hannam K (2017) Understanding seasonal mobilities, health and wellbeing to Sanya, China. Soc Sci Med 177:87–99 Langer R, Beckman SC (2005) Sensitive research topics: netnography revisited. Qual Mark Res Intl J 8(2):189–203 Liu W, Sparks B, Coghlan A (2016) Measuring customer experience in situ: the link between appraisals, emotions and overall assessments. Int J Hosp Manag 59:42–49 Luo JM, Vu HQ, Li G et al (2019) Tourist behavior analysis in gaming destinations based on venue check-in data. J Travel Tour Mark 36(1):107–118 Metricwire (2019) https://metricwire.com/. Accessed 27 May 2019 Miah SJ, Vu HQ, Gammack J et al (2017) A big data analytics method for tourist behaviour analysis. Inf Manag 54(6):771–785 Miyasaka T, Oba A, Akasaka M et al (2018) Sampling limitations in using tourists’ mobile phones for GPS-based visitor monitoring. J Leis Res 49(3–5):298–310 Mkono M (2013) Using net-based ethnography (netnography) to understand the staging and marketing of “authentic African” dining experiences to tourists at Victoria Falls. J Hosp Tour Res 37(2):184–198 Muskat M, Muskat B, Zehrer A et al (2013) Generation Y: evaluating services experiences through mobile ethnography. Tour Rev 68(3):55–71 Muskat B, Muskat M, Zehrer A (2018) Qualitative interpretive mobile ethnography. Anatolia 29(1):98–107 Novoa A (2015) Mobile ethnography: emergence, techniques and its importance to geography. Hum Geogr J Stud Res Hum Geogr 9(1):98–107 Pink S, Horst H, Postill J et al (2016) Digital ethnography: principles and practices. Springer, Los Angeles Quinlan Cutler S, Doherty S, Carmichael B (2018) The experience sampling method: examining its use and potential in tourist experience research. Curr Issues Tour 21(9):1052–1074
32 Mobile Ethnography in Tourism and Hospitality: Concept,. . .
773
Rodríguez J, Semanjski I, Gautama S et al (2018) Unsupervised hierarchical clustering approach for tourism market segmentation based on crowdsourced mobile phone data. Sensors 18(9):2972 Savin-Baden M, Major CH (2013) Qualitative research: the essential guide to theory and practice. Routledge, Abingdon Schwarz N (2007) Retrospective and concurrent self-reports: the rationale for real-time data capture. In: Nebeling L (ed) The science of real-time data capture: self-reports in health research. Oxford University Press, Oxford/New York, pp 11–26 Shoval N, Ahas R (2016) The use of tracking technologies in tourism research: the first decade. Tour Geogr 18(5):587–606 Shoval N, Schvimer Y, Tamir M (2018) Real-time measurement of tourists’ objective and subjective emotions in time and space. J Travel Res 57(1):3–16 SIS International Research (2019) Mobile Ethnography Research. http://www.sisinternational. com/solutions/innovation/mobile-ethnography/. Accessed 10 May 2019 Stickdorn M, Frischhut B, Schmid JS (2014) Mobile ethnography: a pioneering research approach for customer-centered destination management. Tour Anal 19(4):491–503 Tedlock B (2005) The observation of participation and the emergence of public ethnography. In: Denzin NK, Lincoln YS (eds) The Sage handbook of qualitative research, vol 3. Sage Publications, Thousand Oaks, pp 467–481 TSR (2019) Mobile ethnography. https://touchstoneresearch.com/market-research-toolstechnologies/mobile-ethnography/. Accessed 27 May 2019 Urry J (2016) Mobilities: new perspectives on transport and society. Routledge, London/New York Vu HQ, Li G, Law R et al (2015) Exploring the travel behaviors of inbound tourists to Hong Kong using geotagged photos. Tour Manag 46:222–232 Vu HQ, Law R, Li G (2018a) Breach of traveller privacy in location-based social media. Curr Issues Tour 22(15):1–16 Vu HQ, Li G, Law R et al (2018b) Tourist activity analysis by leveraging mobile social media data. J Travel Res 57(7):883–898 Vu HQ, Li G, Law R et al (2018c) Travel diaries analysis by sequential rule mining. J Travel Res 57(3):399–413 Vu HQ, Li G, Law R et al (2018d) Cross-country analysis of tourist activities based on venuereferenced social media data. J Travel Res 46(3):245–255 Watts L (2008) The art and craft of train travel. Soc Cult Geogr 9(6):711–726 Wood SA, Guerry AD, Silver JM et al (2013) Using social media to quantify nature-based tourism and recreation. Sci Rep 3(2976):1–7 Yao H, Xiong M, Zeng D et al (2018) Mining multiple spatial–temporal paths from social media data. Fut Gen Comput Syst 87:782–791
Experimental Research in E-Tourism: A Critical Review
33
Lawrence Hoc Nang Fong, Erin Yirun Wang, Rob Law, and Shousheng Chai
Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review Articles in the Tourism Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamentals of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimensions Included in the Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature Search Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filtering Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coding Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Article Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technology-Assisted Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussions of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Article Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
776 777 777 778 779 779 780 781 781 781 781 785 786 787 787 788 788
L. H. N. Fong () · E. Y. Wang Faculty of Business Administration, Department of Integrated Resort and Tourism Management, University of Macau, Macau, S.A.R., China e-mail: [email protected]; [email protected] R. Law Asia-Pacific Academy of Economics and Management, University of Macau, Macau, S.A.R., China Faculty of Business Administration, Department of Integrated Resort and Tourism Management, University of Macau, Macau, S.A.R., China e-mail: [email protected] S. Chai Department of Business Administration, Ocean University of China, Shandong, China e-mail: [email protected] © Springer Nature Switzerland AG 2022 Z. Xiang et al. (eds.), Handbook of e-Tourism, https://doi.org/10.1007/978-3-030-48652-5_123
775
776
L. H. N. Fong et al.
Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Advanced Technology Used to Assist Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implications and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
790 792 792 792 793 794
Abstract Tourism researchers have been using experimental design for many years. Although e-tourism studies are well documented in the literature, their trend and rigor remain unknown. Therefore, this review critically analyzed corresponding publications by focusing on five major aspects, namely, article characteristics, research design, sample characteristics, data analysis, and advanced technology used to assist experiments. A total of 50 articles consisting of 60 studies were analyzed after a thorough literature review. Findings revealed that experimental design has been recently gaining popularity in e-tourism research, featuring the use of advanced technology to assist manipulation in experiments. Future e-tourism experimental research should have diversified disciplinary foci, conduct multidisciplinary studies in various contexts, avoid excessive independent variables in a study, perform manipulation checks, report effect sizes, and increase the examination of psychological mechanisms using mediation analysis.
Keywords Content analysis · Experimental design · Hospitality journal · Information technology · Systematic review · Tourism journal
Introduction Psychology, education, marketing, and management researchers have been using experimental design for a long time. Researchers in tourism and hospitality have also increasingly adopted experimental design in recent years (Fong et al. 2016). Alongside this methodological trend, topical trends have primarily drawn on e-tourism. Prevalent topics are social media marketing, electronic word of mouth, mobile technology, virtual reality, augmented reality, wearable devices, and geolocation (Navío-Marco et al. 2018). While tourism researchers’ use of experimental design and study of e-tourism have decades of history, the evolution, trend, and quality of experimental design in e-tourism research remain poorly understood, leaving a void in the literature. Without filling in this void, the growing application of experimental design, which is useful for examining causal effect in the e-tourism phenomenon, may encounter bottlenecks. A critical review enables researchers to understand the trends and issues in previous e-tourism experimental studies, which are crucial in enhancing the quality of experimental design in the future and contributing to the methodological maturity of e-tourism research. As such,
33 Experimental Research in E-Tourism: A Critical Review
777
this book chapter critically reviews experimental research on e-tourism in tourism journals. To explore the evolution and the trend of experimental research in e-tourism as well as to assess the rigor of their designs, this study analyzed the retrieved articles according to dimensions including article characteristics, research design, sample characteristics, and data analysis (Fong et al. 2016). Since recent years have experienced the growing application of virtual reality and eye-tracking devices, we also examined the prevalence of advanced technology used to assist experiments. Drawing from the review of tourism and experimental design literature, the next section will articulate the justifications for conducting this critical review.
Literature Review Review Articles in the Tourism Literature The tourism literature has a plethora of review articles. Recent years have even published reviews of review articles in the tourism literature (Kim et al. 2018; Pahlevan-Sharif et al. 2019). Review articles in tourism can be categorized into three major types, which serve different purposes. First, quantitative meta-analysis refers to the aggregation of statistical results of previous empirical studies through which errors and biases in individual studies can be overcome (Gretzel and KennedyEden 2012). Meta-analytical articles started to emerge in the tourism literature in recent years, and they focused on well-established topics such as destination image (Zhang et al. 2014), international tourism demand (Peng et al. 2014, 2015), residents’ attitude toward tourism development (Gursoy et al. 2018), and electronic word of mouth (Yang et al. 2018). Second, literature review articles use a purely qualitative approach by reviewing, synthesizing, and critically discussing the theories, concepts, and arguments in previous studies. Researchers aim to present a comprehensive framework of the topic of concern at the end of their papers. This kind of review is not scant in the tourism literature. Examples of topics covered are social media applications (Leung et al. 2013), cocreation of tourist experiences (Campos et al. 2018), mobile marketing (Kim and Law 2015), peer-to-peer accommodation (Prayag and Ozanne 2018), and big data (Li et al. 2018). The third type of review is called qualitative meta-analysis (also called meta-synthesis), which is interpretive instead of aggregative (Park and Gretzel 2007). Typical meta-synthesis articles summarize concerned information from previous studies and portray the evolution and trend over the years. This type of review is the most frequent in the tourism literature, while various topics have been covered (to be articulated in the next paragraph). Most meta-synthesis articles in tourism interpreted the trend of the topics of concern by journal outlets (including SSCI-listed or not), disciplinary foci, contexts, and research goals. Among the three types of review, meta-synthesis is deemed suitable for this review study that focuses on a research method (i.e., experimental design), which is not about aggregation of statistical results (suitable for quantitative meta-analysis) or theoretical issues (suitable for literature review).
778
L. H. N. Fong et al.
Meta-synthesis papers in tourism have covered various foci, which can generally be classified into three major types, namely, topic-, place-, and method-focused reviews. Topic-focused reviews are the most prevalent in the tourism literature, covering topics such as tourism entrepreneurship (Solvoll et al. 2015), revenue management (Guillet and Mohammed 2015), air transport (Spasojevic et al. 2018), customer relationship management (Chan et al. 2018), sustainability communication (Tölkes 2018), wine tourism (Gómez et al. 2019), and others. Place-based reviews center upon a particular region or country but still have a concerned topic such as Asia-Pacific tourism research (Fong et al. 2015), China hotel research (Gross et al. 2013), China tourism research in general (Leung et al. 2014), China outbound tourism research (Jin and Wang 2016; Law et al. 2016), Chinese family tourism research (Wu and Wall 2016), and sustainable tourism in Cambodia (Carter et al. 2015). Method-based reviews focus on the use of a particular research method or analysis in tourism research, such as bibliometric approach (Koseoglu et al. 2016), experimental design (Fong et al. 2016), netnography (Tavakoli and Wijesinghe 2019), mixed-method research (Khoo-Lattimore et al. 2019), and partial least squares structural equation modeling (Do Valle and Assaker 2016; Mostafiz et al. 2019). Our review belongs to method-based research and adds knowledge to the literature by focusing on experimental design in e-tourism research. Early e-tourism research can be traced back to the 1980s (Leung and Law 2007). The accumulation of e-tourism knowledge has not slowed down over the last three decades, whereas related publications have experienced an upsurge recently given the widespread application of technology in the tourism industry and the proliferation of tech-savvy consumers (Neidhardt and Werthner 2018). Trends in research practices can be reflected by review articles because a review requires a sufficiently large body of literature. Numerous e-tourism review articles have been published over the previous decades. The topics were diversified, including technology application (Law et al. 2009), website evaluation (Ip et al. 2011; Law et al. 2010), online reputation of destinations (Marchiori and Canton 2011), social media (Leung et al. 2013; Zeng and Gerritsen 2014), online reviews (Lu and Stepchenkova 2015; Schuckert et al. 2015; Yang et al. 2018), Internet marketing (Leung et al. 2015), virtual reality or augmented reality (VR/AR) (Yung and Khoo-Lattimore 2019), big data (Li et al. 2018), and eye tracking (Scott et al. 2019). The chronological sequence of these review studies implies that e-tourism research has evolved from general application of technology, through website evaluation, and social media marketing to the latest technology of VR/AR, big data, and eye tracking. These reviews are clearly topic based, whereas place- and method-based e-tourism review articles are lacking. The current study contributes to the e-tourism literature by conducting a method-based (i.e., experimental design) e-tourism review.
Fundamentals of Experiments In contrast with other methods such as survey and in-depth interview, experimental design has been underutilized in tourism research (Fong et al. 2016), though
33 Experimental Research in E-Tourism: A Critical Review
779
the advocacy for using this research design has not been scarce (Dolnicar and Ring 2014; Levy et al. 2011; Viglia and Dolnicar 2020). Experimental design is characterized by a comparison of conditions that vary by the extent of manipulated variables, followed by an assignment of participants to the conditions. To illustrate, a study that examines if an e-coupon promotion increases tourists’ intention to book a hotel exposes participants in conditions A (treatment condition) and B (control condition) to the receipt of an e-coupon and nothing, respectively, whereas all other conditions remain unchanged. Participants in both conditions will be asked about their intention to book a hotel. Given this design, the only variation is the availability of the e-coupon. Thus, the effect of e-coupon promotion on intention can be assessed with minimal contamination by unrelated factors. With the control condition and the high internal validity, experimental design is especially suitable for testing of theory. To design a rigorous experiment, researchers need to carefully consider every element and step throughout the design. To successfully design an experiment, researchers need numerous pretests because failure (e.g., of the manipulation) is common. Experimental design requires researchers to plan for the following elements: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
The setting (experiment conducted in laboratory or field) The type of experiment (true experiment or quasi-experiment) The participants (students, general population, and others) The number of conditions The sample size in each condition The type of design (between-subjects, within-subjects, or mixed) The assignment of participants (random assignment, matching, or others) The manipulation method (scenario-based, priming, or others) The manipulation check The measures of other nonmanipulated variables The analysis
In their review of experimental design in the tourism and hospitality literature, Fong et al. (2016) critically and collectively assessed all these elements, identifying whether the effect size was reported and the number of studies per article. Similarly, the present critical review covered the same set of elements. Furthermore, we added the component of technology used to assist the experiment to examine if it is a trend in e-tourism research.
Methodology Dimensions Included in the Review This critical review focused on e-tourism studies that used experimental design. As noted in the literature review, the trend of publications can be reflected in journal outlets, disciplinary foci, research contexts, research goals, and the technology
780
L. H. N. Fong et al.
used to assist the experiments. The rigor of experimental design is assessed using the components listed in the aforementioned fundamentals of experiments. Therefore, this review generally focuses on five major dimensions, namely, article characteristics, research design, sample characteristics, data analysis, and advanced technology used to assist experiments. First, with regard to article characteristics, we covered the following aspects: (1) (2) (3) (4) (5) (6)
Publication year SSCI journals or not Disciplinary foci Research goals Number of studies per article Research contexts Second, in terms of research design, we investigated:
(7) (8) (9) (10) (11) (12)
Types of experiment Settings Assignment of subjects Manipulation method Availability of manipulation check Number of independent variables
Third, with regard to sample characteristics, two aspects were examined: (13) Types of subjects (14) Sample size Fourth, for data analysis, the following were analyzed: (15) Statistical analyses (16) The availability of effect size The final aspect is: (17) The adoption of advanced technology in the experiments
Literature Search Criteria Relevant experimental articles in major tourism journals were searched, retrieved, and content analyzed. Consistent with Lee et al. (2014), we focused the search on the tourism journals listed in McKercher (2012). Data collection was undertaken in April 2019.
33 Experimental Research in E-Tourism: A Critical Review
781
To begin, the search was conducted in the Hospitality & Tourism Complete Database. To obtain a comprehensive result of experimental studies, we entered the term “experiment” in all texts rather than only in subject or title. The journal title was then entered in journal name. No limit was set to the results. We repeated this process and eventually retrieved 781 articles in total for further filtering.
Filtering Criteria The filtering exercise was conducted by considering predetermined inclusion and exclusion criteria (Fong et al. 2016). Concerning the inclusion criteria, the study method has to be a conventional experimental design (Fong et al. 2016). Besides, the topic of the study is e-tourism (Buhalis and Law 2008). Following Fong et al. (2016) and Law et al. (2020), the excluded articles were as follows: (1) Conceptual studies, case studies, review, editorials, book review, and conference reports (2) Studies using choice experiment or computational experiment as the method (3) The word “experiment” was mentioned in articles (4) Papers not dealing with e-tourism (5) Duplicated articles After a perusal of the articles, we had a consensus that 50 articles in 16 journals (see Table 1) were qualified for subsequent content analysis.
Coding Procedure Following previous review papers (Fong et al. 2016; Hulland et al. 2018), we used content analysis to code the selected 50 articles. Certain articles contain multiple experimental studies, which were coded separately. Any coding discrepancy was resolved through discussions among three investigators until a consensus was achieved.
Results Article Characteristics The timeline in Fig. 1 presents the number of experimental articles in e-tourism published from 2003 to April 2019. The first publication of experimental design in e-tourism appeared in 2003. A steady increase was observed since then, until 2018 when a surge of the experimental e-tourism publications was recorded (n = 11). This year published more than one-fifth of the e-tourism experimental articles that we content analyzed.
782
L. H. N. Fong et al.
Table 1 Experimental e-tourism research in tourism journals (n = 50 articles) SSCI Yes
No
Journal outlets Tourism Management Journal of Travel Research Journal of Travel & Tourism Marketing International Journal of Tourism Research Journal of Vacation Marketing Annals of Tourism Research Asia Pacific Journal of Tourism Research Journal of Sustainable Tourism Scandinavian Journal of Hospitality and Tourism Total Information Technology & Tourism e-Review of Tourism Research Event Management Anatolia International Journal of Culture, Tourism and Hospitality Research Journal of Heritage Tourism Journal of Teaching in Travel & Tourism Total
No. of articles 10 8 5 4 3 2 2 1 1 36 6 2 2 1 1
Proportion (%) 20.0 16.0 10.0 8.0 6.0 4.0 4.0 2.0 2.0 72.0 12.0 4.0 4.0 2.0 2.0
1 1 14
2.0 2.0 28.0
Other tourism journals that have been included in the searching process are Acta Turistica; Current Issues in Tourism; European Journal of Tourism Research; International Journal of Tourism Policy; International Journal of Tourism Sciences; Journal of China Tourism Research; Journal of Convention and Event Tourism; Journal of Ecotourism; Journal of Hospitality & Tourism Management; Journal of Hospitality, Leisure, Sport & Tourism Education; Journal of Sport & Tourism; Journal of Tourism and Cultural Change; Journal of Tourism Challenges and Trends; Journal of Unconventional Parks; Tourism and Recreation Research; Tourism Analysis; Tourism and Hospitality Planning & Development; Tourism and Hospitality Research; Tourism Economics; Tourism Geographies; Tourism in Marine Environments; Tourism Recreation Research; Tourism Review; Tourism Review International; Tourism, Culture and Communication; Tourist Studies
Most e-tourism experimental articles were published in SSCI journals (n = 36, 72.0%) (χ 2 (1) = 9.680, n = 50, p < 0.01), in particular Tourism Management (n = 10) and Journal of Travel Research (n = 8).They published over one-third of the included articles in this study. Among the non-SSCI journals, Information Technology & Tourism published the most as it recorded six e-tourism experimental publications. In sum, these three journals published almost half of the total publications (n = 24, 48.0%) (see Table 1). Table 2 presents the nature of articles, including disciplinary foci and the number of studies per article. In terms of disciplinary foci, we adopted the framework developed by Davis et al. (2011) to categorize the articles. A significant variation of proportions was observed among the five disciplinary foci identified from the
33 Experimental Research in E-Tourism: A Critical Review
783
Experiment in e-Tourism 11
No. of Articles
12 10 8
6
6
4
4 2 0
1
0
1
1
0
1
3
1
4 2
5
5
2
3
Year Fig. 1 Experimental articles in e-tourism during the period of 2003–April 2019
articles (χ 2 (4) = 85.200, n = 50, p < 0.001). Marketing was the primary focus (n = 35, 70%), followed by computer science (n = 11, 22%). Besides these top two foci, the results also suggest the research interests in other disciplines like finance (n = 2, 4.0%), environment (n = 1, 2.0%), and management and administration (n = 1, 2.0%). A statistically unequal distribution in research goals (χ 2 (9) = 56.800; n = 50; p < .001) was observed among the articles. The goal of influencing purchase decision, particularly in manipulating information from marketers, attracted the most attention (n = 14, 28%). The second prevalent goal was to develop effective tourism marketing communication strategies (n = 13, 26.0%), especially by examining, among others, credibility, trustworthiness, and helpfulness. Hence, much attention was paid to the goal of influencing consumer destination attitude (n = 12, 24.0%), through information delivered on official websites or in consumergenerated contents. Other goals that we identified recorded low frequency and were scattered. The number of studies per article exhibited an uneven distribution (χ 2 (2) = 58.240, n = 50, p < 0.001). Researchers adopted mostly a single-study approach (n = 42, 84%). Multi-study articles featuring two studies (n = 6, 12%) and three studies (n = 2, 4%) were relatively scant. A total of 60 studies were involved in the articles. The following analyses and reporting of results were conducted by studies instead of articles, considering that the analyzed items would vary with studies in multi-study articles. We categorized research contexts by following Fong et al. (2015). Eight research contexts were identified from the studies (see Table 3). The unequal proportion of study contexts was statistically significant (χ 2 (6) = 52.933; n = 60; p < .001). Among the seven contexts, destination was most frequently investigated (n = 23, 38.3%), followed by hotel or accommodation (n = 15, 25.0%) and tourism in general (n = 14, 23.3%).
784
L. H. N. Fong et al.
Table 2 Nature of articles (n = 50 articles) Criteria Categories Disciplinary foci Marketing Computer Science Finance Environmental studies Management and administration Research goal To influence purchase decision To develop tourism marketing communication strategies To influence destination attitude To enhance tourist experience To increase public donations in tourism project To enhance employee training To enhance tourist online loyalty To enhance tourist online positive rating To enhance tourist online sharing behavior To enhance tourist sustainable behavior No. of studies per 1 article 2 3
No. of articles Proportion (%) Test statistics 35 70.0 85.200a∗∗∗ 11 22.0 2 4.0 1 2.0 1 2.0 14 13
28.0 26.0
12
24.0
4 2
8.0 4.0
1 1
2.0 2.0
1
2.0
1
2.0
1
2.0
42
84.0
6 2
12.0 4.0
56.800a∗∗∗
58.240a∗∗∗
***p < .001 chi-square test is performed
a Pearson’s
Table 3 Study contexts (n = 60 studies) Criteria Categories Context Destination Hotel/accommodation Tourism in general MICE Attraction Restaurant Cruise ***p < .001 chi-square test is performed
a Pearson’s
No. of articles 23 15 14 3 2 2 1
Proportions (%) Test statistics 38.3 52.933a∗∗∗ 25.0 23.3 5.0 3.3 3.3 1.7
33 Experimental Research in E-Tourism: A Critical Review
785
Research Design Table 4 provides a summary of research designs in experimental e-tourism studies. Concerning the types (χ 2 (1) = 15.000; n = 60; p < .001) and settings (χ 2 (1) = 15.000; n = 60; p < .001) of experiments, significant variations of proportion were found. The types of experiment are true experiment and quasiexperiment. True experiment features random assignment of participants to experimental and control group, while quasi-experiment adopts nonrandomized intervention (Fong et al. 2016; Viglia and Dolnicar 2020). The results showed that true experiment (n = 45, 75.0%) outnumbered quasi-experiment (n = 15, 25.0%). For the setting of experiment, the majority of the studies were conducted in a laboratory (n = 45, 75.0%), whereas the remaining works were conducted in the field (n = 15, 25.0%). The proportional difference was statistically significant in assignment of subjects (χ 2 (2) = 63.100; n = 60; p < .001). A considerable number of studies used between-subjects design (n = 49, 81.7%), while within-subject design (n = 6, 10.0%) and mixed design (n = 5, 8.3%) were equally rare (χ 2 (1) = 0.091; n = 11; ns). The proportion varies significantly with regard to manipulation method in e-tourism research (χ 2 (2) = 56.067; n = 60; p < .001). Based on the retrieved papers, almost all the researchers used scenario techniques to manipulate the
Table 4 Research design (n = 60 studies) Criteria
Categories
Types of experiment
True experiment Quasiexperiment Settings Laboratory Field Assignment of subjects Between-subjects Within-subjects Mixed Manipulation method Scenario Priming Manipulation check Yes No No. of independent variables 1 2 3 4 5 ***p < .001 chi-square test is performed
a Pearson’s
No. of studies Proportion (%) Test statistics 45 15
75.0 25.0
15.000a∗∗∗
45 15 49 6 5 59 1 31 29 18 30 8 3 1
75.0 25.0 81.7 10.0 8.3 98.3 1.7 51.7 48.3 30.0 50.0 13.3 5.0 1.7
15.000a∗∗∗ 63.100a∗∗∗
56.067a∗∗∗ 0.067a 48.167a∗∗∗
786
L. H. N. Fong et al.
subjects (n = 59, 98.3%). Priming was only used in one study (n = 1, 1.7%). As for whether performing a manipulation check or not, no significant variation of proportions (χ 2 (1) = 0.067; n = 60; ns) was observed. In particular, half of the studies test the validity of their manipulation (n = 31, 51.7%). The number of independent variables, ranging from one to five, differs significantly among the e-tourism experimental studies (χ 2 (4) = 48.167; n = 60; p < .001). Half of the studies adopted two independent variables (n = 30, 50%), followed by one variable (n = 18, 30%), three variables (n = 8, 13.3%), four variables (n = 3, 5.0%), and then five variables (n = 1, 1.7%). The results indicated factorial design, which used more than one independent variable, was widely applied in e-tourism experimental research (n = 42, 60%).
Sample Characteristics Considering the sample characteristics of experiments in e-tourism research (see Table 5), a significant difference of proportions can be observed in the sample size (χ 2 (4) = 18.167; n = 60; p < .01). The largest proportion of studies had more than 300 participants (n = 22, 36.7%). The median of overall sample size was 249.5, which is smaller than the mean value of sample size (M = 500.7). Overall, the size of all the study samples ranges from 20 to 8,431. Sample type was also found to be significantly different in terms of proportions (χ 2 (3) = 27.600; n = 60; p < .001). Student samples were used in almost half of the studies (n = 27, 45%), followed by general population (n = 22, 36.7%); tourists (n = 10, 16.7%); and residents (n = 1, 1.7%). The largest sample size among the included studies was 8,431 subjects recruited from the general population.
Table 5 Sample characteristics (n = 60 studies) Variable
No. of Proportion Test Mean studies (%) statistics Sample size 18.167a∗∗ 500.700 0 for all j = 1, 2, . . . , D,
(1)
where D is the number of parts or components. Individual compositions could be, for instance, hotels, and parts could be the possible reasons for complaining about them in TripAdvisor, or the content of photos posted by them in their Facebook accounts. In order to focus on the relative importance of the parts, the closure of x to a constant sum is common practice. It can also be the case that the raw data already have a fixed sum (e.g., 100% in market share data). Without loss of generality, we consider the unit sum, so that after closure, z contains part proportions: z = C(x) = xS1 , xS2 , . . . , with zj > 0 for all j = 1, 2, . . . , D;
xD S = (z1 , z2 , . . . , D j =1 xj = S;
z ) DD
j =1 zj
= 1.
(2)
Regardless of whether closure is performed or not, the relative information contained by the D parts should remain the same, thus ensuring the so-called compositional equivalence property (Barceló-Vidal and Martín-Fernández 2016). This implies that results of a compositional analysis should be invariant to changes of scale in the data (scale invariance principle).
38 Compositional Data Analysis in E-Tourism Research
897
Fine-Tuning the Research Questions It is up to the researcher to select which D parts to analyze. In a content analysis of photos posted by hotels in their Facebook accounts, one could, for instance, think of x1 = outside facilities (garden, terrace, swimming pool, etc.), x2 =inside facilities (gym, sauna, etc.), x3 = rooms, x4 = common inside spaces, x5 = menu, x6 = events, x7 = natural surroundings, and x8 = urban surroundings. If the distinction between urban and natural surroundings is not of interest to the researcher (after all, a hotel in an urban environment has no other choice than to picture urban surroundings and a hotel in a natural environment natural surroundings), both categories can be merged into one part termed “x7 + x8 = surroundings as a whole.” This operation is called amalgamation. Due to the particularities of CoDa, amalgamated parts cannot be analyzed separately at a later stage (e.g., Van den Boogaart and Tolosana-Delgado 2013). In other words, amalgamated parts remain so forever, and amalgamation should take place in the problem definition stage. Following up with the same example, one could decide to study only the subset of parts x1 to x6 having to do with the hotel itself. This is referred to as a subcomposition in CoDa. In this particular example, the subcomposition would imply that the researcher is uninterested in content about surroundings. The amalgamation x7 + x8 would imply that the researcher is interested in comparing the relative importance of content about surroundings as a whole with content about the hotel itself. Analyzing all parts x1 to x8 would imply that the researcher is additionally interested in comparing the relative importance of contents about urban and natural surroundings. It is often claimed that all compositions are, in fact, subcompositions and amalgamations. After all, gym and sauna could have been treated as separate parts instead of amalgamating them within inside facilities. Additional contents could also have been added in order to have a more general composition of which the current one is only a subcomposition. What if, for instance, the researcher would have been interested in pictures about city events or about guest celebrities?
Why Are Classical Statistical Techniques Inappropriate? The closed composition z resides in a subspace called the simplex, which is constrained by positiveness and unit sum, with different operations, angles, and distances from the full real space. This explains why most statistical workhorses, such as mean, variance, correlation, and distance, are to a greater or lesser extent meaningless when applied to z. Since one part can only increase if one or more of the others decrease(s), negative spurious correlations among the parts emerge (Pearson 1897). Euclidean distances among the individual compositions are also meaningless (Aitchison et al. 2000). Euclidean distance considers the pair of proportions 0.01 and 0.02 to be as mutually distant as 0.21 and 0.22, while in the first pair the difference is twofold and in the second it is less than 5% (Coenders and Ferrer-Rosell 2020).
898
B. Ferrer-Rosell et al.
In addition, statistical modeling with unbounded distributions such as the normal distribution is not feasible, as it results in values larger than 1 or lower than 0 having a positive probability of occurrence. The statistical and distributional assumptions of most classical statistical models are to a greater or lesser extent violated in z (Aitchison 2001; Pawlowsky-Glahn et al. 2015). Uncritical use of standard statistical models on raw untransformed compositional data is thus generally inappropriate. Finally, the fact that one part can only increase in relative terms if some other(s) decrease(s) makes interpretation of the results dependent on which parts are made to decrease. Interpretation around one single part is thus bound to be misleading, which means that CoDa necessarily uses multivariate statistical methods.
Compositional Data Analysis in Practice Log-Ratio Transformations In order to solve the aforementioned drawbacks, the most common CoDa approach is to express an original compositional vector of D parts in logarithms of ratios among parts (Aitchison 1986; Egozcue et al. 2003). Log-ratios are unbounded and thus have a chance to meet the distributional assumptions of classical statistical models (Pawlowsky-Glahn et al. 2015). In addition, they constitute a natural way of distilling the information about the relative size of parts and form the basis for defining association, variance, and distance in a meaningful way. Finally, it must be noted that they yield the same result regardless of whether they are computed from x or z, thus adhering to the scale invariance principle. In some instances, compositional data are even defined as those data for which the relevant information is carried by ratios (Egozcue and Pawlowsky-Glahn 2019). Log-ratios may, for instance, be computed among all possible pairs of parts in the so-called pairwise log-ratios: ln
zj zk
= ln
xj xk
with j < k; k = 2, 3, . . . , D; j = 1, 2, . . . , k − 1,
(3)
or between each part and the geometric mean of all parts including itself, in the socalled centered log-ratios. From now on we present only the formulation using the closed z parts: ln
z √ j D z1 z2 ...zD
with j = 1, 2, . . . , D.
(4)
There are alternative interpretations and expressions of centered log-ratios (Filzmoser et al. 2018; Pawlowsky-Glahn et al. 2015). They can also be understood
38 Compositional Data Analysis in E-Tourism Research
899
as the balance between one part and the geometric mean of the rest. The corresponding expression, which is equivalent to (Eq. 4), is:
D−1 D
ln
zj √ D−1 z1 z2 ...zj −1 zj +1 ...zD
with j = 1, 2, . . . , D.
(5)
Equation (5) stresses the fact that the value and the interpretation of centered log-ratios are subject to the definition of the research problem and the research questions. They will change when adding parts or when defining an amalgamation or a subcomposition. It also stresses the fact that both problem definition and data analysis must be mutually coherent and multivariate. Once the compositional research problem has been defined precisely, the greater the centered log-ratio, the greater the importance of the part, compared to the geometric mean of the rest of the parts included in the research problem. One attractive feature of CoDa is that once the raw composition has been transformed into centered log-ratios, classical statistical techniques for unbounded data can be applied in the usual way and even with standard software. Log-ratio transformations thus constitute the easy way out in compositional problems. The applied researcher can concentrate his or her effort in interpreting the results taking the compositional nature of the data and the research questions into account: What does increase at the expense of decreasing what? To this end it must be taken into account that the D centered log-ratios have zero sum for any individual. This is a reflection of the sheer fact that, in relative terms, one part can only increase if some others decrease. Statistically speaking, the covariance matrix among the D centered log-ratios is singular and non-invertible. Among the methods described in this chapter, singularity only affects outlier detection, but the researcher must have in mind the fact that centered log-ratios cannot be applied for statistical techniques which require inverting the covariance matrix without taking extra precautions. Alternative transformations which lead to invertible covariances and can be readily used in more advanced statistical methods are described in Van den Boogaart and Tolosana-Delgado (2013), Egozcue et al. (2003), and Pawlowsky-Glahn et al. (2015).
Basic Statistical Concepts Center In order to assess the overall relative importance of each part for all individual compositions, the composition center can be described from the arithmetic means of the centered log-ratios. For ease of interpretation, the researchers may wish to exponentiate these means (which then become geometric means), and close them to the original unit sum of the composition, in order to express them in the original scale.
900
B. Ferrer-Rosell et al.
Association Proportionality between pairs of parts is a valid alternative to correlation (Lovell et al. 2015). The same pairwise log-ratios (Eq. 3) and their variances are computed as: z V ar ln zjk (6) with j < k; k = 2, 3, . . . , D; j = 1, 2, . . . , k − 1. These variances can be arranged in a symmetric matrix with parts defining both D rows and D columns, with the same layout as a correlation matrix. This is the so-called variation matrix. Variance (Eq. 6) is zero when zj and zk behave perfectly proportionally (compositions with twice the amount of part j also have twice the amount of part k), corresponding to perfect positive association. It goes without saying that a part is proportional to itself, hence the zeros in the matrix diagonal. The further variance (Eq. 6) is from zero, the lower the association. There is neither a clearly defined threshold representing no association, nor is there an upper bound representing perfect negative association, so values in the variation matrix can only be assessed comparatively. This comparative assessment may be carried out relatively to the mean log-ratio variance (Pawlowsky-Glahn et al. 2015). There are D(D−1)/2 distinct elements in the variation matrix: k−1 D zj 1 . V ar ln D (D − 1) /2 zk
(7)
k=2 j =1
Log-ratio variances larger than (Eq. 7) show pairs of parts contributing to a larger share of the variation matrix than the average log-ratio, and variances lower than (Eq. 7) show pairs of parts with a small contribution, in other words, with positive association. Strong association can be inferred, for instance, when the ratio of (Eq. 6) over (Eq. 7) is lower than 0.2 (Egozcue and Pawlowsky-Glahn 2019).
Total Variance Total variance in a compositional data set can be computed in two alternative equivalent manners: firstly as the sum of variances of the D centered log-ratios: D j =1
zj , V ar ln √ D z z ...z 1 2 D
(8)
and secondly from the sum of the distinct elements in the variation matrix: D k−1 zj 1 . V ar ln D zk k=2 j =1
(9)
38 Compositional Data Analysis in E-Tourism Research
901
Distance Aitchison’s distance (Aitchison 1983; Aitchison et al. 2000) between two individual compositions z and z∗ considers that pairwise log-ratios (Eq. 3) carry all the required information about the difference between them:
2 D k−1 zj∗ ∗ zj 1 d z, z = − ln ∗ . (10) ln D zk zk k=2 j =1
Two compositions at zero distance have identical part proportions. When there is a larger difference between the log-ratios of two compositions, their distance is likewise larger. Aitchison’s distances can also be expressed in terms of centered log-ratios (Eq. 4) as:
2 D zj∗ ∗ zj − ln d z, z = . ln √ ∗ D z z ...z D ∗ ∗ z1 z2 . . . zD 1 2 D j =1
(11)
Expression (Eq. 11) has the attractive feature that it equals Euclidean distance computed from data transformed as centered log-ratios (Eq. 4). Computing centered log-ratios from equation (Eq. 5) makes no difference.
Data Preprocessing Zero Replacement As is well known, computing log-ratios implies that x and z may contain no zero values. If the x and z vectors contain zeros, they must be replaced beforehand (Martín-Fernández et al. 2011). Treatment of zeros in CoDa depends on the assumed reason for their occurrence, which is deemed more important than their sheer existence. On the one hand, there are absolute zeros, essential zeros, or structural zeros, which represent values that can only be zero given certain characteristics of the individual compositions (e.g., nature pictures in a hotel located in an urban environment, tobacco consumption in a nonsmoking home). The presence of this kind of zeros may lead to different variance structures of the parts of interest and usually indicates that the choice of parts to be analyzed is not meaningful to a certain subpopulation. Thus, data with absolute zeros should be considered as distinct subpopulations (Bacon-Shone 2003) and either be excluded (e.g., by analyzing only hotels in a natural environment) or analyzed separately. Amalgamation of problematic parts can constitute an alternative, if the researcher is happy with its implications for the definition of the research questions.
902
B. Ferrer-Rosell et al.
On the other hand, so-called rounded zeros, trace zeros, or zeros below detection limit constitute parts which are believed to be present, but are not observed due to randomness or limitations of measurement. Consider a study about dollar spending in e-shops by product categories (parts: apparel, books, music, travel, hobbies, and other). Certain consumption values may be zero in a short reference period, but might not be if observed over a longer period. They are, thus, analogous to missing data with the added information that they have to be below a detection limit. If there is no external or theoretical indication on what the detection limit should be, it can be set as the minimum observed value of each part. The situation is therefore analogous to missing value imputation and zeros can be replaced with a value below the detection limit following certain criteria. Palarea-Albaladejo and Martín-Fernández (2008, 2015) modified the well-known EM imputation method to the compositional case by introducing the restriction that imputed values are below the detection limit. At least one part must be complete for all individuals. If this is not the case, the researcher can take the part with fewest zeros and previously replace zeros with a small amount around two thirds of the detection limit. Finally, the x data can also be counts of phenomena, whose sum for the ith individual is Si . For instance, an individual’s total count of Si tweets can be classified into D content categories. Our hotel photo and complaint examples also constitute count data. The counts of the ith individual can be considered to be a realization of a multinomial distribution with θi1 , θi2 ,. . . , θiD unobserved nonzero probabilities. Even if these probabilities are nonzero, a combination of a small probability and a small Si may result in certain x values being zero, referred to as count zeros. This opens up the possibility of using the Bayesian methods described in Martín-Fernández et al. (2015). An alternative approach is to treat count zeros as rounding zeros, which is considered appropriate if the total counts Si are large (Filzmoser et al. 2018). In this case, detection limits are straightforward. Since the minimum observable count is 1, the detection limit in terms of the closed composition can be set for each individual at 1/Si . The references in this section acknowledge the fact that zero imputation can introduce distortion when the proportion of zero values to be imputed in the data set is large. What constitutes a large proportion may depend on many circumstances, but in many cases sizeable distortion starts occurring when around 15% or 20% of data are zeros. In this case, dropping parts with many zeros by means of a subcomposition analysis or amalgamating them together with other parts can mitigate the distortion, although it goes without saying that it affects the definition of the compositional research questions.
Multivariate Outlier Detection Zero replacement is usually the first step in CoDa, and some sort of outlier diagnostics the second. CoDa has implications for outlier detection. Given the fact that parts cannot be considered in isolation, multivariate outlier detection methods are called for (Aitchison 1986). Once compositions have been transformed into centered log-ratios, squared Mahalanobis distances between each composition and the overall mean can be computed (Filzmoser and Hron 2008). Mahalanobis
38 Compositional Data Analysis in E-Tourism Research
903
distances measure how far away each individual is from the center, taking into account the variances and covariances among log-ratios. It must be taken into account that Mahalanobis distances require inverting the covariance matrix, and thus they cannot be applied on the whole D centered log-ratios. In this case the situation can be solved by just leaving one of the centered log-ratios out. Fortunately, results are invariant to the decision on which one is left out. Under multivariate normality, these squared Mahalanobis distances follow a χ 2 distribution with D–1 degrees of freedom. An appropriate percentile for this distribution can be used as cutoff criterion for outlier detection. This percentile should not be uncritically set to the usual 0.95 cutoff criterion but should take sample size into account. For instance, if the sample size n =1000 and one would use the 0.95 cutoff, around 50 cases would appear as outliers even if no true outlier was present at all. To set the cutoff, for instance, at the 0.999 percentile would be far more reasonable. An exact percentile which adapts the common 0.95 practice to the existing sample size can be obtained as 0.95(1/n) . Since Mahalanobis distances are themselves affected by outliers, an alternative is to compute robust Mahalanobis distances (Filzmoser et al. 2005, 2018).
Compositional Principal Component Analysis and the CoDa Biplot Like standard data, compositional data require visualization tools to help researchers interpret large data tables with many individuals and parts. To this end, Aitchison (1983) extended the well-known principal component analysis procedure to the compositional case. This method belongs to the family of multivariate statistical analysis, and the extension boils down to submitting centered log-ratios (Eq. 4) to an otherwise standard principal component analysis based on the covariance matrix. Together with Gabriel’s (1971) biplot, which jointly represents cases (i.e., individual compositions) and variables (i.e., parts) in a principal component analysis, this served as the basis for Aitchison and Greenacre (2002) developing CoDa biplots. A compositional principal component analysis computes uncorrelated linear combinations of the centered log-ratios which explain the highest possible portion of total variance (Eq. 8), called dimensions. The two first dimensions are represented in the CoDa biplot, which can be understood as the most accurate graphical representation of a compositional data set in two dimensions (or, optionally, three dimensions). As in standard principal component analysis, overall biplot accuracy can be assessed from the percentage of the total variance (Eq. 8) explained by the first two dimensions. The accuracy of the representation of each part can likewise be computed from the percentage of variance of each centered log-ratio explained by the first two dimensions (Daunis i Estadella et al. 2011). In particular, the so-called covariance biplot is the most commonly drawn type in CoDa. It optimizes the representation of the variation matrix among parts. Proximity among individuals is not interpretable in this type of biplot. Parts appear as rays emanating from a common origin and individual compositions appear as points.
904
B. Ferrer-Rosell et al.
The origin of coordinates represents the composition center. The interpretation is as follows (see Aitchison and Greenacre 2002; Blasco-Duatis et al. 2019; Van den Boogaart and Tolosana-Delgado 2013; Pawlowsky-Glahn et al. 2015 for further details): 1. Distances between the vertices of the rays of two parts are approximately proportional to the square root of the variance of their corresponding pairwise log-ratio (Eq. 6). Parts that behave proportionally for all individuals appear close together. It must be noted that unlike the general principal component analysis case, in the CoDa biplot angles between rays play no interpretational role. 2. The orthogonal projection of all individuals in the direction defined by a ray shows an approximate ordering of the importance of that part for all individuals, in relative terms, compared to the geometric average of the remaining parts in the composition. Compared to standard principal component analysis, in CoDa parts can never have all coordinates of the same sign on any dimension, stressing the fact that along any dimension some parts increase and others decrease, in relative terms. Like standard principal component analysis, compositional principal component analysis is not only a visualization tool, but also a data reduction tool. The first few dimensions contain a summary of the compositional information and can be used as numeric variables in further statistical analyses, provided that they can be interpreted. The composition can thus be related to external non-compositional variables, by means of correlations if the external variable is numeric or by comparing the dimension means by a categorical external variable.
Compositional Cluster Analysis Like standard data, compositional data can benefit from classifying individual compositions into groups of compositions, called clusters, which are mutually similar. In other words, pairs of compositions within the same cluster have lower Aitchison’s distances than pairs of compositions belonging to different clusters. Yet, in other words, the sum of centered log-ratio variances within clusters is as small as possible. Cluster analysis is the typical multivariate statistical analysis method for this purpose, and many alternative clustering methods are available. An attractive feature of compositional cluster analysis is that once centered log-ratios have been computed, any standard cluster analysis method supporting Euclidean distances can be used (Ferrer-Rosell and Coenders 2018; Godichon-Baggioni et al. 2019; Martín-Fernández et al. 1998). This includes, among others, Wards’ method and the k-means method. Any such method can be applied with standard software on the centered log-ratios and will provide equivalent results to clustering based on Aitchison’s distances. In particular, the k-means method minimizes the sum of variances of all centered log-ratios within clusters, as a measure of intra-cluster similarity. For a classification
38 Compositional Data Analysis in E-Tourism Research
905
into k groups, k initial cluster centers are selected randomly. Each individual composition is assigned to the nearest center, and centers are iteratively updated according to the assigned individuals until no individual changes membership. Since the procedure may fall into a local minimum of within-cluster variance, the procedure may be replicated a large number of times with different sets of random initial cluster centers. As regards the cluster interpretation, cluster profiles can be described by means of within-cluster means of the centered log-ratios, if necessary exponentiated and closed back to the original composition unit sum. A standard graphical representation of cluster analysis results in CoDa is the geometric mean barplot. This plot depicts the log-ratios of the closed cluster means of each part over the closed mean of that part for the overall sample. Positive bars show above-average parts for that particular cluster and negative bars below-average parts. Since CoDa focuses on relative information, no cluster will ever have the highest or the lowest means on all parts. A well-known terminology distinguishes between clustering based on size and clustering based on shape, CoDa belonging to the latter category (Greenacre 2017). The main procedural difference compared to standard cluster analysis is that standardization of centered log-ratios is not desirable because it modifies distances and would thus make Euclidean distances no longer equivalent to Aitchison’s distances. In most other respects, the analysis is carried out like a cluster analysis on any numeric data set. Decisions on the number of clusters (k) are made as usual. In any case these decisions involve a trade-off between accuracy and parsimony; in other words, the higher the desired similarity of individuals within the clusters, the higher the number of required clusters. A pragmatic approach can involve to start with a low number of clusters k and keep adding clusters as long as the profiles of the additional clusters are meaningfully different and as long as none of the clusters is too small for practical purposes. A usual statistical measure of the aforementioned trade-off between accuracy and parsimony is the Calinski index, higher values tending to show a good choice of k: (total sum of variances – within cluster sum of variances) (k − 1) , within cluster sum of variances (n − k)
(12)
where total sum of variances is (Eq. 8). Another statistical measure is the average silhouette width comparing average distances of each case with all cases in its own cluster and with all cases in the second best neighboring cluster. Higher values also tend to show a good choice of k. Relationships between the cluster-membership variable and external noncompositional variables are also analyzed with the usual statistical tools in any cluster analysis. Such relationships constitute a convenient way to relate the composition to other variables in further statistical analyses. The simplest methods are contingency tables when the external variable is categorical and analysis of variance when the variable is numeric.
906
B. Ferrer-Rosell et al.
Limitations and Extensions The inability to work with sparse data tables with many zeros is indeed one of the most often quoted limitations of CoDa. This precludes using CoDa, for instance, in web mining of short texts if single words or single word combinations are treated as parts. Alternatives such as correspondence analysis are recommended in these cases (Greenacre 2018). Another often quoted limitation is that, in a log scale, parts with very small values may end up dominating the analysis results. Advanced methods for downweighting small parts are discussed in Greenacre (2018). Amalgamation of very small parts is an alternative, as long as it is coherent with the research problem definition. This chapter has only presented descriptive methods. Of course CoDa lends itself to statistical inference. The composition can be the dependent or the explanatory variable in statistical models ranging from simple multivariate analysis of variance or regression models to mixture models, time series models, generalized linear models, and structural equation models (Filzmoser et al. 2018; Pawlowsky-Glahn et al. 2015), with many applications in the tourism field (Coenders and Ferrer-Rosell 2020). To make these applications possible, alternative log-ratio transformations (Egozcue et al. 2003), robust methods, and methods for high-dimensional data (Filzmoser et al. 2018) have been duly developed. In the particular case of text content analysis, a noteworthy variation on the theme is that by Roberts et al. (2016), a multistep procedure including a compositional regression.
Example Data The example presented in this chapter shows that CoDa methodology serves as an important complementary tool for content analysis. In this case, the aim of the application is to analyze the hotel reviews’ content, and more particularly to relate complaint topics to one another, as well as to distinguish hotel clusters based on major complaint topics. The primary unit of content analysis is the review in itself. Some of the other available and relevant variables per review are review identification, review date, hotel identification, user identification, and score given (from 1 to 5). Hotel reviews of the city of Barcelona were downloaded on September 2016 from TripAdvisor as it is one of the leading online traveler opinion platforms (MartinFuentes 2016). The downloading process was done automatically with a web scraper tool developed in Phyton, and the process took less than 24 h to obtain a random selection of 31,000 reviews from hotels of all categories. For the example, out of the total hotels included in the sample, we selected those which had at least 150 reviews (n = 50 hotels). Then, we randomly selected 50 reviews of each hotel. Thus, 2,500 reviews were analyzed.
38 Compositional Data Analysis in E-Tourism Research
907
The topics of complaints were deduced from the content analysis of reviews. The hotel is the unit of statistical analysis in the compositional data set, and counting the topics of complaints in all reviews of each hotel constitutes count data. The hotel’s total count of identified contents in the 50 reviews (Si ) was classified into D = 8 topics (content categories or parts). The topics were nothing (the review did not include any complaint or negative comment), facilities (the review included negative comments about hotel facilities in general, beds, rooms in general, bath, or common facilities), services provided (the review included negative comments about the services provided by the hotel such as the breakfast, the Wi-Fi, the bar/restaurant, the pool, etc.), cleanliness (the review included negative comments about the hotel cleanliness in general and about the rooms in particular), location (the review included negative comments related to the location of the hotel), environment (the review included negative comments about the neighborhood, external noise, etc.), staff (the review included negative comments about staff, for example, staff not being helpful or problem-solving), and other complaints (the review included negative comments unrelated to the former topics). Apart from the reviews, the average hotel score was also used to relate it to the biplot dimensions and to describe the clusters.
Results The location part has a large percentage of zeros (42.0%). Its conceptual similarity with the environment part makes amalgamation a feasible option. Both parts are not under the control of the hotel management, at least in the short term, but depend mostly on where the hotel is located. We name the amalgamated part environment, understanding that it covers both concepts. After amalgamation, the percentage of zeros (12.57%) is deemed appropriate for replacement by the modified EM algorithm by setting the detection limits at 1/Si . If the relative importance of complainers versus not complainers is outside of the example focus and the main aim is to study the distribution of the importance of complaints by topics, then a subcomposition excluding the nothing part makes sense. In the rest of the example, we concentrate on the parts facilities, services, cleanliness, staff, environment, and other. The outlier detection threshold is set at 0.95 (1/50) = 0.9990. No outliers are found. Table 1 shows the variation matrix and the center. The average of the variation matrix elements is 0.867. Pairs of parts with a log-ratio variance below 0.2 × 0.867 = 0.173, if any, would be considered to move proportionally. Conversely, the pairs of parts environment versus services and environment versus staff have comparatively high log-ratio variances, meaning that hotels with relatively more complaints about environment tend to have relatively fewer about services and staff. The center shows that the most often quoted reasons for complaining are facilities, environment, and services, and the less often quoted reasons are staff and cleanliness. Figure 1 shows the biplot. The distances among pairs of rays closely mirror the log-ratio variances in Table 1. Orthogonal projections along the directions defined
908
B. Ferrer-Rosell et al.
Table 1 Center, variation matrix, and centered log-ratio variances Facilities Environment Services Other Staff Cleanliness
Center 0.344 0.243 0.157 0.103 0.092 0.062
Environment Services Other 0.728 0.862 0.797 1.379 1.045 0.817
Staff 0.942 1.433 0.828 0.771
Cleanliness 0.406 0.625 0.788 0.694 0.891
Clr variances 0.261 0.507 0.418 0.326 0.449 0.206 2.168
Fig. 1 CoDa biplot of hotel complaints
by a ray constitute an approximate ordering of hotels according to the ratio of the frequency of a complaint topic over the geometric mean of the frequency of the remaining complaints. For instance, hotel 1 has the largest frequency of complaints about services in relative terms and hotel 24 the lowest. Hotels in the upper left quadrant stand out for having relatively more complaints on other reasons and staff and relatively fewer on cleanliness and facilities. The percentage of explained variance by the first two dimensions is deemed satisfactory at 60.5%, thus arguing for a good biplot accuracy.
38 Compositional Data Analysis in E-Tourism Research
909
According to Table 2, hotels with the highest average scores are at the bottom of biplot; thus, satisfied reviewers tend to complain more about services. The most unsatisfied customers are those lodged at hotels located at the upper part of the biplot and tend to complain more about staff. Having said this, the correlation is admittedly low. The results of a k-means clustering with three clusters are depicted in the geometric mean barplot in Fig. 2. The first cluster (black bars, n = 25) stands out for relatively more complaints on environment, cleanliness, and facilities, and relatively fewer on staff, services, and other topics, compared to the overall sample average. The second cluster (red bars, n = 17) stands out for relatively more complaints on services and staff, and relatively fewer on environment, compared to the overall Table 2 Correlations between biplot dimensions and hotel score (mean over 50 reviews)
Hotel score (mean)
1st dimension (horizontal axis) −0.033
2nd dimension (vertical axis) −0.194
Fig. 2 Geometric mean barplot of clusters and complaint topics
910
B. Ferrer-Rosell et al.
sample average. Finally, the third cluster (green bars, n = 8) stands out for relatively more complaints on other topics, staff, and environment and relatively fewer on facilities and cleanliness. These results can be numerically observed in Table 3, which shows the centers of the parts of each cluster. The three clusters are also depicted in Fig. 3. Half of hotels are included in the first cluster, with relatively more complaints about facilities, cleanliness, and environment. According to Table 4, hotels in clusters 1 and 3 are the ones with highest average and median hotel score, while hotels of second cluster are worst evaluated (lowest hotel score mean and median), although differences are admittedly minimal. Figure 4 shows the corresponding boxplots. Table 3 Centers of parts in each cluster Cluster 1 Cluster 2 Cluster 3
Facilities 0.392 0.330 0.154
Services 0.100 0.248 0.151
Cleanliness 0.065 0.054 0.044
Staff 0.049 0.153 0.132
Environment 0.321 0.112 0.330
Other 0.073 0.103 0.189
Fig. 3 CoDa biplot of hotel complaints per cluster Table 4 Statistics of hotel score per cluster. Mean, standard deviation, and percentiles Cluster 1 Cluster 2 Cluster 3
Mean 4.115 4.007 4.168
Std. dev. 0.443 0.417 0.395
0 2.840 3.400 3.560
25 3.940 3.640 4.000
50 4.200 4.120 4.240
75 4.380 4.300 4.460
100 4.960 4.660 4.880
38 Compositional Data Analysis in E-Tourism Research
911
Fig. 4 Boxplots of hotel score per cluster
Summing up, the application of the CoDa methodology in this example has made it possible to plot the relative importance of complaint topics for each specific hotel and to draw clusters with different complaint profiles, which are related to the hotel average score and could also be related to any other hotel characteristic.
Conclusion As stated throughout the chapter, data carrying relative information have particular characteristics which may lead to interpretational difficulties, among other problems, when using standard statistical analyses. The proportional nature of data must then be taken into account from the onset. For instance, we cannot consider the distance between 1% and 2% to be the same as between 11% and 12%, which is what the Euclidean distance does. In the first pair, the increase is 100%, while in
912
B. Ferrer-Rosell et al.
the second pair, it is less than 10%. Most standard and classical statistical methods do not consider the restricted nature of the data expressed as proportions, and are subject to spurious correlations among the parts, and violation of the statistical and distributional assumptions, for instance, normality. The main advantage and also the appeal of the CoDa methodology is that it solves the aforementioned problems. It is also worth mentioning that once the data (components) have been transformed into logarithms of ratios, any present and future standard statistical technique may be used, since the relative importance of the parts is put on the table, and normality is recovered. As stated in Ferrer-Rosell (2021), CoDa has already been used to analyze e-tourism data. The CoDa methodology in e-tourism is considered to be an ideal complement to content analysis techniques and to research regarding dominance of contents in any kind of (online) source. Regarding the future of the CoDa methodology in e-tourism, apart from being a simple and straightforward tool to use when researchers focus on proportions, it also passes through considering the total (volume) of contents. The total has been considered to advantage in research about tourist expenditure, where it is interesting to analyze the distribution of the trip budget and the total trip budget in the same statistical model (Ferrer-Rosell et al. 2016b), but has not been used in e-tourism yet. In the e-tourism context, analyzing the composition of contents (which contents are more emphasized in online sources) in, for instance, social media is as relevant as analyzing the total number of posts, or its ratio to the number of tourists at the destination, for instance. The total number of posts in social media according to the number of visitors determines how active the social media profiles are. Other possible developments are to take advantage of the usability of any statistical technique on the log-ratios, including the composition as dependent, explanatory, or mediating variable in static or dynamic models, although more advanced log-ratio transformations than those presented in this chapter are sometimes needed (Filzmoser et al. 2018; Pawlowsky-Glahn et al. 2015).
Appendix: CoDaPack Menus Used for the Example CoDaPack is an intuitive menu-driven freeware for CoDa developed by the Research Group in Statistics and Compositional Data Analysis at the University of Girona. The philosophy of CoDaPack is to reduce the analysis steps the users must perform by themselves. The program computes by itself the needed log-ratios for each type of analysis. CoDaPack can be downloaded at: http://ima.udg.edu/codapack/ The File menu handles opening and saving data files, including importing and exporting them in a variety of formats, at the moment of writing this chapter .xls, .csv, .txt, and .RData. Ideally the file contains some columns indicating a closed composition (Eq. 2) together with non-compositional numeric and categorical variables as wished by the researcher. Zeros are not coded as “0,” but coded as below a certain detection limit, which may be different for each zero cell. For instance, if a value is known to be below 0.005, the data file entry in xls, .csv, and .txt formats is “