297 54 5MB
English Pages 146 [147] Year 2023
Contributions to Management Science
Massimiliano Nuccio Sofia Mogno
Mapping Digital Skills in Cultural and Creative Industries in Italy A Natural Language Processing Approach
Contributions to Management Science
The series Contributions to Management Science contains research publications in all fields of business and management science. These publications are primarily monographs and multiple author works containing new research results, and also feature selected conference-based publications are also considered. The focus of the series lies in presenting the development of latest theoretical and empirical research across different viewpoints. This book series is indexed in Scopus.
Massimiliano Nuccio • Sofia Mogno
Mapping Digital Skills in Cultural and Creative Industries in Italy A Natural Language Processing Approach
Massimiliano Nuccio Department of Management and Bliss Digital Impact Lab Ca’ Foscari University of Venice Venice, Italy
Sofia Mogno Department of Management and Bliss Digital Impact Lab Ca’ Foscari University of Venice Venice, Italy
This work was supported by Ca’ Foscari University of Venice ISSN 1431-1941 ISSN 2197-716X (electronic) Contributions to Management Science ISBN 978-3-031-26866-3 ISBN 978-3-031-26867-0 (eBook) https://doi.org/10.1007/978-3-031-26867-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Innovation scholars are well aware that any industrial revolution is defined as such when the transformation induced by new technologies is pervasive and affects organizations and individuals in every aspect of daily life. Since the last decade, digital transformation has been profoundly affecting both the way we produce goods and services and the behavior of citizens and consumers. However, it would be dangerous to focus only on the technological dimension without considering the different aspects that characterize the interaction between man and machine. The history of the industrial revolutions has also taught us that to maximize the positive effects of technology in production systems and therefore limit the negative effects, it is necessary to act on the knowledge and skills of workers. The faster the workforce adapts to new technologies and learns to use them, the less severe the short-term impact on employment will be. This volume investigates the evolution of knowledge and skills generated by digital transformation in the cultural and creative industries (CCI). The research starts from the awareness of the extreme heterogeneity intrinsic to the notion of CCI and therefore from the need to explain the characteristics and relative weight of each sector included in the standard definition of CCI. Therefore, the analysis does not have a normative approach based on a generic aspiration for larger and faster digitization of the CCI. The goal is to understand what mix of competence and knowledge cultural workers declare and what are the new technological skills that emerge in the different sectors. Although the analysis has an exploratory value on a relatively limited sample, clusters of digital and managerial skills clearly emerge that are associated with more traditional humanistic and cultural skills. The authors are convinced that this type of analysis based on natural language processing algorithms, if conducted regularly, can be a very powerful tool for planning education and training in the different cultural occupations and for selecting the professional profiles that best respond to the lack of knowledge and competence of cultural organizations and creative firms. The research project was funded by the Management Department of Ca’ Foscari University of Venice, and it is the outcome of the collaboration between two v
vi
Preface
research centers of the same department: Maclab—Management, Arts & Culture Laboratory (https://www.maclab.info/) and Bliss—Digital Impact Lab (www.unive. it/bliss/). Venice, Italy
Massimiliano Nuccio Sofia Mogno
Acknowledgments
This research project was funded by the Management Department of Ca’ Foscari University of Venice. In particular, we would like to thank Maclab—Management, Arts & Culture Laboratory (https://www.maclab.info/). Also, we would like to thank Aldo Razzino, CEO and MD at Open Search Group (https://opensearchgroup.com/), and his team for their support in building the dataset that was provided in fully anonymous format. All chapters were co-authored by Massimiliano Nuccio and Sofia Mogno (BLISS—Digital Impact Lab, Management Department, Ca’ Foscari University of Venice). Nicolò Tamagnone and Grazia Sveva Ascione co-authored Sect. 3.3.
vii
Introduction
The research presented in this book analyzes the set of digital and creative knowledge, skills, and competences (KSC) of creative and cultural workers in the creative and cultural industries (CCI) in Italy, identifying which are the most relevant and which skill gaps and mismatches may exist between KSC demand and supply, drawing future managerial implications. The book analyzes the workforce in the CCI in Italy by combining a literature review and empirical research. The former covers a compilation of existing literature on digital and creative KSC development and its nexus with education, human resource management (HRM), organizational and digital innovation, and policymaking. The latter defines both the KSC supply in the CCI by extracting KSC from nearly 8000 curriculum vitae (CV) through an innovative natural language processing (NLP) analysis based on an algorithm trained on KSC taxonomies from ESCO and the KSC demand from 131,504 job adverts to assess current KSC and existing gaps between their demand and supply among the CCI workforce in Italy. The final aim of the study is to (1) highlight noteworthy digital and creative KSC, drawing a geographical mapping of creative and cultural workers in Italy, (2) assess whether relevant KSC gaps or mismatches between labor demand and supply exist in the Italian CCI, also taking into consideration the impact of digital transformation on KSC degree of complexity and characteristics, and (3) propose how these new and/or missing KSC impact on labor demand and supply, either leading to new professional opportunities and careers or hindering others, and on the long-term sustainable growth and innovation in the Italian CCI. Escalating digitization and changes in lifestyle triggered by COVID-19 urged organizations for continuous and cutting-edge innovation to boost their competitive advantage (Ren & Song, 2020) and value creation. The digital transformation is defined as a “process” (Vial, 2019, p. 118) of change, which can be analyzed from multiple perspectives. At a macro level, it describes a social and industrial phenomenon sparked by the systematic introduction and application of digital technologies (Agarwal et al., 2010), while at a micro (single firm) level, it identifies an organization’s process of substantially changing its properties “through combinations of ix
x
Introduction
information, computing, communication, and connectivity technologies” (Vial, 2019, p. 118) with the aim of improving its efficiency and effectiveness. In both cases, digital transformation involves a strategic internal reaction to disruptive changes in the external environment by creating new streams and ways for value creation through digital technology. Since, at an organizational level, the notion of digital transformation is strictly related to the concept of digital business strategy, which has to leverage “digital resources to create differential value” (Mithas et al., 2013, p. 472), digital transformation can also be explained as a daily rolling process demanding capacities and competences that may steer an agile “strategic renewal of an organization’s business model, collaborative approach, and [. . .] culture” (Warner & Wäger, 2019, p. 344). Indeed, digital transformation can be depicted as not only an innovation process or business strategy but also a novel culture that must be fostered by all the workforce within an organization, triggering new awareness, introducing original ways of thinking, and generating ideas within it (Warner & Wäger, 2019). That is why collaborative network forms have gained predominance in the digital hyperconnected era as they build on informal linkages for technological development and R&D as well as rapid knowledge sharing and acquisition (Dodgson, 1993, p. 78). In particular, the business ecosystem, defined as a network of interrelated stakeholders who cooperate to jointly create value for the whole ecosystem itself (Eamonn, 2015), has prevailed as a successful business model. However, many firms face remarkable difficulties in properly and successfully addressing digital transformation, which may entail reviewing and reorganizing the business model, due to poor digital strategy development and implementation (Correani et al., 2020). As operations and processes must be frequently revised to meet relentlessly changing technology, customer demand, and systems, new roles and skills continuously emerge as necessary, turning people into a crucial asset to detect, address, and seize the opportunities of digital transformation. Thus, digital transformation requires a new organizational culture combining flexibility, agility, entrepreneurship, and risk-taking for a business to implement changes and novel technology successfully and fast (Hartl & Hess, 2017), to rapidly react to everchanging consumer needs (Horlach et al., 2017), and to frictionlessly collaborate and share information within innovation-driven business ecosystems (Dremel et al., 2017). Because system and integrative thinking (Warner & Wäger, 2019) grow to be fundamental as a consequence, developing cross-discipline skills and competences seems to be essential in the digital era (Dremel et al., 2017, p. 99). Lazzaretti (2020) explains the role of culture as a “resource and digital creative capacity” (p. 3), claiming that the synergy between “art and digital creativity” (p. 8) is set to grow into being a strategic opportunity and innovation driver in the emerging ecosystem- and knowledge-based economy. Clarke and Clegg (2000, p. 45) also highlight the positive impact of creativity on innovation potential, proposing a tripartite set of factors that could underpin and engender sustainable business growth: creativity, intelligence, and ideas. Since new KSC are required to thrive in the hyperconnected and innovative twenty-first century and in constantly
Introduction
xi
changing and uncertain environments (Dodgson, 1993), not only a new supply of KSC is highly in demand in the labor market, but also more integrated and wideranging educational curriculums and assessments combining both digital and creative aspects are urged to be developed and implemented (Darling-Hammond, 2012). The development of creative and digital synergies and skills is essential to enhance innovation in the knowledge-based era (van Laar et al., 2020), even more in the aftermath of the COVID-19 pandemic and its related restriction policies. This is even more urgent in the CCI that are considered as one of the most damaged by the pandemic, mainly because of social distancing and physical restriction policies which not only disrupted the processes and business models of many CCI organizations but also radically worsened the unemployment rates of the industry. In 2020, CCI reported a contraction of global value added (GVA) of US$750 billion with respect to 2019 with a revenue loss spanning from 20 to 40% across countries and 10 million jobs lost (UNESCO, 2021). Advanced IT skills are reported as a major skill shortage by employers in the CCI (Giles et al., 2019; VVA, 2021). According to Giles et al. (2019), skill mismatch comprises skill shortages, which identify whether employers are unable to find qualifications, skills, and experiences when recruiting, skill gaps, which depict whether the skills possessed by workers meet or not the needs required from the role, and skill underutilization, which describes the extent to which skills, capabilities, and qualification of individuals are higher with respect to the level required for a role. Accelerating digitization and recent worldwide situations ask not only for new business models, new organizational culture, and new product and service supply but also for new KSC that can successfully address more hyperconnected and complex situations (Vial, 2019). Therefore, a more integrated approach needs to understand which new types and combinations of KSC are required to thrive in the new economy in the CCI. In particular, since creativity is expected to be a major source of value creation in the new knowledge-based and digitally driven economy, if properly combined with digital skills (van Laar et al., 2020), it is crucial to understand how their synergy can be successfully built and boosted in the CCI and how it can be turned into a strategic driver for innovation and competitive advantage in contemporary dynamic contexts. Therefore, the analysis of creative and digital skills needed in the CCI has become more vital than ever. The book is organized as follows. The first two chapters review past literature on knowledge, skills, and competences and the troublesome definition of the creative and cultural industries. In particular, in the first chapter we introduce various KSC frameworks to define KSC according to four main research areas, highlighting the rising need to focus on digital and creative KSC in management studies. In the second chapter, we discuss the complex definitions of creativity and the boundaries of the CCI, reporting some statistics proving the economic impact of CCI on national economy and employment. After showing the relevance of our research, Chap. 3 introduces our novel methodology and research approach through a tripartite structure. Compiling existing literature and studies on both NLP and CV (or resume) analysis, the chapter shows the relevance of our methodological approach implementing NLP techniques for extracting KSC from job candidates’ CVs to
xii
Introduction
extract supplied and demanded digital skills in the labor market in the Italian CCI. Indeed, the third section explains the major steps of our empirical strategy, which adopts an NLP approach to the mapping of creative and digital skills in the CCI in Italy. Chapter 4 reports main results of our research, which analyzed KSC from a sample of CVs consisting of two sub-samples through a descriptive, a KSC, and a cluster analysis. These types of analyses were used to explain main gaps in KSC demand and supply in the CCI with managerial implications.
References Agarwal, R., Gao, G., DesRoches, C., & Jha, A. K. (2010). The digital transformation of healthcare: Current status and the road ahead. Information Systems Research, 21(4), 796–809. https://doi.org/10.1287/isre.1100.0327 Barrena-Martínez, J., Cricelli, L., Ferrándiz, E., Greco, M., & Grimaldi, M. (2020). Joint forces: towards an integration of intellectual capital theory and the open innovation paradigm. Journal of Business Research, 112, 261–270. https://doi. org/10.1016/j.jbusres.2019.10.029 Clarke, T., & Clegg, S. (2000). Management paradigms for the new millennium. International Journal of Management Reviews, 2(1), 45–64. https://doi.org/10. 1111/1468-2370.00030 Correani, A., De Massis, A., Frattini, F., Petruzzelli, A. M., & Natalicchio, A. (2020). Implementing a digital strategy: Learning from the experience of three digital transformation projects. California Management Review, 62(4), 37–56. https://doi.org/10.1177/0008125620934864 Darling-Hammond, L. (2012). Policy frameworks for new assessments. In P. Patrick Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 301–399).Springer. Dodgson, M. (1993). Leaning, trust, and inter-firm technological linkages: some theoretical associations. Human Relations, 46(1), 77–95. https://doi.org/10.1177/ 001872679304600106 Dremel, C., Wulf, J., Herterich, M. M., Waizmann, J. C., & Brenner, W. (2017). How AUDI AG established big data analytics in its digital transformation. MIS Quarterly Executive, 16(2), 81–100. Eamonn, K. (2015). Blurring Boundaries, Uncharted Frontiers. In Business ecosystem come of age (pp. 16–29). Deloitte University Press, Business Trends Series. Giles, L., Spilsbury, M., & Carey, H. (2020) A skills monitor for the creative industries. Multiple: Creative Industries Policy and Evidence Centre and Work Advance. Available from: https://pec.ac.uk/discussion-papers/creative-skillsmonitor Harris, L. (2000). A theory of intellectual capital. Advances in Developing Human Resources, 2(1), 22–37. https://doi.org/10.1177/152342230000200104
Introduction
xiii
Hartl, E., & Hess, T. (2017). The role of cultural values for digital transformation: Insights from a Delphi study. In Twenty-third Americas conference on information systems, Boston, 2017. Horlach, B., Drews, P., Schirmer, I., & Böhmann, T. (2017). Increasing the agility of IT delivery: five types of bimodal IT organization. In Proceedings of the 50th Hawaii international conference on system sciences | 2017 (pp. 5420–5429). http://hdl.handle.net/10125/41818 Lazzaretti, L. (2020). What is the role of culture facing the digital revolution challenge? Some reflections for a research agenda, European Planning Studies, 30(9), 1617–1637. https://doi.org/10.1080/09654313.2020.1836133 Mithas, S., Tafti, A., & Mitchell, W. (2013). How a firm’s competitive environment and digital strategic posture influence digital business strategy. MIS Quarterly, 511–536. Paoloni, M., Coluccia, D., Fontana, S., & Solimene, S. (2020). Knowledge management, intellectual capital and entrepreneurship: a structured literature review. Journal of Knowledge Management, 24(8), 1797–1818. https://doi.org/10.1108/ JKM-01-2020-0052 Ren, S., & Song, Z. (2021). Intellectual capital and firm innovation: incentive effect and selection effect. Applied Economics Letters, 28(7), 617–623. https://doi.org/ 10.1080/13504851.2020.1767281 UNESCO, Naylor, R., Todd, J., Moretto, M., & Traverso, R. (2021) Cultural and creative industries in the face of COVID-19: An economic impact outlook. Published online. https://unesdoc.unesco.org/ark:/48223/pf0000377863. van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2020). Measuring the levels of 21st-century digital skills among professionals working within the creative industries: A performance-based approach. Poetics, 81, 101434. https:// doi.org/10.1016/j.poetic.2020.101434 Vial, G. (2019). Understanding digital transformation: A review and a research agenda. Journal of Strategic Information Systems 28, 118–144. https://doi.org/ 10.1016/j.jsis.2019.01.003 VVA, Hausemer, P., Richer, C., Klebba, M., & Amann, S. (2021). Creative FLIP final report work package 2: Learning. Skills needs and gaps in the CCSI. http:// creativeflip.creativehubs.net/wp-content/uploads/2021/07/FINAL-WP2_FinalReport-on-Skills-mismatch-2.pdf Warner, K. S. R., & Wäger, M. (2019). Building dynamic capabilities for digital transformation: An ongoing process of strategic renewal. Long Range Planning 52, 326–349. https://doi.org/10.1016/j.lrp.2018.12.001
Contents
1
2
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Crucial Role of Intellectual Capital (IC) in the Knowledge-Based Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The KSC Framework: Multiple Perspectives . . . . . . . . . . . . . . . . . 1.2.1 Learning and Education . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Organizational Theory and Human Resource Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Labor Economics and Innovation . . . . . . . . . . . . . . . . . . . 1.2.4 Multi-level Policymaking . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The Need for Twenty-First Century Skills: From STEM to STEAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Review of the Creative and Cultural Industries (CCI) . . . . . . . . . . 2.1 Creativity: A Complex Definition . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Creative Knowledge, Skills, and Competences (CKSC) . . . 2.1.2 The Role of Creativity in the Twenty-First Century . . . . . . 2.2 Creative and Cultural Industries (CCI) . . . . . . . . . . . . . . . . . . . . . 2.2.1 Boundaries and Sectors: A Contested Definition . . . . . . . . 2.2.2 The Economic Contribution and Value of CCI . . . . . . . . . . 2.2.3 Employment in the CCI Worldwide . . . . . . . . . . . . . . . . . 2.3 The Creative and Cultural Industries in Italy . . . . . . . . . . . . . . . . . 2.3.1 Main Subsectors and Contribution to the National Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Creative and Cultural Employment in Italy . . . . . . . . . . . . 2.4 The Need for a New Integrated Approach to KSC Development in the CCI in Italy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 4 5 9 12 14 16 23 23 25 26 27 27 31 32 33 33 35 38 38
xv
xvi
3
Contents
Methodology and Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . 3.1 CV (or Resume) Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 The Study of KSC Supply and Demand in the CCI: A Brief Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 CV Analysis and KSC Assessment . . . . . . . . . . . . . . . . . . 3.2 Natural Language Processing (NLP) . . . . . . . . . . . . . . . . . . . . . . 3.2.1 NLP: A Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 A Brief History of NLP Developments . . . . . . . . . . . . . . . 3.2.3 Main NLP Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3.1 Symbolic NLP Approaches . . . . . . . . . . . . . . . . . 3.2.3.2 Statistical NLP Approaches . . . . . . . . . . . . . . . . 3.2.3.3 Connectionist NLP Approaches . . . . . . . . . . . . . 3.2.4 NLP Analysis and Applications . . . . . . . . . . . . . . . . . . . . 3.2.4.1 Main Steps in NLP Analysis . . . . . . . . . . . . . . . . 3.2.4.2 Levels of Analysis in NLP . . . . . . . . . . . . . . . . . 3.2.4.3 An Overview of Main NLP Applications . . . . . . . 3.2.4.4 Major Challenges in Applying NLP . . . . . . . . . . 3.2.5 NLP and Topic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5.1 NLP and LDA . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5.2 LDA Applications . . . . . . . . . . . . . . . . . . . . . . . 3.2.6 NLP in Management Science . . . . . . . . . . . . . . . . . . . . . . 3.2.6.1 NLP in Accounting and Finance . . . . . . . . . . . . . 3.2.6.2 NLP in Marketing and Sales . . . . . . . . . . . . . . . . 3.2.6.3 NLP in Supply Chain and Operations Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6.4 NLP in Strategic Management . . . . . . . . . . . . . . 3.2.6.5 NLP in Sustainability Management . . . . . . . . . . . 3.2.6.6 NLP in Innovation and Information Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.6.7 NLP in Human Resource Management (HRM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.7 NLP and CV Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 The Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Dataset Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1.1 ESCO Skills, Knowledge, and Competences Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1.2 Mapping CV to ESCO and ATECO Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Skills’ Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.1 Creation of Vector-Based Representations of KSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.2 Network Representation . . . . . . . . . . . . . . . . . . . 3.3.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 43 45 47 47 48 51 51 51 52 53 53 56 59 60 63 63 64 65 66 66 67 68 68 69 69 70 73 73 74 75 75 78 80 80 81
Contents
4
Creative and Digital Skills in Italian Cultural and Creative Industries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 KSC Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Specific KSC Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Title KSC Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Type of KSC Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 KSC Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 KSC Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 CV Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 KSC Gap Between Demand and Supply . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xvii
95 95 102 102 106 109 110 112 112 118 126 131
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Chapter 1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Keywords Knowledge · Skills · Competences · Knowledge-based economy · Digital skills · Creative skills
1.1
The Crucial Role of Intellectual Capital (IC) in the Knowledge-Based Economy
Ever-changing technology and accelerating digitization have radically overturned business processes and consumer habits, entailing new approaches to manage organizations and for new skills to be developed. Since digital transformation triggered the need for constant and relevant innovation in structure, products, processes, and business models to address the escalating knowledge intensity and globalization of economic activities (Houghton & Sheehan, 2000), knowledge turned into the most fundamental economic resource (Houghton & Sheehan, 2000), leading to a new economic structure placing knowledge at its core: the knowledge-based economy. Furthermore, brand names, trademarks, copyright (Roslender & Fincham, 2004), goodwill and R&D (Lev et al., 2005), and company reputation and relationships (Fincham & Roslender, 2003) have grown into core resources and sources of value creation for firms. So, advanced capabilities of acquiring, managing, and organizing all this knowledge and effectively selecting and communicating information must complement increasingly efficient information technology. This demands new business models that place the human element at the very core of a firm’s sustainable growth and take advantage of cutting-edge technology and digital assets for fast, on-the-spot, and relevant responses to external changes. As adaptation and change become crucial to long-term survival, information and knowledge become main drivers of economic value creation and growth as well as determinants of a company’s life (Paoloni et al., 2020). As a consequence, a new type of economy has emerged, the knowledge-based economy (Clarke & Clegg, 2000, p. 45), where intellectual capital (IC), defined as the summa of an organization’s human, structural, and relational capital (Barrena-Martínez et al., 2020), and, especially, its human capital (HC), comprising individual knowledge and skills © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Nuccio, S. Mogno, Mapping Digital Skills in Cultural and Creative Industries in Italy, Contributions to Management Science, https://doi.org/10.1007/978-3-031-26867-0_1
1
2
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
(Harris, 2000), become fundamental assets (Paoloni et al., 2020) for enhancing a firm’s growth, competitiveness, and long-term sustainable performance. Therefore, if the digital transformation is a multifaceted concept, it may also identify a paradigm shift for its triggering fresh and new “means of understanding the world and basis for informing action” (Clarke & Clegg, 2000, p. 45). This demands new skills and knowledge necessary to build new business models and innovation-based ecosystems that may address the disruptive external environment as well as constantly altering technology and habits (Correani et al., 2020). As businesses operate in a highly interrelated ecosystem of stakeholders impacting on business operations and dictating the rules of its game and survival due to the increasing ephemerality of business structures and boundaries (Powell, 1990) in the quest for constant knowledge sharing and cutting-edge innovation, intellectual capital (IC), which can also be defined as the combination of an organization’s structural, human, and customer capital (Bellucci et al., 2021), has turned into a strategic asset and value-adding resource in addition to natural, capital, and labor resources, leading to the rise of the knowledge economy (Cronje & Moolman, 2013). As IC becomes a major determinant of an organization’s market value (Cronje & Moolman, 2013, p. 1) and competitive advantage, the human factor (Roslender & Fincham, 2004) turns into a key asset for organizations as well. If in the past management mainly focused on production, the creation and development of IC has become fundamental to sustain competitiveness and engender economic value creation (Chen et al., 2005, pp. 1–2). Since higher levels of IC have been positively associated with higher employment and growth, with creativity expected to become a key driver of innovation (EC, 2010), the individual KSC of employees, which collectively add to a firm’s HC (and thus, IC), acquire utmost importance as crucial catalysts for organizational value and sustainable competitive advantage (Houghton & Sheehan, 2000). This makes the study of IC within organizations vital in order to understand the growing need for creative and digital synergies in organizational innovation, human resource management, education and learning, and policymaking.
1.2
The KSC Framework: Multiple Perspectives
The definition of KSC has been extensively debated in educational, managerial, innovation-, and policy-related literature. Most literature on KSC development relates IC to a firm’s competitive and innovation performance, portraying it as an important determinant of its “survival and growth” (Bellucci et al., 2021, p. 744) or as a main enhancer of its competitive advantage and innovation processes (Rogo et al., 2014). Since IC is also described as “knowledge that can be converted into value” (Edvinsson & Sullivan, 1996, p. 361), Bellucci et al. (2021) define IC as an intangible asset and form of tacit knowledge that can be further segmented into five capital-generating components: human, structural, organizational, process, and customer capital. Nahapiet and Ghoshal
1.2
The KSC Framework: Multiple Perspectives
3
(1998, p. 253) introduce a social perspective on IC, explaining it as “the knowledge and knowing capability of a social collectivity,” such as an enterprise or a community. From this viewpoint, the notion of IC is strictly related to the concept of HC, since the latter constitutes one of the former’s components (Bellucci et al., 2021) and results from the accumulation of individual KSC (Becker, 1994; Khalique et al., 2011). Harris (2000) explains HC as “the acquired skills, knowledge, and abilities of human beings” (p. 24), specifying that KSC (and, consequently, HC as well) can be acquired and improved over time. Indeed, Adam Smith was one of the first to recognize the socioeconomic benefit and competitive advantage resulting from an individual’s education and knowledge as well as the costs related to the acquisition of talents, defining human talent as “capital in [a] person [. . .] part of his fortune and likewise that of society” (Smith, 1776, p. 217) and introducing a cost–benefit dichotomy into IC research. Therefore, the investment perspective on human capital gained relevance: investments in education and training by part of the individual, the firm, or the government could result in human and social capital growth. An organizational and human resource management approach to IC emerged, focusing on human capital (HC) as the individual KSC of each employee (Mouritsen et al., 2001b). As the particular feature of HC is that it is embodied (Becker, 1994) because a person is inseparable from his/her KSC and vice versa, it is the uniqueness of HC, rather than its value, that directly impacts on innovation (Cabello-Medina et al., 2011). This makes education crucial for KSC development (Becker, 1994), and consequently, HC improvement. While Schultz (1972) stressed the significant effect of leisure on education forgoing, the World Economic Forum (WEF, 2020) underlined that both formal and informal training are effective in KSC acquisition. IC is a tripartite concept, comprising knowledge in the form of people, structures, and relationships (Delgado, 2011) as the summa of an organization’s human, structural, and relational capital. In particular, structural capital encompasses “intellectual property, methodologies, software, documents, and other knowledge artifacts” (Stewart, 2007, p. 13), processes included (p. 124), constituting a framework for knowledge distribution and sharing (Harris, 2000), fostering HC (Edvinsson, 1997) as a result, whereas relational (Chen et al., 2005; Delgado, 2011) or customer (Harris, 2000) capital identifies a firm’s ties with external stakeholders, such as the community and its competitors (Tumwine et al., 2012), being related to the capacity of nurturing relationships with external stakeholders (Kamukama, 2013). Indeed, Djumalieva and Sleeman (2018) prove that KSC can be grouped into hierarchical clusters according to the strength of their interrelation. Hence, innovation results from the successful and original exploitation of the synergies coming from all three components of an organization’s IC (Subramaniam & Youndt, 2005). If the capacity of constantly integrating and acquiring new knowledge is crucial for economic growth and innovation in the knowledge-based society (Nonaka & Nishiguchi, 2001), IC becomes a fundamental driver of value and competitive advantage creation through its human, structural, and relational capital. Thus, it is essential to understand which KSC should be developed to do so and further enhance an organization’s IC to thrive in this new turbulent era.
4
1.2.1
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Learning and Education
Schultz (1972, p. 25) claims that education and HC investment may trigger increases in output “by generating new ideas and techniques [..]; by improving links among consumers, workers and managers; and by extending the useful life of the stock of knowledge and skills.” Hine et al. (2007) add that such a value is not only discrete but also collective: an individual’s KSC are not only valuable for the individual but for the overall society, adding to the overall social capital by accumulation. If HC is defined as “the investment in human resources to increase their efficiency [. . .] for future use” (Pasban & Nojedeh, 2016, p. 250), social capital identifies both the compendium of all individuals’ HC and the value added by their interaction and cooperation. As such, HC becomes a crucial source of competitive advantage and wealth in the “knowledge society” (Ananiadou & Claro, 2009), asking education to update accordingly. Martínez de Morentin de Goñi (2006) stresses the importance of framing education as a multidisciplinary process, which goes beyond the development and acquisition of professional qualifications and promotes civic, moral, social, and cultural attitudes for daily and social life as well. Indeed, Etzkowitz and Leydesdorff (2000, p. 109) prove the positive impact of university on innovation, with regional knowledge-intensive clusters comprising the state, academia, and industry players (p. 111) having an enhancing effect. Hence, education positively affects not only the personal development and growth of the individual but also sustainable economic, social, environmental, and cultural development. As the knowledge-based society is characterized by and needs higher levels and continuous cycles of information production, sharing, and consolidation, high-quality education becomes vital for social and economic growth (European Commission, 2017) as it may empower individuals with the possibility of accessing constantly updated information and knowledge. Winterton et al. (2006) introduce a matrix defining four main types of competences according to their scope and breadth. In terms of scope, competences may focus on either job-related matters (occupational) or individual development and effectiveness (personal), while in terms of breadth, they can be associated with either the cognitive sphere (conceptual) or practical efficiency (operational). Occupationrelated competences can be classified as knowledge (if cognitive) or skills (if functional), while personal competences usually deal with individual learning processes that build on personal competences that may facilitate learning, social attitudes, and behavior (Winterton et al., 2006). Such a classification recalls the hard–soft skill dichotomy introduced by Salman et al. (2020), who classify hard competences as either knowledge-related or skill-related, and soft competences as either behavior-related or self-actualization-related. Unlike Winterton et al. (2006), Salman et al. (2020) frame competences into the premise of sustainable development, stressing the criticality of not only social skills but also ethical, cross-cultural, and emotional competences, while Hodges and Burchell (2003) describe them as context-related, that is, long-lasting individual characteristics that result in effective
1.2
The KSC Framework: Multiple Perspectives
5
performance given normal conditions, so they can be categorized according to their level of salience. Considering five major types of competences, Spencer and Spencer (2003, p. 11) introduce the Iceberg Model of Competency, classifying competences as either visible (i.e., skill and knowledge), so easier to be taught, acquired, and assessed, or hidden (i.e., self-concept, trait, and motive), requiring more time and effort to be developed and adjusted, being more related to an individual’s core personality. Bogers et al. (2018) advocate for absorptive capabilities to acquire external knowledge and bring it into a workplace and organization, stimulating business innovation, with Barrena-Martínez et al. (2020) claiming that absorptive capacity has become a knowledge-based asset that may transform interorganizational knowledge into innovation (Xie et al., 2018) for its encompassing collaborative skills such as effective external knowledge, technology, and ideas acquisition, assimilation, and exploitation, which can turn inter-organizational knowledge into innovation (Barrena-Martínez et al., 2020). Since people can acquire KSC not only during their academic education but also during their daily experiences, interaction, and life nowadays, the knowledge-based economy asks for education to be a wide-ranging, multifaceted, and ongoing process. Educational curriculums should focus on cross-integration and crossinteraction across disciplines, teaching cross-field KSC that are easily transferable across subjects, contexts, and time frames (Ananiadou & Claro, 2009). At the same time, since lifelong learning has become crucial in the knowledge society, Blundell et al. (1999) study KSC acquisition through formal and informal methods, underlining the importance of both formal education and on-the-job training on competence development, as human capital may be built and enriched through three main factors: “early ability (acquired or innate), qualifications and knowledge acquired through formal education, and skills, competencies and expertise acquired through training on the job” (p. 2). EMSI also highlights how communication failure due to mismatching KSC taxonomies between workers, educators, and organizations presses for the development of a common KSC language and framework, proposing a new one. Based on personal elaboration, Table 1.1 summarizes the educational and learning perspective on KSC.
1.2.2
Organizational Theory and Human Resource Management
As KSC become crucial for social, economic, and individual growth in digital transformation, they turn into critical organizational assets as well. If intellectual capital (IC) constitutes a new source of a company’s goodwill (Fincham & Roslender, 2003), the development and improvement of this new type of intangible asset become critical for improving business performance (Buenechea-Elberdin, 2017). Berzkalne and Zelgalve (2014) claim that IC could explain the discrepancies
6
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Table 1.1 Educational and learning perspective on KSC. Personal elaboration Knowledge
Skills
Competences
● Knowledge comprises the set of information (principles and facts) acquired about a specific area through both formal and informal learning process ● Knowledge can be classified as a visible competence, so it is easier to learn and develop ● Knowledge can be defined as a cognitive occupation-related competence ● Skills encompass the demonstrated set of learned abilities and organized sequence of actions for a specific purpose in a specific context ● Skills can be classified as visible competence, so they are easier to learn, develop, and lose ● Competences identify the combination of skills and knowledge, self-concept, traits, and motives. Involving individual characteristics, competences require more time to be learnt and developed ● Competences may relate to the cognitive sphere (conceptual) or practical efficacy (operational) ● Competences can be both hard and soft: the former can be either skill- or knowledge-related, while the latter deal with either individual behavior or selfactualization (Salman et al., 2020)
between the market and book value of an organization, as IC assets may enhance profitability by boosting efficiency and productivity. Since IC is positively associated with a firm’s competitiveness, differentiation from competitors, and monetary value, if properly aligned with the overall corporate strategy (Brown et al., 2005), it can be labeled as its “knowledge-based equity” (Maditinos et al., 2011, p. 58), consisting of a company’s structural, human, and relational capital. At an organizational level, the firm’s processes and culture establish its structural capital, relationships and exchanges with external stakeholders such as customers, competitors, institutions, and suppliers its relational capital, and the “cumulative tacit knowledge skills” of its human resources its human capital (Cricelli et al., 2014). Hence, assessing owned and needed resources may be beneficial for a firm to discover its sources of value creation, diversification, and competitive advantage, as well as to detect potential resources to acquire for future innovation and profitability. Wernerfelt (1984) develops a resource-based view of the firm, claiming that companies should divert their strategic focus from products to resources that could optimize their product and market activities (p. 171). Barney (1991) classifies resources as related to physical capital, human capital, or organizational capital, alleging that not only technology but also the training of, the relationships between, and experiences of workers as well as the business model, structure, and processes could be major sources of value creation for an organization. If resources must be valuable, rare, inimitable, and non-substitutable to do so, HC perfectly fits the description, so it may drastically boost competitive advantage and value creation (Barney, 1991). Since the employees’ KSC and experiences constitute an organization’s human capital (HC), highly skilled personnel and in-house knowledge have a direct positive effect on profitability (Wernerfelt, 1984 p. 172). Indeed, the resourcebased approach (Helfat & Peteraf, 2003) ascribes competition to the heterogeneity
1.2
The KSC Framework: Multiple Perspectives
7
and varied distribution of capabilities and resources across businesses, stressing that competition is highly likely when they differ in owned capabilities. Yet, HC is an extremely movable asset, as workers may decide to leave a company, bringing with them their unique set of KSC (Barney, 1991). This implies that human resource management must also favor the development and enhancement of human resources through valuable training, motivational working environments, and career growth opportunities to retain them. However, Aloini et al. (2017) prove that excessive protection methods of intangible assets may actually hinder rather than favor innovation. So, if continuous learning and innovation as well as communication should be incentivized as they improve HC, and, eventually, long-term competitiveness in a globalized world (Buenechea-Elberdin, 2017), Storck and Hill (2000) also propose more open-ended and network configurations through intra-firm alliances as a successful alternative to leverage and protect firm-specific resources at the same time, supporting the establishment of strategic communities of dispersed human resources to foster innovation with the IT industry as a reference. If HC embedded in an organization’s workforce directly affects a firm’s potential for value creation (Chen et al., 2005, p. 160), profitability, and revenue growth (p. 174), it may also do so through exchanges and interactions in cross-firm configurations. Harris (2000) ascribes the positive relationship between structural and HC to the fact that the former provides “the framework and patterns for the transmission of knowledge [. . .] to maximize [. . .] human capital” (p. 26). Hence, knowledge management and human resource management become essential for improving innovation (Paoloni et al., 2020). Nevertheless, if the resource-based approach may explain the importance of HC, it seems a poor methodology to support the evaluation and assessment of KSC needed in the digital transformation era, where constantly changing technology, customer demand, and habits require extreme agility, continuous innovation, and fast responses, urging organizations, managers, and workers to regularly update their KSC to succeed. According to Helfat and Peteraf (2003), both the internal and external environment deeply affect the development and growth of KSC within organizations. While the resource-based method adopts a discrete approach, providing a snapshot of the set of routines that characterize a company in a given point of time, a more dynamic perspective may be more informative as it accounts for the evolution of KSC over time, more effectively supporting business decision-making in more uncertain and volatile environments. If Schumpeter considers that innovation is engendered by novel combinations or re-combinations of knowledge in a new context or channel (Langlois, 2002), Nahapiet and Ghoshal (1998) stress that the combination and exchange of resources, including IC, are essential for organizational learning and innovation, with motivation, expected value creation, and opportunity arousal as moderating factors. Since IC positively affects an organization’s market value and financial performance (Brown et al., 2005, p. 1), Teece et al. (1997) apply a dynamic approach to KSC development, advocating for the rising strategic importance of acquiring dynamic capabilities, namely the “ability to integrate, build, and reconfigure internal and external competences to address rapidly changing environments [. . .] to achieve new and innovative forms of competitive advantage”
8
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
(p. 516). Furthermore, they considered such an approach more effective for network business models like the business ecosystem, as dynamic capabilities such as “innovation capabilities, environmental scanning and sensing capabilities, and integrative capabilities” (Helfat & Raubitschek, 2018) are critical in contemporary multi-sided (Vial, 2019) competition frameworks. Both Senge (1990) and Harris (2000) combine this dynamic view of organizational capabilities with system theory: the former underlines the crucial role of dynamic capabilities like team learning, personal mastery, system thinking, and share vision in building learning organizations, whereas the latter highlights the direct positive effect of the leaders’ and managers’ KSC on the successful implementation of system-wide structures and sharing of information. Pike et al. (2005) add that IC is not only strategic, but it may also create value if “used in combinations” (p. 491), encouraging the adoption of a multi-stakeholder perspective within the company, whereas Djumalieva and Sleeman (2018) reveal the existence of networks of interrelated KSC. Nahapiet and Ghoshal (1998) claim that exchange and combination are pivotal conditions for IC development and improvement within organization, with Warner and Wäger (2019) also emphasizing the strategic significance of dynamic capabilities in digital transformation, which they define as a continuing process through which organizations implement digital technologies in their daily functioning and which underpins agile capabilities for strategically renewing their business model and culture and transitioning toward a more collaborative approach. In particular, dynamic capabilities like digital sensing, digital seizing, and digital transforming are fundamental to address a constantly changing external environment and facilitate strategic renewal to tackle digital transformation, if effectively supported by internal enablers such as cross-functional teams, fast decision-making processes, and executive support (Warner & Wäger, 2019, p. 336). Furthermore, a vision-oriented organizational culture and continuous organizational learning processes aimed at developing flexible and agile response patterns are set to become critical as well (Dierkes et al., 1998). The dynamic approach highlights the relevance of recording and analyzing “capability lifecycles” (CLC; see Helfat & Peteraf, 2003, p. 997), which depict the evolution and growth patterns of individual and accumulated capabilities within an organization or network over time. This allows businesses to notice potential spillover and branching effects stemming from an originally acquired single capability. Zott (2003) adds that dynamic capabilities may be studied in terms of not only the timing necessary for their implementation and deployment with a business entity but also the cost of variation (i.e., cost of imitation and experimentation) and learning (i.e., to imitate or to experiment) to explain intra-industry variations in firm performance. He also emphasizes the negative correlation between organizational learning and the costs related to resource deployment: as the former increases, the latter decreases. So, upward-sloping learning curves may result from continuous and lifelong learning within organizations, resulting in lower innovation and technology costs for the firm in the long term as human resources are provided with the necessary dynamic capabilities to rapidly sense and seize innovation and adapt accordingly at any time. This makes human
1.2
The KSC Framework: Multiple Perspectives
9
Table 1.2 Organizational theory and human resource management perspective on KSC. Personal elaboration Knowledge
Skills
Competences
● Organizational knowledge can be both internal (resulting from the cumulation and interaction of the individual knowledge of all human resources) and external (resulting from external exchanges and relationships) ● As the accumulation of employees’ KSC, organizational knowledge adds to an organization’s human capital (HC). So, it is an organizational resource and intangible asset, in addition to other physical and organizational capital resources ● In-house knowledge becomes a major source of competitive advantage and positively affects profitability and potential for value creation ● Organizational knowledge is dynamic, so they can be described in terms of lifecycles. Upward-sloping learning curves may result from continuous and lifelong learning within organizations ● Organizational skills identify the cumulative skills of all human resources of a firm, thus adding to its human capital (HC) ● Hence, organizational skills are organizational resources, adding to the intangible assets of a company, in addition to its physical and organizational capital ● Highly skilled personnel become a major source of competitive advantage and positively affect profitability and potential for value creation ● Organizational skills are dynamic, so they can be described in terms of lifecycles. Upward-sloping learning curves may result from continuous and lifelong learning within organizations ● Organizational competences identify the aggregation of knowledge and skills of an organization, given the unique structural capital of an organization (e.g., frameworks and patterns for knowledge transmission), so they directly relate to knowledge management and human resource management ● Organizational dynamic capabilities (i.e., team learning, personal mastery, system thinking and shared vision, digital sensing, digital seizing, and digital transforming) of both employees and managers are main sources of competitive advantage, and, consequently, strategic ● Organizational competences are dynamic, so they can be described in terms of lifecycles. Lifelong learning capacity is crucial in digital transformation
resources and continuous learning crucial to organizations (Teece et al., 1997), urging them to acquire “lifelong learning capacity” (Csapó et al., 2012, p. 158). Based on personal elaboration, Table 1.2 summarizes the organizational theory and human resource management perspective on KSC.
1.2.3
Labor Economics and Innovation
Mouritsen et al. (2001a) use the notion of “capable firm” to identify an organization prospering thanks to its IC, that is, its unique combination of human, structural, and relational capital, with IC (and HC, specifically) directly affecting firm performance (Clark et al., 2011). Pasban and Nojedeh (2016, p. 250) consider HC as a pre-investment incurred by an organization today for increasing the efficiency of human resources in the future. This is strictly related to the investment approach to IC and HC and asks for new types of contracts (Mouritsen et al., 2001b) between
10
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
labor and management. Nerdrum and Erikson (2001) consider the individual marginal product of a worker (i.e., their wages) as the function of the quantity of working hours, the sum of diverse human capital goods, and other capacities that may positively or negatively affect their individual productivity (p. 129), declaring that both formal education and informal training on the job add to IC (p. 131). In studying the relationship between HC and labor supply, Blinder and Weiss (1975) develop a lifecycle human investment model based on an individual’s utility-maximizing allocation of daily time on leisure, work, and education, while Imai and Keane (2004) state that human capital accumulation directly determines an individual’s labor supply with age playing a moderating role due to intertemporal elasticities of substitution, so that momentary wage variations may modestly impact on labor supply at younger age ranges. If investment in HC directly affects an organization’s potential for innovation and economic growth, it also comprises employees’ education, work-related competences, and psychometric assessments (Tumwine et al., 2012). Therefore, HC can be considered as an investment in education and training that is undertaken by the individual, the firm, and the government toward its improvement (Becker, 1994). Vincent (2008) studies the dichotomy of competence and capability, describing the former as an individual’s know-how and the latter as a collaborative functional process for the effective deployment of the former. While competences are specific to a certain workplace, so that they may apply only to specific contexts, environments, circumstances, or events and may be assessed against standards, capabilities are not context-specific as they identify processes, thus being applicable in a large set of circumstances (Vincent, 2008). Since the former are difficult to imitate by competitors because they concern the capacity of performing a task by integrating knowledge and skills through independent individual action (Méhaut & Winch, 2012; Prahalad & Hamel, 2003), human resources, as holders of these skills, become important value-creating assets in innovation. Cabello-Medina et al. (2011) prove the direct and positive effect of HC on firm innovativeness, especially if coupled with social capital and human resources management practices, whereas Chen et al. (2005) classify IC as either internal or external, claiming that the interaction between the two types could foster successful innovation. Buenechea-Elberdin (2017) also demonstrates that knowledge resulting from internal (interactions among employees) and external relational capital remarkably increases HC, with Cabrilo et al. (2020) remarking that it is the synergy between the two that fosters organizational innovation. So, individual KSC become key elements for innovation purposes through both their discrete and cumulative effect, highlighting not only the need for lifelong learning for continuously learning, updating, and acquiring new types of KSC but also the strategic importance of their unique combination for sustained competitive advantage. Educational diversity of a firm’s employees has been proven to be positively associated with organizational open innovation (Bogers et al., 2018), so that recruitment should be done accordingly. At the same time, CEO’s characteristics may influence the transition toward more innovative business models, solutions, and processes, as they may facilitate or hamper possibilities for knowledge flows,
1.2
The KSC Framework: Multiple Perspectives
11
information sharing, and collaboration through their attitude, entrepreneurial orientation, patience, and education (Bogers et al., 2018). Li et al. (2020) highlight the centrality of creative stars in promoting innovation at both individual and aggregate levels during teamwork, pushing other employees toward innovative thinking and creativity and encouraging product and process innovation in their organization. As such, KSC are crucial for innovation not only at an individual level but at an aggregate (organizational) level as well, since the KSC of each individual may aggregate to foster organizational innovation through employees’ and CEO’s knowledge and educational diversity. Nahapiet and Ghoshal (1998) introduce the notion of “social embeddedness of intellectual capital” to highlight the positive impact of network configurations on resource access and information communication, alleging that more various and higher quality exchanges and (re-)combinations of IC may result from higher levels of social capital. Storck and Hill (2000) also emphasize how creating a “strategic community” through “dispersed human resources” positively affects IC deployment and development, and, consequently, organizational performance in the IT industry. Cappiello et al. (2020) define a cluster as “a dense geographic concentration” (p. 422) of interrelated firms aimed at enhancing innovation and competitiveness of both the organizations and the region, urging policymakers to promote these kinds of initiatives. This is even more relevant for small-and-medium enterprises (SMEs) that may be able to overcome their resource limitations, in terms of both tangible and intangible assets, by sharing information, knowledge, and talent for innovation purposes with other stakeholders in more informal business models and collaborative ecosystems (Demartini & Beretta, 2020). Hence, from the demand side, highly skilled entrepreneurs (Bublitz et al., 2015) and places with higher human capital concentration (Berry & Glaeser, 2005) are more likely to increase labor demand for talent with higher levels of KSC. By analyzing job adverts, Djumalieva and Sleeman (2018) explain that networks of skills also exist, depending on the strength of the relationship (i.e., link) between skills (i.e., vertices) and the frequency of their co-occurrence (i.e., edges). Grouping KSC into hierarchical clusters, they show how not only the single skill but also synergies and hierarchies between KSC are significant, proposing a novel KSC taxonomy. In conclusion, unlike the canonical model, which classifies workforce’s skills as either low or high, the knowledgebased economy builds on more flexible, agile, dynamic, reactive, collaborative, and integrative skills that can integrate multiple perspectives together for constant innovation. In particular, “knowledge leaders” (Storck & Hill, 2000) are great facilitators of knowledge acquisition, assimilation, and exchange within a firm and across organizations, remarking the role of a common organizational culture in the creation of effective innovation communities. Based on personal elaboration, Table 1.3 summarizes the labor economics and innovation perspective on KSC.
12
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Table 1.3 Labor economics and innovation perspective on KSC. Personal elaboration Knowledge
Skills
Competences
1.2.4
● Knowledge can be either internal (resulting from the interaction of individual knowledge of employees) or external (resulting from relationships with external stakeholders). The unique combination of both types and the educational diversity of employees positively affect and drive innovation ● Depending on its level of codification, knowledge can be tacit or explicit. Knowledge adds to an organization’s human capital (HC), and, therefore, intellectual capital (IC), underpinning innovation potential ● Cross-disciplinary knowledge, in addition to cross-disciplinary skills (and therefore, competences), is essential for innovation ● Skills are dynamic so they can be learnt, developed, and lost, and are contextspecific ● Dynamic, collaborative, and reactive skills, promoting flexibility, agility, and system thinking, are crucial to promote innovation ● Skills add to an organization’s human capital (HC), and, therefore, intellectual capital (IC), underpinning innovation potential ● Competences identify the individual know-how of employees and managers, so they are specific to a certain occupation and can be assessed against standards ● Competences relate to the capacity of performing a task by integrating knowledge and skills through independent individual action, so they are difficult for competitors to imitate, constituting a main source of sustained competitive advantage ● Competences add to an organization’s human capital (HC), and, therefore, intellectual capital (IC), underpinning innovation potential
Multi-level Policymaking
If IC is a major source of value creation for businesses, its positive effect has also been proved at the aggregate level, as an important driver of national development (Chen et al., 2005, p. 174). Glewwe (2002) advocates for the role of high-quality education on economic growth, demanding public policy to promote high levels of educational attainment in emerging countries (p. 1), while the OECD (2021) urges countries to provide for the smooth transition toward a knowledge-based economy for prospering in the future. KSC-related policymaking may refer to either an entrepreneurial or macroeconomic perspective, whether value is created at a firm level or a community/national level (Roberts & Townsend, 2016), respectively. If new KSC, expanding beyond mere technological knowledge to also include more dynamic, reactive, and multidisciplinary capabilities, must be urgently taught, developed, and acquired by the workforce to meet digital transformation, national and international policymaking should focus on renewing educational and training systems accordingly to grant sufficient living standards to everyone (UNCTAD, 2008). The World Bank proposes a Knowledge Assessment Methodology (KAM) for helping countries transition toward a knowledge-based economy by assigning policies and investments on four main pillars of the knowledge economy, including economic and institutional regimes, education and skills, information and communication
1.2
The KSC Framework: Multiple Perspectives
13
infrastructure, and innovation system (World Bank Institute, 2010, p. 1). All four pillars are meant to incentivize and ease the acquisition and development of new knowledge as well as the dissemination and processing of information. Since they aim at assisting people in gaining twenty-first century skills to survive in the digital and innovative twenty-first century (Ananiadou & Claro, 2009) and contemporary learning processes encompass both informal and formal educational approaches (e.g., educational and training institutions and daily leisure activities), education must evolve from being a chapter of an individual’s existence to becoming a neverending and continuing process (Lapiņa & Aramina, 2011). This demands a lifelong perspective on education from policy as well, which must frictionlessly combine formal and informal (Méhaut & Winch, 2012) and online and offline dimensions of education, integrating life skills (e.g., decision-making, learning from failure decision-making, information and stress management, interdisciplinary and interpersonal skills; Seltzer and Bentley (1999)) into curricula. So, the development of collective intellectual systems and innovation capabilities (Secundo et al., 2016) becomes a central topic in public policymaking, implying a coordinated multi-level action across all involved stakeholders. Not only the time horizon and content but also the experience of learning must change through policy. According to Secundo et al. (2016), the establishment of integrated networks of multiple stakeholders through knowledge-intensive cooperation among educational institutions (universities), local entities, and enterprises may result in higher quality education, and, therefore, more highly skilled HC, and, eventually, more relevant innovation. Hence, universities can foster future-proofing KSC development if working within an ecosystem of relationships with multiple stakeholders which may promote fast and reliable knowledge sharing. In doing so, universities may turn into “loci of knowledge” (Cricelli et al., 2018, p. 72), where IC (and, thus, HC) is enhanced and expanded by the convergence and interaction of tacit and explicit knowledge of professors, researchers, students, managers, and staff (Ramírez Córcoles, 2013). Moreover, structural and relational capital can further reinforce and enable innovation and research when innovative knowledge and learning processes are introduced and improved as well as networks and relationships with organizations, governments, the public, and other external partners are established and nurtured, respectively. Etzkowitz and Leydesdorff (2000) corroborate the positive effect of tripartite “knowledge-intensive” systems (p.109), consisting of universities, industries, and governments, on innovation. In 2020, the World Economic Forum (WEF) proposed an Education 4.0 Framework to support institutions and governments in implementing these changes, proposing four new skill sets (technology, interpersonal, innovation and creativity, and global citizenship skills), and four new types of learning experiences (personalized and self-paced, accessible and inclusive, problem-based and collaborative, lifelong and studentdriven) to make education lifelong and beneficial for digital transformation (WEF, 2020). In doing so, international, national, and regional policy may increase both economic and social capital in addition to IC and HC. Besides education and organizational innovation, policy should promote more efficient and effective accounting procedures for measuring the value of IC (Alcaniz
14
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Table 1.4 Policy perspective on KSC. Personal elaboration Knowledge
Skills
Competences
● Knowledge comprises the set of principles, theories, practices, and facts that is structured to apply to a specific field and underpins skills. Hence, knowledge defines the proper conduct of a job ● Knowledge-intensive clusters comprising multiple stakeholders (i.e., universities, governments, businesses) significantly promote innovation and related crossdisciplinary education ● Skills are associated with applied knowledge to perform a task or solve a problem. So, skills determine the ability to perform a task or a job properly ● New skills gain relevance in the digital transformation: technology, interpersonal, innovation and creativity, and global citizenship skills ● Competences identify the collection of skills and knowledge as well as social, ethical, personal abilities, which apply to both professional and personal development ● Lifelong, collaborative, cross-disciplinary, and problem-based competences become crucial in the twenty-first century
et al., 2011). Unlike financial accounting, IC accounting deals with more informal and free “mechanisms of value creation” (Mouritsen et al., 2001b, p. 400), asking for an original “intellectual capital approach” (Pike et al., 2005). Although the financial capital identifies the net present value of a firm while IC describes how “value is created and transformed” within it, both approaches focus on stakeholder value creation. Since financial performance is significantly and mutually related to IC (Tanideh, 2013, p. 1) in the knowledge-based economy, IC accounting must combine both financial and non-financial indicators to define the set of actions that led to the creation and transformation of value. The value-added intellectual coefficient (VAIC) was introduced to measure IC at an organizational level for guiding the management of intangible assets and resources. Yet, Ståhle et al. (2011) validate the inconsistency of this methodology by testing it on a sample of 125 companies in Finland. Their research shows that one of the major hurdles in making VAIC reliable was the misuse of the notion of IC, so international policy is needed to set a common framework for IC definition, interpretation, and measurement. Based on personal elaboration, Table 1.4 summarizes the policy perspective on KSC.
1.3
The Need for Twenty-First Century Skills: From STEM to STEAM
Constantly changing technology and digital transformation have radically altered competition, operations, and cooperation patterns, asking for continuous learning and more collaborative KSC development. Teece and Linden (2017) introduce the term “next-generation competition” to describe a type of competition that is more dynamic, occurs in the present globalized world, and deals with multi-component
1.3
The Need for Twenty-First Century Skills: From STEM to STEAM
15
innovation by relying on business ecosystems and organizational dynamic capabilities to thrive. Since dynamic capabilities can facilitate the coordination of a company’s tangible and intangible assets in innovative ways and the creation of interlaced collaborations and alliances (Teece & Linden, 2017) to jointly take advantage of existing internal and external resources to build difficult-to-imitate combinations of knowledge and skills, they are set to gain increasing relevance as main sources of competitive advantage (Teece et al., 1997). Ascribing economic growth in the new economy to the “synergy between new knowledge and human capital” (Griffin et al., 2012, p. 4), validating KSC development as crucial for innovation, Griffin et al. (2012) define twenty-first century skills as those “skills that are essential for navigating the twenty-first century.” Therefore, twenty-first century skills are those skills and competences that emerged to meet new economic and social models that place knowledge at their core, outdating the previous industrialized production focus. By developing these skills, people can contribute to economic development in the new century and better face the labor demand as well as the educational and innovation requirements of a knowledge-based economy. Indeed, van Laar et al. (2020) describe twenty-first century skills as those necessary for successful education and workplace in the contemporary society, stressing that skills are extremely interconnected and interwoven in the knowledge society, so that they cannot be studied in a discrete way only but must be approached in a comprehensive way as well. Since twenty-first century skills encompass a broader set of knowledge, skills, and competences, including digital skills (van Laar et al., 2017, p. 583), digital skills alone are necessary but not sufficient to thrive nowadays. If digital skills are critical in the knowledge-based economy, for their effective support of knowledge management through more efficient information acquisition, sharing, integration, and analysis in business ecosystems and networks (Ananiadou & Claro, 2009), communication and social skills become fundamental too. Since technology is a necessary but not sufficient condition for successful innovation anymore, human creativity pushed by inner motivation has emerged as a crucial source of competitive advantage (Bruno & Canina, 2019). Van Laar et al. (2017) enumerate among the twenty-first century skills digital skills, such as technical, information management, communication, collaboration, creativity, critical thinking, and problem-solving skills, with lifelong learning, cultural awareness, flexibility, ethical awareness, and self-direction as core contextual skills. DarlingHammond (2012) adds that new ways of thinking, critical thinking, metacognition, learning to learn, and, thus, innovation and creativity should be not only the target of higher educational policies but also the core focus of science investigation on the issue. The WEF (2020) highlights the interconnectedness between innovation and creative skills, like complex problem-solving, analytical thinking, creativity, and systems analysis, to foster innovation and address digital transformation in addition to global citizenship, technology, and interpersonal skills. Binkley et al. (2012) also argue that the unique combination of digital and creative skills is set to become critical in the twenty-first century. In particular, digital and creative skills can be described as follows:
16
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
• Digital skills are related to information and data literacy, communication and collaboration, digital content creation, safety, and problem-solving (Carretero et al., 2018). • Creative skills deal with “the application of knowledge and skills in new ways” (Seltzer & Bentley, 1999). Since the interaction and symbiosis between digital and creative skills is expected to be the real source of successful innovation in the near future (Lazzaretti, 2020), the acronym STEAM was coined to identify a cross-curricular KSC set (Land, 2013) integrating STEM (science, technology, engineering, and mathematics) and arts (Aguilera & Ortiz-Revilla, 2021). STEAM skills combine analytical and creative thinking to build twenty-first century KSC that may improve information retention and encourage innovative problem-solving (Land, 2013), meeting the rising quest for high-quality and cross-disciplinary combination and synergy of convergent (STEM) and divergent (arts) skills for global competitiveness (Land, 2013). If digital skills and creativity jointly enhance and foster innovation and sustainable development, learning must become more cross- and multidisciplinary, personalized, and self-paced, accessible and inclusive, more problem-based, collaborative, and lifelong. Education, policy, organizational innovation strategies, and human resource management should shift from a technological and digital focus to a more comprehensive, interdisciplinary, and holistic approach to KSC development. Instead of focusing on developing STEM skills only, people must develop STEAM skills that integrate both technology- and creativity-related KSC to face the challenges and meet the demands of the twenty-first-century knowledge-based economy and to generate outstanding innovation (Maeda, 2013). If digital transformation requires these new skills and creative-digital synergy, the main concerns are how they can be effectively and successfully learnt, assimilated, and developed among current and future workers and across sectors on a lifelong perspective.
References Aguilera, D., & Ortiz-Revilla, J. (2021). STEM vs. STEAM education and student creativity: A systematic literature review. Education Sciences, 11(7), 331. https://doi.org/10.3390/ educsci11070331 Alcaniz, L., Gomez-Bezares, F., & Roslender, R. (2011). Theoretical perspectives on intellectual capital: A backward look and a proposal for going forward. Accounting Forum, 35(2), 104–117. https://doi.org/10.1016/j.accfor.2011.03.004 Aloini, D., Lazzarotti, V., Manzini, R., & Pellegrini, L. (2017). IP, openness, and innovation performance: an empirical study. Management Decision, 55(6), 1307–1327. https://doi.org/ 10.1108/MD-04-2016-0230 Ananiadou, K., & Claro, M. (2009). 21st century skills and competences for new millennium learners in OECD countries (OECD Education Working Papers, 41). OECD. https://doi.org/ 10.1787/218525261154 Barney, J. (1991). Firm resources and sustained competitive advantage. Journal of Management, 17(1), 99–120. https://doi.org/10.1177/014920639101700108
References
17
Barrena-Martínez, J., Cricelli, L., Ferrándiz, E., Greco, M., & Grimaldi, M. (2020). Joint forces: towards an integration of intellectual capital theory and the open innovation paradigm. Journal of Business Research, 112, 261–270. https://doi.org/10.1016/j.jbusres.2019.10.029 Becker, G. S. (1994). Human capital revisited. In Human capital: A theoretical and empirical analysis with special reference to education, third edition (pp. 15–28). The University of Chicago Press. Bellucci, M., Marzi, G., Orlando, B., & Ciampi, F. (2021). Journal of Intellectual Capital: A review of emerging themes and future trends. Journal of Intellectual Capital, 22(4), 744–767. https:// doi.org/10.1108/JIC-10-2019-0239 Berry, C. R., & Glaeser, E. L. (2005). The divergence of human capital levels across cities. Papers in Regional Science, 84(3), 407–444. https://doi.org/10.1111/j.1435-5957.2005.00047.x Berzkalne, I., & Zelgalve, E. (2014). Intellectual capital and company value. Procedia-Social and Behavioral Sciences, 110, 887–896. https://doi.org/10.1016/j.sbspro.2013.12.934 Binkley, M., Erstad, O., Herman, J., Raizen, S., Ripley, M., Miller-Ricci, M., & Rumble, M. (2012). Defining twenty-first century skills. In P. Patrick Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 17–66). Springer. Blinder, A., & Weiss, Y. (1975). Human capital and labor supply: A synthesis (NBER Working Paper Series (January 1975)). Blundell, R., Dearden, L., Meghir, C., & Sianesi, B. (1999). Human capital investment: the returns from education and training to the individual, the firm and the economy. Fiscal Studies, 20(1), 1–23. Bogers, M., Foss, N. J., & Lyngsie, J. (2018). The “human side” of open innovation: The role of employee diversity in firm-level openness. Research Policy, 47, 218–231. https://doi.org/10. 1016/j.respol.2017.10.012 Brown, A., Osborn, T., Chan, J. M., & Jaganathan, V. (2005). Managing intellectual capital. Research-Technology Management, 48(6), 34–41. https://doi.org/10.1080/08956308.2005. 11657346 Bruno, C., & Canina, M. (2019). Creativity 4.0. Empowering creative process for digitally enhanced people. The Design Journal, 22(sup1), 2119–2131. https://doi.org/10.1080/ 14606925.2019.1594935 Bublitz, E., Nielsen, K., Noseleit, F., & Timmermans, B. (2015). Entrepreneurship, human capital, and labor demand: A story of signaling and matching (HWWI Research Paper, No. 166). Hamburgisches WeltWirtschaftsInstitut (HWWI). http://hdl.handle.net/10419/113684 Buenechea-Elberdin, M. (2017). Structured literature review about intellectual capital and innovation. Journal of Intellectual Capital, 18(2), 262–285. https://doi.org/10.1108/JIC-07-2016-0069 Cabello-Medina, C., López-Cabrales, Á., & Valle-Cabrera, R. (2011). Leveraging the innovative performance of human capital through HRM and social capital in Spanish firms. The International Journal of Human Resource Management, 22(04), 807–828. https://doi.org/10.1080/ 09585192.2011.555125 Cabrilo, S., Dahms, S., Mutuc, E. B., & Marlin, J. (2020). The role of IT practices in facilitating relational and trust capital for superior innovation performance: the case of Taiwanese companies. Journal of Intellectual Capital, 21(5), 753–779. https://doi.org/10.1108/JIC-07-2019-0182 Cappiello, G., Giordani, F., & Visentin, M. (2020). Social capital and its effect on networked firm innovation and competitiveness. Industrial Marketing Management, 89, 422–430. https://doi. org/10.1016/j.indmarman.2020.03.007 Carretero, S., Vuorikari, R., & Punie, Y. (2018). DigComp 2.1: the digital competence framework for citizens with eight proficiency levels and examples of use. European Commission, Joint Research Centre, Publications Office. https://data.europa.eu/doi/10.2760/836968 Chen, M. C., Cheng, S. J., & Hwang, Y. (2005). An empirical investigation of the relationship between intellectual capital and firms’ market value and financial performance. Journal of Intellectual Capital, 6(2), 159–176. https://doi.org/10.1108/14691930510592771 Clark, M., Seng, D., & Whiting, R. H. (2011). Intellectual capital and firm performance in Australia. Journal of Intellectual Capital, 12(4), 505–530. https://doi.org/10.1108/14691931111181706
18
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Clarke, T., & Clegg, S. (2000). Management paradigms for the new millennium. International Journal of Management Reviews, 2(1), 45–64. https://doi.org/10.1111/1468-2370.00030 Correani, A., De Massis, A., Frattini, F., Petruzzelli, A. M., & Natalicchio, A. (2020). Implementing a digital strategy: Learning from the experience of three digital transformation projects. California Management Review, 62(4), 37–56. https://doi.org/10.1177/0008125620934864 Cricelli, L., Greco, M., & Grimaldi, M. (2014). An overall index of intellectual capital. Management Research Review, 37(10), 880–901. http://www.emeraldinsight.com/doi/full/10.1108/ MRR-04-2013-0088 Cricelli, L., Greco, M., Grimaldi, M., & Llanes Duenas, L. P. (2018). Intellectual capital and university performance in emerging countries: Evidence from Colombian public universities. Journal of Intellectual Capital, 19(1), 71–95. https://doi.org/10.1108/JIC-02-2017-0037 Cronje, C. J., & Moolman, S. (2013). Intellectual capital: measurement, recognition and reporting South. African Journal of Economic and Management Sciences, 16(1), 1–12. Csapó, B., Ainley, J., Bennett, R. E., Latour, T., & Law, N. (2012). Technological issues for computer-based assessment. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 143–230). Springer. Darling-Hammond, L. (2012). Policy frameworks for new assessments. In P. Patrick Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 301–399). Springer. Delgado, M. (2011). The role of intellectual capital assets on the radicalness of innovation: Direct and moderating effects (Working Paper 2011/05). Management of Innovation, Autonomous University of Madrid (UAM), Faculty of Economics and Accenture on the Economics and Management of Innovation. Demartini, M. C., & Beretta, V. (2020). Intellectual capital and SMEs’ performance: A structured literature review. Journal of Small Business Management, 58(2), 288–332. https://doi.org/10. 1080/00472778.2019.1659680 Dierkes, M., Hofmann, J., & Marz, L. (1998). Technological development and organisational change: differing patterns of innovation. In 21st century technologies: Promises and perils of a dynamic future (pp. 97–122). Organization for Economic Co-operation and Development (OECD). Djumalieva, J., & Sleeman, C. (2018, August). An open and data-driven taxonomy of skills extracted from online job adverts (ESCoE Discussion Paper 2018-13). ISSN 2515-4664. https://escoe-website.s3.amazonaws.com/wpcontent/uploads/2020/07/13161304/ESCoE-DP2018-13.pdf Edvinsson, L. (1997). Developing intellectual capital at Skandia. Long Range Planning, 30(3), 366–373. https://doi.org/10.1016/S0024-6301(97)90248-X Edvinsson, L., & Sullivan, P. (1996). Developing a model for managing intellectual capital. European Management Journal, 14(4), 356–364. https://doi.org/10.1016/0263-2373(96) 00022-9 Etzkowitz, H., & Leydesdorff, L. (2000). The dynamics of innovation: from National Systems and “Mode 2” to a Triple Helix of university–industry–government relations. Research Policy, 29(2), 109–123. https://doi.org/10.1016/S0048-7333(99)00055-4 European Commission (EC). (2010). Green paper—Unlocking the potential of cultural and creative industries. https://op.europa.eu/s/vKy4 European Commission, Directorate-General for Education, Youth, Sport and Culture, Hoelck, K., Engin, E., & Airaghi, E. (2017). Mapping the creative value chains : a study on the economy of culture in the digital age : final report. Publications Office. https://data.europa.eu/doi/10.2766/ 868748 Fincham, R., & Roslender, R. (2003). The management of intellectual capital and its implications for business reporting. Institute of Chartered Accountants of Scotland. Glewwe, P. (2002). Schools and skills in developing countries: education policies and socioeconomic outcomes. Journal of Economic Literature, 40(2), 436–482.
References
19
Griffin, P., Care, E., & McGaw, B. (2012). The changing role of education and schools. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills. Springer. https://doi.org/10.1007/978-94-007-2324-5_1 Harris, L. (2000). A theory of intellectual capital. Advances in Developing Human Resources, 2(1), 22–37. https://doi.org/10.1177/152342230000200104 Helfat, C. E., & Peteraf, M. A. (2003). The dynamic resource-based view: Capability lifecycles. Strategic Management Journal, 24(10), 997–1010. https://doi.org/10.1002/smj.332 Helfat, C. E., & Raubitschek, R. S. (2018). Dynamic and integrative capabilities for profiting from innovation in digital platform-based ecosystems. Research Policy, 47(8), 1391–1399. https:// doi.org/10.1016/j.respol.2018.01.019 Hine, D. C., Helmersson, H., & Mattsson, J. (2007). Individual and collective knowledge: An analysis of intellectual capital in an Australian biotechnology venture using the text analytic tool Pertex. International Journal of Organizational Analysis, 15(4), 358–378. https://doi.org/10. 1108/19348830710900151 Hodges, D., & Burchell, N. (2003). Business graduate competencies: Employers’ views on importance and performance. International Journal of Work-Integrated Learning, 4(2), 16. Houghton, J., & Sheehan, P. (2000, February). A primer on the knowledge economy (CSES Working Paper No. 18). Victoria University of Technology, Centre for Strategic Economic Studies. Imai, S., & Keane, M. P. (2004). Intertemporal labor supply and human capital accumulation. International Economic Review, 45(2), 601–641. https://doi.org/10.1111/j.1468-2354.2004. 00138.x Kamukama, N. (2013). Intellectual capital: company’s invisible source of competitive advantage. Competitiveness Review: An International Business Journal, 23(3), 260–283. https://doi.org/10. 1108/10595421311319834 Khalique, M., Isa, A. H. B. M., Nassir Shaari, J. A., & Ageel, A. (2011). Challenges faced by the small and medium enterprises (SMEs) in Malaysia: An intellectual capital perspective. International Journal of current research, 3(6), 398–401. https://ssrn.com/abstract=1891867 Land, M. H. (2013). Full STEAM Ahead: The Benefits of Integrating the Arts into STEM. Procedia Computer Science, 20, 547–552. https://doi.org/10.1016/j.procs.2013.09.317 Langlois, R. N. (2002, August). Schumpeter and the obsolescence of the entrepreneur (Working Paper 2002-19). University of Connecticut, Department of Economics Working Paper Series. Lapiņa, I., & Aramina, D. (2011). Competence based sustainable development: quality of education. Management Theory and Studies for Rural Business and Infrastructure Development, 26(2), 138–145. Lazzaretti, L. (2020). What is the role of culture facing the digital revolution challenge? Some reflections for a research agenda. European Planning Studies, 30(9), 1617–1637. https://doi.org/ 10.1080/09654313.2020.1836133 Lev, B., Canibano, L., & Marr, B. (2005). An accounting perspective on intellectual capital. Perspectives on Intellectual Capital, 42–55. Li, Y., Li, N., Li, C., & Li, J. (2020). The boon and bane of creative “stars”: a network exploration of how and when creativity is (and is not) driven by star teammate. Academy of Management Journal, 63(2), 613–635. https://doi.org/10.5465/amj.2018.0283 Maditinos, D., Chatzoudes, D., Tsairidis, C., & Theriou, G. (2011). The impact of intellectual capital on firms’ market value and financial performance. Journal of Intellectual Capital, 112(1), 132–151. https://doi.org/10.1108/14691931111097944 Maeda, J. (2013). STEM + Art = STEAM. The STEAM Journal, 1(1), 34. Martínez de Morentin de Goñi, J. I. (2006). ¿Qué es educación de adultos? Responde la UNESCO. Editorial Centro UNESCO de San Sebastián. ISBN 84-88737-69-6. https://unesdoc.unesco.org/ ark:/48223/pf0000149413.locale=en Méhaut, P., & Winch, C. (2012). The European qualification framework: Skills, competences or knowledge? European Educational Research Journal, 11(3), 369–381. https://doi.org/10.2304/ eerj.2012.11.3.369
20
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
Mouritsen, J., Larsen, H. T., & Bukh, P. N. (2001a). Intellectual capital and the ‘capable firm’: narrating, visualising and numbering for managing knowledge. Accounting, Organizations and Society, 26(7-8), 735–762. https://doi.org/10.1016/S0361-3682(01)00022-8 Mouritsen, J., Larsen, H. T., & Bukh, P. N. (2001b). Valuing the future: intellectual capital supplements at Skandia. Accounting, Auditing & Accountability Journal, 14(1), 399–422. https://doi.org/10.1108/09513570110403434 Nahapiet, J., & Ghoshal, S. (1998). Social capital, intellectual capital, and the organizational advantage. Academy of Management Review, 23(2), 242–266. https://doi.org/10.5465/amr. 1998.533225 Nerdrum, L., & Erikson, T. (2001). Intellectual capital: a human capital perspective. Journal of Intellectual Capital, 2(2), 127–135. https://doi.org/10.1108/14691930110385919 Nonaka, I., & Nishiguchi, T. (2001). Knowledge emergence: Social, technical, and evolutionary dimensions of knowledge creation. Oxford University Press. OECD. (2021). Economic and social impact of cultural and creative sectors. Note for Italy G20 Presidency Culture Working Paper. https://www.oecd.org/cfe/leed/OECD-G20-CultureJuly-2021.pdf. Paoloni, M., Coluccia, D., Fontana, S., & Solimene, S. (2020). Knowledge management, intellectual capital and entrepreneurship: a structured literature review. Journal of Knowledge Management, 24(8), 1797–1818. https://doi.org/10.1108/JKM-01-2020-0052 Pasban, M., & Nojedeh, S. H. (2016). A review of the role of human capital in the organization. Procedia-Social and Behavioral Sciences, 230, 249–253. https://doi.org/10.1016/j.sbspro.2016. 09.032 Pike, S., Fernström, L., & Roos, G. (2005). Intellectual capital: Management approach in ICS Ltd. Journal of Intellectual Capital, 6(4), 489–509. https://doi.org/10.1108/14691930510628780 Powell, W. W. (1990). Neither market nor hierarchy. In Research in organizational behavior, 12 (pp. 295–336). JAI Press. Prahalad, C. K., & Hamel, G. (2003). The core competence of the corporation. International Library of Critical Writings in Economics, 163, 210–222. Ramírez Córcoles, Y. (2013). Importance of intellectual capital disclosure in Spanish universities. Intangible Capital, 9(3), 931–944. https://doi.org/10.3926/ic.348 Roberts, E., & Townsend, L. (2016). The contribution of the creative economy to the resilience of rural communities: exploring cultural and digital capital. Sociologia Ruralis, 56(2), 197–219. https://doi.org/10.1111/soru.12075 Rogo, F., Cricelli, L., & Grimaldi, M. (2014). Assessing the performance of open innovation practices: A case study of a community of innovation. Technology in Society, 38, 60–80. https://doi.org/10.1016/j.techsoc.2014.02.006 Roslender, R., & Fincham, R. (2004). Intellectual capital: who counts, controls? Accounting and the Public Interest, 4(1), 1–23. https://doi.org/10.2308/api.2004.4.1.1 Salman, M., Ganie, S. A., & Saleem, I. (2020). The concept of competence: a thematic review and discussion. European Journal of Training and Development, 44(6/7), 717–742. https://doi.org/ 10.1108/EJTD-10-2019-0171 Schultz, T. W. (1972). Human capital: Policy issues and research opportunities. In Economic research: Retrospect and prospect, Volume 6, Human resources (pp. 1–84). NBER. http:// www.nber.org/chapters/c4126 Secundo, G., Dumay, J., Schiuma, G., & Passiante, G. (2016). Managing intellectual capital through a collective intelligence approach. Journal of Intellectual Capital, 17(2), 298–319. https://doi.org/10.1108/JIC-05-2015-0046 Seltzer, K., & Bentley, T. (1999). The creative age: Knowledge and skills for the new economy. Demos. Senge, P. M. (1990). Give me a lever long enough . . . and single-handed I can move the world. In The art and practice of the learning organization (pp. 3–16). Currency Doubleday. Smith, A. (1776). An inquiry into the nature and causes of the wealth of nations.
References
21
Spencer, M. L., & Spencer, M. (2003). Competence at work: Models for superior performance. Wiley. Ståhle, P., Ståhle, S., & Aho, S. (2011). Value added intellectual coefficient (VAIC): a critical analysis. Journal of Intellectual Capital, 12(4), 531–551. https://doi.org/10.1108/ 14691931111181715 Stewart, T. A. (2007). The wealth of knowledge: Intellectual capital and the twenty-first century organization. Currency. Storck, J., & Hill, P. A. (2000). Knowledge diffusion through “strategic communities”. MIT Sloan Management Review. Magazine Winter 2000. https://sloanreview.mit.edu/article/knowledgediffusion-through-strategic-communities/. Subramaniam, M., & Youndt, M. A. (2005). The influence of intellectual capital on the types of innovative capabilities. Academy of Management Journal, 48(3), 450–463. https://doi.org/10. 5465/amj.2005.17407911 Tanideh, S. (2013). Relationship between innovation capital and intellectual capital with value and financial performance. Life Science Journal, 10(2013), 251–254. Teece, D. J., & Linden, G. (2017). Business models, value capture, and the digital enterprise. Journal of Organization Design, 6(1), 1–14. https://doi.org/10.1002/(SICI)1097-0266(199708) 18:7%3C509::AID-SMJ882%3E3.0.CO;2-Z Teece, D. J., Pisano, G., & Shuen, A. (1997). Dynamic capabilities and strategic management. Strategic Management Journal, 18(7), 509–533. Tumwine, S., Kamukama, N., & Ntayi, J. M. (2012). Relational capital and performance of tea manufacturing firms. African Journal of Business Management, 6(3), 799–810. http://www. academicjournals.org/AJBM UNCTAD. (2008). Creative economy report 2008. The challenge of assessing the creative economy: towards informed policy-making. United Nations. Published online April 19, 2008. https:// unctad.org/webflyer/creative-economy-report-2008-challenge-assessing-creative-economytowards-informed-policy. van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2017). The relation between 21stcentury skills and digital skills: A systematic literature review. Computers in Human Behavior, 72, 577–588. https://doi.org/10.1016/j.chb.2017.03.010 van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2020). Measuring the levels of 21st-century digital skills among professionals working within the creative industries: A performance-based approach. Poetics, 81, 101434. https://doi.org/10.1016/j.poetic.2020. 101434 Vial, G. (2019). Understanding digital transformation: A review and a research agenda. Journal of Strategic Information Systems, 28, 118–144. https://doi.org/10.1016/j.jsis.2019.01.003 Vincent, L. (2008). Differentiating competence, capability, and capacity. Innovating Perspectives, 16(3). Warner, K. S. R., & Wäger, M. (2019). Building dynamic capabilities for digital transformation: An ongoing process of strategic renewal. Long Range Planning, 52, 326–349. https://doi.org/10. 1016/j.lrp.2018.12.001 Wernerfelt, B. (1984). A resource-based view of the firm. Strategic Management Journal, 5(2), 171–180. https://doi.org/10.1002/smj.4250050207 Winterton, J., Delamare-Le Deist, F., & Stringfellow, E. (2006). Typology of knowledge, skills and competences: clarification of the concept and prototype (pp. 13–16). Office for Official Publications of the European Communities.
22
1
Knowledge, Skills, and Competences (KSC) in the Knowledge-Based Economy
World Bank Institute (2010). Measuring knowledge in the world’s economies. Knowledge assessment methodology and knowledge economy index. Knowledge for development program. World Economic Forum (WEF). (2020). Schools of the future: Defining new models of education for the fourth industrial revolution. Reports. Published online January 14, 2020. https://www. weforum.org/reports/schools-of-the-future-defining-new-models-of-education-for-the-fourthindustrial-revolution. Xie, X., Wang, L., & Zeng, S. (2018). Inter-organizational knowledge acquisition and firms’ radical innovation: A moderated mediation analysis. Journal of Business Research, 90, 295–306. https://doi.org/10.1016/j.jbusres.2018.04.038 Zott, C. (2003). Dynamic capabilities and the emergence of intraindustry differential firm performance: insights from a simulation study. Strategic Management Journal, 24(2), 97–125. https:// doi.org/10.1002/smj.288
Chapter 2
A Review of the Creative and Cultural Industries (CCI)
Keywords Creativity · Cultural and creative industries · Creative economy · Cultural employment · Creative employment · Cultural capital
2.1
Creativity: A Complex Definition
Bourdieu (1986) conceives creativity as a source of economic value by proposing three fundamental types of capital, namely economic, social, and cultural, with the latter being “convertible, in certain conditions, into economic capital” (p. 16) and then institutionalized in the form of educational qualifications. Digitization and ICT technologies have radically boosted creativity and selfexpression, by providing new media (i.e., platforms, social networks, software, virtual and augmented reality) that facilitate collaboration, idea sharing and design, project management, and fast communication (NESTA, 2018). This has not only accelerated but also fostered intra-industry and globalized dissemination of creative works and ideas as well as interdisciplinary collaborations, partnerships, and synergies between creators and technologists, challenging conventional innovation output and processes (NESTA, 2018) through system, dynamic, and proactive thinking (WEF, 2020). As a consequence, traditional sectors experience changes in their business models and new digitally driven creative ones emerged (The Economist, 2021), such as video games and music, changing not only labor demand in terms of job requirements but also the nature of intrafirm competition and sectors’ definition: as boundaries between industries and disciplines blur, competitive advantage is increasingly related to the capacity of combining seemingly unrelated knowledge into novel contexts and possibilities. As innovation demands interdisciplinary and experimental approaches, the workforce, organizations, and policies should adapt accordingly by developing ambidextrous skills that synergically combine digital and creative skills. Chung et al. (2015) define creativity as the “search for alternative ways of generating revenues and innovative ways to leverage their internal resources” (p. 93) through creative processes, creative persons, and creative products, with technologies as main enablers of creativity. If Amabile (1996) proposes a similar © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Nuccio, S. Mogno, Mapping Digital Skills in Cultural and Creative Industries in Italy, Contributions to Management Science, https://doi.org/10.1007/978-3-031-26867-0_2
23
24
2 A Review of the Creative and Cultural Industries (CCI)
definition, linking creativity with the generation of ideas, which, by being useful and original, may engender unprecedented products or ideas, Helsper and Eynon (2013) consider creative skills as a subset of digital skills, together with social, technical, and critical skills. Conversely, van Laar et al. (2020) recognize a mutually beneficial relation between digital technology and creativity, introducing the notion of creative digital skills or “digital creativity” as key drivers of competitiveness in contemporary hyper-dynamic environments. If technology can support the production and design of creative ideas through social networks and platforms, technology is also generated by individual creativity. Bakhshi et al. (2019) use the word “createch” to identify the sapient synergy, symbiosis, and combination of digital and creative skills, while Roberts and Townsend (2016) prove that digital technology is a major enabler of creativity, both promoting community resilience and development and building cultural capital. KEA (2006) considers creativity as a source of innovation, explaining it as “the use of cultural resources as an intermediate consumption in the production process of non-cultural sectors” which eventually leads to innovation (p. 2) and highlighting the interdependence between creativity and ICTs. Illustrating the effects of digital transformation on creativity, Bruno and Canina (2019) develop the Creativity 4.0 Model, presenting creativity as a multi-level phenomenon, which could be described as cognitive, individual, and sociocultural. Indeed, as an artistic phenomenon, creativity usually refers to the interaction of a multiplicity of approaches and the reshaping of knowledge (Papadopoulou et al., 2018). If creativity, digitization, and innovation are intricately interconnected, with Amabile (1996) defining innovation as “the successful implementation of creative ideas within an organization” (p. 1), creative talent and collaboration systems become major enablers and drivers of organizational innovation. That is why lifelong learning and education should help people develop not only technical skills but also personal attributes, cognitive skills, and ethical values, so that they can leverage on more tacit knowledge and social skills (Ananiadou & Claro, 2009) to succeed in more complex contexts. Florida et al. (2008) link creativity and innovation, by introducing the creative class as “a set of occupations [..] including science, engineering, arts, culture, entertainment and the knowledge-based professions of management, finance, law, healthcare and education” (p. 3). Since the development of the “creative class” is positively related to increases in employment, economic resilience and growth (Currid-Halkett & Stolarick, 2013), and entrepreneurship and regional growth (Stolarick et al., 2011), creative talent (i.e., human resources) becomes a main driver of innovation development and success (Florida, 2002) in the knowledge-based economy. Building on Schumpeter’s creative destruction, entrepreneurship is considered a “creative act” (Stolarick et al., 2011), which may lead to innovation under specific circumstances. However, in studying regional unemployment during economic downturns, Stolarick and Currid-Halkett (2013) find lower unemployment rates among the creative class, which turns out to be determinant in upholding economic growth. If the development of a creative class positively affects technology and regional growth (Hansen et al., 2005), creative education becomes vital for talent and regional development, with universities being pivotal hubs for regional and
2.1
Creativity: A Complex Definition
25
knowledge development in the creative economy (Mellander and Florida, 2006) as major determinants of the geographic distribution of the creative class (Florida et al., 2008, p. 617). This, in turn, determines the geographic distribution of innovating firms and regional growth (Florida, 2002). Yet, cultural infrastructure and public policy have a mediating role on this nexus (Comunian, 2011), requiring a multi-level policy approach to creativity development instead of a top-down policy application, to develop creative talent that can be a key driver of innovation and competitiveness (Hansen et al., 2005).
2.1.1
Creative Knowledge, Skills, and Competences (CKSC)
Different scholars have identified the core elements of creativity, with Seltzer and Bentley (1999) recognizing trust, freedom of action, context variations, skill–challenge balance, interactive interchanges of knowledge and ideas, and a focus on real outcomes as its six main drivers. Indeed, for its multidisciplinary nature, creativity allows for multiple perspectives and knowledge to come together in innovative ways, and to recombine existing knowledge and skills into new configurations to fit new contexts. According to Bourdieu (1986, p. 17), cultural capital can be objectified into cultural goods, institutionalized into educational qualifications, or embodied into mind or dispositions constituting “external wealth converted into an integral part of the person” (p. 18). Hence, it can be acquired by organizations through their workforce. Hunter et al. (2012) underline that creative performance is a method and source for innovation within organization, considering creative ideas as necessary for innovation output (p. 2), while Wilson (2010) defines creativity as a social phenomenon underpinned by cross-border human relationships, with these boundaries being geographical, sectorial, disciplinary, or cultural. KEA (2006) introduces a four-perspective notion of creativity, which could be scientific, socioeconomic, cultural, or technological, stressing that the relationships and exchanges between the four unleash creativity at a market level, whereas Bogers et al. (2018) claim that educational diversity among employees and a firm’s openness to innovation are positively related, as they may lead to new combinations of internal and external knowledge, with educational level as a moderator. Interestingly, in studying the creative industry, van Laar et al. (2020) classify creativity as a digital skill together with information, critical thinking, and problem-solving. Mietzner and Kamprath (2013) outline a tripartite classification of creative competences, as personal-social, methodological, or professional. While professional competences directly affect innovation as they involve IT, business, and legislative knowledge, personal-social and methodological competences deal with strategic and proactive thinking and creativity. Hunter et al. (2012) study the elements of Knowledge, Skills, Abilities, and Other (KSAOs) that could potentially predict creative performance within organizations. As far as creativity is concerned, knowledge can be either domain specific or broad: the former generates high-quality and relevant ideas, while the latter allows for the
26
2 A Review of the Creative and Cultural Industries (CCI)
novel combination of previously seemingly unrelated concepts that may result in new ideas. In terms of skills, creative ones can be either domain specific, if adding to individual expertise, or related to creative processing, when it comes to opportunity identification, information gathering, and conceptual combination (Hunter et al., 2012, p. 4). Among abilities, divergent thinking, intelligence, and analogical ability help in generating creative ideas for creating novel associations. However, creativity may have an innate component in some individuals, who may have particular dispositions or motivations for being open to discovery. The “other” feature accounts for these characteristics. Innovation output results from the interplay between the creative potential resulting from KSAO and contextual (organizational) variables (Hunter et al., 2012). According to the componential theory of creativity (Amabile, 2012), creativity involves the generation of ideas and/or outcomes that can be recognized as novel and purposeful, consisting of factors that are both internal (i.e., domain-relevant skills, creativity-relevant processes, and intrinsic task motivation) and external (the social context) to the individual. So, creativity results from the capacity of finding novel solutions and problem-solving processes by combining diversity, which boosts organizational innovation. Since innovation entails the effective execution, application, and realization of creative ideas in a company (Amabile, 1996), individual creativity may drive innovation, so that both internal and external components of creativity should be addressed when educating and training the present-day and future workforce.
2.1.2
The Role of Creativity in the Twenty-First Century
Despite the rising need for digital creative skills, literature lacks proper frameworks and extensive analysis of the issue, especially when it comes to their development and application in the cultural and creative industries (Poce et al., 2020). Escalating digitization has overturned traditional production and consumption models, challenging traditional ways of generating ideas as well as creating and distributing content (EC, 2010) and opening opportunities for novel experimentation and business models (Poce et al., 2020, p. 8). Poce et al. (2020) apply Carretero et al. (2018)’s five digital competency dimensions to the creative sector to identify how creativity must complement digital skills in the CCI in the knowledge-based economy in four main areas: information and data literacy (i.e., Internet and digital curation), digital communication and collaboration (i.e., synchronous communication with digital audiences and artists and social media), digital content creation (i.e., storytelling, AR and VR, mobile media, and digital publishing), digital safety (e.g., open and digital licenses), and digital problem-solving (e.g., digital culture management and mobile user experience). Further research should be done on the wide-ranging opportunities, collaborations, and interactions between creative disciplines and other sectors, between scientific and artistic education, and between private and public entities to boost these synergies (EC, 2010, p. 9).
2.2
Creative and Cultural Industries (CCI)
27
In terms of outcomes, UNCTAD (2008) proposes four major outputs of creativity: its manifestation itself, human capital, cultural capital, structural capital, and social capital. Rodríguez-Pose and Lee (2020) study the effects of creativity at a city level, proposing scientific and creative activities as the two main inputs for innovation to take place and remarking that the combination of the two rather than the simple co-presence of both allows for such achievements. UNCTAD (2008) further explores the economic aspects of creativity, considering it as a driver of not only innovation but also productivity and economic growth, and referring to creativity as the “formulation of new ideas and [. . .] the application of these ideas to produce original works of art and cultural products, functional creations, scientific inventions and technological innovations” (p. 3). Hence, it is possible to talk about a creative economy that replaces traditional economic models with more multidisciplinary models and transversal innovation models integrating economic value, culture, and technology synergically, with content and services as main outputs. Therefore, the creative economy is related to the knowledge economy as it is deeply rooted in the notion of KSC as crucial drivers of economic value creation, since creativity may identify the “capacity of being experimental” (NESTA, 2018). Indeed, creativity is characterized by multiple dimensions: cultural, social, economic, and sustainable aspects (Bakhshi and Throsby, 2010). UNCTAD (2008) also underlines the multidisciplinary nature of the creative economy, which is characterized by economic, social, cultural, and technological aspects, explaining creativity as “a set of knowledge-based economic activities with a development dimension and cross-cutting linkages at macro and micro levels to the overall economy” (p. 1). In particular, it emphasizes its socioeconomic advantages, including job creation, social inclusion, cultural diversity, human development, and income generation, which may eventually result in sustainable economic development and growth in the long term.
2.2 2.2.1
Creative and Cultural Industries (CCI) Boundaries and Sectors: A Contested Definition
According to Throsby (2008)’s framework, cultural and creative industries are at the very core of the creative economy and can be described through a concentric circles model. While creative arts are placed at the heart of the cultural and creative industries (CCI), technology adds to the human side of creativity as circles expand, passing from other core (i.e., film, museums and photography) to wider (i.e., publishing and print, television and radio, video and computer games) cultural industries. Related industries that may contemplate creativity, such as advertising, design, and fashion, are included in the outer circle. The Work Foundation (2007) proposes a similar scheme of subsectors of the CCI on the basis of their expressive value, that is, their symbolic and cultural component. This model comprises four concentric circles, distinguishing different levels of value, ranging from the core
28
2 A Review of the Creative and Cultural Industries (CCI)
creative fields (e.g., performing and visual arts, literature, and music), through cultural industries (e.g., libraries, film, and museums) and creative industries and activities (e.g., publishing, heritage, tv and radio, video games), to the rest of the economy (e.g., architecture, design, fashion, and advertising). UNCTAD (2008) classifies creative industries according to their focus and outputs, which could deal with heritage, arts, media, or functional creations, identifying three key features of the CCI: the involvement of some degree of human creativity in production, the considerable symbolic value of the output beyond the utilitarian value, and the existence of some level of intellectual property attributable to the producer(s). Nonetheless, Innocenti and Lazzeretti (2019) ascribe the successful impact of the CCI on employment, social, and economic growth to collaborative business models, claiming that cross-fertilization processes between creative industries and companies in other related sectors are the real driver of innovation and economic growth. If these collaborative systems usually take the form of innovative and creative hubs or incubators, digitization has further boosted opportunities for establishing broader and more geographically scattered knowledge-intensive networks, such as intercommunity, inter-regional, or regional networks (Roberts & Townsend, 2016). Furthermore, creative synergies and intra-industry contamination seem to be not only increasingly advantageous but also widespread. According to the Standard Industrial Classification (SIC) 2007, the boundaries of the CCI are extremely difficult to define, as creative industries tend to overlap more or less significantly with the digital, cultural, and tourism sectors (DCMS, 2019). This could be the case of film, TV, recorded media, heritage, and gaming for instance. If creative and digital skills alone constitute poor innovative resources, the key for future success is in their combined effect and cooperation. Digital creativity may better exploit new technological opportunities and new societal frameworks available in the digital era, which may contribute to the creation of digital ecosystems (i.e., smart cities, creative hubs, and virtual-reality environments), the improvement of knowledge transfer and sharing opportunities through social networks and asset digitization, and the emergence of new creative sectors (Lazzaretti, 2020). Digital technology can provide a more integrated and synchronized functioning of the creative economy (UNCTAD, 2018). NESTA (2006, p. 54) distinguishes between creative service providers and creative content products, describing the CCI as an industrial sector, and not only as creative activities driven by human talent. Whether delivering content or experiences, content producers identify those subsectors owning and exploiting their skills, knowledge, and abilities to produce and distribute creative content or services to the customers (e.g., antiques, designer-making, crafts, fashion, museums, galleries), while service providers are those providing supporting services that also include creative dimensions and whose labor demand is more likely to rise (e.g., PR, marketing, advertising, and post-production). This urges for more multidisciplinary approaches to KSC development, assimilation, and acquisition. KEA (2006) differentiates between the cultural industry and the creative industry, with the former comprising non-industrial and industrial sectors producing outputs such as arts, performing arts, heritage, and film, video games, broadcasting, music, and book publishing, and the latter encompassing those activities producing
2.2
Creative and Cultural Industries (CCI)
29
non-cultural goods in which culture “becomes a creative input” (p. 2) such as design. Valentino (2013, pp. 282–283) also distinguishes between the creative and the cultural industry as subsectors of the CCI defining a three-concentric-cycle framework of CCI on the basis of technological innovation spillovers, placing the more traditional and pre-capitalist cultural activities, including heritage and visual arts, at the core, the cultural industry engendered by the Industrial Revolution and encompassing TV and radio, publishing, and cinema in the second tier, and finally the creative industry prompted by digital technology, comprising design, web, PR, and advertising, in the last cycle. Despite considering creative industries and the cultural sector separately, the Department of Culture, Media & Sport (DCMS) in the UK builds on the Standard Industrial Classification (SIC) 2007 to identify their respective subsectors: publishing, computer games, software publishing, computer programming, computer consultancy, film, tv, music, radio, heritage, retail of music and video recordings, manufacture of musical instruments, reproduction of recorded media, heritage, arts and museum activities (DCMS, 2019, p. 6). This DCMS Sector framework highlights the overlapping of the digital and creative domain in many activities and sub-industries. Conversely, although discriminating between creative service providers and creative content products, NESTA (2006, p. 54) explains the CCI as an industrial sector, and not merely creative activities driven by human talent. Miller et al. (1998) and the World Intellectual Property Organization (WIPO, 2015) classify the CCI according to the degree of intellectual property involved, stressing the importance of intellectual property as a major value-adding element in the CCI as it promotes creativity, innovation, economic growth, and improved working conditions in the sector. In particular, the CCI may be subdivided into four main categories based on the type of property rights on their output’s final value (WIPO, 2015): core copyright industries producing intellectual property goods for final consumption (e.g., publishing, motion picture and sound recording, broadcasting and telecommunications, information and data processing; pp. 50–51), interdependent copyright industries organizations facilitating the creation, distribution, and use of goods protected by property rights (e.g., producers of electronic goods, computers, smartphones, musical instruments and recording devices, photocopiers, photographic instruments, and cinematographic equipment; pp. 59–60), partial copyright industries with some activities related to protected works (e.g., textile, footwear, furniture, museums, interior design, architecture, engineering, household goods, toys, jewelry; p. 60), and non-dedicated support industries with some activities facilitating distribution or sale of protected goods and services (e.g., wholesale and retail, information and communications through Internet, p.62). EUROSTAT (2019) classifies the cultural sector as comprising 18 economic activities, which, in turn, may be further subdivided into seven major groups: printing and reproduction of media, instruments manufacturing, and jewelry; retail sales; publishing; motion picture and television, music, and renting of videotapes; programming and broadcasting; architecture, design, and photography; and translation and interpretation. The UNCTAD (2018) proposes a similar classification of creative goods and products according to eight sub-groups: art crafts, audiovisuals,
30
2 A Review of the Creative and Cultural Industries (CCI)
design, digital fabrication, new media, performing arts, publishing, and visual arts, while categorizing advertising, market search, and public opinion as well as engineering and architectural, personal, cultural, and recreational services as “creative services” (p. 13). The European Commission (EC) considers as creative industries those using “culture as an input” and having a “cultural dimension” even if outputs can be functional (EC, 2010, p. 6), including advertising, graphic design, and fashion design among them as well. A whole-ranging definition is the one provided by ESSnet-Culture (2012, p. 20): Cultural activities are understood as any activity based on cultural values and/or artistic expressions. Cultural activities include market- or non-market-oriented activities, with or without a commercial meaning and carried out by any kind of organization (individuals, businesses, groups, institutions, amateurs or professionals).
The report Io Sono Cultura 2021 (Symbola & Unioncamere, 2021) proposes a similar seven-domain classification based on a more system-wide and productionfocused perspective of the CCI or “Sistema Produttivo Culturale e Creativo” (SPCC, Creative and Cultural Production System; Symbola & Unioncamere, 2021, p. 68): architecture and design; communications; audiovisual and music; video games and software; editorial publishing and printing; performing and visual arts; and historical and artistic heritage. In studying the effects of the COVID-19 pandemic on the CCI in 2021, UNESCO (2021, p. 9) introduces a new classification of the CCI according to six main sub-domains, which, in turn, can be further classified into sub-activities: design and creative services (e.g., fashion, graphic and interior design, advertising, architecture), audiovisual and interactive media (e.g., film, tv and radio, streaming, podcasts, video games), visual arts and crafts (e.g., crafts, photography, fine arts), performance and celebration (e.g., music, dance, festivals, fairs, performing arts), cultural and natural heritage (e.g., historical places, archeological sites, cultural landscapes, natural heritage, museums), and books and press (e.g., books, newspapers, magazines, libraries, book fairs). An interesting aspect of this classification is the inclusion of the natural heritage among the CCI-related activities. In conclusion, a wide-ranging definition of the CCI seems to be more illuminating in the knowledge-based economy of the twenty-first century. As accelerating digitization and COVID-19 have quickened and prompted the implementation of ICT technology for creative purposes, disrupting traditional business models in the CCI and opening new opportunities for value creation and product development (The Economist, 2021), digital industries and CCI gradually converge, making the distinction between the two increasingly ephemeral and pushing for the symbiosis of creative and digital skills.
2.2
Creative and Cultural Industries (CCI)
2.2.2
31
The Economic Contribution and Value of CCI
From 2002 to 2015, the creative industries accounted for 3% of total global GDP, creating nearly 30 million job positions and doubling their value from US$208 to US $509 billion in terms of creative goods globally over the same time range (The Economist, 2021). Cultural and creative industries also outperformed other sectors in terms of youth employment (15–29-year-olds) and are expected to account for 10% of global GDP in the near future (UNESCO, 2018). In 2016, the number of cultural firms amounted to more than 1.2 million among the 28 European Member States (EU-28) generating a total value added of 192 billion €, accounting for nearly 3% of non-financial business economy total value (EUROSTAT, 2019, p. 83). In the same year, Italy and France were reported as having the highest number of cultural organizations, which amounted to respectively 14.5% and 13.4% of total EU-28 cultural organizations (EUROSTAT, 2019, p. 83). Between 2011 and 2016, the growth in the number of cultural enterprises was positive, as the annual average rate of change was 2% in EU-28 (EUROSTAT, 2019, p. 86). Yet, significant discrepancies were reported among countries, with Lithuania reporting an annual change rate of 11.6%, and Greece of –6.9% (EUROSTAT, 2019, p. 86). In 2017, there were 1.1 million creative and cultural enterprises among 27 European Member States (EU-27), producing a total value added of 145 billion (EC, 2021, p. 5). Yet, the pandemic radically changed not only the structure but also the statistics related to work and production in the CCI, whose gross value added (GVA) of the industry contracted by US$750 billion in 2020 with respect to 2019 (UNESCO, 2021) and global contribution to global GDP dropped by 21% (UNESCO, 2021, p. 23). Average revenue loss fluctuated between 20 and 40% across countries, whereas urban centers with high CCI concentration reported the biggest drops in the economic contribution of the CCI. If the CCI were also largely damaged by the pandemic, with performing arts and music experiencing a revenue loss of 90% and 76% respectively in Europe, they were one of the fastest-growing sectors globally in 2021 (VVA et al., 2021) through. Digital innovation proved itself as crucial for thriving and surviving in challenging times, thanks to software enabling fast communication and collaboration, music and video production, as well as new digital assets such as blockchain and non-functional token (NFTs) and digital publishing, and new available platforms (e.g., metaverse, virtual reality, and augmented reality, streaming), to distribute, sell, and design creative and cultural goods. In 2020, the streaming sector reported US$62 billion in total revenues, with an increase of 31% with respect to the previous year (The Economist, 2021), and 1.1 billion online subscribers, accounting for a 26% annual increase (The Economist, 2021). Furthermore, the creative economy has also boosted the “creator” economy, thanks to the increasing number and availability of digital tools and social media platforms enabling people with the possibility to easily create, produce, and disseminate their own contents with (almost) no costs involved (The Economist, 2021). If this process had been undergoing well before the pandemic, it was exponentially accelerated by it. This significantly enhanced the overlapping and cooperation between
32
2 A Review of the Creative and Cultural Industries (CCI)
digital and creative industries, blurring the boundaries between the two and between creative subsectors. So, multidisciplinary skills that span across the two are most likely to become a source of future resilience, innovation, and competitive advantage for individuals, firms, and national economies. In terms of sectors, in 2015, design was the product category accounting for the largest share of creative goods exported globally, amounting to 62% of world exports, while visual arts and new media reported the second and third share of 11% and 8%, respectively (UNCTAD, 2018). In 2016, architecture, design, and photography accounted for 51.7% of EU-28 cultural enterprises in EU-28, constituting the largest share. Motion picture and television, music, and renting of videotapes and printing, reproduction of recorded media, instrument manufacturing, and jewelry ranked second and third, reporting a share of respectively 12.6% and 12.4%. In 2017, the CCI reported a 413 billion € value added, which accounted for a 5.5% contribution to the European economy, with audio-visual and media (AVM) subsector as the main driver (EIF & KEA, 2021). In 2019, similar results were found in terms of value added at factor cost, with architecture, design, and photography accounting for 22.3% of total value added for the cultural sector, and motion, picture and television, music, and renting of videotapes for 18.2 % (EUROSTAT, 2019, p. 88). However, publishing reported the second largest share (20.5%) in 2016 in terms of value added at factor cost (EUROSTAT, 2019, p. 88). In 2020, during the pandemic, the subsector that could not properly function with remote working or without physical audiences or visitors experiences the most remarkable economic losses, such as performance and celebration, and cultural and natural heritage (UNESCO, 2021, p. 25), which, in turn, radically affected the tourism and hospitality sectors.
2.2.3
Employment in the CCI Worldwide
At the beginning of 2021, 51.2 million people declared to be working in the CCI on LinkedIn, if all types of contracts are considered (full-time, part-time, internship; UNESCO, 2021, p. 4). Considering that LinkedIn’s user base represents nearly 20% of the total global workforce, the impact of CCI on employment and on the global economy is more remarkable than ever (UNESCO, 2021). In 2018, 8.7 million people worked in culture-related industries in EU-28, accounting for almost 4% of overall European employment (EUROSTAT, 2019), with an 8% increase with respect to 2013. In 2019, the CCI accounted for 3.7% of all EU-27 employment, with 7.4 million people employed in the sector (EC, 2021). Furthermore, 60% of culture employees reported to have tertiary level education. In terms of contracts, self-employment accounted for 32% (OECD, 2021) while full-time employment for roughly 76% of cultural employment on average, with significant discrepancies among EU Member States though (EUROSTAT, 2019). Evidence suggests that the CCI tend to spatially concentrate in large urban areas both in terms of firms and employees. In a recent paper, Gutierrez-Posada et al.
2.3
The Creative and Cultural Industries in Italy
33
(2022) measure the spillover effect of the CCI in UK urban areas. Using historical instruments to identify the causal effects of creative activity on non-creative firms and employment, they find robust, positive employment impacts of creative industries on urban local services. According to their results between 1998 and 2018 in the UK, each creative job generated at least 1.96 non-tradable jobs, although job multipliers declined substantially after the 2007 financial crisis. The CCI were even more radically disrupted by the COVID-19 pandemic and related policies, which, in turn, remarkably affected employment. Indeed, ten million jobs were reported to be lost globally in 2020, accounting for roughly 20% of total CCI workforce (UNESCO, 2021, p. 24), with self-employment being the category with highest income loss and unemployment levels. This only dramatically adds to the already mentioned US$750 billion contraction in GVA of the CCI (UNESCO, 2021), with these statistics only relating to direct effects. In Europe (EU-27 and UK) the decline in total turnover of the sector accounted for roughly 200 billion €, meaning a percentage fall of more than 30% (OECD, 2021). As for educational attainment, in the EU-27 59% of the CCI workforce claimed to have received tertiary level education, which almost doubled the share of total workforce (34%, OECD, 2021). In terms of skills, competences related to AR and VR were the most relevant skillset with 22% of professionals employed in the creative industries in EU-27 in 2020 having them, followed by Cloud and AI technologies (EC, 2021, p. 19).
2.3 2.3.1
The Creative and Cultural Industries in Italy Main Subsectors and Contribution to the National Economy
In 2015, the total economic value of the Italian CCI was 47.9 billion €, with an annual growth of 2.4% in direct economic value (EY, 2016). In 2016, cultural enterprises amounted to 178,907, accounting for almost 5% of non-financial business economy in terms of numbers, and 2.3% of the non-financial economy business economy in terms of value added. However, the yearly average rate of change in the number of cultural enterprises was negative between 2011 and 2016, accounting for –0.8% in Italy, unlike the 2% growth in the number of cultural enterprises reported on average by EU-28 States. However, during the pandemic, the annual percentage fall of GVA of the CCI in Italy was 30%, which is dramatically higher than the annual percentage loss in national GDP of nearly 12% in 2020 (UNESCO, 2021). Although total wealth produced shrank by 8.1%, if all the creative and cultural production system is considered (Symbola & Unioncamere, 2021, p. 66), the contribution of the CCI to the whole national economy remained stable with respect to 2019 at 5.7% (p. 72), producing a total value added of 84.6 billion € in 2020 (p. 66).
34
2
A Review of the Creative and Cultural Industries (CCI)
As for sectors, audiovisual, visual arts, and advertising reported the highest economic value in 2015 (respectively 14 billion €, 11.9 billion €, and 7.4 billion €, EY, 2016), whereas music, video games, and radio registered the highest growth rates in economic value in the same year (respectively, 10%, 9.5%, and 9.3%) with respect to 2014 (EY, 2016). In 2016, businesses related to architecture, design, and photography accounted for the largest share of cultural enterprises (almost 60%), while printing and reproduction of recorded media, music instrument manufacturing, and jewelry for the highest share of value added (EUROSTAT, 2019, pp. 87–88). If Italy was reported as the EU-28 State having the highest share of cultural enterprises (14.5%), the country ranked fourth when such share is expressed in terms of value created by cultural sectors, accounting for 8.4% (EUROSTAT, 2019, p. 83). In terms of economic output, Italy positioned itself as the third largest exporter of creative goods among developed countries in 2015, closely following the USA and Italy (UNCTAD, 2018). If digital cultural goods were the biggest source of revenues in 2020, COVID-19 exponentially accelerated digitization and demand for digital content and goods, with great gains for the streaming industry (OECD, July 2021). Conversely, core cultural subsectors were the major drivers of the decline of the CCI during 2020, with performing and visual arts as well as historical and artistic heritage experiencing a contraction of respectively 26.3% and 19% (Symbola & Unioncamere, 2021, p. 79). Table 2.1 reports the total annual number of active enterprises and average values of workers employed in active enterprises in the CCI in Italy in 2019 according to Istituto Nazionale di Statistica (Istat, 2022a). We used the ATECO 2007, which is an alphanumeric classification of economic activities adopted in Italy and defined by Istat to identify CCI subsectors as it is based on the European classification of economic activities NACE. If we consider a broader definition of the CCI as in DCMS (2019), subsectors partially overlap, encompassing and sometimes merging the digital sector, creative industries, the cultural sector (i.e., publishing, software publishing, computer programming, computer consultancy activities, and computer games, telecommunications, film, TV, music, radio, heritage, music and video recordings, music instruments’ manufacturing, recorded media, arts, museums) in addition to tourism, sports, and gambling. If all these subsectors are considered, the number of people employed in active enterprises in Italy in 2019 would reach approximately 1.56 million. Yet, for the purposes of our research, we opted for a narrower definition of the CCI subsectors, considering only subsectors that specifically refer to the digital, creative, and cultural domain only, thus excluding sports and gambling. Therefore, only related ATECO codes were considered in our research. The last two columns of the table show the relative contribution of each subsector to the total in terms of active enterprises and people employed in it.
2.3
The Creative and Cultural Industries in Italy
35
Table 2.1 Number of active enterprises and related people employed in the CCI in Italy in 2019 by ATECO 2007
ATECO 2007 58—Publishing activities 59—Motion picture, video and television programme production, sound recording, and music publishing activities 60—Programming and broadcasting activities 62—Computer programming, consultancy, and related activities 70—Activities of head offices, management consultancy activities 71—Architectural and engineering activities, technical testing, and analysis 73—Advertising and market research 74—Other professional, scientific, and technical activities 85.52—Cultural education 90—Creative, arts, and entertainment activities 91—Libraries, archives, museums, and other cultural activities Total CCI
Total number of active enterprises 4954 7936
1463
Total number of persons employed of active enterprises 32,614 27,599
Relative share of active enterprises 0.92% 1.47%
Relative share of persons employed of active enterprises 2.73% 2.31%
14,053
0.27%
1.18%
52,059
306,120
9.65%
25.61%
70,942
181,621
13.15%
15.19%
195,341
285,324
36.20%
23.87%
22,027
71,847
4.08%
6.01%
148,935
216,068
27.60%
18.07%
3144 31,752
4685 43,760
0.58% 5.88%
0.39% 3.66%
11,797
0.19%
0.99%
1022
539,575
1,195,488
Based on Istat data (ISTAT, 2022a)
2.3.2
Creative and Cultural Employment in Italy
In 2015, the CCI employed more than 1 million people in Italy, which represents nearly 4% of the total Italian workforce, soaring by 1.7% with respect to the previous year (EY, 2016). The share of cultural employment out of total employment steadily increased from 2013 to 2018, rising from 3.5 to 3.6%, meaning that the total number of workers grew from 784,000 to 831,000 over the same time range (EUROSTAT, 2019). If all the production system is considered, total CCI employment reached 1.55 million in 2018 (Symbola & Unioncamere, 2019, p. 42). While CCI
36
2 A Review of the Creative and Cultural Industries (CCI)
employment dropped by 3.5% on average, significantly outpacing the average Italian employment decline of 2.1%, due to the pandemic, the industry still greatly contributed to the national economy, employing 1.5 million workers, that is, 5.9% of total national employment in 2020 (Symbola & Unioncamere, 2021, p. 66–67). As for contracts, self-employment accounted for 46% of cultural employment in Italy (OECD, 2020), which is significantly higher than the EU-28 average of 33% (EUROSTAT, 2019, p. 70) in 2018. In 2019, the share of dependent workers in the CCI was 67.1%, which is significantly lower than the total economy’s share of 78.1% (Symbola & Unioncamere, 2021, p. 88). In terms of socio-demographic profiles, the share of female workers in cultural employment in Italy was one of the lowest among all 28 EU Member States in 2018 and only nearly 50% of people in cultural employment admitted having tertiary level education (EUROSTAT, 2019). In 2019, females accounted for 37.4% of people employed in the CCI in contrast to 42.1% of total employees in Italy, while almost 43% of total CCI employees had tertiary education (Symbola & Unioncamere, 2021, p. 84). Furthermore, most workers of the Italian CCI are between 35 and 54 years old, although employees aged 25–34 account for 20% of total CCI employment, while they only represent 17.6% of total employment (Symbola & Unioncamere, 2021, p. 80). Disruptive changes may also be highlighted in the subsectors from 2015 to 2020. In 2015, visual arts, audiovisual, and performing arts reported the highest employment in terms of the total number of people employed (respectively 250,200, 180,500, and 168,900), whereas video games, music, and performing arts accounted for the highest percentage growth in employment with respect to 2014, with, respectively, 7.8%, 6.1%, and 5% growth rates (EY, 2016). In 2020, editorial and publishing, video games and software, and architecture and design were the three subsectors with highest employment levels, hiring nearly 13.5%, 11.6%, and 10.2% of all CCI workforce and roughly 0.8%, 0.7%, and 0.6% of the total workforce (Symbola & Unioncamere, 2021, p. 75–76), respectively. On the contrary, performing and visual arts as well as historical and artistic heritage subsectors reported the highest unemployment rates in 2020, reporting a 11.9% (against a 2019’s increase of 4.4%) and 11.2% (against a 2019 surge of 7.5%) reduction in employment (Symbola & Unioncamere, 2021, p. 79). Table 2.2 reports Istat data (ISTAT, 2022b), summarizing the annual average employment levels in active creative and cultural organizations in Italy in 2017 according to CCI-related ATECO 2007 subsectors and type of contract. If national employment in the CCI subsector would reach nearly 1.6 million if a broader definition of CCI were considered, as in DCMS (2019), total national employment in the CCI accounts for 1.2 million according to our narrower definition if all types of contracts are considered.
Based on Istat data (ISTAT, 2022b)
ATECO 2007 58—Publishing activities 59—Motion picture, video and television programme production, sound recording, and music publishing activities 60—Programming and broadcasting activities 62—Computer programming, consultancy, and related activities 70—Activities of head offices, management consultancy activities 71—Architectural and engineering activities, technical testing, and analysis 73—Advertising and market research 74—Other professional, scientific, and technical activities 85.52—Cultural education 90—Creative, arts, and entertainment activities 91—Libraries, archives, museums, and other cultural activities Total CCI
Selfemployed 3544 6322
839 46,673 54,496 211,173 19,410 138,254 31,249 28,930 721 541,611
Employed 29,435 21,129
13,138 240,458 106,407 81,911 52,535 67,426 35,297 11,892 10,494 634,825
32,093
6287 865 227
4694 3018
2300
6070
323 5615
Outworkers 1659 1035
9973
277 61 57
265 458
2208
2782
900 2660
Temporary workers 184 121
Table 2.2 Human resources employed in CCI in Italy in 2017 by ATECO 2007 and contract type
1,218,502
37,813 41,748 11,499
76,904 209,156
297,592
169,755
15,200 295,406
Total Employment (regardless of contract) 34,822 28,607
3.10% 3.43% 0.94%
6.31% 17.17%
24.42%
13.93%
1.25% 24.24%
Relative share out of total employment 2.86% 2.35%
2.3 The Creative and Cultural Industries in Italy 37
38
2.4
2
A Review of the Creative and Cultural Industries (CCI)
The Need for a New Integrated Approach to KSC Development in the CCI in Italy
The literature and data review clearly highlighted the crucial role of creativity and creative KSC in promoting, fostering, and successfully implementing innovation, both at a social, economic, and organizational level, with important implications for education and learning, human resource management, organizational innovation, and (inter-)national policymaking. In particular, the synergy between creative and digital skills emerged as crucial. As technologies like virtual reality (VR), blockchain, and artificial intelligence (AI) have emerged in multiple subsectors of the CCI (EC, 2021), enabling engaging on-live content distribution and access, enhancing immersive content creation and experience co-creation, and improving the possibilities for content protection and payment online, the digital and cultural and creative industries progressively converge. Therefore, the synergy between the digital and creative domain, through digital creative skills, is expected to be the main source of value creation and driver for innovation (Lazzaretti, 2020). This raises concerns on how to effectively train future and current workforce, so that it can be successfully equipped with cross-disciplinary skills to thrive in the digital transformation. If this was pressing before, it has become imperative now due to the impact of COVID-19 on national economies, which negatively affected the employment and output levels of many industries. In particular, the CCI were incredibly disrupted both globally and in Italy due to social distancing policies, which further accelerated the already ongoing digitization of the sector, but also now urgently demand new KSC and innovative solutions for it to recover and adapt. Digital and creative skills and their synergy are expected to become critical assets for the CCI to thrive again through the adoption of novel business models that may meet new customer demands and habits as well as advanced technology that may boost creative potential. This will not only change intra- and cross-industry competition and boundaries, but also the working requirements and KSC needed in the sector. Since little research has been done to identify current labor demand and supply in terms of digital and creative KSC and their synergy in the CCI in Italy, our analysis wants to fill this gap, highlighting mismatches between the two and main reasons behind them to understand which digital and creative KSC are most needed to be developed and implemented in the Italian CCI in the near future and to draw managerial implications in terms of new digital creative skills that may hinder or boost innovation potential.
References Amabile, T. M. (1996). Creativity and innovation in organizations (Vol. 5). Harvard Business School.
References
39
Amabile, T. M. (2012). Componential theory of creativity (Working Paper 12-096). Harvard Business School. Ananiadou, K., & Claro, M. (2009). 21st century skills and competences for new millennium learners in OECD countries (OECD Education Working Papers, 41). OECD. https://doi.org/ 10.1787/218525261154 Bakhshi, H., Djumalieva, J., & Easton, E. (2019). The creative digital skills revolution. October 2019. Creative Industries Policies and Evidence Centre led by NESTA. PEC Policy Unit. Bakhshi, H., & Throsby, D. (2010). Culture of innovation: An economic analysis of innovation in arts and cultural organisations. NESTA, Research report: June 2010. Bogers, M., Foss, N. J., & Lyngsie, J. (2018). The “human side” of open innovation: The role of employee diversity in firm-level openness. Research Policy, 47, 218–231. https://doi.org/10. 1016/j.respol.2017.10.012 Bourdieu, P. (1986). The forms of capital. In J. G. Richardson (Ed.), Handbook of theory and research for the sociology of education (pp. 241–258). Greenwood Press. Bruno, C., & Canina, M. (2019). Creativity 4.0. Empowering creative process for digitally enhanced people. The Design Journal, 22(sup1), 2119–2131. https://doi.org/10.1080/ 14606925.2019.1594935 Carretero, S., Vuorikari, R., & Punie, Y. (2018). DigComp 2.1: the digital competence framework for citizens with eight proficiency levels and examples of use. European Commission, Joint Research Centre, Publications Office. https://data.europa.eu/doi/10.2760/836968 Chung, S., Lee, K. Y., & Choi, J. (2015). Exploring digital creativity in the workspace: The role of enterprise mobile applications on perceived job performance and creativity. Computers in Human Behavior, 42, 93–109. https://doi.org/10.1016/j.chb.2014.03.055 Comunian, R. (2011). Rethinking the creative city: The role of complexity, networks and interactions in the urban creative economy. Urban Studies, 48(6), 1157–1179. https://doi.org/10.1177/ 0042098010370626 Currid-Halkett, E., & Stolarick, K. (2013). Baptism by fire: did the creative class generate economic growth during the crisis? Cambridge Journal of Regions, Economy and Society, 6(1), 55–69. https://doi.org/10.1093/cjres/rss021 Department of Digital, Culture, Media and Sport (DCMS). (2019). Economic estimates of DCMS Sectors. Crown. https://www.gov.uk/government/collections/dcms-sectors-economic-estimates Ernst & Young (EY). (2016). Italia Creativa. L’Italia che crea, crea valore. 2° Studio sull’Industria della Cultura e della Creatività. http://www.italiacreativa.eu/. European Commission (EC). (2010). Green paper—Unlocking the potential of cultural and creative industries. https://op.europa.eu/s/vKy4 European Commission (EC), Executive Agency for Small and Medium-sized Enterprises, Roche, C., & Izsak, K. (2021). Advanced technologies for industry: sectoral watch: technological trends in creative industries. Publications Office. https://data.europa.eu/doi/10.2826/444418 European Investment Fund (EIF) & KEA European Affairs. (2021). Market analysis of the cultural and creative sectors in Europe. https://keanet.eu/wp-content/uploads/ccs-market-analysiseurope-012021_EIF-KEA.pdf European Statistical System Network on Culture (ESSnet-Culture). (2012). Final Report 2012. Published May 22, 2014. https://ec.europa.eu/eurostat/cros/content/essnet-culture-final-report_ en. Eurostat. (2019). Culture Statistics 2019. Publications Office of the European Union, Published online 21 October 2019, https://ec.europa.eu/eurostat/web/products-statistical-books/-/ks-01-1 9-712. https://doi.org/10.2785/118217 Florida, R. (2002). The economic geography of talent. Annals of the Association of American geographers, 92(4), 743–755. Florida, R., Mellander, C., & Stolarick, K. (2008). Inside the black box of regional development— Human capital, the creative class and tolerance. Journal of Economic Geography, 8(5), 615– 649. http://www.jstor.org/stable/26161285
40
2
A Review of the Creative and Cultural Industries (CCI)
Gutierrez-Posada, D., Kitsos, T., Nathan, M., & Nuccio, M. (2022). Creative clusters and creative multipliers: evidence from UK cities. Economic Geography, 1–24. Hansen, K. H., Vang-Lauridsen, J., & Asheim, B. (2005). The creative class and regional growth— Towards a knowledge based approach (CIRCLE Working Paper WP 2005/15). Helsper, E. J., & Eynon, R. (2013). Distinct skill pathways to digital engagement. European Journal of Communication, 28(6), 696–713. https://doi.org/10.1177/0267323113499113 Hunter, S. T., Cushenbery, L., & Friedrich, T. (2012). Hiring an innovative workforce: A necessary yet uniquely challenging endeavor. Human Resource Management Review, 22(4), 303–322. https://doi.org/10.1016/j.hrmr.2012.01.001 Innocenti, N., & Lazzeretti, L. (2019). Do the creative industries support growth and innovation in the wider economy? Industry relatedness and employment growth in Italy. Industry and Innovation, 26(10), 1152–1173. https://doi.org/10.1080/13662716.2018.1561360 ISTAT. (2022a). Enterprises and persons employed: Size class of persons employed, economic activity (Nace 2 digit) - prov. Published online. http://dati.istat.it/Index.aspx?lang=en& SubSessionId=ae69fadb-ecad-496e-8cec-66a58be85eeb#. ISTAT (2022b). Enterprises: human Resources. Published online. http://dati.istat.it/Index.aspx? lang=en&SubSessionId=ae69fadb-ecad-496e-8cec-66a58be85eeb#. KEA European Affairs. (2006). The economy of culture in Europe. Study prepared for the European Commission (Directorate-General for Education and Culture). https://ec.europa.eu/assets/eac/ culture/library/studies/cultural-economy_en.pdf Lazzaretti, L. (2020). What is the role of culture facing the digital revolution challenge? Some reflections for a research agenda. European Planning Studies, 30(9), 1617–1637. https://doi.org/ 10.1080/09654313.2020.1836133 Mellander, C., & Florida, R. (2006). The creative class or human capital. Explaining regional development in Sweden. Mietzner, D., & Kamprath, M. (2013). A competence portfolio for professionals in the creative industries. Creativity and Innovation Management, 22(3), 280–294. https://doi.org/10.1111/ caim.12026 Miller, R., Michalski, W., & Stevens, B. (1998). The promises and perils of 21st century technology: an overview of the issues. In 21st century technologies: promises and perils of a dynamic future (pp. 7–32). Organization for Economic Co-operation and Development (OECD). NESTA. (2006). Creating growth. How the UK can develop world class creative businesses. NESTA Research Report. Published April 1, 2006. https://www.nesta.org.uk/report/creatinggrowth/. NESTA. (2018, March). Experimental culture: A horizon scan commissioned by Arts Council England. Arts Council England. OECD. (2020). Culture shock: Covid-19 and the cultural and creative sectors. Tackling Coronavirus (COVID-19) contributing to a global effort. OECD Policy Responses to Coronavirus (COVID-19). Published online 7th September 2020. https://www.oecd.org/coronavirus/policyresponses/culture-shock-covid-19-and-the-cultural-and-creative-sectors-08da9e0e/. OECD. (2021). Economic and social impact of cultural and creative sectors. Note for Italy G20 Presidency Culture Working Paper. https://www.oecd.org/cfe/leed/OECD-G20-CultureJuly-2021.pdf. Papadopoulou, A., Kaimara, P., Poulimenou, S. M., & Deliyannis, I. (2018). Art didactics and creative technologies: Digital culture and new forms of students’ activation. In Digital culture & audiovisual challenges interdisciplinary creativity in arts and technology, DCAC international conference. Ionian University. Poce, A., Amenduni, F., & De Medio, C. (2020). Guidelines for digital competences for creative industries. Roberts, E., & Townsend, L. (2016). The contribution of the creative economy to the resilience of rural communities: exploring cultural and digital capital. Sociologia Ruralis, 56(2), 197–219. https://doi.org/10.1111/soru.12075
References
41
Rodríguez-Pose, A., & Lee, N. (2020). Hipsters vs. geeks? Creative workers, STEM and innovation in US cities. Cities, 100, 102653. https://doi.org/10.1016/j.cities.2020.102653 Seltzer, K., & Bentley, T. (1999). The creative age: Knowledge and skills for the new economy. Demos. Stolarick, K., & Currid-Halkett, E. (2013). Creativity and the crisis: The impact of creative workers on regional unemployment. Cities, 33, 5–14. https://doi.org/10.1016/j.cities.2012.05.017 Stolarick, K., Lobo, J., & Strumsky, D. (2011). Are creative metropolitan areas also entrepreneurial? Regional Science Policy & Practice, 3(3), 271–286. https://doi.org/10.1111/j.1757-7802. 2011.01041.x Symbola & Unioncamere. (2019). Io sono cultura 2019. L’Italia della qualità e della bellezza sfida la crisi. I quaderni di Symbola. ISBN 9788899265519. https://www.symbola.net/ricerca/iosono-cultura-2019/. Symbola & Unioncamere. (2021). Io sono cultura 2021. L’Italia della qualità e della bellezza sfida la crisi. I quaderni di Symbola. Copygraph sas, Roma, ISBN 9788899265663. https://www. symbola.net/ricerca/io-sono-cultura-2021/ The Economist. (2021). Creative industries. Trade challenges and opportunities post pandemic. Great Britain and Northern Ireland. The Economist Intelligence Unit. https://impact.economist. com/perspectives/sites/default/files/eiu_dit_creative_industries_2021.pdf The Work Foundation. (2007). Staying ahead: The economic performance of the UK’s creative industries. Department for Culture, Media, and Sport. Crown. United Kingdom. https://static.an.co.uk/wp-content/uploads/2013/11/4175593.pdf Throsby, D. (2008). The concentric circles model of the cultural industries. Cultural Trends, 17(3), 147–164. https://doi.org/10.1080/09548960802361951 UNCTAD. (2008). Creative economy report 2008. The challenge of assessing the creative economy: towards informed policy-making. United Nations. Published online April 19, 2008. https:// unctad.org/webflyer/creative-economy-report-2008-challenge-assessing-creative-economytowards-informed-policy. UNCTAD. (2018). Creative Economy Outlook. Trends in international trade in creative industries 2002-2015. Country profiles 2005-2014. United Nations. https://unctad.org/webflyer/creativeeconomy-outlook-trends-international-trade-creative-industries. UNESCO. (2018). Re | shaping cultural policies. Advancing creativity for development. https://en. unesco.org/creativity/global-report-2018 UNESCO, Naylor, R., Todd, J., Moretto, M., & Traverso, R. (2021). Cultural and creative industries in the face of COVID-19: an economic impact outlook. Published online. https:// unesdoc.unesco.org/ark:/48223/pf0000377863. Valentino, P. A. (2013). L’impresa culturale e creativa: verso una definizione condivisa. Economia della cultura, 23(3), 273–288. van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2020). Measuring the levels of 21st-century digital skills among professionals working within the creative industries: A performance-based approach. Poetics, 81, 101434. https://doi.org/10.1016/j.poetic.2020. 101434 VVA, Hausemer, P., Richer, C., Klebba, M., & Amann, S. (2021). Creative FLIP final report work package 2: Learning. Skills needs and gaps in the CCSI. Brussels. http://creativeflip. creativehubs.net/wp-content/uploads/2021/07/FINAL-WP2_Final-Report-on-Skillsmismatch-2.pdf Wilson, N. (2010). Social creativity: re-qualifying the creative economy. International Journal of Cultural Policy, 16(3), 367–381. https://doi.org/10.1080/10286630903111621 World Economic Forum (WEF). (2020). Schools of the future: Defining new models of education for the fourth industrial revolution. Reports. Published online January 14, 2020. https://www. weforum.org/reports/schools-of-the-future-defining-new-models-of-education-for-the-fourthindustrial-revolution. World Intellectual Property Organization (WIPO). (2015). Guide on surveying the economic contribution of the copyright industry. 2015 revised edition.
Chapter 3
Methodology and Empirical Strategy
Keywords Natural language processing · Curriculum vitae analysis · Resume analysis · Text mining · Clustering · Word2Vec · Skill extraction · ESCO taxonomy · ATECO
3.1 3.1.1
CV (or Resume) Analysis The Study of KSC Supply and Demand in the CCI: A Brief Literature Review
Existing research dealing with the study of KSC supply and demand is twofold, focusing on either statistical data to evince skill demand and supply in an industry or qualitative methodology like interviews. Furthermore, as far as the CCIs are concerned, it primarily investigates skill gaps existing between competences acquired through education and those actually needed at work, studying the connection between graduation and future work in the creative sector with an emphasis on career development and education. Bridgstock (2011) analyzes the skills required from graduates in the creative industries in Australia through self-reported surveys, while Ball et al. (2010) propose a longitudinal study of the career patterns and progression of UK graduates between 2008 and 2010. Barroso et al. (2021) use qualitative methodology to analyze creative skills in advertising, interviewing creatives on the skills they had developed during education and those they utilized at work. van Laar et al. (2020) opt for semi-structured interviews to evince which twenty-first-century skills are most in demand in the creative industry, whereas van Laar et al. (2022) adopt the same methodology to infer top managers’ perspectives on twenty-first-century digital skill development in the same industry. Royle and Laing (2014) also apply interview-based research to the study of digital marketing skill gaps in the communication industries. If interview-based research is widely used, both Talja (2005) and van Deursen et al. (2016) point out the limitations of research based on self-reports, as they can be biased due to skill under- or overstatement for their being interpretative, when studying computer-based skills. Thus, van Deursen et al. (2014) combine multiple © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Nuccio, S. Mogno, Mapping Digital Skills in Cultural and Creative Industries in Italy, Contributions to Management Science, https://doi.org/10.1007/978-3-031-26867-0_3
43
44
3 Methodology and Empirical Strategy
interview stages to prevent biases, conceptualizing digital skills and identifying potential shortages in the UK and the Netherlands through a multi-stage study involving cognitive interviews and online surveys. Existing research on skill demand relies on semi-structured interviews with managers and managing directors while studies on skill supply consisted of surveys, interviews, or CV analysis of the workforce. Our analysis explores skill shortages between labor demand and supply in the industry through CV analysis. A CV is a document summarizing and reporting education, work experience, skills and competences, achievements, and accomplishments of a job seeker. In exploring the developments and evolution of CV studies, Hanemann and Kanninen (1996) define CV responses as binary and discrete, so that statistical methods could be applied to CV analysis, claiming that supplementary parameters could bring additional richness though. Ben Abdessalem and Amdouni (2011) add that CV analysis may also result in the identification of patterns defining a particular profile, emphasizing the dichotomy nature of this document which comprises both semantic and syntactic features. Furthermore, Werz et al. (2019) state that this also allows researchers to detect and depict specific unique career patterns. Indeed, while studying scientists’ and engineers’ career changes, Dietz et al. (2000) use CVs as the foundation for career analysis for its richness in longitudinal data: by reporting a person’s entire career, a CV provides a detailed historical account of a person’s evolution in skills, interests, and jobs while outlining the job applicant’s potential career trajectories. For instance, Dietz and Bozeman (2005) apply CV analysis to study engineers’ and scientists’ careers in research, highlighting novelties in their career paths with respect to the past and possible implications for government investment policies, while Dietz (2004) uses this methodology to evaluate the impact of job changes on productivity. Cañibano et al. (2008) study mobility patterns of Spanish researchers through CV analysis. Cañibano and Bozeman (2009) consider this type of research particularly useful for identifying career trajectories, studying researchers’ mobility, and mapping collective capacity. Indeed, CV analysis methodology allows scholars to appreciate collective processes (Woolley & Turpin, 2009) such as the geographical distribution of specific skills and workforce in a specific area. Yet, if self-assessment methods are the most used in measuring these skills (van Deursen et al., 2014), like interviews, CV analysis relies on self-assessment and selfreporting of skills, competences, and knowledge possessed, which may result in understating or overrating skills and competences (Hargittai, 2005; Merritt et al., 2005) or even misrepresentation (Phillips et al., 2019). Some scholars consider this type of analysis a complementary methodology that could support other primary research by bringing a collective perspective on the issue. Woolley and Turpin (2009) conceive it as a useful complementary methodology to filter survey data, whereas Sandström (2009) shows the advantages of combining advanced biometrics and curriculum vitae cluster analysis to analyze researchers’ mobility and performance. On the contrary, CV analysis is widely used for hiring processes within organizations and for more automated CV recommendation procedures and resume ranking (Kelkar et al., 2020). Alanoca et al. (2020) demonstrate how text mining and
3.1
CV (or Resume) Analysis
45
natural language processing (NLP) techniques can support CV recommendation on the basis of term frequency and relevance, with Kumar and Bhatia (2013) remarking the role of natural language processing (NLP) in text mining as a crucial input to extract information from ambiguous and unstructured data. Faliagka et al. (2014) prove how machine learning algorithms may improve and speed up semantic matching between skills reported on LinkedIn’s profiles and job’s requirements and prerequisites. Text mining techniques were applied to our research to extract digital and creative skills pertaining to the workforce in the CCI. As our research aims at highlighting specific skills, the skill section was firstly extracted from the CVs. If a CV comprises four main sections, including personal details, academic and work experience, and (soft and digital) skills (Haddad & Mercier-Laurent, 2021), this last section was the main focus of our empirical research, with particular attention to creative and digital skills. In particular, as our study aims at studying the diffusion and importance of both creative and digital skills in a specific collective (the creative and cultural workforce) with implications in terms of skills shortages, this type of analysis seems to be proper for our research.
3.1.2
CV Analysis and KSC Assessment
Jiechieu and Tsopze (2021) underscore that extracting skills is a critical step in resume classification, thus being also a fundamental task that candidate recommender systems should do. If skills extraction aims at identifying the main skills and knowledge expressed in CVs, the issue is how to successfully extract them in a time-saving way. Due to escalating digitization, job posting and resume submission have become digitized as well, asking organizations to develop systems that could collect, extract, and process CV data to select suitable candidates. If a resume can be defined as text document comprising sections (Maheshwari et al., 2010) containing self-reported information about personal details, past employment, academic background, skills, and knowledge (Ankala & Karra, 2016), such information can be compared with the job requirements listed in job advertisements for those candidates that show highest levels of skill matching. Hence, resumes are key tools for screening job applications on the basis of a profile (Ankala & Karra, 2016), so that skill extraction is critical in HRM, requiring IT systems devoted to such skill matching processes to be highly effective in doing so, so that skill extraction must be one of the fundamental tasks to be performed effectively and efficiently by job recommender systems (Jiechieu & Tsopze, 2021). Kelkar et al. (2020) underline how digitization has made the work of human resource managers increasingly tedious and time-consuming as the number of CVs to be processed has escalated. This has raised the need for developing recommender systems exploiting text mining and machine learning to process, rank, and compare candidates’ resumes with respect to specific parameters and job position requirements. Moreover, escalating digitization has introduced digital technology and innovation in HRM leading to the development and implementation of AI solutions,
46
3 Methodology and Empirical Strategy
such as human resource management systems (HRMS) and information systems (HRIS), in human resource practices (Votto et al., 2021). In particular, the advent and use of technology in HRM mostly aim at reducing the manual workload, working as an effective time-saving solution in the era of hyperconnectivity (Trinh & Dang, 2021). However, the analysis of CVs is difficult due to its semi-structuredness. While some sections may be common (i.e., education, work experience, skills, and languages), there is not a compulsory predefined structure for CVs, which makes the classification of large CV amounts time-consuming. Moreover, resumes can be textual, video-recorded, or a mix of both methods. While our research uses textual resumes, Waung et al. (2014) investigate the differential impact of different CV types on candidate assessment. They show that video resumes are related to harsher evaluation of a candidate’s skills and abilities with respect to paper resumes, although they facilitate the inference of applicants’ personality traits such as extraversion and agreeableness. As for paper-based CVs, if Cole et al. (2005) claim they are valid for and accurate in helping CV reviewers infer traits like conscientiousness, extraversion, and openness, Cole et al. (2009) prove that they are so only for extraversion. In studying how the resume selection process can be improved, Maheshwari et al. (2010) claim that CVs are complex to elaborate as they contain semi-formatted or unformatted text. Moreover, resumes for the same job posting may present completely varying structures, formats, and file types (Ankala & Karra, 2016). Therefore, selecting the right resume with the needed skills is a very delicate task. Maheshwari et al. (2010) propose a model matching skill type with corresponding skill values through their degree of specialness (DS). Ankala and Karra (2016) claim that skill extraction from CVs firstly required CV segmentation into blocks labeled with the type of information. Indeed, Randazzo (2016) reframes CVs as qualitative research projects or “assignments” (p. 280) synthesizing career information through relevant and career-specific keywords. Hence, machine learning and AI may reduce time and labor needed to process CVs. Deep learning (DL) may be advantageous for analyzing and understanding meaning from human language in a useful way. As CVs are mostly composed of four sections, containing self-report information about personal life, academic background, professional experience, and skills (either soft and/or technical; Haddad & Mercier-Laurent, 2021), El Mohadab et al. (2020) employ a data mining algorithm to automatically process CVs to improve scientific research. Maer-Matei et al. (2019) also opt for a text mining approach to extract assessment skills required from early career researchers in Europe, avoiding NLP technology though, while Woolley and Turpin (2009) use CVs for filtering survey data, using CV analysis as a complementary methodology between a large-scale survey and further qualitative study of a stratified sample of domestically based participants. Jiechieu and Tsopze (2021) apply CCN models to the study of CVs to extract skills for matchmaking purposes. Zhao et al. (2015) combine named entity recognition (NER) and normalization (NEN), developing a NER model trained on a set of resumes capable of spotting certain skills in a resume text. In studying the implications of math skills possession for labor market outcomes, Koedel and
3.2
Natural Language Processing (NLP)
47
Tyhurst (2012) adopt a resume-based field experiment, responding to online job postings by sending fictitious resumes, randomly overstating math skills in some of them, to show that stronger math skills receive more attention from recruiters. Similarly, Fahrenbach et al. (2019) evaluate LinkedIn CVs through a text mining system, showing how information and communication technology may support the assessment of professional competences through the analysis of prior learning. Therefore, we opted for an NLP-based approach to extract digital and creative KSC from our dataset of CVs.
3.2 3.2.1
Natural Language Processing (NLP) NLP: A Definition
NLP is a set of computational techniques (Zhao et al., 2021) enabling computer tools to mimic, understand, and manipulate natural language (Vijayarani et al., 2015) through algorithms (Ly et al., 2020). If text mining applies statistical analysis to texts for high-quality deriving of information (Oramas et al., 2018), NLP implements text mining techniques (Vijayarani et al., 2015) in the extraction of information from a textual document through a “human-like language processing” (Zhao et al., 2021) that can be implemented by computer tools (Ly et al., 2020) As NLP can analyze and generate both written and spoken language (Meurers, 2012) through such means, it is generally considered as merging computer science (IBM) and linguistics (Khurana et al., 2022). NLP is also referred to as an artificial intelligence (AI) discipline (Liddy, 2001) or subfield (Rahmani & Kamberaj, 2021) aiming at adopting a “human-like” approach to language processing (Liddy, 2001). In particular, Marsoof et al. (2022) define NLP as an AI deep learning application enabling computers to deconstruct textual data as humans would understand it, so that it can be employed in textual content analysis. Being an AI system, the main advantages of AI, such as automation and speed in content analysis, also apply to NLP applications (Marsoof et al., 2022), as NLP can successfully automate qualitative data analysis (Crowston et al., 2012). Indeed, NLP identifies a set of both theories and technologies defining a computerbased approach to text analysis (Liddy, 2001), being defined as a subset of data analytics techniques that enables computers to interpret human language by processing textual data converted into a numerical form through ML algorithms (Donnelly et al., 2022). In particular, NLP extracts word frequencies from a textual document, having the capability to transform qualitative data (i.e., texts and words) into quantitative data (Hu et al., 2019). However, textual documents may contain both structured and unstructured data, requiring multiple levels of linguistic analysis (Zhao et al., 2021). Markham et al. (2015) investigate the differences between structured and unstructured data on the basis of four variables: variety, volume, veracity, and velocity. While structured data is derived from known sources, being large and real-time or archival, unstructured
48
3 Methodology and Empirical Strategy
data reports unknown sources, with data volume not necessarily being large and requiring multiple sources’ triangulation for validity. If the former can be advantageous for operational efficiency as it continuously aggregates new data, the latter can be implemented in strategic decision-making as it isolates specific pieces of information (Markham et al., 2015). By partially automating content analysis (Crowston et al., 2012), NLP may decrease research time and manual effort needed for coding text by reducing the volume of text to be processed and analyzed in qualitative analyses. Moreover, if NLP allows for content analysis and qualitative data analysis automation (Crowston et al., 2012, p. 523), the virtuosity of NLP system is that they do not only isolate single words but also spot the relation and patterns (Wanless et al., 2022) among them, which supports inference making (Friedman & Hripcsak, 1999). Since NLP also facilitates the collection of written data from many sources (Rajput, 2020), it is widely used for information retrieval (IR); Liddy, 1998). Thus, NLP can also identify a whole-ranging research area, encompassing multiple applications, approaches, and research topics (Crowston et al., 2012). NLP also deals with the design of databases out of unstructured text data (Auer, 2018). Asemie et al. (2017) state that NLP enables a computer to not only understand the language of human beings but also easily communicate with them. Since NLP automates content analysis (Crowston et al., 2012), Votto et al. (2021) conceive NLP as an ability, describing it as a machine’s capability of communicating with humans by understanding both voice and text as well as by generating appropriate responses in its interaction with human beings. Therefore, NLP encompasses a set of computational techniques for analyzing and converting human language so that it can be processed by computers in a “human-like” way (Liddy, 2001).
3.2.2
A Brief History of NLP Developments
Jones (1994) organizes the history of NLP’s introduction and development into four main phases ranging from the 1940s to the early 2000s according to main research focus and areas: machine translation, artificial intelligence, logico-grammatical style development, and the advent of massive language data. NLP was firstly introduced in the 1950s (Chen et al., 2017) for enabling computerized means to automatically analyze and understand natural (i.e., “human-like”; Liddy, 2001) language to perform specific tasks (Joseph et al., 2016). The 1940s research on natural language started to gain predominance, leading to the first computer-based processing of natural language after the introduction of machine translation (MT), and mostly focusing on syntactic analysis in the 1950s (Lehnert & Ringle, 2014). Therefore, NLP research mostly dealt with machine translation (MT) issues (Jones, 1994). Following the first international conference on machine translation, the first journal of dealing with the subject, MT (Machinal Translation), started publication in 1954, whereas the first automatic translation machine was introduced in 1954, although allowing for a basic, rudimentary, and limited Russianto-English text translation (Jones, 1994).
3.2
Natural Language Processing (NLP)
49
If research mostly focused on the successful development of a “dictionary-based word-for-word processing” (Jones, 1994, p. 5), linguistic differences were mainly ascribed to differences in vocabularies and word orders, with major attention given to lexical ambiguity (Liddy, 2001). Hence, NLP mostly dealt with speech recognition since the major focus of the first phase’s research was on syntax (Jones, 1994). In his well-known syntax study, Chomsky (1957) defined syntax as the study of how sentences are built in a specific language, highlighting the role of linguistic and grammatical principles and properties and considering grammar as a collection of constructional rules (Jones, 1994). In the 1960s, attention started to move toward computational linguistics (Jones, 1994), the introduction of ELIZA (Weizenbaum, 1966), one of the first chatbots mimicking human conversation (Rahmani & Kamberaj, 2021), being able to answer to questions by matching words to their literal meaning, and DOCTOR, promoting a keyword-based approach in NLP. During the same period, Green et al. (1961) worked on the BASEBALL program, automatically extracting information from stored data about baseball games by deriving a query from an input question. Nevertheless, in the late 1960s, Minsky (1968) published his studies on semantic information, so that NLP researchers started to pay attention to the positive impact of external knowledge on the interpretation of language, addressing the problem of developing a “semantically driven processing” (Jones, 1994, p. 5). If in the 1960s most approaches were “engineering-based” (Lehnert & Ringle, 2014), the semantic perspective gained increasing predominance in information processing throughout the 1970s, so that by the end of the decade scholars came to the realization that language had a “communicative function” with linguistic expression having both a surface and underlying meaning (Jones, 1994, p. 8). This led to a new awareness that “common sense” was needed to analyze natural language when dealing with words with multiple-meaning or context-sensitive words or expressions (Lehnert & Ringle, 2014) as natural languages are characterized by fuzziness, as words often have a vague or context-dependent meaning, thus being “fuzzy” (Zimmermann, 2001). If this period’s research mainly focused on AI systems, with Woods (1978) developing the LUNAR program for parsing phrases, in the 1980s computational grammar theory received significant attention (Khurana et al., 2022), leading to the creation of SRI’s Language Engine and Discourse Representation theory. Since automated linguistic production emerged as a practical need (Stock, 2000), NLP research and application regarded both human text and voice processing in a way that is easily readable by computer tools (Crowston et al., 2012), with an increasing focus on statistical NLP approach. This led to a further development of techniques like part-of-speech (POS) tagging, sentiment analysis, natural language generation, word sense disambiguation, and speech recognition. Indeed, new research areas dealt with both new tasks, such as information extraction, discourse analysis, and generation, while innovative practical tools for processing large text corpora, such as machine-readable dictionaries, parses, and database query systems, started to be developed and implemented (Jones, 1994). By the 1990s, statistical NLP acquired increasingly significant relevance in information extraction (Stock, 2000), given the rising influence of technology,
50
3 Methodology and Empirical Strategy
and, thus, escalating preference for statistical language processing during the 1990s (Manning & Schütze, 1999). Indeed, the development of data analysis methodologies and the advent of innovative technology enabled the processing of much more copious quantities of data with respect to the past by means of machine power and machine learning, unveiling syntactic features and deriving probabilities. In particular, three main research streams characterized the first studies of statistical NLP (Basili et al., 1996): the first stream focused on studying word co-occurrences and their probability, the second on calculating a word’s probability given all those words preceding it, and the third on exploiting ML in the classification of language phenomena. Furthermore, NLP research addressed the challenge of managing, automatically processing, extracting information from, and summarizing large corpora of data that started to be widely available but diffused due to the advent of the Internet (Khurana et al., 2022). Therefore, new NLP research areas and techniques emerged in response, with NLP techniques also being applied to conversational speech text, which led to the development of advanced speech recognition technology (Jones, 1994), like Verbmobil (Wahlster, 2000). In 1995, Richard Wallace launched ALICE, an Artificial Linguistic Internet Computer Entity, a chatbot capable of conversing with human beings through a heuristic pattern based on human feedback (Rahmani & Kamberaj, 2021). If NLP-based language processors had been underperforming because of their unrealistic and unsystematic lexical knowledge codification, the statistical approach to NLP appeared as a solution for this as well (Basili et al., 1996). Between 1993 and 1997, the Text Retrieval Conference (TREC) Programme analyzed large files through various indexing and searching strategies, testing full-text retrieval using LMI (Jones, 1999), leading to the development of highly efficient information retrieval systems capable of processing tons of text (Strzalkowski, 1995). By the 2000s, NLP research merged information and machine technology as powerful systems or interfaces can now be developed for both experiments and daily use. However, Jones (1994) points out that NLP applications and systems are usually goal-specific, because they are purposely designed for a given applications, demanding further study on how to build a full-scale and highly performing NLP system. Moreover, if most NLP approaches still focus on the syntactic analysis of text, adopting a word-based approach, there is a growing interest in shifting attention from the syntactic to the semantics curve, focusing on the intrinsic meaning of natural language (Cambria & White, 2014). In particular, NLP is increasingly focusing on reaching a superior and appropriate natural language understanding (NLU; Kang et al., 2020) by taking into consideration both common sense and common (shared) knowledge, updating them in a continuous fashion (Cambria & White, 2014).
3.2
Natural Language Processing (NLP)
3.2.3
Main NLP Approaches
3.2.3.1
Symbolic NLP Approaches
51
Symbolic approaches firstly characterized NLP research, being implemented since the 1950s (Dale et al., 2000) and utilizing semantic information and background knowledge for extracting precise information from texts (Khoury et al., 2008). Relying on formal rules for building knowledge, this approach is knowledge-based (Jackson & Moulinier, 2002), thus conceiving language knowledge as encoded into rules or formal expressions (Dale et al., 2000). Since symbolic NLP approaches build on grammar rules (Jackson & Moulinier, 2002), they analyze, treat, and manage textual data on the basis of human-defined rules determining meaning at syntactic, semantic, and discourse levels of analysis (Crowston et al., 2012). Therefore, they adopt a top-down approach to NLP (Jackson & Moulinier, 2002). As this implies that symbolic NLP uses domain-specific knowledge, a major drawback of this type of approach is related to result validity and relevance, as results may be restricted to a specific domain or field (Khoury et al., 2008) with limited generalizability and applicability elsewhere. According to Dale et al. (2000), the symbolic approach to NLP involves five steps, starting with tokenization breaking down the text into its basic forms, and followed by lexical, syntactic, semantic, and pragmatic analysis sequentially, so that the speaker’s intended meaning can be evinced from the surface text. In applying NLP for information retrieval (IR), Rajput (2020) identifies four discrete steps, excluding the pragmatic analysis from the equation. If also semantic NLP approaches may be applied to automated information extraction (IE), that is, the extraction of desired information from a text (Zhang & El-Gohary, 2015), major problems of symbolic NLP deal with flexibility, as it fails to easily adapt to new language features, and with unfamiliar or ungrammatical inputs (Zhao et al., 2021). Other symbolic NLP approaches are explanation-based learning, case-based learning, and analogical reasoning methods, with information extraction being a crucial NLP subfield (Wermter et al., 1996). Basili et al. (1996) also considered empirical symbolic approaches to NLP, defining empirical methods as those based on probabilistic language models.
3.2.3.2
Statistical NLP Approaches
Statistical approaches significantly prevailed in NLP research in the 1990s (Jackson & Moulinier, 2002), employing computer algorithms, machine learning (ML), and deep learning (DL) for automatically extracting and classifying text or voice constituents, whose meanings are assigned a statistical probability. Applying ML to the analysis of copious amounts of data (Zhao et al., 2021), statistical NLP is able to develop and implement predictive language models by processing a copious amount of textual data (Basili et al., 1996).
52
3
Methodology and Empirical Strategy
Hence, unlike symbolic NLP, statistical NLP is a bottom-up methodology (Jackson & Moulinier, 2002), adopting an empirical approach to NLP by extracting language data from larger texts. Nevertheless, like symbolic NLP, statistical NLP also shows limitations when processing unfamiliar or ungrammatical input. If this implies that statistical NLP systems require domain-specific training when dealing with domain-specific text, deep learning (DL) solutions have been combined with statistical NLP to develop training machines that could automatically process vast amounts of data while understanding and extracting features needed for future detection in NLP analysis (Zhao et al., 2021). Moreover, DL models are also used to solve NLP problems and speech recognition (Alshemali & Kalita, 2020). Unlike symbolic NLP techniques, statistical NLP systems build linguistic models through mathematical techniques (Crowston et al., 2012). Some techniques are recurrent neural networks (RNNs) and convolutional neural networks (CNNs). As a DL technique, neural networks (NNs) analyze extremely vast amounts of data quickly, supporting human and business decisions (Votto et al., 2021). Lawrence et al. (2000) consider how to successfully develop linguistic capabilities in NNs architectures, realizing that NNs discriminatory features enable them to automatically distinguish natural languages sentences between grammatical and ungrammatical ones. Focusing on artificial neural networks (ANNs), Lee and Dernoncourt (2016) introduce a new CCNs-based model for short text classification arguing that the supply and accumulation of sequential information positively affects predictions’ quality and model performance. Nevertheless, if a major advantage of statistical NLP techniques is their scalability, a significant disadvantage is related to the interpretation of results, which are given in mathematical form (i.e., word cluster with corresponding probabilities or word strings) leaving to the researcher the definition of a conceptual explanation (Basili et al., 1996).
3.2.3.3
Connectionist NLP Approaches
According to Christiansen and Chater (1999, p. 419), connectionism can be referred to as “neural networks” (NNs) as this type of approaches focuses on the role that ANNs and connectionist networks (CNs) may have in NLP, with the latter being established as highly effective in modeling language learning tasks (Wermter et al., 1996). Mimicking the brain’s neurons, connectionist models consist of a complex network of deeply interconnected processors (i.e., units or nodes) operating “simultaneously and cooperatively” (Christiansen & Chater, 1999, p. 419). Therefore, unlike symbolic approaches, connectionist approaches are based on the dynamic reinterpretation of previous inputs, as CNs have the capability to continuously reinterpret prior inputs as soon as new input is received, thus allowing for the integration of multiple knowledge sources as well (Dyer, 1995). This is why connectionism can also be addressed as “parallel distributed processing” (Christiansen & Chater, 1999, p. 419) or simply “parallel processing” for their highly integrated forms of language processing (Selman, 1989).
3.2
Natural Language Processing (NLP)
53
In particular, connectionist NLP systems process language from multiple sources in an integrated way, thus adopting a multi-level perspective on language (e.g., semantic, syntactic, and pragmatic; Selman, 1989, p. 23). Miikkulainen and Dyer (1991, p. 343) also describe a connectionist NLP approach adopting modular parallel distributed processing (PDP) networks and distributed lexicon, since PDP models can automatically extract knowledge from examples, learning to process inputs with no need for being pre-trained to do so. In studying the development of connectionist language models in NLP, Christiansen and Chater (1999) report that the most interesting feature of connectionist systems is their capability to learn from experience, thus being “self-organizing” (p. 419).
3.2.4
NLP Analysis and Applications
3.2.4.1
Main Steps in NLP Analysis
Collobert et al. (2000) identify four main phases in NLP analysis, including part-ofspeech (POS), tagging, chunking (or parsing), named entity recognition, and semantic role labeling (p. 1). Pandey et al. (2017, p. 80) point out that computer-aided textual analysis entails three crucial initial steps, encompassing the collection of a set of documents, the definition and application of a computer code, and the definition of a dictionary. Considering the compiled and studied NLP literature, the main steps involved in NLP application are the following. Input Data Collection At first data are collected from multiple sources given specific parameters. If the appropriate collection of a set of textual documents is a critical initial step in NLP (Pandey et al., 2017, p. 80), Oramas et al. (2018) claim that collecting text data from diverse sources may be a major research challenge as information should be complete and accurate but may be scattered across a plethora of sources, especially in the era of massive and diffused digitization. Data Preprocessing In classifying the main steps involved in NLP studies, Crowston et al. (2012) highlight pre-processing as a first essential step to convert raw data in a format that can be further processed. This usually entails eliminating unnecessary sections and/or deleting unnecessary words like links (Bharadwaj et al., 2022), removing stop words, and deciding on special characters, such as punctuation ones, which may be either kept or identified as stop words according to the specific analysis being conducted (Ly et al., 2020). If text data must be preprocessed to be used for further NLP analysis (Kang et al., 2020), this also involves cleaning data, eliminating non-textual elements (e.g., images), and correcting misspellings through N-gram. Preprocessing data may also encompass stemming (Ly et al., 2020).
54
3
Methodology and Empirical Strategy
Tokenization After textual data is cleaned, it undergoes tokenization (Crowston et al., 2012), which breaks down the text into its smallest complete units (i.e., tokens) by identifying “word boundaries” (Dale et al., 2000). Tokenization involves sentence segmentation into tokens, which are “discrete units” (Donnelly et al., 2022), such as words or numbers (Jain et al., 2018). This is why tokenization can also be referred to as a sentence segmentation process where sentences are identified through sentence boundaries (Dale et al., 2000). If tokenization relates to the segmentation of a sentence into well-defined basic units, such as words or sentences (Ly et al., 2020), the main challenge is defining what “a word” is through word boundaries (Palmer, 2000). If the delimiter is usually a space, separating the text into tokens accordingly (Gopalakrishna & Vijayaraghavan, 2019), major problems arise for languages or linguistic expressions whose word boundaries are not necessarily defined by spaces between terms (Dale et al., 2000). Indeed, while some languages are easier to decompose, as words are usually neatly separated by spaces, others require researchers to define rules of what a word is (Dale et al., 2000). Therefore, the language of the text being processed deeply affects the tokenization phases, requiring a deep understanding of the writing system of the text (Palmer, 2000). Moreover, “agglutinating” (Palmer, 2000) or multiword constructions may make text segmentation troublesome as this implies that spaces do not necessarily correspond to word or token identification. Stemming and Lemmatization Ly et al. (2020) consider stemming as a critical step in data preprocessing for NLP. Stemming can be defined as a reducing mechanism, allocating each word to its root. Porter’s five-stage stemmer, introduced in the 1980, is one of the most used algorithms for stemming (Willett, 2006). If stemming implies further processing tokens to reach a word’s root form, lemmatization delivers linguistically correct word lemmas. This is because lemmatization is context-sensitive, considering the possibility of words to have different meanings given a specific context (Daryani et al., 2020) by exploiting parts of speech. In particular, lemmatization is a particular kind of stemming that eliminates words’ prefixes and/or suffixes (Vodithala & Mohammed, 2021), reducing words to a base form as well. However, unlike stemming, the results of lemmatization are more precise, as they are actual words, requiring an error-free and large dictionary for a more precise look-up (Kamath et al., 2019). Therefore, lemmatization reduces words to general base forms, whereas stemming extracts root forms which do not necessarily carry semantic meaning. Part-of-Speech (POS) Tagging In addition to tokenization, data preprocessing involves part-of-speech (POS) tagging (Kang et al., 2020). After deleting stop words, each token is tagged a part of speech, such as “verb,” “noun,” “adjective,” “conjunction,” “pronoun,” “preposition,” and “adverb” (Gopalakrishna & Vijayaraghavan, 2019). This is why this phase is also referred to as “word-category disambiguation” or “grammatical tagging” (Daryani et al., 2020, p. 100), because each word is associated with a label
3.2
Natural Language Processing (NLP)
55
highlighting its syntactic role in the POS phase (Collobert et al., 2000, p. 1). Jackson and Moulinier (2002) distinguish two approaches to POS tagging: rule-based POS tagging deletes syntactically incorrect tags by following linguistic and contextual rules, whereas stochastic approaches function by training data, using probabilities and frequencies for word disambiguation. Hence, POS tagging impacts on the correct understanding of the correct meaning of a sentence (Daryani et al., 2020), being a crucial step in NLP, so that it may be a built-in function in NLP tools like tokenization (Crowston et al., 2012). Model Training NLP models must be pre-trained to provide accurate results, if no pre-trained model is already available (Bharadwaj et al., 2022). The training process of NLP models is iterative (Haddad & Mercier-Laurent, 2021) involving a comparison between a reference (e.g., embedding language model) and the model’s predictions. Galassi et al. (2021) focus on the different relevance of text elements depending on the task that is conducted, proposing attention models as a solution for obtaining a more compact data representation in terms of information relevance. By highlighting more relevant textual elements through a weight distribution, this attention model assigns different values to data elements and greater ones to the more relevant features. That is what the word embedding technique implies: the conversion of textual data into vectors of numerical values (Xu et al., 2022). Word embeddings identify “distributional vectors” (Young et al., 2018, p. 2), mapping words onto vectors (Jain et al., 2018), working as a “feature input” (Wang et al., 2018) aimed at spotting similarities between words (Young et al., 2018, p. 2), in ML techniques, NLP included, to contextualize data. They can be trained either through internal or external data: while the former ensures topic-specific languages, the latter can provide additional knowledge that may enhance a specific NLP task in a given domain. Yet, word embeddings require pre-training to capture syntactic and semantic information needed within a large corpus of data (Young et al., 2018). Conversely, Cambria and White (2014) propose a semantics-based approach to NLP, which makes the use of word co-occurrence count obsolete, focusing on the intrinsic and implicit meanings of natural language texts instead. This is strictly related to the fact that natural language contains constructions that are considered as multiword expressions (Sag et al., 2002) that should not be processed separately, such as words with spaces and idiomatic constructions. If this phenomenon refers to idiomaticity (Sag et al., 2002), it also highlights how critical the choice of NLP library is (Al Omran & Treude, 2017) in the model training phases. Nevertheless, past literature often fails to justify how an NLP library is chosen (Al Omran & Treude, 2017). Ushio et al. (2021) use BERT as a pre-trained embedding language model in NLP for recognizing analogies, highlighting the importance of choosing a fine-grained embedding language model to reach effective results. Indeed, text documents contain both unstructured and structured data (Vijayarani et al., 2015), which complicates the implementation of computer-based tools in interpreting natural language as this may require developing rules and algorithms that can process natural language
56
3 Methodology and Empirical Strategy
input (Ng & Zelle, 1997). In describing the phases of NLP, Webster and Kit (1992) underscore that word entity definition is still rather complicated, despite being one of the first steps needed for NLP. Indeed, while most past literature relied on explicit delimiters such as spaces between words this rule may overlook complexity of other units including idioms and fixed expression. If defining what is a word is troublesome, additional complexity is related to the fact that words may acquire different meanings in relation to their context requiring word sense disambiguation and semantic parsing (Ng & Zelle, 1997). In the name entity recognition (or NER; Collobert et al., 2000, p. 4) the sentence’s components are analyzed at the atomic level, so that each constituent is classified into a specific category (e.g., location and person). In particular, in the semantic role labeling (or SRL; Collobert et al., 2000) each syntactic element of a phrase is assigned a semantic role. Hence, NER is an NLP tool that extracts information from unstructured data, identifying a text’s name entities by denoting them by a specific noun or tag, such as “location,” “time,” “person,” “organization,” or other user-made name entities (Shelar et al., 2020, p. 324). This implies that pre-training is a critical step for NER models so that NLP libraries are available for building NER models (Shelar et al., 2020). According to Zhang and El-Gohary (2015), NLP is characterized by two major approaches, based on either rules or machine learning (ML). Combining statistics, algorithm complexity, and probability theories, the latter regards computers’ simulation of human learning capabilities of continuous acquisition, upgrading, and reorganization of skills and knowledge to reach superior performance (Song et al., 2022, p. 2). Therefore, while the former entails the iterative manual coding of rules for processing textual data, the latter trains text processing models by applying ML algorithms to a training text. In studying the role of data augmentation (DA) in NLP, Feng et al. (2021) define NLP-based DA as a set of strategies enabling higher training examples’ diversity without collecting new data. Yet, Stenetorp et al. (2012) emphasize the limited availability of rapid pre-training methods for NLP, developing a rapid annotation tool (BRAT) for increasing annotators’ productivity. Conversely, Agerri et al. (2014) propose a ready-to-use modular set of linguistic annotations, the IXA pipeline.
3.2.4.2
Levels of Analysis in NLP
If multiple NLP tools are available (Hellmann et al., 2013), NLP is based on the idea that natural language requires a tripartite analysis: syntax, semantics, and pragmatics. After debunking a sentence into its ordered structure (i.e., syntax), literal meaning can be more easily evinced (i.e., semantics) first, so that the meaning of text or utterance can be eventually analyzed in relation to the context (i.e., pragmatics; Liddy, 2001). In explaining linguistic analysis, Crowston et al. (2012) add phonological, morphological, and discourse levels of analysis as well, while Khurana et al. (2022) consider that NLP deals with natural language understanding focusing on linguistics like semantics, syntax, pragmatics, phonology, and morphology. In particular, if NLP aims at natural language understanding, it may adopt seven
3.2
Natural Language Processing (NLP)
57
Table 3.1 NLP levels of analysis Level of analysis Phonological Morphological Lexical Syntactic Semantic Discourse Pragmatic
Definition It analyzes text in its being uttered by considering characteristics like speech, sound, and inflection It analyzes morphemes, which are the smallest meaning-carrying linguistic element, such as prefixes and suffixes It analyzes words through POS tagging It extracts meaning by considering the word’s arrangement in a sentence It defines a word’s meaning depending on the context, for instance disambiguating terms that may have multiple meanings It defines a sentence’s meaning by taking into account the text preceding and following the sentence itself It determines the meaning of an expression, word, or sentence by taking into consideration “world knowledge” such as shared understandings
Based on Liddy (2001) and Crowston et al. (2012)
levels of linguistic analysis, which infer a specific level of meaning from text or conversational language, including phonetic (i.e., pronunciation), morphological, lexical, syntactic, semantic, discourse, and pragmatic (Crowston et al., 2012). Based on Liddy (2001) and Crowston et al. (2012), Table 3.1 explains NLP levels of analysis. Phonological Analysis In the phonological level of analysis, the uttering of the sentence is analyzed through its sound and inflection. Morphological Analysis Morphological study focuses on the morphemes, such as prefixes and suffixes. Lexical Analysis Lexical analysis involves the processing (Dale et al., 2000) of a text into lexemes referring to a specific lemma, that is, the root token (Rajput, 2020). Indeed, this phase may also involve the stemming of several terms (Vijayarani et al., 2015) into a root or base form, called stem (Bamman et al., 2014; Rajput, 2020). By reducing different words or grammatical elements to a base form (Vijayarani et al., 2015), stemming provides a better estimation of a word’s frequency distribution within a text, but also of similarity between documents as the number of common terms rises after stemming (Lease, 2007) as it eliminates suffixes and shrinks the number of words being processed (Vijayarani et al., 2015). However, this process may lead to lower precision (Lease, 2007) as it may trigger ambiguity by reducing words with different meanings to the same morpheme (Kamath et al., 2019). Furthermore, segmentation may be problematic for those languages with no visual detection of word boundaries (Bird et al., 2009). Yet, this type of analysis may also eliminate stop words, diminishing dimensionality (Vijayarani et al., 2015) by deleting words bringing no additional meaning to the document.
58
3
Methodology and Empirical Strategy
Syntactic Analysis The syntactic level of analysis encompasses assigning each word a function within the sentence or text (i.e., its “syntactic category”; Bird et al., 2009, p. 421) through POS tagging by assigning each world a label indicating its “syntactic role” (Collobert et al., 2011, p. 2494) to sort out lexical ambiguities (Joseph et al., 2016). Unlike the lexical analysis, the syntactic analysis requires both a parser and a grammar (Khurana et al., 2022; Liddy, 2001), as they aim at uncovering such grammatical structures. In particular, parsing entails delimiting sentences and recognizing their structure (Tepper et al., 2002), by defining the syntactic role of each of their components. In studying sentence-level ambiguity, Chomsky (1957) explains how the reciprocal relation between syntax and semantics deeply impacts language and sentence understanding because a sentence may result as ambiguous when carrying more than one meaning or being highly likely to be understood similarly to another. That is why, Young et al. (2018) differentiate between two types of parsing: dependency and constituency parsing (p. 20). Semantic Analysis If all levels of analysis contribute to the understanding of a sentence’s meaning, semantic analysis implements semantic disambiguation of polysemous words to investigate the possible meanings of a word in a sentence (Liddy, 2001). Since a sentence’s semantic analysis aims at revealing all its possible meanings by examining the interactions and relations among its words’ meanings, it may require semantic word disambiguation processing for more accurate results. Therefore, it assigns tags to each word, which is classified as a noun, a verb, an adjective, and so on (Jain et al., 2018). Discourse Analysis After the semantic analysis, the discourse analysis focuses on the whole text (Liddy, 2001), studying the role of each sentence through its connections to what precedes and follows (Crowston et al., 2012). Therefore, this type of analysis determines a sentence’s function in a text, evaluating how this brings additional meaning to the whole (Liddy, 2001). Pragmatic Analysis If the semantic level of analysis reveals the word meaning within context (Crowston et al., 2012), this is further refined in the pragmatic analysis, which contextualizes the text to explicate “extra meaning” (Liddy, 2001). As language is “situated” (Bamman et al., 2014) in a specific time and space, an utterance carries “latent information” (Hovy & Spruit, 2016, p. 592) about the subject (i.e., speaker or writer) and the context. Thus, this level of analysis also evaluates meaning in relation to experience-related connotations and shared understandings (Crowston et al., 2012, p. 526).
3.2
Natural Language Processing (NLP)
3.2.4.3
59
An Overview of Main NLP Applications
In summarizing the applications of NLP to information retrieval (IR), Lease (2007) applies text retrieval (TR) to human language, whether textual or vocal, considering it as the information being retrieved. In studying how NLP can be advantageous for qualitative analysis, Crowston et al. (2012, p. 525) count machine translation, information retrieval, and text summarization as some of the tasks that NLP can automate and speed up. Song et al. (2022) claim that NLP technologies also support opinion extraction and text classification, with Dale et al. (2000) adding text critiquing and natural language database query as other NLP applications. Indeed, a main NLP task is document retrieval (IR; Jones, 1999), that is, finding the relevant document given a query, and document routing, that is, the automatic forwarding of a document to a specific user/profile, with Strzalkowski (1995) defining IR as the capability to extract “relevant documents” from a larger corpus on the basis of a user’s queries. This implies that NLP may also be implemented in document classification, facilitating the assignment of documents to specific classes depending on their content. In addition to information extraction (IE), NLP programs can extract relevant pieces of information from a text, creating a surrogate document, through document summarization (Jackson & Moulinier, 2002). Markham et al. (2015) include morphological segmentation, parsing, term, or sentence disambiguation discourse analysis, sentiment analysis, relationship and information extraction, and entity recognition among business-related NLP tasks for supporting business decisions. Wanless et al. (2022) add that NLP can also be employed in customer service (e.g., chatbots) and sentiment analysis on social media. Crossley et al. (2014) also emphasize how NLP can be effective in feedback-providing systems that can automatically provide answers and hold conversations with users, with Feng et al. (2021) pointing out how NLP can also be advantageous for correcting grammatical errors, tagging sequences, and generating open-ended text. NLP can also be used for linking and categorizing data as well as providing information to users in a more friendly way (Jones, 1999). That is why, Crossley et al. (2014) develop the SiNLP (“simple NLP”), functioning as an NLP extension for discourse processing in the prediction of human judgments. In underlining how NLP effectively enables human–computer interaction, Asemie et al. (2017) enumerate question answering, language translation, and natural language interface to database (NLIDB) among NLP applications. In particular, NLIDB may improve a users’ search and retrieval of pertinent information from a database through queries in his or her native language (Asemie et al., 2017). Indeed, conversational software capable of simulating human conversations via text or voice (i.e., chatbots) uses NLP methods to understand users and reply appropriately by identifying their intention and isolating significant information (Ayanouz et al., 2020). Ray et al. (2021) apply NLP technology to user-generated content in social media platforms and websites to understand consumption intention and customer value in e-Learning, whereas Kolleck and Yemini (2020) combine NLP and social network analysis (SNA) to
60
3 Methodology and Empirical Strategy
assess environment-related education (ERE) representation in Global Citizenship Education (GCE) scholarships. Implementing NLP methods in musicology research, Oramas et al. (2018) show that NLP pipelines and approaches can also be effective in the analysis of digitized music texts and music knowledge discovery, implementing NLP for information extraction, sentiment analysis, and knowledge graph generation as well (p. 365). Wanless et al. (2022) study NLP application in sports management research, while Robeer et al. (2016) conduct an NLP analysis of user stories, with Marrone et al. (2022) assessing mathematical creativity automatically by using NLP techniques. Some scholars also implement NLP methods to measure innovation levels. Arts et al. (2021) use NLP to assess the impact of patents and their technical novelty through keyword extraction, with Rezende et al. (2022) also examining similarities among patents. Song et al. (2022) also combine NLP and machine learning (ML) to evaluate how digital systems of patents affect the efficiency of research and development (R&D) cooperation between organizations and research institutions, while Chiarello et al. (2021) investigate technology trajectories in terms of blockchain’s value creation. If Pandey et al. (2017) measure innovation in the public sector through NLP applications, Preuss (2017) analyzes statements dealing with organizational culture through NLP methods, proving that NLP approaches can also be adopted in cultural research. Fitzgerald et al. (2012) classify various NLP techniques for file fragment classification in digital forensics, blending support vector machines and bag-of-words model, considering texts as unordered bags of words. However, much NLP literature does not address its implementations and resulting implications in industrial settings (Rosadini et al., 2017). Rosadini et al. (2017) use NLP to detect defects in a railway company’s documents, while Femmer et al. (2014) apply their newly developed tool Stella to three companies’ databases. NLP successfully detects fake news (Bourgonje et al., 2017; Chokshi & Mathew, 2021). In particular, Bourgonje et al. (2017) propose automated NLP procedures for clickbait detection. Past scholars also address the implementation of NLP techniques in medical research and healthcare settings. Kaufman et al. (2016) show that NLP can effectively enable electronic recording in healthcare, improving documentation quality, time, and usability, whereas Denny et al. (2009) validate a new algorithm, named SecTag, combining NLP methods to identify note section headers in history and physical documents, improving clinical decisions and competency assessment systems. An interesting application of NLP is in human resource management (HRM) where it works as a time- and labor-saving solution for processing high volumes of applications (see Sect. 3.2.6.7).
3.2.4.4
Major Challenges in Applying NLP
Variability and Context Dependency of Meaning Due to the personal nature of language use (Lynn et al., 2017), linguistic expressions may be endowed with multiple meanings, depending on the context and the user. For
3.2
Natural Language Processing (NLP)
61
instance, Loughran and McDonald (2016) emphasize the problematic treatment of positive and negative words, which depend on their usage context. Hence, understanding a word context may result in a better understanding of its meaning, thus providing more relevant and specific results to the question in comparison to search engines while also reducing ambiguity (Markham et al., 2015). Ambiguity According to Friedman and Hripcsak (1999), variety and ambiguity are critical issues in NLP applications. Ambiguity refers to the degree of “understanding” of a sentence (Chomsky, 1957, p. 9), being the property of those phrases that can be understood in multiple ways or in an analogous manner with respect to others. This is related to the syntactic structure of the sentence. Ambiguities may also result from segmentation (Webster & Kit, 1992). Linguistic ambiguity refers to the fact that words and phrases can have multiple interpretations (Jackson & Moulinier, 2002), which could be easily differentiated by human beings by referring to their context and real-world knowledge. Information may contain either rigid or flexible elements (Zhang & El-Gohary, 2015): if the former includes predefined and fixed concepts, the latter reports a variable number of concepts and relations depending on the context (Zhang & El-Gohary, 2015). Moreover, Rosadini et al. (2017) distinguish between anaphoric and coordination ambiguity, the former arising when a pronoun may have more than one antecedent in the text, whereas the latter being due to multiple interpretations of a sentence because of a coordinating conjunction. Krovetz and Croft (1992) distinguish between syntactic and semantic lexical ambiguity: the former encompasses words that may belong to multiple syntactic categories, whereas the latter involves polysemy and homonymy cases having unrelated meaning across various categories. Passive voice, excessive length, vague terms with no precise semantics, missing references, and unit of measurement also contribute to ambiguity (Rosadini et al., 2017). In their analysis of lexical ambiguity NLP in document retrieval, Krovetz and Croft (1992) claim that ambiguous words or phrases may cause unwanted documents to be retrieved as well in the search. If the query’s words determine which documents will be extracted as relevant accordingly, it may also retrieve documents that result as irrelevant for the research even if they do not contain the exact words of the query. Hence, they propose that words’ senses—that is, the semantics—should be preferred to words to improve information retrieval (Krovetz & Croft, 1992). In studying textual data from financial documents, Loughran and McDonald (2016) consider the ambiguity of positive words, which may have both a positive and negative usage, depending on how the statement is framed. Nevertheless, Markham et al. (2015) ascribe ambiguity to poorly defined rules, which are responsible for guiding the search and establishing the context for narrowing possibilities. If this was hand-programmed in the past, now NLP exploits ML and algorithms to screen much broader datasets of information (Markham et al., 2015). Hence, they claim that NLP may solve ambiguity by assigning part-of-speech (POS) tags, thus considering the term context in understanding its meaning, which is not possible in search engine results.
62
3
Methodology and Empirical Strategy
Background Knowledge and Model Training A major limitation of AI models is related to the dataset selection for training the model itself as this determines whether the AI system can parse specific textual content (i.e., expressions, sentences, and words). Since NLP tools are more effective in processing content matching “the data they were trained on” (Marsoof et al., 2022, p. 15), accuracy may be improved by training corpora of data on sufficiently large and relevant corpora of data and by utilizing clear parameters to select it (Marsoof et al., 2022). Galassi et al. (2021) highlight the different relevance of text elements depending on the task that is conducted, proposing attention models as a solution for obtaining a more compact data representation in terms of information relevance. Kang et al. (2020) explain how data must then be represented through either discrete or distributed methods to better understand a word’s meaning. Discrete representation may exploit tools like bag-of words (BOW) to illustrate a document as a vector by counting each of its term’s frequency (Kang et al., 2020). While this approach may appear too simplistic because the knowledge of both the context where the word is uttered and words’ order within a sentence is irrelevant, it may also offer a limited perspective on the study of semantic relationships among words within the document and is vexed by high dimensionality problems in representation (Kang et al., 2020). Term frequency-inverse document frequency (TF-IDF) provides a better account of word’s frequency within a text, by also taking into consideration the proportion of text in which the term is situated. However, like BOW, this method suffers from high dimensionality of representation (Kang et al., 2020). Tools like Word2Vec and Global Vector (GloVe) may solve such problems, allowing for superior word representation (Kang et al., 2020). As the identification of word meanings requires a context of lexicons, pre-trained embedding languages were developed to do so, with Word2Vec being one of the most used in addition to BERT (Chiarello et al., 2021). World2Vec depicts words as vectors, capturing their semantic and syntactic similarity (Rezende et al., 2022) through their spatial disposition which relates to the number of co-occurrences of two words within the document being considered. Thus, the more perpendicularly two vectors are portrayed with respect to each, the more different these two words are. The main limitation of this approach arises when the model encounters unfamiliar words (Lease, 2007). Evaluation Derczynski (2016) claims that NLP systems may ensure precision and recall, which estimate whether the system can find a specific linguistic element within a corpus, adding evaluation as another issue for NLP systems. Hence, he proposes an improvement of F-score for achieving a superior performance scale, despite its lack of detail.
3.2
Natural Language Processing (NLP)
3.2.5
63
NLP and Topic Modeling
Topic modeling is an unsupervised ML technique, automatically spotting words with high co-occurrence probability in a text (Lind et al., 2022). Since NLP extracts word frequencies, it is strictly related to topic modeling, whose aim is discovering main topics in a text with no need for researchers to fully read it (Hu et al., 2019). Among NLP topic modeling techniques, latent Dirichlet allocation (LDA) may solve such dimensionality problems, being a quantitative topic modeling tool and text mining technique that may unveil “latent” thematic structures in a large corpus of documents automatically and with no supervision through word co-occurrences, reducing dimensionality (Ambrosino et al., 2018).
3.2.5.1
NLP and LDA
Latent Dirichlet allocation (LDA) is a dimensionality-reducing text mining algorithm (Lind et al., 2022; Mutanga & Abayomi, 2022) and topic modeling method (Calheiros et al., 2017; Hu et al., 2019) that scouts hidden regularities and themes (i.e., topics) in a large database of documents on the basis of how likely two or more words are likely to co-occur (Ambrosino et al., 2018). As such, it can be defined as a generative model as each text is processed as if it was generated from a probabilistic topic distribution with each topic being a probabilistic distribution of terms (Hu et al., 2019). In particular, LDA extracts main themes (i.e., “topics”; Lind et al., 2022) from a corpus of documents (Mutanga & Abayomi, 2022) treating them as “analytic objects” (Bernier et al., 2021) or “features” revealing a document’s content (Basili et al., 2006) on the assumption that a document can be considered as “bag-of-words,” that is, a bag containing many words with a specific frequency and number of co-occurrences assuming that similar words should refer to similar contents, thus being highly likely to be featured and co-occur in the same text sections (Ambrosino et al., 2018). As each document is processed as a “vector of word counts” (Mutanga & Abayomi, 2022, p. 165), LDA does not consider word order or grammar (Ambrosino et al., 2018), but provides probability distributions, clustering words into groups accordingly by classifying sets of word groups into a specific theme, or “topic” that is evinced by processing the text documents depending on the parameters chosen (Mutanga & Abayomi, 2022). Hence, LDA requires a prior knowledge of the database being modeled, as the more a word appears in a document, the higher its related topic’s relevance and frequency is in a document (Basili et al., 2006). This implies that co-occurrence frequency can be used as a proxy for content representation (Basili et al., 2006). As topics are latent and hidden and are inferred from the elaboration and analysis of data, LDA reveals the “latent thematic structure” (Ambrosino et al., 2018) of a document collection, representing each topic as a set of vectors, each representing a latent theme, and each document as a distribution of multiple topics (Bernier et al.,
64
3 Methodology and Empirical Strategy
2021). If each document is conceived as composed of multiple topics, each topic is then assigned a certain random probability distribution, in turn, being composed of words characterized by a random probability distribution as well (Tian et al., 2022). If a document’s probability to belong to a topic is a crucial parameter (Chiarello et al., 2021), one of the key problems in applying topic modeling techniques is classification as the choice of categories affects how topics will be evinced (i.e., modeled) from the archive of documents (Bernier et al., 2021). Since the choice of model parameters is critical because it directly affects the number of topics that will be modeled, like NLP, LDA also requires an accurate preprocessing of textual data and interpretation of resulting topics (Maier et al., 2018). Another key issue is choosing the proper number of topics to be found in the dataset, because this number affects the model’s topic interpretation as well (Tian et al., 2022). Tian et al. (2022, p. 4) base the calculation of the optimal number of topics on the trade-off between the highest level possible of “intra-topic coherence” and highest level of “inter-topic differentiation.”
3.2.5.2
LDA Applications
Bernier et al. (2021) use LDA to study the impact of employees’ post-conversations on organizational innovation through novel idea generation, while Afolabi et al. (2020) exploit LDA algorithms in social network analysis to spot co-authorship networks in civil engineering research. Tian et al. (2022) implement LDA when studying the global development of standard-essential patents (SEPs) through a technological topic analysis. Since LDA is an efficient topic model to process patents’ texts, treating patents as multi-topic documents (Tian et al., 2022), Wang et al. (2020) apply LDA to the analysis of a firm’s patent portfolio, extracting technologies as topics for building an enterprise-patent-topic probability distribution and developing a two-axis model for a company’s competitiveness evaluation in R&D strategy. In employing LDA in technology and strategic management, they prove that LDA can be an effective tool for evaluating an organization’s competitiveness. Edison and Carcel (2021) implement LDA in discourse analysis in finance, analyzing US Federal Open Market Committee (FOMC) through LDA to detect the evolution of discussion topics and topic priorities of the banking system. Xu et al. (2022) support the implementation of LDA model in the analysis of textual data extracted from online financial platforms (i.e., Shenzhen Stock Exchange Easy Interaction) to understand how interaction affects stock market efficiency through good and bad news, whereas Li et al. (2020) identify which factors affect commodity futures prices by applying dependent-sentence-LDA to a large set of news headlines. However, existing LDA literature also deals with its application in marketing research and data processing. Wang et al. (2019) opt for a big data analytic approach through LDA to extract features of destination images of roughly 20 Chinese cities from a large volume of online travel blogs. If Britt (2021) uses LDA in his longitudinal study of people’s interactions and discourses in online communities during COVID-19 pandemic and their influence on individuals’ health information
3.2
Natural Language Processing (NLP)
65
gathering and sensemaking, Ryoo and Bendle (2017) also opt for LDA to investigate key topics in social media strategies of US presidential candidates through online textual analysis of tweets, promoting the LDA application in political marketing as well. Since LDA topic models are a reliable and valid approach to analyze textual data in communication research (Maier et al., 2018), Sjøvaag and Pedersen (2018) also adopt a LDA approach for processing of online news content in investigating the impact of direct press support on online news content diversity. Focusing on brand-related user-generated content (UGC) on Twitter, Liu et al. (2017) apply text mining, LDA, and sentiment analysis to approximately 1.7 million tweets, showing that product, service, and promotion are significant topics in customer-brand interactions and that brand-related customer sentiments differ according to the industry. Özdağoğlu et al. (2018) use LDA when inspecting group customers’ online product and service reviews of Italian restaurants according to topics to determine customer needs and preferences. Similarly, Lim and Lee (2020) examine passengers’ online reviews to evince significant service quality dimensions of full-service and low-cost carriers, while Zhang (2019) explore topics in Airbnb customer reviews in studying their implications for listing performance on the app. Calheiros et al. (2017) also study customer-generated online reviews of an eco-hotel by implementing LDA topic modeling to sentiment classification of customer feelings toward hotel issues, supporting managers in enhancing customer satisfaction. So, LDA is also a valuable tool in brand management and customer relationship management (CRM). In performing topic modeling on publicly available corporate social responsibility (CSR) reports, Goloshchapova et al. (2019) use LDA analysis to extract main CSR topics for firms indexed in major stock markets in 15 industrialized countries, highlighting how employees’ safety, carbon emissions, human rights, and employees’ training support are among most common. Hence, LDA can also support sustainability management and CSR strategy.
3.2.6
NLP in Management Science
As a major technique for text processing and textual data analysis, NLP research has been applied to multiple fields in management science to better understand textual data. According to Kang et al. (2020), NLP can have multiple applications in management science, from accounting and finance to marketing and sales, also improving information systems and strategic management. Indeed, Markham et al. (2015) highlight how NLP may improve decision-making through proper information triangulation as it can retrieve critical and/or needed pieces of information from multiple sources. In the following sections, we summarize main NLP applications in various fields of management science.
66
3.2.6.1
3
Methodology and Empirical Strategy
NLP in Accounting and Finance
Most literature dealing with NLP application in accounting and finance deals with NLP techniques for stock market analysis and market reaction. In particular, NLP is used for analyzing news’ impact on sales. Loughran and McDonald (2016) advocate for the use of textual analysis in finance and accounting research through methods such as sentiment analysis of financial documents. Eachempati and Srivastava (2021) confirm the effect of sentiment polarity on stock returns given unadjusted news in the market price, whereas Liao et al. (2021) implement automatic speech recognition (ASR), machine learning, and voice mining in combination with NLP to process a finance company’s unstructured voice and text data from debt collection calls to evince which strategies are more effective in customer persuasion. Huang et al. (2019) also encourage the adoption of financial textual analysis in China’s financial market research to better understand monetary decision-making. NLP can also support auditing procedures by automatically processing textual data from annual reports. For instance, Boskou et al. (2019) develop a classification model to assess internal audit quality, while Fisher et al. (2016) use NLP to process financial statement content and corporate reports to predict future performance. Wang and Guo (2012) study the correlation between a company’s performance, identified as the annual average stock price (AASP) and online recruitment information by adopting NLP, opinion mining, and competitive intelligence techniques, showing how investors also consider such information when evaluating firms to spot fraudulent ones.
3.2.6.2
NLP in Marketing and Sales
NLP is widely used in digital marketing for customer value creation and usergenerated content (UGC) analysis. In particular, NLP may automatize the study of customers’ posts and comments in and across multiple social networks and platforms as well as websites, supporting sentiment analysis as well. Öhman and Metcalfe (2021) combine NLP and critical discourse analysis to study how beauty is marketed in Japan through social media. Similarly, Hasan et al. (2019) utilize NLP in the sentiment analysis of tweets on Twitter to support an organization’s product marketing. In particular, they use NLP to preprocessed data for tweet filtering, while they use BOW and TF-IDF for analyzing sentiment. Ramaswamy and DeClerck (2018) consider both structured and unstructured text data across multiple channels of customer feedback to evince customer perception. NLP can also be applied to online customer reviews to evaluate customer perception of attribute performance in product rating in sentiment analysis (Oh & Yi, 2021). Oh and Yi (2021) process online customer reviews on Amazon to extract major quality dimensions and related words describing customers’ sentiment, investigating not only the effect of features on customer satisfaction but also the implications of such sentiments for product ratings. This is also applicable to experience
3.2
Natural Language Processing (NLP)
67
marketing. Antonio et al. (2018) opt for ML and NLP approaches in their analysis of hotel online reviews for developing rating indexes to predict review ratings through a review’s textual elements. Barbierato et al. (2021) mix NLP, text mining, and sentiment analysis to sort out TripAdvisor reviews of an experience (i.e., wine tour) to verify which elements create a successful wine tour experience in terms of customer satisfaction. Using NLP of latent semantic analysis (LSA), Zhang and Koshijima (2019) try to reveal implicit information from online travel reviews, using text mining to analyze tourists’ feedback and problems. Indeed, in studying the increasing importance of “unpaid influence,” Williams et al. (2019) show how algorithm word of mouth (aWOM) has emerged in NLP, replacing traditional word of mouth (WOM) and electronic word of mouth (eWOM) in affecting and driving organizational decision-making. Sun et al. (2017) adopt an NLP approach to opinion mining in social media. Zhang and Huang (2022) show how influencer attraction and government promotion positively affect public travel interest by applying NLP to Weibo posts. Markham et al. (2015) examine unstructured textual data to support product development decision-making. If NLP may support sentiment analysis at product/experience feature level and product/experience rating, it can also be applied in B2B marketing strategies and analysis. In studying the implications of artificial intelligence (AI) systems for market knowledge, Paschen et al. (2019) explain the role of NLP in B2B marketing, claiming that NLP may identify AI-service systems capable of understanding human language, decide which actions to take, and respond accordingly, while improving competitive intelligence by providing insights for B2B sales management.
3.2.6.3
NLP in Supply Chain and Operations Management
Literature dealing with NLP in operations management mostly regards the adoption of NLP techniques to process textual data for improving operations within an organization. Lu and Zhang (2021) process almost 6000 injury reports to detect issues in construction safety, whereas Asadabadi et al. (2022) use a mixed methodology approach, combining NLP and quality function deployment (QFD) to examine online product reviews from customers to improve products. NLP may also be valuable in operations management research, especially in compiling existing studies on NLP for systematic literature research. If Sahoo et al. (2022) choose DL techniques, including NLP, to conduct a systematic literature review mapping six knowledge clusters in manufacturing operations, Lu and Zhang (2021) underline that NLP is a major emerging technology in big data application in the construction industry. Rizun et al. (2021) apply NLP to business process management (BPM) research, extracting information and insights about business process complexity from business process textual data. Hirata et al. (2020) combine NLP, ML, and text mining approaches to retrieve from existing literature the key components of blockchain technology in supply chain, with blockchain technology integration with an organization’s Internet of things (IoT) being vital for efficient supply chain management.
68
3
Methodology and Empirical Strategy
Guo et al. (2017) consider another important external factor in operations management (i.e., competitors), highlighting that NLP can also be applied to competitor analysis for informing companies about their competitors and the relationships between them in the fitness mobile app business.
3.2.6.4
NLP in Strategic Management
NLP may be advantageous in strategic management when analyzing merger and acquisitions (M&A) documents and annual reports. For instance, Vinocur et al. (2022) employ NLP to process unstructured data from both annual reports and M&A synopses to investigate the impact of M&A capability on long-term firm performance in terms of return on equity (ROE) and price-to-book value. Much related literature also deals with the application of NLP to the analysis of corporate and executive-level speech. Key and Keel (2020) use an NLP system (i.e., IBM Watson) to examine how C-suite marketers’ speech articulation may increase their strategic influence within an organization. The influence on a firm’s performance of top management’s communication may also be conducted through sentiment analysis. Pengnate et al. (2020) study top management’s communication in times of economic crisis to reveal how managers’ interpretation and discretion subsequently affect performance and performance trajectory. Menon et al. (2018) employ NLP when comparing companies’ strategies and finding key strategic elements, combining two NLP techniques (i.e., topic modeling and vector space models) to spot and compare critical strategy constructs across multiple industries through the analysis of business descriptions and annual reports.
3.2.6.5
NLP in Sustainability Management
In terms of sustainability management, Samant and Sangle (2016) focus on the stakeholders’ changing role in sustainable value creation, using NLP in combination with text mining methodology to evince trends in firm’s approaches from past sustainability literature. In studying existing literature on the role of blockchain technology in supply chain management through NLP, Hirata et al. (2020) show its critical role in food sustainability, as it improves supply chain traceability and transparency as well as accelerates the implementation of supply chain sustainability strategies. Existing literature on the application of NLP techniques to sustainability management also deals with the analysis of sustainability-related documents, such as sustainability reports. Conducting a sentiment analysis on various sustainability reports, Kang and Kim (2022) prove that NLP can be useful in organizational settings for investigating which sustainability trends and themes are crucial sustainability for companies, while understanding the differential effect of differing consideration of sustainability reporting across businesses on their company image. Amel-Zadeh et al. (2021) show that NLP is also effective in measuring a company’s
3.2
Natural Language Processing (NLP)
69
compliance to UN SGDs by applying NLP techniques to an organization’s sustainability-related documents, while Gutierrez-Bustamante and Espinosa-Leal (2022) evaluate sustainability reports of Nordic Listed Companies through NLP, assigning a quantitative ranking score to each company. Nevertheless, as many companies still fail to account for financial data on sustainability achievements, Luccioni et al. (2020) embed NLP applications into a custom model to scrutinize financial reports for identifying sections containing climate information and other relevant sustainability data. However, NLP is applied to sustainability management to better understand customers and other stakeholders’ take on sustainability issues as well. Ballestar et al. (2020) implement social listening on Twitter to examine conversations’ sentiments for inferring people’s feelings toward sustainability, whereas Yamano et al. (2022) combine NLP, co-occurrence network analysis, and principal component analysis (PCA) to process open-ended answers of students to investigate how student’s perception of sustainability development has changed from the beginning to an end of a sustainability class.
3.2.6.6
NLP in Innovation and Information Management
According to Kang et al. (2020), NLP applications in information systems and innovation management mostly deal with R&D investment, process improvement, and the effect of interpersonal communication on original idea generation. Taskin and Al (2019) claim that NLP can speed up and improve data processing through improved bibliometrics and social network analysis at the organizational level, while Kostelník and Dařena (2021) study how conversational interfaces (i.e., chatbots) may be a novel approach to access business data more efficiently, claiming that chatbots may enable constant feedback, decomposes complex database queries, and build conversations. If Hutchinson (2020) advocates for the development of NLP and ML methodologies for archival processing (e.g., e-mails) as they can be applied to sensitivity reviews for extracting personal information from records and appraisal, Gaizauskas and Wilks (1998) state that NLP may support text processing through efficient information extraction from legal texts and formal software system requirements specifications. Yang et al. (2006) use NLP and ontology concepts to evaluate the effectiveness of a prototyped IT system automatically extracting IT product specifications from online textual sources, while Rizun et al. (2021) implement NLP in the understanding of business process complexity in IT ticketing service management, applying NLP to the analysis of the interdependence between business process and organizational IT systems and service management.
3.2.6.7
NLP in Human Resource Management (HRM)
Jia et al. (2018) define six basic tasks of HRM, including the recruitment of human resources. If this implies that recruiting is a critical activity for businesses, escalating
70
3 Methodology and Empirical Strategy
digitization has radically changed how companies advertise, search, and recruit workers, as innovative technology and the Internet have overturned decision-making in HRM in the matchmaking between a company’s required profiles and job applicants. NLP is usually used in the recruitment phase to process and scan enormous quantities of resumes. Haddad and Mercier-Laurent (2021) opt for an ML approach to evaluate CVs to replace tedious manual processing. However, in dealing with resumes, NLP models must be trained with resume data to acquire necessary vocabulary. In addition to supporting data collection and processing for constant innovation, NLP can be implemented in HRM to choose valuable people for innovation projects. In studying the crucial role of collaborative innovation platforms in organizational innovation strategies, Montelisciani et al. (2014) deal with the problem of building effective collaborative teams in crowdsourcing platforms, advocating for a successful matchmaking mechanism. In developing an effective selection method, they use a mixed methodology, combining NLP algorithms and semantic ontologies to find which characteristics these teams should have, developing a team-building method that calculates the contribution of each problem solver to the final solution to select the most suitable people. The advantages of using NLP in HRM for CV screening will be analyzed in more detail in Sect. 3.2.7. Based on personal elaboration, Table 3.2 summarizes main NLP-based techniques used in management science.
3.2.7
NLP and CV Analysis
As a resume can be conceived as composed of textual data encompassing words and sentences (Maheshwari et al., 2010), it can be considered as a bag of words revealing information about a job application’s acquired skills and knowledge. Hence, NLP techniques can be applied to resume screening, that is, the determination of whether a job applicant is qualified for a job position given the self-reported CV information and profile required (Bharadwaj et al., 2022). Jia et al. (2018) propose a framework combining the six dimensions of HRM with their AI applications, including computer vision and NLP, to process massive sets of natural language data. In applying NLP to CV analysis, Sanyal et al. (2017) define a three-layer approach, involving lexical, syntactic, and semantic analyses. Daryani et al. (2020) also develop an NLP system for extracting relevant information from unstructured CVs to support resume screening, matching resumes with job descriptions through cosine similarity thanks to a vectorization model. Kinge et al. (2022) propose a resume screening system combining both ML and NLP, whereas Harsha et al. (2022) combine NLP and an automated ML algorithm to reduce human involvement and errors considering the massive number of resumes received by companies today. Opting for an NLP approach to resume screening as well, Najjar et al. (2021) develop a three-block system, encompassing a training block that uses a portion of resumes for training, a matching block for matching resumes with the job
3.2
Natural Language Processing (NLP)
71
Table 3.2 Main applications of NLP in management science. Personal elaboration Management science Finance and accounting
Marketing and sales
Supply chain and operations management
Strategic management
Sustainability management
Innovation and information management
NLP applications Stock market analysis and market reaction; news’ impact on sales; textual analysis of finance and accounting research; sentiment analysis of financial statements and corporate reports for predicting future performance; sentiment analysis of news about market prices for analyzing their effect on stock returns; automatic speech recognition (ASR), machine learning, and voice mining for more effective in customer persuasion; automatic processing of textual data from organization’s annual reports for more efficient auditing procedures Customer value creation: sentiment analysis of customers’ feedback and reviews across multiple channels for more effective customer perception analysis of attribute performance in product/ experience rating and customer satisfaction Digital marketing: automation of user-generated content (UGC) analysis (e.g., customers’ posts and comments in and across multiple social networks and platforms) and discourse analysis of UGC through social media Product/Experience marketing: sentiment analysis of UGC across multiple social media and online reviews for rating index development; sentiment analysis of online UGC for improving elements and investigating problems of an experience or product in terms of customer satisfaction, thus boosting product development decision-making B2B marketing: enhanced competitive intelligence by providing insights for B2B sales management Enhanced data collection and extraction of information and insights for improving business process management (BPM) and operations within an organization (e.g., safety issues, product development, analyzing business process complexity, blockchain technology in supply chain) and refining competitor analysis Analysis of merger and acquisitions (M&A) documents and annual reports; sentiment analysis of corporate, top management, and executive-level speech and its strategic influence on organizational performance; analysis of business descriptions and annual reports for finding critical strategic constructs and comparing key strategic themes across companies and/or competitors Analysis of existing sustainability management literature for identifying changing trends in organization’s approaches to sustainable value creation; textual analysis of organizations’ sustainability reports and documents for checking companies’ compliance to UN SGDs or assigning sustainability scores; social listening and sentiment analysis of social media conversations for understanding people’s perceptions on sustainability-related issues R&D investment, process improvement, and the effect of inter- or intra-organizational interpersonal communication on generation of innovative ideas: archival processing (e.g., e-mails); improved bibliometrics and social network analysis at the organizational level; analysis of conversational interfaces (i.e., chatbots) and IT (continued)
72
3
Methodology and Empirical Strategy
Table 3.2 (continued) Management science
Human resource management
NLP applications ticketing service management for more efficient business data access, processing, and IT systems and service management Automation of resume screening for matchmaking a job applicant’s self-reported information and a job profile on the basis of specific job requirement; enhanced matchmaking and human resource management mechanisms for resume classification and resume/candidate ranking through cosine similarity; automatic interview-to-text conversion and processing for matching a candidate’s characteristics and interview results with corresponding positions
description, and extracting block isolating top-ranking applicants accordingly. If Gopalakrishna and Vijayaraghavan (2019) classify IT profiles’ resumes, Trinh and Dang (2021) focus on a similar task, opting for cosine similarity as a methodology to check the proximity between a resume and a job post and drafting a recommendation list accordingly. Similarly, Anand and Dubey (2022) opt for NLP to check the affinity between a job profile requirement and a resume, claiming that in such a way NLP not only supports recruiters in choosing the best-fitting candidates, but also job applications in submitting their CV to most suitable positions. In evaluating resumes extracted from LinkedIn, Faliagka et al. (2014) choose ML and semantic matching techniques for improving candidate ranking by inferring their personality. Ben Abdessalem and Amdouni (2011) implement text mining methods in e-recruiting, highlighting how correct information extraction in HRM requires that a CV’s structures and constituting elements are correctly identified so NLP-driven systems can extract similar information even from widely different CVs in terms of structures and contents. In examining NLP applications in resume analysis, Haddad and Mercier-Laurent (2021) highlight how many terms are not useful for research purposes and candidate assessment, which makes data polishing critical. This entails eliminating stop words and performing lemmatization. In particular, after cleaning the data, they use an NLP algorithm for tokenizing and training the model for NER to extract information from CVs (Haddad & Mercier-Laurent, 2021). Similarly, Alanoca et al. (2020) blend text mining techniques with NLP ones to identify most pertinent CVs given a job offer’s requirements through average values of terminverse frequency (TF-IDF). Phillips et al. (2019) analyze faculty’s CVs to spot potential misrepresentation and misstatements regarding research accomplishments when applying for a position. Combining NLP and LSTM, Bharadwaj et al. (2022) develop an algorithm-based model for sorting out CVs into various job positions given the skills reported in a CV through linear support vector classifier (SVC). Adopting an NLP approach as well, Sanyal et al. (2017) develop a parse model finding and clustering keywords onto sectors, thus converting unstructured CV text into a structured data format and providing the most suitable CV on the basis of keyword matching.
3.3
The Empirical Strategy
73
In conclusion, if HRM uses AI systems for unveiling trends and insights about human characteristics and profiles (Zeng, 2020), NLP technology makes the recruiters’ work more efficient by enabling them to avoid typing, saving time in speech-to-text conversion, and easily match a candidate’s characteristics and interview results with corresponding positions (Jia et al., 2018).
3.3
The Empirical Strategy1
This section reports the approach we applied to the analysis of KSC of candidates in our CV dataset. In particular, in Sect. 3.1 we explain the dataset construction, in Sect. 3.3.2 we focus on the skill extraction procedure, and in Sect. 3.3.3 we describe the clustering procedure.
3.3.1
Dataset Construction
We built the dataset for this analysis from two main sources. The first is a private CV structured dataset from a headhunting company. This dataset has 4001 observations linked by an id variable which uniquely identifies candidates. The dataset has 22 variables,2 including both numeric variables, such as “years of experience,” and alphanumeric variables such as “city,” “skills,” “job title,” and more extended textual description about each candidate, such as “candidate summary,” “resume,” the whole resume of the candidate, “candidate info,” and “employers list.” For the purpose of this analysis, we only retain two of them: the job title, which allows us to match each candidate to an ATECO code, and the whole text of the CVs in the variable “resume,” which we use to conduct the textual analysis. This dataset covers ATECO classes which are related to the ICT job market such as data scientists, computer scientists, and so on. The second dataset is composed of 4074 CVs retrieved from LinkedIn, belonging to ATECO classes related to creative and cultural fields. We selected the number of CVs to be extracted according to the ISTAT proportions for the ICC. In order to retrieve CVs specifically related to the ATECO sectors of our interest, we queried the LinkedIn professionals database with a list of job titles that are established ex ante as related to the ATECO sectors, selecting only profiles of those working in Italy at any point of their career, and preferentially, having their profiles
1
This paragraph is also co-authored with Nicolò Tamagnone and Grazia Sveva Ascione. The first dataset of candidates has been supplied by Open Search Group, a headhunting company, founded in 2013, specialized in STEM profiles recruiting. The company is London-based, and its database includes both Italian and international candidates, which are then uploaded into Salesforce to be exported as comma delimited files.
2
74
3
Methodology and Empirical Strategy
written in English language and with a verbose candidate summary.3 Then, we anonymized each CV for privacy reasons, creating a consistent and distinct candidate id across the two datasets. Eventually, we merged the two datasets, obtaining a resulting dataset with 8075 observations, each one uniquely corresponding to a certain candidate with its own CV.
3.3.1.1
ESCO Skills, Knowledge, and Competences Taxonomy
The goal of this analysis was to study the different set of skills, knowledge, and competences of Italian candidates through CV analysis. To do that, we made express reference to the ESCO skills taxonomy. ESCO is a multilingual classification of Skills, Competences, Qualifications, and Occupations created by the European Commission to improve the supply of information on skills demand in the labor market. It is designed to assist individuals, employers, universities, and training providers by giving them up-to-date and standardized information on skills. More in detail, ESCO classification distinguishes between skills/competence concepts and knowledge.4 Each concept also includes an explanation through a description. ESCO is a live entity that is undergoing a constant process of updating and enriching. The last release is ESCO v1.1 at the end of 2021.5 The skills pillar includes 13485 concepts referring to 2942 occupations organized in hierarchical structures which contain four sub-classifications, each of them targeting different types of skills/ knowledge concepts: knowledge; skills; attitudes and values; and language skills and knowledge. In addition, the skills are also divided into transversal, languages, and digital skills. Few works previously used ESCO classification to describe the job market. For instance, le Vrang et al. (2014) worked on semantic interoperability to address the chronic mismatch between unemployed workers’ skills and companies’ needs. Mirski et al. (2017) used the ESCO database to triangulate skills, learning items, and job offerings. Furthermore, recent work by Colombo et al. (2019) developed a set of innovative tools for labor market intelligence by applying machine learning techniques to web vacancies on the Italian labor market. Their approach creates a taxonomy for skills and maps it into the ESCO classification, calculating, for each occupation, the different types of skills required by the market alongside a set of relevant variables such as region, sector, education, and level of experience.
3
The list of job titles related to each ATECO class is available upon request. ESCO classification is partially based on O*Net and Canadian skills and knowledge glossary. 5 ESCO v1.1 includes 515 new skills and occupations, in line with recent evolutions of the labor market. 4
3.3
The Empirical Strategy
3.3.1.2
75
Mapping CV to ESCO and ATECO Classification
The first task was to assign an ATECO code to each CV coming from the Open Search Group dataset. To this end, we first mapped the job title to the ESCO classification. In particular, using the columns “Title” and “Candidate_short_list,” the first reporting the last job title of the candidate and the latter previous occupations and skills of the candidate, we encoded both of them using FastText pretrained embeddings6 creating an average vector for all the words in the previously mentioned columns. At the same time, we retrieved from ESCO occupations dataset all those job titles (defined by different labels such as “preferred labels” and “alternate labels”) related to STEM disciplines for a total of 426 occupations. Then, we created embeddings using the same technique and we averaged the vectorial results to consider the different embeddings related to the preferred and the alternate labels. Eventually, we calculated cosine similarity between each ESCO job-related vector and candidate vector and we assigned to each candidate the ESCO label vector closest to the candidate vector.
3.3.2
Skills’ Extraction
Based on personal elaboration, Fig. 3.1 reports the main steps we performed to extract skills from our data. We could summarize the skills’ extraction process in three main steps: 1. Data pre-processing/cleaning 2. Skills/knowledge/competences detection 3. ESCO mapping As for the first step, we homogenized our dataset to one language, since about 20% of the CV texts were in Italian and the remaining 80% were in English; therefore, we created a monolingual dataset using open-source translation systems provided by the Language Technology Research Group at the University of Helsinki, 7 translating Italian texts into English. The second step involved a general text cleaning. CVs’ texts are the result of extracting textual content from digital documents, which is a nontrivial process since it involves transforming a complex layout graphical text representation (PDF document) into a plain form (txt document). This process inevitably creates some noise within texts, such as wrong spacing and words’ order. In parallel, CV’s texts are rich in symbols such as special characters (for instance, bulleted lists), graphic structures, and non-standard punctuation. For this reason, we built a simple preprocessing
6
FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. The embeddings are learned using both English web scrape and Wikipedia text. 7 https://blogs.helsinki.fi/language-technology/hi-nlp/crosslingual/
76
3
European Skills, Competences, Qualifications and Occupations (ESCO)
Data
Methodology and Empirical Strategy
NESTA Open Jobs Observatory Skills Project
Knowledge base
approximate string matching
ESCO Semantic Index
Mapped Skills/ Knowledge/ Competences List
KSC Detection ESCO mapping
IT to EN translation EMSI SKILLS
Fig. 3.1 Skill extraction diagram. Personal elaboration
process to remove these elements and keep plain text contents as similar as possible to that of the corresponding source files. There are several works in the literature regarding skills identification. Khaouja et al. (2021) propose a survey on current research and possible future directions on skill detection, making a systematic review of one hundred research articles on the topic. Several methods have been developed, but mainly those can be grouped into two main categories: the former, which we refer to as “top-down” approach, is based on taxonomies and knowledge bases of skills terms, that is a dictionary-based searching of terms within target texts, and the latter instead is based on big dataset annotation processes, resulting in machine learning (ML) models development, particularly name entity recognition models (Lample et al., 2016) or multilabel classification systems (Bhola et al., 2020). Although today more attention is given to strictly ML-based methods (Zhang et al., 2022), we decided to use a modified top-down approach, trying to add more advanced NLP/ML components to improve its results. This choice was made for the simple reason of not having an annotated dataset, whose size is also limited and not suitable for the development of more complex models. In addition, all cited works are based on the identification of skill terms within job postings, which are very different texts than a resume, which do not present descriptive text but more an unordered list of skills, knowledge, and competencies divided into macro-sections (education, work experience, etc.), making advanced models often based on word context analysis less effective. For this research, we retrieved the knowledge/skills/competence (KSC)-related keywords using two different tools. First, we followed the methodology proposed by Nesta (2021) in the Open Jobs Observatory (OJO) project, 8 who proposed a mixture of an NLP/rule-based technique to retrieve skills from job ads and map the retrieved keywords into the ESCO taxonomy. As described in Sect. 3.3.1, ESCO is a multilingual classification of Skills, Competences, Qualifications, and Occupations created by the European Commission that includes more than 13,000 KSC terms. Each
8
https://github.com/nestauk/ojo_daps_mirror/tree/main/ojd_daps/flows/enrich/labs/skills
3.3
The Empirical Strategy
77
entity consists of one preferred term (representing the entity itself), alternative labels, and a description that usually is composed of one or several sentences. Nesta proposes a top-down approach, extracting ESCO taxonomy terms within job advertisements. However, such a rule-based detection of exact keyword matches within a text has many limitations, like its inability to capture the context of the words and especially the lack of taking into account the variability in which a KSC concept might be expressed. For instance, the skill of “creative thinking” could also be expressed as “to think creatively,” or perhaps even “innovative thinking.” To remedy this problem, OJO project extends the ESCO taxonomy by building for each term an object composed by available textual information on the skill’s preferred and alternative labels, as well as its description, subsequently going on to identify within these objects what they call as “surface forms,” meaning simpler and more generic words and phrases that represent the underlying “skill entity.” This idea of surface and deep structures mirrors the surface and deep structures theory by Naom Chomsky on transformational grammar (Chomsky, 1971). Through some preprocessing and NLP methods, such as lemmatization 9 and noun phrases detection, this methodology yielded approximately an average of ten surface forms per KSC entity. Surface forms quality was then evaluated by a mixed machine learning (such as tf-idf and contextual sentence embeddings) and manually supervised approach to only retain higher quality ones, excluding too general or too ambiguous terms, i.e., surface forms that were found to be present in numerous KSCs. With this method, each keyword (preferred label) was supported by simplified alternative versions and synonyms, increasing the generality and accuracy of the extraction. To further improve the effectiveness of this approach, KSC terms and related surface forms were searched not only as exact matches, but through an approximate string matching technique, computing Jaro-Winkler distance 10 scores between the searched keyword and previous preprocessed and tokenized CV text words. We selected only string comparisons with a score above 0.95, decreasing false negatives due to typos or small differences. Secondly, we extracted other KSC-related keywords using “Emsi,” an API service which is rapidly gaining popularity in the human resources (HR) field and which allows us to identify more than 32,000 skills extracted from job postings, profiles, and resumes, and very up to date for skills related to the ICT sectors that are updated every 2 weeks (for instance, software names and developing techniques). 11 In this case, it was not possible to reapply the full methodology proposed by Nesta, since in this case the only textual data available is the KSC term itself (preferred label), making the respective surface forms’ derivation ineffective. For this reason, we filtered the Emsi skills dictionary, keeping only proper software names, technologies, and occupations-specific terms. Again, we matched keywords through the same string matching procedure as described above.
9
https://en.wikipedia.org/wiki/Lemmatisation https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance 11 More information on this project is available at https://skills.emsidata.com/ 10
78
3
Methodology and Empirical Strategy
The last step was to normalize detected terms in the CVs against a common taxonomy. Clearly, since Nesta extraction methodology was built from the ESCO taxonomy, we decided to use the latter as reference to map the extracted terms against it. This obviously only affects skills detected through the Emsi dictionary, since the approach shown by Nesta is built directly from ESCO and already mapped in respect of it. To map the extracted Emsi KSC into the ESCO taxonomy, we encoded both ESCO and EMSI skills using the steps explained in the following subsection (Sect. 3.3.2.1). After the encoding procedure, we compared encoded identified EMSI skills with those of ESCO and we assigned each EMSI skill to its closest vector in the ESCO’s list (reported as ESCO semantic index in Fig. 3.1) in terms of cosine similarity. In our research, a total of 5534 unique skills are identified in the KSC extraction process over all the dataset. The result for each CV analyzed is an unordered list of objects as shown in Fig. 3.2, where each element represents a detected skill in the document. Based on personal elaboration, Fig. 3.2 depicts a KSC object. Because of the mapping to ESCO, we not only have information of the specific label, its alternative labels, and description, but the taxonomy also provides information on type (Type KSC), indicating how widely a knowledge, skill, or competence concept can be applied, and preceding hierarchy level of the skill (Title KSC). The latter is particularly useful for assigning each specific KSC to its own broader category of membership, thanks to the hierarchical tree structure of the taxonomy.
3.3.2.1
Creation of Vector-Based Representations of KSC
In order to compare different skills, we use text encoding, where encoding is defined as a process to convert meaningful text into vectorial representation so as to preserve the context and relationship between words. In our research we combined two distinct encoding techniques, as represented in Fig. 3.3, which is a personal elaboration depicting our KSC encoding process. First, for each skill we found we created a short corpus joining its title, description, and alternate labels. To get the best encoding, we created embeddings for the text using both a pre-trained BERT-style transformers-based language model (SentenceTransformers as defined in Reimers & Gurevych, 2019) and a custom Word2Vec model (Mikolov et al., 2013). We trained a word2vec model with the continuous bag-of-words (CBOW) architecture and a window size of ten words, using around 132,000 Italian job ads text extracted from LinkedIn,12 collected over a period of 2 months, keeping the same vector size (384) of the pre-trained transformers-based model we selected. We included only English ads, without applying translation models as for our starting dataset. We followed this strategy considering that job-related text (like curricula or job advertisements) is a peculiar kind of text for which using a generic transformers-based model could not create tailor-made vector embeddings. Hence, to improve our vectorial
12
Each job ad has been preprocessed to exclude special characters, punctuation, and stop words.
3.3
The Empirical Strategy
79
Fig. 3.2 KSC object. Personal elaboration
Fig. 3.3 Encoding process of KSC. Personal elaboration
representations, we chose to follow the idea of Alghanmi et al. (2020) who suggest combining generic transformers-based embeddings and static vector representations trained on a domain-specific dataset. Eventually, to obtain the final vectorial representation for each skill, we first averaged the word2vec vectors for each word in each document, obtaining a document level encoding, and then we concatenated the two vectorial representations through a mean pooling, where each element in the vector was the average of the respective components.
80
3.3.2.2
3
Methodology and Empirical Strategy
Network Representation
As a first step of analysis, we decided to analyze our skill corpus through a graph structure. Graphs are a powerful representation formalism that can be applied to a variety of aspects related to natural language processing; in our case, we can use graphs to understand the relationships between extracted KSC if we imagine them as part of a knowledge network. Hence, we built a standard graph, consisting of a set of nodes and edges that connect pairs of nodes. In our case, nodes were represented by the set of unique skills identified whose connections were added when two different KSC co-occur at least one time within the same CV. In this way, we constructed an undirected graph, where we weighted each edge with the cosine similarity value between the corresponding KSC embeddings obtained in the encoding process. Edge weights were inversely proportional to its length; two skills with high cosine similarity would be represented closer within the graph. As a final step, we considered as node size the frequency with which the respective skill was identified in all documents in the dataset.
3.3.3
Clustering
At that point of the analysis, we were interested in understanding the proximity between skills and the proximity between candidates’ CVs. To do that we applied clustering techniques on the vectorial representation of skills obtained as explained in the previous step; further, we followed the same procedure to create a single vector representing each CV.13 The selected clustering technique used in both cases was HDBSCAN (Malzer & Baum 2020; McInnes et al., 2017), because we expected irregular clusters with different shapes, sizes, and densities considering that not all skills are equally represented in our dataset. HDBSCAN is a hierarchical densitybased cluster selection algorithm based on the concept that clusters are data partitions that have a higher density than their surroundings. Our vectors have a length of 384 and high dimensional data requires more observed samples to produce much density. HDBSCAN can suffer in this scenario, and in order not to incur in the curse of dimensionality, we had to reduce the dimensionality of the vectors, making density more evident before the clustering procedure. To do that we used the UMAP (Uniform Manifold Approximation and Projection for Dimension Reduction) algorithm (McInnes et al., 2018), which had been proven as useful in reducing embeddings’ dimensionality by the literature (see, for instance, Asyaky & Mandala, 2021; Hu et al., 2020). UMAP was set to perform nonlinear manifold aware
13 In the latter case, to represent each CV we average the vectors of all the skills contained in that CV.
References
81
dimension reduction so we could get the dataset down to a number of dimensions small enough for a density-based clustering algorithm to make progress. Using UMAP, we reduced the dimensionality of our word embeddings from 384 to 30, selecting 30 as the size of local neighborhood used for manifold approximation, where larger values resulted in more global views of the manifold and smaller in a local data preservation, and cosine distance as our default metric measure. After the dimensionality reduction step, we created clusters setting 40 as the minimum cluster size and the hyperparameter K to 5 as the minimum sample size from which the K-th nearest neighbor distance was computed for each data point. To visualize the clusters, we further reduced the dimension from 364 to 2 using the same UMAP parameters.
References Afolabi, I. T., Badejo, J., Adubi, S. A., & Odetunmibi, O. A. (2020). Identifying major civil engineering research influencers and topics using social network analysis. Cogent Engineering, 7(1), 1–17. https://doi.org/10.1080/23311916.2020.1835147 Agerri, R., Bermudez, J., & Rigau, G. (2014, May). IXA pipeline: Efficient and ready to use multilingual NLP tools. In Proceedings of the ninth international conference on language resources and evaluation (LREC’14) (pp. 3823–3828). http://www.lrec-conf.org/proceedings/ lrec2014/pdf/775_Paper.pdf Alanoca, H. A., Vidal, A. A., & Saire, J. E. C. (2020). Curriculum vitae recommendation based on text mining. https://doi.org/10.48550/arXiv.2007.11053 Alghanmi, I., Espinosa-Anke, L., & Schockaert, S. (2020). Combining BERT with static word embeddings for categorizing social media. Al Omran F. N. A., & Treude, C. (2017). Choosing an NLP library for analyzing software documentation: A systematic literature review and a series of experiments. In EEE/ACM 14th international conference on mining software repositories (MSR) (pp. 187–197). https://doi.org/ 10.1109/MSR.2017.42 Alshemali, B., & Kalita, J. (2020). Improving the reliability of deep neural networks in NLP: A review. Knowledge-Based Systems, 191, 105210. https://doi.org/10.1016/j.knosys.2019.105210 Ambrosino, A., Cedrini, M., Davis, J. B., Fiori, S., Guerzoni, M., & Nuccio, M. (2018). What topic modeling could reveal about the evolution of economics. Journal of Economic Methodology, 25(4), 329–348. https://doi.org/10.1080/1350178X.2018.1529215 Amel-Zadeh, A., Chen, M., Mussalli, G., & Weinberg, M. (2021). NLP for SDGs: Measuring corporate alignment with the sustainable development goals. Columbia Business School Research Paper. https://doi.org/10.2139/ssrn.3874442 Anand, A., & Dubey, S. (2022). CV analysis using machine learning. International Journal for Research in Applied Science & Engineering Technology (IJRASET), 10(V), 1316–1322. https:// doi.org/10.22214/ijraset.2022.42295 Ankala, K. M., & Karra, S. (2016). Resume analysis for skill-set estimation using HDFS, MapReduce and R. In Proceedings of the world congress on engineering and computer science (vol. 1). Antonio, N., de Almeida, A. M., Nunes, L., Batista, F., & Ribeiro, R. (2018). Hotel online reviews: creating a multi-source aggregated index. International Journal of Contemporary Hospitality Management, 30(12), 3574–3591. https://doi.org/10.1108/IJCHM-05-2017-0302
82
3
Methodology and Empirical Strategy
Arts, S., Hou, J., & Gomez, J. C. (2021). Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures. Research Policy, 50 (2), 104144. https://doi.org/10.1016/j.respol.2020.104144 Asadabadi, M. R., Saberi, M., Sadghiani, N. S., Zwikael, O., & Chang, E. (2022). Enhancing the analysis of online product reviews to support product improvement: integrating text mining with quality function deployment. Journal of Enterprise Information Management (ahead-of-print). https://doi.org/10.1108/JEIM-03-2021-0143 Asemie, S., Tepi, E., Jimma, E., & Mamo, G. (2017). Possibility of Amharic query processing in database using natural language interface. International Journal of Engineering Research & Technology (IJERT), 6(5). Asyaky, M. S., & Mandala, R. (2021, September). Improving the performance of HDBSCAN on short text clustering by using word embedding and UMAP. In 2021 8th international conference on advanced informatics: Concepts, theory and applications (ICAICTA) (pp. 1–6). IEEE. https://doi.org/10.1109/ICAICTA53211.2021.9640285 Auer, E. M. L.. (2018). Detecting deceptive impression management behaviors in interviews using natural language processing. Old Dominion University. https://digitalcommons.odu.edu/ psychology_etds/70 Ayanouz, S., Abdelhakim, B. A., & Benhmed, M. (2020, March). A smart chatbot architecture based NLP and machine learning for health care assistance. In Proceedings of the 3rd international conference on networking, information systems & security (pp. 1–6). https://doi.org/10. 1145/3386723.3387897 Ball, L., Pollard, E., & Stanley, N. (2010, January). Creative graduates creative futures. Creative Graduates Creative Futures Higher Education Partnership and the Institute for Employment Studies. https://static.a-n.co.uk/wp-content/uploads/2016/12/Creative-graduates-creativefutures.pdf Ballestar, M. T., Cuerdo-Mir, M., & Freire-Rubio, M. T. (2020). The concept of sustainability on social media: A social listening approach. Sustainability, 12(5), 2122. https://doi.org/10.3390/ su12052122 Bamman, D., Dyer, C., & Smith, N. A. (2014, June). Distributed representations of geographically situated language. In Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 828–834). Baltimore, Maryland, USA. Barbierato, E., Bernetti, I., & Capecchi, I. (2021). Analyzing TripAdvisor reviews of wine tours: An approach based on text mining and sentiment analysis. International Journal of Wine Business Research, 34(2), 212–236. https://doi.org/10.1108/IJWBR-04-2021-0025 Barroso, C. L., Abad, M. V., & Solís, F. M. (2021). Essential skills in current creative advertising: University vs. professional reality. ICONO 14, Revista de comunicación y tecnologías emergentes, 19(2), 93–117. Basili, R., Moschitti, A., & Pazienza, M. T. (2006). Extensive evaluation of efficient NLP-driven text classification. Applied Artificial Intelligence, 20(6), 457–491. https://doi.org/10.1080/ 08839510600753725 Basili, R., Pazienza, M. T., & Velardi, P. (1996). An empirical symbolic approach to natural language processing. Artificial Intelligence, 85(1–2), 59–99. https://doi.org/10.1016/00043702(95)00116-6 Ben Abdessalem, W. K., & Amdouni, S. (2011). E-recruiting support system based on text mining methods. International Journal of Knowledge and Learning, 7(3–4), 220–232. https://doi.org/ 10.1504/IJKL.2011.044542 Bernier, C., DiMaggio, P., & Heckscher, C. (2021). When content is king: using topic models to analyze online innovation crowdsourcing. Innovation: Organization & Management, 1–24. https://doi.org/10.1080/14479338.2021.2016417 Bharadwaj, S., Varun, R., Aditya, P. S., Nikhil, M., & Babu, G. C. (2022, July). Resume screening using NLP and LSTM. In 2022 international conference on inventive computation technologies (ICICT) (pp. 238–241). IEEE. https://doi.org/10.1109/ICICT54344.2022.9850889
References
83
Bhola, A., Halder, K., Prasad, A., & Kan, M. Y. (2020, December). Retrieving skills from job descriptions: A language model based extreme multi-label classification framework. In Proceedings of the 28th international conference on computational linguistics (pp. 5832–5842). https://doi.org/10.18653/v1/2020.coling-main.513 Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Safari Books Online. Boskou, G., Kirkos, E., & Spathis, C. (2019). Classifying internal audit quality using textual analysis: the case of auditor selection. Managerial Auditing Journal, 34(8), 924–950. https:// doi.org/10.1108/MAJ-01-2018-1785 Bourgonje, P., Schneider, J. M., & Rehm, G. (2017). From clickbait to fake news detection: An approach based on detecting the stance of headlines to articles. In Proceedings of the 2017 EMNLP workshop: Natural language processing meets journalism (pp. 84–89). Association for Computational Linguistics. https://doi.org/10.18653/v1/W17-4215 Bridgstock, R. S. (2011). Skills for creative industries graduate success. Education and Training, 53(1), 9–26. https://doi.org/10.1108/00400911111102333 Britt, B. C. (2021). The evolution of discourse in online communities devoted to a pandemic. Health Communication, 1–13. https://doi.org/10.1080/10410236.2021.1991618 Calheiros, A. N., Moro, S., & Rita, P. (2017). Sentiment classification of consumer-generated online reviews using topic modeling. Journal of Hospitality Marketing & Management, 26(7), 675–693. https://doi.org/10.1080/19368623.2017.1310075 Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57. Cañibano, C., & Bozeman, B. (2009). Curriculum vitae method in science policy and research evaluation: the state-of-the-art. Research Evaluation, 18(2), 86–94. https://doi.org/10.3152/ 095820209X441754 Cañibano, C., Otamendi, J., & Andújar, I. (2008). Measuring and assessing researcher mobility from CV analysis: the case of the Ramón y Cajal programme in Spain. Research Evaluation, 17(1), 17–31. https://doi.org/10.3152/095820208X292797 Chen, X., Chen, B., Zhang, C., & Hao, T. (2017, September). Discovering the recent research in natural language processing field based on a statistical approach. In Huang et al. (Eds.), International symposium on emerging technologies for education (pp. 507–517). Springer. https://doi.org/10.1007/978-3-319-71084-6_49 Chiarello, F., Belingheri, P., Bonaccorsi, A., Fantoni, G., & Martini, A. (2021). Value creation in emerging technologies through text mining: the case of blockchain. Technology Analysis & Strategic Management, 33(12), 1404–1420. https://doi.org/10.1080/09537325.2021.1876221 Chokshi, A., & Mathew, R. (2021). Deep learning and natural language processing for fake news detection: A survey. In International conference on IoT based control networks and intelligent systems (ICICNIS 2020) (pp. 716–728). https://doi.org/10.2139/ssrn.3769884 Chomsky, N. (1957). Syntactic structures. De Gruyter Mouton. Chomsky, N. (1971). Deep structure, surface structure, and semantic interpretation. Semantics, 183–216. Christiansen, M. H., & Chater, N. (1999). Connectionist natural language processing: The state of the art. Cognitive Science, 23(4), 417–437. https://doi.org/10.1207/s15516709cog2304_2 Cole, M. S., Feild, H. S., Giles, W. F., & Harris, S. G. (2009). Recruiters’ inferences of applicant personality based on resume screening: do paper people have a personality? Journal of Business and Psychology, 24(1), 5–18. https://doi.org/10.1007/s10869-008-9086-9 Cole, M. S., Feild, H. S., & Stafford, J. O. (2005). Validity of resume reviewers’ inferences concerning applicant personality based on resume evaluation. International Journal of Selection and Assessment, 13(4), 321–324. https://doi.org/10.1111/j.1468-2389.2005.00329.x Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2000). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 1, 1–48.
84
3
Methodology and Empirical Strategy
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (Almost) from Scratch. Journal of Machine Learning Research, 12, 2493– 2537. https://www.jmlr.org/papers/volume12/collobert11a/collobert11a.pdf?source Colombo, E., Mercorio, F., & Mezzanzanica, M. (2019). AI meets labor market: Exploring the link between automation and skills. Information Economics and Policy, 47, 27–37. https://doi.org/ 10.1016/j.infoecopol.2019.05.003 Crossley, S. A., Allen, L. K., Kyle, K., & McNamara, D. S. (2014). Analyzing discourse processing using a simple natural language processing tool. Discourse Processes, 51(5–6), 511–534. https://doi.org/10.1080/0163853X.2014.910723 Crowston, K., Allen, E. E., & Heckman, R. (2012). Using natural language processing technology for qualitative data analysis. International Journal of Social Research Methodology, 15(6), 523–543. https://doi.org/10.1080/13645579.2011.625764 Dale, R., Moisl, H., & Somers, H. (Eds.). (2000). Handbook of natural language processing. CRC Press. Daryani, C., Chhabra, G. S., Patel, H., Chhabra, I. K., & Patel, R. (2020). An automated resume screening system using natural language processing and similarity. Topics In Intelligent Computing and Industry Design (ICID), 2(2), 99–103. https://doi.org/10.26480/etit.02.2020.99.103 Denny, J. C., Spickard, A., III, Johnson, K. B., Peterson, N. B., Peterson, J. F., & Miller, R. A. (2009). Evaluation of a method to identify and categorize section headers in clinical documents. Journal of the American Medical Informatics Association, 16(6), 806–815. https://doi.org/10. 1197/jamia.M3037 Derczynski, L. (2016, May). Complementarity, F-score, and NLP evaluation. In Proceedings of the tenth international conference on language resources and evaluation (LREC’16) (pp. 261–266). Dietz, J. S. (2004). Scientists and engineers in academic research centers: An examination of career patterns and productivity. Georgia Institute of Technology. Dietz, J. S., & Bozeman, B. (2005). Academic careers, patents, and productivity: industry experience as scientific and technical human capital. Research Policy, 34(3), 349–367. https://doi.org/ 10.1016/j.respol.2005.01.008 Dietz, J., Chompalov, I., Bozeman, B., Lane, E., & Park, J. (2000). Using the curriculum vita to study the career paths of scientists and engineers: An exploratory assessment. Scientometrics, 49(3), 419–442. Donnelly, L. F., Grzeszczuk, R., & Guimaraes, C. V. (2022, April). Use of natural language processing (NLP) in evaluation of radiology reports: An update on applications and technology advances. Seminars in Ultrasound, CT and MRI, 43(2), 176–181. https://doi.org/10.1053/j.sult. 2022.02.007 Dyer, M. G. (1995). Connectionist natural language processing: A status report. In R. Sun & L. A. Bookman (Eds.), Computational architectures integrating neural and symbolic processes. A perspective on the state of the art (pp. 389–429). Kluwer Academic. Eachempati, P., & Srivastava, P. R. (2021). Accounting for unadjusted news sentiment for asset pricing. Qualitative Research in Financial Markets, 13(3), 383–422. https://doi.org/10.1108/ QRFM-11-2019-0130 Edison, H., & Carcel, H. (2021). Text data analysis using Latent Dirichlet Allocation: an application to FOMC transcripts. Applied Economics Letters, 28(1), 38–42. https://doi.org/10.1080/ 13504851.2020.1730748 El Mohadab, M., Bouikhalene, B., & Safi, S. (2020). Automatic CV processing for scientific research using data mining algorithm. Journal of King Saud University-Computer and Information Sciences, 32(5), 561–567. https://doi.org/10.1016/j.jksuci.2018.07.002 Fahrenbach, F., Revoredo, K., & Santoro, F. M. (2019). Valuing prior learning: Designing an ICT artifact to assess professional competences through text mining. European Journal of Training and Development, 44(2/3), 209–235. https://doi.org/10.1108/EJTD-05-2019-0070
References
85
Faliagka, E., Iliadis, L., Karydis, I., Rigou, M., Sioutas, S., Tsakalidis, A., & Tzimas, G. (2014). On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed CV. Artificial Intelligence Review, 42(3), 515–528. https://doi.org/10.1007/s10462-013-9414-y Femmer, H., Kučera, J., & Vetrò, A. (2014, September). On the impact of passive voice requirements on domain modelling. In Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement (pp. 1–4). Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for NLP. https://doi.org/10.48550/arXiv.2105.03075 Fisher, I. E., Garnsey, M. R., & Hughes, M. E. (2016). Natural language processing in accounting, auditing and finance: A synthesis of the literature with a roadmap for future research. Intelligent Systems in Accounting, Finance and Management, 23(3), 157–214. https://doi.org/10.1002/isaf. 1386 Fitzgerald, S., Mathews, G., Morris, C., & Zhulyn, O. (2012). Using NLP techniques for file fragment classification. Digital Investigation, 9, S44–S49. https://doi.org/10.1016/j.diin.2012. 05.008 Friedman, C., & Hripcsak, G. (1999, August). Natural language processing and its future in medicine. Academic Medicine, 74(8), 890–895. Gaizauskas, R., & Wilks, Y. (1998). Information extraction: Beyond document retrieval. Journal of documentation, 54(1), 70–105. https://doi.org/10.1108/EUM0000000007162 Galassi, A., Lippi, M., & Torroni, P. (2021). Attention in natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 32(10), 4291–4308. https://doi.org/ 10.1109/TNNLS.2020.3019893 Goloshchapova, I., Poon, S. H., Pritchard, M., & Reed, P. (2019). Corporate social responsibility reports: topic analysis and big data approach. The European Journal of Finance, 25(17), 1637–1654. https://doi.org/10.1080/1351847X.2019.1572637 Gopalakrishna, S. T., & Vijayaraghavan, V. (2019). Automated tool for Resume classification using Sementic analysis. International Journal of Artificial Intelligence and Applications (IJAIA), 10(1) https://ssrn.com/abstract=3349094 Green Jr., B. F., Wolf, A. K., Chomsky, C., & Laughery, K. (1961, May). Baseball: an automatic question-answerer. In Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM computer conference (pp. 219–224). Guo, L., Sharma, R., Yin, L., Lu, R., & Rong, K. (2017). Automated competitor analysis using big data analytics: Evidence from the fitness mobile app business. Business Process Management Journal, 23(3), 735–762. https://doi.org/10.1108/BPMJ-05-2015-0065 Gutierrez-Bustamante, M., & Espinosa-Leal, L. (2022). Natural language processing methods for scoring sustainability reports—A study of Nordic listed companies. Sustainability, 14, 9165. https://doi.org/10.3390/su14159165 Haddad, R., & Mercier-Laurent, E. (2021). Curriculum vitae evaluation using machine learning approach. Artificial intelligence for knowledge management, IFIP AICT 614, ffhal-03496596f. Hanemann, W. M., & Kanninen, B. (1996). The statistical analysis of discrete-response CV data (Working paper no. 798). University of California Berkeley. Hargittai, E. (2005). Survey measures of web-oriented digital literacy. Social Science Computer Review, 23(3), 371–379. https://doi.org/10.1177/0894439305275911 Harsha, T. M., Moukthika, G. S., Sai, D. S., Pravallika, M. N. R., Anamalamudi, S., & Enduri, M. (2022, April). Automated resume screener using natural language processing (NLP). In 2022 6th international conference on trends in electronics and informatics (ICOEI) (pp. 1772–1777). IEEE. https://doi.org/10.1109/ICOEI53556.2022.9777194 Hasan, M. R., Maliha, M., & Arifuzzaman, M. (2019, July). Sentiment analysis with NLP on Twitter data. In 2019 international conference on computer, communication, chemical, materials and electronic engineering (IC4ME2) (pp. 1–4). IEEE. https://doi.org/10.1109/ IC4ME247184.2019.9036670
86
3
Methodology and Empirical Strategy
Hellmann, S., Lehmann, J., Auer, S., & Brümmer, M. (2013, October). Integrating NLP using linked data. In International semantic web conference (pp. 98–113). Springer. https://doi.org/ 10.1007/978-3-642-41338-4_7 Hirata, E., Lambrou, M., & Watanabe, D. (2020). Blockchain technology in supply chain management: insights from machine learning algorithms. Maritime Business Review, 6(2), 114–128. https://doi.org/10.1108/MABR-07-2020-0043 Hovy, D., & Spruit, S. L. (2016, August). The social impact of natural language processing. In Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short papers) (pp. 591–598). Hu, S., He, Z., Wu, L., Yin, L., Xu, Y., & Cui, H. (2020). A framework for extracting urban functional regions based on multiprototype word embeddings using points-of-interest data. Computers, Environment and Urban Systems, 80, 101442. https://doi.org/10.1016/j. compenvurbsys.2019.101442 Hu, Y., Deng, C., & Zhou, Z. (2019). A semantic and sentiment analysis on online neighborhood reviews for understanding the perceptions of people toward their living environments. Annals of the American Association of Geographers, 109(4), 1052–1073. https://doi.org/10.1080/ 24694452.2018.1535886 Huang, A., Wu, W., & Yu, T. (2019). Textual analysis for China’s financial markets: a review and discussion. China Finance Review International, 10(1), 1–15. https://doi.org/10.1108/CFRI-082019-0134 Hutchinson, T. (2020). Natural language processing and machine learning as practical toolsets for archival processing. Records Management Journal, 30(2), 155–174. https://doi.org/10.1108/ RMJ-09-2019-0055 Jackson, P., & Moulinier, I. (2002). In R. Mitkov (Ed.), Natural language processing for online applications. Text retrieval, extraction and categorization. John Benjamins.. Jain, A., Kulkarni, G., & Shah, V. (2018). Natural language processing. International Journal of Computer Sciences and Engineering, 6(1), 161–167. https://doi.org/10.26438/ijcse/v6i1. 161167 Jia, Q., Guo, Y., Li, R., Li, Y. R., & Chen Y. W. (2018, December 2–6). A conceptual artificial intelligence application framework in human resource management. In Proceedings of the 18th international conference on electronic business (pp. 106–114). ICEB. Jiechieu, K. F. F., & Tsopze, N. (2021). Skills prediction based on multi-label resume classification using CNN with model predictions explanation. Neural Computing & Applications, 33, 5069–5087. https://doi.org/10.1007/s00521-020-05302-x Jones, K. S. (1994). Natural language processing: a historical review. In A. Antonio Zampolli, N. Calzolari, & M. Palmer (Eds.), Current issues in computational linguistics: in honour of Don Walker (Linguistica Computazionale, 9) (pp. 3–16). Springer. Jones, K. S. (1999). What is the role of NLP in text retrieval? In T. Strzalkowski (Ed.), Natural language information retrieval. text, speech and language technology, 7. Springer. https://doi. org/10.1007/978-94-017-2388-6_1 Joseph, S. R., Hlomani, H., Letsholo, K., Kaniwa, F., & Sedime, K. (2016). Natural language processing: A review. International Journal of Research in Engineering and Applied Sciences, 6(3), 207–210. Kamath, U., Liu, J., & Whitaker, J. (2019). Deep learning for NLP and speech recognition. Springer. Kang, H., & Kim, J. (2022). Analyzing and visualizing text information in corporate sustainability reports using natural language processing methods. Applied Sciences, 12, 5614. https://doi.org/ 10.3390/app12115614 Kang, Y., Cai, Z., Tan, C.-W., Huang, Q., & Liu, H. (2020). Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics, 7(2), 139–172. https://doi.org/10.1080/23270012.2020.1756939 Kaufman, D. R., Sheehan, B., Stetson, P., Bhatt, A. R., Field, A. I., Patel, C., & Maisel, J. M. (2016). Natural language processing-enabled and conventional data capture methods for input to
References
87
electronic health records: a comparative usability study. JMIR Medical Informatics, 4(4), e5544. https://doi.org/10.2196/medinform.5544 Kelkar, B., Shedbale, R., Khade, D., Pol, P., & Damame, A. (2020). Resume analyzer using text processing. Journal of Engineering Sciences, 11(5), 353–361. Key, T. M., & Keel, A. L. (2020). How executives talk: Exploring marketing executive value articulation with computerized text analysis. European Journal of Marketing, 54(3), 546–569. https://doi.org/10.1108/EJM-01-2019-0105 Khaouja, I., Kassou, I., & Ghogho, M. (2021). A survey on skill identification from online job ads. IEEE Access, 9, 118134–118153. Khoury, R., Karray, F., & Kamel, M. S. (2008). Keyword extraction rules based on a part-of-speech hierarchy. International Journal of Advanced Media and Communication, 2(2), 138–153. Khurana, D., Koli, A., Khatter, K., & Singh, S. (2022). Natural language processing: State of the art, current trends and challenges. Multimedia Tools and Applications, 1–32. https://doi.org/10. 1007/s11042-022-13428-4 Kinge, B., Mandhare, S., Chavan, P., & Chaware, S. M. (2022). Resume screening using machine learning and NLP: A proposed system. International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), 8(2), 253–258. https://doi.org/ 10.32628/CSEIT228240 Koedel, C., & Tyhurst, E. (2012). Math skills and labor-market outcomes: Evidence from a resumebased field experiment. Economics of Education Review, 31(1), 131–140. https://doi.org/10. 1016/j.econedurev.2011.09.006 Kolleck, N., & Yemini, M. (2020). Environment-related education topics within global citizenship education scholarship focused on teachers: A natural language processing analysis. The Journal of Environmental Education, 51(4), 317–331. https://doi.org/10.1080/00958964.2020.1724853 Kostelník, P., & Dařena, F. (2021). Conversational interfaces for unconventional access to business relational data structures. Data Technologies and Applications, 56(1), 87–102. https://doi.org/ 10.1108/DTA-03-2021-0062 Krovetz, R., & Croft, W. B. (1992). Lexical ambiguity and information retrieval. ACM Transactions on Information Systems (TOIS), 10(2), 115–141. https://doi.org/10.1145/146802. 146810 Kumar, L., & Bhatia, P. K. (2013). Text mining: concepts, process and applications. Journal of Global Research in Computer Science, 4(3), 36–39. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural architectures for named entity recognition. https://doi.org/10.48550/arXiv.1603.01360 Lawrence, S., Giles, C. L., & Fong, S. (2000). Natural language grammatical inference with recurrent neural networks. IEEE Transactions on Knowledge and Data Engineering, 12(1), 126–140. https://doi.org/10.1109/69.842255 le Vrang, M., Papantoniou, A., Pauwels, E., Fannes, P., Vandensteen, D., & De Smedt, J. (2014). ESCO: Boosting job matching in Europe with semantic interoperability. Computer, 47(10), 57–64. https://doi.org/10.1109/MC.2014.283 Lease, M. (2007, November). Natural language processing for information retrieval: the time is ripe (again). In Proceedings of the ACM first Ph. D. workshop in CIKM (pp. 1–8). Lee, J. Y., & Dernoncourt, F. (2016). Sequential short-text classification with recurrent and convolutional neural networks. https://doi.org/10.48550/arXiv.1603.03827 Lehnert, W. G., & Ringle, M. H. (Eds.). (2014). Strategies for natural language processing. Psychology Press. Li, J., Li, G., Zhu, X., & Yao, Y. (2020). Identifying the influential factors of commodity futures prices through a new text mining approach. Quantitative Finance, 20(12), 1967–1981. https:// doi.org/10.1080/14697688.2020.1814008 Liao, C., Du, P., Yang, Y., & Huang, Z. (2021). Carrots or sticks in debt collection services? A voice metrics and text analysis of debt collection calls. Journal of Service Theory and Practice, 31(6), 960–973. https://doi.org/10.1108/JSTP-12-2020-0290
88
3
Methodology and Empirical Strategy
Liddy, E. D. (1998). Enhanced text retrieval using natural language processing. Bulletin of the American Society for Information Science and Technology, 24(4), 14–16. Liddy, E. D. (2001). Natural language processing. In Encyclopedia of library and information science (2nd ed). Marcel Decker. Lim, J., & Lee, H. C. (2020). Comparisons of service quality perceptions between full service carriers and low cost carriers in airline travel. Current Issues in Tourism, 23(10), 1261–1276. https://doi.org/10.1080/13683500.2019.1604638 Lind, F., Eberl, J. M., Eisele, O., Heidenreich, T., Galyga, S., & Boomgaarden, H. G. (2022). Building the bridge: Topic modeling for comparative research. Communication Methods and Measures, 16(2), 96–114. https://doi.org/10.1080/19312458.2021.1965973 Liu, X., Burns, A. C., & Hou, Y. (2017). An investigation of brand-related user-generated content on Twitter. Journal of Advertising, 46(2), 236–247. https://doi.org/10.1080/00913367.2017. 1297273 Loughran, T., & McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4), 1187–1230. https://doi.org/10.1111/1475-679X.12123 Lu, Y., & Zhang, J. (2021). Bibliometric analysis and critical review of the research on big data in the construction industry. Engineering, Construction and Architectural Management. https:// doi.org/10.1108/ECAM-01-2021-0005 Luccioni, A., Baylor, E., & Duchene, N. (2020). Analyzing sustainability reports using natural language processing. https://doi.org/10.48550/arXiv.2011.08073 Ly, A., Uthayasooriyar, B., & Wang, T. (2020). A survey on natural language processing (NLP) and applications in insurance. https://doi.org/10.48550/arXiv.2010.00462 Lynn, V., Son, Y., Kulkarni, V., Balasubramanian, N., & Schwartz, H. A. (2017, September). Human centered NLP with user-factor adaptation. In Proceedings of the 2017 conference on empirical methods in natural language processing, Copenhagen, Denmark (pp. 1146–1155). https://doi.org/10.18653/v1/D17-1119 Maer-Matei, M. M., Mocanu, C., Zamfir, A. M., & Georgescu, T. M. (2019). Skill needs for early career researchers—a text mining approach. Sustainability, 11(10), 2789. https://doi.org/10. 3390/su11102789 Maheshwari, S., Sainani, A., & Reddy, P. K. (2010, March). An approach to extract special skills to improve the performance of resume selection. In International workshop on databases in networked information systems (pp. 256–273). Springer. Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Adam, S. (2018). Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. Communication Methods and Measures, 12(2–3), 93–118. https://doi.org/10.1080/19312458.2018. 1430754 Malzer, C., & Baum, M. (2020, September). A hybrid approach to hierarchical density-based cluster selection. In 2020 IEEE international conference on multisensor fusion and integration for intelligent systems (MFI) (pp. 223–228). IEEE. https://doi.org/10.1109/MFI49285.2020. 9235263 Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing (Vol. 999). MIT Press. Markham, S. K., Kowolenko, M., & Michaelis, T. L. (2015). Unstructured text analytics to support new product development decisions. Research-Technology Management, 58(2), 30–39. https:// doi.org/10.5437/08956308X5802291 Marrone, R., Cropley, D. H., & Wang, Z. (2022). Automatic assessment of mathematical creativity using natural language processing. Creativity Research Journal. https://doi.org/10.1080/ 10400419.2022.2131209 Marsoof, A., Luco, A., Tan, H., & Joty, S. (2022). Content-filtering AI systems—Limitations, challenges and regulatory approaches. Information & Communications Technology Law, 1–38. https://doi.org/10.1080/13600834.2022.2078395
References
89
McInnes, L., Healy, J., & Astels, S. (2017). hdbscan: Hierarchical density based clustering. Journal of Open Source Software, 2(11), 205. https://doi.org/10.21105/joss.00205 McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. https://doi.org/10.48550/arXiv.1802.03426 Menon, A., Choi, J., & Tabakovic, H. (2018, July). What you say your strategy is and why it matters: natural language processing of unstructured text. In Academy of management proceedings (vol. 1, p. 18319). Academy of Management. Merritt, K., Smith, D., & Renzo, J. C. D. (2005). An investigation of self-reported computer literacy: Is it reliable. Issues in Information Systems, 6(1), 289–295. Meurers, D. (2012). Natural language processing and language learning. In C. A. Chapelle (Ed.), Encyclopedia of applied linguistics (pp. 4193–4205). Wiley. Miikkulainen, R., & Dyer, M. G. (1991). Natural language processing with modular PDP networks and distributed lexicon. Cognitive Science, 15(3), 343–399. https://doi.org/10.1207/ s15516709cog1503_2 Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. https://doi.org/10.48550/arXiv.1301.3781 Minsky, M. (1968). Semantic information processing. MIT Press. Mirski, P., Bernsteiner, R., & Radi, D. (2017). Analytics in human resource management the OpenSKIMR approach. Procedia Computer Science, 122, 727–734. https://doi.org/10.1016/j. procs.2017.11.430 Montelisciani, G., Gabelloni, D., Tazzini, G., & Fantoni, G. (2014). Skills and wills: the keys to identify the right team in collaborative innovation platforms. Technology Analysis & Strategic Management, 26(6), 687–702. https://doi.org/10.1080/09537325.2014.923095 Mutanga, M. B., & Abayomi, A. (2022). Tweeting on COVID-19 pandemic in South Africa: LDA-based topic modelling approach. African Journal of Science, Technology, Innovation and Development, 14(1), 163–172. https://doi.org/10.1080/20421338.2020.1817262 Najjar, A., Amro, B., & Macedo, M. (2021). An intelligent decision support system for recruitment: resumes screening and applicants ranking. Informatica, 45(4), 617–623. https://doi.org/10. 31449/inf.v45i4.3356 NESTA. (2021, September 21). Open jobs observatory: Extracting skills from online job adverts. https://www.nesta.org.uk/project-updates/skills-extraction-ojo/ Ng, H. T., & Zelle, J. (1997). Corpus-based approaches to semantic interpretation in NLP. AI Magazine, 18(4), 45–45. https://doi.org/10.1609/aimag.v18i4.1321 Oh, Y. K., & Yi, J. (2021). Asymmetric effect of feature level sentiment on product rating: an application of bigram natural language processing (NLP) analysis. Internet Research, 32(3), 1066–2243. https://doi.org/10.1108/INTR-11-2020-0649 Öhman, E., & Metcalfe, A. G. (2021, December). Japanese beauty marketing on social media: Critical discourse analysis meets NLP. In Proceedings of the workshop on natural language processing for digital humanities (pp. 131–137). Oramas, S., Espinosa-Anke, L., Gómez, F., & Serra, X. (2018). Natural language processing for music knowledge discovery. Journal of New Music Research, 47(4), 365–382. https://doi.org/ 10.1080/09298215.2018.1488878 Özdağoğlu, G., Kapucugil-Ikiz, A., & Celik, A. F. (2018). Topic modelling-based decision framework for analysing digital voice of the customer. Total Quality Management & Business Excellence, 29(13–14), 1545–1562. https://doi.org/10.1080/14783363.2016.1273106 Palmer, D. D. (2000). Tokenisation and sentence segmentation. In Handbook of natural language processing (pp. 11–35). Marcel Dekker. Pandey, S., Pandey, S. K., & Miller, L. (2017). Measuring innovativeness of public organizations: Using natural language processing techniques in computer-aided textual analysis. International Public Management Journal, 20(1), 78–107. https://doi.org/10.1080/10967494.2016.1143424 Paschen, J., Kietzmann, J., & Kietzmann, T. C. (2019). Artificial intelligence (AI) and its implications for market knowledge in B2B marketing. Journal of Business & Industrial Marketing, 34(7), 1410–1419. https://doi.org/10.1108/JBIM-10-2018-0295
90
3
Methodology and Empirical Strategy
Pengnate, S. F., Lehmberg, D. G., & Tangpong, C. (2020). Top management’s communication in economic crisis and the firm’s subsequent performance: sentiment analysis approach. Corporate Communications: An International Journal, 25(2), 187–205. https://doi.org/10.1108/CCIJ-072019-0094 Phillips, T., Saunders, R. K., Cossman, J., & Heitman, E. (2019). Assessing trustworthiness in research: a pilot study on CV verification. Journal of Empirical Research on Human Research Ethics, 14(4), 353–364. https://doi.org/10.1177/1556264619857843 Preuss, B. (2017). Text mining and natural language processing to capture cultural data (Working paper). https://doi.org/10.13140/RG.2.2.30937.42080. Rahmani, D., & Kamberaj, H. (2021). Implementation and usage of artificial intelligence powered chatbots in human resources management systems. In Conference: International conference on social and applied sciences at: University of New York Tirana. Rajput, A. (2020). Natural language processing, sentiment analysis, and clinical analytics. In Innovation in health informatics (pp. 79–97). Academic Press. https://doi.org/10.1016/B9780-12-819043-2.00003-4 Ramaswamy, S., & DeClerck, N. (2018). Customer perception analysis using deep learning and NLP. Procedia Computer Science, 140, 170–178. https://doi.org/10.1016/j.procs.2018.10.326 Randazzo, C. (2016). Where do they go? Students’ sources of résumé advice, and implications for critically reimagining the résumé assignment. Technical Communication Quarterly, 25(4), 278–297. https://doi.org/10.1080/10572252.2016.1221142 Ray, A., Bala, P. K., & Kumar, R. (2021). An NLP-SEM approach to examine the gratifications affecting user’s choice of different e-learning providers from user tweets. Journal of Decision Systems, 30(4), 439–455. https://doi.org/10.1080/12460125.2020.1847406 Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERTnetworks. https://doi.org/10.48550/arXiv.1908.10084 Rezende, J. M. D., Rodrigues, I. M. D. C., Resendo, L. C., & Komati, K. S. (2022). Combining natural language processing techniques and algorithms LSA, word2vec and WMD for technological forecasting and similarity analysis in patent documents. Technology Analysis & Strategic Management, 1–22. https://doi.org/10.1080/09537325.2022.2110054 Rizun, N., Revina, A., & Meister, V. G. (2021). Assessing business process complexity based on textual data: Evidence from ITIL IT ticket processing. Business Process Management Journal, 27(7), 1966–1998. https://doi.org/10.1108/BPMJ-04-2021-0217 Robeer, M., Lucassen, G., Van Der Werf, J. M. E., Dalpiaz, F., & Brinkkemper, S. (2016, September). Automated extraction of conceptual models from user stories via NLP. In 2016 IEEE 24th international requirements engineering conference (RE) (pp. 196–205). IEEE. https://doi.org/10.1109/RE.2016.40 Rosadini, B., Ferrari, A., Gori, G., Fantechi, A., Gnesi, S., Trotta, I., & Bacherini, S. (2017, February). Using NLP to detect requirements defects: An industrial experience in the railway domain. In International working conference on requirements engineering: Foundation for software quality (pp. 344–360). Springer. Royle, J., & Laing, A. (2014). The digital marketing skills gap: Developing a digital marketer model for the communication industries. International Journal of Information Management, 34(2), 65–73. https://doi.org/10.1016/j.ijinfomgt.2013.11.008 Ryoo, J., & Bendle, N. (2017). Understanding the social media strategies of U.S. primary candidates. Journal of Political Marketing, 16(3–4), 244–266. https://doi.org/10.1080/15377857. 2017.1338207 Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002, February). Multiword expressions: A pain in the neck for NLP. In International conference on intelligent text processing and computational linguistics (pp. 1–15). Springer. Sahoo, S., Kumar, S., Abedin, M. Z., Lim, W. M., & Jakhar, S. K. (2022). Deep learning applications in manufacturing operations: a review of trends and ways forward. Journal of Enterprise Information Management (ahead-of-print). https://doi.org/10.1108/JEIM-012022-0025
References
91
Samant, S. M., & Sangle, S. (2016). A selected literature review on the changing role of stakeholders as value creators. World Journal of Science, Technology and Sustainable Development, 13(2), 100–119. https://doi.org/10.1108/WJSTSD-01-2016-0002 Sandström, U. (2009). Combining curriculum vitae and bibliometric analysis: mobility, gender and research performance. Research Evaluation, 18(2), 135–142. https://doi.org/10.3152/ 095820209X441790 Sanyal, S., Hazra, S., Adhikary, S., & Ghosh, N. (2017). Resume parser with natural language processing. International Journal of Engineering Science and Computing, 17(2), 4484. Selman, B. (1989). Connectionist systems for natural language understanding. Artificial Intelligence Review, 3(1), 23–31. https://doi.org/10.1007/BF00139194 Shelar, H., Kaur, G., Heda, N., & Agrawal, P. (2020). Named entity recognition approaches and their comparison for custom NER model. Science & Technology Libraries, 39(3), 324–337. https://doi.org/10.1080/0194262X.2020.1759479 Sjøvaag, H., & Pedersen, T. A. (2018). The effect of direct press support on the diversity of news content in Norway. Journal of Media Business Studies, 15(4), 300–316. https://doi.org/10.1080/ 16522354.2018.1546089 Song, K., Ran, C., & Yang, L. (2022). A digital analysis system of patents integrating natural language processing and machine learning. Technology Analysis & Strategic Management, 1–17. https://doi.org/10.1080/09537325.2022.2035349 Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., & Tsujii, J. I. (2012, April). BRAT: a web-based tool for NLP-assisted text annotation. In Proceedings of the demonstrations at the 13th conference of the European chapter of the Association for Computational Linguistics (pp. 102–107). Stock, O. (2000). Natural language processing and intelligent interfaces. Annals of Mathematics and Artificial Intelligence, 28(1), 39–41. https://doi.org/10.1023/A:1018995904244 Strzalkowski, T. (1995). Natural language information retrieval. Information Processing & Management, 31(3), 397–417. https://doi.org/10.1016/0306-4573(94)00055-8 Sun, S., Luo, C., & Chen, J. (2017). A review of natural language processing techniques for opinion mining systems. Information Fusion, 36, 10–25. https://doi.org/10.1016/j.inffus.2016.10.004 Talja, S. (2005). The social and discursive construction of computing skills. Journal of the American Society for Information Science and Technology, 56(1), 13–22. https://doi.org/10. 1002/asi.20091 Taskin, Z., & Al, U. (2019). Natural language processing applications in library and information science. Online Information Review, 43(4), 676–690. https://doi.org/10.1108/OIR-072018-0217 Tepper, J. A., Powell, H. M., & Palmer-Brown, D. (2002). A corpus-based connectionist architecture for large-scale natural language parsing. Connection Science, 14(2), 93–114. https://doi. org/10.1080/09540090210162074 Tian, C., Zhang, J., Liu, D., Wang, Q., & Lin, S. (2022). Technological topic analysis of standardessential patents based on the improved Latent Dirichlet Allocation (LDA) model. Technology Analysis & Strategic Management, 1–16. https://doi.org/10.1080/09537325.2022.2130039 Trinh, Q., & Dang, T. T. (2021). Automatic process resume in talent pool by applying natural language processing. In Proceedings of international conference on logistics and industrial engineering 2021 (pp. 234–240). Social Science Publishing House. Ushio, A., Espinosa-Anke, L., Schockaert, S., & Camacho-Collados, J. (2021). BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies? https://doi.org/10. 48550/arXiv.2105.04949 van Deursen, A. J., Helsper, E. J., & Eynon, R. (2014). Measuring digital skills. From digital skills to tangible outcomes project report. Available at: www.oii.ox.ac.uk/research/projects/?id=112 van Deursen, A. J. A. M., Helsper, E. J., & Eynon, R. (2016). Development and validation of the Internet Skills Scale (ISS). Information, Communication & Society, 19(6), 804–823. https://doi. org/10.1080/1369118X.2015.1078834
92
3
Methodology and Empirical Strategy
van Laar, E., van Deursen, A. J., & van Dijk, J. A. (2022). Developing policy aimed at 21st-century digital skills for the creative industries: an interview study with founders and managing directors. Journal of Education and Work, 35(2), 195–209. https://doi.org/10.1080/13639080. 2022.2036710 van Laar, E., Van Deursen, A. J., Van Dijk, J. A., & De Haan, J. (2020). Measuring the levels of 21st-century digital skills among professionals working within the creative industries: A performance-based approach. Poetics, 81, 101434. https://doi.org/10.1016/j.poetic.2020. 101434 Vijayarani, S., Ilamathi, M. J., & Nithya, M. (2015). Preprocessing techniques for text mining-an overview. International Journal of Computer Science & Communication Networks, 5(1), 7–16. Vinocur, E., Kiymaz, H., & Loughry, M. L. (2022). M&A capability and long-term firm performance: a strategic management perspective. Journal of Strategy and Management (ahead-ofprint). https://doi.org/10.1108/JSMA-10-2021-0204 Vodithala, S., & Mohammed, S. W. (2021). Retrieval of software components using NLP based IR model. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.03.362 Votto, A. M., Valecha, R., Najafirad, P., & Rao, H. R. (2021). Artificial intelligence in tactical human resource management: A systematic literature review. International Journal of Information Management Data Insights, 1(2), 100047. https://doi.org/10.1016/j.jjimei.2021.100047 Wahlster, W. (2000). Mobile speech-to-speech translation of spontaneous dialogs: An overview of the final Verbmobil system. In W. Wahlster (Ed.), Verbmobil: Foundations of speech-to-speech translation (pp. 3–21). Springer. https://doi.org/10.1007/978-3-662-04230-4_1 Wang, B., & Guo, X. (2012). Online recruitment information as an indicator to appraise enterprise performance. Online Information Review, 36(6), 903–918. https://doi.org/10.1108/ 14684521211287954 Wang, R., Hao, J.-X., Law, R., & Wang, J. (2019). Examining destination images from travel blogs: a big data analytical approach using latent Dirichlet allocation. Asia Pacific Journal of Tourism Research, 24(11), 1092–1107. https://doi.org/10.1080/10941665.2019.1665558 Wang, X., Yang, X., Wang, X., Xia, M., & Wang, J. (2020). Evaluating the competitiveness of enterprise’s technology based on LDA topic model. Technology Analysis & Strategic Management, 32(2), 208–222. https://doi.org/10.1080/09537325.2019.1648789 Wang, Y., Liu, S., Afzal, N., Rastegar-Mojarad, M., Wang, L., Shen, F., Kingsbury, P., & Liu, H. (2018). A comparison of word embeddings for the biomedical natural language processing. Journal of Biomedical Informatics, 87, 12–20. https://doi.org/10.1016/j.jbi.2018.09.008 Wanless, L., Seifried, C., Bouchet, A., Valeant, A., & Naraine, M. L. (2022). The diffusion of natural language processing in professional sport. Sport Management Review, 25(3), 522–545. https://doi.org/10.1080/14413523.2021.1968174 Waung, M., Hymes, R. W., & Beatty, J. E. (2014). The effects of video and paper resumes on assessments of personality, applied social skills, mental capability, and resume outcomes. Basic and Applied Social Psychology, 36(3), 238–251. https://doi.org/10.1080/01973533.2014. 894477 Webster, J. J., & Kit, C. (1992, August). Tokenization as the initial phase in NLP. In Proceedings of COLING 1992 volume 4: The 14th international conference on computational linguistics (pp. 1106–1110). Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. Wermter, S., Riloff, E., & Scheler, G. (Eds.). (1996). Connectionist, statistical and symbolic approaches to learning for natural language processing (Vol. 1040). Springer. Werz, J. M., Varney, V., & Isenhardt, I. (2019, August). The curse of self-presentation: Looking for career patterns in online CVs. In 2019 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) (pp. 733–736). IEEE. https://doi.org/10.1145/ 3341161.3343681 Willett, P. (2006). The Porter stemming algorithm: then and now. Program: Electronic Library and Information Systems, 40(3), 219–223. https://doi.org/10.1108/00330330610681295
References
93
Williams, N. L., Ferdinand, N., & Bustard, J. (2019). From WOM to aWOM—The evolution of unpaid influence: A perspective article. Tourism Review, 75(1), 314–318. https://doi.org/10. 1108/TR-05-2019-0171 Woods, W. A. (1978). Semantics and quantification in natural language question answering. Advances in Computers, 17, 1–87. https://doi.org/10.1016/S0065-2458(08)60390-3 Woolley, R., & Turpin, T. (2009). CV analysis as a complementary methodological approach: Investigating the mobility of Australian scientists. Research Evaluation, 18(2), 143–151. https:// doi.org/10.3152/095820209X441808 Xu, S., Zhang, C., & Hong, D. (2022). BERT-based NLP techniques for classification and severity modeling in basic warranty data study. Insurance: Mathematics and Economics, 107, 57–67. https://doi.org/10.1016/j.insmatheco.2022.07.013 Yamano, H., Park, J. J., Choe, N. H., & Sakata, I. (2022). Understanding students’ perception of sustainability: Educational NLP in the analysis of free answers. Sustainability, 14, 13970. https://doi.org/10.3390/su142113970 Yang, C., Chen, L. C., & Peng, C. Y. (2006). Developing and evaluating an IT specification extraction system. The Electronic Library, 24(6), 832–846. https://doi.org/10.1108/ 02640470610714251 Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 13(3), 55–75. https:// doi.org/10.1109/MCI.2018.2840738 Zeng, H. (2020). Adaptability of artificial intelligence in human resources management in this era. International Journal of Science, 7(1), 271–276. Zhang, J. (2019). Listening to the consumer: Exploring review topics on Airbnb and their impact on listing performance. Journal of Marketing Theory and Practice, 27(4), 371–389. https://doi.org/ 10.1080/10696679.2019.1644953 Zhang, J., & El-Gohary, N. M. (2015). Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering, 30(2), 04015014. Zhang, K., & Koshijima, I. (2019). Trend analysis of online travel review text mining over time. Journal of Modelling in Management, 15(2), 491–508. https://doi.org/10.1108/JM2-102018-0178 Zhang, M., Jensen, K. N., Sonniks, S. D., & Plank, B. (2022). Skillspan: Hard and soft skill extraction from English job postings. https://doi.org/10.48550/arXiv.2204.12811 Zhang, T., & Huang, X. (2022). Viral marketing: influencer marketing pivots in tourism—A case study of meme influencer instigated travel interest surge. Current Issues in Tourism, 25(4), 508–515. https://doi.org/10.1080/13683500.2021.1910214 Zhao, M., Javed, F., Jacob, F., & McNair, M. (2015). SKILL: A system for skill identification and normalization. Proceedings of the AAAI Conference on Artificial Intelligence, 29(2), 4012– 4017. https://doi.org/10.1609/aaai.v29i2.19064 Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K. J., Ajagbe, M. A., Chioasca, E. V., & BatistaNavarro, R. T. (2021). Natural language processing for requirements engineering: a systematic mapping study. ACM Computing Surveys (CSUR), 54(3), 1–41. https://doi.org/10.1145/ 3444689 Zimmermann, H. J. (2001). Fuzzy set theory—And its applications (4th ed.). Springer.
Chapter 4
Creative and Digital Skills in Italian Cultural and Creative Industries
Keywords Cultural and creative industries · Curriculum vitae analysis · Knowledge, skills, and competences analysis · Natural language processing · Cluster analysis · Skills demand and supply · Digital skills · Creative skills · Managerial skills
4.1
Descriptive Analysis
Our sample consists of two subsamples of data, containing 8075 curriculum vitae (i.e., CV, from now also observations) in total. Our sample tries to be representative of the workforce composition of the CCI in Italy in terms of ATECO code’s share. Hence, the ATECO codes’ distribution in the sample reflects the distribution of the CCI workforce across those ATECO codes in Italy, with the majority of it working in computerrelated activities (41%), design and photographic positions (12%), or architecture (9%). Interestingly, the share of those employed in creative, arts, and entertainment activities as well as marketing jobs is quite similar (roughly 6%), highlighting marketing as another important CCI subsector. Conversely, core CCI activities (ATECO code 58, 59, 60, and 91) account for approximately 10% of the sample. Education reports the lowest share. Based on personal elaboration, both Table 4.1 and Fig. 4.1 show frequencies and shares of ATECO codes in the whole sample. Moreover, based on personal elaboration, Figs. 4.2 and 4.3 report shares of ATECO codes in subsample 1 and subsample 2, respectively. In particular, although the sample represents the whole CCI workforce, subsample 1 contains CV of workers whose majority is employed in STEM occupations (roughly 63% of CV refers to computer-related jobs), whereas subsample 2 reports a greater share of CV whose employees are employed in more core creative and cultural jobs (e.g., design, architecture, creative, and entertainment activities).
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Nuccio, S. Mogno, Mapping Digital Skills in Cultural and Creative Industries in Italy, Contributions to Management Science, https://doi.org/10.1007/978-3-031-26867-0_4
95
96
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Table 4.1 Subsample 1 + Subsample 2. ATECO codes. Frequencies and shares. Personal elaboration ATECO code 58—Publishing activities 59—Motion picture, video and television programme production, sound recording, and music publishing activities 60—Programming and broadcasting activities 62—Computer programming, consultancy, and related activities 66—Activities auxiliary to financial services and insurance activities 70—Activities of head offices, management consultancy activities 71—Architectural and engineering activities, technical testing, and analysis 73—Advertising and market research 74—Other professional, scientific, and technical activities (*including specialized design activities, photographic activities) 85—Education 90—Creative, arts, and entertainment activities 91—Libraries, archives, museums, and other cultural activities NA Total
Absolute frequency 294 239
Share (%) 3.64 2.96
144 3274 35 441 738
1.78 40.54 0.43 5.46 9.14
513 994
6.35 12.31
60 513 148 682 8075
0.74 6.35 1.83 8.45
As for subsample 1, if computer programming reports the highest share (roughly, 63%), management consultancy follows with a share of 11%, advertising and market research 5%, and creative and cultural activities (e.g., creative, arts, and entertainment as well as libraries, archives, museums, and other cultural activities) about 2%. On the contrary, subsample 2 comprises curriculum vitae where ATECO codes related to CCI sub-industries have largest shares. Design and photography (ATECO code 74) reports the highest share of roughly 24%. Although 19% of subsample 2 workers is occupied in computer programming and related services, 18% is employed in architecture, 12% in creative, arts, and entertainment activities, 7.6% in advertising and marketing, 7.2% in publishing, 6% in the motion picture, video and television production, sound recording, or music publishing industries, 3.5% in programming and broadcasting activities, 3% in libraries, archives, and museums, and 0.3% in education in primary occupation. In our sample, female workers account for 24% of CV, while male ones for 68% of the total, with 8% of CV missing or not specifying this information. Based on personal elaboration, Fig. 4.4 shows gender shares in the whole sample. As for the sample’s geographical distribution, Milan, Rome, and Turin rank top three in terms of frequency, with the ten cities with highest frequency accounting for almost 50% of all observations and Milan for more than half of it, with a share of roughly 28%. Hence, there is a highly polarized distribution with the majority of workers concentrated in three cities. Based on personal elaboration, Fig. 4.5 shows
Fig. 4.1 Subsample 1 + subsample 2. ATECO codes. Personal elaboration
4.1 Descriptive Analysis 97
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.2 Subsample 1. ATECO codes. Personal elaboration
98
Descriptive Analysis
Fig. 4.3 Subsample 2. ATECO codes. Personal elaboration
4.1 99
100
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.4 Sample (subsample 1 + subsample 2) by gender. Personal elaboration
Fig. 4.5 Sample. Geographical distribution. Top ten cities. Personal elaboration
the share of the ten most frequently reported cities in the whole sample, while Figs. 4.6 and 4.7 display such shares in subsamples 1 and 2, respectively. Even if the two subsamples are considered separately, Milan, Rome, and Turin rank top three in both. In subsample 1, Milan reports a relative share of almost 24%,
4.1
Descriptive Analysis
101
Fig. 4.6 Subsample 1. Geographical distribution. Top ten cities (relative share out of 4001 observations). Personal elaboration
Fig. 4.7 Subsample 2. Geographical distribution. Top ten cities (relative share out of 4074 observations). Personal elaboration
102
4 Creative and Digital Skills in Italian Cultural and Creative Industries
Rome 8.5%, and Turin 2.2% with Bologna ranking fourth with a share of 0.9%. Although 39% of observations do not specify any geographical location, the distribution reveals a distinct concentration with the majority of the Italian digital workforce being highly concentrated in three main cities, which work as digitally innovative hubs. Similarly, in subsample 2 Milan, Rome, and Turin rank top three, with relative shares of approximately 32%, 11%, and 5%, respectively. Firenze and Bologna closely follow, with relative shares of 2.5% and 2.3%. However, cities with a number of observations ranging between 0 and 10 report a cumulative frequency of 92% as almost 73% of cities are found in only 1 CV. This proves the highly polarized distribution of creative and cultural workers in Italy, which is unevenly allocated between three main cities, reporting prominent creative and cultural workforce concentration and, therefore, working as creative hubs, and a capillary distribution of dispersed workers in multiple towns across the peninsula. While this may be due to the increasing adoption of smart-working and self-employment contracts (partite IVA), especially in many subsectors of the CCI, this is interesting if combined with subsample 1 results. Hence, Milan, Rome, and Turin work as main clusters and hubs attracting and concentrating both digital and creative talent.
4.2
KSC Analysis
The first part of our empirical research focuses on analyzing KSC extracted from both subsample 1 and subsample 2. After extracting KSC from CV, we identified three main categories to calculate KSC frequencies. The first two differ in the level, while the third encompasses different types of KSC. 1. A specific KSC identifies a particular knowledge, skill, or competence extracted from CVs and identifies their most fine-grained level of aggregation. 2. A title KSC refers to a superior level of taxonomy with respect to a specific KSC, working as a sort of umbrella term to identify multiple and diverse specific KSC. Hence, a title KSC groups multiple specific KSC according to specific parameters of similarity. 3. Finally, the highest level of aggregation classifies KSC into four categories to find frequency of sector-specific, cross-sector, transversal, and occupation-specific KSC.
4.2.1
Specific KSC Analysis
By comparing specific KSC, managerial skills are the most frequent in both subsamples, with alter management (also defined as change management, according to
4.2
KSC Analysis
103
ESCO) and communication occupying the first two positions, although the latter ranks first in subsample 2 while second in subsample 1. Computer equipment is the third most frequent KSC in subsample 1, whereas adaptation to change is so in subsample 2. This highlights the increasingly crucial role of managerial skills in both creative and cultural occupations. Based on personal elaboration, Figs. 4.8 and 4.9 reveal the top 50 specific KSC in terms of frequency in subsamples 1 and 2, respectively. After managerial skills, computer-enabled or digital specific KSC report highest frequencies in both subsamples, with KSC related to database and online communications management positioning in the top 50 in both subsamples. In particular, digital hard skills related to programming rank among the 50 KSC with highest frequency in both, with KSC dealing with web development, search engine design and online communication being featured in both rankings (e.g., Java and PHP), and a noticeable presence of programme use skills (Java, SQL, and Python in subsample 1; Adobe Photoshop, AutoCAD, and SketchBook Pro in subsample 2). Hence, digital KSC emerged as crucial for both digital and creative occupations. However, if digital hard skills related to the use of software emerge as vital in both subsamples, there is an important qualitative difference when it comes to the type of software. While they mostly entail the ability to use, implement, and consult about software, ICT, and data analysis (e.g., SQL, ICT project management, computer science) in subsample 1, in subsample 2 they mainly deal with the use of software to support design, architectural, or multimedia activities (e.g., Adobe Photoshop, AutoCAD), further revealing the convergence between design and digital KSC. While PHP ranks similarly in both subsamples (40th in subsample 1 and 38th in subsample 2), Java positions 13th in subsample 1 and 32nd in subsample 2 in terms of frequency. Furthermore, in subsample 1 SQL is the most frequent technical hard skill, ranking 10th, followed by database (12th), Java/computer programming (13th), ventilation network design (15th), and CRM (18th). Conversely, Adobe Photoshop is the first technical competence in subsample 2, with the 9th highest frequency, followed by photography (11th), graphic design (14th), image editing (16th), and cinematography (19th). This clearly highlights the growing tendency toward the symbiotic relationship between digital and artistic KSC in design and cinema, emphasizing the increasing importance of developing digital and software skills for creative and cultural workers as well. Moreover, among the CCI subsectors in subsample 2, also digital marketing KSC (i.e., planning, communication tools, copywriting) and cinematographic KSC (i.e., film reeling and cinematography) position among the top 50 KSC, which underscores the growing combination between marketing and digital as well as cinema and digital. This indicates that some CCI subsectors are experiencing a more remarkable predominance of digital KSC.
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.8 Subsample 1. Top 50 specific KSC (frequency). Personal elaboration
104
Fig. 4.9 Subsample 2. Top 50 specific KSC (frequency). Personal elaboration
4.2 KSC Analysis 105
106
4.2.2
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Title KSC Analysis
If a higher level of analysis is considered, also title KSC rankings suggest similar results. Languages rank first among the 20 most frequent title KSC for workers in subsample 2 (Fig. 4.11) although we can assume that this set of skills is very often mentioned in a CV. Yet, KSC related to software and applications development and analysis as well as to audiovisual techniques and media production follow even if reporting remarkably lower frequencies, showing a convergence between some cultural and creative and digital activities. Indeed, while languages KSC are crucial in the arts, placing first, software and applications development and analysis ranks second, highlighting the increasing symbiosis between arts and digital skills in the CCI. Computer usage, management and administration, database management, and adaptation to change occupy the top positions in both subsamples, which further attests what has been found in the analysis of specific KSC: the importance of managerial skills. However, the second half of the 20 top-frequency title KSC ranking shows remarkable differences between the two subsamples: if it encompasses KSC mostly dealing with finance and banking, wholesale and retail, and controlling and operations in subsample 1, it includes KSC related to visual display design, products and service promotions, fashion, and interior and industrial design for creative and cultural employees. If financial and operational KSC also rank among the top 20 KSC in subsample 2, they report lower frequencies with respect to subsample 1 and a focus on the planning rather than the execution of financial and operational activities. Based on personal elaboration, Figs. 4.10 and 4.11 reveal the top 20 title KSC in terms of frequency in subsamples 1 and 2, respectively. An interesting result is that title KSC related to marketing and advertising report a significant and similar ranking in both subsamples (i.e., 12th in subsample 1 and 10th in subsample 2), showing how marketing-related occupations now combine both creative and digital KSC (and therefore, workers), requiring more symbiotic approaches that merge creative and digital KSC in the era of hyperconnectivity. Furthermore, soft skills focusing on flexibility, resilience, co-management, and adaptation are increasingly critical as innovation becomes pivotal to survive, regardless of the sector. Indeed, among soft skills, adaptation to change, communication, and teamwork are featured among the top 50 specific KSC in terms of frequency (Figs. 4.8 and 4.9), while management skills as well as personal skills and development are among the top 20 title KSC. This proves what was introduced in the literature: the increasingly crucial role of “ambidextrous” workers in the CCI equipped with both arts and technical KSC that can ensure relentless innovation, and, therefore, sustainable competitive advantage and organizational resilience, through unique combinations of KSC thanks to constantly updated knowledge sharing and superior flexibility.
Fig. 4.10 Subsample 1. Top 20 title KSC (frequency). Personal elaboration
4.2 KSC Analysis 107
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.11 Subsample 2. Top 20 title KSC (frequency). Personal elaboration
108
4.2
KSC Analysis
4.2.3
109
Type of KSC Analysis
Indeed, when comparing types of KSC, sector-specific and cross-specific (or soft) KSC are the most frequently found in CV, whereas occupation-specific are the lowest. This highlights the convergence of KSC and, thus, industries, which is due to the escalating digitization of many activities due to digital transformation. Since sector-specific KSC report slightly higher frequencies than cross-sector ones, ranking first in both subsamples, this means that hard skills still matter for the CCI workforce. However, the minimal gap between sector-specific and crossspecific KSC rankings also shows the importance for workers to complement hard skills with soft skills that support the former in the management and coordination of digital and creative talents and KSC in digital transformation. However, frequencies dramatically differ between the two subsamples: sector-specific KSC and cross-specific KSC had a frequency of 70k and roughly 68k respectively in subsample 1, whereas they report frequencies of approximately 32k and 30k respectively in subsample 2. Interestingly, the frequency of transversal skills is approximately 13k in both subsamples. If this may be due to the nature and volume of CV information, which is larger for full CV (subsample 1) and lower for LinkedIn CV (subsample 2), this may also prove the divergent approach to CV compilation and KSC specification between the two subsamples. Based on personal elaboration, Figs. 4.12 and 4.13 illustrate KSC frequency according to KSC type (sector-specific, cross-sector, transversal, and occupation-specific) in subsamples 1 and 2, respectively.
70k 60k 50k 40k 30k 20k 10k 0
sector-specific
cross-sector
transversal
Fig. 4.12 Subsample 1. KSC frequency by type. Personal elaboration
occupation-specific
110
4
Creative and Digital Skills in Italian Cultural and Creative Industries
30k
25k
20k
15k
10k
5k
0
sector-specific
cross-sector
transversal
occupation-specific
Fig. 4.13 Subsample 2. KSC frequency by type. Personal elaboration
4.2.4
KSC Network Analysis
Before proceeding with our cluster analysis, we wanted to pre-check the likely existence of some connection between digital and creative KSC within the CCI through a network analysis. Figures 4.14, 4.15, and 4.16 depict KSC in both subsamples, assigning each skill a node that is proportional to its frequency (and thus, relevance) in the sample. If a network analysis highlights how crucial a skill is, it also suggests likely connections between KSC in the sample, which is defined by their spatial distance. When considering associations between top-frequency KSC in the sample, digital KSC emerged as crucial as well, reporting highest frequencies as they are easy to spot for their size in Fig. 4.14, which proves their growing importance in the CCI. Moreover, alter management and communication, two managerial skills, and computer equipment and database, two digital skills, are fundamental. Therefore, like our previous analysis, the network analysis shows the increasing prevalence of managerial and digital skills in the CCI. In particular, the network analysis highlights a connection between these two sets of KSC: if KSC related to computer equipment are strictly linked to Java and computer science (Fig. 4.15), they are also close to project management and research, which are management skills, while KSC dealing with database and SQL appear to be strictly connected to alter management. This result suggests that CCI activities require not only digital but also managerial KSC in combination with creative ones and that digital and managerial KSC are somehow interrelated. Our cluster analysis will provide further insights on these findings.
4.2
KSC Analysis
111
Fig. 4.14 Network analysis of KSC by associations and frequency. Personal elaboration
Based on personal elaboration, Fig. 4.14 shows the network analysis of KSC by associations and frequency, whereas Figs. 4.15 and 4.16 show associations between the most frequent KSC in the whole sample. In particular, Fig. 4.16 further illustrates how AutoCAD, 3D modeling, Adobe Photoshop, and image editing are not only tightly associated within themselves but also with fine arts, which is a core CCI sector. This demonstrates how ICT skills are also fundamental in creative activities, proving the digitization of fine arts as well. Furthermore, the close connection between computer equipment and computer science, Java and Python further proves the convergence between digital and creative activities and KSC.
112
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.15 Associations between top-frequency KSC in the sample. Personal elaboration
4.3 4.3.1
Cluster Analysis KSC Cluster Analysis
After analyzing KSC rankings, we identified clusters of KSC and CV to evince similarities. First, we mapped specific KSC of both subsamples, obtaining 16 major clusters, distributed according to KSC frequencies. Based on personal elaboration, Fig. 4.17 shows their spatial distribution, which also reflects the degree of similarity or difference between KSC among clusters. In analyzing the spatial distribution of KSC clusters, we immediately identify a sort of concentric galaxy, with some dispersed clusters in the outer section (i.e., languages; finance, baking, and insurance; medicine) and an interesting concentration of five clusters in the centric area (i.e., creating visual displays and decorations; performing artistic or cultural activities; software and applications development and analysis; allocating and controlling physical resources; complying with health and safety procedures). In particular, we can highlight five main areas where clusters of KSC are located in the graph:
4.3
Cluster Analysis
113
Fig. 4.16 Associations among the most frequent KSC in the sample. Personal elaboration
• Top area: cluster 6 (textiles—clothes, footwear, and leather) • Upper-middle area: cluster 12 (food processing), cluster 16 (energy and electricity), and cluster 15 (mechanics and metal trade) • Lower-middle area: cluster 2 (performing, artistic or cultural activities), cluster 3 (software and applications development and analysis), cluster 11 (creating visual displays and decorations), cluster 4 (allocating and controlling physical resources), cluster 5 (complying with health and safety procedures), and cluster 8 (medicine) • Lower area: cluster 1 (languages), cluster 14 (leading and motivating), cluster 9 (selling products and services), cluster 10 (finance, banking, and insurance), cluster 13 (teaching academic or vocational subjects), and cluster 7 (law). Based on personal elaboration, Table 4.2 reports the 16 clusters, with top five KSC with highest frequencies within it. While Fig. 4.17 shows a dense concentration of most clusters at the center of the map, there are some interesting convergences between some clusters in the middle area. Interestingly, there is also a remarkable presence of many technical clusters, such as cluster 5 (complying with health and safety procedures), 8 (medicine), 15 (mechanics and metal trades), and 16 (electricity and energy), which include KSC that pertain mostly to engineering activities (respectively, biomedical,
114
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.17 KSC clusters. Personal elaboration
mechanical, and electronics, computer engineering, and telecommunications). This highlights, at first glance, the growing role of ICT activities, functions, and KSC in the CCI which justifies the considerable presence of engineers and data scientists in the sample as well. Cluster 6 (textiles—garment, footwear, and leather) is the only cluster at the very top, showing a highly polarized KSC distribution within it, as the top-ranking KSC reports a dramatically higher frequency than the other four KSC with the highest frequencies in the cluster (see Table 4.2). Yet, the closer cluster is cluster 15 (mechanics and metal trades). Considering that KSC related to fabricating garments and textile products and those needed for operating machinery for the manufacture and treatment of textiles, fur, and leather products and designing systems and products rank second, third, and fourth, respectively, in terms of frequencies in cluster 6, this proves how STEM skills are increasingly vital in the CCI subsector of fashion, especially when it comes to design and production. Cluster 15 and cluster 16 (energy and electricity) are also concentrated in the upper-middle
4.3
Cluster Analysis
115
Table 4.2 KSC clusters with five top-frequency KSC in the sample. Personal elaboration # 1
Cluster name Languages
2
Performing artistic or cultural activities
3
Software and applications development and analysis
4
Allocating and controlling physical resources
5
Complying with health and safety procedures
6
Textiles (clothes, footwear, and leather)
7
Law
8
Medicine
9
Selling products or services
10
Finance, banking, and insurance
Top-frequency KSC Languages: 69 Translating and interpreting: 7 Language acquisition: 5 Using foreign languages: 5 Classical languages: 3 Performing artistic or cultural activities: 18 Audiovisual techniques and media production: 11 Artistic and creative writing: 10 Operating audiovisual equipment: 8 Artistic and creative writing: 7 Software and applications development and analysis: 99 Database and network design and administration: 64 Managing and analyzing digital data: 34 Computer use: 15 Programming computer systems: 14 Allocating and controlling physical resources: 10 Planning and scheduling events and activities: 9 Purchasing goods or services: 8 Directing operational activities: 8 Storing goods and materials: 7 Complying with health and safety procedures: 13 Maintaining and enforcing physical security: 12 Transport services: 11 Protecting ICT devices: 9 Database and network design and administration: 7 Textiles (clothes, footwear, and leather): 17 Fabricating garments and textile products: 7 Operating machinery for the manufacture and treatment of textiles, fur, and leather products: 5 Designing systems and products: 4 Cutting materials and drilling holes: 4 Law: 34 Advising on legal, regulatory, or procedural matters: 7 Protection of persons and property: 2 Technical or academic writing: 1 Presenting research or technical information: 1 Medicine: 61 Therapy and rehabilitation: 16 Medical diagnostic and treatment technology: 14 Biology: 13 Diagnosing health conditions: 13 Selling products or services: 12 Developing financial, business, or marketing plans: 9 Marketing and advertising: 6 Providing information to the public and clients: 5 Directing operational activities: 4 Finance, banking, and insurance: 22 Executing financial transactions: 19 (continued)
116
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Table 4.2 (continued) #
Cluster name
11
Creating visual displays and decorations
12
Food processing
13
Teaching academic or vocational subjects
14
Leading and motivating
15
Mechanics and metal trades
16
Electricity and energy
Top-frequency KSC Managing budgets or finances: 19 Providing financial advice: 12 Monitoring financial and economic resources and activity: 10 Creating visual displays and decorations: 21 Audiovisual techniques and media production: 18 Creating artistic designs or performances: 13 Operating audiovisual equipment: 12 Using computer-aided design and drawing tools: 7 Food processing: 35 Crop and livestock production: 12 Preparing food and drinks: 11 Fabricating food and related products: 4 Developing recipes and menus: 4 Teaching academic or vocational subjects: 15 Teacher training without subject specialization: 6 Education science: 5 Monitoring and evaluating the performance of individuals: 5 Teaching and training: 5 Leading and motivating: 9 Supervising a team or group: 5 Recruiting and hiring: 5 Working in teams: 3 Coordinating activities with others: 2 Mechanics and metal trades: 21 Materials (glass, paper, plastic, and wood): 12 Operating cutting, grinding, and smoothing machinery: 12 Smoothing surfaces of objects and equipment: 9 Applying protective or decorative solutions or coatings: 8 Electricity and energy: 29 Electronics and automation: 17 Operating energy production or distribution equipment: 10 Designing electrical or electronic systems or equipment: 10 Mechanics and metal trades: 8
area, showing the increasing importance of telecommunications and systems design KSC in the CCI, as ICT get widely and increasingly diffused with digital transformation. Indeed, KSC related to computer technology, automation technology, and ICT communications protocols and infrastructure are among the top-frequency KSC in cluster 16 (Fig. 4.18). Based on personal elaboration, Fig. 4.18 indicates top-frequency KSC in cluster 16 (energy and electricity). This is even more evident if we analyze the lower-middle region of the distribution map, where there is an interesting concentration of three clusters, with cluster
4.3
Cluster Analysis
117
800
600
400
200
0
ele co ele au ce tom ctr ctr pts icit on ati y ics of re on t n ele ch tec gin no c hn om ee log o rin log mu y g y nic ati on s
co
co
mp
ute
r te
mp
ute
co
ntr
ol
ICT
co
mm
sys
tem
s
tel ICT ec inf om ras mu tru nic ctu ica ati re tio on ns se pro ng toc ine ols eri
un
ng
Fig. 4.18 Cluster 16 (energy and electricity). Top-frequency KSC. Personal elaboration
2 (performing, artistic, or cultural activities) substantially circumscribing cluster 3 (software and applications development and analysis) and partially converging toward cluster 11 (creating visual displays and decorations). This highlights how some performing, artistic, architectural, design, and cultural KSC become increasingly digitized, sometimes transitioning toward mostly technically focused, digitally based, or software-driven solutions, thus requiring mostly digital KSC. Indeed, the second and the fourth highest frequent title KSC for both clusters 2 and 11 coincide (i.e., respectively, audiovisual techniques and media production and operating audiovisual equipment) and involve the digital and ICT domain. At the same time, top-frequency KSC in cluster 3 are database and network design and administration, managing and using digital data, and computer use, also reporting highest absolute frequencies for the top 3 KSC among all 16 clusters, further proving the remarkable importance of digital skills in the CCI. As cluster 3 also comprises hard digital skills, being highly technical, this aggregation definitely proves the symbiotic relationship between digital and creative skills in the CCI, with ICT supporting creativity. In the same section, clusters 4 (allocating and controlling physical resources) and 5 (complying with health and safety procedures) partially intermingle. These two clusters mainly concern managerial skills, involving KSC related to complying health and safety in the workplace, protecting ICT devices, and database and network design and administration (cluster 5) as well as planning and scheduling events and activities or directing operational activities (cluster 4). This shows that not only digital KSC but also managerial KSC are increasingly important for the CCI workforce because the latter are necessary for supporting the former. Moreover, the proximity of this aggregation to the convergence described in the previous paragraph of clusters 2, 3, and 14, which identifies the symbiosis between digital and creative skills, explains how managerial, digital, and creative skills coexist in the CCI, in a mutually supportive relationship. Managerial KSC are crucial for coordinating creative work as well as implementing digital tools successfully in the CCI.
118
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Indeed, soft skills concentrate in the lower area of the graph, as cluster 14 (leading and motivating) gets closer to cluster 9 (selling products and services), which proves how soft skills of leadership and motivation as well as marketing and advertising get pivotal in the CCI as well. Cluster 9 also partially reaches out to cluster 10 (finance, banking, and insurance), showing the importance of budgeting and monitoring KSC as well. These clusters mostly encompass KSC related to corporate functions, such as human resources (i.e., recruiting and hiring; cluster 14), finance (i.e., managing budgets or finances; cluster 10), and marketing (i.e., marketing and advertising; cluster 9). Moreover, they all deal with soft skills dealing with monitoring resources and activities (cluster 10), developing plans, directing activities, and providing information (cluster 9), and working in or supervising teams, coordinating activities, and leading (cluster 14). This is in line with our previous KSC analysis (see Sect. 4.2), which highlights how cross-sector skills closely follow sector-specific skills in terms of frequency. Conversely, cluster 7 (law) is quite dense, as it shows highly polarized KSC frequency distribution, with the first-ranking KSC (i.e., law) showing radically higher frequencies than the remaining ones. This is similar to cluster 1 (languages) which occupies a sort of outlier position as well, and this may be due to the fact that this KSC is not digital- or creativity-related, but it is a crucial requirement and soft skill regardless of the work performed.
4.3.2
CV Cluster Analysis
The CV cluster analysis was threefold. We first clustered CV according to KSC and then according to ATECO code and demographic data (i.e., gender and geographical distribution). Based on personal elaboration, the following graph highlights CV clusters according to KSC. The square identifies CV from subsample 1, while the cross a resume from subsample 2 (Fig. 4.19). At first glance, the spatial distribution of clusters clearly shows a frontier (which is highlighted by the curve on the graph) distinguishing two macro-clusters of CV. On the left, most CV contain digital-related KSC (group 1), while on the right KSC mostly deal with creative or soft KSC (group 2). This decision boundary will also stay in the following sections of our empirical analysis, mostly dividing the sample in two groups: group 1 on the left and group 2 on the right. Most frequent KSC in group 1 are data analytics, web and software development, and business and project management, with the first one being remarkably higher than the other two (i.e., frequency of roughly 17k). In particular, cybersecurity, web and software development, and data analytics are the top-ranking KSC in cluster 0, 5, and 4, respectively, that is, in group 1. Conversely, multimedia and product design, marketing, and general workplace skills are the KSC with highest frequency in group 2, with frequencies of approximately 16k, 12k, and 11k, respectively. In particular, at the frontier, cluster 9’s top-ranking KSC is marketing, while cluster 5’s
4.3
Cluster Analysis
119
Fig. 4.19 CV clusters according to KSC. Personal elaboration
top-ranking KSC is web and software development, which further proves the predominance of digital KSC in marketing. Yet, cluster 9 is also close to cluster 10, which reports multimedia and product design as highest frequency KSC. Therefore, the CV cluster analysis further underlines the symbiosis of digital and creative skills in marketing. However, web and software development and data analytics rank fourth and fifth in terms of frequency in group 2. This clearly shows the growing digitization of creative KSC, which is also highlighted by the ranking of multimedia and product design as the most frequent KSC in the group. On the contrary, marketing reports the fifth highest frequency in group 1, which further proves how this CCI subsector increasingly becomes considerably digital- and software-driven. Furthermore, general workplace skills place rather similarly and high in both subsamples 1 and 2 (i.e., respectively fourth and third), highlighting how soft skills are crucial, regardless of the industry considered. This perfectly supports past literature and our results, advocating for the rising importance of flexibility and learning in promoting
120
4
Creative and Digital Skills in Italian Cultural and Creative Industries
15
10
5
0 D a
ta
Sc Se Civ Tr Sa Ac Off Pu Mu Cu Ma Pr E We Bu Ge Ma Fin IT co le n S le s un ice A plic R ltime stom nufa ocure ientif curity il En ansp An b & S ines eral rketin ancia upp ctrica s tin i s aly W l S ort lE g g dmin elat dia & er Se cturi men c Re & C ginee ort & tics oftwa & P ork S e n n se y t rvi str ion re roje plac arc bers ring Logis ce ervic ginee ati s Prod rvice g & M De ct s e o s e ec h tics r uc s n ing ec ve Ma Ski uri tD ha lop na ls ty e n sig ica me ge m n lE nt en ng t ine eri ng
Fig. 4.20 Group 1. Top-frequency KSC. Personal elaboration
14 12 10 8 6 4 2 0
G O C Ca C S S IT P F E D E L M B M W T re ltim arke ener eb & ata A ivil E usin ublic ffice anu inan lectr ales Sup cient xtrac each usto aw &S c e f i al t urr ing mer po ific ed Wo Sof naly ngine ss & Rela Adm actur ial S cal E oc r i ia ing R c t S i u t e t S e ial n n i eri erv &P rkp wa ics ng Pro tion lar se rvi s g e t i r n Wo r n r a v ice lac e D & M ces ati &S rod g jec s ee i r c c e o rk s e h rin tM uc s n po ec Sk eve g tD r h an ts ills lop an es a A ica ge me ctiv ign m l nt En itie en gin t s ee rin g
Mu
Fig. 4.21 Group 2. Top-frequency KSC. Personal elaboration
continuous and relevant innovation within organizations. Indeed, despite mostly encompassing STEM and digital KSC, group 1 blends CV from both subsample 2 (i.e., cross) and subsample 1 (i.e., square), proving how ICT skills are crucial in all CCI, as we expected from our literature research and previous analysis. Based on personal elaboration, Figs. 4.20 and 4.21 reveal top-frequency KSC in group 1 and 2, respectively. Moreover, the analysis of most frequent KSC among the two groups further supports the digitization of KSC in the CCI and the symbiotic relationship between digital and creative skills within it, as there is a general lack of occupation-specific skills and the KSC ranked in Figs. 4.20 and 4.21 are similar. Indeed, soft skills report the fourth and third highest frequency among male and female workers of roughly 11%. General workplace skills encompass KSC related to personal development, teamworking, leadership and motivation, and problem-solving, further highlighting
4.3
Cluster Analysis
121
3000 2000 1000 0 ity tiv ea cr d an n tio ra bo lla co ns n, tio io es ga at iti i ic iv st rs ct un ve he d a m in ot n m g ith ts a co ctin w n u es ve nd iti e co ils g tiv sk ac ulin k g or ed in t h w s na sc di d am or an ers te co ng th ing ni th o p i lo an w pl ve ng de y t ki i or and ios s w r g ce cu in an te ild a rm bu str rfo on cs pe m ti ts or de ma s en n he itm ig at m es m es om c d i ut tc st i sp ee art di m e g ng ng tin vi ea cha sol cr o re tt d cs ap an hi g et ad tin d ia an g ed hy tin m p lu so so t i n g ilo g va i ph pin ot lo m ve d s de an ms ce g ur ns in ea tio so a ad in t re ul le g t ng calc llin en ki tro pm or ing w on elo v rm d c rfo an d de g pe n i an t ca ils lo sk al al on rs
pe
Fig. 4.22 General workplace skills. Top-frequency KSC for both males and females. Personal elaboration
the importance of managerial skills among the CCI workforce as well. Based on personal elaboration, Fig. 4.22 illustrates top-frequency general workplace KSC for both males and females. If we cluster CV by ATECO code, we obtain partially similar results, with group 1’s CV predominantly belonging to ATECO code 62 (computer programming, consultancy, and related activities). Another cluster that is immediately noticeable is the orange one on the upper-right side, which identifies marketing (i.e., advertising and market research, ATECO code 73). However, the remaining observations are quite dispersed, which proves the hybridization of KSC and CCI subsectors with solutions merging digital and creative KSC in addition to managerial ones. While on the left the majority of CV refer to ATECO code 62, it does so only in the northern part of the isle (blue). In the southern part, there is a mixture of colors, which depicts a merging of multiple ATECO code, including motion picture, video and television programme production, sound recording and music publishing activities (59), marketing (73), consulting (70), creative, arts, and entertainment activities (90), and activities of libraries, archives, and other cultural activities (91). Similarly, on the right of the boundary line, a small cluster blends CV pertaining to television programming and broadcasting activities (60) with marketing (73) and computer programming (62) activities as well. This proves how ICT KSC and workforce are increasingly predominant in the CCI as well. Based on personal elaboration, Fig. 4.23 shows CV clusters by ATECO code. Indeed, in comparing the most frequent KSC for each ATECO code, it is possible to notice the general lack of occupation-specific skills, whereas sector- and crosssector specific skills are mostly featured in these rankings. If we compare top-frequency KSC in digital-related CV (62) and core CCI subsectors (ATECO code 59, 90, and 91), data analytics, marketing, and multimedia and product design rank among the top 8 KSC in all. In particular, multimedia and product design ranks first for ATECO code 59 (motion picture, video and television programme
122
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.23 CV clusters by ATECO code. Personal elaboration
production, sound recording, and music publishing activities) and marketing for ATECO code 90 (creative, arts, and entertainment activities), while data analytics positions 7th and 4th, respectively, reaching the 3rd position for ATECO code 91 (libraries, archives, and other cultural activities). Hence, even core CCI sub-industries, as defined by the Work Foundation (2007), increasingly demand digital skills. Moreover, managerial soft skills related to workplace and project management are likewise asked in both digital and more creative and cultural occupations, with general workplace skills ranking among the three most frequent KSC in all four ATECO codes. This proves how cross-sector skills are crucial, but also how more technical and managerial KSC are pivotal for the CCI workforce as well. Based on personal elaboration, Figs. 4.24, 4.25, 4.26, and 4.27 report the most requested skills in activities which are more related to either the core creative and cultural activities (ATECO code 59, 90, and 91) or core technology-driven
4.3
Cluster Analysis
123
Ateco 62.0 10k
8k
6k
4k
2k
0
Da
ta
An
We b&
aly
sti
cs
Ge So ftw are
Bu
ne
ral
rkp
De
Ma
sin
Wo
rke
es
Pr
eS
me
ial
Se
rvi
ct M
kils
lop
IT
ac
g
oje
lac
ve
Fin
tin
s&
an
ctr
ltim
rt S
ce
Ele
Mu
Su pp o
ed
ia
erv
rod
ice
s
bli
lE
s
ine
tD es ign
em
en
cR
ng
uc
ag
nt
Pu
ica
&P
eri ng
ela tio ns
t
Fig. 4.24 Most frequent title KSC. ATECO code 62 (computer programming, consultancy, and related activities). Personal elaboration
Ateco 59 250
200
150
100
50
0
Mu
ltim
Ge ed
ne
ia
&P
rod
ral
Ma Wo
rkp
uc tD es
rke
lac
ign
Civ tin
eS
il E
g
kils
Pu
We
ng
b&
ine
eri
ng
So
ftw
bli
are
cR
De
Da
ta
ela
tio
ns
ve
lop
Bu
An
Of
sin
aly
tic
s
Fin
fic
es
s&
eA
Pr
dm
oje
ins
ct
Ma
me
na
nt
an
tra
cia
tio
n
lS
erv
ice
s
ge
me
nt
Fig. 4.25 Most frequent title KSC. ATECO code 59 (motion picture, video and television programme production, sound recording, and music publishing activities). Personal elaboration
occupations (ATECO code 62). By comparing them, it is possible to notice how there is no relevant difference in terms of top-frequency KSC. For instance, general workplace skills, business and project management, marketing, multimedia and product design, web and software development, and public relations characterize all ATECO codes. This further supports our claim that there is a growing convergence of KSC but also highlights an interesting finding: there is a general convergence of industries toward hybrid solutions due to escalating digitization which
124
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Ateco 90.0
400
300
200
100
0
Ma
Mu
rke
tin
Ge
Da
ne
ltim
ed
g
ia
ta
ral
&P
tD
s
eS
ign
Pu
sin
are
bli
es
De
s&
Pr
lop
Civ
cR
ela
ct
Ma
nt
Fin
ng
Of
an
ine
eri
ns
na
me
il E
tio
oje
ve
kils
es
Bu So
ftw
tic
lac
uc
We b&
aly
rkp
rod
An
Wo
ng
cia
fic
eA
dm
lS
erv
ins
ice
s
tra
tio
n
ge
me
nt
Fig. 4.26 Most frequent title KSC. ATECO code 90 (creative, arts, and entertainment activities). Personal elaboration
Ateco 91.0
150
100
50
0
Ge
ral
Da
Bu
ne
Wo
sin es
rkp
s&
ta A
Pr
oje
lac
eS
kill
s
ct
na
lyt ics
Ma
na
Ac IT Pu Fin We co Su bli an ltim b& un cR pp cia ed tin S ort e l oft ia g lat Se Se &P w i rvi on a rvi r c s rod eD es ce uc s ev tD elo es p me ign nt
Mu
Ma rke
tin g
ge me nt
Fig. 4.27 Most frequent title KSC. ATECO code 91 (libraries, archives, and other cultural activities). Personal elaboration
makes sectors’ boundaries more difficult to define. Consequently, KSC required from both digital and creative workforce tend to converge, whereas creative and digital skill must have a symbiotic relationship in the CCI. This also shows how digital transformation requires an urgent update of ATECO codes to meet more hybrid solutions and sectors, especially in the CCI. Indeed, multimedia product and design as well as marketing rank first for two ATECO codes that identify core CCI
4.3
Cluster Analysis
125
sub-industries for traditional CCI literature, which also consider marketing as a peripheric CCI activity (The Work Foundation, 2007). The salience of ICT-related KSC even in core CCI activities proves how digitization also urges CCI workforce to acquire digital KSC and also how research and literature must update their studies to better meet continuous industry developments. The decision boundary also seems to exist in terms of gender distribution, with group 1 being male-dominated and group 2 showing more balance, especially in the northern section of group 1 and group 2’s northern cluster on the right, which refers to marketing. Yet, on the lower-right side, the two clusters closer to the decision boundary show a slight majority of females. These two clusters merge multiple sub-industries of the CCI, as it is possible to see from the ATECO codes’ distribution in Fig. 4.23. Hence, digital transformation seems to promote a balance between female- and male-dominated activities within the CCI, although ICT occupations (ATECO code 62) still show a heavy predominance of a male workforce. Further research, therefore, should be done on how to promote digital KSC among the female CCI workforce to promote equal opportunities. Indeed, the majority of males report KSC related to data analytics (roughly 15%) and web and software development (almost 12%), while most females have general workplace skills (almost 13%) and marketing (approximately 11%) skills. If the comparison between the top ten most frequent KSC between the males and females shows no difference in terms of managerial skills, with business and project management KSC ranked the third and fourth most frequently supplied KSC by females and males respectively in the CCI, huge discrepancies exist in terms of digital skills. Indeed, among females, roughly 9.5% of them report data analytics skills and 5.5% web and software development skills, which rank as only the fourth and sixth most supplied skills by the female CCI workforce. Nevertheless, 8% of them have multimedia and product design KSC (unlike the 4% of males who have it). Moreover, more females show KSC related to scientific research (2% against the 1% of males) and teaching skills (1%), which are not featured among the top-ranking KSC for males. Based on personal elaboration, Fig. 4.28 reveals CV clusters by gender, while Figs. 4.29 and 4.30 depict the most frequently reported KSC by males and females, respectively. Similar to our descriptive analysis (see Sect. 4.1), the cluster analysis shows a polarized geographical distribution of CCI workforce, with a concentration of the majority of both digital and creative talent in three main regions, Lombardia, Lazio, and Piemonte, with the first accounting for most of it, and the remaining creative talent dispersed across the peninsula, as Fig. 4.31 shows. This highlights both the widespread distribution of digital and creative talent in Italy and the concentration of major digital creative centers in Lombardia. Based on personal elaboration, Fig. 4.31 displays CV clusters by Italian region, whereas Fig. 4.32 displays the regions’ frequencies in the sample.
126
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.28 CV clusters by gender. Personal elaboration
4.4
KSC Gap Between Demand and Supply
In comparing KSC demand and supply in the industry, we compare KSC outlined in 131,504 job adverts and those of the CV in our sample. Based on personal elaboration, Fig. 4.33 reports the most frequently demanded KSC in the creative and cultural industries as resulting from the analysis of these job advertisements. Results in Fig. 4.33 report the huge demand for skills dealing with software and applications development and analysis, which is the most demanded skill, and management and administration KSC, ranking as the second most demanded KSC with a slightly lower frequency. This is in line with the results of our sample, and, therefore, with the demand of skills in the industry, which showed the cruciality of digital and managerial KSC. If we make a comparison between this bar graph and the one reporting the KSC supply in the CCI in our sample (Figs. 4.10 and 4.11), we notice similar rankings. In
4.4
KSC Gap Between Demand and Supply
127
Male 14 12 10 8 6 4 2 0
Da
ta
Sa Pu Off Ac Civ Cu Ma Sc Se Pr Tr E We Ge Bu Ma Fin Mu IT le n S le blic ice cou il E sto nu ien cur ocu ans s An b & S eral ines rketin ancia ltime upp ctrica s Re Adm ntin ngin mer fact tific ity & rem por o d s aly W l g o r i g lat Se a & t S l En ee Se urin Res Cy ent t & L tics ftwa ork & P ion instr rvi erv gin rin rvi g & ea be P og re plac roje c ati s ce r g es od ice eer isti Me rch rsec De c on s uc s ing cs ve e Ski t Ma uri ch t De lop an lls t na y sig ica ge me m n l nt En en gin t ee rin g
Fig. 4.29 Most frequent KSC. Males (%). Personal elaboration Female 12 10 8 6 4 2 0 G e
A O C C S S I P E P F D M M M E B W T ne ark usin ata ultim eb ubli inan ivil E ales ffice ccou T Su lect usto cien anu xtra rocu eac c ral etin es An e & c Ad ntin ppo rica me tific fac curr rem hing s & aly dia Sof Rela ial S ngine Wo g lE rS tur icu mi r R g e t t e t S e ns e rkp Pro ics & P wa tion rv erv ngin ervi sea ing & lar nt tra ice ring lac rod re D s jec ice eer ces rch M & S tio s eS tM uc ev s n p i e n c g t D elo kill ha orts an s es pm nic Ac ag ign em al en En tivitie en t gin s t ee rin g
Fig. 4.30 Most frequent KSC. Females (%). Personal elaboration
Fig. 4.10, the most supplied KSC are software and applications development and analysis, database and network design and administration, as well as management and administration. Therefore, there is no significant gap between the demand and supply of managerial and digital KSC when computer-related occupations are considered in the CCI. In analyzing Fig. 4.11, software and applications development and analysis is the second most frequent skill supplied by the CCI workforce in subsample 2, which contains larger shares of people employed in traditionally core occupations of the CCI. However, software and applications development and analysis skills report a significantly lower frequency with respect to languages, which position as the most supplied KSC in subsample 2. Likewise, management and administration KSC is
128
4
Creative and Digital Skills in Italian Cultural and Creative Industries
Fig. 4.31 CV clusters by Italian region. Personal elaboration
placed only sixth among the KSC that are most frequently supplied in the subsample. The same applies to database and network design and administration skills, which are the fourth most demanded skills by organizations, but rank seventh among KSC supplied by CCI workforce in subsample 2 (Fig. 4.11). This highlights a shortage of digital and managerial KSC supply among the more core CCI occupations. Yet, this digital KSC gap is important if gender is considered. If we take into consideration Figs. 4.29 and 4.30 depicting KSC that are more frequently supplied respectively by males and females in our sample, web and software development (which is related to the most demanded skill in Fig. 4.33) is the most frequently supplied skill by men, but it is only the sixth by females. So, if a gap between the demand and the supply of digital skills does not exist for male workers, it is quite remarkable when female workers are considered. This makes gender significant when studying gaps in digital skills in the CCI. On the contrary, as regards managerial skills gaps, gender is not significant: business and project management skills rank high for both males (4th
4.4
KSC Gap Between Demand and Supply
129
1500
1000
500
0
A C C S S ro LombLazio Piem Emili Vene Tosc ampLigur iciliaPugl Friuli Tren Marc arde bruzUmb alabBasil Molis Valle ia -Ve tino he i on a-R to an ria icat e ard gn zo ria d’A a ania a te a ne -Al a om ia os zia to ta/ ag Va na Giu Adi llée g lia e/S d’A üd os tiro te l
Alt
Fig. 4.32 Region frequencies. Personal elaboration 120k
100k
80k
60k
40k
20k
0 s m p d of a er at lan ma ma com ada aud wh fin wo con per pla coo ele lea allo dire dev ma dev des bu lite wo cre ana tw na so ab g na rk p p ia ol an rk d fo nn rd ct d c c e na e ig ild ra rk at ly a t r i u l l u e i t
e u r c in o o n i i t i o n a s n s e gem nal se age gem ting ter to c vis sal e, g i ctin min ing nat nic g a ting ng pin ging pin ing ng ure kil g a ing an e sk a s en a us ha ual e a ban n te g i g c an ing s a nd an ope g i a g f sy and an ls rtis bu d nt ills nd d n n n n i n e n s n n t d t a k tic sin a r m a ap a ge ec d in m ve lc sc ct d o c at str d a an te civ d li sk d a n hn re g s st ul he iv au tiv on io uc n c m il ng de es pl nd and etw ills dv ig at d iti to a tr ns ti al ia s e u ic a iq tail an si s o er n a e a i i v l y at dm de ork t o s o u , gn p u d m i a g t s e n si b d i tic tis io ns lin w a ng llin c es sa in io i v s er ns in ns ni el de g g tivi or p ng usi pr nee s ith tio or a t an les su g o s e n d s re tie ro ig e od ri ra de tra pm ig ve o n d pe ion n s s s nc m ita s uc g ou m ve tio e n a n t th e rfo s o e ed s lo n nt nd rc rs tio l d or ts rm a pm es ia nd na ata ma a an pr dm en lm rk a ce o ct et ta du in a s i i vi te is ng nd ct t tra r i io es ia pl an n ls tio an al n s ys is
ar
Fig. 4.33 Most in-demand KSC from job advertisements in the industry. Personal elaboration
frequency) and females (3rd frequency), while there is a low supply of managerial skills related to IT support services, sales, and customer service for both males and females (Figs. 4.29 and 4.30). If we take into consideration the remaining KSC, the first ten KSC most demanded and most supplied are the same, with small differences in rankings, for
130
4 Creative and Digital Skills in Italian Cultural and Creative Industries
both subsamples. For subsample 1 (Fig. 4.10), there is a small gap between marketing and advertising KSC demand (7th frequency) and supply (12th) as well as between personal skills and development KSC demand (3rd) and supply (10th), whereas there is an oversupply of electronics and automation KSC, which is placed 6th among the most supplied KSC, but only 18th among those most demanded. As for sub-sample 2, if we compare Fig. 4.33 with Fig. 4.11, the first ten KSC most demanded and most supplied are the same, with small differences in ranking positions. For instance, there is a small gap between marketing and advertising KSC demand (7th frequency) and supply (10th frequency), whereas there is an oversupply of audiovisual techniques and media production and languages, which rank respectively 3rd and 1st in the supply, but only 10th and 5th in the demand. However, both subsamples report most important gaps in KSC demand and supply, as some KSC that are demanded find no supply at all. In terms of important gaps between KSC demand and supply in subsample 1 (Fig. 4.10), there are some skills largely demanded, which, therefore, are placed high in terms of frequency in Fig. 4.33, but find no supply: KSC dealing with audiovisual techniques and media production and working in teams, which rank as the 10th and 13th most demanded KSC in Fig. 4.33. Hence, when more digital profiles are considered, there is a significant lack of soft skills related to team working. As for subsample 2, the most important gaps in KSC demand and supply can be found in the middle section of both graphs in Figs. 4.11 and 4.33. Indeed, leading and motivating, coordinating activities with others, planning and scheduling events and activities, and working in teams are featured in the demand but not in the supply. Therefore, there is a general gap in the supply of soft managerial skills related to project management in subsample 2: leadership, organization, and teamworking. Furthermore, although marketing ranks high in both graphs, sales skills report an important gap as wholesale and retail sales KSC are ranked the 11th most demanded but are not supplied. Finally, KSC dealing with managing and analyzing digital data, designing systems and products, and eletronics and automation are highly demanded but not supplied, highlighting an important gap in digital management skills in subsample 2 as well, especially when it comes to data analysis and data management. Conversely, such gaps in digital skills (i.e., managing and analyzing digital data, electronics and automation), soft managerial skills related to leadership and organization (i.e., monitoring and evaluating the performance of individuals, coordinating activities with others, planning and scheduling events and activities), and sales skills (i.e., wholesale and retail sales) do not exist for subsample 1. Nevertheless, KSC dealing with leading and motivating as well as working in teams are missing in subsample 1’s supply of KSC (Fig. 4.10). However, two main limitations need to be added to the previous discussion. First of all, despite querying the LinkedIn database using the job titles in our two datasets, the job ads we retrieve might not be related to the same exact job titles, as the LinkedIn algorithm heavily influences the results of the queries. In addition, LinkedIn job ads might not represent all ATECO sectors equally, as they might be more oriented toward more ICT-related ATECO codes: for instance, it would be easier to find profiles of UX designers compared to librarians, as the formers might be more prone toward creating a LinkedIn profile in the first place.
Reference
131
Reference The Work Foundation. (2007). Staying ahead: the economic performance of the UK’s creative industries. Department for Culture, Media, and Sport. Crown. https://static.a-n.co.uk/wpcontent/uploads/2013/11/4175593.pdf
Conclusion
Cultural and creative industries have received growing attention in business and economics studies, not simply for their contribution to national economic growth. Recent studies have found heterogeneous but positive effects on local economies (Gutierrez-Posada et al., 2022) and innovation (Innocenti & Lazzeretti, 2019), stressing their pivotal role in the interaction between professionals, firms, and institutions even beyond the traditional boundaries of arts and culture. Employment in the CCI contributed to Italy’s national employment by roughly 5.9% and to the national economy by 5.7%, totaling almost 1.5 million workers and producing a total value added of 84.6 billion € in 2020 (Symbola & Unioncamere, 2021, p. 66). However, the CCI experienced a dramatic fall in both the total value added and employment levels, which shrank by 8.1% and 3.5%, respectively, with respect to 2019 (Symbola & Unioncamere, 2021, p. 66). This shock urges the study of how KSC demand and supply have changed in the industries and which KSC have become crucial for creative and cultural workers to thrive in this new scenario. This research is a first attempt to empirically identify the impact of digital transformation on the CCI in terms of evolution and mix of knowledge, skills, and competences required in cultural jobs. The CCI have been among the most exposed to both the effect of digital transformation and the shock of the COVID-19 pandemic. The former has completely changed the mode of production and distribution of many forms of arts like music, literature, and cinema and deeply revolutionized our habits of cultural participation and consumption. The latter has shown the fragility of many activities and organizations that have been forced to closure and to a dramatic decline of the audience and digital technologies have significantly contributed to the survival of many cultural activities. Therefore, the analysis of the evolution of KSC is not only relevant to explore the possibilities of art and cultural production in the digital age but also the opportunity for organizations to become more resilient to technological unemployment and external shocks. Our research highlighted how both digital and managerial/creative KSC have become crucial in the CCI for both job supply and demand. In particular, digital KSC © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 M. Nuccio, S. Mogno, Mapping Digital Skills in Cultural and Creative Industries in Italy, Contributions to Management Science, https://doi.org/10.1007/978-3-031-26867-0
133
134
Conclusion
related to data analytics and software and applications development and analysis emerge as critical, although a gap still exists between the demand and supply of digital KSC for female workers. This highlights how gender is relevant when analyzing digital KSC gaps in the CCI. Furthermore, managerial KSC emerge as fundamental for managing and coordinating creative and digital talent and KSC, fostering a successful symbiosis between the two, and the effective implementation of digital innovation and technology for creative purposes. However, our research proves the existence of some gaps between the demand and supply of some managerial soft skills related to project management (i.e., leadership, organization and coordination, teamworking), digital management (i.e., data analysis and management), and sales. In general, the research shows the growing symbiotic relationship between digital, managerial, and creative KSC in the CCI, urging the development of new KSC among CCI employees, as well as the accelerating convergence between CCI subsectors, whose boundaries get increasingly blurred, due to digital transformation. This not only proves the limitations of traditional top-down classifications but also promotes the adoption of more skill-based and bottom-up approaches to the study of CCI and KSC, validating our research methodology. The research has its limitations, and we have not been able to draw a definitive picture of the skills for the cultural jobs of the future. A more comprehensive analysis should expand the sample to other countries and should compare differences and similarities among urban areas and regions. Our contribution should generate value primarily to the management of cultural and arts organizations because such an approach can support them in selecting and choosing new job profiles. Nevertheless, a regular monitoring of KSC (not only in the CCI) would benefit higher education and training institutions to better match their learning programs with the actual needs of the job market. The skills gap analysis can also help to set specific programs of up-skilling for employees in the CCI and re-skilling for people who lost their jobs or simply want to make a change in their career. At the level of local governance, even public administration can exploit the results of localized skills analysis to evaluate investment opportunities and support innovation and training in cultural organizations.
References Gutierrez-Posada, D., Kitsos, T., Nathan, M., & Nuccio, M., (2022). Creative clusters and creative multipliers: evidence from UK cities. Economic Geography, 1–24. https://doi.org/10.1080/ 00130095.2022.2094237 Innocenti, N., & Lazzeretti, L. (2019). Do the creative industries support growth and innovation in the wider economy? Industry relatedness and employment growth in Italy. Industry and Innovation, 26(10), 1152–1173. Symbola & Unioncamere. (2021). Io sono cultura 2021. L’Italia della qualità e della bellezza sfida la crisi. I quaderni di Symbola. Copygraph sas. ISBN 9788899265663. https://www.symbola. net/ricerca/io-sono-cultura-2021/