133 67 7MB
English Pages 174 [169] Year 2024
Stefan Schiffner Sébastien Ziegler Meiko Jensen Editors
Privacy Symposium 2023 Data Protection Law International Convergence and Compliance with Innovative Technologies (DPLICIT)
Privacy Symposium 2023
Stefan Schiffner • Sébastien Ziegler • Meiko Jensen Editors
Privacy Symposium 2023 Data Protection Law International Convergence and Compliance with Innovative Technologies (DPLICIT)
Editors Stefan Schiffner BHH University of Applied Sciences Hamburg, Germany
Sébastien Ziegler Mandat International Geneva, Switzerland
Meiko Jensen Karlstad University Karlstad, Sweden
ISBN 978-3-031-44938-3 ISBN 978-3-031-44939-0 https://doi.org/10.1007/978-3-031-44939-0
(eBook)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
Foreword
The Privacy Symposium held in Venice in April of 2023 brought together academics, civil society organizations, international organizations, government officials, and industry representatives in a week-long conversation that was remarkable both in its openness and curiosity as well as its candor. It was an honor for me to participate in this conversation, and an even greater honor to have been asked to write this introduction. For an American who is not an academic, writing this introduction has allowed me to engage more deeply with the participants’ written papers, learn from their scholarship, and obtain a deeper understanding and appreciation of the strengths and weaknesses of European data protection framework. I look forward to applying what I have learned in an increasingly international information ecosystem. The essays in this volume examine a variety of aspects of the data protection framework in Europe. While this has been the subject of a growing number of academic publications, those in this volume are unique in reflecting a rising concern among European scholars over the tension between data protection law on the books and the law on the ground. The ideals of the data protection framework, expressed in legislation such as the European Union General Data Protection Regulation (GDPR) and international legal instruments such as Convention 108 and the European Charter of Fundamental Rights, encompass a body of universalist theory which sees in the rise of new information technologies the threat to longstanding social values, and adopts a fundamental rights theory to address these risks. Partly because of the universalist theoretical origins of this conventional wisdom, scholars have often been uncomfortable highlighting the misalignment of human experience with the legal architecture of data protection. In particular, little attention has been paid to the high levels of noncompliance with data protection laws by SMEs in Europe. Likewise, in connection with the restrictions the GDPR purports to impose on data transfers outside of Europe, there has been a general disregard of the nonalignment of the legal framework with actual practice. This volume thus represents an important step in the right direction. In these general introductory comments on these papers, I have tried to highlight these themes, and show how many of the works may raise more important concerns. v
vi
Foreword
I recognize that some authors may be unwilling to explore these implications because of the constraints of academic culture rather than any failure of imagination. To question a conventional paradigm, particularly one that has raised to the status of fundamental rights law, is not a recipe for a sanguine academic career, no matter how obvious some of these implications may be.
Comments on Specific Papers Beatriz Aguinaga Glaría conducts a comparative analysis of the rules applicable to the processing of data related to criminal convictions and offenses, with a primary focus on Spanish law, and the consequences of unrestricted use of this information in connection with employment decisions in Spain. She observes that despite the claims grounded in legal theory about the need for uniformity of laws pertaining to the processing of personal data, the reality on the ground is that the rules are quite different from country to country when it comes to the use of criminal histories in connection with employee selection and management decisions. The remarkable paper by Georgios Georgiadis and Geert Poels summarizes two of three very important studies of the potential use of Data Protection Impact Assessments (DPIAs) to evaluate the risks posed by big data analytics to the interests of data subjects. The specific intention of the studies is to identify potential recommendations in how the European Data Protection Board might provide guidance to industry in using DPIAs to evaluate the risks to data subjects posed by big data analytics, under the provisions of Article 35 of the GDPR. Under GDPR Article 35, DPIAs are required when a new project is likely to involve “a high risk” to personal data,1 but the language in Article 35 about the meaning of “high risk” is far from clear. Some guidance has been provided, highlighting relatively obvious “high risk” activities as tracking people’s location or behavior; systematically monitoring publicly accessible locations on a large scale; processing personal data related to racial or ethnic origins, political opinions, religious or philosophical beliefs, or trade union membership; the processing of genetic data; biometric data for the purpose of uniquely identifying a natural person; data concerning health or data concerning a natural person’s sex life or sexual orientation; processing of children’s data; or processing data that could have legal effects or result in harm to individuals if the information were leaked. However, as the authors correctly point out, outside of these obvious categories, it is unclear to what extent industry in Europe currently makes any meaningful use of DPIAs.
1 Article 35 DPIAs are required “[w]here a type of processing in particular using new technologies, and taking into account the nature, scope, context and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural persons, the controller shall, prior to the processing, carry out an assessment of the impact of the envisaged processing operations on the protection of personal data.”
Foreword
vii
The studies discussed in this paper provide great insight into the best ways to go about assessing risks to data subjects, and it is to be hoped that their work will prove to be of great value to regulators in providing better guidance identifying types of data processing which should fall into the requirements of Article 35. The recognition of the variety of risks associated with different forms of big data analytics implies different kinds of concerns by data subjects and other stakeholders. The study wisely recommends consultation by different stakeholder groups, for different kinds of uses of big data analytics, something that probably should be a best practice in connection with any assessment of privacy risks. The implications of this important insight, however, raises questions about the extent to which the GDPR, which to date has been interpreted with a centralized regulatory framework in mind, may need to be implemented with greater sensitivity to those aspects of the GDPR which envisaged its application in a more local and sectoral way to provide greater benefits to the protection of personal data. It also may be helpful to contrast the use of DPIAs under Article 35 of the GDPR with the use of similar legal frameworks in other jurisdictions. In the United States, under the provisions of Section 208 of the E-Government Act of 2002, federal government agencies are required to prepare Privacy Impact Assessments (PIAs) before operating any information system affecting personally identifiable information. The purpose of the US requirement is not simply to ensure better protection of personal data but also to facilitate better agency decision-making by taking risks to personal data into account. Because the focus of the PIA requirement in the E-Government Act is on assessment of risks rather than enforcement of rights, the scope of PIAs under the E-Government Act is much broader than those covered by DPIAs under Article 35. In connection with the assessment of risks, the insights generated by DPIAs may be of greater value than in connection with the enforcement of rights. In such risk-based jurisdictions, the insights of this paper will certainly be carefully studied with interest and appreciation, particularly those that highlight the critical importance of the role of stakeholder engagement in identifying and assessing risks and benefits, and how these invariably change depending on the sector and context one is addressing. Elias Gruenewald, Johannes Halkenti, Nicola Leschke, Johanna Washington, Frank Pallas, and Cristina Paupini have developed a remarkable and versatile machine-readable interface that makes transparent the lifecycle of the personal information of an organization. They note that without the ability to understand how an organization processes personal data in online services, that organization cannot make informed decisions regarding how best to collect, maintain, use, and ultimately share the data. They make two proposed implementations of their idea: (1) a GDPR-aligned privacy dashboard and (2) a chatbot and virtual voice assistant enabled by conversational AI. They also suggest that use of this tool may assist data controllers in meeting their respective regulatory obligations. I suspect that the extreme modesty of their proposed uses of the tool they have designed is due to the limits the classical model of privacy has imposed on their imagination. The authors of this paper may not have initially fully appreciated the significance of their remarkable work.
viii
Foreword
As noted above, under the US legal framework, the purpose of conducting a privacy impact assessment is not to check a box on a mere regulatory requirement. It is to help agency decision-maker make better decisions by making them more aware of both the risks and benefits of alternative proposed uses of personal data. This goal has been elusive, and agencies have defaulted to mere regulatory compliance, but the goal is still one that occasionally can be achieved, and even when PIAs are performed after the system is built, for the nature of continuously improving systems means that the analysis will be useful in the future iterations of the system, as well as in responding to breaches. The tool discussed in this article thus represents one of the most exciting developments in years and is likely to be widely adopted by government agencies as well as businesses as enabling them to achieve that illusive but important goal of better decision-making based on a better understanding of an institution’s uses of data. Jan Czarnocki, Eyup Kun, and Flavia Giglio address how the EU’s laws regarding cross-border transfers of personal data have been applied with respect to China and India, examining the laws of each country with respect to the requirements of necessity and proportionality, particularly in connection with the laws of the two countries regarding government access to personal data held by the private sector for purposes of national security. After their review of the laws of China and India, the authors conclude, unremarkably, that the laws of neither country appear able to satisfy the requirements established by the CJEU in Schrems II for transfers of personal data. This is an important conclusion, for it highlights an important distinction. In the EU US Data Privacy Framework, the United States was able to accommodate the concerns of the CJEU about its legal framework for the protection of privacy in connection with government access to personal information for purposes of national security and law enforcement, by making substantial changes in the US legal framework to bring it into line with the requirements of the CJEU. It should be noted, however, that the United States was able to accommodate the CJEU because the economic benefits of making these legal adjustments were substantial, and the preexisting legal structures and social values in the United States already so closely resembled those of Europe due to our shared history and culture. When countries whose legal institutions are rooted in cultural traditions far more different to those of Europe approach the challenge of accommodating the concerns of the CJEU, the costs and benefits of such an undertaking are likely to look very different. Countries such as India and China that have become global superpowers in their own right cannot reasonably be expected to subordinate their own national institutions and national economic interests to meet the demands of a remote European institution, when the legal changes demanded by the CJEU would be far more significant, and increasingly appear presumptuous if not imperious. If there can be no reasonable expectation that these countries are likely to change their laws on the books, much less modify their culture so that laws on the books reflect changes in behavior on the ground, some very uncomfortable questions arise, questions that the authors appear to have been reluctant to discuss. Do the authors believe that transfers of personal data to China and India should therefore cease?
Foreword
ix
This is an awkward conclusion, since it would effectively end trade between Europe and the two largest countries in the world, whose combined populations constitute just shy of half the world’s population. Even if this were possible, since China’s GDP alone exceeds that of Europe and both India and China have very strong economic and cultural connections with all other countries in the world, cutting off trade with China or India would be unlikely to end transfers of personal data from Europe to them, unless Europe were to cut itself off from all other countries that conduct trade with China and India. Flows of personal data are inextricably intertwined because human relationships are inextricably intertwined, and these relationships increasingly span the artificial borders of nation-states. To what extent then should we reexamine the underlying assumption underlying much of the conventional wisdom of data protection theory, that data is a thing to which quasi-property rights of control can be meaningfully attached in the first place? In their very important paper, Malte Hansen, Nils Gruschka, and Meiko Jensen discuss what is perhaps the central problem under the General Data Protection Regulation (GDPR). In theory, the GDPR empowers European individuals to exercise their rights under the fair information practices; in reality, data subjects enjoy these rights only with respect to larger organizations because only such larger organizations maintain records management systems, while small- and mediumsized enterprises, which constitute 90% of any developed economy, generally do not, and lack any need to implement them for a business purpose. The authors attempt to address this problem by proposing the establishment of an industry of intermediaries who would act as a bridge between the data subjects and data controllers, by providing as service on behalf of the SMEs, implementation of the fair information practices. The solution that the authors propose, while ingenious, requires further economic analysis to determine if it would be economical for SMEs to utilize it. Assuming it were economical for SMEs to use such a service, it would then raise the question whether such a solution would consist of jumping from the frying pan to the fire—that is, by solving the problem of SME compliance with the FIPPs, it would potentially create a massive conglomeration of personal data that would substantially increase the risks to data subjects. The importance of the paper is not in its proposed solution to the compliance problem, but the logical conclusion that it fails to draw. If there is no meaningful way to improve the compliance rates among SMEs without creating even greater privacy risks, should the laws themselves be reexamined, or at least the ways in which it has been implemented as to SMEs? Shouldn’t greater effort be made to find less expensive ways for SMEs to comply with the law? The GDPR made substantial contributions to the international data protection conversation, but one of its weaknesses is its very expensive compliance costs for SMEs, because it was largely designed for large businesses and government agencies, not the community of SMEs. Other model laws have since been developed which provide substantial data protection at SMEs at a far lower costs, for example the Model Uniform Personal Data Protection Act promulgated by the Uniform Law Commission in the United States. Insights from these alternative models might lead to higher overall
x
Foreword
rates of compliance with general data protection legal frameworks and improved rates of social trust. Nadia Giusti discusses the Charter of Trust dedicated to Cybersecurity, presented in 2018 at the Munich Security Conference in Bavaria, and its ten principles, designed to help facilitate the protection of individuals’ personal data, prevent cyberattacks, and create a reliable foundation on which confidence in a networked, digital world can take root and grow. The objective of this work is to highlight the benefits of applying the principles of the Charter of Trust to avoid security threats and to investigate whether the application of the Charter constitutes an advantage also in the field of data protection. The focus on cybersecurity is an important reminder that not only is data security a critical aspect of data protection but one of the most important. Further, the Charter of Trust shows how much farther ahead cybersecurity is as compared to the other aspects of data protections. Cybersecurity generally has focused on standards, certifications, and training, on a continuous improvement model, rather than on the abstract concepts which are characteristic of the other areas of data protection on individual rights, judicial redress, and so forth. Janis Wong’s paper studies the adoption of technologies and application of datadriven practices in the real estate sector. She notes that even where the property sector has managed to adhere to data protection regulations, nothing in these regulations addresses other risks associated with the vast amounts of data that are being collected, processed, and shared. While ethical considerations appear to have been incorporated into best practices for the industry as part of the urban planning process, the author calls for a centralized regulation making ethics best practices into a formal legal requirement. The paper would benefit from some empirical analysis of compliance rates with existing regulations. If current rates of compliance with existing legal mandates are low, imposing more legal mandates may not result in significantly higher levels of protection for consumers.
Conclusion These papers highlight the depth and breadth of intellectual ferment surrounding the GDPR taking place in Europe today. There have been two great European attempts to implement the ideals of data protection, first in the Data Protection Directive, and second in the GDPR. The GDPR was intended to rectify the low compliance rates of the Directive by establishing a single set of rules as well as by dramatically increasing the penalties for noncompliance. What impact has the GDPR had on the United States? How has the United States improved as a result of engagement with Europe? How can Europe improve implementation of its own data protection regime by engaging more deeply with the United States and with even more diverse nations such as India and China? Here’s one example: While the abstract, universalistic expression of data protection law embodied in the GDPR may have benefits within
Foreword
xi
the EU context, it also comes with costs that may be easier for observers situated outside the EU to recognize. For example, the GDPR as an instrument of “command and control,” topdown regulation may have resulted in higher rates of compliance on the part of large enterprises, there is little indication that the compliance rates of the GDPR have improved with respect to SMEs. The enormous significance of SMEs within any developed economy, and the lack of compliance of the law within this sector, raises an important question as to whether Europe should reconsider its policy of promoting its legal model to the rest of the world, at least until these fundamental problems are addressed. Laws with low compliance rates inevitably lead to breakdowns in social trust—after all, nothing destroys trust faster than the failure to do what the law requires. Addressing the problem of noncompliance on the part of SMEs may perhaps lead to re-examination of some of the premises of data protection theory, at least to the extent it has spawned a conventional wisdom that only a “one-size-fits-all” theory of data protection can protect a fundamental right. Perhaps the time has come for EU institutions to move beyond the attempt to impose their abstract, universal, “one-size-fits-all” approach to data protection law and engage in a deeper dialogue with their trading partners around the world. European leaders as well as European academics might find that dialogue based on mutual respect for diverse legal traditions, whether those with similar historical roots such as the United States or with radically different historical roots such as India and China, could help them address many of the challenges identified by the authors of the papers in this volume. Washington, DC, USA June 2023
Peter Winn
Preface
With the European General Data Protection Regulation (GDPR) entering into force in 2016, a chain reaction has been triggered: Many other jurisdictions are adapting similar regulations or at least accelerated their legislative process. Simultaneously, digitalization was already changing the world with high speed, but was further propelled by the emergence of the COVID-19 pandemic. These developments have impacted all economic and societal sectors. Consequently, almost any human endeavor leaves digital footprints, generating an exponentially growing volume of personal data. In this context, many questions and challenges emerge: How to support convergence of data protection requirements among and across distinct jurisdictions? How to adopt data protection by design approach to emerging technologies? How to integrate certification and processes to demonstrate compliance into organizational structures? The present proceedings is a collection of original work presented in the scientific track at the Privacy Symposium in Venice, April 17–21, 2023. The Symposium promotes international dialogue, cooperation, and knowledge sharing on data protection, innovative technologies, and compliance. Legal and technology experts together with researchers, professionals, and data protection authorities meet and share their knowledge with decision-makers, and prescribers, including supervisory authorities, regulators, law firms, DPO associations, DPO and C level of large companies. In order to reach out to the community in early 2022, we issued a call for papers stating: We welcome multidisciplinary contributions bringing together legal, technical, and societal expertise, including theoretical, analytical, empirical, and case studies. We particularly encourage submissions that fall under one of the following thematic areas: Law and Data Protection • Multidisciplinary approaches, arbitration, and balance in data protection: arbitration in data protection law and compliance, multistakeholder and multidisciplinary approaches in data protection and compliance.
xiii
xiv
Preface
• International law and comparative law in data protection and compliance: crossborder data transfer approaches and solutions, international evolution of data protection regulations, international evolution of compliance regulations and norms, comparative law analysis in data protection domain, comparative law analysis in compliance domain, international law development in data protection domain, international law development in compliance, interaction between regulations, standards, and soft law in data protection. • Data subject rights: right to be informed, right to access and rectify personal data, right to restrict or object to the processing of personal data, right to limit access, processing and retention of their personal data, right to lodge a complaint with a supervisory authority, right not to be subject to a decision based solely on automated processing, including profiling, right to withdraw consent at any time, right to data portability, delegation and representation of data subjects’ rights, effective processes, implementations and monitoring of data, automated mechanisms and tools to support data subjects’ rights and consent management. Technology and Compliance • Emerging technologies compliance with data protection regulation: emerging technologies and data protection compliance, data protection compliant technologies, artificial intelligence, compliance and data protection, blockchain and distributed ledger technology, 5G and beyond, data protection by design. • Data protection compliance in Internet of Things, edge, and cloud computing: enabling data protection compliance in networking technologies, impact of extreme edge on privacy, network virtualization, seamless compliance from edge to core in multi-tenant environments. • Technology for compliance and data protection: privacy enhancing technologies (PET), anonymization and pseudonymization, privacy by default, innovative legal tech and compliance technology, compliance standardization and interoperability, data sovereignty. Cybersecurity and Data Protection • Cybersecurity and data protection measures: technical and organizational measures for data protection, making cybersecurity, privacy and data protection by design and by default, authentication, digital identities, cryptography, network inspection, GDPR compliance, evaluation of the state-of-the-art technology compliance with data protection, cybercrime and data protection, identity theft and identity usurpation. Data Protection in Practice • Audit and certification: audit and certification methodologies, innovative solutions and services for audit and certification. • Data protection best practices across verticals: health and medical technology, mobility, connected vehicles, smart cities, supply chains, telecommunication. • Data protection economics: market analysis, economic models and their impact on data protection, compliance and financial valuation, compliance by technol-
Preface
xv
ogy, economic impact of international convergence in data protection, impact of data protection regulations, unintended harms of cybersecurity measures. Data Protection in Finance and ESG • Data Protection and Finance: impact of data regulations on financial valuation and markets, privacy in financial and banking sectors, impact of data availability and compliance on credit risk assessment, organizational implications of data protection and security, institutional adoption of data protection mechanisms, cross-border data transfer compliance, cryptocurrencies and privacy, privacy and law enforcement with financial transactions, digital financial instruments in the context of data protection measures, etc. • Data Protection and Environmental, Social and Governance (ESG): role of privacy in ESG, implications on data governance and reporting, changing attitudes of societal data sharing, corporate culture and data protection, privacy, and sustainable development, etc. Our call was answered by 28 researchers or groups of researchers. Our program committee and additional referees carefully reviewed these contributions and provided feedback to the authors. We selected eight papers for presentation at the conference resulting in a 28% acceptance rate. We further asked the authors of these contributions to consider our feedback and compile a final version of their work. The present book contains these reviewed and revised manuscripts. Lastly, we would like to express our gratitude to the program committee members and referees for their voluntary contributions and to the authors and co-authors for the patience during the review process. We are looking forward to the exchange of ideas in Venice. Hamburg, Germany Geneva, Switzerland Karlstad, Sweden April 2023
Stefan Schiffner Sébastien Ziegler Meiko Jensen
Organization
The Privacy Symposium conference has been established to support international dialogue, cooperation, and knowledge sharing on data protection. The 2023 edition, like the 2022 edition, has been hosted by the University Ca’Foscari of Venice and was organized in collaboration with several institutions, including the Data Protection Unit of the Council of Europe, European Centre for Certification and Privacy (ECCP), European Law Students’ Association (ELSA), European Cyber Security Organization (ECSO), European Federation of Data Protection Officers (EFDPO), European Centre on Privacy and Cybersecurity (ECPC), Italian Institute for Privacy (IIP), Italian Association for Cyber Security (CLUSIT), IoT Forum, IoT Lab, Mandat International, and several European research projects. The overall coordination was provided by the foundation Mandat International. The call for paper of this second edition was focused on Data Protection Law International Convergence and Compliance with Innovative Technologies (DPLICIT).
Executive Committee General Chair: Sébastien Ziegler (IoT Forum, ECCP, Switzerland) Program Chairs: Stefan Schiffner (BHH, Germany) Meiko Jensen (Karlstad University, Sweden)
Steering Committee Constantinos Marios Angelopoulos (Bournemouth University) Paolo Balboni (European Centre for Privacy and Cybersecurity) Alessandro Bernes (University Ca’Foscari) Natalie Bertels (KUL) Luca Bolognini (Italian Institute for Privacy, University of Maastricht) Andrew Charlesworth (University of Bristol) Luca Bolognini (Italian Institute for Privacy) Afonso Ferreira (CNRS-IRIT) xvii
xviii
Organization
Romeo Kadir (University of Utrecht) Zsófia Kräussl (University of Luxembourg) Latif Ladid (University of Luxembourg) Kai Rannenberg (Goethe University Frankfurt) Stefan Schiffner (BHH—University of Applied Sciences, Hamburg) Antonio Skarmeta (University of Murcia) Geert Somers (Timelex) Steve Taylor (University of Southampton) Sébastien Ziegler (European Centre for Certification and Privacy, IoT Forum)
Program Committee Florian Adamsky (Hof University of Applied Sciences) Bettina Berendt (TU Berlin) Alessandro Bernes (Ca’ Foscari University of Venice) Wilhelmina Maria Botes (SnT, University of Luxembourg) Athena Bourka (ENISA) Sébastien Canard (Orange Labs) Roberto Cascella (ECSO Secretariat) Afonso Ferreira (CNRS, IRIT) Michael Friedewald (Fraunhofer ISI) Meiko Jensen (Karlstad University) Sokratis Katsikas (Norwegian University of Science and Technology) Stephan Krenn (AIT Austrian Institute of Technology GmbH) Christiane Kuhn (Karlsruhe Institute of Technology) Gabriele Lenzini (SnT, University of Luxembourg) Elwira Macierzynska (Kozminski University) Joachim Meyer (Tel Aviv University) Sebastian Pape (Goethe University Frankfurt) Davy Preuveneers (KU Leuven) Delphine Reinhardt (University of Göttingen) Arnold Roosendaal (Privacy Company) Arianna Rossi (SnT, University of Luxembourg) Steve Taylor (University of Southampton) María Cristina Timón López (Universidad de Murcia) Julián Valero-Torrijos (Universidad de Murcia)
Additional Referees Nils Gruschka (Oslo University) Malte Hansen (Oslo University) Leonardo Martucci (Karlstad University)
Contents
1
2
3
4
A Methodology for the Assessment of the Impact of Data Protection Risks in the Context of Big Data Analytics: A Delphi Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Georgios Georgiadis and Geert Poels Introducing the Concept of Data Subject Rights as a Service under the GDPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Malte Hansen, Nils Gruschka, and Meiko Jensen
17
Technical and Legal Aspects Relating to the (Re)Use of Health Data When Repurposing Machine Learning Models in the EU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soumia Zohra El Mestari, Fatma Sümeyra Do˘gan, and Wilhelmina Maria Botes
33
“We Are the Makers of Manners”: A Grounded Approach to Data Ethics for the Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Janis Wong, Yusra Ahmad, and Sue Chadwick
49
5
How the Charter of Trust Can Support the Data Protection . . . . . . . . . . . Nadia Giusti
6
Operationalizing the European Essential Guarantees in Cross-Border Personal Data Transfers: The Case Studies of China and India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan Czarnocki, Eyup Kun, and Flavia Giglio
7
1
69
91
Enabling Versatile Privacy Interfaces Using Machine-Readable Transparency Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Elias Grünewald, Johannes M. Halkenhäußer, Nicola Leschke, Johanna Washington, Cristina Paupini, and Frank Pallas
xix
xx
8
Contents
Processing of Data Relating to Criminal Convictions and Offenses in the Context of Labor Relations in Spain . . . . . . . . . . . . . . . 139 Beatriz Aguinaga Glaría
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Chapter 1
A Methodology for the Assessment of the Impact of Data Protection Risks in the Context of Big Data Analytics: A Delphi Study Georgios Georgiadis
and Geert Poels
1.1 Introduction Big Data analytics (BDA) harnesses the power of Big Data, which due to their volume, variety, veracity, and velocity cannot be analysed by traditional analytical techniques and their associated tools. Essentially, BDA is a form of advanced analytics comprising computer algorithms, techniques, and supporting technologies capable of transforming raw and seemingly unconnected data into useful information thus improving organisations’ operational efficiency and decision-making. While BDA has created significant opportunities for businesses and, for instance, also played a significant role in controlling recent large-scale public health incidents [1, 2], it has inadvertently increased the risks of data misuse. For example, data that are used on their own are not easily identifiable; however, when combined with other data and due to the hidden patterns that are discovered through BDA, sensitive personal information can be revealed [3]. In order to identify and mitigate risks arising out of the processing of personal data, the General Data Protection Regulation (GDPR) imposes the data protection impact assessment (DPIA). The DPIA is a structured means of enabling users to take a proactive rather than reactive approach to data protection [4]. By performing
G. Georgiadis Department Business Informatics and Operations Management, Ghent University, Ghent, Belgium e-mail: [email protected] G. Poels () Department Business Informatics and Operations Management, Ghent University, Ghent, Belgium CVAMO Lab, Flanders Make, Faculty of Economics and Business Administration, Ghent University, Ghent, Belgium e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_1
1
2
G. Georgiadis and G. Poels
the DPIA, which is more of a continuous endeavour than a one-time process, organisations can take concrete steps to meet their legal obligations, with the objective of incorporating or ‘engineering’ [5, p. 342] the data protection by design principle under GDPR Recital 78 in the early stages of their product and service development processes. Organisations that fail to prevent serious data misuse may face significant legal repercussions or be subject to financial penalties of up to A C20 million or 4% of their annual turnover [6]. Although DPIA as a concept only became widely known with the GDPR, it has actually been around since the mid-1990s in the form of privacy impact assessments (PIAs) [7]. Several studies have been conducted on the use of PIAs since then, but little attention has been paid to whether these assessments require a different approach when used in the context of BDA. In this paper, we briefly discuss what is at stake in conducting privacy or data protection impact assessments for BDA processing operations based on the findings of our systematic literature review (SLR) in [8]. We also describe the steps we are currently taking to further investigate and close the knowledge gap regarding how to conduct such impact assessments for BDA through our ongoing Delphi study with both BDA and privacy/data protection experts [9]. This Delphi study aims to inform the development of an improved variant of DPIAs in connection with the use of BDA considering specific BDA-related risks to protecting individuals’ privacy and their personal data. As it is common in the existing literature and for convenience, we use the term ‘methodology’ in this paper to refer to all types and forms of methodological support for PIAs. We also use the terms PIA and DPIA interchangeably although we are aware of their significant differences which we discuss extensively in [8]. The goal of this paper is to report on the Delphi study’s experts’ opinions about our SLR findings. Do experts on BDA and privacy/data protection agree on the BDA-specific risks that we identified in the literature? How important do they believe these issues are for DPIAs as required by the GDPR? Once there is consensual agreement on the relevance and importance of these – or a selection of them – for DPIAs and BDA-specific issues with privacy and personal data protection, we can start investigating solutions on how to account for these risks and potential harms through the use of the DPIA. This paper is structured as follows. In Sect. 1.2, we present the background of our research and our literature findings in the area of BDA and privacy impact assessment. In Sect. 1.3, we present our Delphi study strategy and describe the results of the data analysis from each round. Finally, in Sect. 1.4, we conclude the paper and describe the next steps in our Delphi study research.
1.2 Background During our SLR and based on a sample of 159 articles, 9 BDA-specific risks and potential harms to personal data protection and individuals’ privacy were identified
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
3
Table 1.1 List of PTPs [8] PTP 1 2 3 4 5 6 7 8 9
Description Unclear data controllership Identification of individuals from derived data Discrimination issues affecting moral (e.g. stigmatisation) or material (e.g. reducing the chance of finding a job) personal matters Lack of transparency Increased scope leading to further processing incompatible with the initial purpose Improper treatment of different types of privacy risks and data breaches Limited range of stakeholder involvement Practical issues due to procedural vagueness Treatment of indirect privacy harms
leading us to the definition of so-called ‘Privacy Touch Points’ (PTPs) to avoid the frequent use of the term ‘risk’ in the paper (see Table 1.1). We also investigated the extent to which these PTPs are currently covered by established PIA methodologies. For ten PIA methodologies, we assessed the coverage of the PTPs using the following three classes: • PTP is documented or addressed in core process or supporting material. • PTP is referred to but not sufficiently addressed. • PTP is neither documented nor implied. As shown in Fig. 1.1, our analysis in [8] indicates that none of the investigated PIA methodologies adequately cover all PTPs. In addition, we discovered practical issues in those methodologies when assessing the impact of large-scale operations with Big Data. After examining these methodologies closely, we concluded that a ‘best-of-breed’ approach, which would combine selected parts of different methodologies to allow for broader coverage of the PTPs, was neither feasible nor sustainable. A ‘pick-and-mix’ of different methodologies allows using the features that are best suited to the problem at hand but can also easily lead to a kind of ‘Franken-model’, as it is important to first ensure that the chosen methodologies can work together harmoniously. Our approach, therefore, was to leave this to our Delphi experts, giving them the opportunity to express their opinions and even suggest additional methodologies for further research in the first round. This led us to the decision to develop a new DPIA methodology that is adapted to the BDA context. As a first step, we started a Delphi study involving BDA and privacy/data protection experts to make an informed decision on what PTPs to include and prioritise and to obtain expert advice on how to address them in this type of enhanced methodology. A Delphi study is a decision-making and consensusbuilding technique [10, 11], which involves an iterative process consisting of several rounds that aim to achieve group consensus on expert opinions. Our Delphi study is still ongoing, but after two rounds, the consensus in the expert opinions on the relevance and importance of the PTPs that we identified through the literature review for the DPIA is becoming clear. We consider these
4
G. Georgiadis and G. Poels
More Privacy Risk Related
PTP (4)
PTP (6)
PTP (8) PTP (9)
More Data Protection Risk related
PTP (1)
PTP (2)
PTP (3) PTP (5)
PTP (7)
Hong Kong Privacy Commissioner for Personal Data IRL Data Protection Commission
ISO/IEC 29134:2017 La Commission Nationale de I’Informatique et des Libertés LINDDUN NZ Privacy Commissioner
Office of the Australian Information Commissioner Office of the Privacy Commissioner of Canada
UK's Information Commissioner's Office
US Homeland Security
PTP is documented or addressed in core process or supporting material
PTP is referred to but not sufficiently addressed
PTP is neither documented nor implied
Fig. 1.1 Extent of coverage of PTPs by PIA methodologies [8]
results as an expert validation of our literature-based findings, which provide us with a solid empirical basis for developing the envisioned DPIA methodology.
1.3 Overview of the Delphi Study Our Delphi study consists of three phases, which are schematically depicted in Fig. 1.2. In the preparatory phase, we defined the minimum qualifications of candidate expert participants and verified them through a screening survey. Since we want to cover different areas of knowledge in the fields of law, security, and computer science, we paid close attention to the size, representativeness, and heterogeneity of our expert panel to ensure that the entire spectrum of expertise is covered. In doing so, we followed the methodological guidance provided in [12, 13] which suggests a heterogeneous composition of the expert panel, while pointing out one of the weaknesses of Delphi, namely, the precise definition of the term ‘expert’. As far as the heterogeneous composition of the panel is concerned, we invited more than 30 experts from academia, the public sector, and industry to participate in the expert panel, with experience in BDA, artificial intelligence, security, law, and privacy. Eventually, 18 experts agreed to participate in the Delphi study. As shown in Fig. 1.3, most of our participants have expertise in data protection and privacy covering other important areas for the study such as BDA, security, and artificial intelligence.
Preparatory Phase
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
IDENTIFICATION OF POTENTIAL EXPERTS
EXPERT SELECTION ON PREDEFINED CRITERIA
PREPARATION & TRANSMISSION OF QUESTIONNAIRE TO EXPERT PANEL
5
INITIAL QUESTIONNAIRE
INVITATION SENDING
Final Phase
Intemediate Phase
NO LAUNCH OF ROUND n+1
DATA ANALYSIS YES
CONSENSUS
FINAL ROUND
NO NEW QUESTIONNAIRE
REPORT OF RESULTS TO THE EXPERTS
JOURNAL PAPER
YES
ANALYSIS FEEDBACK
FINAL CONSENSUS STATEMENT REPORT RESULTS
FINAL DATA ANALYSIS
Fig. 1.2 Delphi phases 71%
80% 60% 36% 40%
43% 29%
29%
36% 21%
20% 0% Artificial Big Data Intelligence Analytics
Data Science
Law
Privacy and Security Data Protection
Other
Fig. 1.3 Domains of expertise
Next, in the intermediate phase, we employed successive rounds of online surveys with questions that could be reused in different rounds until a consensus is reached. The two rounds conducted so far focused mainly on validating and possibly extending the PTPs identified in our literature review [8] and seeking the opinion of the expert panel on key components of a DPIA framework. The third round will mainly focus on improvements for the DPIA in terms of the coverage of the PTPs for which we have reached a consensus on their relevance and importance for DPIA in environments employing BDA. In the final phase, we will conclude our research by obtaining the final consensus statements and the overall statistical and qualitative analysis of all survey results. This will lead to the preparation of additional reports with the results for our panellists and the submission of a scientific paper.
6
G. Georgiadis and G. Poels
Throughout our Delphi research project, we made a concerted effort to stay in touch with our participants and keep them informed about our progress. One of the ways we did this was by making relevant research papers and analysis results available to them. By sharing our findings in a transparent and accessible manner, we aimed to keep our participants engaged and informed throughout the process. In addition to providing access to our research papers, we also posted regular messages on our website, https://entarchitecture.eu/. These messages included updates on our progress, reminders about upcoming deadlines, and invitations to participate in online discussions. We actively encouraged our participants to share their own thoughts and feedback with us and with other experts on our website. By facilitating open communication and exchange of ideas, we hoped to foster a sense of collaboration and shared ownership over the research project. Through ongoing communication and collaboration, we were able to build a strong and supportive community of experts who were committed to advancing our shared research goals.
1.3.1 Delphi First Round Round 1 contained questions with response options (i.e. scores) on a 5-point Likert scale. These questions aimed to explore the level of agreement and the level of importance of the PTPs. Concerning the level of agreement, we assessed whether a mixed panel of BDA and privacy/data protection experts were able to recognise the relevant BDA-specific potential issues and PTPs that we identified through our literature review on privacy and personal data protection. For the level of importance, we assessed whether there was agreement amongst the experts to cover these PTPs in the DPIA. We distinguished between the level of agreement and level of importance as, in principle, a PTP found to be relevant might not be considered important or feasible enough to be included as a focus area in the DPIA. Along with Likert-type questions, we included open-ended questions so our experts could motivate and explain their responses. The answers to the openended questions allowed us to make sense of the scores provided by the experts. Providing an interpretation of the scores to the participants is essential in the consensus-seeking process, such that participants can adjust their opinion in the next Delphi study round. This additional feedback allowed us to enrich our findings by identifying additional issues for further investigation and possible controversies. The online survey also contained additional questions to collect basic demographic information and the consent of the experts to participate in the study. Eventually, 14 experts fully completed the first-round questionnaire. To find out whether there is consensus in the expert opinions, consensus criteria were used. Inspired by the research of [14], we defined four consensus criteria (see Table 1.2). Note that consensus can be positive or negative. If positive, there is a consensus on agreeing that a PTP is relevant or important. If negative, there is a
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
7
Table 1.2 Consensus criteria Condition #1 #2 #3
#4
Definition 35% of the experts strongly agree (i.e. score 5) or strongly disagree (i.e. score 1) 70% of the experts agree (i.e. score 4 or 5) or disagree (i.e. score 1 or 2) The interquartile range (IQR) (i.e. the difference between the highest and lowest of the scores of the 50% of the experts that are in the middle when scores are ranked) is less or equal to 1.25 No expert strongly disagrees if conditions (1) and (2) based on the frequencies indicate a tendency towards positive consensus (and vice versa)
Table 1.3 Consensus types with conditions
Consensus type Strong consensus Almost strong consensus No consensus
Condition All four conditions are met: #1–#4 At least three criteria are met Less than three criteria are met
Table 1.4 Delphi results from round 1 Measurements Agreement Importance
Level of consensus Strong positive Almost strong positive PTP (2), PTP (3) PTP (1), PTP (4), PTP (7) PTP (1), PTP (3)
No consensus PTP (5), PTP(6), PTP (8), PTP(9) PTP (2), PTP (4), PTP (5), PTP (6), PTP (7), PTP (8), PTP (9)
consensus on disagreeing that a PTP is relevant or important. Generally, we consider consensus strong if all criteria are met and almost strong if three out of four criteria are met (see Table 1.3). In some cases, we also consider the reasoning of the experts provided on the open-ended questions for identifying possible contradictions in the scores. Based on our analysis of the responses, we have reached a strong positive consensus on the level of agreement for two PTPs and an almost strong positive consensus for three other PTPs. In terms of the level of importance, we have reached an almost strong positive consensus for two PTPs. None of the PTPs had a strong or almost strong negative consensus on the level of agreement and level of importance. These results are presented in Table 1.4. Based on the feedback provided through the open-ended questions, the experts agreed with our findings that PTP (1) ‘unclear data controllership’, PTP (2) ‘identification of individuals from derived data’, PTP (3) ‘discrimination issues affecting moral or material personal matters’, PTP (4) ‘lack of transparency’, and PTP (7) ‘limited range of stakeholders involvement’ are indeed BDA-specific risks. Interestingly, for PTP (2) we noted that BDA inference techniques and the risk of algorithmic bias and discrimination are almost inevitable, while protection through anonymisation is difficult to ensure. Regarding PTP (1), data controllers need to consider both legal and non-legal aspects in their data processing operations. Some
8
G. Georgiadis and G. Poels
experts even suggested measures to address this risk, such as the designation of different data controllers in the various steps of the data analysis process. With reference to PTP (4), it was stressed that the lack of transparency in BDA requires more attention as it is somehow related to the level of expertise of BDA or relevant protection techniques, although there is extensive research on fair and explainable artificial intelligence. Referring to PTP (7), involving a wider range of stakeholders in impact assessment is very relevant but very challenging to achieve from an operational perspective and should therefore not be underestimated. Panellists also thought that PTP (1) and PTP (3) were relatively more important to be addressed in the DPIA, but the level of importance reached was not as strong as the level of agreement, because they had not met conditions 1 and 3, respectively. However, the general feedback from the experts was that all PTPs are indeed important and that this sometimes depends on the specific case.
1.3.2 Delphi Second Round The second Delphi round questionnaire was completed by 12 participants and used the same 5-point Likert scale as in the first round. It aimed to reach a consensus on the remaining PTPs and to ask participants what they thought about the parts of a possible DPIA framework made for use in a BDA setting based on a concept model we built according to the research findings of [15] (see Fig. 1.4). The purpose of this concept model was to provide a visual representation of the different DPIA concepts or components and their relationships. The framework element in the concept model represents the overall architecture of the impact assessment, which is governed by a policy that describes the applicable condition(s) and principle(s). Dariusz et al. [16] have compiled a comprehensive list of 16 conditions and principles that can be applied to any type of impact assessment. In practice, the framework is accompanied by a methodology informed by the principles, supported by templates, and complemented by various guidelines such as the knowledge base(s), software tools, and guidance documents. Some of the guidelines also explain the purpose and use of the template(s) in the assessment process. An example of a template is the documentation used throughout the DPIA process, which can be made available to the public or produced on request during inspections. The consensus criteria were the same as in the previous round. Before the opening of the second round, all participants were informed about the results of the first round. This was delivered in the form of a detailed feedback report that included our detailed statistical analysis along with summaries of their feedback on the open-ended questions. The idea behind this was to inform the experts about the choice of other panel members regarding each question but without revealing their identities.
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
9
class dpia
Conditions describe Policy
DPIA
describe
1
govern
shape
Framework accompany
Principles
Method 1
complement
assist *
Aid
Templates
explain
Software
Knowledge Base
Guidelines
Fig. 1.4 DPIA concept model Table 1.5 Delphi results from round 2 Measurements Agreement Importance
Level of consensus Strong positive PTP (4), PTP (5) PTP (1), PTP (2), PTP (3), PTP (5)
Almost strong positive PTP (1), PTP (7), PTP (9) PTP (4), PTP (6), PTP (9)
No consensus PTP (6), PTP (8) PTP (7), PTP (8)
Based on the results of the second round (see Table 1.5), we achieved strong or almost strong positive consensus for five (out of seven1 ) PTPs, which corresponds to an agreement rate of about 70%. For the level of importance, the rate was better at around 80%, as we achieved strong or almost positive consensus for seven (out of a total of nine) PTPs. On the other hand, we could not reach a consensus for two PTPs in each category. PTP (8) refers to the possible lack of practical guidance for assessing privacy or data protection risks when processing Big Data, whereas PTP (6) and PTP (7) refer to the treatment of different types of privacy risks and data protection and the idea of extending stakeholders’ involvement when conducting impact assessments for BDA.
1 PTP
(2) and PTP (3) had reached a consensus in round 1.
10
G. Georgiadis and G. Poels
Referring to PTP (4) ‘lack of transparency’ and PTP (5) ‘increased scope leading to further processing incompatible with the initial purpose’, experts stressed that the risk of data opacity and intervenability in BDA is sometimes associated with a lack of knowledge relating to what has been collected and processed, including the quality and complexity of BDA techniques, some of which are treated as a ‘black box’. They also agreed that BDA techniques allow the revealing of a wealth of sensitive information about an individual and noted that many companies do not want to do without this possibility. Concerning PTP (1) ‘unclear data controllership’, they associated the issue of unclear data controllership with the difficulty for controllers to understand their responsibilities which is often the result of a lack of technical expertise and the inner working of the BDA. On this topic, one of the experts also raised the issue of manuals and guidelines issued by national data protection authorities (DPA), which may differ from one member state to another. As for PTP (7) ‘limited range of stakeholder involvement’, except for some very specific cases (e.g. criminal suspects), the perspective of different stakeholders should always be considered, while for PTP (9) ‘treatment of indirect privacy harms’, although ethical and societal harms are very important, the impact assessment should primarily focus on direct harms. When it comes to the level of importance, the panellists expressed similar views. The only oddity was PTP (6) ‘improper treatment of different types of privacy risks and data breaches’, where no consensus was reached on the level of agreement; however, most experts still considered it as an important part of the DPIA. This means that we will have to reassess this PTP in Delphi round 3. Experts stated that the BDA assessments require more detailed scoping, which is why some consulting firms have crafted their own PIA or DPIA methodology. On the other hand, experts argued that existing templates are sufficient and attributed the problem to the low level of involvement of some DPAs during the impact assessment due to their lack of expertise. Even if they agree that private risks and data breaches often co-exist in BDA, the link between them is not always obvious. Moreover, the extended involvement of additional stakeholders during the assessment is practically difficult to achieve, but including external stakeholders is beneficial, especially for the detection of potential breaches that sometimes cannot be done by internal stakeholders. Comparing the consensus results from the first and second rounds in terms of the stability of responses (see Figs. 1.5 and 1.6) that refers to the degree to which experts’ opinions remain consistent over multiple rounds of feedback and refinement, we noted that the experts’ positive feedback for PTP (4) ‘lack of transparency’ and PTP (5) ‘increased scope leading to further processing incompatible with the initial purpose’ increased significantly for the level of agreement and almost in all PTPs for their importance. For almost all PTPs, such an increase could also be observed for the level of importance. On the other hand, PTP (1) ‘unclear data controllership’, PTP (6) ‘improper treatment of different types of privacy risks and data breaches’, and PTP (8) ‘practical issues due to procedural vagueness’ were rated relatively low at the level of agreement and for PTP (8) at the level of importance. This finding is consistent with the results of our consensus analysis in
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
65.5%
70.0%
11
64.3%
60.0% 50.0% 40.0% 30.0%
19.0%
20.0% 6.0%
10.0% 0.0%
–2.4%
–10.0% –20.0%
–17.9%
–30.0% PTP(1)
PTP(4)
PTP(5)
PTP(6)
–29.8% PTP(8) PTP(9)
PTP(7)
Fig. 1.5 Stability of responses in the two rounds (level of agreement) 79.8% 80.0% 70.0%
65.5% 54.8%
60.0% 50.0% 40.0%
38.1%
33.3% 27.4%
26.2%
30.0% 20.0%
14.3%
10.0% 0.0% –10.0% –15.5%
–20.0%
PTP(1) PTP(2) PTP(3) PTP(4) PTP(5) PTP(6) PTP(7) PTP(8) PTP(9)
Fig. 1.6 Stability of responses in the two rounds (level of importance)
round 2. It also showed us that experts decided that PTP (8) should not be considered at all in the DPIA for BDA. The same applies to PTP (6), which will be re-evaluated in the Delphi round 3, as explained above. When asked which of the components presented in the DPIA concept model depicted in Fig. 1.4 should be changed or improved and with what priority, the experts gave the highest priority to the guidelines, knowledge base, methodology, and templates, while the software tool supporting the DPIA was given the lowest priority.
12
G. Georgiadis and G. Poels
This part of the survey, which focuses primarily on DPIA improvements, will be the central topic of Delphi round 3.
1.4 Conclusions BDA is a form of advanced analytics encompassing computer algorithms, techniques, and supporting technologies that enable businesses and organisations to harness the power of Big Data. While BDA has created significant opportunities, it has also unintentionally increased the risks of data misuse. To counter the impact of these risks and harms to individual rights and freedoms, the GDPR imposed the DPIA, which can be seen as another type of PIA but specific to the production of personal data. Although there are many types of PIA methodologies, little attention has been paid to whether these assessments require a different approach when used in the context of BDA. Our research objective and contribution is to assist in the development of an improved variant of DPIA for environments employing BDA considering specific BDA risks. As part of this effort, we have launched a Delphi study to assess and expand on the PTPs identified in our SLR [8] and to come up with a list of possible improvements for DPIAs with the help of a group of experts with diverse backgrounds, all relevant to the areas of our study. Our Delphi study, which consists of three rounds, is currently underway. So far, we have managed to complete and analyse the experts’ responses in the first two rounds. The results are quite promising as we have managed to reach a consensus on most of the PTPs and obtained some insights on practical issues in terms of conducting a DPIA methodology for BDA. Building on the results of the first two rounds, the third Delphi round will focus on changes and improvements to DPIA methodologies. We may also supplement this round with one-on-one interviews with experts to clarify some of the open points we have come across during our analysis or obtain additional input from them. Our ambitious goal is to draw up a comprehensive list of improvements and submit it to the European Data Protection Board, which is responsible for the application of the GDPR. By doing this, we hope to actively contribute to the creation of a DPIA methodology that is more beneficial for initiatives and businesses that rely on BDA technologies as well as for users who are less familiar with privacy and data protection law.
A.1 Appendix Sample of questions from the Delphi 1 survey:
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
13
A.1.1 Level of Agreement Below we ask for your opinion on the PTPs (i.e. specific risks or harms related to Big Data analytics for data protection and privacy) that we have identified in our systematic literature review. Please indicate the extent to which you agree with the statements given, drawing on your specific expertise and experience in this area. We would be grateful if you could also justify your choice.
A.1.2 PTP (1) Unclear Data Controllership Please indicate to what extent you agree with the fact that it is often difficult to understand and comply with data protection requirements amongst data controllers when it comes to Big Data analytics.
(1)
o
o
Strongly disagree (1)
disagree
(2)
o
Neither agree nor disagree (3)
o
Agree
(4)
o
Strongly agree (5)
Can you give some reasons why you agree or disagree?
A.1.3 Level of Importance Next, we would like to know your opinion on the importance of the privacy touchpoints (PTPs) described earlier so that we can focus on them when developing specific guidance for conducting data protection impact assessments in environments where Big Data analytics are deployed. Not at all important (1)
Slightly important (2)
Moderately Important (3)
Very important (4)
Extremely important (5)
Unclear data controllership
o
o
o
o
o
Identification of individuals from derived data
o
o
o
o
o
Can you give some reasons for your choices?
14
G. Georgiadis and G. Poels
A.1.4 DPIA Methodologies Finally, we would like to know your opinion on the following. During our research we came across a number of PIA or DPIA methodologies. Which of them do you think provide sufficient guidance for assessing the previously discussed PTPs in the context of Big Data analytics? Please only give an opinion on the methodologies that are familiar with. Not at all important (1)
Slightly important (2)
Assessment Template for Smart Grid and Smart Metering systems
o
o
ISO/IEC 29134:2017
o
o
Moderately important (3)
Very important (4)
Extremely important (5)
o
o
o
o
o
o
o
o
Not familiar (6)
Can you give some reasons for your choices?
References 1. Q. Jia, Y. Guo, G. Wang, and S. J. Barnes, ‘Big Data Analytics in the Fight against Major Public Health Incidents (Including COVID-19): A Conceptual Framework’, IJERPH, vol. 17, no. 17, p. 6161, Aug. 2020, https://doi.org/10.3390/ijerph17176161. 2. L. Lin and Z. Hou, ‘Combat COVID-19 with artificial intelligence and big data’, Journal of Travel Medicine, vol. 27, no. 5, Aug. 2020, https://doi.org/10.1093/jtm/taaa080. 3. M. Kayaalp, ‘Patient Privacy in the Era of Big Data’, Balkan Med J, vol. 35, no. 1, pp. 8–17, Jan. 2018, https://doi.org/10.4274/balkanmedj.2017.0966. 4. WP29, ‘Guidelines on Data Protection Impact Assessment (DPIA) and determining whether processing is “likely to result in a high risk” for the purposes of Regulation 2016/679’, 2017. [Online]. Available: https://ec.europa.eu/newsroom/article29/itemdetail.cfm?item_id=611236. 5. G. Georgiadis and G. Poels, ‘Enterprise architecture management as a solution for addressing general data protection regulation requirements in a big data context: a systematic mapping study’, Information Systems and e-Business Management, vol. 19, pp. 313–362, 2021, https:// doi.org/10.1007/s10257-020-00500-5. 6. K. A. Salleh and L. Janczewski, ‘Technological, Organizational and Environmental Security and Privacy Issues of Big Data: A Literature Review’, Procedia Computer Science, vol. 100, pp. 19–28, 2016, https://doi.org/10.1016/j.procs.2016.09.119.
1 A Methodology for the Assessment of the Impact of Data Protection Risks. . .
15
7. R. Clarke, ‘Privacy impact assessment: Its origins and development’, Computer Law & Security Review, vol. 25, no. 2, pp. 123–135, 2009, https://doi.org/10.1016/j.clsr.2009.02.002. 8. G. Georgiadis and G. Poels, ‘Towards a privacy impact assessment methodology to support the requirements of the general data protection regulation in a big data analytics context: A systematic literature review’, Computer Law & Security Review, vol. 44, 2022, https://doi.org/ 10.1016/j.clsr.2021.105640. 9. G. Georgiadis and G. Poels, ‘Delphi Study to Identify Criteria for the Systematic Assessment of Data Protection Risks in the Context of Big Data Analytics’, in 2022 IEEE Eighth International Conference on Big Data Computing Service and Applications (BigDataService), Newark, CA, USA, Aug. 2022, pp. 177–178. https://doi.org/10.1109/BigDataService55688.2022.00037. 10. S. Keeney, F. Hasson, and H. Mckenna, The Delphi Technique in Nursing and Health Research. Wiley-Blackwell, 2011. 11. C. Okoli and S. D. Pawlowski, ‘The Delphi method as a research tool: An example, design considerations and applications’, Information and Management, vol. 42, no. 1, pp. 15–29, 2004, https://doi.org/10.1016/j.im.2003.11.002. 12. J. Baker, K. Lovell, and N. Harris, ‘How expert are the experts? An exploration of the concept of “expert” within Delphi panel techniques’, Nurse researcher, vol. 14, no. 1, pp. 59–70, 2006, https://doi.org/10.7748/nr2006.10.14.1.59.c6010. 13. P. M. Mullen, ‘Delphi: Myths and reality’, Journal of Health Organization and Management, vol. 17, no. 1, pp. 37–52, 2003, https://doi.org/10.1108/14777260310469319. 14. A. Van Looy, G. Poels, and M. Snoeck, ‘Evaluating Business Process Maturity Models’, Journal of the Association for Information Systems, vol. 18, no. 6, pp. 461–486, 2017. 15. D. Kloza et al., ‘Towards a method for data protection impact assessment: Making sense of GDPR requirements’, p. 8, 2019, https://doi.org/10.31228/osf.io/es8bm. 16. Dariusz, Kloza, Van Dijk, Niels, Casiraghi, Simone, Vazquez Maymir, Sergi, and Tana, Alessia, ‘The concept of impact assessment’, in Border Control and New Technologies, Academic & Scientific Publishers, 2021. https://doi.org/10.46944/9789461171375.
Chapter 2
Introducing the Concept of Data Subject Rights as a Service Under the GDPR Malte Hansen
, Nils Gruschka
, and Meiko Jensen
2.1 Introduction Using online services has become an integral part of everyday life for a large majority of people. These services naturally process and store plenty of data about the service user, data essential to the usage of the service (e.g., shipping address for an online shop) as well as data to create additional revenue (e.g., user profiling to offer tailored advertisement). Many users are concerned about the amount but also the correctness of their data stored in online services. However, service providers are in general reluctant to give customers access to their data. The European General Data Protection Regulation (GDPR) [1] strengthens the rights of individuals, the data subjects (DS), and aims to give them more control over their personal data. If organizations process personally identifying data of EU citizens, or if the processing of personal data is geographically happening within the European Union, they become data controllers (DC) or data processors (DP) and must make sure to process the data in compliance with the GDPR. That regulation defines the prerequisites for the processing of personal data, restricts which data can be collected, and states principles for data processing and storage. Additionally, the GDPR grants the DS a number of rights concerning their personal data, e.g., receiving information about the personal data processed, or demanding the erasure of that personal data. However, in practice, these data subject rights (DSR) generate plenty of challenges, mainly for the DC, but also for the DS. Among others, DCs must log M. Hansen () · N. Gruschka Department of Informatics, University of Oslo, Oslo, Norway e-mail: [email protected]; [email protected] M. Jensen Karlstad University, Karlstad, Sweden e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_2
17
18
M. Hansen et al.
carefully which personal data they have stored in which location or shared with other organizations. Further, they must offer a user interface for the DSs to access their data and execute their DSR. This includes authentication of DSs, which has shown to be nontrivial [2, 3]. These requirements are especially challenging for small and medium organizations. The DSs face the problem of coping with different interfaces at different DCs and receiving the information in different formats of varying quality. Further, if data was shared among different DCs, the DS might not know which DC they shall contact. To address these issues, we extend the typical information flow architecture in the context of the GDPR and propose a further entity offering Data Subject Rights as a Service (DSRaaS). These DSRaaS providers act as brokers between DSs and DCs and can be contact points for Data Protection Authorities and certification organizations. Our contributions presented in this chapter are as follows: • We introduce the concept of DSRaaS and analyze the requirements for such a service. • We develop a catalog of services for fulfilling the legal and functional requirements. • We integrate the DSRaaS concept into the current developments of the European Data Strategy (cf. [4]). The chapter is organized as follows. Sections 2.2 and 2.3 describe the data subject rights according to the GDPR and the relevant related work. Section 2.4 then introduces and explains the novel approach of Data Subject Rights as a Service, focusing on its goals, services, data models, etc., and Sects. 2.5 and 2.6 then put that approach into the European legislative and general scientific context. The chapter concludes in Sect. 2.7 with future research directions.
2.2 Data Subject Rights DSR and their legal requirements are described in Chapter 3 of the GDPR [1, Art. 12-23]. The aim of these rights is to empower individuals to exercise control over their personal data. Each organization holding or processing the personal data of a European citizen has to comply with the enforcement of those rights by the individual. For our scenario, the most relevant DSR are the following: • Article 15, Right of access (RoA): Right to obtain information about personal data of the DS being processed by the DC. • Article 16, Right to rectification (RtR): The DC must rectify any inaccurate personal data concerning the DS on request. • Article 17, Right to erasure (RtE): The DC must erase personal data concerning the DS on request. • Article 18, Right to restriction of processing (RtRoP): The DS can limit how the DC processes personal data concerning itself in certain circumstances. • Article 20, Right to data portability (RtDP): The DC must provide a copy of all personal data concerning the DS on request.
2 Introducing the Concept of DSRaaS under the GDPR
19
The legal text does not specify how to realize the corresponding article in practice. This leads to a lackluster understanding of DSR requirements and the tools required for secure and GDPR-compliant implementation.
2.3 Related Work The European Data Protection Board (EDPB) and their predecessor, the Article 29 Working Party, whose works have since been endorsed by the EDPB [5], have released guidelines for both the RtDP [6] and the RoA [7] to aid in the implementation of DSR. While addressing some open issues for organizations looking to comply with DSR, concrete instructions on the implementation and the technical tools and measures required are not given. This leaves many open challenges for DCs and DPs, especially for small- and medium-sized enterprises, demanding a significant amount of expertise and resources from them to comply with DSR enforcement. These issues can be seen in the perception of DSR from both DCs and DSs alike. As an example, companies in a Norwegian study have stated to have a limited understanding of the GDPR and possess neither the budget nor technology nor expertise to fulfill the GDPR and data subject rights requirements [8]. Further, roughly 40% of European DSs think they lack control over their personal data, and enforcement of DSR is still not utilized frequently [9]. Looking towards the implementation of DSR, several technical challenges for the RtDP have been identified [10], which, when combined with its ambiguity, may result in different solutions for approaching the RtDP (e.g., [11, 12]). A proposal for future research to progress the status quo of the RtDP is looking to address some of these problems [13]. To resolve the shortcomings of the RoA, a generic data request model to serve as a common baseline for implementations of the RoA has been proposed [2]. Another issue, requiring additional research, is the enforcement of data subject rights in non-European organizations (e.g., [14, 15]). For the RoA specifically, several works have shown pressing issues in its implementation by organizations. Different research has shown that the results of exemplary RoA requests are often lacking in several aspects, namely, the timeliness of the response, the completeness of data, and the additional information provided about the processing of the data, especially processing including automated decision-making (e.g., [9, 16, 17]). Additionally, the current RoA implementations by the data controllers have shown to be prone to misuse. Several studies have succeeded in getting unauthorized access to personal data via RoA requests in multiple cases, employing social engineering attacks, impersonation techniques, and email forgery (e.g., [18–20]). Although currently, no solutions for DSR enforcement with a focus on the DS side exist, especially in anticipation of the new European Data Governance Act (DGA, cf. [21]) and the upcoming European Data Act (DA, cf. [22]), some companies have identified the need for external aid in the implementation of DSR
20
M. Hansen et al. Data Protection Authority
compliance complaints
personal data joint controller services & fulfillment of DS rights
Data Subject
Data Controller
Data Controller
data sharing agreement audit compliance
Seals & Certifications sub processor
Data Processor
Data Processor
Fig. 2.1 Information flows between different entities in the context of the GDPR, adopted from [23]
and are offering DSR management tools for DCs (e.g., Fair & Smart,1 OneTrust,2 DataGrail,3 or Osano.4 ) In the current architecture of the GDPR environment, the information flows from the perspective of a DS are limited to the interaction with DCs and the possibility of complaints to data protection authorities. While the information flows between different DCs and DPs can be learned with the help of DSR, the process of doing this is very tedious, as it requires multiple DSR requests. A complete overview of information flows between different entities adapted from [23] can be seen in Fig. 2.1.
1 https://www.fairandsmart.com/en/right-requests/. 2 https://www.onetrust.com/products/privacy-rights-automation/. 3 https://www.datagrail.io/platform/request-manager/. 4 https://www.osano.com/products/dsar-srr.
2 Introducing the Concept of DSRaaS under the GDPR
21
2.4 Data Subject Rights as a Service As stated above, DSR enforcement currently has a lot of issues. Implementing the different requirements for DSR can be quite challenging, especially for smalland medium-sized enterprises, lacking the necessary know-how and resources. The scarcity of clear implementation guidelines and solutions worsens this situation. To overcome these problems, the introduction of a separate service for the execution and implementation of the RoA has been proposed [2]. As these problems span over all DSR, we can expand this idea to a generic framework, called Data Subject Rights as a Service (DSRaaS). DSRaaS can provide a reliable point of contact for DS and DC alike to resolve these issues. In this section, we will describe the goals, a catalog of services offered by DSRaaS providers, and the interfaces needed to realize these services.
2.4.1 Goals The central goal of this work is to explore the concept of Data Subject Rights as a Service and identify the steps necessary to realize such an architecture. To achieve this, we need to take a look at the current issues in the process of DSR enforcement, the challenges regarding the implementation of DSR processes, and the questions and requirements derived from the surrounding legal framework, namely, the GDPR and the upcoming DGA. From this we can derive the following high-level goals for DSRaaS: • Prevent misuse of DSR • Reduce the duration and improve quality of the DSR process, especially comprehensibility of the data flow for DSs • Provide aid for both DS and DC during the DSR process and its implementation • Support interoperability between DSRaaS providers • Comply with legal requirements To reach these goals, several issues have to be resolved and a concrete, implementable solution has to be derived. These issues will be described in more detail in Sect. 2.4.2, where they will be mapped to a service, looking to address the corresponding goals: • • • • •
Provide a secure and definite authentication scheme for DSs Provide a data trail for DSs Perform and merge multiple DSR requests for multiple DCs Assert enforcement of DSR requests Present solution catalog for DCs to guide them in the implementation of DSR
22
M. Hansen et al.
The demand for a service solution for each of these problems may vary between DCs, depending on their own capacities. Hence, each service should be able to fulfill these goals individually while also functioning in a complete DSRaaS architecture, addressing every issue with a catalog of interoperating services.
2.4.2 Services As can be derived from the goals described in Sect. 2.4.1, the main focus of the services offered by DSRaaS revolves around helping DSs enacting their DSR and aiding DCs in implementing and maintaining the appropriate infrastructure to comply with those DSR requests. In order to achieve this, we have identified five central, interrelated services, as seen in Fig. 2.2: 1. Data subject right enforcement to let a DS enact any DSR for any DC and get a satisfactory result 2. Authentication providing a reliable, privacy conform, and secure authentication of the DS
Data Model Template for Data Controllers Shared standard for Service Providers
Use
Map to
Use
Data Subject Right Enforcement Data Logbook Map of location of data Data trail
Authentication
Configure parameters Visualize
Forward requests Assert good result
Use
MFA for Data Subjects Mapping Data Subjects
Issues concerning Rights and Implementation
Consulting For Data Subjects For Data Controllers
Fig. 2.2 Overview of the five central services of Data Subject Rights as a Service
2 Introducing the Concept of DSRaaS under the GDPR
23
3. Data model offering a general data model for the exchange and storage of personal data, as well as acting as a shared standard for interoperability between different DSRaaS providers and providing DCs with a reliable tool to implement interfaces to the DSRaaS providers 4. Data logbook providing an overview of the data flows and current location of personal data for DSs 5. Consulting aiding both DSs and DCs in ongoing issues with DSR processes.
2.4.2.1
Data Subject Right Enforcement
A DSRaaS provider should be able to initiate any request for a DSR by any DS for any DC, where the GDPR applies, and provide the DS with a meaningful result. This means that the provider delivers the relevant information about the DC to the DS or confirms the execution of a DSR by a DC. For DSR containing information about personal data, the DSRaaS provider should assert that a result is sent to the DS. However, while they can act as a relay point, leveraging the established channels for secure communication to both the DS and DC, the provider must not gain access to the result itself. Concerning the most relevant DSR, identified in Sect. 2.2, the following service should be provided for the DS: • Right of access: Receive a complete copy of processed data sets by a DC, containing information about every aspect specified in Article 15 of the GDPR. • Right to rectification: Update any incorrect or incomplete data set held by a DC with the correct and complete information. • Right to erasure: Delete any data sets held by a DC, where processing is not necessary (see [1, Art. 17(3)]). • Right to restriction of processing: Restrict the processing purposes for any data set held by a DC. • Right to data portability: Receive a complete copy of all data sets held by a DC in a structured, commonly used, and machine-readable format (e.g., CSV, XML, JSON, or a format based on the data model). On request, the result can directly be forwarded to another DC of the DS’s choice. The DSR enforcement service would greatly benefit from the implementation of the other services. For example, the data model service facilitates the automation of the DSR processes. In case a data trail is built, it can also be considered for the DSRaaS provider to fulfill the notification obligation [1, Art. 19] in place of the DC. A DC is obliged to inform any other organization they have disclosed personal data about a DS’s enforcement of the RtR, RtE, or RtRoP. Leveraging a data trail, the DSRaaS provider can instantaneously forward those requests to all applicable data processors. This would greatly increase the speed and completeness of those DSR processes and relieve DCs. Additionally, a link to the authentication service can be established to verify the identity of the DS before the initiation of any DSR process.
24
2.4.2.2
M. Hansen et al.
Authentication
The DSRaaS provider shall be able to securely and undeniably confirm the identity of the DS based on a multifactor authentication scheme. While many DCs can provide this form of authentication of the DS themselves, it has been identified as a huge challenge for DCs in scenarios where the DC holds no previously established authentication factors about the DS, such as a user account, email address, or mobile number [2]. Resolving these issues by providing a generic authentication service to these DCs is the main goal of this service. The second purpose of this service is to map the authenticated DS to the correct data sets in the system of the DC by leveraging the data model.
2.4.2.3
Data Model
The goal of the data model service is to provide a general data model for use as a common standard for DSRaaS providers. This will facilitate interoperability between different providers, which can lead to the development of shared functionalities and faster exchange of key information to improve DSR results. The data model must be designed to include all relevant attributes for complying with DSR and to enable the construction of a data trail to use in the data logbook. It can then also serve as a standard for DCs to fit or model their own data architecture to meet the requirements for automated DSR processes and GDPR compliance. For these fitted data architectures the DSRaaS provider can then offer an API that transforms the result of a query for a DSR process into the desired data format. This way the DC can send the result for an RoA or RtDP request to the DS in the desired data format directly, without the provider having to process or receive the result in any way.
2.4.2.4
Data Logbook
The data logbook is a service for the DS to receive a visualization of their data trail (cf., e.g., [24, 25]). This visualization should be easily comprehensible for the DS, even without technical knowledge of the DS. The data trail should include the sources and destinations of the data, as well as the purpose for which it was collected or shared. The goal is that the DS can see which DC currently has which types of data of it for which purposes. To achieve this, the DC must, in some way, share its data-sharing activities with the DSRaaS provider. This service creates a data overhead that may pose a significant privacy risk if maintained and secured improperly. However, it may also be used to improve auditing and compliance inspection of DCs and their processing of DSR. Therefore, a thorough assessment of the risks and advantages of this service is necessary.
2 Introducing the Concept of DSRaaS under the GDPR
2.4.2.5
25
Consulting
In addition to the technical services presented above, the DSRaaS provider offers to consult for all issues regarding DSR for both DSs and DCs. For the DS this means that the DSRaaS provider will answer questions concerning which DSR are available to the DS, information about the DSR, and what is required by the DS to enforce said DSR. For the DC, the DSRaaS provider will provide information about the DSR that the DC needs to implement and its requirements. They can recommend tools and frameworks to realize the implementation. If the DC needs additional help with the implementation, the DSRaaS provider can extend the support, especially with the adaption of a proper data model.
2.4.3 Interfaces For the presented services we have identified three interfaces that must be implemented. Firstly, an interface for the communication between the DSRaaS provider and DS; see Fig. 2.3. For the enforcement service, the DS can clarify the DSR to be enforced and the DCs to be approached by the provider. The DSRaaS provider then provides information about where and how to receive the result over this interface. Additionally, the realization of the data logbook and consulting for DSs are carried out over this interface. Another interface handles the communication between the DSRaaS provider and DC; see Fig. 2.4. The provider forwards the parameters for the enforcement services provided by the DS. The DC then confirms the completion of the request, possibly stating over which channel the result is provided. As the data logbook potentially requires information from the DC to create an accurate and up-to-date result, the Fig. 2.3 Interface between DSRaaS provider and data subject
Initiate DSR related issue Send request parameters
DSRaaS Provider
Consult about issues Authenticate Data Subject Confirm DSR execution Send Data Logbook
Enforcement
Consulting
Data Logbook
Authentication
Data Subject
26
M. Hansen et al. Forward request parameters Consult about issues Aid in implementation Provide Authentication of DS Provide Data Model Initiate DSR related issue Send data sharing activities
DSRaaS Provider
Data Controller
Enforcement
Consulting
Data Model
Data Logbook
Authentication
Fig. 2.4 Interface between DSRaaS provider and data controller Fig. 2.5 Interface between two DSRaaS providers
Provide authentication Exchange Logbook information Share standards
DSRaaS Provider
DSRaaS Provider Data Logbook
Authentication
Data Model
necessary data can be communicated over this channel as well. Consulting for the DC side will also be handled here. Lastly, an interface for the communication between different DSRaaS providers has to be defined; see Fig. 2.5. Over this interface, communication about current standards and technologies can be exchanged to serve as a type of quality assurance. Further, data sets used for the data logbook can be exchanged to keep the result of this service accurate. This exchange can be realized either by periodically occurring automatic exchanges or on request. A DSRaaS provider can also leverage the authentication service of another provider if they lack the information to implement a multifactor authentication scheme themselves. Many DCs already offer an interface for DSR functionalities to DSs. This is out of the scope of this chapter, as the DSRaaS provider has no influence on this interface. While most DCs should have already established a secure communication channel between themselves and the DS or should be able to establish it without any difficulties, this cannot be guaranteed for every DC. Whether the DSRaaS provider should in this case act as a bridge between the two parties and offer their interfaces
2 Introducing the Concept of DSRaaS under the GDPR
27
Data Protection Authority
complaints
compliance
Consulting Aid in implementation Confirm authentication Provide data model
Initiate DSR issues Send request parameters
Initiate DSR issues Send data sharing activities
Consulting Data Logbook
Data Subject
DSR result
DSRaaS Provider A Provide authentication Exchange Logbook information Share standards
Data Controller
audit compliance audit compliance
audit compliance
DSRaaS Provider B
Seals & Certifications
Fig. 2.6 Information flows between different entities in Data Subject Rights as a Service
for communication, e.g., for sending the result of an RoA request, is something that has to be discussed in the future. An overview of the different interfaces, integrated into the GDPR information flow, can be seen in Fig. 2.6.
2.5 DSRaaS in the Context of the European Data Strategy The approach of dedicated DSRaaS providers is of special relevance in light of the recent European data strategy fostered by the European Commission (cf. [4]). Intended to stimulate data sharing and data trading across actors in the European market, the commission issued a set of new legislations of relevance, such as the proposed European Data Act [22], the European Data Governance Act [21], the European Digital Services Act [26], and others. Especially the Data Governance
28
M. Hansen et al.
Act (DGA) is of interest in this discourse, as it defines the novel role of a dedicated data intermediary (DI) as a stakeholder in data-sharing scenarios. The task of a DI is to act as a data-sharing platform provider, mediating between the DSs and data storage providers on one side and the data requesting organizations on the other side. These DIs are meant to be benevolent actors that do not utilize the data they obtain for themselves, according to the DGA. Their mission is just to care for data requests issued to them, trying to identify and aggregate as many matching datasets from their data sources as possible and forwarding these datasets to the requesting organizations. Moreover, these DIs are obliged to enforce the DSRs and other compliance aspects of GDPR and similar European laws in the data-sharing scenarios they participate in. Hence, it is an obvious approach that DIs are the ideal candidates to become DSRaaS providers as well since they have to cater to DSR enforcement anyways. Also, they typically tend to know all other DCs and DPs involved in a data-sharing scenario, so most of the assumptions in the DSRaaS model are already fulfilled by the DIs in such scenarios.
2.6 Discussion The concept of DSRaaS comes with many open questions. One important factor to consider is the potential privacy risks that may come with it, especially considering that data about DSR enforcement is personal information as well. This means that a thorough analysis of the privacy threats, for example, as a part of a Data Protection Impact Assessment, for the whole framework is required. Aspects to focus on should be which additional personal data will be collected for each service, how and with whom personal data will be shared, and what risks the interactions between the different actors introduce, especially concerning the data flows between a DC and a DSRaaS provider. Trust, or rather the possible lack of trust, is also an important factor for the potential implementation of DSRaaS. How willing are DCs to cooperate with an external service provider, and to which degree would DCs allow them to access their data sets? This leads to the question if different implementations of a service, e.g., the Data Subject Right Enforcement, should be offered. This way, a DC would implement some of the functionalities itself and hence could reduce the degree of access required by the DSRaaS provider. While the DSRaaS provider should not get access to the result of a DSR request itself, the DS has to trust the process as well. This puts further emphasis on the requirement of proper certification and auditing for the DSRaaS providers. Additionally, an awareness campaign about DSRaaS before its launch could be useful. This could both build trust and recruit potential collaborators and users beforehand. Another issue is the question of which organizations are going to implement and run such a service and for which reason. In the previous section, we discussed the role of DIs as actors in DSRaaS. However, an implementation by an independent, private organization is feasible as well. One question for a private approach would
2 Introducing the Concept of DSRaaS under the GDPR
29
be how it would finance its business. As monetizing the data concerning the DSR requests should be out of the question, the two most likely approaches would be to invoice either the DC or the DS. In order to properly assess both options, a closer look at the exact scope of services offered is required. It is important to note, though, that neither the DS nor the DC can be forced to use a service, especially if it is a paid service. This makes a widely used private sector implementation more challenging. Additionally, it must be answered how different DSRaaS systems, especially government and private sector systems, can coexist and interoperate reasonably. Do they all cater to all possible DSs and DCs or do they divide territories or sectors between them? How can they exchange data while upholding security standards and GDPR compliance? Do they all operate in one shared system, or do many distributed systems with some interfaces exist? Independently from the organizational structure behind the DSRaaS provider, the highly confidential nature of a DSR request demands a proper framework for auditing and compliance. To guarantee a basic degree of trust and compliance, concepts for certifications of DSRaaS providers should be proposed. These certificates should be reviewed regularly. Possible candidates to conduct those assessments may include data protection authorities, other providers, and independent auditors. One powerful tool to employ in this process would be the usage of computer-aided audit tools and techniques (CAATTS, cf., e.g., [27])). CAATTS would allow an audit to increase the sample size while reducing the margin of errors, particularly human errors. Considering the huge potential impact of a privacy breach in GDPR-related contexts, adequate privacy compliance auditing methods including CAATTS should be investigated further for both DSRaaS and the GDPR in general. Lastly, the different services described above must be developed and implemented. Each service can be implemented as a self-sustained solution for its intended use case, albeit some services would greatly benefit from other services, e.g., the data logbook from the data model. This means that a partial implementation might progress the state of the art, even if the concept of DSRaaS should prove to not be worthwhile pursuing. An important note for the design of a concrete DSRaaS model going forward is that the list of services presented here should not be seen as immutable. Requirements and circumstances may change during the creating period of this project, e.g., either by the release of new regulations or commercial state-ofthe-art solutions. This means that some services may become redundant, additional services may be needed, or different DSRaaS implementations, each looking to implement a different subset of services, become more practical.
2.7 Conclusion In this chapter, we have introduced the concept of Data Subject Rights as a Service. Successful implementation of its five services, data subject right enforcement, authentication, data model, data logbook, and consulting, addresses the issues DSR are currently facing. DSs will be able to learn about the location and data trail
30
M. Hansen et al.
of their personal information and will be able to exert any DSR for this data over one centralized interface. DCs receive a full-featured tool to implement a GDPR-compliant solution for authenticating DSs, fitting their own data model and automating the different DSR processes. The remaining open issues for both the DS and DC can be resolved via consultation with the DSRaaS provider. A complete implementation of DSRaaS requires a full realization of each service, as well as the interfaces to connect each service with each other and the involved actors. However, each service can be developed independently, possibly resolving several issues as a standalone solution. A special focus should fall on the data model, as it can serve as a foundation for the other services, greatly facilitating their development. While the question of the organizational structure behind the framework of DSRaaS remains an open issue, the outlook we have given on DSRaaS in the context of the DGA and DIs as a DSRaaS provider presents a promising solution that should be developed further in the future. Acknowledgments The contribution of M. Jensen was partly funded by the Swedish Knowledge Foundation (KK-Stiftelsen) as part of the TRUEdig project.
References 1. REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). OJ L 119, 4.5.2016, p. 1–88. 2. Malte Hansen and Meiko Jensen. “A Generic Data Model for Implementing Right of Access Requests”. In: Annual Privacy Forum. Springer. 2022, pp. 3–22. 3. Coline Boniface et al. “Security Analysis of Subject Access Request Procedures”. In: Privacy Technologies and Policy. Ed. by Maurizio Naldi et al. Cham: Springer International Publishing, 2019, pp. 182–209. ISBN: 978-3-030-21752-5. 4. European Commission. European data strategy—Making the EU a role model for a society empowered by data. https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digitalage/european-data-strategy_en. 2022. 5. Endorsement of GDPR WP29 guidelines by the EDPB. https://edpb.europa.eu/news/news/ 2018/endorsement-gdpr-wp29-guidelines-edpb_de. Accessed on 12-10-22. 6. ARTICLE 29 DATA PROTECTION WORKING PARTY 16/EN WP 242 rev.01 Guidelines on the right to data portability Adopted on 13 December 2016 As last Revised and adopted on 5 April 2017. Introducing the Concept of DSRaaS under the GDPR 15 7. Guidelines 01/2022 on data subject rights—Right of access Version 1.0 Adopted on 18 January 2022. https://edpb.europa.eu/system/files/2022-01/edpb_guidelines_012022_right-of-access_ 0.pdf. Accessed on 12-10-22. 8. Wanda Presthus, Hanne Sørum, and Linda Renate Andersen. “GDPR compliance in Norwegian Companies”. In: Norsk konferanse for organisasjoners bruk at IT. Vol. 26. 1. 2018. 9. Wanda Presthus and Hanne Sørum. “Consumer perspectives on information privacy following the implementation of the GDPR”. In: International Journal of Information Systems and Project Management 7.3 (2019), pp. 19–34. 10. Engin Bozdag. “Data Portability Under GDPR: Technical Challenges”. In: Available at SSRN 3111866 (2018).
2 Introducing the Concept of DSRaaS under the GDPR
31
11. Aysem Diker Vanberg and Mehmet B Ünver. “The right to data portability in the GDPR and EU competition law: odd couple or dynamic duo?” In: European Journal of Law and Technology 8.1 (2017). 12. Paul De Hert et al. “The right to data portability in the GDPR: Towards user-centric interoperability of digital services”. In: Computer law & security review 34.2 (2018), pp. 193– 203. 13. Sophie Kuebler-Wachendorff et al. “The Right to Data Portability: conception, status quo, and future directions”. In: Informatik Spektrum 44.4 (2021), pp. 264–272. 14. Benjamin Greze. “The extra-territorial enforcement of the GDPR: a genuine issue and the quest for alternatives”. In: International Data Privacy Law (2019). 15. Danny S Guamán, Jose M Del Alamo, and Julio C Caiza. “GDPR Compliance Assessment for Cross-Border Personal Data Transfers in Android Apps”. In: IEEE Access 9 (2021), pp. 15961–15982. 16. Fatemeh Alizadeh et al. “GDPR-reality check on the right to access data: claiming and investigating personally identifiable data from companies”. In: Proceedings of Mensch Und Computer 2019. 2019, pp. 811–814. 17. Luca Bufalieri et al. “GDPR: When the Right to Access Personal Data Becomes a Threat”. In: 2020 IEEE International Conference on Web Services (ICWS). IEEE. 2020, pp. 75–83. 18. Matteo Cagnazzo, Thorsten Holz, and Norbert Pohlmann. “Gdpirated–stealing personal information on-and offline”. In: European Symposium on Research in Computer Security. Springer. 2019, pp. 367–386. 19. Mariano Di Martino et al. “Personal Information Leakage by Abusing the GDPR ’Right of Access”’. In: Fifteenth Symposium on Usable Privacy and Security (SOUPS 2019). 2019. 20. James Pavur and Casey Knerr. “Gdparrrrr: Using privacy laws to steal identities”. In: arXiv preprint arXiv:1912.00731 (2019). 21. Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on European data governance (Data Governance Act). COM/2020/767 final. 22. Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on harmonised rules on fair access to and use of data (Data Act). SEC(2022) 81 final— SWD(2022) 34 final—SWD(2022) 35 final. 23. Harshvardhan J Pandit, Declan O’Sullivan, and Dave Lewis. “GDPR data interoperability model”. In: the23rd EURAS Annual Standardisation Conference, Dublin, Ireland. 2018. 24. Farzaneh Karegar, Tobias Pulls, and Simone Fischer-Hübner. “Visualizing Exports of Personal Data by Exercising the Right of Data Portability in the Data Track—Are People Ready for This?” In: Privacy and Identity Management. Facing up to Next Steps—11th IFIP WG 9.2, 9.5, 9.6/11.7, 11.4, 11.6/SIG 9.2.2 International Summer School, Karlstad, Sweden, August 21-26, 2016, Revised Selected Papers. Ed. by Anja Lehmann et al. Vol. 498. IFIP Advances in Information and Communication Technology. 2016, pp. 164–181. https://doi.org/10.1007/ 978-3-319-55783-0_12. 25. Tobias Pulls. “Privacy-Friendly Cloud Storage for the Data Track—An Educational Transparency Tool”. In: Secure IT Systems—17th Nordic Conference, NordSec 2012, Karlskrona, Sweden, October 31–November 2, 2012. Proceedings. Ed. by Audun Jøsang and Bengt Carlsson. Vol. 7617. Lecture Notes in Computer Science. Springer, 2012, pp. 231–246. https:// doi.org/10.1007/978-3-642-34210-3_16. 26. REGULATION (EU) 2022/2065 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 19 October 2022 on a Single Market For Digital Services and amending Directive 2000/31/EC (Digital Services Act). OJ L 119, 4.5.2016, p. 1–88. 27. Isabel Pedrosa and Carlos J Costa. “Computer assisted audit tools and techniques in real world: CAATT’s applications and approaches in context”. In: International Journal of Computer Information Systems and Industrial Management Applications (2012), pp. 161–168.
Chapter 3
Technical and Legal Aspects Relating to the (Re)Use of Health Data When Repurposing Machine Learning Models in the EU Soumia Zohra El Mestari , Fatma Sümeyra Do˘gan and Wilhelmina Maria Botes
,
3.1 Introduction The emergence of data-driven solutions in a wide variety of health applications has led to major improvements in services offered: better quality, better performances and better predictions. Especially, since the recent COVID-19 pandemic, health data have gained significant importance. The effects of the pandemic created a critical need for the development of effective data processing models in the healthcare sector. This led to technology shifts aimed specifically at healthcare technologies. For technologies, like machine learning (ML) and its various forms, such as deep learning, quality, trustworthiness and performance of the predictions obtained by these tools depend highly on the volume and quality of the data used to build these models. As a solution to this technical requirement, machine learning engineers reuse the already built models in order to overcome the necessity to collect new large datasets to build new models for similar tasks. This is made possible because for similar tasks the underlying elementary features, which are learnt by certain models, are also beneficial for other models that are built to solve similar or related tasks. For example, the underlying features for ImageNet classification have been shown to be a performant source for brain-tumour segmentation [3], which is beneficial because
These “Soumia Zohra El Mestari and Fatma Sümeyra Do˘gan” contributed equally to this work S. Z. El Mestari () · W. Maria Botes University of Luxembourg, Esch-sur-Alzette, Luxembourg e-mail: [email protected]; [email protected] F. S. Do˘gan Jagiellonian University, Kraków, Poland e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_3
33
34
S. Z. El Mestari et al.
it reduces the volume of the brain-tumour dataset required for this task. ImageNet is a public dataset, so reusing the knowledge extracted from it in the brain-tumour task does not cause any legal issues. However, when the model to reuse is trained on sensitive data, the effects and the impact of this practice on the privacy of the used data are understudied, and thus, the interpretation of this phenomenon under EU data protection legislation remains unclear. In this study, we investigate this issue from both a technical and a legal perspective. We show that the main points of ambiguity are based on the lack of clear legal definitions of the term ‘secondary use’ and ‘reuse’ of data and whether the knowledge extracted by these models which are intended for further use is considered lawful processing in terms of privacy laws and regulations and whether such knowledge must be considered as being processed under the considerations of which the input data were processed. Under the scope of this chapter, we are limiting our legal analysis to the General Data Protection Regulation (GDPR), Data Governance Act (DGA) and European Health Data Space (EHDS) proposal of the EU and the processing of health data.
3.2 Machine Learning Model Repurposing Machine learning models are data-driven tools that learn from data without a full explicit description of how a presented problem must be solved. In other words, the internal behaviour of a machine learning model that lead to a certain decision depends heavily on the features extracted from data, rather than the explicit programming of those patterns. Features that exist in datasets can be reused to solve secondary tasks. For example, a model that was trained to classify photos of animals learnt features about the morphology of animals which suggests that these same features can also be used for any task where the visual information about animal morphology are important, such as classifying a dog breed or even detecting if the image of a certain animal is present in a given photo. This is known in machine learning literature as ‘transfer learning’ where a trained model can be fine-tuned to perform another task. Model repurposing or transfer learning is a technique that allows for changes to be made to a trained model into doing a different task, without reusing the initial dataset. The idea behind this is the reuse of the knowledge previously extracted from a certain dataset to perform another task. It is a technique that is often used in vision-related tasks, especially medical imaging tasks [6, 32]. Transfer learning can also be used for language models [8, 22] enabling interesting results for use cases where the large specific training datasets are not available. Model repurposing focuses on transferring the learnt features and patterns from a first model, often called a teacher model .Mteacher into another model called student model .Mstudent labelled as ‘model 1’ and ‘model 2’ in Fig. 3.1. The teacher model (model 1) is built using a first dataset that we call ‘Dataset1’; then the trained model (model 1) will be fine-tuned to create another model (model 2) using another dataset called ‘Dataset2’ to perform a new task. The knowledge
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
35
Consent for using dataset 1 for model 1 Features extracted from Model 1 Dataset 1
Model 1
Model 2
End users
Consent for using dataset 2 for model 2
Model repurposing
Fig. 3.1 Model repurposing
that was learnt from ‘Dataset1’ is not erased and the repurposing operation itself will not reuse the first dataset called ‘Dataset1’ (see Fig. 3.1). The repurposing operation does not require the reuse of the original data for the second time which makes this scenario hard to interpret and lawfully apply data and privacy protection regulations like the GDPR. To elaborate, technically the first dataset (Dataset 1) is not reused to perform the repurposing (which builds the second model), but only the knowledge extracted from this dataset. Concretely, if we consider that building the first model (model 1) is the first use, or the first processing, then the knowledge extracted, during this first processing, is in fact the output of this first processing which opens up the legal discussion of whether this should be considered to fall within the ambit of the secondary usage of data. The repurposing of machine learning models is a technical mean to overcome the problem of the lack of large volumes of specific datasets, either because the data are highly regulated and subsequently difficult to access or because of their unavailability as a result of the high costs involved in data collection and labelling, or the rarity of the data in the first place (data about minorities or rare scientific events, etc.). While this practice achieves good results with less costs, its effect and impact on the legal processing of data, especially sensitive personal data, are still under studied. Additionally, machine learning models are prone to several privacy attacks that are hard to detect, which further complicates their compliance with privacy laws. These attacks can be performed by the end user of a service without any noticeable malicious behaviour, because the attacker and the user will get the
36
S. Z. El Mestari et al.
output from these systems the same way. Hence, once the model is in service, the leakage of data becomes hard to detect or prevent.
3.3 Data Leakage in Machine Learning Models Machine learning models can leak data used to build them in unexpected ways. The leakage can reveal the membership of certain data points in the training dataset (known as membership inference attacks) or even the reconstruction of a full approximation of the training data used (model inversion attacks). In this study, we focus more on the membership inference attacks. Our choice is made for a number of reasons, namely, (i) membership inference is a cheaper attack to be performed yet they are powerful and can achieve a high success rate, (ii) membership inference attacks can reidentify the data subjects through the output of a model and also (iii) membership inference attacks are not detectable since the malicious actor will use the model the same way any regular benign user uses it, and hence, the leakage can happen without being detected. In a membership inference [5, 21, 23, 29] attack, the adversary tries to identify whether a given data point x or a sample of data points was part of the training dataset D that was used to train the given model f . Revealing that a certain record was used to train a specific machine learning model is a strong indication of private information leakage about the individual data points in the training set. For instance, knowing that a medical record was used to train a deployed machine learning model for diabetes detection may indicate that the concerned person suffers from diabetes. These attacks can be performed in black box mode which means that the attacker has only a query access to the deployed model without any inner information about it just like any regular user of the service. Hence, only the query results are used to infer the membership of data points within the original training set. Moreover, membership inference attacks are not limited to a certain type of models or settings; they can be performed under different architectural choices such as in federated learning [24], transfer learning [38], generative models [18], language models [7, 30] and speech recognition models [28]. These attacks make use of the difference in model behaviour when fed with new data points compared to its behaviour with the already seen data that was used during the building of the model (also known as training data). Models tend to be more confident about the training data; hence, the prediction loss of training data is significantly lower than the prediction loss of an unseen data point. The membership inference attack becomes more accurate when the targeted model suffers from a poor generalization also known as ‘over-fitting’ in which a model performs well on the already seen data but fails to generalize on new data points. An over-fitted model tends to hard memorize the training data points instead of learning the underlying distribution. Yeom et al. [31, 37] proved that over-fitting is a sufficient condition for performing a membership inference attack but not a
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
37
necessary one. In other words, even if the model generalizes well on unseen data, there is still a chance to perform a successful attack, this is due to the fact that even well-generalized models can memorize rare or unique sequences of training data exposing minorities in datasets to the risk of such attacks [33]. The model architecture, model type and dataset structure are also factors that can affect the accuracy of the attack. Complex models exhibit a higher membership attack accuracy [25]; in addition to that, the higher the number of classes in the dataset the more membership leakage is going to be [35]. While membership inference attacks can reveal the existence of certain data points in the training of the model, model inversion attacks [14, 19, 36] go beyond that by trying to create a full approximation of the training data used. Model inversion attacks [14, 15, 19, 20, 36] use the confidence vector values like probability vectors, returned by machine learning models exposed as APIs. These vectors are used to calculate an average that represents a certain class. This practice becomes highly risky from the privacy side when a class represents a single individual data point, which is the case with face recognition tasks, for example. The strategy to reduce the risk of these attacks rely on limiting the adversary’s knowledge by returning the label of the predicted class instead of the confidence vectors that include probabilities of the input values being part of each class considered in the task. In addition to the difficulty of detecting those attacks, if a certain attack does not succeed, then this does not necessarily mean that the targeted model is not prone to a more efficient leakage attack. Thus, from a data flow governance perspective, once the model knowledge is reused, the data owner whose data were used to compute this knowledge loses any governance over what can be revealed about him or her.
3.4 Legal Analysis of Repurposing Machine Learning Models Legal definitions relating to the use of data, including primary use, secondary use and reuse of health data as contained in the General Data Protection Regulation (GDPR), Data Governance Act (DGA) and European Health Data Space (EHDS) proposal are overly broad and contradictory; therefore, they leave large interpretation margins. The fact that there is no harmonized or conclusive definition of the terms ‘secondary use’, ‘reuse’, ‘further use’ or ‘repurposing of data’ gives rise to a variety of interpretations and application of the relevant regulatory instruments, which causes legal uncertainty. Moreover, what is considered to constitute ‘data’ in the context of ‘secondary use’ is also unclear. In order to provide some clarification in this regard, we investigate and analyse the provisions contained in the European GDPR, DGA and EHDS, including scholarly opinions about these issues.
38
S. Z. El Mestari et al.
3.4.1 General Data Protection Regulation The key legislation of data protection in the EU is the GDPR. The GDPR only provides for the ‘use’ of data as part of the actions that constitute ‘processing’ of data, in which case the GDPR will be applicable. Except for mentioning the term ‘further use’ in passing, no other definition or elaboration on the term ‘use’ is available in the GDPR. In the context of processing health data, the GDPR defines ‘data concerning health’ as personal data that relates to the physical or mental health of a natural person, including the provision of healthcare services, which reveal information about his or her health status. As will be discussed below, some of this ‘data concerning health’ may be contained in electronic health records (EHR), the use of which will be dealt with extensively in the EHDS. In terms of Article 9 of the GDPR, all health-related data qualifies as special personal data in which processing is strictly prohibited, unless the explicit consent of the data subject has been obtained for the processing thereof with regards to one or more specified purposes. Article 5 of the GDPR is clear about the fact that such specified processing purposes must further be explicit and legitimate and may no further processing continue if such processing is incompatible with the specified purposes. However, it does provide exemptions that allow for further processing to proceed without consent if such processing is in the public interest or is used for scientific or historical research or statistical purposes, of which purposes shall not be considered to be incompatible with the initial purposes as contemplated in Article 89(1). From this so-called purpose limitation clause, it may seem as if the GDPR does not allow for any secondary uses or reuse of health data unless such uses are compatible with the initial processing purpose for which it was collected. However, although there is no definition in the GDPR for the term ‘further processing’, Recital 50 provides some guidance in this regard by stating that ‘in order to ascertain whether a purpose of further processing is compatible with the purpose for which the personal data are initially collected, the controller, after having met all the requirements for the lawfulness of the original processing, should take into account, inter alia: any link between those purposes and the purposes of the intended further processing; the context in which the personal data have been collected, in particular, the reasonable expectations of data subjects based on their relationship with the controller as to their further use; the nature of the personal data; the consequences of the intended further processing for data subjects; and the existence of appropriate safeguards in both the original and intended further processing operations’ (GDPR, Recital 50)). Recital 50 thus provides for the ‘further use’ of data on condition that such use is compatible with the initial purpose(s) of the processing of the same data. Any so-called secondary use or reuse of health data in terms of the GDPR is thus still strictly tied to the purpose of the initial processing and the reasonable expectations of the data subject and the implementation of a number of technical and organizational safeguards. It is in this context that pseudo-anonymization is mentioned as a technical method to ensure the security of personal data (GDPR, Art.89(1)).
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
39
In the context of our above-discussed machine learning model repurposing, explicit consent for the processing of health data is legally required for the processing of any data forming part of Dataset 1 in Model 1. Any further use or processing of the output data obtained as knowledge extracted as a result of this first processing will, in terms of the GDPR, only be allowed if the purpose of such further use is compatible with that of the purpose for which the data subject consented to. However, considering the risk that machine learning models can leak data, which leakage can reveal the membership of certain data points in the training dataset and via reconstruction of a full approximation of the training data can possibly reidentify individual data subjects or disclose details about their health conditions, it is highly recommendable to obtain further consent for any further use of the mentioned output data, in the case of personal health data. It is for scenarios such as these that the GDPR provides very strict safeguards with regards to the privacy protection of health data, which qualifies as special personal information, on both a technical and organizational level. Regardless of the criticism against the complexity and vagueness of the GDPR, many scholars agree that this sophisticated regulation still enables the best safety for personal data protection [9]. Below we shall show to what extent the DGA and the EHDS proposal expand on these principles by providing for the secondary use and reuse of health data if such data is anonymized.
3.4.2 Data Governance Act The Data Governance Act (DGA) is introduced as another main pillar of the EU data strategy and is anticipated to come into effect in September 2023. The first of the four main goals of the DGA relates to the reuse of data and explains how this regulation lays out ‘conditions for the re-use, within the Union, of certain categories of data held by public sector bodies’ [12]. Along with this goal, it also establishes data intermediation services, builds a framework for voluntary data registration for altruistic purposes and aims to establish a European Data Innovation Board. However, the main goal of this instrument is to regulate the reuse of data and defines the term ‘reuse’ as ‘the use by natural or legal persons of data held by public sector bodies, for commercial or noncommercial purposes other than the initial purpose within the public task for which the data were produced, except for the exchange of data between public sector bodies purely in pursuit of their public tasks’ [12]. The DGA thus expands the further use, called reuse in this instrument, to ‘purposes other than the initial purpose’ exclusively by public bodies on condition that the protected nature of such data is preserved and that access to data for reuse purposes is only granted if the public sector body ensures that such data have been ‘anonymized, in the case of personal data; and modified, aggregated or treated by any other method of disclosure control, in the case of commercially confidential information, including trade secrets or content protected by intellectual property rights’ (Article 5(3), DGA). Accordingly, public sector bodies may only provide access to deidentified personal health data. In addition, recipients of such deidentified personal health
40
S. Z. El Mestari et al.
data are further prohibited by Article 3(5) of the DGA from ‘re-identifying any data subject to whom the data relates’ and are obligated to take technical and operational measures to prevent any reidentification. This provision can thus be interpreted to the extent that even if it is technically possible to reidentify an individual, the recipient of deidentified data is legally prohibited from doing so. In contrast, the GDPR does not explicitly prohibit the reidentification of a data subject, but rather provides guidance with regard to the ‘means reasonably likely to be used’ and ‘taking into consideration the available technology at the time of the processing and technological developments’ to determine whether a natural person is directly or indirectly identifiable (Recital 26, GDPR). Any reuse of personal health data obtained from public sector bodies in terms of the DGA thus carries with it an extra layer of privacy protection with the prohibition of reidentification. Furthermore, considering that personal health information that sprouts from a doctor-patientrelationship is considered to be confidential between those parties, Article 5(8) of the DGA also prescribes that public sector bodies ‘shall ensure that the confidential data is not disclosed as a result of allowing re-use’. Subsequently, the DGA does not create an obligation to allow the reuse of data held by public sector bodies, but rather provides conditions to consider when those bodies ultimately decided to make data in their possession accessible for reuse in accordance with the purposes and scope of such access. If personal health data is obtained from public sector bodies, such as a public hospital, and used or reused as input data in Model 1, it is important to note that such input data will already be deidentified and/or anonymized, as the anonymization of personal data is required by the DGA before transmission thereof to the recipient, specifically to prevent the identification of the data subjects. Consent for the use or reuse of such data is subsequently not necessary, neither in terms of the GDPR nor the DGA. The output data of Model 1 which can be used as input for Model 2 would necessarily also be anonymized and will also not need the consent of the data subject. With regards to non-personal data and pseudonymized data which retains their status as personal data, the DGA still provides that such data may only be transmitted where there is no reason to believe that the combination of non-personal datasets would lead to the identification of data subjects. And if deidentified or anonymized data do not suit the needs of recipients, the public sector bodies are obligated to facilitate the process for requesting consent for the relevant data subjects to reuse their personal health data. In this context, the DGA provides even better protection of personal health information for purposes of reuse.
3.4.3 European Health Data Space Proposal Because an abundance of personal health information is contained in EHR of private health institutions, the DGA will not be applicable to data contained as such. The EHDS proposal accordingly further expands on the GDPR, the DGA and the Data Act by specifically providing for the secondary use of electronic health data. Where the DGA lays down generic conditions for the reuse of public sector data, the Data
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
41
Act aims to enhance the portability of certain user-generated data, including health data, but does not provide rules for all health data. It is in this regulatory gap that the EHDS will fill in the blanks by providing regulations specifically tailored for the health sector. It is in this context that the EHDS defines ‘secondary use of electronic health data’ as the processing of electronic health data, which may include personal electronic health data initially collected in the context of primary use, but also electronic health data collected for the purpose of the secondary use. This uncertainty about what data exactly constitutes impacts greatly on whether consent was obtained from the patient for the use of such data and whether one can reuse such data for other purposes than the initial or primary purpose of obtaining healthcare or medical treatment. Both the EDPB and the EDPS have expressed their concerns about the confusion that may result from this definition of ‘secondary use’ which may contribute to legal uncertainty in Member States, especially considering the fact that the term ‘secondary use’ is not defined in the GDPR and recommended that it must be amended or supplemented with a clearer definition [1]. They further recommended that the link between the terms ‘secondary use’ in the EHDS and ‘further use’ in the GDPR must be clarified to help clarify the implementation of these terms as contemplated in the EHDS proposal and the GDPR [1]. The main aim of the EHDS is to improve cooperation between Member States via harmonized codes of conduct and to ensure that such codes ‘minimize differences in GDPR interpretation and implementation and increase data quality, making the data findable, accessible, interoperable, and reusable (FAIR)’ as was mentioned in a workshop held by the European Medicines Agency [2]. The EHDS proposal further aims to achieve its goal of facilitating the reuse of health data by means of a revolutionary set of systems such as EHR and a digital infrastructure for cross-border data sharing called MyHealth@EU [11]. One of the most important facilitation factors of the EHDS in the secondary use of health data is its effort to enhance security and trust in the technical framework designed to facilitate the exchange of electronic health data for both primary and secondary use and establish mechanisms for data altruism in the health sector. Electronic health data in this regard may consist of electronic health records, genomics data or patient registries to name a few. Chapter IV of the EHDS proposal is dedicated to the domain-specific regulation of the secondary use of electronic health data for purposes such as research, innovation, policymaking, patient safety or regulatory activities [10]. An innovation brought about by the EHDS is the requirement that Member States must set up a health data access body to evaluate, consider and authorize any requests for secondary use of electronic health data. In addition to the standard applicable privacy regulations contained in the GDPR and the DGA as discussed above, such a health data access body provides yet an additional layer of patient and privacy protection, which the body may be able to consider ethical and even social, economic or political issues that may affect or influence the secondary use of data, which considerations may go undetected if purely considered from a normative point of view. Article 34 states that ‘health data access bodies shall only provide access to electronic health data referred to in Article 33 where the intended
42
S. Z. El Mestari et al.
purpose of processing pursued by the applicant complies with “training, testing, and evaluating of algorithms, including in medical devices, AI systems and digital health applications, contributing to the public health or social security, or ensuring high levels of quality and safety of health care, of medicinal products or of medical devices.” In this regard, it is important to note that Article 33(4) allows for the electronic health data which include protected intellectual property and trade secrets from private enterprises to be made available for secondary use on condition that ‘all measures necessary to preserve the confidentiality of IP rights and trade secrets shall be taken’. This article does not provide further details with regard to the specific and acceptable measures to be taken, but we can accept that the health data access body will play an instrumental role in the evaluation of such measures and in deciding whether it is acceptable in that specific circumstances. Subsequently, if personal health data is obtained from EHR, such access and the purpose for secondary use will in all probability be evaluated and considered by the health data access body, who will also consider the technical processing of the potential recipient of such data will use during secondary use, including the model repurposing. If the health data access body thus authorizes access to and use of personal health data for the stated purposes, including the use of certain technologies during such secondary use, it should theoretically be safe to proceed with the suggested processing as discussed above. Although the EHDS proposal is not yet legally binding, it is a powerful document which clearly shows the European Commission’s attitude towards the sharing and reuse or secondary use of health data. The European Parliamentary Research Service, in their briefing document about the EHDS stated that ‘the European Commission’s proposal for a regulation on a European health data space aims to improve individuals’ access to and control of their electronic personal data (primary use) while facilitating data re-use for societal good across the EU (secondary use)’ [34].
3.5 Scholarly Opinions Until the above legislative instruments harmonize their terminology, we have to rely on the insights of various expert scholars in this field to gain a better understanding of the different terms referred to above. Hansen et al. defined the term ‘secondary use’ as using the data for another purpose other than the primary purpose of collecting the data, which in healthcare the primary purpose is usually to give care to patients [17]. Geissbuhler et al. mentioned how secondary use and reuse are terms which could be used interchangeably and their definition relies on the use of data for ‘purposes other than direct care of the patient’ [16]. Becker et al. remarked on the interchangeable usage of the terms ‘secondary use’, ‘data reuse’ and ‘repurposing’ and decided to resume with the term secondary use to avoid confusion. They proceed with their definition ‘by which we
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
43
mean any use of the data beyond the scope for which the data was initially collected or generated’ [4]. Pasquetto et al. approached the concept of data reuse in scientific research studies and they define ‘first use’ as the use by an individual who asked a specific research question. When that dataset is stored in a repository, retrieved by someone else and deployed for another project, it usually would be considered as a ‘reuse’. It is clear that he made this differentiation based on the scientist who uses the data; if the scientist changes, then they consider any further use as a reuse of these data [26]. Using data for research purposes creates another related term called ‘secondary research’. This term is defined by Peloquin et al. as ‘research conducted using data or bio-specimens collected either for research studies other than the proposed research or for non research purposes, such as clinical care’ [27]. Custers et al. focused on the data protection impact assessments and examine data reuse in this context. They came up with different types of reuse based on the context and purposes of the reuse of such data. Their classification listed and included ‘data recycling—using data several times for the same purpose in the same context; data re-purposing using data for different purposes than for which they were initially collected, but still in the same context as the original purpose; data re-contextualization using data in another context than for which they were initially collected’ [13].
3.6 Discussion Based on the distinction laid out by Custers et al., data recycling, data repurposing and data re-contextualization scenarios will have different legal meanings and outcomes [13]. Thus, it is important to differentiate which one of these terms and definitions will be applicable to the use case discussed in this chapter and what will be the legal consequences under current EU privacy laws. In order to achieve this goal, relevant provisions of GDPR must be considered, as we have discussed in this chapter. According to Article 9 of the GDPR, apart from the exclusively listed conditions, health data can only be processed by obtaining the data subject’s consent. This article stipulates that it must be ‘explicit consent to the processing of those personal data for one or more specified purposes’ (GDPR Article 9(2)(a)), with consent being defined as ‘any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her’ (GDPR Article 4(11)). In this definition and with consideration of the provisions contained in Article 5 such as purpose and storage limitation, the GDPR clearly does not allow data to be processed and used for further purposes as those being consented to initially. However, it is not clear where the line is drawn. In this sense, in a situation of data recycling, we can conclude that initial consent obtained from the data subject would be sufficient enough to use the data in the same
44
S. Z. El Mestari et al.
context and for the stipulated purpose. However, in the case of data repurposing, data will be used for a different purpose other than the data subject gave his or her initial consent for, but in the same context of the initial processing. Here, the discussion expands on the question of ‘whether or not initial consent would be sufficient to consider this processing a lawful processing under GDPR’. With that being said, when we analysed the relevant EU laws, we found that the legal link between the data subject’s consent and data processing purposes is even weaker than data repurposing. Machine learning models, which were contemplated in the opening of this article, open another dimension in this debate. In our use-case scenario output data are in the form of features and patterns extracted from the first model and used to train another model. The second model is able to reach conclusions by using these features which were initially extracted by the first model. The question of whether or not data subjects’ initial consent would be sufficient enough to constitute this processing as lawful does not have a simple or straightforward answer. There is no doubt that output data is an inference resulting from running models on the input data, and thus, it is no longer considered to be the initial personal data which constituted the input data. However, as we explained in the third section of this article (see Sect. 3.3), data subjects can be identified from these output data as a result of performing privacy attacks by exploiting these outputs. Meanwhile, accessing and measuring this type of leakage is a challenging technical task. Data scientists, and at the time of writing this chapter, do not have the technical capability to provide precise guarantees about the resilience of any repurposing techniques against this type of leakage. Thus, the output of repurposed machine learning models remains a potential source of information leakage about the input data. Nonetheless, the decision of whether another or a separate consent for the usage of output data obtained from repurposed models is required remains unclear. That being said, the decision of whether another consent for the usage of the output to repurpose the model is required cannot be based on solid proofs. Classifying the repurposing of machine learning models under the classification proposed earlier by the work of Custer et al.’s is not an easy task [13]. Our use case works in a manner which is more complex than the given classification. On the one hand, it can be argued that the model does not use personal data but only repurposing the output data. However, the linkage between the input data and the second model is still intact, and thus, input data can still be reached. This leads to the question of whether the knowledge extracted during the first usage is still considered to be covered or processed under the terms of the first consent or needs another separate consent to legalize reuse, especially given the fact that this knowledge is in fact output data of the initial machine learning model. From the above-mentioned legal analysis, it seems that an actual usage or reusage of data is directly linked to the actual raw or input data. This interpretation causes the practice of knowledge transfer to bypass the existing regulations, as discussed above, unless the first consent explicitly covers the input and the output data of the first processing purpose. To elaborate, if the consent terms and purposes cover the initially collected data and any knowledge or statistical aggregations obtained as an
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
45
output from processing these data, then the reuse of the output data resulting from the first processing will be constrained by these terms. The risks of using health data in machine learning must be communicated, and more advanced safeguarding measures must be prescribed. However, we insist on the fact that even given the state of the art, technical recommendations to reduce the success of the previously mentioned leakage attacks, these risks remain. Thus, for more informed consent, we believe that these risks should be studied and communicated properly to the data subjects, public sector bodies or private health institutions. Interpretation of this phenomenon under current EU legislation is another challenge considered by this chapter. In their article, Custer et al. concluded that the law does not make a distinction and sets out the perimeters of the different uses of data [13]. However, their article was written when the EU data protection directive 95/46/EC was the only applicable law. Regardless, this conclusion still stands as the GDPR does not differentiate between the different uses of data either. Considering the purpose limitation principle and the conditions of the consent, we conclude that the GDPR’s philosophy is focused on legalizing each data processing and in this sense does not regulate the instance of secondary use. As we explained above, further processing is mentioned in a manner which ties closely with the initial processing and is foreseen under specific circumstances; thus, it does not meet the current complexity of data usage in today’s data economy. New and novel legislation such as the DGA and the EHDS show us that the European Commission’s approach is shifting in a different direction. To elaborate, in these regulations, unlike the GDPR, a more permissive methodology is adopted. However, these two regulatory instruments can also be criticized for not regulating in a detailed manner. We argue that ambiguous regulations could cause discrepancies in the free movement of data to data controllers and thus jeopardize data subjects’ rights. We must also consider the imbalanced equation between data subjects and data controllers as data subjects have limited resources once they gave up control of their personal data and the phenomena that we discussed in this article should not be used as another tool by data controllers.
3.7 Conclusion One of the main sources of fuel for the modern economy is data. Thus, extracting the maximum benefits from data is a key goal of today’s economy. The secondary use of data, or as we mentioned, other versions of this term such as the ‘transfer learning’ term used by machine learning practitioners, is one of the tools which can serve this goal. As we demonstrated throughout this chapter, legislative actions move towards amplifying the usage of data in this manner. Applying FAIR principles to the policymaking process is an indication of this movement. In one regard, the advantages of this kind of use cannot be ruled out. Increasing the amount of repurposing models/technologies can decrease the processing of new personal
46
S. Z. El Mestari et al.
data and this will serve the data minimization principle. Moreover, offering better healthcare services with the assistance of new technologies will magnificently benefit almost every part of society. However, the other side of the coin is quite different, implementing this solution in the case of machine learning systems through model repurposing exhibits many struggles from the legal standpoint. This is due to the particularity of this practice compared to the classical reuse of data. Ensuring the security of this process is technically challenging in many aspects. Additionally, repurposing machine learning models could affect the application of the data subject’s rights. It complicates a situation that is already hard to execute in real life. As we discussed throughout the chapter, detecting personal data after it is used in a machine learning model is, technically, a challenging task. As we analysed, in the technical parts of this chapter, data scientists cannot provide solid guarantees to confirm that this practice will not lead to a reidentification of the data subjects. It is even harder to implement the right to rectification after personal data are used in this manner. The ambiguity arising from the legal instruments does not help the situation. Due to this reason, the authors of this chapter believe that clearer definitions or guidelines must be provided to reach a better data, data reuse and data sharing ecosystem. Authors insist on the fact that these risks must be communicated to the data subjects in order to help them make a more informed consent. Acknowledgments This work has been funded by the European Union’s Horizon 2020 Innovative Training Networks, Legality Attentive Data Scientists (LeADS) under Grant Agreement ID 956562.
References 1. EDPB-EDPS Joint Opinion03/2022 on the Proposal for a Regulation on theEuropean Health Data Space (Jul 2022), https://edpb.europa.eu/system/files/2022-07/edpb_edps_jointopinion_ 202203_europeanhealthdataspace_en.pdf 2. Agency, E.M.: GDPR and the secondary use of health data- Report from EMA workshop held with the EMA Patients’ and Consumers’ Working Party (PCWP) and Healthcare Professionals’ Working Party (HCPWP) on 23 September 2020. Tech. rep. (Dec 2020) 3. Amin, J., Sharif, M., Yasmin, M., Saba, T., Anjum, M.A., Fernandes, S.L.: A new approach for brain tumor segmentation and classification based on score level fusion using transfer learning. Journal of medical systems 43(11), 1–16 (2019) 4. Becker, R., Chokoshvili, D., Comandé, G., Dove, E., Hall, A., Mitchell, C., Molnar-Gabor, F., Nicolás, P., Tervo, S., Thorogood, A.: Secondary use of personal health data: when is it’further processing’under the GDPR, and what are the implications for data controllers? Available at SSRN 4070716 (2022) 5. Bernau, D., Grassal, P., Robl, J., Kerschbaum, F.: Assessing differentially private deep learning with membership inference. CoRR abs/1912.11328 (2019), http://arxiv.org/abs/1912.11328 6. Cao, M., Zhao, T., Li, Y., Zhang, W., Benharash, P., Ramezani, R.: ECG heartbeat classification using deep transfer learning with convolutional neural network and stft technique. arXiv preprint arXiv:2206.14200 (2022)
3 Technical and Legal Aspects Relating to the (Re)Use of Health Data When. . .
47
7. Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: Evaluating and testing unintended memorization in neural networks. In: 28th USENIX Security Symposium (USENIX Security 19). pp. 267–284. USENIX Association, Santa Clara, CA (Aug 2019), https://www.usenix.org/conference/usenixsecurity19/presentation/carlini 8. Chronopoulou, A., Baziotis, C., Potamianos, A.: An embarrassingly simple approach for transfer learning from pretrained language models. arXiv preprint arXiv:1902.10547 (2019) 9. Comandè, G., Schneider, G.: Can the GDPR make data flow for research easier? Yes it can, by differentiating! A careful reading of the GDPR shows how EU data protection law leaves open some significant flexibilities for data protection-sound research activities. Computer Law & Security Review 41, 105539 (Jul 2021). https://doi.org/10.1016/j.clsr.2021.105539, https:// www.sciencedirect.com/science/article/pii/S0267364921000121 10. Commission, E.: A European Strategy for data | Shaping Europe’s digital future, https://digitalstrategy.ec.europa.eu/en/policies/strategy-data 11. Commission, E.: Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on the European Health Data Space (2022), https://eur-lex.europa.eu/ legal-content/EN/TXT/?uri=CELEX%3A52022PC0197 12. Commission, E.: Regulation (EU) 2022/868 of the European Parliament and of the Council of 30 May 2022 on European data governance and amending Regulation (EU) 2018/1724 (Data Governance Act) (May 2022), http://data.europa.eu/eli/reg/2022/868/oj/eng 13. Custers, B., U Vrabec, H., Friedewald, M.: Assessing the Legal and Ethical Impact of Data Reuse:. European Data Protection Law Review 5(3), 317–337 (2019). https://doi.org/10.21552/ edpl/2019/3/7, http://edpl.lexxion.eu/article/EDPL/2019/3/7 14. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. p. 1322–1333. CCS ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2810103.2813677. 15. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. p. 1322–1333. CCS ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2810103.2813677 16. Geissbuhler, A., Safran, C., Buchan, I., Bellazzi, R., Labkoff, S., Eilenberg, K., Leese, A., Richardson, C., Mantas, J., Murray, P., De Moor, G.: Trustworthy reuse of health data: A transnational perspective. International Journal of Medical Informatics 82(1), 1– 9 (Jan 2013). https://doi.org/10.1016/j.ijmedinf.2012.11.003, https://www.sciencedirect.com/ science/article/pii/S138650561200202X 17. Hansen, J., Wilson, P., Verhoeven, E., Kroneman, M., Verheij, R., van Veen, E.B.: Assessment of the EU Member States’ rules on health data in the light of GDPR (2021), https://data.europa. eu/doi/10.2818/546193 18. Hayes, J., Melis, L., Danezis, G., Cristofaro, E.D.: LOGAN: evaluating privacy leakage of generative models using generative adversarial networks. CoRR abs/1705.07663 (2017), http:// arxiv.org/abs/1705.07663 19. He, Z., Zhang, T., Lee, R.B.: Model inversion attacks against collaborative inference. In: Proceedings of the 35th Annual Computer Security Applications Conference. p. 148–162. ACSAC ’19, Association for Computing Machinery, New York, NY, USA (2019). https://doi. org/10.1145/3359789.3359824 20. He, Z., Zhang, T., Lee, R.B.: Model inversion attacks against collaborative inference. In: Proceedings of the 35th Annual Computer Security Applications Conference. p. 148–162. ACSAC ’19, Association for Computing Machinery, New York, NY, USA (2019). https://doi. org/10.1145/3359789.3359824 21. Jia, J., Salem, A., Backes, M., Zhang, Y., Gong, N.Z.: Memguard: Defending against black-box membership inference attacks via adversarial examples. CoRR abs/1909.10594 (2019), http:// arxiv.org/abs/1909.10594
48
S. Z. El Mestari et al.
22. Kindermans, P.J., Tangermann, M., Müller, K.R., Schrauwen, B.: Integrating dynamic stopping, transfer learning and language models in an adaptive zero-training erp speller. Journal of neural engineering 11(3), 035005 (2014) 23. Li, J., Li, N., Ribeiro, B.: Membership inference attacks and defenses in supervised learning via generalization gap. ArXiv abs/2002.12062 (2020) 24. Melis, L., Song, C., Cristofaro, E.D., Shmatikov, V.: Inference attacks against collaborative learning. CoRR abs/1805.04049 (2018), http://arxiv.org/abs/1805.04049 25. Nasr, M., Shokri, R., Houmansadr, A.: Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In: 2019 IEEE Symposium on Security and Privacy (SP). pp. 739–753 (2019). https://doi.org/10. 1109/SP.2019.00065 26. Pasquetto, I., Randles, B., Borgman, C.: On the Reuse of Scientific Data. Data Science Journal 16(0), 8 (Mar 2017). https://doi.org/10.5334/dsj-2017-008, http://datascience.codata. org/articles/10.5334/dsj-2017-008/, number: 0 Publisher: Ubiquity Press 27. Peloquin, D., DiMaio, M., Bierer, B., Barnes, M.: Disruptive and avoidable: GDPR challenges to secondary research uses of data. European Journal of Human Genetics 28(6), 697–705 (Jun 2020). https://doi.org/10.1038/s41431-020-0596-x, https://www.nature.com/articles/s41431020-0596-x, number: 6 Publisher: Nature Publishing Group 28. Shah, M.A., Szurley, J., Mueller, M., Mouchtaris, A., Droppo, J.: Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks. In: Proc. Interspeech 2021. pp. 891–895 (2021). https://doi.org/10.21437/Interspeech.2021-1188 29. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE symposium on security and privacy (SP). pp. 3–18. IEEE (2017) 30. Song, C., Shmatikov, V.: Auditing data provenance in text-generation models. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. p. 196–206. KDD ’19, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3292500.3330885 31. Song, C., Shmatikov, V.: Overlearning reveals sensitive attributes. arXiv preprint arXiv:1905.11742 (2019) 32. Tang, X., Du, B., Huang, J., Wang, Z., Zhang, L.: On combining active and transfer learning for medical data classification. IET Computer Vision 13(2), 194–205 (2019) 33. Thakkar, O.D., Ramaswamy, S., Mathews, R., Beaufays, F.: Understanding unintended memorization in language models under federated learning. In: Proceedings of the Third Workshop on Privacy in Natural Language Processing. pp. 1–10. Association for Computational Linguistics, Online (Jun 2021). https://doi.org/10.18653/v1/2021.privatenlp-1.1, https://aclanthology. org/2021.privatenlp-1.1 34. Thierry, E.C.: BRIEFING- European health data space. Tech. rep., EPRS | European Parliamentary Research Service (Sep 2022) 35. Truex, S., Liu, L., Gursoy, M.E., Yu, L., Wei, W.: Towards demystifying membership inference attacks. CoRR abs/1807.09173 (2018), http://arxiv.org/abs/1807.09173 36. Wu, X., Fredrikson, M., Jha, S., Naughton, J.F.: A methodology for formalizing modelinversion attacks. 2016 IEEE 29th Computer Security Foundations Symposium (CSF) pp. 355–370 (2016) 37. Yeom, S., Fredrikson, M., Jha, S.: The unintended consequences of overfitting: Training data inference attacks. CoRR abs/1709.01604 (2017), http://arxiv.org/abs/1709.01604 38. Zou, Y., Zhang, Z., Backes, M., Zhang, Y.: Privacy analysis of deep learning in the wild: Membership inference attacks against transfer learning. CoRR abs/2009.04872 (2020), https:// arxiv.org/abs/2009.04872
Chapter 4
“We Are the Makers of Manners”: A Grounded Approach to Data Ethics for the Built Environment Janis Wong
, Yusra Ahmad, and Sue Chadwick
4.1 Introduction Over the past decade, digital technologies have permeated the fabric of our society. This includes the real estate and property sector. From Internet of Things (IoT) to smart city infrastructures, citizens pass through, and often rely on, digital networks to navigate, explore, and live in both urban and rural environments. In addition to using such data, citizens as data subjects also knowingly or unknowingly provide vast amounts of data that track their movements within these landscapes. Whilst data protection regulations such as the EU General Data Protection Regulation (GDPR) 2016 [1] and the UK Data Protection Act 2018 [2] have mandated that citizen’s data are safeguarded with enshrined protections and rights, data protection
Yusra Ahmad and Sue Chadwick contributed equally with all other contributors. J. Wong () The Alan Turing Institute, London, UK RED Foundation, London, UK e-mail: [email protected] Y. Ahmad RED Foundation, London, UK Acuity Data, London, UK e-mail: [email protected] S. Chadwick RED Foundation, London, UK Pinsent Masons LLP, London, UK e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_4
49
50
J. Wong et al.
alone is insufficient to ensure transparency, accountability, and trust in our built environment. Our paper aims to address this gap between the digitisation of the real estate sector and its response to data protection practices with citizens’ expectations in the context of the property and build environment which they are part of. As part of the real estate sector, as the “maker of manners” [3], we take an introspective approach to assess the regulation, norms, and practices adopted within the real estate sector. First, in Sect. 4.2, we outline the landscape of the real estate sector to explore the opportunities and challenges of real estate in our data-driven society, in particular the sector’s response and reaction to existing data protection regulation and data-related policies. In Sect. 4.3, we examine specific data and data protection challenges within the real estate sector through three case studies. Finally, to address these challenges, in Sect. 4.4, we consolidate examples of current research, guidelines, and practices that aim to minimise these obstacles. In doing so, we introduce The RED Foundation’s Data Ethics Playbook as an ongoing attempt to ensure that data ethics is considered in the real estate sector. Whilst all authors are based in the UK, where our work draws from local examples and policy development, the principles outlined in the Playbook aim to be widely applicable beyond the UK’s real estate sector.
4.1.1 The RED Foundation The authors of this paper are part of The RED Foundation, an initiative set up to ensure the real estate sector benefits from an increased use of data, which avoids some of the risks that this presents and is better placed to serve society. This is done by connecting people, projects, and initiatives around the topic of data in the built environment and raising the sector’s engagement with the ethical challenges that the use of data can present. The Foundation is a not-for-profit organisation whose membership is made up of individuals with expertise representing the full real estate lifecycle, including data practitioners, solicitors, researchers, industry specialists, and startups. They are divided into three steering groups covering: data ethics, data standards, and academia. Activities of these steering groups are overseen by an overarching committee that ensures that value and impartiality are delivered as well as facilitating access to key decision-making bodies within the industry to help drive the necessary change.
4.2 The Real Estate Sector: The Landscape, Data, and PropTech Traditionally, whether in consideration of social, political, or legal norms, the real estate sector can be segmented into residential (for living), commercial (for
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
51
business), industrial (for manufacturing and production), and infrastructure (the roads), rails, and increasingly the digital connections that link them all together [4]. However, the growing application of technology in property’s physical and digital infrastructures has begun to blur those lines [5]. This section introduces how the real estate sector has changed in the context of this transformation; maps out the data that is collected, used, and potentially shared within the sector; and outlines more recent developments in property and planning technology (PropTech [6] and PlanTech [7]) as an industry. As part of this literature review, we draw from our domain expertise across The RED Foundation as well as survey academic, industry, and grey literature.
4.2.1 The Property Landscape When referring to the real estate industry and property, there are many points of view as to what is being referred to. This is primarily due to “real estate” generally being used to discuss a specific domain within the industry. Fundamentally, the real estate sector should be viewed from two angles: as an asset type and as a lifestyle or lifecycle. The first of these is asset type, which relates to land or property that is under construction, purpose-built, or being used for a specific purpose (i.e. an office, residential property, warehouse, etc.). Whilst all asset types require similar maintenance and management services, the nuances of what is required and the regulations that apply differ sufficiently to require specific experience and research. Beyond this, we must consider the real estate lifecycle, starting at the point of a transaction for a sale or lease of an asset. Broadly, the four main parts of the lifecycle can be attributed to acquisition, planning, construction, and occupation [8]. Within the lifecycle, regardless of asset type, there are two primary stakeholder perspectives: that of the landlord or controller of the space and the tenant or occupier. The real estate industry is ultimately focused on servicing the needs of these two parties, whether individually or in how they interact with each other. These needs are broken out into a number of distinct subject matters or services. These include planning and strategic development, property and estates management, project management, facilities management, health and safety, lease administration, transaction management, environmental reporting, advisory services, and investment management. Once these foundational pillars of the industry are understood, additional dimensions can be explored, for example, the difference between public and private sectors. However, this does not change the core services being delivered, only the scale and the landscape in which real estate professionals must operate. It is this complexity and diversity within the real estate sector that makes it interesting. It touches every walk of life, providing crucial shelter and enablement for businesses and people. It is perhaps the ubiquitous nature of real estate that has given it a reputation for stability. Ultimately, we will always need to live somewhere; for some businesses, for example, logistics or healthcare, there will always be a need for bricks and mortar. However, we are entering a new digital era that brings
52
J. Wong et al.
us Web3 and the metaverse, servicing the human need for space in a different way [9]. The COVID-19 global pandemic that fast tracked the cultural change required to enable agile working to the level that has legitimised entry for these technologies [10], thus changing our relationship with property. Although the digitisation of work and services may have been considered temporary at the time [11], there are now organisations that have adopted digital nomadism permanently, such as Novartis who announced in 2020 that workers are free to choose where they work [12]. The relevance of such a change in policy, especially from an employer with such scale (in the case of Novartis, around 100,000 members of staff), is the demand it places within industry on the need to reimagine space [13]. Despite the introduction of agile working or activity-based working strategies to design office space pre-pandemic, we were still closer to the way we used corporate office spaces during the industrial revolution than we are now as we come out of the pandemic. Currently, the physical and virtual are colliding, forcing us to consider space as boundaryless and imagine a world where we can meet, engage, work, and section flexibly. This opportunity to reimagine our physical and virtual worlds is both exhilarating and concerning. As we enter uncharted waters, the one certainty that we have is that with the digitisation of property comes great responsibility, in particular the management and security of the data we collect as well as how we use it.
4.2.2 Real Estate Data Arguably, the 1086 Domesday Book was the first piece of data relied on as a formal record of land interests and transitions in England [14]. Since then, the documentation associated with land transactions and ownerships has proliferated [15], so that planning and property processes are routinely accompanied by the following: • Strategic and policy documents, setting out overall aims for development of particular areas, which may be produced by private developers or, after a long process of public engagement and inquiry, by a local authority • The documentation associated with the contractual aspects of a property transaction, ranging from a short licence for a car parking space through longer leases for industrial units to substantial development agreements and headleases regulating strategic significant developments such as the use and occupation of a city centre or the development of a new garden community • The documents that govern the public/private aspects of land development such as planning and highway agreements, management agreements, codes of practice for construction, and management agreements for tenanted areas Although we have moved from vellum to paper, from wax seals to DocuSign (a digital contract software), and from secure deed stores to digital property records, the property planning system is still heavily dependent on physical paper. Even where transactions are digitalised, the electronic records tend to be no more
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
53
than electronic replicas of their physical predecessors, rather than evolving digital resources. However, things are changing. Alongside the increased digitalisation of transactions, there is an increasing capacity for both places and spaces to become data repositories where information is harvested, processed, and stored. In these live digital environments, property data is a dynamic resource powering processes. These include real-time reporting of energy use [16], ongoing recovery and analysis of environmental data such as air quality [17], as well as the diversified use of biometric information to regulate and manage human environments including images from CCTV and facial recognition cameras, movement records from inhome sensors, and individual biometric information from fingerprint entry points [18]. The smarter the environment, the more diverse the range of actors involved in collecting this information. Examples include local authorities such as the police [19], large asset holders such as residential and commercial landlords [20], or smaller commercial interests such as those providing public EV charging points [21].
4.2.3 Technologies Within the Real Estate Sector In light of the digital transformation of the real estate sector introduced in Sect. 4.2.1, the PlanTech and PropTech sectors have proliferated in recent years in the public and private sectors. The UK Government itself is a major player and, uniquely in the market, has the power to compel the production of data in almost any context. For example, Part 9 of The Levelling Up and Regeneration Bill [22] currently going through Parliament includes an enabling power to collect information on the ownership and control of land and transnational dealings. Both national and local government organisations are establishing new and innovative ways of digitalising property transactions, with the new platform for planning transactions [23] and an emerging digital twin for underground assets [24] among the standout examples. From the private sector perspective, there are a range of actors claiming a space in a burgeoning market. Last year’s PropTech Awards hosted by the UK PropTech Association [25] included Hounslow Council for its work on digitising local plan processes, built ID for its work on digital engagement, and recognition for outstanding individuals in the industry, such as The RED Foundation and Real Estate Data Foundation’s Dan Hughes, Naqash Tahir from PGIM Real Estate, and Gregory Dewerpe of A/O PropTech.
4.2.4 Data Regulations Within the Real Estate Sector There are strong arguments both for law and guidance in data ethics, particularly when it comes to the built environment. Regulation creates hard lines between
54
J. Wong et al.
legitimacy and illegitimacy; this is essential for formal enforcement, whether by established agencies such as the Information Commission or the individual seeking to assert their rights. However, those hard lines can be problematic without supplementary guidance that explores how legal principles should be implemented in practice. The best example of this is the Surveillance Camera Code, with its overarching purpose of enabling operators “to make legitimate use of available technology in a way that the public would rightly expect and to a standard that maintains public trust and confidence” [26]. Moreover, whilst law is a powerful tool, it is not agile and not able to solely respond to the scale, scope, and speed of technological developments. In one recent example, the courts grappled with the issue of whether TfL should have given an operator’s licence to a private hire vehicle service run from an online platform and accessed through an app. The judge commented that “the underlying question of law is one of the interpretations of a statute enacted in 1869, before the invention of the telephone or the motorcar, let alone the internet or the smartphone app” [26]. This is an extreme example, but it demonstrates why law and regulation alone will never provide a complete response to ethical dilemmas. Even more recently, the Court of Appeal confirmed that policies alone could not be used to prevent government officials from using instant messaging applications. Indeed, the fact that there were eight different policies drafted at different times meant that “any attempt to follow all these policies would lead to difficulty” [27]. But guidance can improve the visibility of issues, contribute to valuable knowledge sharing, and, if combined with a collaborative outlook, help to create common cultural norms which may help to avoid ethics washing.
4.3 Data and Data Protection Challenges in the Real Estate Sector With the normalisation of vast data collection, processing, and sharing within the real estate sector, new challenges have arisen in relation to data and data protection; the 2019 “Liquid Report” produced with the British Property Federation noted that whilst the UK is a global real estate leader, work needs to be done to harness the benefits of the technological age to create “a digitally enabled real estate sector” [27]. These challenges have often been explored from the perspective of data protection alone, and not in consideration of ethics as contextualised with people’s relationships to the physical built environment. In this section, we draw upon examples within the real estate sector focusing on different stakeholder perspectives to illuminate some of these challenges and to what extent they have been tackled.
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
55
4.3.1 Smart Homes and the Internet of Things: Consumer Data Within the Sector The creation of smart homes, contextualised as the application of the Internet of Things (IoT) in the domestic environment, has become a booming part of the real estate sector given the varied insights that such data gathering can bring to developers and proprietors. In theory, a home is “smart” as soon as technology replaces any part of the human function. However, in practice, a home becomes “smart” with a range of different technologies, from remote controls on heating, lighting, and doorbells to sophisticated sensors combined with analytics that record behavioural patterns. These technologies create connections between the home and its inhabitant as well as being able to alert third parties on an automated basis – for example, to facilitate ageing in place or telecare. In the context of smart homes, the data protection concerns related to smart homes have been explored extensively in academic and grey literature [28–30]. These can be broadly summarised as surveillance [31], lack of control [32], thirdparty (cross-border) data access [33, 34], and data breaches [35]. Whilst existing data protection regulations aim to limit the potential data-related harms by ensuring that certain safeguards are in place related to protecting vulnerable groups such as children [36] as well as limitations on recording biometric information [37], there are wider data-related issues that arise from the complexity of such technologies that raise human rights and equality issues, especially in public housing. For example, for devices such as doorbells, heating controls, energy metres, and integrated sound systems, the main issue is what happens to the devices, as well as their associated contracts, when ownership changes [38]. Consent issues are also raised when residents may not be aware of the smart technologies prior to moving into a residential rental property, raising tensions between the landlord’s preference for maintaining control and monitoring their property with the resident’s privacy [39, 40]. To address some of these challenges, the Law Society is already exploring changes to its standard documents [41]. However, changes within the real estate sector that reflect data protection and ethical challenges remain slow. As the technology that enables smart homes becomes more sophisticated, so do the challenges, especially around consent, data ownership, and transferring contractual responsibility [42]. There may in some cases be human rights and equalities implications, especially when it comes to cameras in “build-to-rent” properties or equivalent residential developments. Data protection by design provides protections by ensuring transparency, accountability, and adjustments that can be made as our understanding of the topic evolves. Whilst the regulatory and policy changes described above represent important first steps to ensuring that residents’ data is protected, the legality of such vast data collection, processing, and sharing does not equate to them feeling safe and secure in their own homes. As a result, more encompassing ethical considerations should be adopted to ensure that residents can not only control as much of the data within their own homes but also feel that their homes will not turn against them.
56
J. Wong et al.
4.3.2 Smart Cities and Sensing: Citizen Data in the Built Environment Smart cities (technologically modern urban areas that use different types of electronic methods and sensors to collect specific data) are considered cities of the future that use data to improve citizen’s lives, modernise government services, and accelerate economic development [43]. The technologies and infrastructures that contribute to smart cities include opening up transportation data [44], transforming old payphones to Wi-Fi-enabled digital street units [45], and enhancing the digital provision of government services [46]. Whilst smart cities focus on the network of technologies, these infrastructures are intrinsically linked to real estate and property, where sensors, surveillance cameras, and other devices are embedded into buildings and building materials. Smart cities can be seen as the extension of smart homes, where groups of individuals may benefit from greater connectivity, data, and digital access. Although the potential opportunities resulting from smart cities can significantly help citizens and societies [47–51], there are also data protection and ethical risks related to the wide-scale deployment of technologies [52]. These include risks related to privacy [53], lack of awareness of data gathering by and of citizens [54], and digital oversight and data management [55]. There are also questions related to data ownership, in particular the privatisation of public data and increased datadriven marketing that may reduce trust between real estate developers, governments of smart cities, and citizens [56, 57]. These challenges have resulted in halting smart city programmes across the world [58, 59]. To address some of these challenges, in addition to smart city projects being halted altogether, participatory processes have been put in place to ensure that citizens are consulted as part of the development of digitally connected cities [60, 61]. Data stewardship mechanisms and institutions [62] such as data trusts [63], data cooperatives [64], and data commons [65] have also been considered to establish bottom-up (as opposed to top-down) approaches to ensuring that equity is being considered in public communal physical spaces and digital environments. However, data stewardship and governance frameworks remain difficult to implement in practice, requiring policy development and infrastructure considerations beyond data protection alone [66]. Smart cities can transform urban areas and utilise data through the deployment of technology in real estate infrastructures to enrich citizens’ lives. However, this comes with risks related to privacy, mistrust, and lack of accountability. The data of people passing through these smart city developments may be useful for developers, landlords, and governments in learning more about how to utilise land and buildings more efficiently, as enhanced through technology. However, increased efficiency through digital innovation and big data does not equate to greater equity or redistribution of wealth within cities. As a result, for the real estate industry, ethical considerations with regard to the development of smart cities are crucial for ensuring that people feel safe and secure in the properties and public spaces they inhabit.
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
57
4.3.3 The Occupier Perspective: The Service Provider Versus the Customer Within the corporate landscape, many of the same challenges encountered in the development of smart city developments persist. Consider the experience of an employee of an organisation based within a multi-tenanted skyscraper in Canary Wharf or City of London. The building is secured with the use of CCTV, monitored continuously day and night. Anyone entering and leaving the building is expected to swipe their badges through the main building turnstiles, provided by the landlord. Once employees reach their allocated floor, they swipe again through their company security systems. As they settle into their seats, under the gaze of cameras, plugged into their computers (whether laptop or desktop), they radiate data in a variety of ways. The IP addresses of their various electronics connect them to services, as they navigate the floors they pass through a series of sensors that collate anonymised data to enable tracking of occupancy rates or other environmental metrics. Even when they submit requests for services throughout the building, they share personal information. Whilst these instances of data sharing are exacerbated by working within increasingly digitised office spaces, the move towards increased agile working and working from home does not completely eliminate access to such forms of personal data. It is crucial to bear this in mind as boundaries between personal and workspaces become less static. As organisations battle with the changing tides, including the next phase of digital transformation, servicing a diverse workforce and keeping up with the demands of a younger and digitally literate generation, the supplier ecosystem supporting their corporate clients with developing the right real estate strategies to deliver against their business needs adds complexity to the data protection challenge. There are a number of areas that need consideration, outlined below. Assignment of data ownership within a service provider and customer relationship Contractually, data ownership tends to be broadly assigned to a customer as part of service mobilisation. It is rare that these negotiations, as well as subsequent Master Services Agreements, include a data model or technical documentation documenting the ownership of data to any level of detail. In practice, this results in a lack of true rigour in oversight of ethical issues relating to the management and usage of data. Conflict between strategic needs and operational ambitions The lack of robust governance of data, including true data ownership, along with lacking data literacy levels, is part of the reason for the tension between what the business (and customer) needs in the immediate and longer-term strategic planning. Lack of an aligned, comprehensive data strategy The absence of a data strategy leaves organisations in a perpetual cycle of tactical delivery rather than mindful consideration of the data estate to the level required to ensure consistency in management of the delivery and consumption of data and data products. Most
58
J. Wong et al.
organisations are considerate of their IT security, and within that they include data and information security, evidenced through the significant investments made in this area. The value of the UK Infosec industry reached £10.1bn in 2022 [67]. However, in practice, the information security sector tends to focus its attention on protecting against foreign attacks [68], rather than concern itself with the manipulation of data by internal teams allowing for opportunities for rules to be bent and boundaries to be squeezed, mostly for the good of the customer or the business.
4.3.4 Summary In this section, we explored examples of ethical challenges and considerations related to different stakeholder perspectives on data in the context of the real estate sector. When it comes to places and spaces, it is clear that data protection regulations are insufficient with regard to ensuring that people feel safe and protected in the private and public spaces they inhabit. As a result, wider ethical practices should be embedded as part of the real estate sector to ensure that our built environment is equitable for all.
4.4 Overcoming Data and Data Protection Challenges in Practice In Sects. 4.2 and 4.3, we outlined how the property sector has adapted with technological developments as well as the opportunity and challenges such developments have in the context of data and data protection, respectively. On top of data-driven considerations, property companies are grappling with the increasing significance of ESG+R (Environmental, Social, Governance, and Resilience) issues [69], an increased emphasis on net zero, and a circular economy [70], as well as emerging new reporting requirements such as fire safety following the Grenfell Towers disaster [71]. As our examples demonstrated, there are data-related concerns throughout the real estate sector that have been raised by different stakeholders; in fact there has been a recent call for legislation to mandate data production and management by representatives from the PropTech industry [72]. As such, having a centralised document that is sensitive to societal concerns with regard to data is not only beneficial for avoiding legal actions and reputational damage, but also reassurance of knowing that policies, processes, and documents reflect those concerns and can adapt to future changes. A key example is the “golden thread” proposals emerging now on fire safety data [73]. The Levelling Up and Regeneration Bill is the latest addition to the expanding regulatory framework that applies to the use of data, with wider reforms round the corner. Other examples of ethical considerations in context
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
59
of technology include the Association for Computing Machinery Code of Ethics and Professional Conduct [74] as well as the recently published statement on principles for responsible algorithmic systems [75]. However, there is an absence of centralised regulation in the real estate sector that considers good practices in regard to data management and governance. This section outlines some examples of good practices with regard to how data and data protection challenges have been overcome from different stakeholders within the real estate sector from the perspective of technologies.
4.4.1 Good Practices in the Absence of Central Regulation As the real estate industry’s digital transformation continues at pace, the utilisation of the data generated from users of the industry’s key output – the buildings along with the space and public realm that surround them – is coming into strong focus. This focus is driven via new business initiatives such as the growing social and governance impacts created by ESG+R reporting as well as the greater scrutiny and attention we all have on how our data are being used. The rate of change of digital transformation, as well as the impact of poor data use that this process generates, is one of the biggest risks companies in this real estate sector should be considering. Below, we outline examples from the London Office of Technology and Innovation, Geospatial Commission, and the Open Data Institute that attempt to establish good practices in the absence of central regulation. London Office of Technology and Innovation (LOTI) Data Ethics Services A recent, and significant, development in data ethics has come from the London Office of Technology and Innovation with a new Data Ethics Support system for London [76]. The new service is based on three pillars: project facilitation delivered through workshops on data projects and offering practical advice, developing specialist capabilities in data ethics through pilots and the development of a London-wide set of resources, standards, and initiatives. In addition to the service itself, LOTI has also appointed the first-ever Data Ethicist for London [77]. Geospatial Commission: An ABC of Ethics? In 2021, the Geospatial Commission began an engagement exercise looking at the key issues of public concern about location data use, badged as “the UK’s first deliberative consultation on location data ethics” [78]. This exercise demonstrated that public acceptance of the use and reuse of their data was conditional on explicit ethical foundations. In June 2022, a policy paper summarised how location data could and should be used, so that commercial benefits could be maximised without legal risk whilst retaining public trust [79]. The report identifies three “ethical building blocks” of accountability, bias, and clarity:
60
J. Wong et al.
• Accountability: Governing location data responsibly, with the appropriate oversight and security • Bias: Considering and mitigating different types of bias and highlighting the positive benefits of location data • Clarity: Being clear about how location data will be used and the rights of individuals Open Data Institute (ODI) Data Ethics Canvas The ODI’s Data Ethics Canvas represents a practical tool that operationalises how ethics can and should be considered by anyone who collects, shares, or uses data, including those in the real estate sector [80]. The Canvas helps identify and manage ethical issues throughout a project, asking important questions about data use such as “What is your primary purpose for using data in this project?” and “Who could be negatively affected by this project?”. Although not specifically created for the property sector, the Data Ethics Canvas provides a framework to develop ethical guidance that suits any project context, size, or scope.
4.4.2 The RED Foundation Data Ethics Playbook To address the risks, concerns, and challenges outlined in our paper, as part of The RED Foundation, we published a Data Ethics Playbook [81] that brings together a cross-section of views from regulatory, academia, and real estate data practices to help guide the industry with a set of approaches that can be taken to drive data ethics adoption as we have outlined in this paper [82]. Instead of taking a prescriptive approach to implementing data ethics within a business, the Playbook builds on an innovative approach established by the Chancery Lane Project [83] and presents a set of guidelines that should be tailored to ensure real estate companies can apply data ethics in the most effective way for their requirements as they embark on their data ethics journeys. The creation of the Playbook builds upon the deskbased research partially presented in our paper and draws from academic, industry, and research conversations with different domain experts working in real estate and data through engagement with The RED Foundation, such as in our quarterly RED Foundation Forum hosted in the UK. The Playbook is aimed at being applied across all levels and stakeholder groups involved in real estate including but not limited to: • All asset classes across commercial and residential real estate • At a real estate corporation level, across a portfolio of assets or at the individual building level • Owners, operators, and occupiers of buildings as well as members of the public that interact with those buildings • Technology hardware and service providers to the industry
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
61
The six key principles of the Data Ethics Playbook that all real estate businesses are encouraged to work towards and form the basis of the advice are the following: 1. Accountable: For the data collected and used. This includes taking responsibility for using the data in an appropriate and secure way. 2. Transparent: About what is collected and why. Whilst this cannot be expected for every data point, at a minimum a general data policy should be published for each building and company. 3. Proportionate: Not only should data be collected within legal and technical requirements but also should be proportionate to societal benefit and expectations. 4. Confidential and Private: All activity with data should at all times consider confidentiality and protect privacy, both within necessary legal requirements and according to the expectations of wider society. 5. Lawful: All data should only be used within all relevant local and international laws and regulations. 6. Secure: Security principles should be built in “by design” into all applications and appropriate steps should be taken to keep data secure. Whilst these principles may appear simple, in practice, such guidance represents a big step forward and revolutionary change given the sector’s lack of consensus on what ethical considerations in real estate entail. Since the publication of The RED Foundation Data Ethics Playbook, the Foundation has already received positive feedback and responses from across the real estate and property sector, including the UK Government Digital Planning team, the Benchmark Initiative, and Data Leaders. An article on the impact of our Playbook was published in the Cambridge University Land Society Magazine [84].
4.4.3 Future Work At a macro level, with the rise in smart city technologies and planning for the incorporation of technologies into real estate and physical infrastructures, consideration of ethical data practices can help support a more equitable, secure, and trustworthy approach to innovative urban planning and developing the built environment: • Individuals will have more trust in the smart city development and understand how their data is being used to benefit them and engage in the development of their public and private spaces, aided by transparent processes that encourage open dialogue. • Communities may be able to derive insights from public data that has been pooled together, with the confidence that such data is being collected, analysed, and shared in lawful and secure ways.
62
J. Wong et al.
• Developers can develop a clearer understanding of exactly what technologies are being implemented and how they can use the data to increase the efficiency, effectiveness, and profitability of their development. • Governments – at both a national and regional level – will find it easier to maximise the significant digital presence real estate developments to increase the value of public data to citizens and enable long-term planning, including stakeholder and public consultation. As digital technologies and innovation are increasingly being embedded within our built environment, different stakeholders are grappling with new ESG+R, reporting, and data protection requirements. Building upon existing regulatory, policy, and best practices that foster more ethical considerations within the built environment, at The RED Foundation, we will continue to provide practical recommendations for how data ethics can be operationalised within the real estate sector. The Playbook was created with agile and iterative development in mind. As such, the document will be regularly reviewed by The RED Foundation as well as other partner organisations to maintain the Playbook’s relevance to ongoing developments within the real estate sector, including further case study examples and putting the Playbook into practice. Since our Playbook was published in December 2022, our focus has shifted to finding examples where ethical issues are emerging from the intersection between the built and digital world. In our current iteration, we are collaborating with Etive who are working with the UK Government and the House Builders Federation on the creation of a single digital identity for people in the house buying and selling process. Brent Council, as well as their pilot study into the usefulness of sensors to detect mould and damp in Council properties, will be our next case study. At The RED Foundation, we also consider the adoption of data standards [85] as well as data research and academia [86] to ensure the real estate sector benefits from an increased use of data, avoids some of the risks that this presents, and is better placed to serve society. We see this as an opportunity, as well as an obligation, for the real estate sector, along with local and national authorities, to take a proactive approach to ensuring that data ethics research is undertaken and ethical practices are adopted within the built environment that focus on harm reduction and prevention rather than reacting to unethical practices or tragedies.
4.5 Conclusion The adoption of technologies in homes, development of smart cities, and application of data-driven practices in the real estate sector have grown significantly over the past decade. To ensure that citizens’ data is protected, the property sector has had to adhere to data protection regulations and justify the vast amounts of data that is being collected, processed, and shared. However, data protection alone is insufficient to ensure ethical practices within both private and public
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
63
built environments. Whilst attempts have been made by the real estate sector and government to establish regulatory, policy, and best practices for incorporating ethical considerations as part of the urban planning process, there is currently no centralised regulation to put ethics in practice. Without this regulatory steer, the real estate sector and citizens are grappling for a collusive solution and forced to adopt a reactive approach to challenges as they emerge as opposed to creating solutions that withstand the decades of which our homes, buildings, and skyscrapers were expected to be built for. Building upon current established practice, The RED Foundation has created a Data Ethics Playbook that outlines key ethical principles to consider within the real estate sector, provides illustrative examples of how they can be operationalised, and creates an agile space for collective experimentation. Addressing the gap between the digitisation of the real estate sector with citizens’ expectations for transparency, accountability, and trust in context of the built environment which they are part of, we suggest paths forward for how the property sector, regulators, and policymakers can be more proactive when it comes to establishing ethical practices in real estate.
References 1. European Union, “Regulation (EU) 2016/679 of the European Parliament and the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC,” Official Journal of the European Union, vol. L119, pp. 1–88, 2016. 2. United Kingdom, “Data Protection Act,” Act of Parliament, vol. 1, pp. 1–335, 2018. 3. W. Shakespeare, Henry V, Oxford: Oxford University Press, 1599 (2008). 4. Investopedia, “What are the main segments of the real estate sector?,” 1 August 2022. [Online]. Available: https://www.investopedia.com/ask/answers/052715/what-are-mainsegments-real-estate-sector.asp. [Accessed 26 October 2022]. 5. B. O’Brien, “Proptech: A distruptive force in real estate?,” 2022. [Online]. Available: https://www2.deloitte.com/lu/en/pages/real-estate/articles/proptech-disruptive-force-realestate.html. 6. The Urban Developer, “What Is Proptech?,” 27 February 2020. [Online]. Available: https:// www.theurbandeveloper.com/articles/what-is-proptech. [Accessed 28 October 2022]. 7. L. Hartley, “PlanTech Explained,” 22 May 2019. [Online]. Available: https:// neighbourlytics.com/blog/plantech-explained. [Accessed 28 October 2022]. 8. Royal Institution of Chartered Surveyors, “Real Estate Lifecycle: Training,” 2022. [Online]. Available: https://www.rics.org/uk/events/e-learning/e-learning/real-estate-lifecyclecertification%2D%2D-course-only/. [Accessed 28 October 2022]. 9. R. Dagnoni, “Web3: Decentralization, Property and Metaverse,” 23 February 2022. [Online]. Available: Web3: Decentralization, Property and Metaverse. [Accessed 28 October 2022]. 10. Forbes Tech Council, “How the pandemic accelerated the proptech/worktech and medtech sectors,” 31 January 2022. [Online]. Available: https://www.forbes.com/sites/ forbestechcouncil/2022/01/31/how-the-pandemic-accelerated-the-proptechworktech-andmedtech-sectors/. [Accessed 28 October 2022]. 11. S. Sijbrandij, “Hybrid Remote Work Offers the Worst of Both Worlds,” 12 July 2020. [Online]. Available: https://www.wired.com/story/hybrid-remote-work-offers-the-worst-ofboth-worlds/. [Accessed 28 October 2022].
64
J. Wong et al.
12. Novartis, “Choice with Responsibility: Reimagining how we work,” 29 July 2022. [Online]. Available: https://www.novartis.com/news/choice-responsibility-reimagining-howwe-work. [Accessed 28 October 2022]. 13. A. Barnes, “Novartis CEO Says Remote Work Hybrid Should Mean Access to New Talent Pools,” 28 April 2021. [Online]. Available: https://www.biospace.com/article/novartisceo-says-remote-work-hybrid-should-mean-access-to-new-talent-pools/. [Accessed 28 October 2022]. 14. The National Archives, “Doomsday Book,” 2022. [Online]. Available: https:// www.nationalarchives.gov.uk/education/resources/domesday-book/. [Accessed 28 October 2022]. 15. HM Land Registry, “Registering land transactions: practice guides,” 25 June 2014. [Online]. Available: https://www.gov.uk/government/collections/registering-land-transactions-practiceguides. [Accessed 28 October 2022]. 16. Department for Education, “Department for Education: real-time energy data,” 12 November 2015. [Online]. Available: https://www.gov.uk/government/publications/greeninggovernment-and-transparency-commitments-real-time-energy-data. [Accessed 28 October 2022]. 17. Department of Environment, Food & Rural Affairs, “Air quality and emissions statistics,” 14 February 2020. [Online]. Available: https://www.gov.uk/government/collections/air-qualityand-emissions-statistics. [Accessed 28 October 2022]. 18. J. Peat, “Biometric data will allow homes of the future to feel like their occupants,” 22 May 2018. [Online]. Available: https://www.thelondoneconomic.com/property/biometric-data-willallow-homes-of-the-future-to-feel-like-their-occupants-89393/. [Accessed 28 October 2022]. 19. J. Simpson, “Washing machine will turn detective,” 2 January 2017. [Online]. Available: https://www.thetimes.co.uk/article/washing-machine-will-turn-detective-djq30jdff. [Accessed 28 October 2022]. 20. Inside Housing, “How can social housing be smart about technology data use and privacy?,” 2022. [Online]. Available: https://www.insidehousing.co.uk/sponsored/sponsored/how-cansocial-housing-be-smart-about-technology-data-use-and-privacy. [Accessed 28 October 2022]. 21. Department for Transport, “Find and use data on public electric vehicle chargepoints,” 11 August 2020. [Online]. Available: https://www.gov.uk/guidance/find-and-use-data-on-publicelectric-vehicle-chargepoints. [Accessed 28 October 2022]. 22. United Kingdom, “Levelling-up and Regeneration Bill,” Bill, vol. 169, no. 2022–23, pp. 1–340, 2022. 23. planning.data.gov.uk, “ Find planning and housing data that is easy to understand, use and trust,” 2022. [Online]. Available: https://www.planning.data.gov.uk/. [Accessed 28 October 2022]. 24. Cabinet Office and Geospatial Commission, “Project update on National Underground Asset Register published,” 6 October 2022. [Online]. Available: https://www.gov.uk/government/ news/project-update-on-national-underground-asset-register-published. [Accessed 28 October 2022]. 25. UK PropTech Association, “UK PropTech Award 2021,” 18 November 2021. [Online]. Available: https://ukproptech.com/uk-proptech-awards-2021/. [Accessed 28 October 2022]. 26. The Good Law Project, R (On the Application Of) v The Prime Minister & Ors, 2022 EWCA Cov 1580. 27. British Property Federation, “The Liquid report: leading the digital transformation of global real estate,” British Property Federation, London, 2019. 28. J. Chen, L. Edwards, L. Urquart and D. McAlley, “Who is responsible for data processing in smart homes? Reconsidering joint controllership and the household exemption,” International Data Privacy Law, vol. 10, no. 4, p. 279–293, 2020. 29. J. M. Batalla, A. Vasilakos and M. Gajewski, “Secure Smart Homes: Opportunities and Challenges,” ACM Computer Surveys, vol. 50, no. 5, pp. 1–32, 2018.
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
65
30. O. Gkotsopoulou, E. Charalambous, K. Limniotis, P. Quinn, D. Kavallieros, G. Sargsyan, S. Shiaeles and N. Kolokotronis, “Data Protection by Design for cybersecurity systems in a Smart Home environment,” 2019 IEEE Conference on Network Softwarization (NetSoft), pp. 101– 109, 2019. 31. K. Hill and S. Mattu, “The House that Spied on Me,” 2018. [Online]. Available: https:// gizmodo.com/the-house-that-spied-on-me-1822429852. [Accessed 2022]. 32. G. A. Fowler, “Tour Amazon’s dream home, where every appliance is also a spy,” 12 October 2022. [Online]. Available: https://www.washingtonpost.com/technology/interactive/ 2022/amazon-smart-home/. [Accessed 28 October 2022]. 33. Privacy International, “Ad-tech,” 2021. [Online]. Available: https://privacyinternational.org/ learn/adtech. [Accessed 2022]. 34. Tweet by Chi Onwurah, “On 1st Oct, the govt announced that it would collect, process & store all #smartmeter data, despite assurances over years that households would control access to that data. When I raised this with the Digital SoS today it was clear she had no idea what . . . ,” 20 October 2022. [Online]. Available: https://twitter.com/ChiOnwurah/status/ 1583122298307248129. [Accessed 28 October 2022]. 35. D. Winder, “Confirmed: 2 billion records exposed in massive smart home device breach,” 2 July 2019. [Online]. Available: https://www.forbes.com/sites/daveywinder/2019/07/02/ confirmed-2-billion-records-exposed-in-massive-smart-home-device-breach/. [Accessed 28 October 2022]. 36. Information Commissioner’s Office, “Introduction to the Age appropriate design code,” 2018. [Online]. Available: https://ico.org.uk/for-organisations/guide-to-data-protection/ico-codes-ofpractice/age-appropriate-design-code/. [Accessed 28 October 2022]. 37. Information Commissioner’s Office, “‘Immature biometric technologies could be discriminating against people’ says ICO in warning to organisations,” 26 October 2022. [Online]. Available: https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2022/ 10/immature-biometric-technologies-could-be-discriminating-against-people-says-ico-inwarning-to-organisations/. [Accessed 28 October 2022]. 38. Osborne Clarke, “Internet of Things and Data Ownership,” 29 September 2016. [Online]. Available: https://www.osborneclarke.com/insights/internet-of-things-and-data-ownership. [Accessed 28 October 2022]. 39. G. Birchley, R. Huxtable, M. Murtagh, R. T. Meulen, P. Flach and R. Gooberman-Hill, “Smart homes, private homes? An empirical study of technology researchers’ perceptions of ethical issues in developing smart-home health technologies,” BCM Medical Ethics, vol. 18, no. 1, pp. 1–23, 2017. 40. E. Thorstensen, “Privacy and Future Consent in Smart Homes as Assisted Living Technologies,” ITAP 2018: Human Aspects of IT for the Aged Population. Applications in Health, Assistance, and Entertainment, pp. 415–433, 2018. 41. Law Society, “Have you say on smart devices in our TA10 form,” 28 February 2022. [Online]. Available: https://www.lawsociety.org.uk/topics/property/have-your-say-on-smart-devices-inour-ta10-form. [Accessed 28 October 2022]. 42. F. Ustek-Spilda, A. Powell and S. Nemorin, “Engaging with ethics in Internet of Things: Imaginaries in the social milieu of technology developers,” Big Data & Society, vol. 6, no. 2, pp. 1–12, 2019. 43. Information Commissioner’s Office, “Biometrics: foresight,” 26 October 2022. [Online]. Available: https://ico.org.uk/media/about-the-ico/documents/4021971/biometrics-foresightreport.pdf. [Accessed 28 October 2022]. 44. Transport for London, “Transport for London Unified API,” 2022. [Online]. Available: https:/ /api.tfl.gov.uk/. [Accessed 28 October 2022]. 45. BT, “Birmingham becomes first city in the Midlands to benefit from free ultrafast wi-fi and phone calls,” 22 October 2018. [Online]. Available: https://newsroom.bt.com/birminghambecomes-first-city-in-the-midlands-to-benefit-from-free-ultrafast-wi-fi-and-phone-calls/. [Accessed 28 October 2022].
66
J. Wong et al.
46. European Commission, “Reclaiming the Smart City: Personal data, trust and the new commons,” 2018. [Online]. Available: https://media.nesta.org.uk/documents/DECODE2018_report-smart-cities.pdf. [Accessed 2022]. 47. European Commission, “Decidim,” 2021. [Online]. Available: https://decidim.org/. [Accessed 2022]. 48. gE.CO, “gE.CO Living Lab About Page,” 2020. [Online]. Available: https://generativecommons.eu/. [Accessed 2022]. 49. The Bristol Approach, “The Bristol Approach homepage,” 2017. [Online]. Available: https:// www.bristolapproach.org/. [Accessed 2022]. 50. Waag, “Shared Cities, Smart Citizens,” 30 June 2021. [Online]. Available: https://waag.org/en/ project/shared-smart-city/. [Accessed 28 October 2022]. 51. Urban and Cities Platform of Latin America and the Carribean, “Urban planning system of Costa Rica,” 2022. [Online]. Available: https://plataformaurbana.cepal.org/en/systems/ planning/urban-planning-system-costa-rica. [Accessed 28 October 2022]. 52. A. B. Powell, Undoing Optimization, New Haven: Yale University press, 2021. 53. Privacy International, “Smart cities,” 2022. [Online]. Available: https:// privacyinternational.org/learn/smart-cities. [Accessed 28 October 2022]. 54. L. Zoonen, “Privacy concerns in smart cities,” Government Information Quarterly, vol. 33, no. 3, pp. 472–480, 2016. 55. C. Jones, “Spotlight Privacy in the smart city,” 26 October 2020. [Online]. Available: https:// eurocities.eu/latest/privacy-in-the-smart-city/. [Accessed 28 October 2022]. 56. RED Foundation, “Is consent the hallmark of ethical data use in a real estate context,” RED Foundation, London, 2022. 57. E. Ismagilova, L. Hughes, N. P. Rana and Y. K. Dwivedi, “Security, Privacy and Risks Within Smart Cities: Literature Review and Development of a Smart City Interaction Framework,” Information Systems Frontiers, vol. 24, p. 393–414, 2022. 58. L. Cecco, “Google affiliate Sidewalk Labs abruptly abandons Toronto smart city project,” 2020. [Online]. Available: https://www.theguardian.com/technology/2020/may/07/googlesidewalk-labs-toronto-smart-city-abandoned. [Accessed 2022]. 59. M. Coulter, “Alphabet’s Sidewalk Labs has abandoned another US smart city project after reported fights about transparency,” 2021. [Online]. Available: https:// www.businessinsider.com/second-sidewalk-labs-smart-city-project-shutters-portland-oregon2021-2?r=US&IR=T [Accessed 2022]. 60. Ada Lovelace Institute, “Participatory data stewardship,” Ada Lovelace Institute, London, 2021. 61. Ada Lovelace Institute, “Exploring legal mechanisms for data stewardship,” Ada Lovelace Institute, London, 2021. 62. Open Data Institute, “https://theodi.org/article/what-are-data-institutions-and-why-are-theyimportant/,” 29 January 2021. [Online]. Available: What are data institutions and why are they important?. [Accessed 28 October 2022]. 63. K. O’Hara, “Data Trusts Ethics, Architecture and Governance for Trustworthy Data Stewardship,” WSI White Paper #1, Southampton, 2019. 64. The Data Economy Lab, “Data Cooperative,” 2021. [Online]. Available: https:// tool.thedataeconomylab.com/data-models/1. [Accessed 28 October 2022]. 65. T. Bass, A. Sutherland and T. Symons, “Reclaiming the Smart City: Personal Data, Trust and the New Commons,” 28 July 2018. [Online]. Available: https://www.nesta.org.uk/report/ reclaiming-smart-city-personal-data-trust-and-new-commons/. [Accessed 28 October 2022]. 66. J. Wong, T. Henderson and K. Ball, “Data protection for the common good: Developing a framework for a data protection-focused data commons,” Data & Policy, vol. 4, no. 1, pp. 1–31, 2022. 67. Department of Digital, Culture, Media & Sport, “Cyber security sectoral analysis 2022,” 18 February 2022. [Online]. Available: https://www.gov.uk/government/publications/cybersecurity-sectoral-analysis-2022/cyber-security-sectoral-analysis-2022. [Accessed 28 October 2022].
4 “We Are the Makers of Manners”: A Grounded Approach to Data Ethics. . .
67
68. Computer Weekly, “Business priorities: what to protect, monitor and test,” 7 January 2013. [Online]. Available: https://www.computerweekly.com/feature/Business-prioritieswhat-to-protect-monitor-and-test. [Accessed 28 October 2022]. 69. CBRE, “ESG and Real Estate: The Top 10 Things Investors Need to Know,” 3 January 2022. [Online]. Available: https://www.cbre.com/insights/reports/esg-and-real-estate-the-top10-things-investors-need-to-know. [Accessed 28 October 2022]. 70. Deloitte, “ESG real estate insights 2022,” October 2022. [Online]. Available: https:// www2.deloitte.com/global/en/pages/financial-services/articles/esg-real-estate-insights.html. [Accessed 28 October 2022]. 71. Home Office and the Ministry of Housing, Communities & Local Government, “Grenfell Tower Inquiry Phase 1 Report: government response,” 21 January 2020. [Online]. Available: https://www.gov.uk/government/publications/grenfell-tower-inquiry-phase-1report-government-response/grenfell-tower-inquiry-phase-1-report-government-response. [Accessed 28 October 2022]. 72. Department for Levelling Up, Housing and Communities, “PropTech sector wants to help deliver Housing and Planning ambitions,” 21 October 2022. [Online]. Available: https:/ /dluhcdigital.blog.gov.uk/2022/10/21/proptech-sector-wants-to-help-deliver-housing-andplanning-ambitions/. [Accessed 28 October 2022]. 73. Building Regulations Advisory Committee and Ministry of Housing, Communities & Local Government, “Building Regulations Advisory Committee: golden thread report,” 21 July 2021. [Online]. Available: https://www.gov.uk/government/publications/building-regulationsadvisory-committee-golden-thread-report/building-regulations-advisory-committee-goldenthread-report. [Accessed 28 October 2022]. 74. Association for Computing Machinery, “ACM Code of Ethics and Professional Conduct,” ACM, 22 June 2018. [Online]. Available: https://www.acm.org/code-of-ethics. [Accessed 16 November 2022]. 75. Association for Computing Machinery, “ACM Technology Policy Council Releases Statement on Principles for Responsible Algorithmic Systems,” ACM, 1 November 2022. [Online]. Available: https://www.acm.org/articles/bulletins/2022/november/tpcstatement-responsible-algorithmic-systems. [Accessed 16 November 2022]. 76. London Office of Technology and Innovation, “LOTI Launches New Data Ethics Support for London,” 21 October 2022. [Online]. Available: https://loti.london/blog/loti-launches-newdata-ethics-support-for-london/. [Accessed 28 October 2022]. 77. S. Wray, “London’s first Data Ethicist appointed,” 14 October 2022. [Online]. Available: https:/ /cities-today.com/londons-first-data-ethicist-appointed/. [Accessed 28 October 2022]. 78. Cabinet Office, Geospatial Commission, and Lord True CBE, “Geospatial Commission announce location data ethics project,” 18 March 2021. [Online]. Available: https:// www.gov.uk/government/news/geospatial-commission-announce-location-data-ethics-project. [Accessed 28 October 2022]. 79. Cabinet Office and Geospatial Commission, “Building public confidence in location data: The ABC of ethical use,” 22 June 2022. [Online]. Available: https://www.gov.uk/government/ publications/building-public-confidence-in-location-data-the-abc-of-ethical-use. [Accessed 28 October 2022]. 80. Open Data Institute, “The Data Ethics Canvas,” 28 June 2021. [Online]. Available: https:// theodi.org/article/the-data-ethics-canvas-2021/. [Accessed 28 October 2022]. 81. RED Foundation, “RED Foundation Data Ethics Playbook,” RED Foundation, 15 November 2022. [Online]. Available: https://www.theredfoundation.org/dataethics. [Accessed 16 November 2022]. 82. RED Foundation, “Data Ethics playbook – everything you need to know . . . .,” 21 September 2022. [Online]. Available: https://www.theredfoundation.org/post/data-ethics-playbookeverything-you-need-to-know. [Accessed 28 October 2022]. 83. Chancery Lane Project, “Chancery Lane Project homepage,” 2022. [Online]. Available: https:/ /chancerylaneproject.org/. [Accessed 31 October 2022].
68
J. Wong et al.
84. Cambridge University Land Society, “CULS Magazine,” 2022. [Online]. Available: https:// www.culandsoc.com/about/culs-magazine/. [Accessed November 2022]. 85. RED Foundation, “Data Standards,” 2022. [Online]. Available: https:// www.theredfoundation.org/standards. [Accessed 28 October 2022]. 86. RED Foundation, “Data Research and Academia,” 2022. [Online]. Available: https:// www.theredfoundation.org/researchandacademia. [Accessed 28 October 2022].
Chapter 5
How the Charter of Trust Can Support the Data Protection Nadia Giusti
5.1 Introduction The joint communication to the European Parliament and the Council, on 13 September 2017 [1], starts with the assumption that “cybersecurity is critical to both our prosperity and our security.” Since then, Europe and its institutions have been committed to improving and strengthening resilience and actions in the field of cyberspace: our personal and working-day lives, in fact, are increasingly dependent on technology as never before and the relevant greater exposure in cyberspace equals a greater attack surface. Both the COVID-19 pandemic experience, which has revolutionized our personal habits, and the Russia-Ukraine conflict have caused a considerable increase in cyberattacks, more sophisticated, numerous, and capable of threatening our economies, the evolution toward a single digital market, and even the very functioning of our democracies, our freedoms, and our values. To be able to defend, predict, and react to cyber threats without delay will be fundamental for the protection of so-called essential services, for critical infrastructures and moreover for creating the trust necessary for all large-scale use of digital technology, to shape that digital single market capable of generating trust and security in citizens.
N. Giusti () Siemens Digital Industry Software, DI SW MOM, Milan, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_5
69
70
N. Giusti
5.2 Recent Landscape in Cybersecurity: EU Cybersecurity, Strategy Prime Threats, and Investments 5.2.1 The EU Cybersecurity Strategy On 16 December 2020, the European Commission and the High Representative of the Union for Foreign Affairs and Security Policy presented a new EU Cybersecurity Strategy [24], a key component of Shaping Europe’s Digital Future [25], the Commission’s Recovery Plan for Europe [26], and the Security Union Strategy that covers the period 2020–2025 [27]. Starting from the 2013 EU Cybersecurity strategy, the EU has developed a coherent and holistic international cyber policy. The new strategy aims to ensure a global and open Internet with strong safeguards where there are risks to security and the fundamental rights of people in Europe, also based on the progress achieved under the previous strategies. The main objective of the strategy is to bolster Europe’s collective resilience against cyber threats and help to ensure that all citizens and businesses can fully benefit from trustworthy and reliable services and digital tools. Moreover, the new Cybersecurity Strategy will allow the EU to step up leadership on international norms and standards in cyberspace, to increase cooperation with international partners, and to promote “a global, open, stable and secure cyberspace.” There are three areas in which the European Commission wants to concentrate its efforts. The first area of action is related to resilience, technological sovereignty, and leadership aspects. Improved resilience under the increasing growth of cyberattacks is a key point, and the EU future security depends on transforming our ability to protect the EU against cyber threats: ransomware, interfering in internal democratic processes, disinformation campaigns, fake news, and others. In this direction, EU Commission is strongly committed to reform the rules of the security of networks and information system to increase the level of cyber resilience of critical infrastructure, public or private, such as hospitals, energy grids, railways, public administrations, and similar critical services, and to respond to the growing threats, due to digitalization: as result of these efforts, the NIS 2 Directive was approved from EU Parliament [28] in November 2022. In addition, always related to resilience are the needs to implement a collective and wide-ranging approach, that means more robust and effective structures to promote cybersecurity and to respond to cyberattacks in the Member States but also in the EU’s own institutions. To improve cooperation the EU Commission assigns to the European Union Agency for Cybersecurity, ENISA, “a strong advisory role on policy development and implementation, including promoting coherence between sectoral initiatives and the NIS Directive and helping to set up Information Sharing and Analysis Centers in critical sectors,” and the creation of a network of Security Operations Centers (SOCs) across the EU, powered by artificial intelligence (AI), will constitute a “cybersecurity shield,” able to detect cyberattacks in advance and to enable
5 How the Charter of Trust Can Support the Data Protection
71
proactive actions, before damage occurs. Additional measures in this area will include the support to small and medium-sized businesses (SMEs), attracting and retaining the best cybersecurity talent and investments in research and innovation. The second area of EU actions is related to build an operational capacity to prevent, deter, and respond to cyberattacks. To reach this goal, the EU Commission is committed to create a new Joint Cyber Unit, to improve cooperation between Member State authorities and EU bodies. This will be an important step forward toward completing the European Cybersecurity Crisis Management Framework. Operational capacity includes also tackling cybercrime, that is, a key factor to ensure cybersecurity, and to ensure this, it is essential the cooperation between different cybersecurity actors and law enforcement. The third area is related to advancing a global and an open cyberspace through increased cooperation. Under this strand of action, the EU Commission will step up work with international partners to strengthen a standardization strategy that means the creation and shaping of international standards in the area or emerging technologies to ensure that “Internet remains global and open, the technologies are human-centric, privacy-focused and that their use in lawful, safe and ethical”, to promote international security and stability in cyberspace, and to protect human rights and fundamental freedoms online. The EU will also form an EU Cyber Diplomacy Network around the world to promote its vision of cyberspace. Along some of the same concerns, also the J. Biden administration issued an executive order [29] to improve nationwide cybersecurity in the USA.
5.2.2 Prime Threats in the 2021–2022 The ENISA Threat Landscape (ETL) 2022 [2] confirms the trends of 2021 [3], without major changes: one of the most important successful cyber threats headlines in recent years remains the ransomware attacks and not only in Europe. Between 2021 and 2022, cyber threats increased considerably not only in terms of complexity and numbers but also and above all in terms of impact, also due to the already mentioned Russia-Ukraine crisis, where cyberwarfare-assumes a leading role. In the 2022 ENISA’s Threat Landscape for Ransomware Attacks [4], the “ransomware” [5] is defined as “type of attack where threat actors take control of a target’s assets and demand a ransom in exchange for the return of the assets’ availability.” The National Institute of Standards and Technology’s (NIST) definition [6] of ransomware is more detailed and mainly focused on encryption and steal impacts: “a type of malicious attack where attackers encrypt an organization’s data and demand payment to restore access. In some instances, attackers may also steal an organization’s information and demand an additional payment in return for not disclosing the information to authorities, competitors, or the public.” There are three key elements always present in a ransomware attack: asset, actor, and blackmail, but the landscape could be diverse and changeable, with multiple extortion techniques and different goals, although a ransomware attack generally
72
N. Giusti
impacts on asset’s confidentiality, integrity, and availability (CIA) with different high-level actions (lock, encrypt, delete, and steal). The phishing is the most used initial attack vector for ransomware attacks, while the second is the threat actor’s brute force weak Remote Desktop Protocol (RDP) credentials, especially when multifactor authentication (MFA) is not enabled. However, also social engineering is used to leverage an employees’ access inside an organization to gain a technical foothold in the network from which further attacks are carried out. The extortion techniques evolve further, too: unlike the past, a dedicated leak site for individual victims is hosted on the public Internet, allows third-party victims whose data was leaked to investigate whether they were impacted quickly, and prevents the threats actor from contacting the victims individually, as happened before. Another new trend is a victim’s data being published on leak sites without mentioning the company’s name: this approach puts pressure on the victims to pay a ransom before their names are made public and allows companies to decide what to do. A “malware” is defined as an unauthorized malicious code or malicious logic [7] intended to perform an unauthorized process that will have an adverse impact on the confidentiality, integrity, or availability of a system. Malware types include viruses, worms, Trojan horses, and spyware. Due to COVID-19 restrictions and the resulting extended home office practices between 2020 and 2021, malware infections on corporate infrastructures was reduces [8]; however, this effect was transient and malware attacks, mainly in the form of crypto-jacking and Internet of Things (IoT) malware, increased again from end of 2021 on [9]. In 2021 and in Q1 2022, the most common malware families included remote access Trojans (RATs), banking Trojans, information stealers, and ransomware. A “social engineering” attack includes a large range of activities that attempt to exploit a human error or human behavior with the objective of gaining access to information or services, and it uses various forms of manipulation to trick victims into making mistakes or disclose sensitive or secret information: the “phishing” has the goal to steal secret information like credit card numbers and passwords, through deceptive e-mails; the spear-phishing is a more sophisticated version of phishing that targets specific individuals or groups within an organization [10]; the “whaling” is a spear-phishing attack versus high-level users (executives, politicians, etc.); the “smishing,” derived from “SMS” and “phishing,” occurs when victims’ financial or personal information are gathered via the use of SMS messages; likewise, the “vishing,” derived from “voice” and “phishing” occurs when sensitive information of victims is gathered via phone; the “business e-mail compromise” (BEC) is a sophisticated fraud targeting businesses and organizations, where social engineering techniques are used to access to an employee’s or executive’s e-mail account to initiate a fraudulent bank transfers. New trends of social engineering attacks are the non-fungible token (NFT) market, using fake profiles on social media, social media account hijacking, phishing, and impersonation attacks [11]. In the context of the Russia-Ukraine conflict, a lot of war-themed social engineering attacks targeted European governments, civilians, and organizations. Threats against data are a collection of threats that target data sources with the goal to disclosure and manipulate these data through unauthorized access. These
5 How the Charter of Trust Can Support the Data Protection
73
kinds of threats are often the first step in a more sophisticated attack, as ransomware, Denial of Service (DoS), Distribute Denial of Service (DDoS), Ransom Denial of Service (RDoS), or phishing. Technically speaking, threats against data can be mainly classified in data breach and data leak: a “data breach” is defined in the art. 4, no. 12 of Regulation (EU) 2016/679, known as “GDPR,” as “a breach of security leading to the accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to, personal data transmitted, stored or otherwise processed,” and it is a deliberate attack under a system or organization, with the intention of stealing data; a “data leak” is an unintentional disclosure of sensitive, confidential, or protected data due to misconfigurations, vulnerabilities, or human errors: it is not an intentional attack. Data breach and data leak are the most traditional threats against data but today, with the machine learning (ML) and artificial intelligence (AI) growing, also the manipulation of data requires attention, because it has the aim of undermining trust in IT and production systems and, more generally, in our society. The current trend is an increase in data breaches: consequently, personal and sensitive data are easily accessible to malicious actors via online forums and the dark web. Ransomware-related data breaches are increasingly gaining importance, and it was one of the first three root causes of compromise during the last year [12]. As reported in the ENISA’s research, ransomware is also increasingly used in combined attacks that target the CIA triad of systems and corresponding data. According to Cloudflare, in Q4 2021, ransom DDoS attacks increased by 29% year-over-year and 175% quarter-on-quarter [13]. Moreover, the ENISA report highlights that although Cloud migration continuously increased in the last few years and is now moving to multicloud strategies, but data management and protection are still lagging, and there is a lack of maturity in cloud data security [14] due to limited use of encryption, perceived multicloud complexity, and a rapid growth of enterprise data. Denial of Service (DoS) or Distribute Denial of Service (DDoS) are part of the threads against availability: the aims of these attacks are to prevent a user of a system or service to access relevant data, services, or other resources, and, in many situations, Cloud is the primary threat vector. The DDoS is one of the most critical threats to IT systems, targeting their availability by exhausting resources, decreased performance, loss of data, and service outages. In 2021–22, while the COVID-19 pandemic still had an important impact on DDoS, the Russia-Ukraine cyberwarfare influenced as never before the shape of DDoS threats, started as a new war strategy. The current trend highlighted by ENISA report is an increase of DDoS attacks in terms of complexity and dimension, and the attackers are constantly innovating and adapting new techniques, and traditional DDoS is moving toward mobile networks and IoT. Also, Cloudflare in the DDoS Attack Trends for 2022 Q1 [15] reported the important role of the Russia-Ukraine war in shaping the status of DDoS threats. The Ransom Denial of Service (RDoS) is the new frontier of DoS and DDoS attacks: it is a mix of different techniques (DDoS, ransomware, identity spoofing), and the goal is to identify vulnerable systems and put in place different activities that result in a final request to pay a ransom.
74
N. Giusti
Internet threats are part of threads against availability too, with the aim to limit the use of Internet and the flow of information. These kinds of threats showed up strongly from the Russia-Ukraine conflict: since the invasion of Ukraine, the Russia has taken the control of the Internet infrastructure, in order to block access to social media and to take control about flow of information. “Misinformation” and “disinformation” threats are major threats to democracy and a free and modern society and are more the implications from security and privacy point of view [16]. Based on ENISA’s definition, “misinformation” means an unintentional attack, where sharing of information is done inadvertently: for example, when a journalist reports wrong information in good faith or reports information by mistake; “disinformation” means an intentional attack that consists of the creation or sharing of false or misleading information. Disinformation as a method of information warfare derived from the Cold War, and it was back in the USA after the 2016 election, when Russian was accused to have interfered with the US election process. Again, the role of disinformation in the cyberwar has been highlighted in the Russia-Ukraine conflict, where cyberwar moved from cyberattacks to disinformation and used from any actors in the war. Today, the role of AI is increasingly becoming central in the creation of disinformation; in particular the current trend sees the deepfakes technology evolving quickly: it is increasingly easier to produce deepfakes, using an app and a common smartphone, and generate fake content (audio, video, images, and text) that is almost impossible to distinguish from real content. Validation of information is a big issue of our digital world: how can we be sure that information is authentic, when today social media, search engines, and news platforms are the common way to obtain source of information for many people, and information that generates more views usually is promoted, regardless of whether it is validated or not? Or can we be sure about the authenticity of information just because it’s shared or re-tweeted by millions of people? The Global Risks Report of the World Economic Forum 2022 [17] highlights how “the growth of deepfakes” and “disinformation-for-hire” are likely to deepen mistrust between societies, business, and government. For example, deepfakes could be used to sway elections or political outcomes. According to Microsoft, threat actors are increasingly using cybersecurity and disinformation attacks in tandem to accomplish their goals [18]. A “supply chain attack” targets the relationship between organizations and their suppliers. In accordance with the ENISA Threat Landscape for supply chain attacks [19], an attack is considered a supply chain attack when it consists of a combination of at least two attacks, a first attack under supplier that is then used to attack a target to gain access to its assets, where the target could be the final customer or another supplier. The SolarWinds case [20] was one of the first revelations of this type of attack and showed the potential impact of attacks on the supply chain. Based on research, supply chain compromises the second most prevalent initial infection vector identified in 2021. Based on the concerns about these topics, the European Commission drove the release of the NIS 2 Directive [21] and the security of supply chains, presented its Cybersecurity Strategy [22].
5 How the Charter of Trust Can Support the Data Protection
75
5.2.3 The ENISA NIS Investments 2022 Report In November 2022 ENISA published the Network and Information Security (NIS) Investments in the EU [23]. The report investigates how Operators of Essential Services (OES) and Digital Service Providers (DSP) invest in cybersecurity and comply with the objectives of the European Union’s directive on Security of Network and Information Systems (NIS Directive). Most OES/DSPs in the EU indicate the NIS Directive and other regulatory obligations, as well as the threat landscape as the main factors influencing their NIS budgets, but the report shows that the proportion of Information Technology (IT) budget dedicated to Information Security (IS) appears to be lower, compared to last year’s findings, dropping from 7.7% to 6.7%. Despite the growing threat of cyberattacks, the investment dedicated to Information Security (IS) appears to be lower: as ENISA suggests, this can be attributed to the composition of the survey sample, with a higher representation of OES from energy and health, but also to the macroeconomic environment, such as the COVID19 impact on the respective budgets.
5.3 The Charter of Trust As illustrated in Sect. 5.2, our digital, technological, and hyperconnected world is increasingly exposed to security threats. Both these new threats and the digital transformation of society, intensified by the COVID-19 crisis, have expanded the threat landscape and are posing new challenges, which require adapted and innovative responses and strong safeguards. Keeping this landscape in mind, at the Munich Security Conference (MSC) 2018, February 16th, Siemens and other eight industrial partners signed [30] the first join to the Charter of Trust [31], which represents an unprecedented cybersecurity initiative for greater cybersecurity and a call for binding rules and standards to build trust in cybersecurity and further advance digitalization. “Confidence that the security of data and networked systems is guaranteed is a key element of the digital transformation,” said Siemens President and CEO Joe Kaeser. “That’s why we have to make the digital world more secure and more trustworthy. It’s high time we acted – not just individually but jointly with strong partners who are leaders in their markets. We hope more partners will join us to further strengthen our initiative.” The Charter of Trust was initiated by Siemens because of increasing daily life exposure to malicious cyberattacks, trend confirmed by ENISA report nowadays. Digitalization has transformed nearly every aspect of modern life. Today, billions of devices are connected through the Internet of Things. While this creates great opportunities, it harbors even greater risks if we are unprepared. Aligned to the EU Cybersecurity Strategy, the Siemens’ approach is that Cybersecurity is a crucial factor to the success of the digital economy, due to the fact that people and
76
N. Giusti
organizations need to trust that their digital technologies are safe and secure; otherwise, they won’t embrace the digital transformation. Digitalization has transformed nearly every aspect of modern life. Today, the Charter of Trust’s members have transformed it into a unique initiative of leading global companies and organizations working together to make the digital world of tomorrow safer. In addition to Siemens and the Munich Security Conference, the signatories include AES, Airbus, Allianz, Atos, Bosch, Dell Technologies, Deutsche Post DHL Group, IBM, Infineon Technologies AG, Mitsubishi Heavy Industries, NTT, NXP Semiconductors, SGS, TotalEnergies, TUV SUD, and recently Microsoft, on 10 November 2022. To provide an effective setting to discuss best practices for implementing the Charter’s principles, to assess cyber trends and developments, and to work together on specific projects, the Charter of Trust’s industry partners work with the associated partner forum (APF), composed by regulators, research institutes, universities, and working groups. The Charter of Trust is focused on three primary goals: 1. To protect the data of individuals and business 2. To prevent harm to people, business, and infrastructure 3. To establish a reliable basis where confidence in digital work can grow and identifies ten principles where actions are needed and able to create for the very first time a common understanding of cybersecurity. Security by default, education and awareness, standardization, and evaluation of risks are essential elements in the prevention and mitigation of security risks and threats. However, the question we want to answer in this paper is as follows: are these approaches also effective in the context of data protection?
5.4 How the Charter of Trust Supports Data Protection In this chapter, we want to explore a subset of the Charter of Trust’s principles to discuss if these principles, useful in term of Cybersecurity, are also effective in terms of data protection.
5.4.1 The “Default” Concept and the Charter of Trust 03 Principle: Security by Default The choice of defaults in software engineering is not a new concept [32], and the question of appropriate pre-settings of information and communication technology has kept software developers and architects busy forever. The concept of “data protection by default,” covered by the art. 25, par. 2 of the GDPR as an obligation for controllers, is still far from being understood and adopted as regular principle in software design methods. It mandates that the controller, using appropriate
5 How the Charter of Trust Can Support the Data Protection Principle 01: Ownership of cyber and IT security
02: Responsibility throughout the digital supply chain
03: Security by default
04: User-centricity
05: Innovation and co-creation
06: Education
07: Cyber-resilience through conformity and certification
08: Transparency and response
09: Regulatory framework
10: Joint initiatives
77
Description Anchor the responsibility for cybersecurity at the highest governmental and business levels by designating specific ministries and CISOs. Establish clear measures and targets as well as the right mindset throughout organizations – “It is everyone’s task” Companies and governments must establish risk-based rules that ensure adequate protection across all IoT layers with clearly defined and mandatory requirements. The goal of this area is to ensure confidentiality, authenticity, integrity, and availability by setting baseline standards, such as identity and access management, encryption, and continuous protection Adopt the highest appropriate level of security and data protection and ensure that it’s preconfigured into the design of products, functionalities, processes, technologies, operations, architectures, and business model Serve as a trusted partner throughout a reasonable lifecycle, providing products, systems, and services as well as guidance based on the customer’s cybersecurity needs, impacts, and risks Combine domain know-how, and deepen a joint understanding between firms and policymakers of cybersecurity requirements and rules in order to continuously innovate and adapt cybersecurity measures to new threats; drive and encourage, i.e. contractual public-private partnerships Include dedicated cybersecurity courses in school curricula − as degree courses in universities, professional education, and trainings − in order to lead the transformation of skills and job profiles needed for the future Companies – and if necessary – governments ensure cyber-resilient products, systems, services, and processes through conformity assessments including, e.g., verification by independent parties Maintain and expand a network of experts who share new insights and information on incidents to foster collective cybersecurity; engage with regulators and other stakeholders on threat intelligence sharing policy and exchange best practices Promote multilateral collaborations in regulation and standardization to set a level playing field matching the global reach of the WTO; inclusion of rules for cybersecurity into free trade agreements (FTAs) Drive joint initiatives including all relevant stakeholders, in order to implement the above principles in the various parts of the digital world without undue delay
technical and organizational measures, shall ensure that only personal data that are necessary for the purpose are processed, and this is applicable to the amount of the personal data collected, the extent of their processing, the period of storage, and their
78
N. Giusti
accessibility. Moreover, the controller shall ensure that by default personal data are not made accessible, without the individual’s intervention, to an indefinite number of natural persons. But the selection of pre-defined settings is not trivial, even with security and data protection are always in mind: default settings in modern systems and services are not always respecting the data protection principles and sometimes become essential to the risk for the rights and freedoms of individuals. Already Ann Cavoukian, who, first, introduced the “Privacy by Design” [33], dedicated one of her seven foundation principles to default settings. The obligation for data protection by default in the art. 25, par. 2 of the GDPR, is closely interlinked with the one on data protection by design stipulated in the art. 25, par. 1 of the GDPR, where the controller shall implement appropriate technical and organizational measures designed to implement the data protection principles of GDPR in an effective manner and integrate the necessary safeguards into the processing of personal data. The data protection by default could be seen as a natural extension of data protection by design in the field of privacy engineering that means embedding privacy requirements into the information systems’ design and operations. Based on this, data protection by design and by default is also closely interlinked with the art 32 of the GDPR “Security of Processing,” which is another essential GDPR requirement. The two concepts, data protection and security, are really linked, because when personal data are strongly secured, it is reasonable to expect that personal data will also be subject to a high protection level. Of course, it is also possible to have some tensions between data protection and security, for example, when security default could be required to increase monitoring or surveillance on data subjects, so in many cases a balance between security and data protection is required. In the Guidelines 4/2019 on data protection by design and by default [34], the European Data Protection Board (EDPB) highlights the link between data protection and security too and how the principle of integrity and confidentiality includes protection against unauthorized or unlawful processing and against accidental loss, destruction, and damage. The aim of the Charter of Trust’s security by default principle, which also includes the concept of “by design,” is putting security first as the art. 25 of the GDPR is putting data protection first, e.g., ensures that security is considered right from the start of the design of a product, a process, or even a business model. The goal is to define by design and by default critical cybersecurity requirements needed to deliver secure products, processes, services, and business models in line with current standards and best practices and to protect data, and it is also in accordance with EDPB’s Guidelines 4/2019 and data protection requirements. Moreover, the third principle of the Charter of Trust is also clarifying the concept of security by default and providing a practical approach: in this direction, the Charter identified different phases and different baseline requirements to support the security by default in different areas/phases and to provide a guidance on how to embed security into the design of products, operations, and business (this one is still in progress). In the last column, data protection considerations, this work aims to highlight the benefit also for data protection (Tables 5.1 and 5.2).
5 How the Charter of Trust Can Support the Data Protection
79
Table 5.1 Phase 1: “Products, Functionalities, Technologies” baseline requirements Baseline Unique identify Secure onboarding
Secure credential
Requirement Assets shall be uniquely identifiable When an asset is being onboarded into an environment, the asset shall be able to assert its unique identity Universal default, hardcoded, and weak credentials shall not be used
Login protection
Either the asset or the system will implement account lockout or an authentication back off timer
Access control
Assets shall include strong authentication mechanisms and have them enabled by default. Authorization shall be used to ensure legitimate use and mediate attempts to access resources in/from a system Storage for security-sensitive data shall be secured
Secure storage
Secure communications
Sensitive data and system information, including management and control process data, shall be protected while in transit
Minimize attack surface
Security features shall be enabled by default and functionalities that are not required or are insecure shall be disabled by default
Data protection considerations Identify an asset is the core element to understand its data protection risks This is linked to the capability to inventor the asset to identify its data protection risks
The asset’s default credentials should be managed securely throughout the entire asset’s lifecycle. This is in accordance also with EDPB’s Guidelines 4/2019 and the art. 32 of the GDPR This is to prevent authentication attacks (as brute force log-in). The Access Control Management is also mentioned by EDPB’s Guidelines 4/2019, and it is also part of the appropriate technical and organizational measures quoted in the art. 32 of the GDPR Strong authentication mechanisms are part of the state-of-the-art technologies, as role-based access control, attribute/context-based access control, and adaptive access control. The state of the art, the Access Control Management, and the “protection according to the risk” are also mentioned by EDPB’s Guidelines 4/2019, and they are also part of the art. 32 of the GDPR Protection according to the risk, pseudonymization and anonymization are also mentioned by the EDPB’s Guidelines 4/2019 and are also part of the art. 32 of the GDPR This is a measure to address data protection risks, based on the current standards, guidance, and current cryptographic protocols, according to local legislation. Protection according to the risk, pseudonymization and anonymization are mentioned by EDPB’s Guidelines 4/2019 and are also part of the art. 32 of the GDPR Measures such as to enable by default the security configurations, to disable by default functionalities that are not necessary, to remove by default software service that are not secure, are key aspects to reduce the attack surface, and they are also part of the art. 32 of the GDPR. (continued)
80
N. Giusti
Table 5.1 (continued) Baseline Secure data deletion
Backup feature
Requirement Manufacturers shall provide functionality for customers to securely wipe customer data Relevant assets shall provide a backup feature for data
Security documentation
Manufacturers shall provide a comprehensive security guide for the asset which details minimal steps and follows security best practices on usability
Validate input data
All input data shall be validated prior to use by the asset
Password changed on first use
Relevant assets shall force a password change during the initial setup
Secure updates
Assets shall have the ability to securely update and remove/mitigate vulnerabilities and bugs, during their lifecycle, in a timely fashion
Data protection considerations This is part of measures to guarantee confidentiality of the previous asset owner’s data The goal is to ensure availability of asset data, where the level of availability depends on the risk. Backups/logs are also mentioned by EDPB’s Guidelines 4/2019 and are also part of the art. 32 of the GDPR The security documentation should reflect current standards and best practices and should help users maintain the same level of security on the asset. In the field of “addressing effectiveness,” EDPB’s highlights how it is important for the controller “have documentation of the implemented technical and organizational measures,” based on the recitals 74 and 78, and the art. 32 of the GDPR Input data need to be validated to avoid the risks of security threats and to demonstrate accuracy. The accuracy is a fundamental principle in the field of data protection, and it is defined in art. 5, par. 1, lett. (d) of the GDPR It is a measure to guarantee that the asset’s data is protected from unauthorized access, in accordance also with EDPB Guidelines 4/2019 and art. 32 of the GDPR Vulnerabilities and bugs need to be mitigated during the entire asset’s lifecycle in a timely fashion. The concept of “state of the art” mentioned in the recitals 78 and 83, art. 25, par. 1 of the GDPR, and in the Guidelines 4/2019, remembers that controller and processor need to take account of the current progress in technology and stay up to date on technological advances; how technology can present data protection risks or opportunities to the processing operation; and how to implement and update the measures and safeguards that secure effective implementation of the principles and rights of data subjects taking into account the evolving technological landscape. It is also part of the art. 32 of the GDPR (continued)
5 How the Charter of Trust Can Support the Data Protection
81
Table 5.1 (continued) Baseline Telemetry and event monitoring
Requirement Assets shall implement logging for telemetry and security related events
Maintain settings after outage
Assets shall maintain settings after power outage
Factory reset
Assets shall provide a means to return to original factory configuration with all customer data securely removed
No backdoors
No undocumented ways to remotely connect to the asset shall be put in place by the manufacturer
Data protection considerations Logging of information enables detection of anomalous behavior and can provide the necessary visibility for incident. Measures to provide default integrity and confidentiality include logs, audit trails, and event monitoring as a routine security control, as reported by EDPB Guidelines 4/2019. These measures can also be considered “appropriate technical security measures” in terms of art. 32 of the GDPR Resilience should be built in to assets. Measures to provide default integrity and confidentiality include disaster recovery/business continuity requirements to be able to restore the availability of personal data following up major incidents, as mentioned in the EDPB Guidelines 4/2019. The measure can also be considered “appropriate technical security measures” in terms of art. 32 of the GDPR Assets shall only go back to their default/nonexistent credentials after a factory reset, properly documented, and protected. This is part of data minimization and storage limitation principles, defined in the art. 5, par. 1, lett. (c) and lett. (e) of the GDPR: removal of sensitive data and automatization of deletion procedures are also part of default data minimization and storage limitation elements by EDBP’s Guidelines 4/2019 and the art. 32 of the GDPR The existence of backdoors compromises security and affects trust with users. All ways to connect to the asset should be documented and made available to the customer to ensure transparency. Transparency is mentioned in the art. 5, par. 1, lett. (a) of the GDPR as a fundamental principle: its importance is highlighted in the recitals 13, 39, 58, 78, and 100 of the GDPR; the art. 12 of the GDPR poses conditions on transparency for effective communication. Avoid backdoors it is also considered an “appropriate technical security measures” in terms of art. 32 of the GDPR (continued)
82
N. Giusti
Table 5.1 (continued) Baseline Conceal password characters
Requirement Assets shall mask all passwords during input by default
Data protection considerations This is a measure to prevent passwords disclosure and unauthorized accesses; it is also considered one of the “appropriate technical security measures” in terms of the art. 32 of the GDPR
5.4.2 A User-Centricity Approach The fourth principle of the Charter of Trust, the user-centricity, is focused on providing products, systems, and services based on the customer’s cybersecurity needs, impacts, and risks. This principle comes from the observation that secure systems have a particularly rich tradition of indifference to the user, whether the user is a security administrator, a programmer, or an end-user: it is not unusual that security functionalities can be difficult to use, and some aspects of secure systems are difficult to make easy to use. To be designed for the user, system’s designers should analyze users’ expected behavioral engagement with the system, also to ensure that design to be able to address potential risks and consider user needs as a primary design goal at the start of secure system development. The same concerns are valid in the field of data protection and privacy: a lot of current approaches mostly concern data as an impersonalized entity and ignores users’ perspectives and needs, and very often developers have different perceptions what data protection means, for example, accuracy or data minimization. The GDPR aims give control to users over their personal data, but it is hard in the current digital age, where the probability of data protection risks for users increases due the big amount of data collection and sharing. A user-centric approach in security will have a positive impact also on data protection and will help to develop an effective data protection framework able to embed security and privacy into system tailored on needs and expectations of users, improve protection, usability, and user satisfaction.
5.4.3 Education The sixth principle of the Charter of Trust aims to provide awareness trainings to improve skills and knowledge in the field of cybersecurity. Security training are mandatory also for data protection and privacy, to understand the state of the art of technologies, risks associated with personal data processed, stored, or transmitted, how to mitigate security risk, and to define protection measures. Moreover, in the art. 4, no. 10, the GDPR explicitly refers to “persons authorized to process personal data
5 How the Charter of Trust Can Support the Data Protection
83
Table 5.2 Phase 2: “Processes, Operations, Architectures” baseline requirements Baseline Security management program
Requirement A security management program based on best practices shall be established and implemented to continuously improve the security posture
Risk management process
Security risk shall be managed in the organization for critical assets based on risk assessment
Human resources security
Processes shall be established in human resources to support security management prior to and during onboarding, as well as off-boarding, of personnel
Data protection considerations The organization shall establish, implement, maintain, and continually improve security policies, processes, and procedures for the entire lifecycle of assets in different areas, including business continuity and disaster recovery, identity and access management, threat identification, and mitigation. These topics are also included in the EDPB’s Guidelines 4/2019 to provide default integrity and confidentiality and are also considered an “appropriate technical security measures” in terms of the art. 32 of the GDPR The organization should define a security risk assessment process to identify, evaluate, and analyses security risks and establish a process for risk treatment. The concept of “state-of-the-art” in the art. 25, par. 1, of the GDPR is also related how technology can present data protection risks or opportunities to the processing operation; the art. 32 of the GDPR requires the controller and the processor to implement appropriate technical and organizational security measures: in this context, “appropriate” means based on the “risk” of the processing that also includes the security risk It includes a pre-employment verification (in accordance with the relevant laws and regulations), compliance of contract terms and conditions with the organization’s security policies, confidentiality agreements, segregation of roles/duties, planning of security awareness training for employees, and processes for job termination. This requirement can help organizations to demonstrate “accountability,” defined in the art. 5, par. 2 of the GDPR, and to manage the entire flow of personal data concerning employees and customers (privacy notice, consents, data processing, and dismission) (continued)
84
N. Giusti
Table 5.2 (continued) Baseline Training
Requirement A minimum level of security education and training on key security issues shall be regularly deployed for employees
Asset management
Policies and procedures shall be in place for the management of assets throughout their lifecycle, including onboarding, changes, and off-boarding
Identity and access management
Access to assets shall be limited to authorized identities only for the time needed and managed based on risk and the principle of least privilege
Credentials management
Organizations shall have a process of enforcing current security best practices to manage credentials and cryptographic material throughout their entire lifecycle
Data protection considerations Executives should promote awareness for security and company should regularly provide its employees with appropriate security training, in accordance with the company’s security framework and the respective roles, tasks, and areas of responsibility. This is related to the “state of the art” mentioned in the art. 25, par. 1 of the GDPR, and can be considered an “appropriate technical security measures” in terms of the art. 32 of the GDPR. The purpose is to provide necessary and up-to-date information about assets, which is a prerequisite for many other requirements. The information asset determines the necessary level of protection and the “appropriate technical security measures” mentioned in the art. 32 of the GDPR It includes processes, policies, and a supporting technical infrastructure to manage and operate human and nonhuman access to assets and identification, authentication, and authorization methods for entities accessing the asset corresponding to the risk level assigned to an asset. It is linked to the art. 32 of the GDPR, but also to the art. 25 of the GDPR regarding accessibility to personal data. As mentioned in the EDPB’s Guidelines 4/2019, the Access Control Management is considered a default element to provide default integrity and confidentiality Passwords, pins, biometric templates, encryption keys, tokens, or certificates should be managed securely throughout their lifecycle using best security practices. It is linked to the art. 32 of the GDPR but also to the art. 25 of the GDPR, where security requirements need to be considered as early as possible to provide confidentiality, integrity, and availability on personal or sensitive data (continued)
5 How the Charter of Trust Can Support the Data Protection
85
Table 5.2 (continued) Baseline Physical security
Requirement Physical security shall be in place to protect assets by providing access control and protecting information
Security documentation
Processes shall be in place to ensure proper and accessible security documentation, including information about capabilities, risks, and mitigation strategies
Continuous monitoring
Robust monitoring for critical assets shall be put in place for all relevant events and logged information shall be protected
Data protection considerations Physical security is the protection of personnel, hardware, software, networks, and data from physical actions and events that could cause serious loss or damage to an enterprise. This includes protection from fire, flood, natural disasters, and terrorism. Procedures of physical security include access control, intrusion detection, protection of essential equipment’s, prevention of unauthorized accesses, and alarms triggered on deviations. Physical security still accounts for many common security and data breaches, and it is a part of requirements asked from the art. 32 of the GDPR Documentation for the product or service should include information about security capabilities, threats, known residual risks, mitigation strategies for those risks, and guidelines should be provided for installation, operation, maintenance, monitoring, and administration, including hardening. In the field of “addressing effectiveness,” EDPB’s highlights how it is important for the controller “have documentation of the implemented technical and organizational measures,” based on the recitals 74 and 78, and the art. 32 of the GDPR During the operation of an asset, processes and procedures should be implemented to ensure effective detection of and reaction to threats. This is in the field of art. 32 of the GDPR but also in the art. 25 of the GDPR, to guarantee integrity and confidentiality by design and default, as highlighted by EDPB’s Guidelines 4/2019 (continued)
86
N. Giusti
Table 5.2 (continued) Baseline Vulnerability management
Threat identification and mitigation
Requirement A vulnerability management process shall be established for the duration of the support lifecycle of assets, including the collection of vulnerability notifications, proactive monitoring, responding to vulnerabilities, and related communication. Security updates shall be implemented to address vulnerabilities in a timely, transparent, and secure manner throughout the entire asset lifecycle Procedures and policies shall be in place to monitor, identify and mitigate threats to assets
Segmentation
Physical and logical segmentation shall be in place to minimize security risks and to protect critical assets
Secure development lifecycle
Policies and procedures shall be in place for secure development best practice-es to ensure the integrity of the developed assets and minimize vulnerabilities
Data protection considerations Processes and procedures should be implemented to monitor and/or regularly check for security updates, considers if updates are relevant and evaluate if mitigations are necessary, patch, update and/or implement mitigations and inform potential affected users, and to maintain security of the asset. It is in the field of art. 32 of the GDPR, but also in the art. 25 of the GDPR. As mentioned in the EDPB’s Guidelines 4/2019, the recital 78 of the GDPR suggests a responsibility on the controllers to continually assess whether it is always using the appropriate means of processing and to assess whether the chosen measures counter the existing vulnerabilities These include threat monitoring to identify and assess external and internal threat sources, threat identification and assessment, and threat mitigation. They are in the field of art. 32 of the GDPR but also in the art. 25 of the GDPR to guarantee integrity and confidentiality by design and default, as mentioned in the EDPB’s Guidelines 4/2019 Segmentation (physically or logically) helps to limit fault propagation and to support resilience, but it is also a way to understand where personal or sensitive information are stored and to be sure that only appropriate zones or users have access to one other. This is in the field of art. 32 of the GDPR but also in the art. 25 of the GDPR to guarantee integrity and confidentiality by design and default, as mentioned in the EDPB’s Guidelines 4/2019 These include secure coding guidelines, threats modelling, methods to stay informed of the attackers’ capabilities, testing and validation, clearly defined roles, and responsibilities for design, development, validation and acceptance, configuration management, and security documentation. These are in the field of art. 32 of the GDPR but also in the art. 25 of the GDPR to guarantee integrity and confidentiality by design and default, as mentioned in the EDPB’s Guidelines 4/2019 (continued)
5 How the Charter of Trust Can Support the Data Protection
87
Table 5.2 (continued) Baseline Security incident management
Requirement Policies and procedures for the management of security incidents shall be established to mitigate risks and minimize damage should an incident occur
Business continuity and disaster recovery
Policies and procedures shall be in place to identify, maintain and re-establish necessary business operations in a timely manner, including ensuring of proper restoration of data and services in case of disruption
Security auditing
Regular and ad hoc internal and external security audits/assessments shall take place to verify compliance with company security policies and relevant regulations
Data protection considerations These include procedures to manage incidents and to publish information after the occurrence of the security incident. They are in the field of art. 32, 33, and 34 of the GDPR, where the GDPR introduces, in certain cases, the requirement for a personal data breach to be notified to the competent national supervisory authority and to communicate the breach to the individuals whose personal data have been affected by the breach. It has also made sense for accuracy, one of the fundamental principles defined in the art. 5, par. (d) of the GDPR, and in the art. 25 of the GDPR to guarantee integrity and confidentiality by design and default, as mentioned in the EDPB’s Guidelines 4/2019 The organization should assess the risk of a disaster on critical business activities and develop and test plans to provide protection against loss or unavailability of critical assets. This is in the field of the art. 32 of the GDPR and in the art. 25 of the GDPR, to guarantee integrity and confidentiality by design and default. The EDPB’s Guidelines 4/2019 explicit mention the disaster recovery/business continuity as a measure to restore the availability of personal data following up major incidents These include regular internal and external security audits based on relevant global standards such as ISO 27001 and IEC 62443, to ensure awareness and mitigation of security risks with dedicate corrective actions. They are in the field of the art. 32 of the GDPR, and in the art. 25 of the GDPR to guarantee integrity and confidentiality by design and default, as mentioned in the EDPB’s Guidelines 4/2019
88
N. Giusti
under the direct authority of controller or processor” and the art. 29 of the GDPR says “any person acting under the authority of the controller or of the processor, who has access to personal data, shall not process those data except on instructions from the controller, unless required to do so by Union or Member State law” It is essential to provide the authorized personnel with operating instructions, including the obligations relating to security measures, and that they are provided with the necessary trainings.
5.4.4 Transparency and Response The eighth principle of the Charter of Trust is focused on communication during security incidents and exchange information between stakeholders on threat intelligence sharing policy. This is a fundamental prerequisite to improve resilience and knowledge on incidents, as highlighted also by EU Cybersecurity Strategy. A key aspect of incident remediation is notifying customers when incidents impact their data, and trust is created through transparency. The approach to provide prompt, clear, and accurate communication about the incident, with the right level of details and steps – if required – to mitigate the risk, is a requirement for security but also for data protection, as mentioned in the recital 85–88 and in the art. 33–34 of the GDPR. Transparency is a core element, and it is mentioned many times in the GDPR: in the recital 58, were “information . . . be concise, easily accessible and easy to understand, and that clear and plain language”; in the recital 78, where transparency is a method to achieve the art. 25 data protection by design and by default and in the art. 12 and 13.
5.5 Conclusion After 4 years, the Charter of Trust experience can be described as a genuine success story [35]. This is because, instead of issuing vague statements of intent, it sets out specific measures which are first implemented at its own members and partners, in the direction of “leading by example.” Many other companies are following the activities of the alliance closely, and even policymakers and security authorities are listening to what it has to say. The Charter had also the merit to create strengthening the bond of trust between the partners, without which nothing could be achieved in a networked world and succeeded in bringing about specific improvements in cybersecurity. Based on the Charter, the partners are involved in some other topics, as the design of a holistic approach to better education in the field of cybersecurity or in the foundations for security by default. Currently, there is not enough data to demonstrate the effectiveness of the Charter, for example, in terms of number of threats before and after the application
5 How the Charter of Trust Can Support the Data Protection
89
of the Charter. But it seems clear that the principles of security by design, transparency, adherence to standards, and education are all essential elements in combating the security threats to which we are subjected today and well-highlighted by the ENISA report, elements also present in the EU Cybersecurity Strategy. The aim of this work was to evaluate the practical aspects of the Charter of Trust and the opportunities of using it in the field of data protection, and not only in cybersecurity. Moreover, it presented the ten principles of the Charter, and discussed those aspects that can be considered most relevant for data protection. We can conclude that the application of the Charter’s principles and the respective baselines offers an evident advantage in the field of security but also in data protection, in particular an ecosystem to ensure the confidentiality and integrity by design and by default in the processing of personal data, in the field of education and awareness, to improve transparency, and to support a data protection usercentric approach, based on needs and requirements of the users, with the aim to give users more control over their personal data, to allow users to make informed choices and to improve their protection in the digital age, where our personal data represents our aspirations, what we believe in, and the moments we cherish most in our lives.
References 1. https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:52017JC0450 &from=en 2. https://www.enisa.europa.eu/publications/enisa-threat-landscape-2022 3. https://www.enisa.europa.eu/publications/enisa-threat-landscape-2021 4. https://www.enisa.europa.eu/publications/enisa-threat-landscape-for-ransomware-attacks 5. ENISA Threat Landscape for Ransomware Attacks – Ransomware definition – https:// www.enisa.europa.eu/publications/enisa-threat-landscape-for-ransomware-attacks 6. https://csrc.nist.gov/CSRC/media/Publications/nistir/draft/documents/NIST.IR.8374preliminary-draft.pdf 7. https://csrc.nist.gov/glossary/term/malware 8. https://www.enisa.europa.eu/publications/enisa-threat-landscape-2021 9. https://www.sonicwall.com/2022-cyber-threat-report/ 10. https://www.trendmicro.com/vinfo/us/security/definition/spear-phishing 11. https://www.welivesecurity.com/2022/05/23/common-nft-scams-how-avoid-them/ 12. https://www.verizon.com/business/resources/reports/2022/dbir/2022-data-breachinvestigations-report-dbir.pdf 13. https://blog.cloudflare.com/ddos-attack-trends-for-2021-q4/ 14. https://cpl.thalesgroup.com/data-threat-report 15. https://blog.cloudflare.com/ddos-attack-trends-for-2022-q1/ 16. https://www3.weforum.org/docs/WEF_The_Global_Risks_Report_2021.pdf 17. https://www.weforum.org/reports/global-risks-report-2022/ 18. https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RWMFIi?id=101738 19. https://www.enisa.europa.eu/publications/threat-landscape-for-supply-chain-attacks 20. https://www.wired.com/story/solarwinds-hack-supply-chain-threats-improvements/ 21. https://ec.europa.eu/commission/presscorner/detail/en/IP_22_2985 22. https://digital-strategy.ec.europa.eu/en/library/eus-cybersecurity-strategy-digital-decade-0
90
N. Giusti
23. https://www.enisa.europa.eu/publications/nis-investments-2022 24. https://ec.europa.eu/commission/presscorner/detail/en/ip_20_2391 25. https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/shaping-europedigital-future_en 26. https://ec.europa.eu/info/strategy/recovery-plan-europe_en 27. https://ec.europa.eu/info/strategy/priorities-2019-2024/promoting-our-european-way-life/ european-security-union_en 28. https://www.europarl.europa.eu/news/en/press-room/20221107IPR49608/cybersecurityparliament-adopts-new-law-to-strengthen-eu-wide-resilience 29. https://www.whitehouse.gov/briefing-room/statements-releases/2022/01/19/fact-sheetpresident-biden-signs-national-security-memorandum-to-improve-the-cybersecurity-ofnational-security-department-of-defense-and-intelligence-community-systems/ 30. https://www.charteroftrust.com/news/siemens-and-partners-sign-joint-charter-oncybersecurity/ 31. https://www.charteroftrust.com/ 32. ENISA Recommendations on shaping technology according to GDPR provisions, https:// www.enisa.europa.eu/publications/recommendations-on-shaping-technology-according-togdpr-provisions 33. https://www.ipc.on.ca/wp-content/uploads/resources/7foundationalprinciples.pdf 34. EDBP Guidelines 4/2019 on Article 25 Data Protection by Design and by Default https:// edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-42019-article-25-dataprotection-design-and_en 35. https://www.siemens.com/global/en/company/stories/research-technologies/cybersecurity/ what-makes-the-charter-of-trust-such-successful-model.html
Chapter 6
Operationalizing the European Essential Guarantees in Cross-Border Personal Data Transfers: The Case Studies of China and India Jan Czarnocki
, Eyup Kun
, and Flavia Giglio
6.1 Introduction1 Without a doubt, Schrems II’s impact on privacy and data protection is significant.2 The decision addressed the question of whether the General Data Protection Regulation (GDPR) applies to the transfer of personal data by the EU-based entities to those in a third country when public authorities of the third country in question might process such data for national security purposes. The European Court of Justice (CJEU) confirmed the applicability of the GDPR since the latter explicitly requires an assessment of the legislation concerning national security.3 Furthermore,
This publication was prepared in the context of the TACOS (Trustworthy sAfety & seCurity cOntrol for cyber-physical Systems) project which has received funding from VLAIO ICON cybersecurity (CS-ICON). We wish to thank Plixavra Vogiatzoglou for her invaluable review. 1 The
present research builds upon the knowledge on the data protection frameworks of China and India gathered during the study for the report on Government Access to data in third countries, finalized in November 2021 [https://edpb.europa.eu/system/files/2022-01/legalstudy_ on_government_access_0.pdf].The article solely expresses the authors’ personal views and opinions. 2 Case C-311/18 Data Protection Commissioner v. Facebook Ireland Limited, Maximilian Schrems [2020] (Schrems II). 3 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data and repealing Directive 95/46/EC (General Data Protection Regulation) [2016] OJ L119/1 (GDPR), art. 45(2). J. Czarnocki · E. Kun · F. Giglio () Imec – KU Leuven – CiTiP (supervisor: Prof. Dr. Peggy Valcke), Leuven, Belgium e-mail: [email protected]; [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_6
91
92
J. Czarnocki et al.
Article 4(2) of the Treaty of the EU (TEU), which sets the exclusive competence of Member States in national security matters, only applies to the relations between the EU and its Member States.4 The case revived the debate about the balance between data protection and national security.5 More importantly, the decision prohibited data transfers to the United States (US) based on the Privacy Shield agreement between the EU and the United States.6 Similar to the Schrems I judgment,7 in Schrems II the CJEU ruled that in the United States, there is no independent oversight and effective judicial remedy for the EU citizens against discretional and bulk processing of personal data by the intelligence agencies. Relying on past jurisprudence, Schrems II reaffirms a benchmark and precise conditions for deciding whether third countries provide an essentially equivalent level of data protection to the EU – a precondition for lawful data transfer.8 Given the high standards set by the EU for privacy and data protection, the ability of most third countries to comply with it is in doubt. In the aftermath of Schrems II, data exporters shall assess risks to fundamental rights to privacy and data protection posed in the third country where the data importer is located, before transferring data outside of the EU. In the absence of an adequacy decision, Schrems II imposes an obligation on data exporters – either data controllers or processors – to carry out a case-by-case transfer risk assessment when relying on alternative data transfer mechanisms according to article 46 of GDPR, in particular the Standard Contractual Clauses (SCCs).9 The ruling states that the SCCs set contractual obligations that are not binding for public authorities in third countries. Thus, they do not necessarily provide sufficient protection in the context of potential government access to the transferred data. On one hand, the Court upheld the validity of the SCC Decision10 adopted by the Commission.
4 Schrems
II, paras 80–89. P. Meltzer, “Case Note: After Schrems II: the Need for a US-EU Agreement Balancing Privacy and National Security Goals” [2021] 2 Global Privacy Law Review 83 (note). 6 On 10 July 2023, a new adequacy decision was adopted by the EU Commission to allow transfers of personal data from the EU to the US, under the EU-US Data Privacy Framework. https://commission.europa.eu/law/law-topic/data-protection/international-dimension-dataprotection/eu-us-data-transfers_en 7 Case C-362/14 Maximilian Schrems v. Data Protection Commissioner [2015] (Schrems I). 8 Maria Tzanou, “Schrems I and Schrems II: Assessing the Case for the Extraterritoriality of EU Fundamental Rights’ in Federico Fabbrini, Edoardo Celeste, John Quinn (authors), Data Protection Beyond Borders: Transatlantic Perspectives on Extraterritoriality and Sovereignty (Oxford: Hart Publishing 2020). 9 Virgilio Emanule Lobato Cervantes, “The Schrems II Judgment of the Court of Justice Invalidates the EU – U.S. Privacy Shield and Requires ‘Case by Case’ Assessment on the Application of Standard Contractual Clauses (SCCs)” [2020], 6 European Data Protection Law Review 602 (note). 10 Commission Decision of 5 February 2010 on Standard Contractual Clauses for the transfer of personal data to processors established in third countries under Directive 95/46/EC of the European Parliament and of the Council. The Decision was recently repealed by the Commission Implementing Decision (EU) 2021/914 of 4 June 2021 on Standard Contractual Clauses for the transfer of personal data to third countries pursuant to Regulation (EU) 2016/679 of the European Parliament and of the Council. 5 Joshua
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
93
On the other hand, it underlined that, depending on the legislation and practices in force in the third countries, the guarantees of the SCCs might need to be supplemented by additional measures.11 However, the Court also ruled that data protection authorities are empowered to suspend or prohibit a transfer of data to a third country based on the use of SCCs, when, in the light of all the circumstances of the transfer, the clauses cannot be complied with in the third country and the level of protection required by EU law cannot be met. The European Data Protection Board (EDPB) adopted some recommendations on supplementary measures in 2020 to help exporters to assess third countries’ data protection framework and identify appropriate supplementary measures where needed.12 In line with the so-called risk-based approach, data exporters should evaluate the laws and practices of third countries, to assess the effectiveness of safeguards against any negative impact + of international data transfers on privacy and data protection. In this regard, the legislation on government access to data for national security purposes should be the subject of particular focus. However, given the limited resources of many companies, conducting an assessment is difficult.13 When gaps in the legal protection are found, exporters may take supplementary measures to strengthen the guarantees of the SCCs and limit the interference with the right to privacy and data protection.14 These supplementary measures might consist of contractual, organizational, and technical measures. As contractual obligations do not bind the third countries authorities in the case of a conflict with domestic laws allowing access to data for national security purposes, contractual and organizational measures should in principle be complemented by technical measures to effectively overcome risks arising from government access to data. Some decisions of EU data protection authorities provide examples of the assessment of the appropriateness of supplementary measures after Schrems II. The Austrian Data Protection Authority (DPA) recently handled a case against Google and its processing of personal data in the context of Google Analytics. Google, as one of the data importers impacted by Schrems II, used to rely on the SCC’s as a transfer mechanism and adopted organizational and technical supplementary measures to ensure an adequate level of data protection. With regard to organizational measures, Google commited to notify data subjects about data requests of access issued by public authorities (where permissible). It also published a transparency report and adopts a policy for dealing with government requests. The Austrian DPA found these organizational measures ineffective as they do not prevent the possibility of access by US intelligence services.
11 Schrems
II, para 122–149 (n 1). Data Protection Board (EDPB), Recommendations 01/2020 on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data. Version 2.0 [2021]. 13 Paul Breitbart, “A Risk-Based Approach to International Data Transfers” [2021] 7 European Data Protection Law Review- 539. 14 GDPR, Art. 46. 12 European
94
J. Czarnocki et al.
Google also implemented technical measures, in particular encryption of data at rest in data centres. However, the Austrian DPA found this measure ineffective as well, since Google has the possibility of access to data in plain text, and US intelligence services might request cryptographic keys in order to perform an access to data.15 The French DPA reached a similar conclusion in its decision for the same importer. The organizational and contractual measures taken by Google are not sufficient to overcome the risks connected to the access of personal data by public authorities. As regards the technical measures, and, in particular encryption technologies, Google has an obligation, under US law, to turn over imported personal data and any cryptographic keys necessary to access them. As regards the possibility to use an IP Anonymization function, if the anonymization takes place after the transfer in the US, this measure does not diminish the risk of access by US intelligence services to the transferred data.16 Even the enforcement of technical measures to supplement the SCCs is not a definitive solution to the issue of international transfers. To understand whether supplementary measures are necessary for the context of transfers of personal data, a preliminary evaluation of third countries’ legislation and practices is needed. This paper aims to clarify what is the level of data protection afforded in the context of government access for national security purposes17 in the People’s Republic of China (China/PRC) and the Republic of India (India) and pinpoint the key issues for future assessments of other legal systems. PRC and India’s legal frameworks were analysed because of their rising geopolitical and economic importance. The evolution of their data governance policies finds its raison d’être in the national economic developments and the necessity to balance the national security rationale with business interests in data sharing. In fact, in both countries, the increase in access to data is perceived as one of the foundations for
15 DSB of 22 December 2022 on the complaint presented by Nyob – European Center for Digital Rights v. Google LLC, https://noyb.eu/sites/default/files/2022-01/E-DSB%20-%20Google %20Analytics_EN_bk.pdf/, accessed 19 May 2022. 16 CNIL Decision of 10 February 2022 on the complaint presented by Nyob v. Google LLC, https://www.cnil.fr/sites/default/files/atoms/files/decision_ordering_to_comply_anonymised_-_ google_analytics.pdf, accessed 19 May 2022. 17 In the judgment on the Joined Cases C-511/18, C-512/18, and C-520/2018, La Quadrature du Net v. France, the Court of Justice of the European Union first defined the concept of national security: “That responsibility corresponds to the primary interest in protecting the essential functions of the State and the fundamental interests of society and encompasses the prevention and punishment of activities capable of seriously destabilising the fundamental constitutional, political, economic or social structures of a country and, in particular, of directly threatening society, the population or the State itself, such as terrorist activities” (para 135). To the aims of the present paper, the provisions in Chinese and Indian legislations whose purposes may be encompassed under such definition will be analysed.
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
95
domestic business growth.18 Therefore, the possibility of transferring personal data to China and India is of practical importance for economic relations between them and the EU. At the same time, China’s and India’s legal systems have different histories and axiological roots. Understanding the peculiarities of both systems is important to assess their approach to data protection and the regime on government access to personal data. Hence, analysis of their legal system provides an insight into different privacy and data protection approaches which, unlike the EU one, are not focused on fundamental rights protection.19 Finally, given the preeminence of government access and national security in the international data protection debate after Schrems II, the paper provides a basis for future comparative data protection Research on the EU, China, and India. Any such comparison omitting an analysis of rules governing government access to data and national security would lack essential context. The paper first provides a brief description of the European Essential Guarantees to be afforded in third countries when implementing surveillance measures, as outlined by the relevant European Data Protection Board (EDPB) recommendations.20 Second, the paper describes the regime on government access to data for national security purposes and the data protection frameworks concerning government access in China and India and evaluates them against the EU standards.
6.2 The European Essential Guarantees for Surveillance Measures The EDPB European Essential Guarantees for surveillance measures were adopted after the Schrems II decision. Such guarantees result from the GDPR,21 the CJEU and the European Court of Human Rights (ECtHR) jurisprudence.22 The compliance with them forms part of assessing the level of protection of privacy and personal data in the third country in the context of access of government agencies responsible for national security. The European Essential Guarantees include both
18 Amba Kak, Samm Sacks, ‘Schifting Narratives and Emergent Trends in Data-Governance Policy. Developments in China, India and EU’ (2021) Paul Tsai China Center policy report 9/2021, https://law.yale.edu/sites/default/files/area/center/china/document/shifting_narratives.pdf, accessed 9 March 2022. 19 For more about the EU data protection rationale see Orla Lynskey, “The Foundations of EU Data Protection Law” Oxford Studies in European Law, (Oxford: Oxford University Press, 2015). 20 European Data Protection Board (EDPB), Recommendations 02/2020 on the European Essential Guarantees for surveillance measures [2020]. 21 GDPR, art. 45. 22 According to art. 53(2) of the Charter of Fundamental Rights of the European Union, as far as the Charter contains rights which correspond to right guaranteed by the Convention for the Protection of Human Rights and Fundamental Freedoms, the meaning and scope of those rights shall be the same as those laid down by the Convention.
96
J. Czarnocki et al.
substantive limits and procedural safeguards for the indirect or direct access to personal data by third countries’ security agencies. According to Article 52 of the EU Charter and the consistent jurisprudence of the CJEU, any interference with the right to privacy and data protection must be provided by law. Clear and precise rules should delineate the scope of interference and indicate “in what circumstances and under which conditions” the measures allowing government access to personal data may be applied so that they are foreseeable to individuals.23 The concept of “law” plays an important role in assessing whether the restriction on a fundamental right is based on a legal basis. The jurisprudence of the ECtHR clarified that personal data access measures provided by law are foreseeable only where they are accessible to individuals, are formulated with sufficient precision, and do not leave excessive discretion to the public authorities applying them.24 Therefore, excessive vagueness of the legal rules leading to arbitrariness in their application impairs the fulfilment of this requirement.25 According to the CJEU jurisprudence, limitations to data protection should be proportionate to objectives of general interests they pursue and should apply only in so far as strictly necessary to achieve these objectives.26 In its recommendations, the EDPB specifies that the principle of proportionality requires a balance between the level of interference with fundamental rights and, in the case of national security measures, the seriousness of the threat faced by the country. As regards the principle of necessity, governments should not access personal data on a general basis but only when necessary for national security in a given case.27 Concerning the procedural safeguards, the EDPB recommendations recall both the ECtHR jurisprudence and the Schrems II judgment in pointing out the importance of “an effective, independent and impartial oversight system that must be provided for either by a judge or by another independent body.” In particular, important factors are independence from the executive and affording the supervisory authority with sufficient powers to exercise control. Such control should take the form of prior authorization or, in urgent cases, a subsequent review.28 Finally, Article 47 of the EU Charter of Fundamental Rights and its interpretation in the jurisprudence confirm that third countries should provide individuals with remedies against unlawful access to personal data by state agencies.29 As stated in the Schrems I judgment, this requirement is necessary to “respect the essence of 23 Schrems
II, para 176 (n 1). plenary judgment in the case of The Sunday Times v. the United Kingdom, 26 April 1979, para. 49. 25 Ian Brown, Dowe Korff, “Exchanges of Personal Data After the Schrems II Judgment” (2021), Policy Department for Citizens’ Rights and Constitutional Affairs Directorate-General for Internal Policies PE 694.678 – 07/2021, www.europarl.europa.eu/RegData/etudes/STUD/2021/694678/ IPOL_STU(2021)694678_EN.pdf, accessed 9 March 2022. 26 Schrems II, para 174–176. 27 EDPB Recommendations 02/2020 (n 15). 28 ibid. 29 Brown, Korff (n 17). 24 ECtHR,
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
97
the fundamental right to effective judicial protection, as enshrined in Article 47 of the Charter.”30 Accordingly, Schrems II reiterated the importance of guaranteeing individuals the possibility to bring legal action before an independent and impartial court to exercise their privacy rights.31 Furthermore, the EDPB recommendations recall the link, established in the CJEU jurisprudence, between the possibility to rely on an effective judicial remedy and the notification of a surveillance measure when the latter is over.32
6.3 Legal Analysis of Government Access to Personal Data in China and India This section provides an overview and context for examining the legal systems of China and then India and the legal regime on government access to personal data in the context of national security. First, the paper provides relevant information regarding the constitutional law of China and India. Such background is required to clarify incorrect assumptions made when examining the legal system of China and give an account of the evolution of the right to privacy in Indian constitutional law. Afterward, the paper examines the applicable legal framework for personal data processing by the Chinese and Indian governments, emphasizing national security purposes and using the European Essential Guarantees as a benchmark.
6.3.1 Government Access to Personal Data in China According to Article 1 of the Constitution of the PRC,33 it is a socialist state under a democratic dictatorship led by the Communist Party of China (CCP).34 Following Second World War, several constitutional amendments were enacted, resulting in the structural unification of the CCP and the state. This unification enabled the CCP to exert influence over the state’s normative system, leading to its dominance in this regard.35 As a result, the CCP was able to shape and control the state’s legal 30 Schrems
I, para 187 (n 5). II, para 194 (n 1). 32 Joined Cases C-511/18, C-512/18 and C-520/2018, La Quadrature du Net v. France [2020], para 191. 33 Article 1 of the PRC Constitution 2017, http://english.www.gov.cn/archive/lawsregulations/ 201911/20/content_WS5ed8856ec6d0b3f0e9499913.html, viewed 21 July 2021. 34 Samantha Hoffman, ‘Engineering Global Consent: The Chinese Communist Party’s DataDriven Power Expansion’, http://www.aspi.org.au/report/engineering-global-consent-chinesecommunist-partys-data-driven-power-expansion, accessed 9 March 2022. 35 Ling Li, ‘“Rule of Law” in a Party-State: A Conceptual Interpretive Framework of the Constitutional Reality of China’ (2015) 2 Asian Journal of Law and Society 93. 31 Schrems
98
J. Czarnocki et al.
and political frameworks, ensuring that they were consistent with its ideology and objectives. This enabled the party to consolidate its power and establish itself as the dominant force in Chinese politics, with significant implications for the country’s governance at the time and in the years since. Under this system, the rule of law is interpreted as a “rule by law” that operates under the authority of the ruling party.36 All power in China, including judicial and law enforcement, is concentrated in the National People’s Congress (NPC), which supervises other state organs. The NPC is simultaneously the legislative, executive, and judicial branches of the government, meaning no separation of power. The CCP dominates government and supervises the NPC – it realizes their policies.37 For this reason, the rule of law in China generally has been described as the rule of law with Chinese characteristics.38 The absence of a clear separation of powers, the lack of the supremacy of law, legal uncertainty, and the absence of judicial independence in China’s legal system make it impossible to classify it as a liberal democratic or rule of law system according to Western standards.39 Moreover, the notion of fundamental rights is generally absent in the Chinese legal system, which confirms the assertion above.40 According to Article 40 of the Constitution, no organization or individual may, under any circumstances, infringe on the freedom and privacy of correspondence. The exception is made where necessary to meet the needs of state security or in cases where criminal investigation, public security, or procuratorial organs are permitted to censor correspondence following legal procedures. However, what is noticed by Creemers is that the Chinese Constitution does not explicitly mention the concept of privacy. Instead, the Constitution refers only to the privacy of correspondence.41 The fundamental basis for Chinese privacy law assumes that community stability should prevail over the needs of individual persons.42 Hence, at the same time, Article 40 of the Constitution is seen as the primary source of authorization for public bodies to access personal data processed by private actors rather than a basis for the fundamental right to privacy.43 Since legislation on government access
36 ibid 37 Zhizheng Wang, ‘Systematic Government Access to Private-Sector Data in China’, Bulk Collection (Oxford University Press 2017), https://oxford.universitypressscholarship.com/10.1093/oso/ 9780190685515.001.0001/oso-9780190685515-chapter-11, accessed 9 March 2022. 38 Ignazio Castellucci, ‘Rule of Law with Chinese Characteristics’ (2010) 13 Annual Survey of International & Comparative Law, https://digitalcommons.law.ggu.edu/annlsurvey/vol13/iss1/4 39 Matthieu Burnay, Joëlle Hivonnet and Kolja Raube, “Bridging the EU-China’s Gap on the Rule of Law?” (2016) 14 Asia Europe Journal; Teemu Ruskola, “Law Without Law, or Is ‘Chinese Law’ an Oxymoron?” (2003) 11 William and Mary Bill of Rights Journal 655. 40 Rogier Creemers “China’s Emerging Data Protection Framework” (November 16, 2021). p. 1–3 Available at SSRN: https://ssrn.com/abstract=3964684 or https://doi.org/10.2139/ssrn.3964684 41 ibid, P. 2. 42 Tiffany C Li, Jill Bronfman and Zhou Zhou, “Saving Face: Unfolding the Screen of Chinese Privacy Law” (Social Science Research Network 2017) SSRN Scholarly Paper ID 2826087, https:// papers.ssrn.com/abstract=2826087, accessed 9 March 2022. 43 Wang (n 36).
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
99
to personal data is dispersed and provides public security officials with broad discretion, the PRC’s mass surveillance programmes go unchallenged in China.44 It is generally argued that the Chinese government is not restricted when requesting companies to provide access to personal information. For example, court orders are not required to access data, which further demonstrates that government interests take precedence over citizens’ rights.45
6.3.1.1
Secondary Legislation Analysis in China: Government Access, Oversight, and Data Subject Rights
This section provides a legal analysis of the Chinese laws empowering the government to access data processed by private actors. In addition, the PRC enacted a general personal information protection law, which became effective as of 1 November 2021. Since this data protection law provides certain safeguards and data subject rights against the public actors, the law will be separately discussed in Sect. 6.3.1.3.
Cybersecurity Law: The Necessary Technical Support and Assistance The Cybersecurity Law of the PRC applies to network operators, i.e. network owners, managers, and network service providers (Article 76 of the Cybersecurity Law). This broad definition covers the entire network system, consisting of computers or other information terminals and supporting equipment that adheres to specific rules and procedures for information gathering, storage, transmission, exchange, and processing (Article 76 of the Cybersecurity Law). According to Article 28 of the Cybersecurity Law, network operators must provide technical support and assistance to national security and public security organs in their efforts to protect the country’s security and investigate criminal activities within the legal framework. This provision requires network operators to work with authorities and provide them with technical assistance.46 This provision does not stipulate any limitations or restrictions to the scope of this technical assistance.47 To clarify, the support and assistance provided by the network operators may entail access
44 Anja Geller, “How Comprehensive Is Chinese Data Protection Law? A Systematisation of Chinese Data Protection Law from a European Perspective” (2020) 69 GRUR International 1191. 45 Emmanuel Pernot-Leplay, “China’s Approach on Data Privacy Law: A Third Way Between the U.S. and the EU?” (Social Science Research Network 2020) SSRN Scholarly Paper ID 3542820, https://papers.ssrn.com/abstract=3542820, accessed 9 March 2022. 46 Cybersecurity Law of the PRC 2017, viewed 28 July 2021, https://www.newamerica.org/ cybersecurity-initiative/digichina/blog/translation-cybersecurity-law-peoples-republic-china/ 47 Wang (n 36).
100
J. Czarnocki et al.
to both the communication content and metadata.48 Article 69 of the Cybersecurity Law stipulates the imposition of monetary fines on network operators and directly responsible management personnel for refusal to provide technical support and assistance to public and state security organs. When it comes to the data subject rights under Cybersecurity Law,49 Article 43 grants people the right to request to delete or correct their personal information if the processing of personal information violates provisions of the law, legislation, or agreements between data subjects and network service providers. Since these responsibilities are related to the network providers, corresponding rights can only be invoked against them. For the state organs, there is no specific complaint mechanism to oversee the activities of the state organs.
National Security Law: The Necessary Support and Assistance for National Security The National Security Law of the PRC, in Article 77, mandates that both individuals and organizations must extend the required support and assistance to the public security organs, state security organs, or other relevant agencies to ensure the protection of national security.50 Article 7 of the National Security Law states that national security must be safeguarded in accordance with the principles of the Chinese Constitution and citizens’ rights must be respected and protected in accordance with the law. However, how individuals can invoke their right to privacy or data protection when dealing with security organizations remains ambiguous, raising concerns about potential violations of these rights. Such ambiguity leaves room for interpretation and raises concerns about how seriously the law takes citizens’ rights and freedoms, including the right to privacy. Although the National Security Law51 recognizes human rights, it does not specify how they are protected. According to Article 82 of this law, citizens and organizations have the right to initiate complaints regarding “national security efforts” if these activities are unlawful. Article 83 stipulates that extraordinary measures restricting the freedom and rights of citizens shall be bound by actual needs and have to be in accordance with the law. However, the real question is to what extent the processing of personal data might be considered unlawful if personal data is accessed or processed for the tasks provided to national security organs 48 “Costs
and Unanswered Questions of China’s New Cybersecurity Regime”, https://iapp. org/news/a/costs-and-unanswered-questions-of-chinas-new-cybersecurity-regime/, accessed 13 March 2022. 49 Cybersecurity Law of the PRC 2017, viewed 28 July 2021, https://www.newamerica.org/ cybersecurity-initiative/digichina/blog/translation-cybersecurity-law-peoples-republic-china/ 50 National Security Law of the PRC 2015, viewed 28 July 2021, https://www.chinalawtranslate. com/en/2015nsl/#_Toc423592313 51 National Security Law of the PRC 2015, viewed 28 July 2021, https://www.chinalawtranslate. com/en/2015nsl/#_Toc423592313
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
101
by national security law. In other words, even though there is a right to initiate complaints regarding national security efforts, this right is designed for the abuse of the broad discretion granted to the state organs by personnel. Thus this mechanism is not designed to ensure that data protection rights are respected. In the case data access occurred, such a complaint mechanism might not work.
National Intelligence Law: The Necessary Support for Organizations and Citizens National Intelligence Law imposes obligations on organizations and citizens to support and cooperate with Chinese national intelligence agencies.52 This law is described as the codification of an expectation that every citizen is responsible for state security.53 More precisely, Article 14 of the National Intelligence Law states that the national intelligence agencies may request companies or citizens to provide the necessary support. This rule also applies to Chinese entities and their subsidiaries in foreign countries.54 Because of the vague scope of the powers given to Chinese intelligence agencies, companies can be requested to give access to personal data and cannot refuse.55 However, there is no specific evidence on whether the Chinese intelligence agencies request data from companies. For example, the recent research, covering “a security and privacy analysis of TikTok and Douyin, developed by ByteDance, found that it remains unclear whether China requested personal data access for intelligence purposes from both companies.”56 Article 19 of the National Intelligence Law states that civilian cooperation must comply with the law and not violate lawful rights and interests, including the prohibition on leaking personal information, but it is unclear how these rights will be protected against potential abuses, and Article 27 requires national intelligence agencies to establish complaint channels, but the nature of these mechanisms is not defined, and complaints may be directed to the National Intelligence Agency. Furthermore, Article 31 of the National Intelligence Law establishes that national intelligence agencies violating citizens’ lawful rights or interests will be punished
52 National
Intelligence Law of the P.R.C. 2017, viewed 28 July 2021, https://www. chinalawtranslate.com/national-intelligence-law-of-the-p-r-c-2017/?lang=en 53 Fergus Ryan Impiombato Audrey Fritz, Daria, “Mapping China’s Tech Giants: Reining in China’s Technology Giants”, http://www.aspi.org.au/report/mapping-chinas-technology-giantsreining-chinas-technology-giants, accessed 9 March 2022. 54 “Applicability of Chinese National Intelligence Law to Chinese and Non-Chinese Entities” (Mannheimer Swartling, 23 January 2019), https://www.mannheimerswartling.se/en/publicationsand-newsletter/applicability-of-chinese-national-intelligence-law-to-chinese-and-non-chineseentities/, accessed 9 March 2022. 55 ibid. 56 Pellaeon Lin, ‘TikTok vs Douyin: A Security and Privacy Analysis’ (University of Toronto 2021) Citizen Lab Research Report No. 137, https://citizenlab.ca/2021/03/tiktok-vs-douyin-securityprivacy-analysis/, accessed 9 March 2022.
102
J. Czarnocki et al.
under the law. However, given that these mechanisms aim to prevent abuse of powers in the context of intelligence activities, these mechanisms might not be considered safeguards for government data access.
Counter-Espionage Law: The Necessary Support for Counter-Espionage Activities The Counter-espionage Law of the PRC57 is another act that foresees government data access. Article 3 designates state security organs as the competent authorities responsible for conducting counter-espionage activities, and Article 38 of the same law defines espionage as activities that endanger state security. However, the ambiguity arises from the inclusion of “other espionage activities” in Article 38(3), which lacks clarity and specificity regarding the scope of these activities. Consequently, counter-espionage efforts are not explicitly defined. While carrying out counter-espionage operations, state security organs are permitted to utilize technical investigative measures, but strict formalities must be followed.58 In addition, Article 4(1) of this law stipulates that citizens have to protect national security, state’s honour, and interests and shall not jeopardize them. In light of this, all citizens, enterprises, and organizations are obliged to prevent espionage.59 Chapter III of the Counter-espionage Law further regulates the duties and rights of citizens and organizations. What strikes especially is that relevant organizations must provide information to the security organs.60 In addition, there is a reference to the strict formalities and getting approval for the use of technical investigative measures by state security organs. Despite an explicit reference to the strict formalities in Article 12, the conditions remain unclear. Therefore, this measure is most probably to include personal data, including of foreigners.61
6.3.1.2
Assessment of the Chinese Legal Framework Against the European Essential Guarantees
In summary, for the following reasons, the legislation analysed in the previous section does not meet the substantial limitations foreseen by the European Essential Guarantees. First, intelligence agencies have widespread powers to access all forms
57 The
Counter-espionage Law of the PRC 2014, viewed 28 July 2021, https://www. chinalawtranslate.com/en/anti-espionage/ 58 Art. 12 of the Counter-espionage Law. 59 Art. 4(2) of the Counter-espionage Law. 60 Art. 22 of the Counter-espionage Law. 61 “Huawei and the Ambiguity of China’s Intelligence and Counter-Espionage Laws” (The Strategist, 12 September 2018), https://www.aspistrategist.org.au/huawei-and-the-ambiguity-ofchinas-intelligence-and-counter-espionage-laws/, accessed 9 March 2022.
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
103
of information appropriate for intelligence operations. The scope of terms such as national security and state security is broad. In addition, investigations and operations do not need to be specific or narrowly defined. Second, even though information access may involve personal data transferred to China for commercial purposes, these laws do not differentiate between personal data and nonpersonal data for government access. Thus, there are not enough substantive limitations to the government’s access to personal data. Third, the secondary legislation discussed in the previous section imposes on organizations obligation to provide the necessary support to government authorities, without any specific safeguards or conditions applicable to this support, which might include personal data access. Regarding the procedural safeguards provided by European Essential Guarantees, although the PRC’s legislation governing data access includes references to “strict approval”, the strict approval itself is not defined. For instance, it should be noted that Article 26 of the National Intelligence Law states that national intelligence organizations must supervise and oversee personnel compliance with laws and discipline. According to this article and the basic structure of the law, the supervision system is developed internally. Similarly, Article 12 of the Counter-espionage Law provides for “strict formalities” and an approval system for employing technology means in the oversight of counter-espionage activities. However, the conditions and methods that apply are unclear. In other words, there is no independent supervisory mechanism in place to review data processing activities and to whom data subjects can make complaints if they believe their data protection rights have been infringed under such laws. Thus, it is unlikely that these requirements meet the European Essential Guidelines. Regarding other procedural safeguards required by EU standards, such as redress mechanisms and data subject rights, the Chinese legal system does not provide data subjects with effective remedies in cases of violation of their rights due to access to personal data by law enforcement or intelligence agencies.
6.3.1.3
Chinese Personal Information Protection Law: New Regime for the Legality, Oversight Mechanism, Redress, and Data Subject Rights for the Government Data Access?
The objective of Chinese privacy and data protection laws is not necessarily similar to privacy and data protection in the EU. While in the EU, the data protection rights are granted against both state and private actors, Chinese privacy and data protection laws protect individuals against only private actors due to the broad exceptions for public actors. Furthermore, these laws frame to broader aims of the CCP policy of making China a cyber power62 without any restrictions on government data access. For these reasons, the personal data protection legislation emerged in China is not
62 Rogier Creemers “China’s Emerging Data Protection Framework” (November 16, 2021). p. 1–5. Available at SSRN: https://ssrn.com/abstract=3964684 or https://doi.org/10.2139/ssrn.3964684
104
J. Czarnocki et al.
a game-changer for essential guarantee schemes against government data access. This section provides a brief analysis of the data protection law to investigate to what extent the new legislation provides essential safeguards for government data access. The long-awaited PRC’s Personal Information Protection Law (PIPL), which is the country’s first comprehensive personal data protection legislation, is effective as of 1 November 2021. Article 1 of the PIPL provides that the law aims to protect personal information and safeguard the free flow of personal information, stimulating reasonable use of data.63 This law will apply to public and private organizations since no derogation is provided under Article 72 for certain organizations, including national security organizations. Additionally, Article 33 stipulates that PIPL applies to the activities of state organs regarding the handling of personal information. However, specific provisions in Section III PIPL apply. Chapter I of the PIPL covers the general principles, namely, lawfulness and necessity (Article 5), purpose limitation and data minimization (Article 6), openness and transparency (Article 7), accuracy (Article 8), and security principle (Article 9). Those general provisions are similar to the principles that are provided under Article 5 of the GDPR. However, there is a reference to the laws and administrative rules that provide exceptions for these rules. For this reason, how these principles restrict government data access is questionable considering the broad obligations of the organizations to provide data access to the security authorities in the PRC. When it comes to the provisions specifically applicable to state organs in Section III, Article 34 states that the government may handle personal information according to the powers and procedures provided in laws or administrative regulations. Handling personal data may not extend the scope necessary to carry out their responsibilities. Even though the law applies to State organs, it has vague and undefined exceptions under Article 35, which provides exceptions for the transparency obligations where a provision in law or administrative regulation allows for such exception. This provision applies to all State organs regardless of their function. Because of the broad powers given to the State organs in national intelligence law and counter-intelligence law, those provisions are less likely to limit the discretion of State authorities, as they can provide exceptions within administrative regulations. Chapter IV of the PIPL covers individuals’ rights in personal information handling activities, which can be invoked against State organs. The PIPL provides individual rights, similar to data subject rights, such as “the right to know” and “the right to decide relating personal information and to limit or refuse the handling of personal information” (Article 44). Other data subject rights include the right to access (Article 45), the right to correction (Article 46), and the right to deletion of incorrect or illegally obtained information, as well as various protections and remedies for infringements by personal information handlers (Article 47). While
63 “Translation: Personal Information Protection Law of the People’s Republic of China - Effective Nov. 1, 2021” (DigiChina), https://digichina.stanford.edu/work/translation-personal-informationprotection-law-of-the-peoples-republic-of-china-effective-nov-1-2021/, accessed 9 March 2022.
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
105
those rights can be invoked against State organs, they can be restricted by law and administrative regulations (Article 44). Furthermore, there are some vague exceptions for the personal information handling activities of State organs under Articles 44–45. Article 44 states that rights provided under this only exist as long as laws or administrative regulations stipulate otherwise, which means that those rights can be restricted by other laws or regulations, providing leeway for the government authorities to fulfil those obligations. Similar to Article 44, Article 45 on the right of access refers to Article 18(1) and Article 35 of the PIPL. Article 18 provides exceptions for the transparency obligations of personal information handlers and refers to laws or administrative regulations that provide confidentiality. Given the broad powers given to State organs in the laws outlined above, it is less likely that those rights will be invoked against the State organs responsible for public security and national security. The complaint procedures are laid out in Article 68 of the PIPL, while Article 65 grants every individual the right to file a complaint in the event of improper handling of personal information. Article 68 of the PIPL further clarifies that individuals can seek compensation from a PRC court if personal information handlers violate their personal information rights and interests, and such infringement causes any harm. Notably, this provision applies to both private and government organizations. The PIPL’s Chapter VI outlines the mechanisms in place to oversee personal information handling activities, assigning responsibilities to various state departments. The State Cybersecurity and Informatization department is in charge of comprehensive planning and management, while other State Council departments are in charge of protecting and overseeing personal information. While these departments are organized within the government and have investigative and enforcement tools, there are no established standards for their independence as supervisory mechanisms. If state officials engage in improper personal information handling practices, their superior agencies charged with personal information security must order correction under PIPL Article 68, indicating that the PIPL only establishes an internal oversight mechanism. While these developments represent some progress for data subjects against Chinese state authorities, they fall short of the European Essential Guarantees standards.
6.3.2 Government Access to Personal Data in India The Republic of India is a party to various international agreements that provide guarantees for fundamental rights, including the right to privacy, such as the Universal Declaration of Human Rights (UDHR)64 and the International Covenant
64 Universal Declaration of Human Rights (adopted 10 December 1948 UNGA Res 217 A(III) (UDHR).
106
J. Czarnocki et al.
on Civil and Political Rights.65 However, notwithstanding the Constitution of India also recognizing several fundamental rights,66 how they are respected has been put consistently into question by several reports of human rights organizations over the years.67 The right to privacy is not exempt from this criticism.68 Despite the right to privacy not being expressly recognized by the Constitution, the Supreme Court of India acknowledged it as a fundamental right in the groundbreaking Puttaswamy v. Union of India decision in 2017.69 The judgment indicates the prominent role of the Indian judiciary in defining the right to privacy.70 The case originated from the question around the alleged unconstitutionality of the Aadhaar Act.71 The Act governs a national identification card system based on citizens’ biometrics providing access to various public benefits.72 Since its introduction in India, an increasing number of public and private providers made the use of the Aadhaar number mandatory in order to use their services. The Aadhaar scheme raised many concerns regarding the right to privacy,73 which culminated in the above-mentioned case. In the first of two connected judgments, the Court preliminary addressed the question of the existence of the right to privacy in Indian constitutional law. It argued that the right is implied in Article 21 of the Constitution,74 but it is only enforceable against the government. In order to give
65 International
Covenant on Civil and Political Rights (adopted 16 December 1966, entered into force 23 March 1976) 999 UNTS 171 (ICCPR). 66 Constitution of India. 67 Amnesty International, “Report 2020/2021. The state of the word’s human rights” (2021), https://www.amnesty.org/en/wp-content/uploads/2021/06/POL1032022021ENGLISH.pdf, accessed 9 March 2022. 68 Centre for Internet and Society India, Privacy International “The Right to Privacy in India” (2016) Stakeholder Report Universal Periodic Review 2016, 27, www.upr-info.org/ sites/default/files/document/india/session_27_-_may_2017/js35_upr27_ind_e_main.pdf, accessed 9 March 2022. 69 Justice K S Puttaswamy and others v. Union of India and others, Writ Petition No. 494 of 2012 (Supreme Court of India, 24/08/2017). 70 Astha Rao, Shipra Sahu, “Right to Privacy and Data Protection in India” (2021), 23 Supremo Amicus Journal, https://supremoamicus.org/wp-content/uploads/2021/02/Shipra-Sahu. pdf, accessed 9 March 2022. 71 Aadhaar (Targeted delivery of financial and other subsidies, benefits and services) Act, 2016. 72 While the Aadhaar Act provides some grounds for government access to personal data, including, amongst others, for national security reasons, the scope of the law is scarcely relevant in the context of international transfer of personal data. In fact, the collection of personal data under the Aadhaar scheme is primarily carried out from citizens of India. Therefore, the following assessment will not dwell on its provisions. 73 Sonal Chhugani, “India’s Aadhaar card violation of Indian citizens right to privacy” (2021) 4 Cardozo International and Comparative Law Review 733. 74 Constitution of India, art. 21.
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
107
individuals the opportunity to exercise the right against private entities, the Court encouraged the prompt adoption of a comprehensive data protection framework.75 In the follow-up judgment on the question around the constitutionality of the Aadhaar Act, the Supreme Court stated that the right to privacy may only be limited where such limitation is provided by the law, in order to pursue legitimate aims of the state, and only in so far as it is necessary and proportionate in a democratic society.76 The legal reasoning of the Court resembles the content of Article 52 of the EU Charter of Fundamental Rights77 and the European Essential Guarantees, as outlined in paragraph 6.2. In 2019, as a result of the heated debate derived from the Puttaswamy judgment, the government proposed the draft Personal Data Protection Bill, an attempt to regulate the processing of personal data by Indian entities, including the bodies of the state.78 However, the proposal was withdrawn in 2022.79 In the same year, a new proposal for a Personal Data Protection Bill was drafted by the Indian government.80 Similarly to the previous proposal, the new Bill attracted the attention of fundamental rights organizations, expressing concerns on the potential of the law to undermine the protection of privacy and personal data in India, as it enshrines broad grounds for the government to be exempted by the data protection principles laid down in it.81,82
75 Puttaswamy
(n 31), para 184. K S Puttaswamy and others v. Union of India and others, Writ Petition No. 494 of 2012 (Supreme Court, 26/09/2018). 77 Charter of Fundamental Rights of the European Union [2000] OJ C364/3, art. 52. 78 Deva Prasad M, Suchita Menon C, “The Personal Data Protection Bill, 2018: India’s regulatory journey toward a comprehensive data protection law” (2020) 28 International Journal of Law and Information Technology 1. 79 Raghavan M., “Are we there yet? The long road to nowhere: the demise of India’s draft Data Protection Bill”, The Future of Privacy Forum blog, 2022, https://fpf.org/blog/are-we-thereyet-the-long-road-to-nowhere-the-demise-of-indias-draft-data-protection-bill/ [last accessed in March 2023]. 80 Proposal for a Personal Data Protection Bill 2022. 81 “India: Data Protection Bill Fosters State Surveillance”, Human Rights Watch, 2022, https:// www.hrw.org/news/2022/12/23/india-data-protection-bill-fosters-state-surveillance [last accessed in March 2023]. 82 At the time of publication of this paper, the Personal Digital Data Protection Bill has been recently adopted by the Indian Parliament (August 2023). Since the research for this paper was already finalized before the adoption of the law, the paper does not address the changes brought by the new piece of legislation on the Indian data protection framework and on the subject matter of this analysis. 76 Justice
108
6.3.2.1
J. Czarnocki et al.
Secondary Legislation Analysis in India: Government Access, Oversight, and Data Subject Rights
Telegraph Act 1885 There is no general law in India regulating government access to personal data for national security purposes. However, the Telegraph Act 188583 and the Information Technology Act 2000 (IT Act)84 represent the most relevant laws in this regard. Section 5(2) of the Telegraph Act allows officers authorized by the government to intercept or disclose any message from any person transmitted by telegraphs on the grounds of protection of the sovereignty of India, security of the state, friendly relations with foreign states, public order, or prevention of the incitement to the commission of crimes. While the scope of the provision is very broad, its application based on the mentioned grounds is subordinated to the condition that reasons of public emergency or public safety require it. The question around the constitutionality of Section 5(2) was brought before the Supreme Court of India in a landmark 1996 judgment.85 While the Court did not strike down the provision due to the alleged arbitrariness of government interferences in the right to privacy, it identified some safeguards to be respected when applying the interception measures.86 The safeguards were later codified in the Indian Telegraph Rules 1951.87 The rules, adopted pursuant to the Telegraph Act, lay down the procedural law regarding telephone tapping, and provide that the interception orders should only be issued by the highest functionaries in the central or state governments. Such orders should not be in place for more than 180 days, and a Review Committee is in charge of overseeing their compliance with the law.88 The same Review Committee is also responsible for the oversight of surveillance activities pursuant to the IT Act, and its characteristics will be described below.
Information Technology Act 2000, Rules 2009, and Rules 2021 The IT Act considerably broadened the range of cases where the state may recur to interception measures, becoming the legislative cornerstone of surveillance law in India. Adopted in 2000, it contains very broad provisions about collecting any information from any computer source located in the territory of India. Furthermore,
83 Indian
Telegraph Act, 1885. Technology Act, 2000 (IT Act). 85 People’s Union for Civil Liberties v. Union of India and others (Supreme Court of India, 18/12/1996). 86 Chaitanya Ramachandran, “PUCL v. Union of India Revisited: Why India’s Surveillance Law Must Be Redesigned for the Digital Age” (2014) 7 NUJS Law Review 105. 87 Indian Telegraph Rules, 1951, Section 419A. 88 ibid. 84 Information
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
109
it does not transpose the requirements of public emergency of public safety reasons to perform the interception activities.89 Section 69 of the IT Act regulates government access to any computer source and collection of every piece of information stored in it, based on national security grounds. In particular, sovereignty or integrity of India, defence of India, security of the state, friendly relations with foreign countries or public order, and the prevention of incitement to the commission of any cognizable offence related to these grounds are listed amongst the grounds justifying the access. According to the provision, the government may issue directions to any governmental agency to intercept, monitor, and decrypt the said information. Section 69B of the IT Act also allows the government to authorize any agency to monitor and collect any traffic data and information in any computer source to enhance national cybersecurity. In order to access the information, the government may rely on assistance from private entities. The latter is referred to as “intermediaries” in the IT Act and includes a broad range of digital services providers, such as telecom service providers, network service providers, Internet service providers, search engines, and online payment sites.90 Under the IT Act, governmental agencies may issue access requests to personal information to the intermediaries, which are held criminally liable when they fail to provide the government with such access. In this case, both imprisonment and a fine may be imposed as a punishment for the lack of assistance.91 Such provision indicates the tendency, in Indian legislation, to introduce forms of binding collaboration with the private sector for national security purposes.92 The IT Act also imposes criminal liability to any person found to breach any provision in it enshrined, irrespective of his or her nationality and regardless of where the conduct takes place. The person in question is held liable as long as the conduct in violation of the Act involves using a computer, computer system, or computer network located in the territory of India.93 The IT Act delegates to the government a broad power to adopt regulations to specify its provisions. In particular, section 69 of the IT Act gives the Indian government the power to clarify procedures and safeguards to be respected in the carrying out of the monitoring activities based on such provision. With this legal basis, the executive adopted the Information Technology Procedure and Safeguards for the interception, monitoring, and decryption of information rules, also known as Rules 2009.
89 P. Arun, “Penetrative or Embracive? Exploring State, Surveillance and Democracy in India”, in Anthony P. D’Costa, Akin Chakraborty (eds), Changing Contexts and Shifting Roles of the Indian State. Dynamics of Asian Development (Springer Singapore 2019). 90 IT Act (n 78), section 2(w). 91 ibid, sections 69(3)-69B(4). 92 Vrinda Bhadari, Renuka Sane, “Protecting Citizens From the State Post Puttaswamy: Analysing the Privacy Implications of the Justice Srikrishna Committee Report and the Data Protection Bill, 2018” (2018) 14 Socio Legal Review 143. 93 IT Act (n 78), section 75.
110
J. Czarnocki et al.
Based on the Rules 2009, the Secretary in the Ministry of Home affairs (as regards the Central Government) and the Secretary in charge of the Home Department (as regards the local government) are allowed to issue an interception order.94 Where unavoidable circumstances or emergencies occur, the Rules provide an exception, and the order may be issued by senior officers of security or law enforcement agencies, with a subsequent ex post authorization by the competent authorities.95 The Rules 2009 identify the measures under Section 69 as a last resort, to be used where no alternative means are available to acquire information.96 Moreover, it is established that the direction should not be in place for more than 60 days from its issuance, but the Government can renew it up to a total period of 180 days.97 As regards the retention of the records pertaining to the monitoring activities, the Rules mandate their destruction every 6 months by the governmental agency having the right to access the information, while the private intermediaries should destroy it within 2 months.98 Finally, Rules 2009 set a general prohibition on disclosure of the intercepted information; nevertheless, the governmental agencies are allowed to share it with other security agencies to investigate crimes or judicial proceedings.99 As a consequence of the delegated regulatory power to the executive, other sets of provisions were adopted to expand the grounds allowing government access to personal data. This is the case of the Information Technology Intermediary Guidelines and digital media Ethics code Rules, also known as Rules 2021. The Rules set obligations for social media intermediaries in order to combat harmful and illegal content online. While they were not adopted nor ratified by the Parliament, they state that significant social media intermediaries which provide messaging services are obliged, to provide the identity of the “first originator” of a message, when a judicial order requires it in the context of prevention, detection, investigation, prosecution, or punishment of an offence related to, amongst the others, national security and public order. Similarly to the IT Act, the Rules 2021 also establish criminal liability for an intermediary not complying with the mentioned judicial order.100
94 Information Technology (Procedure and Safeguards for Interception, Monitoring and Decryption of Information) Rules, 2009 (Rules 2009), section 2(d). 95 ibid, section 3. 96 ibid, section 8. 97 ibid, section 11. 98 ibid, section 23. 99 ibid, section 25(2). 100 Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules 2021 (Rules 2021), section 4(2).
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
6.3.2.2
111
Assessment of the Indian Legal Framework Against the European Essential Guarantees
When assessing the outlined provisions against the European Essential Guarantees set out in paragraph 6.2, it should be noted that Section 69 and 69B provide extremely broad grounds to allow an interference with data protection by governmental bodies, expanding the already considerable discretion afforded by the Telegraph Act. Not only the IT Act includes an extensive list of grounds justifying the government access, but the grounds possess a degree of vagueness that leaves a broad room for discretion to the executive. As pointed out by the report on the Indian data protection framework by a Committee of experts appointed by the government after the Puttaswamy judgment,101 the exemptions to fundamental principle based on national security grounds can undermine the effectiveness of data protection law. In this regard, the report argues that a clear definition of what constitutes national security is crucial to avoid such case-by-case exemptions becoming grounds for systematic access by the government to large data sets processed by private entities. Indian experts drafting the report argued in favour of adopting the concept of “security of the State”, due to the more extensive interpretative effort made by Indian judicial authorities in defining it, as opposed to the problematic blurriness of the notion of national security. The preference for this legal expression is also in line with the wording used in the IT Act and the Telegraph Act provisions. However, the Report acknowledges how the current legal framework in India does not specify such a concept to limit the discretion of the government, nor it provides sufficient safeguards in the event of these provisions being applied. It also points out that a distinction between the grounds based on the level of threat posed to the state is not explicit in the provisions.102 As argued by local scholars, the vagueness of the concept of security of the state is not only limited to the regime for government access to data. The broad discretion left to the executive in assessing the necessity of exceptional measures based on national security is a recurring issue in Indian national security laws.103 Furthermore, the government is empowered by the IT Act and the Telegraph Act to adopt regulations to further specify the provisions. As illustrated above, crucial aspects of the legal framework on government access are left to governmental regulations, such as the regime on data retention and the safeguards to be observed in the interception activities.104 The same regulations may also expand the list of
101 Committee of Experts under the Chairmanship of Justice B.N. Srikrishna, “A Free and Fair Digital Economy. Protecting Privacy, Empowering Indians” (2018), https://www.meity.gov.in/ writereaddata/files/Data_Protection_Committee_Report.pdf, accessed 9 March 2022. 102 Ibid. 103 Surabhi Chopra, “National Security Laws in India: The Unraveling of Constitutional Constraints” (2016) 16 Oregon Review of International Law 1. 104 Rules 2009 (n 89).
112
J. Czarnocki et al.
grounds allowing government access.105 The excessive discretion of the Indian government in applying and specifying the provisions represents a strong argument against the quality of the law of the provisions above. The vagueness of the grounds justifying government access adds up to the absence of an explicit hierarchy between them. As a consequence, the line is blurred between what might constitute a threat to national security and what amounts to a crime requiring the action of law enforcement authorities.106 Thus, compliance with the requirements of necessity and proportionality should also be questioned. The broad wording of the provisions opens to the possibility for the government to perform generalized access based on the grounds of national security, and it does not facilitates the need to strike a balance between the seriousness of the interference in fundamental rights and the importance of the pursued objective of public interest. As the IT Act did not import the precondition of a public emergency or public safety ground to implement the measures from the Telegraph Act, the noncompliance with the principles is even more evident. Based on Section 69, in 2018 ten Indian security and intelligence agencies were authorized by the Ministry of Home Affairs to conduct the said interception, monitoring, and decryption activities of every type of information, including texts, messages, and images.107 This authorization represents a good example in order to assess the practical functioning of the provision. In 2019, the Internet Freedom Foundation, an Indian civil liberties non-profit, and other petitioners challenged the the lawfulness of the notification and the constitutionality of Section 69 before the Supreme Court of India. The petition extensively argued that the notification and the provisions under the IT Act do not respect the principles of necessity and proportionality. Rules 2009 do not require considering whether less onerous and intrusive means in light of the right to privacy exist to achieve a specific government objective but only whether it is possible to acquire information through the activation of different mechanisms. The underlying assumption is that the only way to pursue the list of general interest objectives is by the massive acquisition of personal information. Besides, neither the provisions nor the notification complies with the principle of purpose limitation, as they do not precisely regulate the use of the acquired information.108 The petition also repetitively pointed out how the overbroadness of the provisions leads to the risk of arbitrariness, as “any information generated, transmitted or stored in any computer source” is susceptible to be
105 Rules
2021 (n 95). of Experts under the Chairmanship of Justice B.N. Srikrishna (n 61), 129–130. 107 Order no. D.L.-33004/99 of the Ministry of Home Affairs (Cyber and Information Security Division) of the 20 December 2018. 108 Internet Freedom Foundation and another v. Union of India and others, Writ Petition no. 44 of 2019, para XXXIV. 106 Committee
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
113
intercepted, monitored, or decrypted.109 Likewise, the impugned notification was criticized for entitling the ten agencies in general listed to perform the monitoring activities and not only the authorized officers as per Section 69 of the IT Act. As regards the procedural safeguards in the event of government access to personal data for national security purposes, Section 22 of the Rules 2009 provides that a Review Committee, as set up under Section 419A of the Telegraph Rules 1951, is deputed to review all the interception orders issued pursuant the IT Act, assessing their compliance with the relevant provisions. The same Committee is responsible for overseeing the application of Section 5(2) of the Telegraph Act. Pursuant to the Telegraph Rules, such a committee is constituted by the Central Government or the State Governments. In both cases, the Committee is composed of high functionaries of the governments. According to the Report on data protection drafted by Indian experts, it has been found that about 7500–9000 such orders are passed by the Central Government every month. The Review Committee has an unrealistic task of reviewing 15,000–18,000 interception orders in every meeting while meeting once in 2 months.110 Besides the questionable effectiveness of the Committee in reviewing the huge amount of orders passed by the Central Government, the Report also pointed out how the oversight mechanism relies on an exclusively executive review, and a legislative or judicial overseeing activity should be needed to guarantee appropriate procedural safeguards in the case of government access. The only exception to the lack of independent oversight in the Indian legal system may be found in the Rules 2021, establishing that a judicial order is needed to ask for the disclosure of an individual’s identity.111 The provisions under the Telegraph Act and IT Act leave a disproportionate power in the hands of the government, and they do not guarantee against the evident conflict of interest of a government only reviewed by other government functionaries. In light of the Puttaswamy judgment, the constitutionality of the lack of independent oversight into the Indian legal system is questionable.112 Concerning the availability to individuals of a judicial remedy against a breach of their right to privacy, it is useful to first recall that the IT Act represented until recently the relevant law regulating the processing of personal data in general.113
109 ibid,
para XXI. of Experts under the Chairmanship of Justice B.N. Srikrishna (n 61), 125. 111 Rules 2021 (n 95), Section 4(2). 112 Vrinda Bhandari, Karan Lahiri, “The Surveillance State, Privacy and Criminal Investigation in India: Possible Futures in a Post-Puttaswamy World” (2020) 3 University of Oxford Human Rights Hub Journal 15. 113 At the time of publication of this paper, the Personal Digital Data Protection Bill has been recently adopted by the Indian Parliament (August 2023). The law is destined to replace Section 43A of the IT Act and the IT Rules 2011 in regulating data protection in India. Since the research for this paper was already finalized before the adoption of the law, the paper does not address the changes brought by the new piece of legislation on the Indian data protection framework and on the subject matter of this analysis. 110 Committee
114
J. Czarnocki et al.
However, both Section 43A and the consequently adopted Rules 2011, which governed the matter, only apply to “body corporates”, intended as private entities.114 When the Rules 2011 mentioned governmental agencies, it was to exempt body corporates from the obligation to obtain consent before processing personal data, in the event of such agencies requesting to access the data for law enforcement purposes.115 Therefore, while the IT Act provided the individuals with some guarantees and the possibility to seek compensation when an entity failed to protect personal information,116 such provisions did not apply to government access to personal data, leaving a legislative vacuum around the matter. However, as stated above, the relevance of Section 43A was limited when it comes to the international transfer of data. A residual form of liability might be enshrined in a broad interpretation of Section 45, setting a penalty for whoever contravenes any rules or regulations under the IT Act, where no penalty has been separately provided. However, the question of the applicability of this provision to the state has not been addressed by Indian judicial authorities.117 A corollary of the absence of a provision holding the state liable for government access is the lack of redress mechanisms for individuals.118 Data subjects lack awareness of the monitoring measures they are subject to. These measures are characterized by the secrecy of such measures. Furthermore, there is a lack of provisions allowing the aggrieved individuals to approach courts for redress. Therefore the Indian legal system fails to comply not only with the EU requirements concerning the availability of a judicial remedy to enforce privacy and data protection rights but also the principles of due process and access to justice that are inherent in the Indian Constitution.119
6.4 Conclusion In the light of the Schrems II decision and the European Essential Guarantees, the first standard for government data access in third countries is to meet the legality requirement under the EU Charter. The analysed Chinese legislation does not provide for any limitation to government access to data. Therefore, it does not
114 IT
Act (n 78), Section 2(w).
115 Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal
Data or Information) Rules 2011, Section 6. 116 IT Act (n 78), Section 43A. 117 Rajesh Bahuguna R., Relevance of Distinction between Sovereign and Non-Sovereign Functions in governmental Liability In the Field of Cyber Torts: Indian Perspective (2020) 7 Journal of Critical Reviews 4226. 118 The conclusions drawn from this legal analysis do not consider the recent changes introduced by the Digital Personal Data Protection Act, adopted in August 2023 (n. 81), since the research for this paper was already finalized at the time of the adoption. 119 Vrinda Bhandari, Karan Lahiri (n 72).
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
115
meet such requirements. The second standard is the necessity and proportionality standard. Oversight mechanisms, redress mechanisms, and data subject rights are crucial to meet the proportionality requirement. However, Chinese oversight mechanisms concerning government access lack any assurance of independence, and redress mechanisms, as well as recognized data subject rights, are limited. All in all, the examination of this secondary legislation reveals that the government has significant leeway in acquiring people’s data. It is even possible to argue that Chinese legislation legitimizes the government’s vast and unlimited access to personal data. It might be argued that the PIPL could be viewed as an effort to improve data protection in the PRC. Those improvements are specific provisions on general data processing principles (legality, proportionality, data minimization, purpose limitation) applicable to all State organs. Individual rights in respect of personal information are granted to individuals. However, the oversight mechanism is internally structured within the State organ, with no independence mandated by law. Furthermore, the exceptions provided to data protection principles across the legislation and the whole constitutional and political system in China substantially undermine any restraint of government access to personal data. Following the analysed laws and practices on government access to personal data in India, the substantive EU legality requirement is not met by Indian legislation, due to the vagueness of provisions and broad discretion left to the government in accessing personal data. The framework on government access shares the use of indefinite concepts with other laws concerning national security, thus showing a trend across Indian legislation. The unfulfillment of the legality requirement directly impacts on the respect of the principles of necessity and proportionality. While Indian constitutional law is familiar with such principles, the relevant legislation establishes a broad range of grounds justifying government access to personal data, without providing a hierarchy amongst the grounds based on the level of threat posed to the essential functions of the state. Therefore, the principles are not met in practice. As regards the oversight mechanisms provided by Indian laws, the bodies deputed to review the cases of government access to personal data lacking the essential requirement of independence from the executive. Finally, the lack of a comprehensive framework on data protection leaves a legislative vacuum around the question of liability of the state concerning government access. The analysed laws do not provide specific provisions on state liability, nor do they enshrine any judicial remedies in the event of unlawful access to personal data by the government. While the right to privacy was recognized as a fundamental right in India, the current legal framework on government access to personal data does not meet the European Essential Guarantees. In light of the numerous provisions giving broad discretion and enshrining little oversight on government concerning national security, the effectiveness of upcoming legislation in enhancing the level of privacy and limiting
116
J. Czarnocki et al.
the broad power of the executive will need to be assessed, taking into consideration the landscape of Indian data protection in its entirety.120 The analysis of risks posed to privacy and data protection in China and India leads to the conclusion that, where EU exporters wish to transfer personal data pursuant to SCCs in these countries, supplementary measures are necessary to ensure an adequate level of protection. The evaluation of the two countries’ legislation and practices against the European Essential Guarantees highlights how a concrete application of the risk-based approach in the decision to transfer personal data may result in a difficult task for EU exporters, due to the extensive research and legal analysis required in the assessment. Furthermore, once a conclusion is reached about the need for supplementary measures, an evaluation of whether such measures may effectively overcome the gaps in data protection is also required. While such assessment is easier in the case of transfers of data to the United States, due to the extensive analysis carried out in the Schrems II judgment, the question around the appropriateness of supplementary measures for China and India, as well as other third countries, is more challenging to answer. In fact, the absence of a decision of the CJEU leaves room for the interpretation of domestic provisions and potential ways to address the risks to privacy and data protection. The recommendations of the EDPB provide a clear hierarchy established with regard to contractual, organizational, and technical measures. Contractual obligations are unlikely to fill the gaps in data protection in the third countries, due to the primacy of domestic laws on national security over them. This hierarchy is confirmed by the recent decisions of the EU DPAs addressing the matter. The decisions in question show how the technical measures might have the potential to effectively supplement the safeguards enshrined in the SSCs. However, some national DPAs took a stringent approach in assessing the effectiveness of such measures, due to the high risks for data protection in the US legislation. The evaluation of India and China’s legislation against the European Essential Guarantees shows how serious concerns may be raised about data protection when it comes to government access for national security purposes. In conclusion, while the possibility to perform transfers of personal data to China and India is not excluded, the high risks for data protection resulting from the analysis of their legislations and practices are likely to significantly limit the possibility to transfer data in the current post-Schrems II scenario.
120 At the time of publication of this paper, the Personal Digital Data Protection Bill has been recently adopted by the Indian Parliament (August 2023). However, its adoption is subsequent to the finalization of this research. The assessment of the impact of the new data protection law on the subject matter should be object of further legal analysis. (n. 81).
6 Operationalizing the European Essential Guarantees in Cross-Border. . .
117
6.5 Tables of Relevant Chinese and Indian Laws
Chinese laws on data protection/government access to personal data The Cybersecurity Law 2017 The law applies to network operators, i.e. network owners, managers, and network service providers. It puts an obligation on network operators to provide technical support and assistance to the public security and national security organs that protect national security and investigate criminal activity. The law does not stipulate any limitations or restrictions to the scope of this technical support and assistance National Security Law 2015 The law stipulates that citizens and organizations must provide the necessary support and assistance to public security organs, state security organs, or related organs to protect national security. The national security shall follow the Chinese Constitution and respect and protect citizens’ rights under the law. However, it is unclear how the right to privacy or data protection may be invoked against these security organizations. Although the National Security Law recognizes human rights, it does not specify how they are protected against abuse of power National Intelligence Law The law imposes obligations on organizations and citizens to 2017 support and cooperate with Chinese national intelligence agencies. This law expects that every citizen is responsible for state security. For example, based on this law the national intelligence agencies may request companies or citizens to provide the necessary support. This rule applies to Chinese entities and their subsidiaries in foreign countries. Because of the vague scope of the powers given to Chinese intelligence agencies, companies can be requested to give access to personal data and cannot refuse The Counter-espionage Law The law states that while conducting counter-espionage, state 2014 security organs can use technical investigative measures subject to strict formalities. The citizens have to protect national security, state’s honour, and interests and shall not jeopardize them. All citizens, enterprises, and organizations are obliged to prevent espionage. Organizations and citizens must provide information to the security organs Personal Information The law provides rules on personal information processing, Protection Law 2021 and some of its concepts in their appearance resemble European data protection law. However, this law protects individuals only against private actors due to the broad exceptions for public actors and without any restrictions on government data access (continued)
118
J. Czarnocki et al.
Indian laws on data protection/government access to personal dataa Telegraph Act 1885 The law allows officers by the government to intercept or disclose any message from any person transmitted by telegraphs on the grounds of protection of the sovereignty of India, security of the state, friendly relations with foreign states, public order, or prevention of the incitement to the commission of crimes Information Technology (IT) The law allows government access to any computer source and Act 2000 collection of every piece of information stored in it, on grounds related to sovereignty or integrity of India, defence of India, security of the state, friendly relations with foreign countries or public order, or the prevention of incitement to the commission of any cognizable offence related to these grounds. It also provides for government access to the same information in order to enhance national cybersecurity Information Technology The Rules establish provisions regulating the procedure and (Procedure and Safeguards safeguards on government interception orders based on the IT for Interception, Monitoring Act 2000 and Decryption of Information) Rules, 2009 (Rules 2009) Information Technology The Rules, adopted on the basis of IT Act 2000, regulate the (Reasonable Security processing of personal data by body corporates Practices and Procedures and Sensitive Personal Data or Information) Rules 2011b Information Technology The Rules, adopted on the basis of the IT Act 2000, set (Intermediary Guidelines and obligations for social media intermediaries in order to combat Digital Media Ethics Code) harmful and illegal content online, including the obligation to Rules 2021 (Rules 2021) provide competent authorities with the information about the identity of the “first originator” of certain content, based on a judicial order in the context of prevention, detection, investigation, prosecution, or punishment of criminal offences a At the time of publication of this paper, the Personal Digital Data Protection Act has been recently adopted by the Indian Parliament (August 2023). Since the research for this paper was already finalized before the adoption of the law, the paper does not analyse it. However, it should be considered as the new relevant legal framework on data protection in India. b The new Digital Personal Data Protection Act, adopted in 2023, is destined to replace this set of Rules (n. 81).
Chapter 7
Enabling Versatile Privacy Interfaces Using Machine-Readable Transparency Information Elias Grünewald , Johannes M. Halkenhäußer , Nicola Leschke Johanna Washington , Cristina Paupini , and Frank Pallas
,
7.1 Introduction Transparency is a core principle for protecting data subjects’ privacy worldwide and one of the leitmotifs of the usable privacy discipline [36]: data subjects shall be well-informed about the consequences and potential risks of their interactions with systems that process personal data relating to them. Although regulations, such as the European General Data Protection Regulation (GDPR) or the California Consumer Privacy Act (CCPA), oblige data controllers to provide transparent information about their data processing practices, traditional ways for doing so—as in written privacy policies—fail to convey relevant details [39]. As a consequence, poorly informed data subjects may, e.g., provide consent to dubious or even malevolent processing of sensitive data. Over the last years, an emerging field of usable privacy technologies can be observed [36]. In particular, comprehensive privacy iconographies have been or are being developed. Among them are privacy-focused nutrition labels [28], icons indicating notice & choice mechanisms [38], icons embedded in browser extensions [7], and many more [15, 21]. Still, their evaluation shows that data subjects need recurring contexts, previous experience, and learning to correctly understand their
E. Grünewald () · J. M. Halkenhäußer · N. Leschke · F. Pallas Technische Universität Berlin, Information Systems Engineering, Berlin, Germany e-mail: [email protected]; [email protected]; [email protected]; [email protected] J. Washington iRights.Lab, Berlin, Germany e-mail: [email protected] C. Paupini Oslo Metropolitan University, Oslo, Norway e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_7
119
120
E. Grünewald et al.
contents [37]. Especially, the interpretation of abstract legal terms and concepts remains challenging. For privacy visualizations and the related human-computer interactions, we identify at least three major requirements. First, these approaches need to fulfill legal requirements, i.e., provide sufficient expressiveness to depict the transparency information obligations from the applicable regulatory framework (e.g., the GDPR or CCPA). Second, the information needs to be provided in a machine-readable format to process large-scale sharing infrastructures. Article 12(7) GDPR also explicitly states this obligation in the context of privacy icons for accessibility reasons. Third, the designs are to be evaluated for their effectiveness, as shown for the “Do not sell my information”) opt-out mechanism established by the CCPA [18]. Similarly, data protection authorities (DPAs) observe more and more tactics of “dark patterns”, which describe malformed privacy-related interfaces, that make use of, e.g., overloading, misleading information, or decontextualization [9]. Given that data subjects (i) behave differently depending on the context they are using a service in [31, 41, 44], (ii) have heterogeneous personal preferences, and (iii) bring a different level of competence relating to the processing of personal data, there is an urgent need for adaptive privacy interfaces. We, therefore, argue to avail oneself of the universal design approach which aims for “services to be usable by all people to the greatest extent possible” [43]. More specifically, it takes into account the multiplicity of ways in which people access and navigate society [27, 32]. Consequently, service providers can guarantee the usability of their product while protecting equality and nondiscrimination at the same time [12]. The intensive scientific discourse notwithstanding, widespread adoption of userfriendly and at the same time legally aligned mechanisms for providing transparency information is still missing. Instead, the abovementioned “dark patterns” dominate the experience of the modern web, and even promising privacy icon sets have not been widely implemented. A major barrier to actual adoption lies in the fact that proposed solutions typically focus on overly specific use-case scenarios and are therefore not transferable across contexts with reasonable costs and efforts. To address this gap, we herein provide the following contributions: 1. A general reference model for exchanging transparency information between a data controller and a data subject based on universal design principles, which can be used to implement and evaluate transparency enhancing technologies. 2. Two open-source implementations of this model to demonstrate its real-world applicability to enable versatile (i.e., context-, preference-, and competenceadaptive) privacy information guided by universal design principles. More specifically, these process machine-readable transparency information in the form of: (a) An extendable and layered privacy dashboard for transparency based on participatory user workshops (b) A privacy preference-focused chatbot and voice assistant powered by conversational artificial intelligence.
7 Enabling Versatile Privacy Interfaces
121
3. A preliminary evaluation of the proposed model, as materialized in said implementations, including a user study involving 19 data subjects. Hence, we organize this chapter as follows: In Sect. 7.2, we provide some background and related work. Section 7.3 describes a general reference model that can be used to provide transparency information. Afterward, we describe two implementations (in Sect. 7.4) of the reference model, namely, a layered privacy dashboard in Sect. 7.4.1 and an interactive privacy chatbot and voice assistant in Sect. 7.4.2. We evaluate our model in Sect. 7.5. Finally, Sect. 7.6 gives an overview of future work and concludes.
7.2 Background and Related Work Transparency requirements from privacy regulations such as the GDPR constitute the starting point for our considerations. According to Art. 12 ff. GDPR, for example, data subjects must be informed about the categories of personal data being processed, the legal basis and purposes for processing, storage limitations, potential third parties to whom data are transferred, and details about how to get data access, rectification, or deletion. Moreover, general information about the data controller’s representative and data protection officer has to be accessible. This information must be provided “in a concise, transparent, intelligible and easily accessible form”. The CCPA and other recent (draft) regulations define similar information obligations. Currently, such information is typically provided in privacy policies written in complex, ambiguous, and legalese language. Going beyond such traditional privacy policies, several concrete designs of privacy icons have been proposed over the last decade. For instance, some have been evaluated in e-commerce scenarios [23], others have been proposed for online social networks [26], notice & choice mechanisms have been discussed together with visualizations [6], and, recently, modular and ontology-based icons have been presented [37]. Furthermore, visual nudges to encourage user interaction have been discussed extensively [13]. Notably, such advanced transparency interfaces are not limited to privacy icons. Schaub et al. proposed a privacy notice design space featuring different timing, channel, modality, and control options [39]. For instance, the Polisis project demonstrated how information can be automatically extracted from privacy policies and be presented through different channels [19]. All of these are meant to not only comply with data protection regulations, but also to increase the trust relationship between a data controller and a data subject. Hence, data controllers have an incentive for adequate representation of their regulatory alignment and beyond. Nevertheless, controllers are often lacking the expertise to implement such measures on their own apart from general guidelines (e.g., from supervisory authorities, such as the EDPB). Therefore, we aim to provide an actionable model to implement transparency-enhancing interfaces in practice.
122
E. Grünewald et al.
From a technical perspective, we identify several machine-readable privacy policy languages. The most prominent approach is the Platform for Privacy Preferences Project (P3P). However, it is limited in its expressiveness and therefore not suited to meet legislative givens, which was one of the reasons it was discontinued [5]. Later studies, for instance, as in [29], provide an overview and compare the different languages available. Of particular practical relevance are policy languages and accompanying developer toolkits, such as LPL (Layered Privacy Language) [11], TIRA (Transparency in RESTful Architectures) [17], and TILT (Transparency Information Language and Toolkit) [16], that can capture transparency information from modern, constantly evolving, and inherently complex distributed systems [14]. Moreover, consent or preference languages, such as YaPPL (YaPPL is a Privacy Preference Language) [42], GPC (Global Privacy Control),1 and ADPC (Advanced Data Protection Control) [25], or controversially discussed industrydriven consent management platforms (e.g., IAB Europe Transparency and Consent Framework) [22], are complementary to these, albeit with a different focus on providing legitimacy for processing instead of transparency as the main goal. Until now, a wide adoption of these cannot be observed, since many of these technologies lack a holistic view on the privacy interaction. Consequently, we argue for a more informed approach to build upon, which is based on the universal design approach. In its first conceptualization, back in 1985, universal design referred mainly to issues related to usability that the disabled community experienced, especially in regard to architecture and access to buildings [30]. Gradually, the concept and principles of universal design begun to be adopted by different fields and areas of human experience, such as technology, education, policy and more. For example, [10] elaborates on the integration of universal design in human-computer interaction. In 2006, references to universal design were included in the Convention on the Rights of Persons with Disabilities (CRDP) of the United Nations, requiring the ratifying states to “undertake or promote research and development of universally designed goods, services, equipment and facilities [. . . ] to promote their availability and use” (Preamble). The convention additionally provides a proper definition of universal design, as “the design of products, environments, programs and services to be usable by all people, to the greatest extent possible, without the need for adaptation or specialized design” (Art. 2). It is worth noticing that since the CRPD has been ratified by nearly all the members of the UN, its definition of universal design holds global relevance. The CRPD highlights the need for designs to be usable by “all people”. Scholars have interpreted it as an unambiguous reference to human diversity and the complexity of human experience, particularly in regard to access and use of ICT [27, 32]. In practice, adopting a universal design approach that takes into account such diversity of needs in the designing phase of a product ensures its usability as well as safeguarding equality and nondiscrimination in the process [12].
1 https://globalprivacycontrol.org/.
7 Enabling Versatile Privacy Interfaces
123
7.3 General Model for Providing Transparency Information In this work, we will be referring to Giannoumis and Stein’s conceptualization of universal design for the information society [12]. A set of four principles is identified with the purpose of shifting the focus from universal design as an outcome to universal design as a process: • Social Equality: Equality and nondiscrimination should explicitly be focal points in the design of both policy and practice, ensuring democratic access to any solution proposed. In this context, transparency in privacy interfaces and policies is necessary in order to promote social equality in the fruition of ICT. • Human Diversity: The complexity of the human experience(s) implies a variety of barriers that can be experienced by technology users in the creation of any product. Taking into account this aspect in addressing transparency in privacy interfaces requires the consideration of versatile interfaces to cater for the diverse needs. • Usability and Accessibility: It is not enough for a design to be usable if a vulnerable demographic cannot access it, and vice versa. Traditional privacy policies are often difficult to locate for nonexpert data subjects, and the language they adopt is extremely specific to the legal field. Hence, advanced user interface technologies have to be considered. • Participatory Processes: User participation and the implementation of user feedbacks and testing are essential elements for the design of inclusive solutions. Privacy interfaces, thus, also need to include respective preference communication mechanisms (during the development and the operation). Based on these considerations relating to universal design, we construct a general reference model for providing transparency information between a data controller and a data subject that brings together the so far isolated aspects listed above. We argue, the model of this privacy interaction explicitly needs to recognize the abovementioned principles. We first derive our model construction considerations, depicted in Fig. 7.1, and continue explaining the key ideas. Distinguishing Presentation from Provision Addressing abovementioned principles in the domain of privacy-related transparency information requires presenting transparency information in different forms and modalities, consciously tailored
2
1 Data representation and storage (e.g., policy language and database)
4
3 Data interpretation and filtering (e.g., natural language processing or RegExp)
Transparency information display (e.g., privacy icons or voice assistant)
Privacy preference enactment (e.g., policy language or consent signal)
Data subject
Data controller Provision
Presentation
Fig. 7.1 Model of exchanging transparency information between a data controller and a data subject
124
E. Grünewald et al.
to the capabilities and needs of different audiences. Visually impaired users may, for instance, need high-contrast or acoustic interfaces, certain groups may require particularly simple language, and so forth. At the same time, the underlying transparency information to be presented does not differ between these cases. The required form may change, while the content does not. In a first step, we therefore consciously distinguish between the presentation of transparency information and their originary provision. Of these, provision happens before the presentation and comprises all activities and technical means involved in determining, compiling, and making accessible the information to be presented (the content), independently of the possibly user-specific interface (the form). The subsequent step of presenting this information, in turn, includes all aspects related to (possibly different, needs-adapted) user interfaces, user interaction, etc. These two main stages can be assigned to different control spheres: Provision commonly happens on the side of the data controller who “determines the purposes and means of the processing”.2 In web-based scenarios, this control sphere is often equivalent to the service provider. Presentation, in turn, happens in the control sphere of the data subject (the user), employing the presentation and interaction capabilities of the client device and/or client software (e.g., web browser, operating system). Across these two main stages and control spheres, we identify four substages of interaction required for the provision of transparency information in line with the abovementioned principles of universal design. Of these, the first two belong to the main stage of provision, while the latter two constitute the one of presentation: ❶ Data representation and storage. Foremost, to allow for any interaction at all, transparency-related information should be represented in a well-defined, machine-readable format. Such a representation can then be stored in potentially multiple versions, e.g., in a public database. Equally important is the availability of an operational and reliable API or query language to enable interoperability with other services [16]. We emphasize this structural representation in related work on policy languages, as indicated in Sect. 7.2. The data curation can be carried out by the controller itself, a trusted third party, or even be crowdsourced in a public repository. Having these data collected in a reasonable quality enables all further stages and versatile privacy interfaces, which outweighs the initial collection overhead. ❷ Data interpretation and filtering. Once the information exists in machinereadable and structured format, and as soon as it can be accessed programmatically, we can start data interpretation and filtering. Some policy languages even include certain predefined data transformations, such as aggregation functions. Depending on the data representation, these processes may comprise tasks such as preprocessing (format conversion), vocabulary matching with existing privacy-related terminology for standardized wording [33], or more complex natural language
2 Responsibilities of potentially existing data processors are subsumed under the liability of the data controller which they are acting on behalf (Art. 29 GDPR).
7 Enabling Versatile Privacy Interfaces
125
processing for translation or named-entity recognition for detecting cross-service data sharing. The necessary operations naturally depend on the data representation format and the compatibility of certain language features. Related work shows the general feasibility of doing so with multiple examples [3]. When transparency information for multiple data controllers is present (e.g., harvested from different APIs or taken from a public corpus), even complex sharing network analysis can be performed [16]. In our model, we consciously differentiate between the raw storage and the actual interpretation. The model is constructed having in mind the applicability to as many use-case scenarios as possible. Since data controllers may have different intentions on what they want to present and data subjects cultivate different information needs (as pointed out in the human diversity principle of universal design), this step allows for much individualization as opposed to preset solutions. ❸ Transparency information display. Next, after successful interpretation and filtering, the transparency information can be presented to the data subject. Here, we emphasize the privacy notice design space [39] again and underline our aim of versatile privacy interfaces as well as legal expressiveness, machine-readability, and universal design principles. Of these, machine readability not only serves the accessibility dimension, but also the freedom of choice of client-side privacy agents. The display happens in the control sphere of the data subject, which opts for their desired presentation option. Notably, prior work pointed out the urgent need for adequate presentation options for recent technological advances, such as IoT devices processing personal data [e.g., 8, 34]. We point out that multiple display options can and should be offered based on the same structured transparency information. For visually impaired people, for instance, an audio or haptic interface might be considered, while others might prefer textual or visual presentations (see also [45]). ❹ Privacy preference enactment. In cases in which the subject has the option to provide privacy preferences (level of detail, legal background, technological safeguards, etc.), the user interface must be capable of signaling the desired state of presentation. That is why a communication protocol between the user agent (e.g., the browser) and the service provider’s infrastructure is necessary. We denote different approaches to client- and server-initiated correspondence with associated communication challenges. Furthermore, several competing standards differ in privacy signal contents, interpretation, communication, or contextual factors as pointed out by Human et al. [24]. Data controllers implementing any of these protocols usually propagate and store preferences in their infrastructure. As a result, the enactment of privacy preferences (guided by the informational selfdetermination principle) allows for individualized interfaces meeting the universal design principles and providing effective transparency. Consequently, each of the stages mentioned above in the proposed model comes with legal, technical, and possibly organizational challenges. Therefore, we hereby aim to structure the discussion about concrete system designs and to deliberately separate the different concerns.
126
E. Grünewald et al.
7.4 Implementation After having explained the key concepts of our reference model, we now instantiate it by providing two implementations to illustrate its practical applicability and relevance in different contexts.
7.4.1 Layered Privacy Dashboard Including Privacy Icons In the light of well-known dysfunctionalities of textual, legalese, and all too often incomprehensible privacy policies (including transparency-related statements), it is noteworthy that the GDPR already foresees a different, more intelligible modality for providing transparency information: Article 12(7f) GDPR explicitly states that “the information to be provided to data subjects [. . . ] may be provided in combination with standardized icons to give in an easily visible, intelligible and clearly legible manner a meaningful overview of the intended processing”. Following that, Grafenstein et al. give an overview of the visual components pertinent to the realization of a layered privacy dashboard including privacy icons, exemplarily designed for the domain of cookies [13]. The study rests on participatory user workshops (in line with the universal design approach) that determine the information that has to be presented in order to be most effective and useful. They develop a three-layered approach in which increasing specificity and intricacies of information are presented to allow for various user preferences and competences [13], which addresses the abovementioned universal design goal of human diversity. In principle, this approach is also eligible to convey other transparency information than those related to cookies, like, e.g., the categories of personal data being collected or purposes of their processing. Within this prototypical implementation, transparency is achieved through the visual components of a banner consciously designed to serve abovementioned, participatory gathered information needs. Following a “layered approach” [1], the first layer displays a basic icon symbolizing the availability of further information (see Fig. 7.2a), also facilitating usability. The first layer can be seen as an entry point, indicating that additional transparency information is available. In the second layer, icons that represent purposes and data categories specified in the privacy policy of the data controller are displayed. For example, Fig. 7.2b shows the second layer of a controller that processes data for the purpose of improving the website. The third level acts as a comprehensive dashboard that allows the user to interact with the privacy settings and make more granular decisions (see Fig. 7.2c). It consists of multiple pages that reveal, for instance, information about third-party sharing and third-country transfers outside the European Union. The information is illustrated with a network diagram that allows us to intuitively understand how information is passed on to other entities.
7 Enabling Versatile Privacy Interfaces
127
Fig. 7.2 Functional prototype of the layered privacy dashboard based on the design of [13] and enabled using machine-readable transparency and consent information encoded in the TILT [16] and YaPPL [42] policy languages. (a) Level 1. Entry point. (b) Level 2. Purpose and risk icon. (c) Level 3. Detailed view on data receivers depicting an exemplary sharing network
Since the proposal [13] was of a purely conceptual nature, our contribution lies in the actual development of a technological underpinning and its concrete implementation. In their study, Grafenstein et al. already determined the effectiveness of their designs based on their evaluations. Consequently, the effectiveness evaluation of the visual representation does not need to be repeated in this work. However, for the approach to be considered a state-of-the-art implementation according to Art. 25 GDPR, we argue that all stages from our model need to be addressed. So far, their work focused mainly on the transparency information display step, while the data controller’s control sphere (provision stages) was not discussed technologywise. Thus, we have developed a working prototype that allows users to visually access and control which data are shared with whom, for which purpose, etc. We cover all relevant steps from our proposed model and incorporated the findings from the original study [13]. To do so, we applied our abovementioned general model as follows: ❶ The underlying transparency information is represented using the Transparency Information Language and Toolkit (TILT) [16], which has the required,
128
E. Grünewald et al.
GDPR-aligned expressiveness. We assume the transparency information according to Art. 12–15 GDPR to be provided by the data controller. Moreover, by querying an instance of tilt-hub,3 which is a document store that provides several querycapable web APIs, we also have access to the transparency information for potential secondary recipients (onward transfers), as well as an overview of related third parties. Within this setup, TILT is used as the fundamental data representation stage, as presented in our abovementioned model. ❷ Next, data interpretation and filtering are performed within our prototype in the control sphere of the data controller. In particular, we encoded several interpretation rules (e.g., if-then-else statements) that lead to the generation and subsequent display of suitable privacy icons. In fact, the provided transparency information is scanned for key terms that indicate respective risks for the data subject. For now, we interpreted different categories of personal data disclosed or the purpose specification given. However, we call for contributions to provide even more sensible content-related interpretations and summaries (e.g., as in privacy nutrition labels [28]). ❸ Afterwards, all information is provided on the controller side, retrieved on the data subject side, and visually displayed on stacked layers following the underlying design proposal [13] for presentation. To remove barriers to adoption, the complete dashboard and banner components are technically encapsulated and based on standard web-development technologies. They are all accessible to any programmer in an open-source code repository.4 Consequently, the banner does not need to be coded anew but can be easily integrated into existing websites. Hence, we minimize the cost of implementation and integration to align with the privacy by design and by default principle (Art. 25 GDPR). ❹ In addition, data subjects can choose different options to switch off transfers to selected third parties or by groups such as data controllers residing outside the EU. Presentation preferences are stored locally on the client device to be accessible, when revisiting the service.5
7.4.2 Interactive Privacy Chatbot and Voice Assistant Through Conversational AI Another approach to making transparency information more accessible for data subjects with different needs is the usage of conversational AI [20]. To this end, we developed a Transparency Information Bot (TIBO) that can be accessed in the form of a textual chatbot or as a speech-based virtual assistant (VA). Again, we
3
https://github.com/Transparency-Information-Language/tilt-hub. https://github.com/DaSKITA/privacy-icons. 5 Furthermore, by incorporating the YaPPL privacy preference language [42], we can communicate a respective signal back to the controller if desired. 4
7 Enabling Versatile Privacy Interfaces
129 3 4
2
1
Action Server
Document storage tilt-hub
Core Web server
Rasa
Model
Virtual Assistant
Messengers
Fig. 7.3 Architecture of the interactive privacy interface using conversational AI
provide our complete implementation as open-source software.6 In the following, we discuss our design choices according to the general model. ❶ For this prototype, we assume transparency information to be provided in the TILT ecosystem again, as described in the first implementation. Besides illustrating the advantage of different presentations being implementable on top of the same information provision, this allows for constant extension and potentially crowdsourcing of transparency information for numerous data controllers. We successfully began such efforts in a public corpus featuring dozens of real-world online services.7 ❷ To interpret the transparency information, we implemented a cloud-based infrastructure at the core of TIBO (see Fig. 7.3), which is based on the commonly used Rasa X framework for conversational AI [4]. In particular, there are two core modules in the framework. First, a natural language understanding module (Rasa NLU) receives a message from the data subject and interprets it using NLP and machine learning techniques. Second, Rasa Core uses the interpreted data as input and computes an action (typically a response message) based on the previous conversation history, employing a trained classification model and various dialogue flows. For that, a classification model is continuously trained to choose the best response for a given input from a predefined list of actions. For instance, we provide different dialogue flows depending on the input question. ❸ For the display, we implemented both a web server and a front end that features the TIBO chatbot and a skill for the popular Amazon Alexa platform in data subject’s control sphere. The chatbot can, similar to the privacy dashboard, be included in existing websites to support conversational privacy dialogues. Through the speech-based interface, TIBO can also be used by visually impaired people or on devices without screens (e.g., IoT devices). Researching further similar accessibility improvements—guided by the proposed model foundations—is considered impactful future work. ❹ Additionally, the prototype enables data subjects to specify their desired preferences of detail of the information provided. They can either ask their questions in natural language or choose from recommended conversation flows. This allows experienced data subjects to investigate the peculiarities of, e.g., data sharing networks in-depth, while others are pleased with a rough overview. Moreover, the 6 https://github.com/DaSKITA/chatbot. 7 https://github.com/Transparency-Information-Language/tilt-corpus.
130
E. Grünewald et al.
conversational AI can continuously be trained for precisely answering recurring information requests. Future work in this stage may support lightweight data subject access requests. These are then supposed to be individualized upon personal preferences, for example, with regard to the level of completeness.
7.5 Preliminary Evaluation To demonstrate the general feasibility of the prototypes implemented according to our model and their benefit in serving data subjects’ actual informedness, both have been subjected to a preliminary evaluation. Given the significantly differing prerequisites (e.g., the layered privacy dashboard using preexisting, participatory designed visual concepts), evaluation approaches also differ between the two prototypes.
7.5.1 Layered Privacy Dashboard For the layered privacy dashboard, we first focus on the provision-related stages of the model: The data representation and storage (according to stage ❶) challenges could be addressed by using a structured policy language. We depict an example of a (imaginary) data controller in Fig. 7.2. In fact, the prototype already supports dozens of real-world services, of which we collected transparency information in our corpus repository, vividly demonstrating the practical viability of our approach. Moreover, the data interpretation and filtering (❷) could be realized having the principle of human diversity rooted in universal design in mind. For instance, the automatic detection of third country transfers or the summary of purposes for the related processing activities could be automated within the prototype based on the users’ specific needs. Clearly, the summary and generation of privacy icons (to be then displayed) reliefs the subjects from doing the analysis work manually. In addition, the instantiation of the general model therefore contributes to a more standardized perception of transparency information. As opposed to different information banners on each and every website, the prototype enables a quick overview of the information in an always comparable setting. Altogether, the privacy information banner is designed to allow users to exercise their GDPR rights to transparency. It is based on designs already evaluated in participatory workshops and technically implements these in a practically applicable form, following our general model introduced above. Therefore, we can do without additional user studies. Through our implementation efforts, we contribute a working prototype that can serve as a standardized mechanism for data controllers instead of all too often deficient “Cookie banners” that neither cover all GDPR
7 Enabling Versatile Privacy Interfaces
131
or European ePrivacy Directive transparency requirements nor being properly evaluated beforehand. Further experiments have to optimize transparency information display (❸) to adequately address potential misconceptions of personal data at risk. Lastly, privacy preference and default settings have to reflect information asymmetries, power imbalances, and, again, risks associated with the processing of personal data. Discussing several options under the umbrella of our model might help. Applying stage ❹ of our reference model, we allow users to spend a minimal amount of effort to control their privacy preferences once and let the browser remember the decision. For actual integration, the payload encoded in preference requests needs to be transferred back to the controller to propagate and enforce the policy [35].
7.5.2 Interactive Privacy Chatbot and Voice Assistant Secondly, we continue with the interactive privacy chatbot TIBO. Given that it rests upon basically similar technologies for the provision-related part, we pay particular attention to the presentation-related stages of the model here. For this, we performed a participatory user study, which was carried out as follows: Study Design The study aims to evaluate the extent to which TIBO is useful for data subjects. In particular, we want to measure the time spent to find the desired information about the privacy-related practices of a service. The secondary goal was to obtain user feedback for the further development of the conversational AI. Procedure We selected a variant of Unmoderated Remote Usability Testing [2], where the participants have to solve various tasks given to them in an online survey, with and without the assistance of TIBO. These tasks consisted of finding out relevant information, such as the name of the data protection officer, the categories of personal data being processed, or the existence of third-party sharing. The participants were, based on their month of birth, divided into two groups (42 percent/58 percent), which had to find identical transparency information for two different services, namely, a major German media library service (ARD Mediathek) and the online platform of the German Federal Ministry of Justice (former German BMJV). Group A used TIBO for the ARD Mediathek and the traditional privacy policy of the German BMJV, while group B had to solve the same tasks the other way around. The participants were free in their decision to choose whether to use the chatbot or the voice assistant. We then measured the time taken for solving the tasks. A shorter processing time of the tasks when using one of the assistants would then indicate that the prototypes fulfill their goal. In addition, we conducted short interviews to discuss further use cases and potentials of TIBO. Participants Nineteen participants evenly distributed across gender and age groups completed the online survey. With this sample size, results are of limited statistical significance, but nonetheless still provide valuable insights. Almost half of the
132
E. Grünewald et al.
Fig. 7.4 Ease of understanding regarding responses of the chatbot and voice assistant, rated by the participants of the user study
participants were female (47 percent). The majority of participants were between 30 and 45 years old (53 percent), followed by those younger than 30 (26 percent) and between 46 and 60 (21 percent). Sixty-nine percent opted for testing the voice assistant first. We asked the participants several questions about their prior experience and demographics while ensuring that they stay anonymous at all times. They were briefed to be included in a scientific evaluation of the tools, and they provided consent for analyzing the given inputs. We adhered to good scientific practice and common ethical standards for such user studies. The participants were recruited from the so-called Living Lab of the CheckMyVA8 project. Results The results show that the use of the assistant enabled the participants to solve their tasks more quickly. The average participant spent 8:28 min to find out the required pieces of information with the assistant, while they needed 10:51 min for the same tasks without the assistant. Independent of their chatbot use, they took slightly longer to complete the tasks for the BMJV than for the ARD Mediathek (BMJV: 9:48 min; ARD Mediathek: 9:03 min). Participants using the chatbot were faster than those using the voice assistant, which evidently depends on the speaking rate of the virtual voice and faster processing of written text than voice. Considering the feedback from the participants, the answers are easy to understand (cf. Fig. 7.4) but there is still potential for increasing the time saved by the assistant compared to manual research, e.g., by improving response times. By and large, however, the results indicate our approach to be practically viable and valuable and to effectively help data subjects in finding and comprehending transparency information. One participant elaborated: “The bot makes it easier for me to find the information. It’s great that the information is in the same format, allowing me to compare”. This suggests that the value of TIBO can be increased even more if more services are included, which refers to the provision of more transparency information in stage ❶ and ❷. Furthermore, the results indicate that the assistant fulfills its intended 8 https://checkmyva.de/.
7 Enabling Versatile Privacy Interfaces
133
use with regard to effectiveness in display (❸). All the participants obtained the desired information with the help of the assistant correctly and even faster than with a manual search. Concerning the general usability, the participants noted that the chatbot does not offer the option of interrupting it during its responses. Especially during longer answers to the desired transparency information, this option should be granted in future versions. In addition, a navigation function should be implemented so that jumping between menu and selection items is possible, extending the preference enactment in stage ❹. Overall, 53 percent of the participants stated they would use the assistant again to obtain privacy-related information, which underlines the general effectiveness of the approach. Future versions of the assistant are currently evaluated closely to the ongoing development progress. Overall, the participants envisioned the primary usage of TIBO to check the privacy policy before they are starting to use a service. For this purpose, many of them imagine it on a website of an independent third party, e.g., a consumer center. One participant sees the chatbot’s greatest potential for children or people with learning disabilities who are not familiar with the usual privacy statements. For these people, TIBO—possibly in easy language—could be a good fit in obtaining comprehensible information. This optimally aligns with our intention of using conversational AI as a presentation means (in stages ❸ and ❹), which again addresses the universal design principles of increased usability and accessibility.
7.6 Discussion, Conclusion, and Outlook We proposed a general model for the provision of transparency information while embedding the legal responsibilities and rights, technical challenges, and some cross-cutting aspects induced by diverse fields. All scholars and professionals in the field are invited to take the model as a reference for proposing related systems. Clearly, each proposed stage has to be explored in much more detail. In the same vein, more studies need to explore the suitable interpretation and filtering methods. These should in particular incorporate comparable transparency metrics [40]. Following our general model, we proposed two advanced transparency interfaces that enable context-, preference-, and competence-adaptive provision of transparency information based on machine-readable representations. To the best of our knowledge, this is the first comprehensive work on providing transparency information in a structured, machine-readable format across the control spheres of data controller and subject, presenting the information through different channels, but using the same underlying data, and guided by overarching universal design principles. We demonstrated the practical viability through the user study of the chatbot and the virtual assistant while incorporating the already-evaluated UI/UX design of the layered privacy dashboard. We found conversational AI makes transparency information significantly more accessible, thereby enabling users to
134
E. Grünewald et al.
make well-informed decisions about the processing of personal data concerning them faster than with traditional means. Open challenges remain in large-scale integration efforts across diverse user interface technologies and platforms. We already provided two viable proof-ofconcept implementations. Evidently, as pointed out above, the design space of an effective privacy notice is large [39]. In this work, we demonstrate the actionable guidance of our reference model and aim for more versatile implementations in that space, replacing existing “dark patterns”. In addition, new and current draft regulations of transparency provisions, such as the European Data Governance Act, Digital Services Act, or AI Act, are introducing extended transparency requirements. As soon they enter into force, affected data controllers, in particular large-scale service providers, are obligated to implement comparable transparency measures. Future work, among others, incorporates participatory learning of privacy preferences and crowdsourced approaches for collecting and providing transparency information (e.g., through privacy agents). Such data could also be used for the wider prospects of advanced transparency and accountability mechanisms for practical privacy engineering and usable privacy. In all the mentioned scenarios, the distinction and logical separation of information provision and presentation is key to a sustainable technological development: System engineers can build efficient data provision formats and communication protocols, while UI/UX designers can conceptualize more usable and accessible interfaces (guided by legal expertise) together, when they agree to stick to the proposed model. This ultimately results in satisfactory components for both data controllers and subjects. Acknowledgments First, we thank Flora Muscinelli and Michael Gebauer for their work on the development and operations of our prototypes. Moreover, we thank the DaSKITA9 project team for their valuable support. We also thank Maximilian von Grafenstein and Julie Heumüller for fruitful discussions and their input on the privacy dashboard. In addition, we thank the team of the CheckMyVA project for jointly recruiting the participants of the user study. This research was funded within the project TOUCAN,10 supported under grant no. 01IS17052 by funds of the German Federal Ministry of Education and Research (BMBF) under the Software Campus 2.0 (TU Berlin) program.
References 1. Article 29 Data Protection Working Party. Guidelines on Transparency under Regulation 2016/679. 2017. 2. Carol M. Barnum. “Testing here, there, everywhere”. In: Usability Testing Essentials (Second Edition). 2021, pp. 69–97. ISBN: 978-0-12-816942-1. https://doi.org/10.1016/B978-0-12816942-1.00003-4.
9 http://tu.berlin/ise/daskita. 10 https://tu.berlin/ise/toucan.
7 Enabling Versatile Privacy Interfaces
135
3. Stefan Becher and Armin Gerl. “ConTra Preference Language: Privacy Preference Unification via Privacy Interfaces”. In: Sensors 22.14 (2022), p. 5428. 4. Tom Bocklisch, Joey Faulkner, Nick Pawlowski, and Alan Nichol. “Rasa: Open source language understanding and dialogue management”. In: NIPS Workshop on Conversational AI (2017). 5. Lorrie Faith Cranor. “Necessary but not sufficient: Standardized mechanisms for privacy notice and choice”. In: Journal on Telecommunications and High Technology Law 10 (2012), p. 273. 6. Lorrie Faith Cranor. “P3P: Making privacy policies more useful”. In: IEEE Security & Privacy 1.6 (2003), pp. 50–55. https://doi.org/10.1109/MSECP.2003.1253568. 7. Lorrie Faith Cranor, Praveen Guduru, and Manjula Arjula. “User interfaces for privacy agents”. In: ACM Transactions on Computer-Human Interaction (TOCHI) 13.2 (2006), pp. 135–178. 8. Nigel Davies, Nina Taft, Mahadev Satyanarayanan, Sarah Clinch, and Brandon Amos. “Privacy mediators: Helping IoT cross the chasm”. In: Proceedings of the 17th international workshop on mobile computing systems and applications. 2016, pp. 39–44. 9. European Data Protection Board. Guidelines 3/2022 on Dark patterns in social media platform interfaces: How to recognise and avoid them. 2022. 10. Christian Fuchs and Marianna Obrist. “HCI and Society: Towards a Typology of Universal Design Principles”. In: International Journal of Human-Computer Interaction 26.6 (May 2010), pp. 638–656. https://doi.org/10.1080/10447.311003781334. 11. Armin Gerl, Nadia Bennani, Harald Kosch, and Lionel Brunie. “LPL, towards a GDPRcompliant privacy language: formal definition and usage”. In: Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXVII. Bonn: Springer, 2018, pp. 41–80. 12. George Anthony Giannoumis and Michael Ashley Stein. “Conceptualizing Universal Design for the Information Society through a Universal Human Rights Lens”. In: International Human Rights Law Review 8.1 (2019), pp. 38–66. https://doi.org/10.1163/22131035-00801006. 13. Maximilian von Grafenstein, Julie Heumüller, Elias Belgacem, Timo Jakobi, and Patrick Smiesko. “Effective Regulation through Design—Aligning the ePrivacy Regulation with the EU General Data Protection Regula- tion (GDPR): Tracking Technologies in Personalised Internet Content and the Data Protection by Design Approach”. In: Available at SSRN (2021). https://doi.org/10.2139/ssrn.3945471. 14. Elias Grünewald. “Cloud Native Privacy Engineering through DevPrivOps”. In: Privacy and Identity Management. Between Data Protection and Security. Ed. by Michael Friedewald, Stephan Krenn, Ina Schiering, and Stefan Schiffner. Cham: Springer International Publishing, 2022, pp. 122–141. ISBN: 978-3-030-99100-5. 15. Elias Grünewald and Frank Pallas. “Datensouver¨anit¨at für Verbraucher: innen: Technische Ansätze durch KI-basierte Transparenz und Auskunft im Kontext der DSGVO”. de. In: Alexander Boden, Timo Jakobi, Gun- nar Stevens, Christian Bala (Hgg.): Verbraucherdatenschutz— Technik und Regulation zur Unterstützung des Individuums. 2021, pp. 1–17. ISBN: 978-396043-095-7. https://doi.org/10.18418/978-3-96043-095-702. 16. Elias Grünewald and Frank Pallas. “TILT: A GDPR-Aligned Transparency Information Language and Toolkit for Practical Privacy Engineering”. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’21. Virtual Event, Canada: Association for Computing Machinery, 2021, pp. 636–646. ISBN: 9781450383097. https://doi. org/10.1145/3442188.3445925. 17. Elias Grünewald, Paul Wille, Frank Pallas, Maria C. Borges, and Max-R. Ulbricht. “TIRA: An OpenAPI Extension and Toolbox for GDPR Trans- parency in RESTful Architectures”. In: 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE. 2021, pp. 312–319. 18. Hana Habib, Yixin Zou, Yaxing Yao, Alessandro Acquisti, Lorrie Cranor, Joel Reidenberg, Norman Sadeh, and Florian Schaub. “Toggles, Dollar Signs, and Triangles: How to (In)Effectively Convey Privacy Choices with Icons and Link Texts”. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. CHI ’21. Yokohama, Japan: Association for Computing Machinery, 2021. ISBN: 9781450380966. https://doi.org/10.1145/ 3411764.3445387.
136
E. Grünewald et al.
19. Hamza Harkous, Kassem Fawaz, R´emi Lebret, Florian Schaub, Kang G. Shin, and Karl Aberer. “Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning”. In: 27th USENIX Security Sympo- sium (USENIX Security 18). Baltimore, MD: USENIX Association, Aug. 2018, pp. 531–548. ISBN: 978-1-939133-04-5. 20. Hamza Harkous, Kassem Fawaz, Kang G. Shin, and Karl Aberer. “PriBots: Conversational Privacy with Chatbots”. In: Twelfth Symposium on Usable Privacy and Security (SOUPS 2016). Denver, CO: USENIX Association, June 2016. 21. Hans Hedbom. “A survey on transparency tools for enhancing privacy”. In: IFIP Summer School on the Future of Identity in the Information Society. Berlin Heidelberg: Springer, 2008, pp. 67–82. 22. Maximilian Hils, Daniel W Woods, and Rainer Böhme. “Privacy preference signals: Past, present and future”. In: Proceedings on Privacy Enhancing Technologies 2021.4 (2021), pp. 249–269. 23. Leif-Erik Holtz, Katharina Nocun, and Marit Hansen. “Towards Displaying Privacy Information with Icons”. In: Privacy and Identity Management for Life. Ed. by Simone Fischer-Hübner, Penny Duquenoy, Marit Hansen, Ronald Leenes, and Ge Zhang. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 338–348. 24. Soheil Human, Harshvardhan J. Pandit, Victor Pierre Morel, Cristiana Santos, Martin Degeling, Arianna Rossi, Wilhelmina Botes, Vitor Jesus, and Irene Kamara. “Data Protection and Consenting Communication Mechanisms: Current Open Proposals and Challenges”. In: 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE. 2022. 25. Soheil Human, Max Schrems, Alan Toner, Gerben, and Ben Wagner. Advanced Data Protection Control (ADPC). Sustainable Computing Reports and Specifications. Vienna: WU Vienna University of Economics and Business, June 2021. URL: https://epub.wu.ac.at/8280/. 26. Renato Iannella, Adam Finden, and Stacked Creations. “Privacy awareness: Icons and expression for social networks”. In: Proceedings of the 8th Virtual Goods Workshop and the 6th ODRL Workshop. 2010, pp. 1–15. 27. Rob Imrie. “Universalism, universal design and equitable access to the built environment”. In: Disability and rehabilitation 34.10 (2012), pp. 873–882. 28. Patrick Gage Kelley, Joanna Bresee, Lorrie Faith Cranor, and Robert W Reeder. “A ”nutrition label” for privacy”. In: Proceedings of the 5th Symposium on Usable Privacy and Security. 2009, pp. 1–12. 29. Jens Leicht and Maritta Heisel. “A survey on privacy policy languages: Expressiveness concerning data protection regulations”. In: 2019 12th CMI Conference on Cybersecurity and Privacy (CMI). Copenhagen: IEEE, 2019, pp. 1–6. https://doi.org/10.1109/CMI48017.2019. 8962144. 30. Ronald Mace. “Universal design: Barrier free environments for everyone”. In: Designers West 33.1 (1985), pp. 147–152. 31. Helen Nissenbaum. “Privacy as contextual integrity”. In: Washington Law Review 79 (2004), pp. 119–158. 32. Elaine Ostroff. “Universal design: an evolving paradigm”. In: Universal design handbook 2 (2011), pp. 34–42. 33. Harshvardhan J. Pandit, Axel Polleres, Bert Bos, Rob Brennan, Bud Bruegger, Fajar J. Ekaputra, Javier D. Fernández, Roghaiyeh Gachpaz Hamed, Elmar Kiesling, Mark Lizar, Eva Schlehahn, Simon Steyskal, and Rigo Wenning. “Creating a Vocabulary for Data Privacy”. In: On the Move to Meaningful Internet Systems: OTM 2019 Conferences. Cham: Springer International Publishing, 2019, pp. 714–730. ISBN: 978-3-030-33246-4. 34. Alfredo J Perez, Sherali Zeadally, and Jonathan Cochran. “A review and an empirical analysis of privacy policy and notices for consumer Internet of things”. In: Security and Privacy 1.3 (2018), p. 15. 35. Paulina Jo Pesch. “Drivers and Obstacles for the Adoption of Consent Management Solutions by Ad-Tech Providers”. In: 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE. 2021, pp. 269–277.
7 Enabling Versatile Privacy Interfaces
137
36. Christian Reuter, Luigi Lo Iacono, and Alexander Benlian. “A quarter century of usable security and privacy research: transparency, tailorability, and the road ahead”. In: Behaviour & Information Technology 41.10 (2022), pp. 1–14. https://doi.org/10.1080/0144929X.2022. 2080908. 37. Arianna Rossi and Monica Palmirani. “Can Visual Design Provide Legal Transparency? The Challenges for Successful Implementation of Icons for Data Protection”. In: Design Issues 36.3 (June 2020), pp. 82–96. issn: 0747-9360. https://doi.org/10.1162/desi_a_00605. 38. Florian Schaub, Rebecca Balebako, and Lorrie Faith Cranor. “Designing effective privacy notices and controls”. In: IEEE Internet Computing 21.3 (2017), pp. 70–77. 39. Florian Schaub, Rebecca Balebako, Adam L. Durity, and Lorrie Faith Cranor. “A Design Space for Effective Privacy Notices”. In: Eleventh Symposium On Usable Privacy and Security (SOUPS 2015). Ottawa: USENIX Association, July 2015, pp. 1–17. ISBN: 978-1-931971-249. 40. Dayana Spagnuelo, Cesare Bartolini, and Gabriele Lenzini. “Metrics for Transparency”. In: Data Privacy Management and Security Assurance. Ed. by Giovanni Livraga, Vicen¸c Torra, Alessandro Aldini, Fabio Martinelli, and Neeraj Suri. Cham: Springer International Publishing, 2016, pp. 3–18. ISBN: 978-3-319-47072-6. 41. Janice Tsai, Serge Egelman, Lorrie Cranor, and Alessandro Acquisti. “The impact of privacy indicators on search engine browsing patterns”. In: Proceedings of the 5th Symposium on Usable Privacy and Security. 2009. 42. Max-R. Ulbricht and Frank Pallas. “YaPPL—A Lightweight Privacy Preference Language for Legally Sufficient and Automated Consent Provision in IoT Scenarios”. en. In: Data Privacy Management, Cryptocurrencies and Blockchain Technology. Ed. by Joaquin Garcia-Alfaro, Jordi Herrera-Joancomartí, Giovanni Livraga, and Ruben Rios. Lecture Notes in Computer Science. Springer International Publishing, 2018, pp. 329–344. ISBN: 978-3-030-00305-0. 43. United Nations. Convention on the Rights of Persons with Disabilities. 2006. 44. Maximiliane Windl, Anna-Marie Ortloff, Niels Henze, and Valentin Schwind. “Privacy at a Glance: A Process to Learn Modular Privacy Icons During Web Browsing”. In: ACM SIGIR Conference on Human Information Interaction and Retrieval. 2022, pp. 102–112. 45. Christian Zimmermann. “A Categorization of Transparency-Enhancing Technologies”. In: CoRR abs/1507.04914 (2015). arXiv: 1507.04914.
Chapter 8
Processing of Data Relating to Criminal Convictions and Offenses in the Context of Labor Relations in Spain Beatriz Aguinaga Glaría
8.1 Introduction Selection and recruitment processes and management of employment relationships require the processing of diverse categories of personal data; this includes ordinary categories, such as identification and contact data; special categories, such as gender or health data; and even sometimes data relating to criminal convictions and offenses. In general, the processing of personal data by employers cannot be the result of an improvised action; instead, it requires a prior exercise of thought to ensure compliance with the Principles of Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016, on the protection of individuals with regard to the processing of personal data and the free movement of such data and repealing Directive 95/46/EC (from now on, “GDPR”). In Spain, the reason supporting such a statement is that, according to settled case law, the execution of an employment contract does not imply the deprivation for workers of the rights recognized in the Constitution.1 Therefore, the employment relationship linking
1 Among other resolutions: Tribunal Constitucional 94/1984, October 16, Rapporteur: Mr. Antonio Truyol Serra, ECLI:ES:TC:1984:94, Legal Ground No. 3; Tribunal Constitucional 88/1985, July 19, Rapporteur: Mr. Ángel Escudero del Corral, ECLI:ES:TC:1985:88, Legal Ground No. 2; Tribunal Constitucional 98/2000, April 10, Rapporteur: Mr. Fernando Garrido Falla, ECLI:ES:TC:2000:98, Legal Ground No. 6; Tribunal Constitucional 89/2018, September 6, Rapporteur: Mr. Santiago Martínez-Vares García, ECLI:ES:TC:2018:89, Legal Ground No. 3; Tribunal Constitucional 146/2019, November 25, Rapporteur: Mrs. Encarnación Roca Trías, ECLI:ES:TC:2019:146, Legal Ground No. 4.
B. Aguinaga Glaría () Public University of Navarre, Pamplona, Navarra, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0_8
139
140
B. Aguinaga Glaría
employers and employees cannot justify a violation of the fundamental right to the protection of the latter’s personal data. Despite this, the Spanish Data Protection Authority has received numerous complaints for alleged illegitimate processing of personal data in the employment context.2 Furthermore, employment courts have also ruled in Spain, not infrequently, on the unlawful processing of personal data in the context of employment relationships. Due to recent contentiousness, this paper examines the processing by employers of data relating to criminal convictions and offenses of candidates and employees. In the first half of 2022, the Spanish Data Protection Authority3 and the Spanish Supreme Court4 ruled on this matter, saying that such processing will only be legitimate if a law specifically entitles employers to ask for criminal records. This scenario contrasts with what occurs in some Member States, where employers seem to have other means to legitimize the processing of such data as per their national data protection laws. The origin of this dissimilarity between the Member States lies in Articles 10 and 88 GDPR. On the one hand, Article 10 GDPR, with regard to the processing of personal data relating to criminal convictions and offenses, refers to the legislation of the Member States, stating that such processing may only be carried out if permitted by Union or Member State law providing adequate safeguards for the rights and freedoms of data subjects. On the other hand, Article 88 GDPR, in line with Recital 155 GDPR, also refers to national legislation by empowering the Member States to create rules ensuring the protection of workers’ data in the employment context. In other words, the GDPR refrains from regulating data processing in employment relationships, leaving such regulatory work to national legislators. Therefore, despite the attempt to obtain a uniform set of rules on the protection of personal data in Europe, in the case of the processing of data relating to criminal convictions and offenses and the processing of personal data in the field of employment relations, the European legislator chooses to delegate the regulation of such processing to national legislators. Thus, the consequence of the above is the coexistence of different legal regimes in Europe applicable to the processing of personal data related to criminal convictions and offenses in the context of employment. This phenomenon is the reason why, especially in global business groups or corporations with standard policies, conflicts arise around the processing activity studied herein. Precisely for this reason, and to provide clarity to international groups operating in Spain on the lawfulness of this processing activity, this paper studies the regulation applicable to the processing by employers of criminal convictions and
2 Agencia
Española de Protección de Datos. Browser, https://www.aepd.es/es/informes-yresoluciones/resoluciones?f%5B0%5D=conceptos_resoluciones%3A3285, last visited on September 8, 2022. 3 Agencia Española de Protección de Datos PS/00267/2020, February 11, 2022. 4 Tribunal Supremo 435/2022, May 12, Rapporteur: Mrs. Mª Luz García Paredes, ECLI:ES:TS:2022:1860.
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
141
offenses data. To this end, firstly, this paper will determine the general legal regime applicable to the processing of personal data in the employment context. Thus, given the lack of development of Article 88 GDPR in Spain, this paper will also explain the general rules governing the processing of personal data. Secondly, the paper will address the legal regime applicable to the processing of data foreseen in Article 10 GDPR, drawing parallels with the one applicable to the processing of special categories of data, due to its similarity and shared purpose of protecting the right to dignity, personality, and non-discrimination of the individual. Finally, this research will highlight the wisdom of the Spanish legislator in only contemplating compliance with a legal obligation as an exception to the prohibition contained in Article 10 GDPR, by comparing it with other European regimes.
8.2 The Legal Regime Applicable in Spain to the Processing of Personal Data in the Field of Employment The strictly personal nature of the employment relationship and its continuation over time generates the need for employers to process the personal data of employees and potential employees [1]. Aware of this reality and because of the existence of national particularities in the field of employment contracts [2], the European legislator introduced Article 88 GDPR, which empowered national legislators to create rules for the protection of candidates’ and employees’ data and the preservation of their legitimate interests and fundamental rights adapted to national regulations in the field of “recruitment, the performance of the employment contract, including discharge of obligations laid down by law or by collective agreements, management, planning and organization of work, equality and diversity in the workplace, health and safety at work, protection of employer’s or customer’s property and for the exercise and enjoyment, on an individual or collective basis, of rights and benefits related to employment, and for the termination of the employment relationship.” However, in Spain, Organic Law 3/2018, of December 5, on Personal Data Protection and guarantee of digital rights (from now on, “LOPDGDD,” as commonly referred to in Spain) does not develop Article 88 GDPR; instead, it regulates the so-called digital rights of workers to protect their privacy [3]. Thus, due to the lack of specific national regulation applicable in Spain to the processing of personal data in the employment context, employers must follow the general provisions contained in Chapter II of the GDPR, which include the principles relating to the processing of personal data, the lawfulness of processing, as well as the particularities applicable to special categories of data and data relating to criminal convictions and offenses. Based on the foregoing, it is worth taking a closer look to the principles set out in Article 5 GDPR because, apart from being a normative and interpretative guideline of the GDPR, they specify the obligations to be followed by legal operators when processing personal data, setting out the rules that determine how personal data must be collected and processed [4].
142
B. Aguinaga Glaría
In this way, starting with Article 5.1.a) GDPR and Recital 40 GDPR, processing of personal data must be subject to a lawful basis that legitimates it. This so-called principle of lawfulness implies the need to support any processing of personal data on a lawful basis listed in Article 6 GDPR, to be determined according to the purpose of the processing. Focusing on employment relations, and in order not to exceed the scope of this paper, the potential lawful bases applicable to the processing carried out by employers would be reduced to the following four: performance of a contract, compliance with a legal obligation, legitimate interest, and, to a minimal extent, consent [5]. It is worth explaining that, although it is not uncommon in the daily practice of companies to base their data processing on consent, it is advisable to avoid such a lawful basis in the field of employment because in such a context, workers or job seekers are in a position of imbalance of power and even vulnerability toward employers.5 Consequently, almost all consents could be contemplated as illegitimate because of lacking the note of freedom required by Articles 4.11 and 7 GDPR. Thus, in the field of employment relationships, taking into account that they hinge around the execution of an employment contract with a natural person, such execution constitutes the most appropriate basis of legitimization of the processing of personal data necessary for the performance, maintenance, and fulfillment of the employment relationship [6]. In other words, there is a clear need to process personal data in the recruitment phase to enable subsequent hiring and to guarantee the fulfillment of the rights and obligations of the parties resulting from the employment contract. Furthermore, in the field of employment, numerous legal and regulatory provisions are applicable to the employer, which, due to the personal nature of the employment relationship, will involve the processing of personal data. For this reason, compliance with legal obligations imposed on the employer is another everyday basis for legitimizing processing operations arising from obligations such as social security contributions, prevention of occupational risks, or preservation of equality at work. It is also worth noting legitimate interest as an adequate lawful basis for employers’ processing of personal data. Employers usually recur to this lawful basis when, despite the processing not being required by law or necessary for the performance of the employment contract; it is needed for a certain legitimate purpose and complies with the principles of proportionality and minimization [7]. However, this basis of legitimacy requires analyzing whether the purpose or utility pursued by the employer respects all relevant rules and does not oppose prevailing social values (i.e., that it is legitimate) [8], if there is a way to achieve that purpose without the processing of the employee’s data, and if the purpose or utility pursued by the employer, and that justifies the processing prevails over the rights or freedoms
5 Court of Justice of the European Union, Joint Cases C-397/01 to C-403/01, October 5, 2004, Rapporteur: Mr. R. Schintgen, ECLI:EU:C:2004:584, Paragraph No. 82.
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
143
of the employee. In the employment context, there are many scenarios in which the employer’s legitimate interest will prevail, as per GDPR. For instance, Recital 47 GDPR identifies legitimate interest as the lawful basis for processing needed to prevent fraud. Also, Recital 48 considers legitimate interest the lawful basis for the transmission of employee data between group entities for internal administrative purposes; and Recital 49 justifies the use of this basis of legitimacy for processing operations strictly necessary and proportionate to ensure the security of the network and the security of information. Back to the principles of processing, Article 5.1.a) GDPR also proclaims the principle of transparency. In the employment context and according to Article 12 GDPR and Recital 39 GDPR, this principle requires employers to provide data subjects, with information about the data processing occurring in the employment context. In addition, this information must be given efficiently, succinctly, and in clear and plain language to ensure understanding [9]. That means, through the information provided by the employer, the data subject (in this case, job candidate or employee) should understand what personal data are to be processed, why, how, by whom, for how long, and what rights is he entitled to [10]. In addition, Article 5.1.a) GDPR also contains the fairness principle, which establishes a rule that prevents an unjustified or negative impact on data subjects through data processing. Moreover, Article 5.1.b) GDPR includes purpose limitation in the list of principles to be respected in the processing of personal data. This principle, applied to the matter at hand, obliges the employer to collect data subjects’ personal data for specified, explicit, and legitimate purposes, i.e., purposes that are not prohibited [11], and prohibits processing such data against the stated purposes. Furthermore, according to Article 6.4 GDPR, a case-by-case analysis needs to be conducted to determine if the processing is against its initial purpose. Such analysis shall consider elements like the relationship between purposes, the data collection context, the nature, the consequences for the data subjects, and the existence of adequate safeguards. The processing of personal data must also respect the minimization rule set out in Article 5.1.c) GDPR, which seeks to ensure that only personal data which processing is strictly necessary for achieving the intended purpose are processed. In the employment context, this necessarily entails a preliminary analysis by the employer to determine: the purpose sought, the personal data that need to be processed to achieve the intended purpose, the length of time the processing needs to continue, and who needs to have access to the data. In addition, this principle also requires periodic processing review to check that the personal data are still relevant and adequate for the intended purpose and to delete any data that are no longer needed. The principle mentioned above must necessarily be related to the principle of limitation of the storage period of personal data, set out in Article 5.1.e) GDPR, and to the principle of confidentiality, set out in Article 5.1.f) GDPR. This relation is due to the fact that the former seeks to stop processing data after the initial collection purpose has ceased and, the latter, that only authorized persons who need to know the personal data to achieve the collection purpose can access them.
144
B. Aguinaga Glaría
Article 5.1.d) GDPR, which sets out the principle of data accuracy, is also to be considered. This principle requires the data controller to adopt a proactive attitude toward achieving a database that is as accurate and up-to-date as possible. Likewise, the principle of integrity seeks to protect this data accuracy by providing data with the correct custody and measures to prevent its loss or damage. Therefore, employers must maintain the data’s quality and security. Finally, Article 5.2 GDPR closes the list of principles with the principle of accountability, the most representative principle within European data protection law. This principle seeks a proactive attitude of the controller through the implementation of technical and organizational measures adopted based on “the nature, scope, context, and purposes of the processing as well as the risks of varying likelihood and severity for the rights and freedoms of natural persons” and seeks not only compliance with the law but also its evidence. In short, in all processing of personal data, employers, as data controllers, must meticulously observe the principles set out above at the design stage of the processing, throughout the processing, and during its termination and shall generate sufficient evidence to demonstrate compliance with such principles. Additionally, due to its relevance, it should be noted that Chapter II introduces two prohibitions: the processing of special categories of data (i.e., data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data intended to identify a natural person uniquely, data concerning health, or data concerning the sex life or sexual orientation of a natural person) and the processing of data concerning criminal convictions and offenses. Both prohibitions, set out in Articles 9.1 GDPR and 10 GDPR, respectively, rely on the fact that they refer to individuals’ qualities relating to human dignity, which affect their personality and whose disclosure is likely to entail significant risks to the fundamental rights and freedoms of the individual to whom they refer [12]. Despite the initial prohibition, which includes both direct and indirect collection of such data [13], both articles include a set of exceptions that constitute specific lawful basis for processing the data categories referred to therein. Thus, on one hand, Article 9.2.b) GDPR lifts the prohibition contained in article 9.1 GDPR by accepting the processing of special categories of data in the field of employment where there is an obligation or a right which, to be effective, requires such processing. And, on the other hand, Article 10 GDPR lifts the initial prohibition of processing when conducted under the supervision of public authorities or when required by European Union or Member State legislation.
8.3 The Effects of the Special Nature of Data Concerning Criminal Convictions and Offenses with Respect to Its Processing by Employers in Spain As indicated in the previous chapter, the processing of data relating to criminal convictions and offenses is governed by Article 10 GDPR. This Article, without
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
145
prejudice to the general authorization to process such data under the supervision of public authorities, refers to the legislation of Member States by stating that such processing may only be carried out “where authorized by EU or Member State law providing appropriate safeguards for the rights and freedoms of data subjects.” In Spain, Article 10 LOPDGDD, almost replicating Article 10 GDPR, states that data relating to criminal convictions and offenses may only be processed when such processing “is covered by a European Union law provision, by this organic law or other legal provisions.” This provision applies unless the purpose for the processing is the prevention, investigation, detection, or prosecution of criminal offenses, the enforcement of criminal penalties, or the collection of client information by lawyers and solicitors to exercise their functions. It is worth highlighting that in Spain the data protection legislation restricts the processing of data relating to criminal convictions and offenses and only allows it in cases where such processing is required by the legislation, with the aim to respect the inherent confidentiality of such data and to avoid violations of the rights and freedoms of data subjects. The confidentiality inherent to criminal convictions and offenses as recognized in the Spanish legal system has its rationale in the desire to reduce, as far as possible, the adverse effects linked to the knowledge of such information, due to the stigma it causes in individuals and the considerable reduction in the possibilities of reintegration inherent in its publicity [14]. Thus, as a guarantee of the right to equality and non-discrimination proclaimed in Article 14 of the Spanish Constitution, Article 136.4 of the Criminal Code states that certificates relating to criminal records shall be issued “with the limitations and guarantees provided for in specific rules and in the cases established by law,” and Article 73 of the General Penitentiary Organic Law 1/1979, of 26 September, prohibits social and legal discrimination on the grounds of criminal records. Hence, considering the rules mentioned above, it is not surprising that articles 5 and 17 of Royal Decree 95/2009, of 6 February, which regulates the system of administrative records to support the Administration of Justice, when configuring the Central Register of Convicted Persons, foresee the records of criminal convictions and offenses as not public and only accessible by judicial bodies, the public prosecutor’s office, and the interested parties themselves. In short, Article 10 LOPDGDD, due to the restricted and confidential nature of data relating to criminal convictions and offenses, only allows for the processing of such data if a European Union provision or a national law entitles the controller to process such data. Therefore, what the Spanish legislation does de facto is to attribute to the processing of this category of personal data a regime of protection substantially equivalent to the one applicable to special categories of data. This is, governing a generic prohibition and contemplating exceptional cases of lifting this prohibition [15]. In Ruling 14/2020, the Audiencia Nacional goes further and, as required by Article 10.2 of the Spanish Constitution, understands Article 10 GDPR and Article 10 LOPDGDD in line with Article 6 of the Convention for the Protection of Individuals regarding Automatic Processing of Personal Data, adopted in Strasbourg
146
B. Aguinaga Glaría
on January 28, 1981 (from now on, “108 Convention”), stating that “both Article 10 of the European regulation and the corresponding article of the domestic regulation consider this type of data to be particularly sensitive and therefore the processing thereof shall be subject to a specific discipline.”6 In other words, it expressly recognizes the special nature of information relating to criminal convictions and offenses and requires the adoption of additional safeguards for processing such data. This is the reason why the Spanish legislator has opted to allow its processing only in the event of the existence of a law that requires it. So, following the Spanish courts’ interpretation of Article 10 GDPR, although unlike former Directive (95/46/EU), the EU legislator opted not to expressly label criminal convictions and offenses as sensitive data, the extra layer of protection contained in Article 10 GDPR, which mirrors former regulation, does de facto consider criminal convictions and offenses as sensitive data; all in line with Article 6 of the 108 Convention. In the field of employment, referring to regulations for legitimizing the processing of data relating to criminal convictions and offenses means that employers must consider all the regulations applicable to each profession in order to assess, at the time of the conception of the processing, whether they require the processing of data envisaged in Article 10 GDPR. To this end, as previously said, the law will have to expressly require the absence of a criminal record to execute the employment contract and will have to impose on the employer the obligation to verify this circumstance. Without wishing to be exhaustive, employers are authorized to process such personal data when the employment in question involves dealing with minors, or if the individuals are managers, employees, or agents of companies bound by Law 10/2010, of 28 April, on the prevention of money laundering and the financing of terrorism. However, employers are not allowed to process such data even in cases in which a specific job requires for its performance an official qualification, membership, or license issued by a public body which, to be obtained, requires not having a criminal record in force. In these cases, under the minimization principle, the employer must only request the authorizations or licenses issued by the public bodies, since those documents are the only documents needed by the employer to satisfy the purpose sought, which is the execution of the employment contract.7 Furthermore, the Social Section of the Audiencia Nacional in Ruling 14/20208 adds that, in relation to this type of processing, the sole request for any reference
6 Audiencia Nacional 14/2020, February 10, Rapporteur: Mr. R. Gallo, ECLI:ES:AN: 2020:14, Legal Ground No. 3. 7 Tribunal Supremo 1860/2021, May 12, Rapporteur: Mr. M.L. García, ECLI:ES:TS: 2022:2860, Legal Ground No. 4. 8 Audiencia Nacional 14/2020 STS 14/2020, February 10, Rapporteur: Mr. R. Gallo, ECLI:ES:AN: 2020:14, Legal Ground No. 3.
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
147
thereto constitutes an act of processing. In other words, the mere indication of the existence or absence of a criminal record, either by a certificate of a criminal record or by a self-declaration indicating the lack of such records, must be considered an act of processing subject to the GDPR since, indissolubly, such certificate or declaration is related to the existence of criminal convictions and, therefore, has the status of personal data as per Article 4.1 GDPR.9 This broad interpretation of the concept of data relating to criminal convictions and offenses is in line with the case law of the European Court of Justice on special categories of data that ruled that the effectiveness of the restrictive regime for the processing of special categories of data entails the broad interpretation of the concept so that any information that directly or indirectly reveals this category of data must also fall within the concept of special data.10 That said, considering that Article 10 GDPR foresees, in essence, a special category of personal data and, therefore, calls for a restrictive processing regime, data relating to criminal convictions and offenses shall be understood to exist not only when a criminal record certificate (positive or negative) is processed but also when any information directly or indirectly reveals this category of personal data. Thus, regardless of how information relating to criminal convictions and offenses of employees is collected, if employers in Spain do not have an express legal provision that supports such processing, it will be considered unlawful and, therefore, detrimental to the right to data protection of employees and, to their dignity. At this point, it is worth mentioning that, in the labor context and in the absence of a rule expressly stating the incompatibility of the job with a criminal record or requiring the employer to ensure lack of criminal convictions of employees, just the action of taking into consideration such personal data would be discriminatory and prohibited, based on Article 4.2.c) and 17 of Royal Legislative Decree 2/2015, of October 23, approving the revised text of the Workers’ Statute Law [16]. These provisions, among others, prohibit discrimination based on social status and, in essence, like the enhanced regime of protection of data relating to criminal convictions and offenses, are based on the right to non-discrimination proclaimed in Article 14 of the Spanish Constitution, which includes any difference in treatment that does not provide an objective, reasonable, and proportional justification and which entails social exclusion [17].
9 Spanish Data Protection Authority, Resolution PS/00267/2020, February 11, 2022, Legal Ground No. 1. 10 Court of Justice of the European Union C-184/20, August 1, 2022, Rapporteur: Mr. Ilešiˇ c, ECLI:EU:C:2022:601, Paragraph No. 124-128.
148
B. Aguinaga Glaría
8.4 The Processing of Personal Data Relating to Criminal Convictions and Offenses by Employers in Other Member States The restrictive Spanish approach toward processing criminal convictions and offenses data, also shared by countries like Italy,11 France,12 or Finland,13 differs from the approach taken in countries like Austria or Denmark, among others. In these countries, the data protection law, despite the potentially harmful or even discriminatory effect associated with the processing of data relating to criminal convictions and offenses,14 provides for lifting the prohibition contained in Article 10 GDPR through legitimization mechanisms such as the legitimate interest of the data controller or even the data subject’s consent. As indicated, Austria is one of the countries where the local data protection legislation foresees other lawful basis for processing in addition to legal obligation. In this country, Article 4.3.2 of the Federal Data Protection Act15 recognizes both legal obligation and legitimate interest as valid grounds for the processing of data relating to criminal convictions and offenses; the latter provided that the processing is carried out as per Article 6.1.f) GDPR and safeguarding the rights of the data subject. However, as already mentioned above, legitimate interest requires assessing the processing’s necessity, proportionality, and subsidiarity to be able to justify, in the context of employment, the prevalence of the purpose of the processing over the rights or freedoms of the employee. Therefore, consideration should be given to whether there are cases in which, in the absence of “an explicit legal authorization, obligation, or the legal duty of diligence of the controller,” the legitimate interest of the employer could justify the processing of data relating to criminal convictions and offenses of employees. In other words, considering the factual categorization of such data, it is hard to imagine a scenario in which, in the absence of a rule requiring the employer to ensure the absence of a criminal record of his employees, a diligent weighing of the factors would lead to a favorable outcome for such processing. This is without prejudice to the principle of accountability, according to which the employer should only process the data strictly necessary for recruitment and always give preference to the interests or fundamental rights and freedoms of the data subject, especially
11 Italy.
Section 2-g of the Personal Data Protection Code (https://www.gpdp.it/codice). Article 46 of the Loi n◦ 78-17 du 6 janvier 1978 relative à l’informatique, aux fichiers et aux libertés (https://www.cnil.fr/fr/la-loi-informatique-et-libertes#article46). 13 Finland. Section 7 of the Data Protection Law (https://www.finlex.fi/en/laki/kaannokset/2018/ en20181050.pdf). 14 Recital 75 Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016, on the protection of individuals with regard to the processing of personal data and on the free movement of such data and repealing Directive 95/46/EC. 15 Austria. Article 2 § 4 (3)(2) of the Federal Data Protection Law (https://www.ris.bka.gv.at/ Dokumente/Erv/ERV_1999_1_165/ERV_1999_1_165.html). 12 France.
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
149
when the processing could give rise to situations of discrimination, as is the case here. The case of Denmark is even more evident since section 8.3 of the Data Protection Act16 includes the consent of the data subject, in addition to the legitimate interest, as the lawful basis for the processing of such data. However, in the context of employment, this provision leaves legitimate interest as the only valid lawful basis for processing data relating to criminal convictions and offenses. As mentioned above, this is because, in employment relations, there is a lack of freedom of employees or potential employees that practically nullifies any consent. Finally, due to the particularity of the regime, it is worth mentioning the German regulation and approach to processing data concerning criminal convictions and offenses by employers. In this country, the Federal Data Protection Act does not develop article 10 GDPR. However, the German regulation, contrary to what happens in Spain, does develop article 88 GDPR and enables employers to process the personal data needed to make a hiring decision and for the management and termination of the employment contract, as well as to exercise or fulfill the rights and obligations of employee representation established by law or by collective bargaining agreements or other agreements between the employer and the personnel council.17 According to the Landesbeauftragte für den Datenschutz und die Informationsfreiheit Baden-Württemberg (State Commissioner for Data Protection and Freedom of Information Baden-Württemberg), German law allows processing data relating to employees’ criminal convictions or offenses as long as such data are objectively related to the employment in question. The State Commissioner adds that this relation could be either because of the crimes committed or the legal interests involved [18]. However, this interpretation deviates from the special categorization of data derived from Article 10 GDPR and here supported, especially considering that according to Article 26.3 of the Federal Law, special categories of data should only be processed when necessary to fulfill legal obligations arising from labor and social security law and provided that there is no reason to believe that the data subject has a legitimate interest in not having such processing. Put simply, taking into consideration the factual special nature of personal data foreseen in Article 10 GDPR and, in the absence of an express provision in the German regulation about the processing of data relating to criminal convictions and offenses, it does not seem that the interpretation suggested in the guide mentioned above is respectful to the principles set out in Article 5 GDPR, nor to the protectionist spirit contained in the GDPR. So, following the rationale of the Spanish courts, if personal data referred to in Article 10 GDPR is to be considered of a special category, the use of any other
16 Denmark.
Article 8 (3) of the Data Protection Law (https://www.datatilsynet.dk/media/7753/ danish-data-protection-act.pdf). 17 Germany. Article 26 of the Federal Data Protection Law (https://www.gesetze-im-internet.de/ englisch_bdsg/englisch_bdsg.html).
150
B. Aguinaga Glaría
mechanism different from a law explicitly supporting such processing seems not to be so adequate, especially when the controller is an employer. All this leads us to concluding that the most appropriate lawful basis for processing data relating to criminal convictions and offenses in the employment context is to refer to a regulation that expressly supports such processing. Otherwise, the configuration of the GDPR, which requires the prevalence of the rights of data subjects and the restriction of discriminatory processing, could, in most cases, lead to the unlawfulness of such processing activities.
8.5 Conclusions In the context of employment relationships, employers need to process the personal data of potential employees and employees for the performance of the employment contract and the fulfillment of legal obligations imposed on employers. In this way, and since the employment contract does not deprive employees of the fundamental rights recognized to them as individuals, the employers’ processing of personal data must comply with all the applicable regulatory provisions on personal data. Article 88 GDPR calls on national legislators to develop the rules governing the processing operations carried out for employment purposes to accommodate the specific processing needs that may arise in each Member State in such context. However, the Spanish legislator has not developed this provision, leading to a significant gap in the Spanish data protection legislation that restricts the capacity of employers to conduct data processing activities. This situation contrasts with the existing situation in other countries, such as Germany, where the legislator has developed Article 88 GDPR and provided a specific legal regime for data processing in the employment context adjusted to the needs of employers. Special categories of personal data are among the personal data that an employer may process to enable and manage the conclusion of employment contracts and the fulfillment of legal obligations. For this reason, the GDPR enables its processing for the fulfillment of obligations and the exercise of specific rights in the field of labor law and social security and protection. However, these types of personal data have a more restrictive protection regime due to the potentially harmful and discriminatory effect associated with its processing. This effect is also appreciated in data relating to criminal convictions and offenses, which is the reason why Article 10 GDPR also provides for a special processing regime, although it must be articulated by the data protection regulations of each Member State. The Spanish legislator developed Article 10 GDPR so that the processing of such data may only be carried out when the law specifically provides for such processing. The protection of this category of personal data is aligned with the confidentiality inherent to such data and with the consideration given to such data under the 108 Convention, which serves as an interpretative guideline for the GDPR and labels data relating to criminal convictions and offenses as special data. This rationale is
8 Processing of Data Relating to Criminal Convictions and Offenses in the. . .
151
what has made Spanish courts consider that even information that may indirectly help to infer criminal convictions or offenses would also fall under such category. The enhanced protection given to data under Article 10 GDPR relies on the stigma associated with it and its capacity to cause violations of the rights and freedoms of the individuals it refers to. Restricting the processing of data relating to criminal convictions and offenses also operates in Spain as an instrument to guarantee the right to non-discrimination in employment on the grouds of criminal records, as per Article 73 of the General Penitentiary Organic Law and to guarantee the employee’s right to nondiscrimination, as stated in Article 4.2.c) of the Worker’s Statute of Employment. The Spanish restrictive regime contrasts with the existing situation in other Member States, where the prohibition contained in Article 10 GDPR can be lifted by other lawful bases apart from compliance with a legal obligation. Notwithstanding, in the field of employment, compliance with a legal obligation seems to be the only appropriate lawful basis for the processing of data foreseen in Article 10 GDPR, given the special nature of data relating to criminal convictions and offenses and its potential discriminatory effect, all in line with the principles of Article 5 GDPR and the exception contained in Article 9.2.b) GDPR. The rationale behind such a statement is that it is not easy to imagine scenarios where other legal bases different from legal obligation could justify an employer’s processing of such data. In other words, in the field of employment, considering the stigma and discriminatory effect inherent to data relating to criminal convictions and offenses, it is unlikely to justify the employee’s freely given consent for this processing or, in the absence of a legal provision requiring such processing, the prevailing legitimate interest of the employer over the data subject’s right to data protection. In short, unless a legal duty of care or obligation requires the processing of data relating to criminal convictions and offenses, the structural inequality inherent to employment relationships and the factual special nature of data relating to criminal convictions and offenses make it difficult for the employer to justify the lawfulness of a processing operation which is potentially harmful to the interests, rights, and fundamental freedoms of data subjects.
References 1. J. Baz Rodríguez, Privacidad y protección de datos de los trabajadores en el entorno digital, Valencia, Spain, Ed. Wolters Kluwer, 2019, p. 78. 2. J. Baz Rodríguez, Privacidad y protección de datos de los trabajadores en el entorno digital, Valencia, Spain, Ed. Wolters Kluwer, 2019, p. 38. 3. A. Del Val Tena, “La protección de los datos personales en los procesos de selección de los trabajadores; en particular, aquellos datos especialmente sensibles,” Documentación Laboral, no. 119, vol. I, p. 102, 2020. 4. A. Palma Ortigosa, “Principios relativos al tratamiento de datos personales”, In Protección de datos, responsabilidad activa y técnicas de garantía, J.P. Murga Fernández (dir.); M.A. Fernández Scagliusi, (dir.); M. Espejo Lerdo de Tejada (dir.); S. Lorenzo Cabrera (coord.); A. Palma Ortigosa (coord.), Ed. Reus, Madrid, Spain, 2018, p. 40.
152
B. Aguinaga Glaría
5. Article 29 Working Party, Opinion 2/2017 on data processing at work, Brussels, Belgium, 2017, p. 6. 6. J.L. Goñi Sein, La nueva regulación europea y española de protección de datos y su aplicación al ámbito de la empresa, Albacete, Spain, Ed. Bomarzo, 2018, p. 89. 7. Article 29 Working Party, Opinion 2/2017 on data processing at work, Brussels, Belgium, 2017, p. 25. 8. E. Gil González, El interés legítimo en el tratamiento de datos personales, Madrid, Spain, Ed. Wolters Kluwer Legal & Regulatory, 2022, p. 128. 9. Article 29 Working Party, Guidelines on transparency under Regulation (UE) 2016/679, Belgium, 2018, pp. 7-10. 10. A. Palma Ortigosa, Principios relativos al tratamiento de datos personales. In Protección de datos, responsabilidad activa y técnicas de garantía, J.P. Murga Fernández (dir.); M.A. Fernández Scagliusi (dir.); M. Espejo Lerdo de Tejada (dir.); S. Lorenzo Cabrera (coord.); A. Palma Ortigosa (coord.), Madrid, Spain, Ed. Reus, Madrid, 2018, p. 43. 11. A. Palma Ortigosa, Principios relativos al tratamiento de datos personales. In Protección de datos, responsabilidad activa y técnicas de garantía, J.P. Murga Fernández (dir.); M.A. Fernández Scagliusi (dir.); M. Espejo Lerdo de Tejada (dir.); S. Lorenzo Cabrera (coord.); A. Palma Ortigosa (coord.), Madrid, Spain, Ed. Reus, Madrid, 2018, p. 44. 12. E. Sierra Hernáiz, Las categorías especiales de datos del trabajador. Estudio de los límites y garantías legales para su tratamiento en la relación laboral, Pamplona, Spain, Ed. Aranzadi, 2021, p. 76. 13. J. Cruz Villalón, Protección de datos personales del trabajador en el proceso de contratación: facultades y límites a la actuación del empleador, Albacete, Spain, Ed. Bomarzo, 2019, p. 31. 14. E. Larrauri, Antecedentes penales, in Eunomía: Revista en Cultura de la Legalidad, no. 8, p. 154, 2015. 15. J. Cruz Villalón, Protección de datos personales del trabajador en el proceso de contratación: facultades y límites a la actuación del empleador, Albacete, Spain, Ed. Bomarzo, 2019, p. 38. 16. J. Cruz Villalón, Protección de datos personales del trabajador en el proceso de contratación: facultades y límites a la actuación del empleador, Albacete, Spain, Ed. Bomarzo, 2019, p. 32. 17. E. Sierra Hernáiz, Las categorías especiales de datos del trabajador. Estudio de los límites y garantías legales para su tratamiento en la relación laboral, Pamplona, Spain, Ed. Aranzadi, 2021, pp. 72–73. 18. S. Brink, Arbeitnehmerdatenschutz: Zwischen wirtschaftlicher Abhängigkeit und informationeller Selbstbestimmung. Baden-Württemberg, Germany, Ed. Landesbeauftragte für den Datenschutz und die Informationsfreiheit, 2020, p. 26.
Index
B Big Data analytics (BDA), 1–14 Built environment, 49–63
C Charter of Trust, 69–89 China, 91–118 Criminal convictions, 139–151 Cyberattacks, 69–71, 74, 75 Cybersecurity, 69–78, 82, 88, 99, 100, 105, 109, 117, 118
D Data ethics, 49–63 Data Governance Act (DGA), 19, 21, 27–28, 30, 34, 37, 39–41, 45 Data protection, xiii, xiv, xv, 1–14, 18, 20, 27, 29, 34, 38, 39, 43, 45, 49, 50, 54–62, 69–89, 91–96, 98–101, 103, 104, 107, 109, 111, 113–122, 131, 140, 145, 147–151 Data protection impact assessment (DPIA), 1–6, 8–14, 28, 43 Data re-use, 33–46, 59 Data Subject Rights (DSR), xiv, 17–30, 99–105, 108–110, 115 Data Subject Rights as a Service (DSRaas), 17–30 Delphi study, 1–14
E Employees, 57, 72, 83, 84, 140–143, 146–151 European Health Data Space (EHDS) proposal, 34, 37, 39–42
F Freedom, 12, 69, 71, 78, 95, 98, 100, 112, 125, 140, 142, 144, 145, 148, 149, 151 Fundamental rights, 70, 92, 95–98, 105–107, 112, 115, 140, 141, 144, 148, 150
G General Data Protection Regulation (GDPR), xiii, xiv, 1, 2, 12, 17–30, 34, 35, 37–41, 43–45, 49, 73, 76, 78–88, 91–93, 95, 104, 119–121, 124, 126–128, 130, 139–151 Government access, 91–118
I India, 91–118
M Membership inference attacks, 36, 37
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 S. Schiffner et al. (eds.), Privacy Symposium 2023, https://doi.org/10.1007/978-3-031-44939-0
153
154 N National security, 91–97, 99–106, 108–113, 115–117 P Personal data, xiii, xiv, 2, 4, 6, 12, 17–20, 23, 28, 35, 38–40, 42–46, 57, 73, 77, 78, 81–84, 87–89, 91–121, 125, 126, 128, 131, 134, 139–150 Privacy engineering, 78, 134 Privacy impact assessments (PIAs), 2–4, 10, 12, 14 Privacy interfaces, 119–134 Processing, xiv, 1–3, 7, 9, 10, 14, 17–19, 23, 24, 33–35, 38–45, 54, 55, 77, 78, 80, 83, 86, 89, 91–93, 97, 99, 100, 103, 107, 113–115, 117–122, 124–126, 130–132, 134, 139–151 Property, 39, 42, 49–56, 58, 60–63, 141 R Real estate, 49–63
Index Repurposing machine learning models, 33–46 Right of access (RoA), 18, 19, 21, 23, 24, 27
S Safeguards, 38, 39, 45, 49, 55, 70, 75, 78, 80, 93, 96, 99, 100, 102–104, 108–111, 113, 116, 118, 122, 125, 140, 143, 145, 146, 148 Schrems II, 91–93, 95–97, 114, 116 Secondary use, 34, 35, 37–42, 45 Siemens, 75, 76
T Transfer learning, 34, 36, 45 Transparency, 3, 7, 8, 10, 50, 55, 63, 77, 81, 88, 89, 93, 104, 105, 119–134, 143
U Usable privacy, 119, 134