Data and the Built Environment: A Practical Guide to Building a Better World Using Data (Digital Innovations in Architecture, Engineering and Construction) [2024 ed.] 3031510070, 9783031510076

This book takes a data-first approach to framing the built environment sector’s unique challenges and how industry can a

154 87 8MB

English Pages 383 [380] Year 2024

Table of contents :
Acknowledgements
Contents
About the Authors
List of Figures
List of Tables
1 Purpose and Pedantry
1.1 Why This Book and Why Now?
1.2 Structure of This Book
1.3 Intended Outcomes of This Book
1.4 Building on Established Practice
1.5 Definitions
1.5.1 Physical Things
1.5.2 Data, Information, Knowledge, Insight, Wisdom
1.5.3 Time
1.5.4 Organisations and Friction
1.5.5 Types of Data
1.6 Conclusion
References
2 The Challenge of the Built Environment
2.1 Data and the Built Environment
2.2 What Makes Built Environment Data Different?
2.3 Designing Equity: Planning, Cartography, Geography… and Data
2.4 Measuring Value Through Societal Outcomes
2.4.1 Delivering New Economic Infrastructure to Drive Improved Outcomes for People and Nature
2.4.2 Place-Based Regeneration and Delivery
2.4.3 Addressing the Need for Infrastructure Using a Platform Approach
2.4.4 Retrofitting Existing Buildings to Achieve Net Zero Greenhouse Gas Emissions
2.4.5 Optimising the Environmental Performance of Our Existing Built Environment
2.4.6 A Purpose-Driven Sector
2.5 Building for Extinction: Data and Survival
2.5.1 Lowering the Carbon Impact of Construction
2.5.2 Increasing the Operational Sustainability of the Built Environment
2.5.3 Supporting the Decarbonisation of Transport and Industry
2.6 Safety
2.7 Ownership and Incentives
2.7.1 Public Versus Private
2.7.2 Data As Scorekeeper
2.8 Data Sharing
2.8.1 Sharing Commercial Data
2.8.2 Location Data for Transport Operations
2.9 Interoperability
2.9.1 Delivering Interoperability at the Mega Scale
2.10 Data as a Valued Asset
2.11 Conclusion
References
3 Background Reading
3.1 Relevant Standards
3.1.1 Quality Management (ISO 9001 and BS 99001)
3.1.2 Design and Construction (BS 1192, ISO 19650)
3.1.3 Operations, Maintenance, and Asset Management (BS 8536, PAS 55, ISO 55000)
3.1.4 Heritage and Restoration
3.2 Relevant Publications
3.2.1 Transforming Infrastructure Performance (TIP)—Roadmap 2030 (IPA, 2021)
3.2.2 Government Soft Landings (CDBB, 2019)
3.2.3 Construction Playbook (Cabinet Office, 2020)
3.2.4 Data for the Public Good (National Infrastructure Commission, 2017)
3.2.5 Flourishing Systems (CDBB/CSIC, 2020)
3.2.6 Independent Review of Building Regulations and Fire Safety: Hackitt Review (MHCLG, 2018)
3.3 Relevant Regulation and Legislation
3.3.1 Keeping Staff Safe
3.3.2 Keeping Occupants Safe
3.3.3 Protecting the Natural Environment
3.3.4 Keeping Data Safe
3.3.5 Making Data Transparent
3.3.6 Encouraging Data Best Practice Through Regulation
3.4 Conclusion
References
4 Data and Organisational Culture
4.1 Introduction: Meaning in Data
4.2 Data Culture
4.2.1 What Does It Even Mean to Make Data-Driven Decisions?
4.2.2 Time and Decision-Making
4.2.3 Case Study: Data ‘Friction’ and the Music Industry
4.3 Stakeholder Engagement and Communities of Practice
4.4 Writing a Data Strategy
4.5 Your Data Vision Statement
4.6 Data Principles (and Ethics)
4.7 Data Capabilities
4.8 Use Cases
4.9 Data Outcomes and Benefits
4.9.1 Sectoral Benefits
4.9.2 Organisational Benefits
4.10 Data Roles and Skills
4.11 Conclusion
References
5 Delivering Data Capability
5.1 Data Foundations/Precursors
5.1.1 Data Governance and Data Empowerment
5.1.2 Pragmatic Enterprise Data Architecture
5.1.3 A Practical Philosophy (a Data Dogma)
5.1.4 Ontology (Common Data Models)
5.1.5 How to Build and Use an Ontology for Construction
5.1.6 Starting with an ‘Entry-Level’ Taxonomy or Business Glossary
5.1.7 Search and Classification
5.1.8 Unstructured Data Management
5.2 Delivering Successful Data Projects
5.2.1 Delivering Through Traditional IT Functions
5.2.2 Working with Legacy Practices
5.2.3 Working for Construction Projects
5.2.4 Delivering into Operational IT
5.2.5 Working Through Governance
5.3 Procuring Successful Data Projects
5.3.1 Ensuring Your Scope of Work is Deliverable
5.3.2 Competing and Evaluating
5.3.3 Ways of Working
5.3.4 Managing Ecosystems and Dependencies
5.3.5 Making Effective Use of Terms and Conditions
5.3.6 Procuring Across the Life Cycle of a Service
5.4 Conclusion
References
6 Radical Technologies
6.1 On Complexity
6.2 On Technology
6.3 Digital Construction and Building Information Modelling (BIM)
6.3.1 Definitions
6.3.2 Dimensions and a Data-First Approach
6.3.3 Emerging Technologies: Bridging the Gap Between Digital and Reality
6.3.4 Digital Heritage: Working with old and unique assets
6.3.5 Geospatial Information Systems (GIS)
6.4 Data Analytics and Business Intelligence (BI)
6.4.1 Process
6.4.2 Staff and Stakeholders
6.4.3 Prototyping and Scaling
6.4.4 Infrastructure—Warehouses, Lakes, Mesh
6.4.5 Human Psychology and BI
6.5 Data Science and Artificial Intelligence (AI)
6.5.1 The Sub-four-minute Mile
6.5.2 Definitions
6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built Environment
6.6.1 Primer: AI, LLMs, and Software 2.0
6.6.2 To AI or to Automate?
6.6.3 Categorising Our Problems
6.6.4 Specialist Built Environment Models
6.6.5 Predictive Analytics
6.6.6 Data Science on Built Environment Projects
6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud Computing
6.7.1 What Is Smart?
6.7.2 The Drawbacks and Ethics of Smart Places
6.8 Digital Rehearsal
6.8.1 Parametric and Generative Design
6.9 Digital Twins
6.9.1 Smart Buildings Versus Digital Twins
6.9.2 Industrial Digital Twins
6.9.3 From Construction to Operations
6.9.4 Architecture
6.9.5 National Digital Twinning
6.9.6 Digital Twinning Across Industrial Sectors
6.10 Conclusion
References
7 How to Be a Data Person
7.1 How to Be a Data Person
7.2 How to Be a Person
7.3 How to Set a Positive Example
7.4 How to Be Conscious of Your Biases
7.5 How to Be Ethical
7.6 How to Be Open Minded and Work with Others
7.7 How to Sell a Narrative
7.8 How to Make Product Love, and Not Be Ruled by It
7.9 How to Take Accountability and How to Lead
7.10 How to Grow Talent
7.11 How to Respect Your Own Time and Your Mind
7.12 How to Learn
7.13 How to Think About Mental Health
7.14 How Not to Take It All Personally
7.15 How to Take Sustainability Seriously
7.16 Epilogue: Of Bytes and Bricks
References

Recommend Papers

Data and the Built Environment: A Practical Guide to Building a Better World Using Data (Digital Innovations in Architecture, Engineering and Construction) [2024 ed.] 3031510070, 9783031510076

This book takes a data-first approach to framing the built environment sector’s unique challenges and how industry can a

99 6 37MB Read more

Data Engineering with dbt: A practical guide to building a cloud-based pragmatic and dependable data platform with SQL 9781803246284

Use easy-to-apply patterns in SQL and Python to adopt modern analytics engineering to build agile platforms with dbt tha

211 107 13MB Read more

Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions 1835462146, 9781835462140

Gain hands-on experience in building efficient and scalable big data architecture on Kubernetes, utilizing leading techn

120 120 14MB Read more

New Advances in Building Information Modeling and Engineering Management (Digital Innovations in Architecture, Engineering and Construction) 303130246X, 9783031302466

This book explores the latest advances in the field of building management. Several chapters use new technologies such a

122 23 8MB Read more

Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses & data lakes 9781801070492

Build scalable and reliable data ecosystems using Data Mesh, Databricks Spark, and Kafka Key Features Develop modern da

101 90 9MB Read more

The Cloud Data Lake: A Guide to Building Robust Cloud Data Architecture 9781098116583

More organizations than ever understand the importance of data lake architectures for deriving value from their data. Bu

276 119 8MB Read more

The Cloud Data Lake: A Guide to Building Robust Cloud Data Architecture 9781098116583

More organizations than ever understand the importance of data lake architectures for deriving value from their data. Bu

102 97 8MB Read more

Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses 9781801070492

Learn to build scalable and reliable data ecosystems using Data Mesh, Databricks Spark, and Kafka. Key Features Develop

107 56 14MB Read more

Azure Modern Data Architecture. A Guide to Design and Implement a Modern Data Solutions

274 119 9MB Read more

Essential Data Analytics, Data Science, and AI: A Practical Guide for a Data-Driven World 9798868810695, 9798868810701

105 58 7MB Read more

Data and the Built Environment: A Practical Guide to Building a Better World Using Data (Digital Innovations in Architecture, Engineering and Construction) [2024 ed.]
3031510070, 9783031510076

Author / Uploaded
Ian Gordon
Neil Thompson

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Digital Innovations in Architecture, Engineering and Construction

Ian Gordon Neil Thompson

Data and the Built Environment A Practical Guide to Building a Better World Using Data

Digital Innovations in Architecture, Engineering and Construction Series Editors Diogo Ribeiro , Department of Civil Engineering, Polytechnic Institute of Porto, Porto, Portugal M. Z. Naser, Glenn Department of Civil Engineering, Clemson University, Clemson, SC, USA Rudi Stouffs, Department of Architecture, National University of Singapore, Singapore, Singapore Marzia Bolpagni, Northumbria University, Newcastle-upon-Tyne, UK

The Architecture, Engineering and Construction (AEC) industry is experiencing an unprecedented transformation from conventional labor-intensive activities to automation using innovative digital technologies and processes. This new paradigm also requires systemic changes focused on social, economic and sustainability aspects. Within the scope of Industry 4.0, digital technologies are a key factor in interconnecting information between the physical built environment and the digital virtual ecosystem. The most advanced virtual ecosystems allow to simulate the built to enable a real-time data-driven decision-making. This Book Series promotes and expedites the dissemination of recent research, advances, and applications in the field of digital innovations in the AEC industry. Topics of interest include but are not limited to: – – – – – – – – – – – – – – –

Industrialization: digital fabrication, modularization, cobotics, lean. Material innovations: bio-inspired, nano and recycled materials. Reality capture: computer vision, photogrammetry, laser scanning, drones. Extended reality: augmented, virtual and mixed reality. Sustainability and circular building economy. Interoperability: building/city information modeling. Interactive and adaptive architecture. Computational design: data-driven, generative and performance-based design. Simulation and analysis: digital twins, virtual cities. Data analytics: artificial intelligence, machine/deep learning. Health and safety: mobile and wearable devices, QR codes, RFID. Big data: GIS, IoT, sensors, cloud computing. Smart transactions, cybersecurity, gamification, blockchain. Quality and project management, business models, legal prospective. Risk and disaster management.

Ian Gordon · Neil Thompson

Data and the Built Environment A Practical Guide to Building a Better World Using Data

Ian Gordon Arup London, UK

Neil Thompson AtkinsRéalis London, UK

ISSN 2731-7269 ISSN 2731-7277 (electronic) Digital Innovations in Architecture, Engineering and Construction ISBN 978-3-031-51007-6 ISBN 978-3-031-51008-3 (eBook) https://doi.org/10.1007/978-3-031-51008-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.

Acknowledgements

We want to start by thanking the generous people who volunteered their time to help edit and review this book, and in doing so contributed massively to whatever clarity and insight that this book musters. If any mistakes or inaccuracies remain, they are our own. Secondly, there is a thank you to those we cannot mention directly. There are fingerprints all over this book in terms of guidance and memorable career moments. We would like to thank the various communities we have leant on to learn and connect with new knowledge. This includes the carefully crafted posts on LinkedIn by Ian to crowdsource some wisdom to place in this book. For those that interacted, we hope you find your pearls of wisdom sown into our book. A special mention from Neil to Ian. Ian is the driving force behind this book; he has poured his soul into these pages, and I have done my best to sprinkle my experience around his prose. Thank you for sharing the journey with me, I am in your debt! There are a number of individuals who have informed our thinking on data and the built environment, and who’s influence is explicit and implicit throughout this book. These include Davin Crowley-Sweet who’s thinking on data capabilities informed and clarified our own, Will Varah, Melissa Zanocco and Mark Enzer, who’s work at the IPA and CDBB respectively provoked us to think bigger about data’s role in the built environment, Sarah Hayes who’s writing on helped us to see data in a new light, Miranda Sharp and Liam McGee who helped us to see meaning in data, Jon Jarritt who’s healthy scepticism and wry sense of humour informed our style, and Herman Heynes who captured the value of data better than anyone. We would like to thank the following individuals for their extensive help in editing and reviewing this book: • Dexter Robinson, the first person to make it through the whole manuscript, and a source of invaluable perspective and insight which allowed us to refine our message. • Daniel Rossiter, the pedant’s pedant, for keeping us consistent. • Janet Greenwood, for pointing out our copious blind spots.

v

vi

• • • •

Acknowledgements

Brian Westerman, secret data ninja. Dr. Jo White, for being a voice of reason. Steven Steer, for his extremely detailed and seer comments. The last minute Arup proofreading gang of Samantha Haylen, Lisa Horrocks, Beccie Drake, and Chris Dobbie. • Jo Lucas, for activating the mycelium. • Jon Kerbey, for constructive challenge. • Dr. Chloe Thompson-Booth, for checking our words on psychological issues and encouraging Neil to pull his weight.

Contents

1 Purpose and Pedantry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Why This Book and Why Now? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Structure of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Intended Outcomes of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Building on Established Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Physical Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Data, Information, Knowledge, Insight, Wisdom . . . . . . . . . 1.5.3 Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Organisations and Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.5 Types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 6 7 9 11 12 12 19 19 21 29 30

2 The Challenge of the Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Data and the Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 What Makes Built Environment Data Different? . . . . . . . . . . . . . . . 2.3 Designing Equity: Planning, Cartography, Geography… and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Measuring Value Through Societal Outcomes . . . . . . . . . . . . . . . . . 2.4.1 Delivering New Economic Infrastructure to Drive Improved Outcomes for People and Nature . . . . . . . . . . . . . 2.4.2 Place-Based Regeneration and Delivery . . . . . . . . . . . . . . . . 2.4.3 Addressing the Need for Infrastructure Using a Platform Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.4 Retrofitting Existing Buildings to Achieve Net Zero Greenhouse Gas Emissions . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.5 Optimising the Environmental Performance of Our Existing Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.6 A Purpose-Driven Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 33 38 48 52 54 56 59 60 61 62

vii

viii

Contents

2.5

Building for Extinction: Data and Survival . . . . . . . . . . . . . . . . . . . . 63 2.5.1 Lowering the Carbon Impact of Construction . . . . . . . . . . . 65 2.5.2 Increasing the Operational Sustainability of the Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 2.5.3 Supporting the Decarbonisation of Transport and Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.6 Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.7 Ownership and Incentives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 2.7.1 Public Versus Private . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 2.7.2 Data As Scorekeeper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 2.8 Data Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2.8.1 Sharing Commercial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.8.2 Location Data for Transport Operations . . . . . . . . . . . . . . . . 91 2.9 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.9.1 Delivering Interoperability at the Mega Scale . . . . . . . . . . . 96 2.10 Data as a Valued Asset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 2.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3 Background Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Relevant Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Quality Management (ISO 9001 and BS 99001) . . . . . . . . . 3.1.2 Design and Construction (BS 1192, ISO 19650) . . . . . . . . . 3.1.3 Operations, Maintenance, and Asset Management (BS 8536, PAS 55, ISO 55000) . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Heritage and Restoration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Relevant Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Transforming Infrastructure Performance (TIP)—Roadmap 2030 (IPA, 2021) . . . . . . . . . . . . . . . . . . . . 3.2.2 Government Soft Landings (CDBB, 2019) . . . . . . . . . . . . . . 3.2.3 Construction Playbook (Cabinet Office, 2020) . . . . . . . . . . . 3.2.4 Data for the Public Good (National Infrastructure Commission, 2017) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 Flourishing Systems (CDBB/CSIC, 2020) . . . . . . . . . . . . . . 3.2.6 Independent Review of Building Regulations and Fire Safety: Hackitt Review (MHCLG, 2018) . . . . . . . 3.3 Relevant Regulation and Legislation . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Keeping Staff Safe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Keeping Occupants Safe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Protecting the Natural Environment . . . . . . . . . . . . . . . . . . . . 3.3.4 Keeping Data Safe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 Making Data Transparent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.6 Encouraging Data Best Practice Through Regulation . . . . . 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

107 107 109 111 117 124 125 126 127 127 129 130 131 133 134 134 135 136 137 138 139 139

Contents

ix

4 Data and Organisational Culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction: Meaning in Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Data Culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 What Does It Even Mean to Make Data-Driven Decisions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Time and Decision-Making . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Case Study: Data ‘Friction’ and the Music Industry . . . . . . 4.3 Stakeholder Engagement and Communities of Practice . . . . . . . . . 4.4 Writing a Data Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Your Data Vision Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Data Principles (and Ethics) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Data Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Data Outcomes and Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.1 Sectoral Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.2 Organisational Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Data Roles and Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141 141 142

5 Delivering Data Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Data Foundations/Precursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Data Governance and Data Empowerment . . . . . . . . . . . . . . 5.1.2 Pragmatic Enterprise Data Architecture . . . . . . . . . . . . . . . . 5.1.3 A Practical Philosophy (a Data Dogma) . . . . . . . . . . . . . . . . 5.1.4 Ontology (Common Data Models) . . . . . . . . . . . . . . . . . . . . . 5.1.5 How to Build and Use an Ontology for Construction . . . . . 5.1.6 Starting with an ‘Entry-Level’ Taxonomy or Business Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.7 Search and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.8 Unstructured Data Management . . . . . . . . . . . . . . . . . . . . . . . 5.2 Delivering Successful Data Projects . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Delivering Through Traditional IT Functions . . . . . . . . . . . . 5.2.2 Working with Legacy Practices . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Working for Construction Projects . . . . . . . . . . . . . . . . . . . . . 5.2.4 Delivering into Operational IT . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5 Working Through Governance . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Procuring Successful Data Projects . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Ensuring Your Scope of Work is Deliverable . . . . . . . . . . . . 5.3.2 Competing and Evaluating . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Ways of Working . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Managing Ecosystems and Dependencies . . . . . . . . . . . . . . .

199 199 200 206 207 209 213

147 154 157 159 163 165 168 173 176 181 181 184 190 196 196

215 216 219 220 221 223 225 225 228 229 230 232 233 234

x

Contents

5.3.5 Making Effective Use of Terms and Conditions . . . . . . . . . . 5.3.6 Procuring Across the Life Cycle of a Service . . . . . . . . . . . . 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

236 236 237 238

6 Radical Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 On Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 On Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Digital Construction and Building Information Modelling (BIM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Dimensions and a Data-First Approach . . . . . . . . . . . . . . . . . 6.3.3 Emerging Technologies: Bridging the Gap Between Digital and Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Digital Heritage: Working with old and unique assets . . . . 6.3.5 Geospatial Information Systems (GIS) . . . . . . . . . . . . . . . . . 6.4 Data Analytics and Business Intelligence (BI) . . . . . . . . . . . . . . . . . 6.4.1 Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Staff and Stakeholders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Prototyping and Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.4 Infrastructure—Warehouses, Lakes, Mesh . . . . . . . . . . . . . . 6.4.5 Human Psychology and BI . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Data Science and Artificial Intelligence (AI) . . . . . . . . . . . . . . . . . . 6.5.1 The Sub-four-minute Mile . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Primer: AI, LLMs, and Software 2.0 . . . . . . . . . . . . . . . . . . . 6.6.2 To AI or to Automate? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.3 Categorising Our Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.4 Specialist Built Environment Models . . . . . . . . . . . . . . . . . . 6.6.5 Predictive Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.6 Data Science on Built Environment Projects . . . . . . . . . . . . 6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 What Is Smart? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.2 The Drawbacks and Ethics of Smart Places . . . . . . . . . . . . . 6.8 Digital Rehearsal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8.1 Parametric and Generative Design . . . . . . . . . . . . . . . . . . . . . 6.9 Digital Twins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.1 Smart Buildings Versus Digital Twins . . . . . . . . . . . . . . . . . . 6.9.2 Industrial Digital Twins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.3 From Construction to Operations . . . . . . . . . . . . . . . . . . . . . . 6.9.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.5 National Digital Twinning . . . . . . . . . . . . . . . . . . . . . . . . . . . .

239 239 247 254 259 262 267 271 272 275 276 281 281 283 287 288 288 290 292 292 296 299 301 303 305 308 309 314 316 320 325 327 328 330 331 332

Contents

xi

6.9.6 Digital Twinning Across Industrial Sectors . . . . . . . . . . . . . 333 6.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 7 How to Be a Data Person . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 How to Be a Data Person . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 How to Be a Person . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 How to Set a Positive Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 How to Be Conscious of Your Biases . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 How to Be Ethical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 How to Be Open Minded and Work with Others . . . . . . . . . . . . . . . 7.7 How to Sell a Narrative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8 How to Make Product Love, and Not Be Ruled by It . . . . . . . . . . . . 7.9 How to Take Accountability and How to Lead . . . . . . . . . . . . . . . . . 7.10 How to Grow Talent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 How to Respect Your Own Time and Your Mind . . . . . . . . . . . . . . . 7.12 How to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.13 How to Think About Mental Health . . . . . . . . . . . . . . . . . . . . . . . . . . 7.14 How Not to Take It All Personally . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.15 How to Take Sustainability Seriously . . . . . . . . . . . . . . . . . . . . . . . . . 7.16 Epilogue: Of Bytes and Bricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

339 339 341 342 343 344 346 348 350 352 353 354 355 357 358 361 363 366

About the Authors

Welcome to this journey where data meets the built environment, and innovation intertwines with years of hard-won expertise. In the following pages, you will have the opportunity to delve into the world of advanced analytics, digital integration, and sustainable construction, guided by two of the field’s most ardent professionals—Ian Gordon and Neil Thompson. Ian Gordon is a leader in Data Advisory and Advanced Analytics at Arup and brings with him a wealth of experience in the realm of the built environment. Ian has applied his extensive knowledge to several critical national infrastructure projects. His commitment to using data for societal benefit is evident in his role at Arup, where he aims to forge better outcomes for society through innovative data techniques. Ian has held data (and data-adjacent) leadership positions at a range of large client organizations. Highlights include stints as Head of Data Architecture and Engineering at National Highways, Project Director for System Integration on Thames Tideway, Interim Head of Performance at Network Rail Southeast Route, and most recently Head of Data at the Houses of Parliament Restoration and Renewal Program. Areas of expertise include cloud-first data strategy, data architecture, data modeling (including taxonomies and ontologies), data quality, data governance, data valuation, and the application of AI/ML to real-world problems. Ian is also passionate about interoperability and Digital Twins where he has collaborated with the Centre for Digital Built Britain, the Infrastructure Client Group, Innovate UK, and the Digital Twin Hub. Neil Thompson is a fellow of the Institute of Engineering and Technology, a Chartered Construction Manager and Member of the Data Management Association. Neil currently serves as the director of Digital Integration & Delivery at AtkinsRealis, bringing a unique perspective as a Chief Technology Officer in a complex defence portfolio and a member of a major joint venture senior leadership team. Neil is also the chair of the Built Environment for the Institution of Engineering and Technology and an Honorary Associate Professor at the Bartlett School of Sustainable Construction, University College London. His academic background, paired with his practical experience, has positioned him as a leading voice in the industry, contributing significantly to industry reports and policy formation on a global scale. Neil holds a Bachelor of Engineering degree with honours in building services engineering, a xiii

xiv

About the Authors

Master of Science in Construction Economics and Management from the Bartlett, University College London, and a Certification in Virtual Design and Construction, Stamford USA. Together, Ian and Neil provide a comprehensive view of how data, technology, and innovative practices can be harnessed to revolutionize the built environment. Their shared vision and diverse experiences culminate in this book, offering readers insight into the future of infrastructure, technology, and sustainable development. So, buckle up and prepare yourself for a journey through the realms of data, innovation, and the built environment.

List of Figures

Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4 Fig. 4.5 Fig. 4.6

Fig. 4.7 Fig. 4.8

Data, information, knowledge, insight pyramid, with added inputs and outputs, and a bit of wisdom sprinkled on top . . . . . . An illustrated taxonomy of data types . . . . . . . . . . . . . . . . . . . . . . An illustration of the four types of data in practice . . . . . . . . . . . A Venn diagram illustrating how some of the different concepts covered in this section overlap . . . . . . . . . . . . . . . . . . . . A simple metamodel describing how data and technology deliver outcomes to asset-centric organisations . . . . . . . . . . . . . . “Asks” of data to realise environmental, social, and sectoral outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The principal papers and standards informing this chapter . . . . . Atkins digital pathway circa 2017 . . . . . . . . . . . . . . . . . . . . . . . . . The Digital Pathway–post-workshop output . . . . . . . . . . . . . . . . . The Digital Pathway–Business Logic Abstraction . . . . . . . . . . . . An expanded metamodel of the value delivered to society by data, technology, and physical assets . . . . . . . . . . . . . . . . . . . . An illustration of the trade-off between risk mitigation and data sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Relationships between a data function and the wider organisation (business) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The goldilocks zone of data strategy . . . . . . . . . . . . . . . . . . . . . . . A semi-serious map of decision-making within a built environment organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transactions and latency wave theory . . . . . . . . . . . . . . . . . . . . . . An example four by four data capability matrix . . . . . . . . . . . . . . An extension of the ‘data types’ model from Chap. 1, showing how different data types feed into indicative use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indicative data use cases—social, environmental, health safety and quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indicative data use cases—project delivery, maintenance, and operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 24 28 34 38 76 108 114 114 115 126 133 144 146 151 155 174

177 178 179 xv

xvi

Fig. 4.9 Fig. 4.10 Fig. 4.11 Fig. 4.12 Fig. 4.13 Fig. 4.14 Fig. 4.15 Fig. 4.16 Fig. 4.17 Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 5.4 Fig. 5.5 Fig. 6.1 Fig. 6.2 Fig. 6.3 Fig. 6.4 Fig. 6.5 Fig. 6.6 Fig. 6.7 Fig. 6.8 Fig. 6.9 Fig. 6.10 Fig. 6.11 Fig. 6.12 Fig. 6.13 Fig. 6.14 Fig. 6.15 Fig. 6.16 Fig. 6.17 Fig. 6.18 Fig. 6.19 Fig. 6.20 Fig. 6.21 Fig. 7.1

List of Figures

Indicative data user cases—business support functions . . . . . . . . The role data plays in realising positive outcomes for the built environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outcomes associated with specifying your data requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outcomes associated with measuring and improving your data quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outcomes associated with making your data accessible . . . . . . . Outcomes associated with deriving value from your data . . . . . . Outcomes associated with ensuring your information security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indicative data role taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mapping of roles to data requirements . . . . . . . . . . . . . . . . . . . . . AtkinsRealis—data governance model. Credit to James Trinder and Andrew Cox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visualisation of an ontology using Protege . . . . . . . . . . . . . . . . . . Visualisation of an ontology using OntoPop . . . . . . . . . . . . . . . . . Delivering successful data projects . . . . . . . . . . . . . . . . . . . . . . . . Delivering value from data services in the context of an IT department . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The economics of digital technology (a rough guide) . . . . . . . . . Measuring Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A presentation slide by the Construction Leadership Council’s BIM2050 Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overlapping digital construction technologies . . . . . . . . . . . . . . . Built environment organisation data use cases . . . . . . . . . . . . . . . Examples of built environment use cases that require using multiple data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BI process—outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BI process—as implemented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Axes of analytical maturity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An analytical maturity journey . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pre-LLM knowledge creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Post-LLM knowledge creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data science techniques through the life cycle of a built environment project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What the people think, what is a smart building? . . . . . . . . . . . . . High level abstraction of Smart Building concepts . . . . . . . . . . . . Project utility function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hierarchy of internet platforms over time . . . . . . . . . . . . . . . . . . . An illustration of capabilities in a building context . . . . . . . . . . . The BIG BIM opportunity circa 2012 . . . . . . . . . . . . . . . . . . . . . . The Digital Twin Ripple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Components of a federated digital twin architecture . . . . . . . . . . Every origami and data project ever… . . . . . . . . . . . . . . . . . . . . .

180 182 185 186 187 188 189 194 195 204 212 212 221 222 249 252 255 261 265 274 276 277 282 285 302 302 306 310 310 320 326 328 329 329 332 347

List of Tables

Table 1.1 Table 2.1 Table 4.1 Table 6.1 Table 6.2 Table 6.3 Table 6.4 Table 6.5

Data types and their practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . The public to private divides in the UK . . . . . . . . . . . . . . . . . . . . Your data questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications of technology mapped to Cynefin domains . . . . . . The numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Knowledge and knowhow with respect to technology applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identification of AI snake oil . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25 85 175 244 250 262 273 298

xvii

Chapter 1

Purpose and Pedantry

Abstract Our built environment is the sum of all the stuff that human beings have built and maintained. We all use it every day. It is how we move around, how we feed and shelter ourselves. We use it to make our lives both more rewarding and comfortable. We depend upon it every day and suffer when it fails. As data emerged as a phenomenon, then as a noun, and ultimately a profession, it changed large parts of our lived experience, but left our experience of the built environment untouched. This book seeks to describe if and how we can use data to help make the built environment work better for society. This chapter covers the unique challenges and opportunities presented to data professionals working in the built environment sector. It sets out the structure, purpose, and philosophy of the book, and begins to establish what words like data, information, knowledge, insight, and wisdom mean in the context of the built environment. Your authors are both built environment and data fanatics. We have been lucky enough to work for some of the largest physical asset owners in the UK (and beyond). In doing so we have witnessed the unique challenges and rewards that come from trying to apply data to the benefit of the built environment. We have also drawn upon our extensive networks of subject matter experts across the sector to inform and critique our thinking. We want readers to finish this chapter with a useful overview of the main considerations involved in designing, communicating, and implementing data strategies and solutions in the built environment sector. These considerations include the economic, human, and technological dimensions. During this chapter, we also consider the distinct types of data that built environment data roles must cover, and the different perspectives that readers might bring to this material.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_1

1

2

1 Purpose and Pedantry

1.1 Why This Book and Why Now? Data isn’t neutral. Like any other tool, it can cause harm when used with the wrong intentions… how data and digital technology are used in the built environment impacts all of us, and all of us have a role in deciding the outcomes we want.—CDBB, Four Futures, One Choice.

This is a love story about the built environment and the data it produces. Like all good love stories, there is a meet-cute, followed by a blossoming romance, followed by betrayal and heartbreak, followed by reunion. Will our couple live happily ever after, or pass like ships in the night? The purpose of this book is to express clearly and categorically how data, and the digital technologies that use data, can help the built environment sector deliver value to society. Whilst we want to avoid being deliberately over-prescriptive, we wish to give you enough of a methodology such that you can say “I understand what I want to accomplish with data, how I am roughly going to get there, and how it will add value to my work.” We have sought to write in an honest and compelling manner that allows you to understand how the topics covered apply to challenges you might face in this sector. Data isn’t solely a thing that Information Technology (IT) does. It is a resource that we use as we seek to improve the built environment. Data, and the meaningful information that we create with data, have always been part of the work of built environment professionals, even if data professions have only recently become elevated as distinct roles within the built environment sector. As investment in the collection and application of data increases across the built environment sector, and as more client, contractor, and consultant organisations build dedicated data functions and define their data visions and strategies, it feels like an opportune moment to take stock of what these changes might mean for the built environment and the people who rely upon it. Our transition into a globalised and interconnected digital economy has seen the recognition of data as a valued asset, and the rise of a new profession–the data manager–to function as its custodian.1 This profession has been at the heart of enabling these new firms (and their intangible value) to scale to unprecedented magnitudes in terms of data flow, storage, and computation.2 In this book we want to bridge the gap between the emergent data managers (represented by the Data Management Association) and the established information management practice of the built environment sector (typified by Nima, previously the UK BIM Alliance).

1

Or at least individuals who would self-identify as data owners. The built environment has a long history of individuals compiling information to make better decisions, beginning with the likes of Dr John Snow who we discuss later in this book. 2 Indeed, as Dexter Robinson noted on reviewing this book, one could even argue that data and digital technologies have revealed the value of intangible assets that was always there was not easy to measure. As he noted, “you could argue that because of data and digital systems intangible assets are now easier to measure and therefore we now put value against them.”

1.1 Why This Book and Why Now?

3

Over the past four decades, we have witnessed a profound and invisible change in the global economy. In 1975 ‘tangible’ assets accounted for over 80% of the value of the S&P 500 stock market index. By 2018 this figure had fallen to less than 20% of the total value (Visual Capitalist, 2020). The most valuable assets in the world are no longer physical things like factories, pipelines, gold, or oil. Whilst such assets are clearly still valuable, they are now dwarfed in value terms by ‘intangible’ assets such as brands, intellectual property, algorithms, market share, acquired rights, and–increasingly–data. Data is itself an intangible asset. Moreover, data allows us to capture and measure other intangible assets. Branding and intellectual property in the form of TV advertising, logos, and jingles existed long before intangible assets dominated company valuations. Part of the difference is that more of our economy, and our lives, have become digitised. This digitisation in turn generate data that allow us to better identify and quantify activity and economic value. Over this same period, the built environment sector has both underperformed and over-polluted. Construction alone accounts for 9% of EU GDP (European Commission, n.d.) and 13% of global carbon emissions (Architecture 2030, 2023). As a sector it usually sees only 1% annual productivity gains (McKinsey & Co., 2017), average margins in the developed world of far less than 5% (Turner & Townsend, 2022), rework costs amounting to 5% of the average project budget (Get It Right Initiative, 2020), and average losses of up to 20% of project time on schedule overruns (McKinsey & Co., 2016). The built environment sector, including construction, accounts for almost 40% of global emissions (Architecture 2030, 2023). This is a self-reinforcing state of both financial and environmental stagnation. One that—to date at least—the use of data has failed to meaningfully address. The transition to a digital economy has left the built environment sector behind. In addition to anecdotal evidence, we can detect the traces of slow digitisation in the built environment sector’s slow rate of productivity growth compared to other sectors3 and the lack of executive maturity in realising the strategic value of data (Harvey Nash & KPMG, 2018). The built environment will–and should always– focus on tangible assets. But the sector has yet to become ‘data centric’ or ‘data savvy’. Whilst innovative technologies such as the Internet of Things (IoT) offer the possibility of far greater data capture from physical assets, we remain in a period of transition where the marginal cost of data in the built environment often far exceeds the cost of data collection in other sectors. We are now at liberty to digitise, virtualise, and measure huge parts of our lives: socialising, shopping, exercise, health, finances, and romance. But we can’t (yet) fully virtualise the infrastructure that meets our human needs for shelter, water, heat, transport, power, and (in most instances) care. And our seemingly weightless digital economy relies on huge investments in physical assets like fibre, telecoms, base stations, and data centres. 3

According to McKinsey, “Global labour-productivity growth in construction has averaged only 1% a year over the past two decades, compared with growth of 2.8% for the total world economy and 3.6% in manufacturing.” (McKinsey & Co., 2017)

4

1 Purpose and Pedantry

However, we can’t completely digitise physical things out of existence, nor can we wish away the impact the physical and carbon footprint of the built environment plays in driving the climate crisis, loss of biodiversity, damage from natural disasters, and insidious economic inequality. As Jo Da Silva writes, “over the coming decades, the combination of rapid urbanisation and climate change is set to create a ‘perfect storm’ in terms of increasing vulnerability (Da Silva, 2012). Our built environment, which we construct to shelter and protect us, is conversely one of the greatest hazards when natural disasters (exacerbated in frequency and severity by climate change) strike. That might mean buildings collapsing during earthquakes, urban areas with poor drainage increasing the impact of flooding, and other forms of loss or damage to the infrastructure that we rely upon to support us. In describing the risks posed from failure to adapt our built environment to emerging climate crises, Da Silva expresses eloquently a theme that will arise throughout this book: the impact of complexity on the societal outcomes provided by the built environment. Complexity gives rise to what Da Silva describes as “cascading failure due to inter-relationships between infrastructure, institutions and ecosystems,” where “our ability to accurately predict and prevent the consequences of a disaster is limited by our ability to understand and model the dependencies and interdependencies between different parts of a system (Da Silva, 2012).” The built environment is a complex system-of-systems, one that has only become possible due to a massive increase in the quantity and velocity of information exchange between organisations, locations, and people. Whilst this system-of-systems has granted billions of people, particularly in the Western world, an unprecedented standard of living, the merits of its complexity should not be confused with what Nassim Nicholas Taleb describes as ‘anti-fragility’ (Taleb, 2012). Interdependencies created to improve short-term efficiency can remove redundancy and increase the fragility of our built environment systems. Data has played a crucial role in enabling the hallmarks of our age such as globalisation, distributed just-in-time manufacturing, near-instantaneous marketmaking and trading, ecommerce, personalised medicine, constant connectivity, social networks, and data harvesting. These have in turn shaped the scale and density of our built environment. What data has yet to accomplish is helping the built environment to sustainably accommodate the repercussions of these developments, or to reduce fragility and increase resiliency. As the EPSRC write, the sector suffers from “slow adoption of proven technologies, large amounts of carbon emissions… and projects that are often over cost and over budget (EPSRC-NSF, 2023).” Modern society has birthed a Lovecraftian ‘Shoggoth’ (New York Times, 2023) of over-development and over-consumption without yet providing us with the tools to limit the ensuing destruction. This book seeks to describe in detail how we, as data professionals working on the built environment, should seek to make effective use of the intangible ‘data domain.’ But it doesn’t lose sight of how those working in this sector—unlike more mature business to consumer (B2C) and intangible sectors such as retail, finance, and consumer tech—will always be defined by how they care for our physical domain (and the natural world that supports it). This principle informs how we think about

1.1 Why This Book and Why Now?

5

the application of data, including the unique challenges of collecting data on physical assets, storing it in an appropriate manner, and making use of that data to drive our actions in the physical world. Increasingly, the problems that strike large-scale built environment projects have their roots in data and the systems that use data. The well-documented delays to Crossrail (the Elizabeth Line) resulted not, as one might expect, from the civil engineering works required to dig the enormous tunnels and cathedral-like stations. Rather, according to insiders and reports, the delays had their root in complex informationbased problems, including incompatibilities between signalling systems and ‘silo thinking’ across disciplines (Wired, 2019). Of course, that doesn’t mean that all that is required to solve these kinds of challenges are people with data skills. On the contrary, what the built environment sector needs is people with data skills who are also engineers, project managers, designers, and other systems-based thinkers. Organisations need a critical mass of people who are data aware and digitally minded effecting meaningful change, and they cannot all work for IT or have the word data in their job title. These data-augmented professionals can then add data capabilities to their organisation’s existing subject matter expertise and ways of working. The complexity of projects such as Crossrail demand these cross-cutting capabilities to continue to improve delivery. In a sector that does not yet always think in terms of data, we often attribute data problems to human failure. Many of the issues that bedevil megaprojects have their origin in the use of complex data processing challenges, including optimism bias in costs estimates and benefits projects (Reuters, 2023), working around bottlenecks in the programme (Construction News, 2020), agreeing a regulated asset base funding model (New Civil Engineer, 2023), or monitoring and managing supplier performance. These examples are all—at their root—questions of how built environment clients capture and make use of the data that is available to them.4 Used well, data can help organisations make decisions in a manner congruent with the complexity of projects and allow them to anticipate, identify, and respond to problems before they become untenable. Used poorly (or disingenuously), data can become a smokescreen between the reality on the ground, and what we report outwards. The built environment sector has the added challenge of often arbitrary discontinuities in responsibility between the stages of the asset life cycle. Even the most basic of assets (a house for example) have become complex integration projects. Done well, the use of data will reduce the information boundaries commonly seen in the built environment between the different asset life cycle stages and the associated loss of knowledge/creation of rework. It will realise greater sustainability by minimising the number of carbon-intensive real-world activities required and by allowing us to 4

As we write, Bent Flyvbjerg and Dan Gardner released an expertly constructed account of how to get significant and complex things done. Combining the behavioural economics of Daniel Kahneman with his own experience and data collected on project delivery. From Flyvbjerg’s point of view we need to ‘think slowly well’ and make impact through our data and information management strategies and implementations (Bent Flyvbjerg, 2023).

6

1 Purpose and Pedantry

track our use of carbon across our work. The built environment exists to meet societies’ needs, and data has a sizable and unrealised role to play in improving how well it does so. We hope that this book will appeal to readers looking to make better use of built environment data in their work, whether data is a complementary part of a broader role, or the entirety of their role. Data is a creative rather than prescriptive pursuit. Methods and aesthetics will vary with the skill and inclination of users. One cannot mandate ‘proper’ use of data any more than one can mandate proper uses of any tool. Part of the role of a data professional is to create some norms for the use of data in our sector, and in doing so place boundaries around what constitutes allowable and responsible use of data. As data professionals our role is to improve data practice within our organisations, whether this is defining strategy, recruiting talent, creating shared meaning, deploying infrastructure, or delivering analyses. Sometimes we will work in isolation, other times we will function as an enabler, sowing a field of possibilities for others to harvest.

1.2 Structure of This Book We want to reckon with both the unique challenges and opportunities presented by the built environment, covering both the established standards that we currently work to, and the emerging thinking around how we can do better. Over the course of this book, we approach the question of improving the outcomes of the built environment through better use of data from several different angles, namely: • Establishing the problem statement that we are trying to answer for the built environment sector, as well as useful definitions to avoid confusion with data terminology (this chapter). • Describing the purpose of the built environment, and what makes the application of data to the built environment meaningful, unique, and beneficial to society and the wider natural environment (Chap. 2). • How the built environment sector currently thinks about the opportunities created by data, and how industry standards inform our data use (Chap. 3). • How to create a meaningful data vision for a built environment organisation. How to build a positive data culture within a built environment organisation and capture the benefits of data (Chap. 4). • How to define, procure, and deliver data projects in built environment organisations (Chap. 5). • Specific technologies that apply data to built environment challenges (Chap. 6). • What it means to be a data professional working in the built environment sector (Chap. 7).

1.3 Intended Outcomes of This Book

7

Broadly the book progresses through three themes: • Context (this chapter, 2 and 3): What is data, how does it apply to the built environment, and what can you learn from the existing body of knowledge in this area? • Meaning (Chaps. 4 and 5): What does data mean to the built environment? What should your data vision be, and how will realising that vision help your organisation to succeed? • Method (Chaps. 6 and 7): What is the roadmap of investment required to realise this vision on a built environment project or estate? How will you deploy key technologies, and how will you set up your data projects for success? How will your leadership approach impact your success? To provide context, we focus on looking outwards, to understand the collective knowledge, history, and context of data and the built environment. Built Environment organisations do not exist in isolation, the environment itself rarely acknowledges such arbitrary boundaries. We superimpose the human world onto the natural world, whilst interdependencies interweave and bind infrastructure owners. And so, our approach to data must build upon and harmonise with the approach taken by our predecessors and peers. To use data effectively we need to learn from history and from each other. To avoid repeating the mistakes of others we need to understand the evolution of capability across our sector. Once we know what good data practice looks like we can adapt it to meet the specific requirements and idiosyncrasies of our specific roles and organisations.

1.3 Intended Outcomes of This Book Fundamentally, our use of data should improve the outcomes that the built environment delivers to its users and society, and reduce its detrimental impact on the natural world. We want you to feel that you have the perspective required to function as a data professional and to understand how to deploy data techniques within and beyond your organisation to improve its impact on the built and natural environments. There are opportunities to use data whatever your rank in your organisational hierarchy, but we particularly want you to feel confident in expressing the value of data to leadership (the so-called C-suite) within your organisation and wider stakeholder environment. We understand that there are a wide range of different data roles, perspectives, and contexts even within the built environment niche. We have sought to write in a way that can be adapted to fit these different contexts. Your role might be an internal ‘client side’ one, where you are delivering infrastructure and management systems for your own firm. Or it might be external, where you are advising and consulting others on how to best make the case for data investment or creating a roadmap for implementation. Perhaps you may be delivering a project as a supply chain member, either acting as a single party up to delivering as a partner of a JV, consortium, or any large-scale enterprise. You might even think

8

1 Purpose and Pedantry

of yourself primarily as an architect, engineer, project manager, surveyor, or asset manager, with little oversight of data at all. In reading this book, however, you have begun to recognise the opportunity to augment your ways of working with better use of data. Whatever the context of your role, we want to provide you with: 1. An analysis of the challenges that face the built environment sector and how data can help address those challenges. This includes better meeting society’s needs whilst reducing the harm done to the natural environment. 2. An overview of the application of data to the built environment, the unique challenges in this space, and the body of knowledge that already exists, as well as the tools you need to create a meaningful vision for your own data practice. 3. An understanding of the organisational challenges that you may encounter when trying to deliver data change into a built environment organisation, particularly people and processes as well as personal challenges. 4. An overview of the current data and technology landscape. Particularly the established and emerging data solutions that are available to improve the construction, maintenance, and operation of physical assets. This book is not: 1. An academic text. However, as a vital aspect of our evidence-driven approach to curating the contents of this book, we will provide reference points to research that has demonstrated results in practice. 2. A prescriptive text where you need to follow our advice and procedures exactly. Our intention is to share our experiences combined with leading practice that will need adaptation to your specific context. 3. A vendor specific solution. We strive to provide vendor agnostic frameworks. We appreciate it is hard to provide practical examples without the specifics of vendor solutions. We will try to keep to functional requirements and focal points to help you hold your technology supply chain to account. There was some debate as to whether this book should be entitled Data and the Built Environment or Digital and the Built Environment. Often, we use these two terms (data, digital) interchangeably, other times they are vastly different fiefdoms that hardly acknowledge each other. We chose to use the word ‘data’ because we believe that it is more focused and better defined than digital. In practice the term ‘digital’ covers a wide range of practices from online comms and publishing through the specific applications of technology. This book dives into the use of emerging technologies, but always seeks to take a ‘built environment first’ and ‘data first’ perspective where the focus is less on the specificities or functionalities of software solutions, and more on the change that data can enable across the built environment sector.

1.4 Building on Established Practice

9

1.4 Building on Established Practice Built environment organisations, like people, are unique. It is these organisations that take responsibility for improving and renewing the built environment. This might be through new construction (e.g., projects) or through the on-going operation and maintenance of existing infrastructure. Every project and every organisation will bring unique geographic, architectural, logistical, cultural, and political challenges. And yet they will all share some common features, and this is just as true for their application of data. Those that work in the built environment often speak of the ‘asset life cycle’: plan, design, construct/build, operate, maintain, renew, dispose (give or take a few words). There are various flavours of the standard life cycle; for example, the RIBA Plan of Work or Network Rail’s PACE (formerly GRIP) stages, as well as broader overlays such as the election cycle, or the five-yearly regulatory cycle that many public sector infrastructure organisations follow. In parallel, those that work in data will talk of the information life cycle: specify, collect, collate, use/analyse, visualise/present, archive, dispose (again, the exact formulation varies). These life cycles—however overly-simplistic—help us to orientate ourselves within our organisations and projects, as well as when we change projects or employers. If you know, for example, that you are leading on a data collation/aggregation solution to support a project in the ‘build’ stage then you already have a decent idea of some of the types of data you are working with (e.g., models, schedules, costs) and some of the use cases that you might be asked to meet (e.g., design visualisation, scheduling, ‘earned value analysis’), as well as some of the data technologies that might be relevant. These answers will all vary if you move to work on a different asset or data life cycle stage. The point is that whilst every built environment project is different, patterns and conventions do exist and are important to recognise. Data creates value for the built environment sector only when used to improve the actions that we take in the physical world. It is a question of getting the right data, to the right people, with the right skills, at the right time. So, if data is failing to create value, then there are a few obvious causes to investigate: • Business and information architecture: has the organisation defined what it wants to accomplish with data? Is there a generally accepted and working theory on how and where data will make a difference to the organisation? Does it know what data it holds, and who should be using it? Is that data easy to access, with appropriate controls around permissions, version control, and other hygiene factors? Are the staff who should be using data aware of this expectation, and have they been trained in how to do so? Have they been sold on why they should adapt their ways of working to make better use of data? • Data quality: Is the data provided to users inaccurate or incomplete? This could be because a lack of data modelling means that the data requirements are not specified, or because a lack of data governance means that it’s unclear who is empowered to ensure that the data is fit-for-purpose. Data quality is key to ensuring that organisations trust their own data, and thus their own decision-making processes.

10

1 Purpose and Pedantry

There are many whizzy tools (e.g., Collibra, Informatica, or Alation) that allow you to scan your enterprises’ entire data estate for data quality issues. However, the quickest fix for data quality is likely to be through business change, namely creating a culture of data ownership, and introducing data stewardship obligations and training to existing roles. • User interfaces: the more people using data the better the outcomes we will see, but data can be hard to use or interpret. Often the software used in the built environment sector does not make it easy for people to access and use data in their work. Delivering value from data means understanding what your users are trying to accomplish with data or where they most need data. We should then seek out user-friendly technologies ways to make that data available, whether than is a handheld app, a business intelligence (BI) dashboard, or a chatbot. We must also invest in the data pipelines to ensure that the data people have access to is complete, authoritative, and up to date. • Training and role definition: everyone learns in their own way. Some people like to pick up technology and learn by doing, while others need structured training and a well-defined set of goals or responsibilities. There’s no right way of doing things, but organisations that do not define the data roles and skills that they need, and provide training for staff to get there, will be making life harder for themselves. Solving these issues will go a long way towards enabling better use of data on built environment projects, particularly if complemented by a broader effort to build their data culture. Once the basics are in place, there exist a range of established and emerging datadriven solutions that can demonstrably help the built environment to deliver better outcomes for society. These include better approaches to: • People. • Process. • Organisational culture. As well as technologies such as: • • • • • • • •

Digital construction. GIS. Analytics. BI. Data science and AI. IoT and Smart buildings. Digital rehearsals. Digital twins.

Currently within the sector, adoption is patchy (at best), but the promise is there. What these solutions have in common is that they rely upon, and make use of, data of far greater quantity and quality than is readily accessible to most built environment organisations today. These solutions realise a return on investment through efficiencies in how we interact with the physical world: by reducing unnecessary or redundant use of scarce resources (particularly labour). Examples include:

1.5 Definitions

11

• Using clash detection to correct errors in design before going to site. • Leveraging AI and digital rehearsal to optimise the sequencing of site logistics. • Deploying smart buildings and digital twins to move from reactive to predictive ‘just in time’ maintenance routines. What these techniques have in common is that they allow organisations to use human resources more effectively by reducing redundant or abortive work. Construction and maintenance resources are expensive, and most such work requires taking assets out of service. Consequently, data solutions will best realise a return on investment where they result in fewer, shorter, and better-sequenced interventions to physical assets. Every organisation presents its own unique challenges—they might be driven by big personalities, small budgets, tight timescales, or conflicting demands—and so a truly cookie-cutter approach will rarely work. Your use of data should draw on a body of practice that is far larger than any one role, project, specialism, or organisation. Your approach to themes such as cloud computing, data modelling, data governance or business intelligence (to name but a few) should build upon effective practices established by others facing similar challenges. This book surveys the broad expanse of data and the built environment, identifying lessons learned from established practice, and highlighting some of the good, the bad, and the ugly. In doing so it seeks to set out a trajectory that can increase the speed and reduce the risk of your use of data.

1.5 Definitions Information is not a thing; rather, it is the arrangement of physical things. It is physical order, like what distinguishes different shuffles of a deck of cards—Cesar Hidalgo, Why Information Grows

Definitions are core to how we approach data. Used well, data should reduce ambiguities within an organisation by bringing discipline to how you define your terms and measures. Part of the fun of having a data-related role is that you have licence to be a pedant, particularly when it concerns the meanings of words. Some might even argue that this is core to the role. But pedantry is a double-edged sword, we can’t expect the organisations that we work for to have a consistent and unambiguous lexicon if we don’t first try to meet that standard ourselves. You want your work to be legible to others, we want this book to be legible to you. Clarity and consistency of definitions around language are crucial to avoiding confusion whether one is selling a vision, setting out a strategy, or making the case for investment in data.

12

1 Purpose and Pedantry

1.5.1 Physical Things This book is about how we can use data—an intangible, weightless, digital thing—to improve the outcomes of the real world. Throughout the writing process, we reminded ourselves not to stray too far into the data world but to always return to the real, physical world. Data is not a good unto itself. We care about data because of what it can tell us about the real things that enable our lived experience. Specifically, we are predominantly concerned with the physical assets that human beings create in the real world to meet society’s needs. Throughout the book, we call these things physical assets. ISO 55000 defines an asset as an “item, thing or entity that has potential or actual value to an organisation.” Following ISO 55000, this book dives into how we can use one type of asset (data assets), to improve the outcomes that we realise from another type of asset (physical assets). When referring to the collection assets that we work on, we use the phrase ‘built environment’ as it feels that this term works best as a parent term. This parent term may encompass construction, infrastructure, asset management, buildings, property, facilities, operations, and so on. When using a noun to distinguish the built environment from other parts of the economy or professional fields we will use the term ‘sector’ (rather than ‘industry’). When we talk about the work of the data profession, we will tend to use the term ‘domain.’ Following these definitions, this book is about the intersection of the data domain and the built environment sector. When distinguishing between the built and natural environments, and between infrastructure and the built environment, we will follow the excellent definitions from Transforming Infrastructure Performance (Varah, 2021): • Built environment: the human-made or induced surroundings that provide the setting for human activity. • Natural environment: all living and non-living things occurring naturally (e.g., not human-made). • Infrastructure: the physical assets and services needed for the operation of human society, a sub-set of the built environment. Much of this book focuses on the management and interpretation of data to create and use information, knowledge, insight, and even wisdom. None of these activities are goods unto themselves, but things we do to improve decision-making and our interaction with the physical world.

1.5.2 Data, Information, Knowledge, Insight, Wisdom Human beings have lots of different words for records of things. Reference to the DIKIW (Data, Information, Knowledge, Insight, Wisdom) pyramid (pictured below) allows us to draw a clear distinction between terms. There is often some ambiguity on this front within and between organisations, and so where possible we will seek to

1.5 Definitions

13

Fig. 1.1 Data, information, knowledge, insight pyramid, with added inputs and outputs, and a bit of wisdom sprinkled on top

rely on formalised definitions from the core standards, and literature. Before settling upon definitions for our own purposes, it’s worth reviewing how these terms are used more broadly (Fig. 1.1).

1.5.2.1

Data

ISO 9001, the international standard for quality management systems, describes data as “facts about an object,” where an object can really be any physical or conceptual thing. We find it useful to think of data as raw material, ‘potential’ information, a series of records that may or may not prove to be meaningful. Data tends to provide value as it becomes aggregated. Consequently, there exist a range of terms to describe different aggregations of data. For example, the Ofgem Data Best Practice Guidance defines a data asset as “any entity that is comprised of data. For example, a database is a data asset that is comprised of data records (Steer, 2021).” In this book we usually refer to data sets rather than data assets. Although a philosophical point, and illustrated by the triangle above, the sum of all data can never equate to ‘full information’. Mark Twain’s phrase ”... there are 3 types of lies: lies, damn lies and statistics” suggest the line from data to wisdom is fraught with bias and other behavioural human factors. In the software engineering space, we only have to observe the outcome of the UK’s postal service’s Horizon scandal where individuals were wrongfully imprisoned for an error in a finance management system. The integration of data into information has a profound impact and they are not the same thing.

14

1.5.2.2

1 Purpose and Pedantry

Information

ISO 9001 poetically describes information as “meaningful data”. Therefore to become information, data must undergo interpretation (or at least processing) by an intelligent agent (e.g., a person, or an algorithm/neural network) to imbue it with meaning. The creation of information depends on the existence of intelligent agents, and their ability to interpret data. This is perhaps what Edward Tufte describes as “the sense of the relevant… the ability to identify and detect those things that have consequences beyond themselves (Tufte, 2020).” Information is data with added context that makes it meaningful to intelligent agents, as the Bennett Institute of Public Policy points out, “context matters because the value of data is not related in any simple way to its volume,” and so by adding meaning or context to data we create both information and value. As Cesar Hidalgo explains, information theorists, such as Claude Shannon, saw information as a unit of increased certainty: and the more complex or chaotic something is, the more information is required to communicate its physical state (Hidalgo, 2015). This renders information as a measure of the reduction in uncertainty or randomness that occurs when we transmit a message from a sender to a receiver (Shannon, 1948). As Hidalgo writes, “in a physical system, information is the opposite of entropy, as it involves uncommon and highly correlated configurations that are difficult to arrive at.” Thus, It is statistically possible that one could feed randomly generated code into a computer and accidentally replicate Microsoft Windows, but, without added context or interpretation, it’s highly unlikely.5 It is worth noting that this elevation of information above meaningless data is a recent development. In fact, what we call data our predecessors might have called information. Cesar Hidalgo, describes how early information theorists bred in the world of WWII encryption were of the view that “what travels through the wires or electromagnetic waves is [not] meaning… it is simpler…it is information.” This is doubly true if you transmit encrypted information to deliberately hide the embedded meaning from all but select observers. The important distinction here isn’t so much the terminology as it is the distinction between ordered phenomena (e.g., electromagnetic communication/storage) and contextualised interpreted meaning that stems from an intelligent agent being involved in processing that phenomena. As Hidalgo writes, “we infuse messages with meaning automatically, fooling ourselves to believe that the meaning of a message is carried in the message,” but actually “meaning is the interpretation that a knowledge agent… gives to a message.” In other words, we structure data into information such that we can then overlay our knowledge on to that information to create meaning (or, insights and wisdom). This takes us into the upper half of the pyramid, to fleeting properties known as knowledge, insight, and wisdom. These properties are distinct from data and information. We cannot store them away in a book or server in perpetuity, they require the active involvement of an intelligent agent. After the last intelligent agent ceases to be, there will still be data and information, but there will be no more knowledge, insight, 5

This is my IT equivalent of the infinite number of monkeys writing Shakespeare.

1.5 Definitions

15

or wisdom. We can rest the philosophers here, the physicists have an interesting discovery. The observation problem. This is where nature presents information only when we observe natural phenomenon, prior to our observation nature exist in all states. Our measurement of nature collapses all possible states into a single state. This is why light can simultaneously behave like a wave and a particle. The point here, as we move on from information, knowledge through to wisdom is that these are not natural states, but experiences that occur within the mind of an intelligent agents.

1.5.2.3

Knowledge

The definitions and distinctions between data and information are straightforward and objective, those of knowledge, insight, and wisdom less so. Knowledge requires yet more context than information, as Boulding describes it “knowledge is not something which exists and grows in the abstract… It is a function of human organisms and social organisation (Boulding, 1956).” As we move from data and information to knowledge we are progressing from meaningful facts about an object, towards something more akin to ‘an understanding of the action that might be taken in response to meaningful facts about an object.’ Zeleny backs up this interpretation, stating “codified (explicit) knowledge is a symbolic description of action, which is information. Tacit knowledge is action itself, i.e., the process of knowing (Zeleny, 2005).” Following Zeleny’s definitions, an algorithm is explicit knowledge, and the judgement of an experienced professional is tacit knowledge. Although it is harder to define knowledge, we still think that it is a useful concept to use when describing the outcomes of using data effectively. In more prosaic terms, as Dan Rossiter of BSI describes it, knowledge has a role in interpreting and making use of information. Dan provides the example, a data point might be the number 172, refined to be information this might give us someone’s height (e.g., 172 cm), and by accumulating knowledge we might acquire the context that this is a (slightly) below-average height for a grown man (in the UK at least). The word knowledge carries a refined air to it compared to lowly information, conjuring up images of scholars perusing the great Victorian libraries of hard-earned knowledge, as Zeleny wrote, “although it is quite natural to say that there is too much information, it would be rather difficult to even imply that there could be too much knowledge.” Knowledge is where our philosophers thrive, our ultimate sense makers of information. But knowledge is only to know, it is not to do. Knowledge is passive, insight is where we get activated.

1.5.2.4

Insight

In the business world the term ‘insight’ usually means something akin to ‘actionable information.’ That said, insight is seen as more immediate, recalling scenes of fingers on the buttons of control panels. For the purposes of this book, we do not seek to

16

1 Purpose and Pedantry

arrive at a categorical distinction between knowledge and insight. Instead, we keep to a more colloquial usage, closer to the lexicon that you are likely to encounter in a built environment organisation. Where we use knowledge, we tend to denote something akin to theory or best practice, whereas we will use insight to describe the ability to apply knowledge to fit the requirements of a specific organisation and its challenges. As an example, one might have knowledge about data governance best practices (perhaps from reading the DAMA DMBoK), and subsequently, you might engage with stakeholders across your business to obtain insight into how to best apply that knowledge to the idiosyncrasies of your specific organisation. Perhaps, for example, an earlier data governance project failed because it was too rigid in its definition of data domains or didn’t identify a broad enough group of data stewards. These insights are unlikely to appear in a textbook but will increase your chances of successfully applying your data governance knowledge to effect change in your organisation. We will discuss data and information as valued assets later in the book. But it’s worth noting that as we move from data up to information, knowledge, and insight we are adding more input from intelligent agents (e.g., humans or AI). In this way we are realising the potential value from data by using it to fuel these agents. Data itself can only store potential value rather than realised value, arguably insight is the point at which we are able to fully translate that potential value into realised value. We have previously leant on the philosophers and physicists, but it is now time for the economists. Our rationality in decision making is where we can feel insight tangibly. The seminal work of Daniel Kahneman (Thinking, Fast and Slow) presents two types of thinking. An instantaneous ‘system 1’ thinking where we use cognitive rules of thumb to make fast decisions (the basis of our biases). The other is where the heavy lifting is done, ‘system 2’ where our cognition is deployed to think about bigger problems… still fraught with bias. For us, we see insight as a statistic in how we interact with it. But insight does not equate to truth in our definition. It is, at best, a product of our more refined system 2 thinking.

1.5.2.5

Wisdom

Wisdom is even harder to clearly define than knowledge or insight. Wisdom is a word rarely used in a business context. It is common to have a ‘Head of Knowledge Management’ or similar, but ‘Heads of Wisdom’ are almost unheard of. Jo Lucas argues that this is because wisdom is not an attribute that accrues to individuals, but rather to collective endeavours, that wisdom inherently requires the collection of multiple different perspectives to exist.6 Whilst we will not use the concept of wisdom much in this book, it is useful to have this idea of a form of knowing that is even more refined than knowledge. As Zeleny puts it, “to attain wisdom, you have to ask why?” So, if data gives us ‘facts about an object,’ information adds ‘meaning,’ and knowledge adds ‘context’ and 6

Comments received directly from Jo as she reviewed a draft version of this book.

1.5 Definitions

17

insight ‘action’, then perhaps it is wisdom that gets us to higher order concerns like ‘purpose’ and ‘ethics’. In that sense wisdom is not the accumulation of ever more knowledge, but rather the understanding of how to use it for good. As such, wisdom is the goal of this book rather than a topic within it. If we understand how best to use data to improve the built environment, we might have gained some small amount of wisdom. Sitting at the bottom of the pyramid, data is the raw material underpinning the creation of information, and its subsequent refinement into knowledge, insight, and wisdom. As Jennifer Rowley writes in the 2007 paper ‘The Wisdom Hierarchy’, “information is defined in terms of data, knowledge in terms of information, and wisdom in terms of knowledge (Rowley et al., 2007)”. Whilst this hierarchy is useful, it is an oversimplification and risks implying that data can transform into wisdom without any additional input. In fact, moving from data up to wisdom requires input from cognitive agents (in other words either human intelligence, or the artificial intelligences that we create, and which learn from us, or both). As Richard Carter states, “Information and knowledge process events that have already occurred. Wisdom processes events that may occur… you should focus on AI doing most of the heavy lifting, to progress up the hierarchy from Data to Information to Knowledge. Then humans can focus more on the application of knowledge to determine the course of action.” We noted above that we shouldn’t exclusively frame the journey moving up the pyramid from data through to wisdom as an individual pursuit. Indeed, the higher we move up the pyramid the more difficult it becomes for individuals to further refine and contextualise the subject matter without the spreading that cognitive load across a more diverse group of intelligences. These definitions provide us with a basis for how we talk about data and information in the context of their management. There are some philosophical foundations to build upon. The work of data architects/ontologists/theorists such as the late Dr Matthew West seeks to set out a far more rigorous and scientific set of definitions for the built environment than we could hope to rival in this book. However, these are highly theoretical frameworks, and difficult for practitioners working on the sharp end of projects to usefully apply to the day job. There are also issues with the conventional definitions of data and information presented above. Defining data as ‘facts about an object’ is problematic because the word ‘fact’ comes with a fair bit of baggage these days. As we know from our own experiences of fake news and the broader aspect of inaccurate or misleading information, data can incorrectly represent the facts, and humans can misinterpret or misuse it. Data is a unit of record relative to a confidence interval of the probability of a thing occurring or being true. As such, it might be safer to define data as ‘statements about an object’ rather than facts. As for knowledge, insight, and wisdom, notwithstanding the lengthy discussion above, it may be easiest to think of knowledge as the use of information to look backwards (e.g., retrospective reporting), and insight as the use of information to look forward (e.g., anticipation). The definitions above are general and require development with respect to what management of these resources means in practice. There are differences in management activities between (the levels of the pyramid) because organisations use these

18

1 Purpose and Pedantry

resources in diverse ways. Rough definitions of management practice at each level might look something like: 1. Data management: the collection and collation of data such that it is available and meaningful to users. 2. Information management: the documentation, storage, and archive of the institutional memory of a project or organisation. 3. Knowledge management: a process that allows us to process and learn from that institutional memory, to bring added context that allows us to make better use of information. As an aside: traditional knowledge management was primarily the time-consuming process of humans consuming and interpreting (internalising) qualitative information, but AI has the potential to massively speed and scale the creation and dissemination of knowledge. 4. Insight management: the means of taking what we have learnt and applying it to our future endeavours in a manner that delivers better outcomes that we have realised before. 5. Wisdom management: the process that allows us to evaluate our decisions against our broader purpose, bring perspective to our endeavours, understand our own limitations, and ensure that we are working with long-term, holistic benefits in mind. It’s worth noting that within built environment organisations there can be quite a strong distinction made between data, information, and knowledge management. Viewing data, information, and knowledge as inextricably linked, it is not unusual for built environment organisations to have completely distinct data, information, and knowledge management teams working in distinct departments. This in part reflects how our sector uses these terms, as well as the influence of regulation, where you might find ‘information management’ rooted in physical records management and document control, ‘knowledge management’ seen as more of a business change or organisational improvement function, and ‘data management’ more often associated with IT. So, you end up with, for example, data management sitting within IT under the Chief Information Officer (CIO), information management sitting within the Programme Management Office (PMO) under the Chief Operating Officer (COO), and knowledge management sitting within quality assurance under the Chief Finance Officer (CFO) or Human Resources (HR) or Legal. These divisions are surmountable but are a far cry from the seamless transition that the WIKID pyramid might suggest. Where possible we will seek to use simple terms as stand-ins for long-lists of sub-domains. Most incorrectly, we will use data as the default descriptor of digital content and only use terms such as ‘information’ and ‘insight’ where drawing that distinction adds considerable meaning. We will also avoid the capitalisation of any of these terms, as they are not proper nouns. It’s worth underlining the subjectivity of these decisions. They represent an attempt to improve the legibility of this work, not to make a point. Feel free to disagree.

1.5 Definitions

19

1.5.3 Time To work in the built environment is to constantly reckon with time. The ‘built’ part of the ‘built environment’ is humanity’s temporary chicken scratches on the barn walls of time. The environment itself is more permanent (though as we are discovering, it too may change, sometimes rapidly). As Carlo Rovelli tells us, “The hugely evident distinction between past and future is really statistical—is really due only to our blurred interaction with reality” (Rovelli, 2017). In other words, our relationship to time follows from it being useful to our emotional and mental state. It is a survival mechanism. The built environment is the closest that we humans come to creating permanence for ourselves, but the lifespan of any part of the built environment is finite, usually measured in decades… sometimes less, sometimes far more, but always finite. All our subdivisions below that level, from the distinct parts of the asset life cycle, to projects with their schedules and tasks, require the continuum of time to be meaningful. The w3.org time ontology is a fascinating read, and one that makes several useful distinctions such as the difference between duration and instants (W3C, 2022). For our purposes, time is an infinite continuum of instants, one that we can subdivide into any number of durations. To avoid undue confusion, we will treat the assets that constitute the built environment as ‘permanent’ (even though we know they are not). In doing so we will perpetuate the implicit distinction in the sector between the creation of permanent assets (construction), and the operation and periodic maintenance of those permanent assets (asset management). That said, we expect (and indeed hope) that over the coming years the distinction between construction and asset management will lessen. Creating new built assets has both an immediate and long-term impact on carbon emissions, biodiversity, as well as often the public purse and the consumption of resources required to operate the asset. As our built environment matures, and as we seek to tackle the climate crisis, we will need to seek ways to repurpose and reuse existing assets whilst improving the sustainability of their operation. We believe that this will result in an increasing number of ‘Trigger’s Broom’ (or Ship of Theseus) assets where repeated refurbishment ultimately replaces the permanent assets in their entirety (or close to their entirety) but never as a single carbon-intensive demolition and construction project. This model will in turn require a step-change in how we capture and manage information associated with the composition and health of our assets throughout their life cycle.

1.5.4 Organisations and Friction Another concept that we have already introduced in the case study above, and will return to throughout this book, is the idea of ‘friction.’ We are using the term ‘friction’ as a simplification of transaction cost economics found in the field of microeconomics. Friction is the thing that occurs when we expend resources enacting decisions. A good example of economic friction which occurs in every sector, including

20

1 Purpose and Pedantry

the built environment, is that of asymmetric information. The renowned economist Kenneth Arrow described how asymmetric information prevents markets such as healthcare working frictionlessly because “a patient will not typically know exactly what a physician is proposing to do, nor will she be completely informed about what the physician knows (Maskin, 2019).” The relationship between the clients and contractors (or IT companies) is not dissimilar to that of the patient and physician. In these circumstances Arrow argues that “nonmarket institutions, such as selfregulation by the medical profession and a code of ethics for physicians, can play an important role in improving market performance.” As we describe later in this book, professional bodies, standards, and sectoral thought leadership (read: wisdom) play a similar role in the built environment. On a more practical level, it is also worth thinking about the diverse types of entities that interact in the built environment sector. These are typically the parties to the contracts required to enact any change to the built environment. These commercial and economic entities include: 1. Organisations, which may in turn subdivide into: a. Public infrastructure owner/operators (such as Network Rail, National Highways) b. Government departments/regulatory bodies (such as the Department for Transport, or the Office of Road and Rail) c. Governmental bodies (such as the Infrastructure and Projects Authority) d. Special purpose vehicles (such as Thames Tideway or HS2). e. Private infrastructure owners, whether fully private (such as Westfield) or the result of privatisation (such as many power generators). f. Investment bodies (such as pension funds or asset managers like Macquarie). g. Service providers (such as Amey or Interserve). h. Main or ‘Tier 1’ contractors (such as Balfour Beatty or Skanska). i. Consultant designers and engineers (such as AtkinsRealis or Arup). j. Subject matter experts and specialist suppliers. k. Joint ventures and consortia (usually combinations of h & i created for specific projects like d). l. Academic institutions and think tanks. 2. All the above have data roles, which similarly may subdivide into: a. Organisational leadership (CEO, Executives, Board members, CIO, CTO, etc.) b. Data leader (e.g., CDO or Head of…) c. Security roles (e.g., CISO, SIRO) d. Data professional (e.g., Data scientists, data engineers, data analysts). e. Governance roles (e.g., data owner, data steward). f. Data-adjacent roles (e.g., information manager, records manager, document controller, performance analyst). g. Advisory/consultancy roles (often Technical Directors and the like) h. User roles (everyone else).

1.5 Definitions

21

Part of the story behind why change in the built environment sector can be slow and sporadic, is the complexity of the ecosystem of organisations that are, in one way or another, responsible for its creation and upkeep. Distinct types of organisations optimise based on different incentives, as do the individuals and teams within those organisations. Some exist to realise a public good, others to make profit, some priorities their reputation, others their scale or influence. As a sector we try to align the behaviours and incentives of organisations through contracts, as well as implicit norms and codes of conduct, but often this merely further distorts incentives, or hides behaviours behind a veneer of bureaucracy.

1.5.5 Types of Data The fact is that true data is never ‘just’ the data. (Greenfield, 2013)

Whether you are designing, building, maintaining, or operating parts of the built environment, you will use information as a means of deciding what to do, communicating what you are doing, and recording what you have done. You will have to meet the information requirements of your customers or stakeholders, as well as regulatory or legal requirements. Whilst these obligations are different in many ways, what they share is the need to securely control the flow of information within and between organisations, as well as the need for common data definitions to ensure that the correct information is available when required. Built environment clients have a long history of working with extensive supply chains. Even where these clients have recently employed Chief Data Officers (or similar roles), the established data standards and practices often reflect an accumulated mixture of practice from across the various teams that compose the organisation and its supply chain. Data must naturally work across all parties. Your data practice will reflect the needs of your customer/stakeholders, best practice from the Built Environment/Construction sector, and your wider supply chain. Even where you are starting with a blank canvas you will inherit some ways of working from the experience of your colleagues and stakeholders. Consequently, you need to work to ensure that the components you inherit and originate are mutually compatible. Again, there is a friction between aspiring to do better, and going against the established grain. Pick your battles! One aspect that characterises and distinguishes the data domain of the built environment sector from many other sectors is the breadth of data types that data practice must account for. Working with data on a built environment project can feel like you are living in multiple different decades (or even centuries) simultaneously. Because of the (relative) permanence of the built environment we will often need to use archive records to inform our current work. For example, the Victorians built much of the UK’s existing rail and water infrastructure, and organisations such as Network Rail and Thames Water still refer to original drawings from this time to understand their assets. The engineering profession has continued to produce high quality drawings for millennia (Bower, 2023), shifting to computer-aided design in

22

1 Purpose and Pedantry

the 1980s, appending information to design models (building information modelling) in the 2000s, and far greater use of image capture (such as high-definition photogrammetry and point clouds) in the 2010s. Semi-structured data such as design models, geospatial models, bulk stores of imagery, and the omnipresent PDF document, form a large part of many built environment datasets—particularly for construction. In parallel to an evolving use of drawings, models, spatial layers, and imagery, since the 1960s many built environment clients have also accumulated a sizable legacy of data systems and associated structured alphanumeric data. This evolution has progressed through paper records, to mainframe databases, to desktop applications, and latterly cloud computing, often with new solutions stacked on top of old solutions. A great example is the UK’s train delay recording system, which when we worked with it in the late 2010s was still running an IBM mainframe database from the 1960s, with a 1990s BASIC GUI bolted on top, a late 2000s database use for long-term data storage, and inevitably PowerBI or equivalent used to visualise the data on top of that. A focus on replacing software, rather than ensuring consistency of data across time, means that it is not unusual for different generations of solutions to use completely different standards for the structuring and organisation of their data. In tandem, the sector has not always been as diligent in preserving alphanumeric data as it has drawings, and it is not unknown for data not to outlast legacy systems, depriving analysts of valuable longitudinal datasets. This is perhaps because built environment clients have often had established means of archiving paper records, but fewer resources or standards for archiving electronic information. The discontinuities in the data record of the built environment, whether that’s changes in format over time, or simply the loss of historical data, has a real cost in the present day. Data collection on site remains the most expensive (and sometimes dangerous) means of collecting data on our physical asset base. Where we find ourselves collecting data (e.g., conducting surveys) that at some point we already held (or still hold, but don’t trust), this consumes unnecessary resources. Similarly, a lack of longitudinal data can make it harder for us to understand how our assets behave over longer timeframes, and what their failure modes look like. Modelling asset degradation on civil structures, for example, depends on the collection of data on assets stretching over multiple decades at a minimum. When we are managing assets that take decades or centuries to degrade, the inability to maintain data over generations of software can seriously harm our ability to care for our assets. All of which is an argument for reckoning with the distinct types of data that we use in this sector, understanding the idiosyncrasies and specific requirements of each type, and preparing a holistic approach for data management. For the purposes of this book, we split ‘data’ into four subtypes: • Alphanumeric data: text and numbers comfortably structured in tables and fields (e.g., data you could use in a spreadsheet). • Document data: unstructured data including images with a geometric or geospatial reference, may include some semi-structured data such as html or metadata tags. • Model data: geometric data structured in accordance with building information modelling (BIM) standards (e.g., referenced to a geometric and/or geospatial model per ISO 19650).

1.5 Definitions

23

• Geospatial data: geometric data structured in accordance with geographic information systems (GIS) standards, composed of shapes (points, lines, polygons) and referenced to a geographical coordinate system (e.g., following ISO 19115).7 These four data types are not highly scientific, but rather to reflect the different working practices that exist in the built environment sector. They broadly align with BS EN 17412-1 which distinguishes between alphanumeric data, geometric data, and documents. BS EN 17412 describes alphanumeric data “expressed using characters, digits and symbols or tokens,” and geometric data as “expressed using shape, size, dimension, and location (British Standards Institution, 2020).” In this context what we call model data and geospatial data correspond to two subsets of geometric data. We believe that this distinction is useful as it reflects the different use cases, ways of working, standards, and tooling distinguishing BIM and GIS. A useful way of ensuring that you are meeting the data requirements of your project or organisation is to split the task to reflect these four data types, the integration between those data types, the integration between different parties, and the governance and policies required to ensure order. This results in a task list as follows: 1. 2. 3. 4. 5. 6. 7.

Managing alphanumeric data. Managing document data. Managing model data. Managing spatial data. Integrating each of the above where necessary. Integrating with your suppliers and stakeholders. Implementing sustainable means of managing your data and information to support the above. 8. Keeping the whole thing secure. Tooling varies depending upon what type of data is being managed. Moreover, to integrate distinct types of data one may require specialist software such as Informatica or FME. The diagram below provides a high-level overview of the data domains in a simple format. This type of non-technical diagram might prove useful in explaining to leadership and other non-specialists the components that your organisation needs to have in place to effectively manage the four different data types. Note that there is some subjectivity in this diagram, and it is only intended to function as a simplifying illustration; the built environment consumes such a wide range of different 7

One might argue, as at times your authors have, that spatial coordinates are merely a sub-type of numeric data and are therefore not a distinct data type. It’s not worth becoming hung up on such distinctions. All data is binary, we are trying to reflect how built environment organisations use distinct types of data in practice. We have elected to include geospatial data as a distinct data type for three reasons. Firstly, there is a distinct body of practice within the build environment sector that uses geospatial information intensively. Secondly, whilst the points/lines/polygons stored in a database may be numeric fields, they must align to geospatial referencing systems (e.g., latitude and longitude, or in the UK Ordnance Survey (OS) Reference) to qualify as geospatial data. Thirdly, we are wary of becoming overly purist in defining data types because we want to reflect data working practice rather than how bits pass through a CPU.

24

1 Purpose and Pedantry

Fig. 1.2 An illustrated taxonomy of data types

data formats that is difficult to place all of them comfortably in a small number of categories (Fig. 1.2). Alphanumeric data, document management, and geospatial data all have widespread application outside of the built environment sector. Model data is more unique to sectors that deal with physical assets, meaning the tooling is developed specifically to meet the needs of the built environment sector, as well as sectors such as manufacturing and extractive industries. Table 1.1 describes, by data type, where you are likely to share practice with other sectors, and what is (relatively) unique to the built environment sector (Fig. 1.3). Within a built environment organisation many roles (and associated tooling) focus exclusively on one type of data. For example, performance analysts will spend their time making dashboards with alphanumeric data, quality and assurance professionals will live in a world of carefully version-controlled documents and reports, and BIM users will happily tinker with their design models using a variety of specialist tools. However, at an organisational level, to leverage data to deliver better outcomes from the built environment, some level of integration between these data types will be

Built environment idiosyncrasies

Limited unique requirements, outside the data sources that you need to integrate (e.g., cost, schedule, quality, risk, sustainability) Some edge cases such as SCADA, which generate numeric/binary data but require much closer integration between the specialist hardware and software

Data type

Alpha-numeric

Table 1.1 Data types and their practices Good practice Massively scalable platform-as-a-Service (PaaS) or Software-as-a-Service (SaaS) infrastructure from a major cloud vendor, with added security measures where necessary Data mirroring, aggregation, and ‘marts’ to speed up reporting and analysis without compromising BAU functions Data modelling and data governance to keep track of what data you hold and who’s responsible for it

Poor practice

(continued)

Data management through spreadsheets (documents) and collation on shared drives or local disks A lack of centralised architecture, and/or a reliance on ‘on prem’ servers with limited scalability Data sharing via email Data definitions that vary by user

1.5 Definitions 25

Built environment idiosyncrasies

The types of documents managed (e.g., drawings, reports, unstructured images) and the level of collaboration and control required across the supply are not unique to the built environment. However, not all software solutions, particularly those designed for consumer markets, will meet the standard of data quality, quality management or information management (e.g., ISOs 8000, 9001, and 19650) required by built environment organisations It is worth noting the bulk of our historical record on our assets exists in document form (albeit often digitised documents). This is because most of our physical assets pre-date the widespread adoption and use of digital technologies or machine-readable formats

Data type

Document

Table 1.1 (continued) Many built environment clients run multiple document management systems to meet distinct needs. These systems might loosely couple, with working processes in an attempt to avoid redundancy and duplication Any example might be a client using Microsoft Office 365 for corporate document management, along with more specialist tooling for the collaborative ISO-compliant management of design documents. In general, these are likely to be cloud-based SaaS solutions

Good practice

(continued)

Good practice and poor practice won’t necessarily have substantially different architectures. Whilst some mature organisations may use a single document management system, it is not necessarily wrong to use different systems for different use cases (e.g., SharePoint for collaboration, and an Enterprise Document Management System for design artefacts and review). The gap between good and poor practice is more likely to depend on how well organisations define and abide by their information management standards and policies, as well as how ‘searchable’ and ‘findable’ they make their libraries of documents to users

Poor practice

26 1 Purpose and Pedantry

There are many proprietary and open-source geospatial tools, ranging from the simple consumer-facing (OpenStreet Maps, Google Maps, Ordinance Survey) to the sophisticated (QGIS, ESRI). These are available using a mix of SaaS solutionsand other deployment options There are also open data initiatives for sharing spatial information, such as the UK’s National Underground Asset Register (NUAR)

Geospatial models (maps) provide a common reference point for built environment organisations. That may mean a model the size of a country for infrastructure such as roads, rail, waterways, power, or environmental assets. Or it may mean more localised models maintained at a high degree of accuracy, for example detailed topographical maps of sites There is also the sub-group of schematic maps, most famously Harry Beck’s Tube Map which are commonly used to represent complex networks. These are not geographically accurate but are still models of space

Spatial

Not all projects or asset types will require spatial models. This is especially true of assets concentrated in a relatively small spatial area which might comfortably fit within a BIM environment However, where organisations do require spatial modelling the superiority of cloud-based PaaS and SaaS solutions is self-evident at this point. As per the discussion of alphanumeric data above, poor practice looks like a reliance on siloed ‘on prem’ and local installations of GIS tooling

The software and hosting required to manage 3D design models can be expensive to license and require specialist resources to use. For this reason, organisations may be tempted to take a ‘do minimum’ approach to BIM, for example by capturing very little design information or outsourcing to the supply chain Whilst this approach will doubtless realise short-term cost savings, it risks leaving the ultimate asset owner with a dearth of information in terms of the design and construction of their assets

Poor practice

management process and not a technology or ‘just a 3D model’.

a We have permitted the term ‘BIM Model’ here as it is a colloquialism for the use of a 3D model. Going forward we want BIM to only be seen as an information

Considering the complexity and specialisation of this type of software, practitioners rarely have much choice but to use a leading vendor SaaS product (though for some projects, particularly small-scale projects, open-source tooling may be viable

Object-based design modelling is a sector-specific sub-domain that is not well understood outside of built environment specialists. It has common origins with industrial design tooling, but at this point has branched off into a separate domain of solutions

Good practice

Built environment idiosyncrasies

Data type

Design model (aka. BIMa model)

Table 1.1 (continued)

1.5 Definitions 27

28

1 Purpose and Pedantry

Fig. 1.3 An illustration of the four types of data in practice

necessary. There are several sector-specific use cases for integrating data of diverse types. Some use cases for integration include: 1. Storing/archiving design information from design models as documents [Model to Documents]. Over the period of design and construction the model will accrue an additional information layer of appended notes, comments, drawings, pictures. It is commonplace for designers to export the finished design of an asset from model to documents as these are likely to be accessible to a wider range of users in the future. However, in doing so we lose much of the richness of the model and the information appended to it. Finding better ways of handing over the rich media generated during construction to operations and maintenance roles is an important sectoral challenge. 2. Overlaying alphanumeric information into design models [Numeric to Models]. Often referred to as 4D (when overlaying time) or 5D BIM (when overlaying

1.6 Conclusion

29

cost),8 in practice this means using the spatial (and time) representation that is possible through design models to display analytical information. At its simplest applications this might involve using the design model as a heat map to describe other phenomena (for example cost of works by area, or safety events by location). More complex applications such as digital rehearsal allow users to visualise their schedule of works as a sequence of changes to the design model. 3. Translating design model information into operational information at handover [Models to Numeric], for example extracting alphanumeric information from within the design model to populate an Enterprise Asset Management (EAMS) system. As Item 1 describes current standard practice in the sector is to ‘de-model’ at the end of construction (e.g., to translate the design model into a more static set of documents). This is because operational staff rarely use design models as a user interface during the stages of the asset life cycle following construction. However, as the sector moves towards ‘operational digital twins’ this will mean that the use of design models (or potentially more simplified schematics) extends into the later stages of the asset life cycle. In this case the flow of integration becomes one of complementing the design model with additional documentation (e.g., allowing users to access drawings and reports by interacting with the design model or digital twin). 4. Merging model and geospatial information into a single interface. Presenting both model and geospatial data in a single visualisation allows the user to observe their physical assets as they appear in the real world. In doing so users can benefit from the richness of both 3D design information and a broad range of contextual spatial information. Integration (and interoperability) are a key means to avoiding technical debt and ensuring that we maximise the long-term return from investing in data. It’s what makes modularity and loose coupling possible. If we can get data out of systems, then it will be easier to augment and replace life-expired solutions. Successful integration will always require the alignment of data structures and metadata between solutions. This may not mean retrospectively changing the data structures held within legacy solutions but will require a means of translating data between solutions without undue loss or corruption of data.

1.6 Conclusion Over the course of this chapter, we have established why it is now vital to consider the role that data plays in the built environment sector and provided an overview of how we will explore that topic in greater detail over the course of this book. We 8

Your authors find that expressing additional functionality of BIM products as ‘dimensions’ (e.g., 8D BIM) is at best confusing, and at worst actively misleading. Where possible we have sought to describe what the technology in question actually does rather than resorting to the dimension metaphor.

30

1 Purpose and Pedantry

explained that we want to build upon established practice wherever possible. We explored a range of different definitions commonly encountered in our field and how we will ensure that we use consistent and meaningful terminology throughout this book, Finally, we provided you with an overview of the data types that you will commonly encounter when working on the built environment, as well as a summary of how this might differ from other sectors, and the unique integration challenges that built environment data may present.

References Architecture 2030. (2023). Why the built environment. https://architecture2030.org/why-the-bui lding-sector/ Bent Flyvbjerg, D. G. (2023). How big things get done. Macmillan. Boulding, K. (1956). General systems theory: The skeleton of science (pp. 197–208). Management Science. Bower, B. (2023). The oldest scaled-down drawings of actual structures go back 9,000 years. ScienceNews. https://www.sciencenews.org/article/oldest-scaled-drawing-desert-kite British Standards Institution. (2020). BS EN 17412-1:2020. British Standards Institution. Construction News. (2020). New tideway delay and cost increase revealed. https://www.construct ionnews.co.uk/civils/new-tideway-delay-cost-increase-25-08-2020/ EPSRC-NSF. (2023). The role of funding, financing and emerging technologies in delivering and managing infrastructure for the 21st century. EPSRC-NSF. European Commission. (n.d.). Construction sector. https://single-market-economy.ec.europa.eu/ sectors/construction_en Get It Right Initiative. (2020). Construction errors cost billions. https://getitright.uk.com/live/files/ resources/46-giri-digital-flyer-2020-608.pdf Greenfield, A. (2013). Against the smart city. Do Projects. Harvey Nash & KPMG. (2018). Harvey Nash/KPMG CIO survey 2018. https://assets.kpmg.com/ content/dam/kpmg/xx/pdf/2018/09/harvey-nash-kpmg-cio-survey-2018-construction-engine ering-industry-findings.pdf Hidalgo, C. (2015). Why information grows: The evolution of order, from atoms to economies. Basic Books. Maskin, E. (2019). The economics of Kenneth J. Arrow: A selective review. Annual Review of Economics, 11, 1–26 (retrieved from Annual Reviews). McKinsey & Co. (2016). Imagining construction’s digital future. https://www.mckinsey.com/cap abilities/operations/our-insights/imagining-constructions-digital-future McKinsey & Co. (2017). Reinventing construction through a productivity revolution. https://www. mckinsey.com/capabilities/operations/our-insights/reinventing-construction-through-a-produc tivity-revolution New Civil Engineer. (2023). Sizewell C faces funding challenges as two pension funds back out. https://www.newcivilengineer.com/latest/sizewell-c-faces-funding-challenges-as-two-pen sion-funds-back-out-01-02-2023/ New York Times. (2023). Why an octopus-like creature has come to symbolize the state of A.I. New York Times. Reuters. (2023). Britain delays major rail project HS2 as costs soar again. Reuters. Rovelli. (2017). The order of time. Adelphi. Rowley, et al. (2007). The wisdom hierarchy. Journal of Information Science. https://journals.sag epub.com/doi/10.1177/0165551506070706

References

31

Shannon, C. E. (1948). A mathematical theory of communications. Bell System Technical Journey, 27(3), 379–423. Da Silva, J. (2012). Shifting Agendas—Response to resilience. Arup.com. https://www.arup.com/per spectives/publications/speeches-and-lectures/section/shifting-agendas-response-to-resilience Steer, S. (2021). Ofgem.gov.uk. https://www.ofgem.gov.uk/sites/default/files/2021-11/Data_Best_ Practice_Guidance_v1.pdf Taleb, N. N. (2012). Antifragile: Things that gain from disorder. Random House. Turner & Townsend. (2022). Global construction market outlook. https://publications.turnerand townsend.com/international-construction-market-survey-2022/global-construction-market-out look Tufte, E. (2020). Seeing with fresh eyes. Graphics Press. Varah, W. (2021). Transforming infrastructure performance: Roadmap to 2030. Infrastructure and Projects Authority. Visual Capitalist. (2020). Intangible assets: A hidden but crucial driver of company value. https:// www.visualcapitalist.com/intangible-assets-driver-company-value/ W3C. (2022). Time ontology in OWL. https://www.w3.org/TR/owl-time/ Wired. (2019). This is the software meltdown behind Crossrail’s costly delay. Wired Magazine. Zeleny, M. (2005). Human systems management: Integrating knowledge. World Scientific Publishing Co.

Chapter 2

The Challenge of the Built Environment

Abstract In this chapter we begin to establish some of the unique aspects of working with data in the built environment sector. In doing so we describe the problem statement that we are seeking to answer through this book: can data improve the performance and outcomes of the built environment? We look at how the use of data can tie into the aspirations of the sector to create social value and drive sustainability using the Infrastructure Project Authority’s (IPA) Transforming Infrastructure Performance (TIP) Roadmap to 2030 as our rubric. We cover the profound role that better use of data must play in helping the built environment realise net zero, as well as covering pertinent topics for application of data to the built environment, namely interoperability, data sharing, safety, data ownership, and data valuation.

2.1 Data and the Built Environment A system must consist of three kinds of things: elements, interconnections, and a function or purpose… It’s easier to learn about a system’s elements than about its interconnections. (Meadows, 2008)

Our governments are bankrupt, the world is on ﬁre, natural environments are disappearing, biodiversity is collapsing, inequality, injustice, and disinformation are rife, over-consumption shows no sign of abating. Humanity ﬁnds itself at a juncture, the lives of our descendants, and billions of others are at stake. These are staggering, enormous, ineffable problems (hyper objects, in the parlance). It can be difﬁcult, as individuals, to know what small contribution we can make. And it’s tempting, comforting, an understandable mental shortcut, to assume that because we care about data, or digital twins, or data science that they must have a role to play in facing these challenges. That way we get to feel like we are doing something, to feel noble and righteous whilst simply following our own inclinations (or even worse, a decent wage). There are undoubtedly admirable uses of data, but do they scale, do they persist? There are also, undoubtedly, cases of “datawashing” (i.e., using a veneer of data and analytics to justify or excuse bad, selﬁsh, or vainglorious ideas). Is data part

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_2

33

34

2 The Challenge of the Built Environment

of the answer? Well one thing’s for sure, we can’t just assume it is and rest on our laurels, we have to prove it (Fig. 2.1). Outside of the built environment, we have witnessed the application of data in response to catastrophe. The COVID-19 pandemic underscored the crucial role of scientiﬁc data collection not just for public health responses, but also for global economic and welfare responses. Over the last few years, and particularly during the pandemic, we watched the role of data collection evolve in various promising ways, including: • A greater focus on (near) real-time data: The need for timely data was underscored during the pandemic. Governments, institutions, and organizations recognized the importance of real-time reporting for both health-related and economic data to respond quickly and effectively. This led to a surge in the development of tools for collecting and analysing real-time data.

Fig. 2.1 A Venn diagram illustrating how some of the different concepts covered in this section overlap

2.1 Data and the Built Environment

35

• Cross-disciplinary data collection: The pandemic showed that health crises can cause economic and social crises. This led to a greater appreciation for cross-disciplinary data collection, connecting dots between health, economics, education, employment, and society. • More granular, localized data: Disparities in how the pandemic affected different regions, demographics, populations, and sectors underscored the importance of granular, localized data. This level of detail is critical to designing targeted responses and policies. • Data privacy and governance: With more extensive data collection, issues around data privacy and governance have become increasingly prominent. Balancing the need for extensive data collection with privacy rights and ethical considerations is a signiﬁcant focus. • Global cooperation: The global nature of the pandemic showed the importance of international data sharing. Future progress in global health and economic stability will depend on the continued development of norms and systems that facilitate this kind of cooperation. • Predictive analytics and modelling: The use of data to predict disease spread and economic impacts has been critical during the pandemic. This has underscored the importance of advanced analytics and modelling, and we can expect these tools to play a big part in future economic and public health strategies. • Data literacy: With the ﬂood of data during the pandemic, the importance of data literacy has become apparent. Policymakers, the media, and the public all need a greater understanding of how to interpret and use data responsibly. In short, the role of scientiﬁc data collection has grown in visibility and significance. It has become clear that robust, timely, and detailed data, combined with strong data analysis capabilities, are vital for managing crises and improving global economic and welfare performance. Data is the backbone of renowned economist Thomas Piketty’s suggestions for a more prosperous future (Piketty, 2015). Piketty believes in the power of data and transparency. In his vision, better data on income and wealth could lead to more informed policy decisions and a more just society. The two centuries since the start of the Industrial Revolution have been characterised by urbanisation and population growth on a grand scale. We have expected our built environment to constantly grow in real terms to meet these changes, and to do so in the absence of mature information and data management systems. As such humans developed social and economic infrastructure in a disjointed and less than optimal way. We now have a chance to catch up. There are, as we have witnessed, environmental limits to the growth of physical infrastructure. Our role as data people in the built environment is a foundational building block of Piketty’s vision. We are building the data infrastructure that will steer future society, allowing us to better meet society’s needs without exhausting our planet’s natural resources. This book started with a walk in St. James’ Park in Central London and a grammatically incorrect question: “People have been doing digital to construction for decades now, has it actually made the outcomes of the sector better?” In the end,

36

2 The Challenge of the Built Environment

digital became data, and construction became the built environment, but the question remains. It lingers in the background of every Digital Construction Week conference, every BIM Twitter post, every LinkedIn announcement: are we helping here? And so, this book is an exploration of this question. Or more speciﬁcally, it is an exploration of two contradictory hypotheses. The positive hypothesis is something like: The use of data techniques can meaningfully improve the extent to which the performance and outcomes of the built environment beneﬁt society now and in the future.1

Conversely, the null hypothesis is something like: The use of data may be necessary due to the prevailing ways of working in the supply chain, but its use will not inherently improve the performance or outcomes of projects, organisations, or the wider built environment sector.

There is a fair amount of circumstantial evidence available to substantiate either hypothesis, particularly the null hypothesis. The sector successfully delivered complex construction projects back when ‘information’ meant pen and paper (or quill and papyrus). Similarly, a profusion of data products and their adoption in the sector has by no means prevented major infrastructure projects from running over budget and behind schedule. Sometimes it feels like data solutions are just an additional cost line, accountable for another 1% of overspend, like a parasitic cottage industry soaking up scarce resources through a haze of hype and salesmanship. There are, unfortunately, few natural experiments to study here. Rarely do we construct two identical assets simultaneously, using data to aid one and avoiding data entirely on the other. One can’t say with any conﬁdence how the performance of any given project would have differed if they had used data to a greater or lesser extent. Nor are the benchmarks against which projects hold themselves massively scientiﬁc. We talk of projects running over or under budget as if the budget itself is somehow scientiﬁc. But budgets are rarely scientiﬁc, they are a product of politics, ﬂippancy, and optimism bias. Often a project’s budget is the biggest number that could receive approval, rather than an accurate estimate of likely costs. A project’s stated budget isn’t always an upper limit, but rather the ﬁrst salvo in an on-going negotiation. What is one to do, aside from throwing our arms up in exasperation? We believe that the lack of top-down macro-level justiﬁcation shouldn’t convince us that there isn’t a strong case to be made for investing in data. It simply means that there needs to be greater rigour behind the logic of the investments, greater discipline in capturing the beneﬁts of our work, and greater willingness to invest in a manner that pays off in the future. Work in the built environment, for example construction and infrastructure asset management, can be a risky and low margin proposition (Statista, 2022). These low margins are exacerbated by competitive procurement practices that favours cost over 1

This is not too dissimilar from the UK’s BIM Working Party Strategy Paper (Building Information Modelling (BIM) Working Party, 2011), a key hypothesis of which was: Government as a client can derive signiﬁcant improvements in cost, value and carbon performance through the use of open shareable asset information.

2.1 Data and the Built Environment

37

quality and incentivise a ‘race to the bottom.’ As Rachel Rees writes in Building magazine, technology investment is just one area that suffers, “the damage done to the construction sector by the impact of such a price evaluation model is evident: poor safety and quality outcomes; disputes, insolvencies, lack of investment in technology, site wellbeing, dire payment terms and periods… we need to hit the reset button (Rees, 2023).” Tight ﬁnances (and limited cashﬂow) mean that organisations struggle to justify the upfront investment required in data capability even where they know it will pay dividends in the future because it will put them “in the red” at the start of a contract. There is also a lack of consistent or developed practice for estimating the beneﬁts of data investment, with many organisations developing bespoke beneﬁts frameworks based upon whatever the bugbear of their organisation happens to be at the time (e.g., cost reduction, risk reduction, operational resilience, regulatory compliance, brand, and reputation, etc.). The irony here is that part of the challenge in justifying investment in data is a lack of rigour in terms of the data used to build beneﬁts cases. Most white collar ‘knowledge’ workers will spend a large part of their working lives sitting in front of a screen, manipulating one form of information or another (be it reports, plans, spreadsheets, or PowerPoint). Intuitively, there are beneﬁts and costs to providing our colleagues with a positive or negative data experience. So how do we substantiate that intuition? We propose the following steps to ensure that every investment in data meets the following criteria: 1. Someone with a sufﬁcient understanding of how the organisation operates is asking for the investment and can explain how it will beneﬁt their work in a manner that begins to justify the expenditure. 2. We, as data professionals, can establish a logical, tangible, and deliverable link between the data investment and improved project outcomes. We may not know the exact sequence of steps, but we should at least be able to sketch it out. In data terms, if you can’t specify a schema that would provide the data to answer a question, then you can’t use data to answer that question. We need to avoid business cases for data that, to paraphrase Sidney Harris’ famous cartoon, includes a “then a miracle occurs” moment. 3. Finally, you must learn from the past. If the organisation you work for, or the sector you work in, has a history of failing to deliver data projects then you owe it to yourself to seek to understand the causes of failure that you may have overlooked. Such unseen obstacles will range from the purely practical (e.g., “the system doesn’t have an API”,2 or “cyber3 would never sign it off”), to the more philosophical (e.g., even when presented with the ‘correct’ data, staff still clung to spreadsheets and heuristics). It would be difﬁcult to review or summarise the causes of the past performance of data projects in the built environment sector in anything resembling a complete 2

Application Programming Interface (API), a means of interacting with a piece of software using another piece of software. APIs are how applications “talk” to each other (e.g., share data) and a foundational part of any open data architecture. 3 Cyber is short for cybersecurity.

38

2 The Challenge of the Built Environment

Fig. 2.2 A simple metamodel describing how data and technology deliver outcomes to asset-centric organisations

way, but the remainder of this section provides a high-level overview of what makes the application of data to the built environment so fascinating, and so challenging (Fig. 2.2).

2.2 What Makes Built Environment Data Different? Our current abundance of data offers a timely opportunity to digitally transform construction and infrastructure from a lagging sector to a world leading data-enabled dynamic industry… We should think about where behaviours need to be changed and reﬂect on how technology could transform the way we live to enhance social equality, not simply reproduce existing discrimination.—Dr Anne Kemp OBE (Kemp, 2019)

The premise of this book is that the challenge of using data to improve our built environment is worthy of attention. This is a question that sector-agnostic data texts have not answered and cannot answer. If there were no unique challenges in applying data to the built environment then one could simply read a few of the many astute books on data architecture, governance, strategy, and science out there and forgo this book entirely. We do not believe this to be the case. There is a lot of cross-over in terms of data domain technical knowledge, as a data professional working on the built environment you will be using many of the same tools and techniques as your peers

2.2 What Makes Built Environment Data Different?

39

in other industries. However, we believe that to apply data to the built environment in a meaningful way, we will beneﬁt from ﬁrst reckoning with what is and is not unique about the sector. What are the unique challenges in applying data to the built environment? Whilst it will take this entire book to answer that question convincingly, there are a few themes that crop up throughout. First is the connection to the real world and physical assets. Working in this sector, you are likely to be part of the cycle of collecting information on physical assets and helping to determine what course of action to take in the real world. In this we share practice with other asset-heavy sectors such as manufacturing, agriculture, and even parts of research and healthcare. What distinguishes the built environment is the scale and complexity of its assets. There is no other sector where assets cover such broad geographic areas, or subsume so many component parts, or arguably where so many lives depend directly upon their ongoing performance. This affects the types of data we use, and how we use it, in profound ways. The consequences of action, and inaction, are high. It also often increases the cost and complexity of gathering data in the ﬁrst place and raises the risks involved with getting it wrong. Second is the culture of the built environment sector. Our culture ﬂows from our history, both good and bad, and the distinct professions and ways of working that have evolved over hundreds of years. This culture has in turn determined how the sector has, and has not, successfully adopted digital technologies. Just as importantly, the culture of the built environment changes the experience of working there. It’s worth noting here that the sector has been dominated by white privileged men (your authors included). It has not always been a welcoming sector for women, minorities, or other under-represented groups. And these problems are far from resolved. The culture can be extremely hierarchical, slow to change, and often quite risk-averse (due to the down-stream consequences of error that we highlighted above). We write more about how you can help to address these issues as a data professional at the end of the book. Third, related to the points above, is the historical legacy of the sector. The ways of working that we have inherited from storied institutions such as RIBA, ICE, RICS, and other bodies of the great and good still inform how many people in the sector think about their skills and their career progression. They have created archetypes that many of us working in the sector wittingly or unwittingly adhere to. These ways of working have also created idiosyncratic uses of technology, as well as a long legacy of incomplete adoption of good practice digital processes and tooling from elsewhere. From a technological perspective, the built environment can feel like an island, cut off from the mainland, where the local ﬂora and fauna have evolved in strange and surprising ways. Fourth, are the dynamics and incentives that result when the wide variety of different professional tribes in the sector meets the temporary nature of much work in the sector. As Mo Shana’a said in response to the three points above, “I think the project-based nature of the industry and the number of ‘skills/trades’ required

40

2 The Challenge of the Built Environment

to deliver an asset also contribute to a key challenge in the sector.4 Namely, you’ll always have a mix of organisations delivering an asset for a ‘ﬁxed’ amount of time. This is made even more challenging because you’ll rarely have the same group of participants on two different projects (even if you have the same organisations, you’ll have different people involved). This makes it harder to apply the learnings from one job to the other.” Mo’s point is most obviously true for construction projects, but it is not exclusive to construction.5 Even during the operational and maintenance stage of the asset life cycle, which is not inherently project-based, it has still proven difﬁcult to contract in a manner that ensures that operational knowledge builds up over time and that parties’ incentives align with the interests of society. We should be aware of the unique challenges of applying data solutions to the built environment, but we shouldn’t let this blind us to opportunities to learn from other sectors. Some of the built environment’s data needs are unique, but not all of them. As one of your authors found himself writing to his colleagues about a particularly idiosyncratic product, “you know that I love construction-speciﬁc solutions, but even I know that master data management (MDM) is not a construction-speciﬁc problem.” Data is data (or should it be ‘data are data’?). The ones and zeroes that comprise data concerning the built environment are no different to those used in any other domain (except, you know, a bit dustier). Much of the underlying infrastructure is the same, particularly now that so much of our data has migrated to global cloud computing platforms. Even most specialist construction-speciﬁc software tools reside in one of the big clouds. Nor are the volumes of data generated by the built environment particularly profound. IoT-enabled infrastructure can quietly rack up billions of rows of data, design models, point clouds, and high-deﬁnition imagery can add up to a few terabytes. However, the sector rarely comes close to the types of volumes seen in social media, streaming content, banking, or retail.6 This is not to say that there are inherently fewer data points to capture about the built environment, quite the opposite in fact. There is a small inﬁnity of things that are knowable about the built environment and its interactions with the natural world it exists within, and the human world that uses it. However, for now, the cost of data collection, combined with the failure to make the case for costly data storage and processing, has limited the volume of data available to budding data scientists and their ilk. If the difference between the built environment and other sectors isn’t the data itself, then perhaps it is the meaning we ascribe to that data. We write later in detail about the use of taxonomies and ontologies. The built environment is full of taxonomies, though we rarely label them as such. This is because the act of building relies on a degree of mental abstraction. Designers, architects, and engineers are 4

Private conversation, no reference available. As Dexter Robinson pointed out when reviewing this book, the same is true of ﬁlm production, where crews come together in a short space of time and never work together again, yet they still manage to produce blockbusters regularly. 6 This may soon change, at present the industries that cater to entertaining and ﬁnancing society accumulate far greater data volumes at a far greater velocity than we collect for the environment that they live within. But one can imagine a world in which real-time telemetry of the built and natural environment generates data far more than other sectors. 5

2.2 What Makes Built Environment Data Different?

41

people who can imagine physical things that do not yet exist, and how to create them. They are also people who can lay out a logical structure of activities, often consuming vast pools of resources, so as to make those imagined physical things a reality.7 Built environment data is sparse but dense. There isn’t always that much of it, but what does exist exudes meaning. If one were to compile a list of critical datasets for infrastructure-owning organisations many of those datasets would only be a few kilobytes in size. However, the meaning, completeness and accuracy of those records have a profound impact on the well-being of others. Directly, or indirectly, they could be a matter of life or death. Many of us, particularly those of us fortunate enough to live in advanced economies, take the safety and functionality of the built environment for granted. But we shouldn’t. The Grenfell tragedy taught us that even in the rich world, pockets of neglect can exist, unseen, unquantiﬁed, and unmitigated, and that in an age of record economic inequality and austerity these pockets are most likely to disproportionately endanger disadvantaged groups (Bubola, 2022). The ﬁrst truth of the built environment is that nothing lasts forever. Every physical asset that humans create will eventually fail (assuming that we don’t demolish it ﬁrst). Some fail quickly, others slowly, some predictably, and others at random. The built environment functions despite the inevitability of failure largely because we have learned to use data and information to coordinate our response to failure. And we did so long before anyone accreted terabytes of data. Instead, the safety and functionality of the built environment relied on a few small but key facts about each of our assets: location, composition, date of construction, date of last inspection, maintenance history. We wake up in the morning and expect the lights to switch on, for clean water to come out of the tap, we expect to be able to plan a safe and punctual journey to work and be able to text our friends on the way there (provided we are not driving). All these actions that we take for granted are a triumph of the built environment over failure, of order over entropy, and often thanks to small but critical amounts of data. Some examples of ‘small but dense’ datasets vital to the built environment include: • Asset registers (the composition, typology, and location of a portfolio of assets): • • • • •

What is this asset, and what other things is it comparable to? How and when was it built? What materials is the asset composed of? Who owns the asset, and who is liable if it fails? What is the asset used for?

• Spatial records: • Where is the asset? • What is it next to? 7

Or at least that’s how the sector has historically worked. Proponents of Modern Methods of Construction might argue that much of the intellectual effort in construction stems from the lack of standardisation of components and the lack of automation in design.

42

2 The Challenge of the Built Environment

• How do I safely access it? • Is the asset necessary for people/goods/utilities to move from one place to another? • Inspection records: • • • •

When was the last time someone went and looked at this asset? When is it next due for inspection? What skills/equipment/measurements do we require to assess its condition? Are any parts of the asset physically impossible to inspect?

• Condition assessments: • • • • •

How proximate is this asset to failure? What kind of failure modes are emergent on the asset? What is the consequence of failure? Does the asset have any critical components that require attention? At what point prior to failure should the asset be taken out of service?

• Maintenance records: • When did this asset last receive care? • What remedial actions have we applied to this asset? • What is the maintenance plan for this asset? • Risk assessments and health & safety ﬁles: • • • •

What are the hazards associated with accessing and maintaining this asset? What risks does this asset pose to the public? Does this asset contain any hazardous materials (for example asbestos)? Does the asset have any issues contributing to failure (for example, knotweed).

• As-built information: • What is the structure and composition of this asset? • How is it supposed to appear and perform (for example, does it require painting)? • Who built this asset, and what do we know about their capabilities? • Is this asset composed of any standardised components? • Engineering models: • How is this asset supposed to behave? • How does this asset respond to usage patterns (for example, loading and unloading)? • Whole life cost estimates: • What resources did this asset take to create, and how much will it cost to replace?

2.2 What Makes Built Environment Data Different?

43

• Telemetry/monitoring: • How is this asset performing right now? • Are there any trends apparent in the performance of the asset over time? • Whole life carbon estimates: • • • •

What was the carbon cost of creating this asset? What impact does this asset have on the natural environment? What is the environmental impact of disposing or replacing this asset? Can this asset be repurposed at the end of its life?

As the (relatively) simple questions appended to each of the datasets above imply, it’s not asking much to capture this information about the physical assets that compose our built environment at an individual asset level. But it’s not impossible, right? Properly speciﬁed, and stripped of supporting images and documentation, you could ﬁt the crucial records of an entire country’s critical national infrastructure on a USB stick. Many infrastructure owners have historically failed to accurately capture and maintain this fundamental record of their assets. This is an observation, not a criticism. Whilst it is easy to capture records on individual assets, it becomes far more logistically challenging to capture that level of information across a large, geographically diffuse portfolio of assets and to ensure that it remains accurate and timely. There is a wide array of challenges here, including ﬁnding the right workforce, deﬁning consistent data capture methodologies, capturing data accurately, and then storing and interpreting it effectively. Take the inspection of bridges and tunnels as an example. What does it take to inspect bridges and tunnels for a living? You must be: 1. Qualiﬁed to assess the condition and safety of a sizable civil structure that may be up to 200 years old and may include hidden voids, bats, drug paraphernalia, and much else. 2. Willing to spend much of your working life driving around the country and walking to inaccessible structures come rain or shine, often at night when the trains aren’t running. Maybe sometimes you get to ﬂy a drone if you’re lucky (probably not). There can’t be a large pool of people who meet these criteria. Nor is it necessarily possible for even the most resolute and skilled people to accurately record data in these conditions. Imagine, it’s midnight in a cold, wet, dark tunnel and you are on mobile scaffolding underneath the roof of a Victorian tunnel using your eyesight and a hammer to try to detect and accurately locate delamination of bricks, water ingress, mortar breakdown, and other issues. You’re yelling what you ﬁnd down to your colleague on the ground who is trying to quickly scribble notes onto a piece of paper which you will later type into a clunky database back at the ofﬁce. There exist opportunities to improve this way of working with innovative technologies like LIDAR and Hi-Def scanning. However, for many of the country’s critical assets manual data collection remains the means through which we track the condition

44

2 The Challenge of the Built Environment

of critical national infrastructure. Regardless of whether human beings record data directly on site, or review imagery remotely, the process depends on the time of specialist human resources rather than AI. These challenges in inspecting assets represent a microcosm of the wider challenges discussed throughout this book. The consequences of not collecting and maintaining accurate records on our built environment are direct and indirect, visible, and hidden. They include: • Direct, visible impacts: Risk to the lives of members of the public: whilst premature infrastructure failures are mercifully rare, when they do happen (such as Grenfell Tower or Hatﬁeld Rail Crash) they can have a profound human cost. Risk to the lives of staff: poor records in construction contribute to the loss of life in the sector, for example failing to identify on-site hazards or hidden services. Disruption to users: infrastructure failures are a dramatic waste of people’s time. One of your authors spent a year working with the train delay attribution team in the Southeast of England. That experience really sunk in an appreciation of the amount of chaos and consternation the failure of a single asset can cause if it happens in the wrong place at the wrong time (for example, a points failure outside of Victoria Station shortly before rush hour). • Indirect, hidden impacts: Poor asset information may lead us to respond belatedly to emergent asset failures. Responding to risks of failure before they manifest as failure issues is almost always cheaper and safer. In December 2015, a routine inspection discovered structural failures in the steelwork of the Forth Road Bridge (a vital road link between Edinburgh and Fife). The bridge remained closed to heavy vehicles for months causing substantial delays to freight and associated costs. However, these costs paled in comparison to the human and economic costs that would have occurred had the bridge failed under load, as cruelly evidenced by the Genoa Bridge Collapse in 2018. Conversely, if we do not know enough about our assets, we may be overly conservative in our maintenance activities (e.g., seeking to maintain assets when they are still in good condition) resulting in excessive costs and/or poorly prioritised work. Without high quality and timely asset data maintenance succumbs to an adage originally applied to advertising, “half the money I spend on maintenance is wasted, the trouble is I don’t know which half.” The more we know about our assets, and the components and consumables that they use, the more carefully we can align the replacement of assets with the duration over which they are performant and safe. Poor planning due to poor information resulting in redundant or sub-optimal allocation of resources, from simple examples like digging up recently laid tarmac, to more systemic problems such as making major public expenditure decisions on incomplete or misleading return on investment calculations.

2.2 What Makes Built Environment Data Different?

45

Sometimes these kinds of errors happen because we simply don’t collect suitable information, other times the information exists but isn’t shared within or between organisations. When applied to complex infrastructure-owning organisations such as Network Rail, or National Highways, we might see departments that are responsible for distinct parts of the asset life cycle (enhancements vs. maintenance) or different asset classes (signals vs. track) making decisions that do not make sense from a whole life cycle or systems thinking perspective. That said, organisations are beginning to recognise the limitations of this dichotomy, structuring their strategic approach around customer outcomes rather than stages in the asset life cycle. The less we know about our assets, the more likely they are to fail unexpectedly. As discussed above, this can quickly result in substantial knock-on effects as members of the public struggle to get to work or access vital services. In infrastructure jargon, these are referred to ‘inter-sectoral cascade failures’ where the system of systems collide like a nuclear ﬁssion reaction to cause what generally is a complex and potentially catastrophic event. We can measure the lost human hours, but it’s harder to calculate the impact on an economy of long-term underperformance of infrastructure. We saw during the COVID-19 pandemic that humans are generally an adaptive bunch, able to work around obstacles that—prior to their occurrence—might have seemed insurmountable. And yet we know that over the long term, at the macro level, the quality of infrastructure has a profound effect on the economic health of cities and countries. Of course, causality ﬂows both ways: rich countries have better infrastructure which makes them richer. This two-way causality (endogeneity) makes it difﬁcult to prove the extent to which growth precede infrastructure or vice versa, but a positive reinforcing feedback loop between the two certainly exists (Garsous, 2012). Unsurprisingly, data has a role in helping to answer these questions, as macroeconomists Estache and Garsous write “because data on the physical stocks of [infrastructure], or their valuation, tends to be scarce, authors have often relied on stocks of public capital or speciﬁc sub sectors as proxies for infrastructure.” What these authors’ research demonstrates is that the type of infrastructure and the context of that infrastructure matters. The authors argue that, whilst the data remains incomplete, investment in sectors such as telecoms, transportation, sanitation, and particularly energy seem to result in more statistically signiﬁcant improvements to growth, likewise investment in countries that are less developed economically. Where built environment data matters it can really matter. One of your authors became aware of the importance of data quality early in his career as he tried to help a large infrastructure owner ﬁgure out how many tunnels it maintained. The low precision order of magnitude answer was always somewhere between ﬁve hundred and seven hundred tunnel bores, comprising a relatively small asset portfolio, and trivial amount of information. However, the high accuracy answer was another matter entirely, and depended on meaningfully answering questions such as: • What’s the difference between a tunnel and a tunnel bore? Does a twin bore tunnel count as one tunnel or two?

46

2 The Challenge of the Built Environment

• When does an overbridge become a tunnel? If I build a wide bridge, cover it with soil, and run trains under it, have I in fact built a tunnel? • Similarly, if I have a cut-and-cover tunnel that has occasional short gaps which are open to the elements, does every gap effectively split the tunnel in two? • What about tunnels that branch in the middle? • What about tunnels that have stations in the middle, is that one tunnel or two? • Does an unused and sealed tunnel bore still count as a tunnel? Does it matter if parallel bores are still in use? What about narrower bores for maintenance vehicles or access? • How does one account for tunnels that you share with other infrastructure owners? At times it was possible to enjoy the absurdity of our task. Weren’t these all just arbitrary decisions? Was there an objectively correct answer to these questions, or was it more important to decide and stick with it? But then parts of tunnels started failing, in ways that could potentially have hurt lots of people. And it became clear that the categorisation of these assets would determine their maintenance policy, who was responsible for keeping them safe, and what kind of level of investment they would attract. Indeed, in a tautological sense, the assets only became assets once captured and recorded. Any tunnel-shaped objects that existed in the real world but were unrecorded in the database assumed a liminal state. The arbitrariness of the decisions belied the importance of consistency of deﬁnitions and the assurance of data quality. That’s what makes the application of data to the built environment meaningfully different is the impact that it can, in turn, have on wider society, the quality of life that people experience, and the safe enjoyment of public services that we all take for granted. As so many of the problems that really bedevil the built environment involve ensuring the accuracy of tiny amounts of critically important data. It is our role as data professionals working on the built environment to help ensure that the data held on our assets reﬂect the reality on the ground. There are many reasons why data and reality may drift out-of-sync, and common causes of inaccuracy include: • Measurement errors: where we simply don’t collect the right data, due to poor tooling, poor training, adverse conditions, or imprecise deﬁnitions concerning what should be measured (or all the above, as per the earlier tunnel inspection example). • Veriﬁcation errors: where we may capture the correct data in the moment but enter it into our systems incorrectly. For example, one of your authors remembers working with a data set on bridges, and wondering how there could be so many staggeringly large structures in the UK that he’d never heard of, before realising that these were predominantly instances where a millimetre measurement was input into a ﬁeld that was supposed to represent metres. At least in this case it was possible to correct the data, but in many cases, we design data collection interfaces with an incomplete or poorly deﬁned list of options meaning that operatives struggle to create a record of the correct values using the available list of drop-down options. Such errors are all but impossible to quickly resolve without needing to collect the same data for a second time, often at considerable cost.

2.2 What Makes Built Environment Data Different?

47

• Loss of historical record: where a lack of effective information/records management and/or deprecated or inaccessible ﬁle formats lose valuable information that we have gathered in the past… or we simply don’t think to look. This is a desperately self-reinforcing problem; archives only work if people trust that the archive contains valuable information. It takes resources to maintain archives, and it takes effort to search an archive. The fewer resources devoted to maintaining archives, the less ordered they will be, and the harder it will be to ﬁnd what we are looking for. The harder it is to use an archive the less likely we are to do so, and therefore the less motivation to maintain the archive! As with so many things in the data world, we only realise beneﬁts when people have the foresight to create the systems and processes that will deliver the beneﬁts. There are certainly organisations out there spending money collecting information over and over again and not realising that they could access the same information for a fraction of the cost if they merely invested in effective records management. • Inconsistent deﬁnitions: accuracy depends on deﬁnitions, in the sense that one can only say whether something is accurate if one has meaningful deﬁnitions to evaluate that accuracy against. Where organisations work with inconsistent deﬁnitions then loss of accuracy is inevitable. We can only measure cracks in the sofﬁts of tunnels if we agree on a) what cracks, sofﬁts, and tunnels are, and b) what the ‘measurement’ of a crack means (is it width, length, depth, area, all the above?). Not all physical assets are of equal importance. Typically, greater data accuracy, higher levels of precision, better veriﬁcation, longer retention periods, better structured archives, and better data deﬁnitions all incur greater costs.8 For less critical assets a degree of inaccuracy may be a cost-effective choice, one may not, for example, need to know the location of streetlights or telecoms base stations to the nearest millimetre. For more critical assets, for example the placement of high voltage cabling or gas mains, there may be both a ﬁnancial and safety case in measuring with higher accuracy. In some cases, the need for accuracy may be the consequence of a particular failure mode, for example we may initially only care to note that concrete is not suffering from spalling,9 but once spalling begins we may wish to know the exact measures so that we can track its progress over time and give ourselves sufﬁcient warning to prevent catastrophic failure. Data on the built environment matters because the built environment matters, because the safety and performance of the built environment has an immediate and lasting effect on the wellbeing of everyone. Built environment data isn’t different from a technological perspective, one doesn’t need a special type of hard-drive or database to store it. It is the content, context, the use, and the impact that makes built environment data meaningful and worthy of an entire book devoted to its use and its idiosyncrasies. Built environment professions use data to shape our lives, and they 8

XKCD #2696 has a characteristically fun explanation of precision vs. accuracy. The fracturing of the surface of concrete surfaces. In the case of reinforced concrete spalling may indicate corrosion of the reinforcement itself, resulting in lengthy and expensive repairs (for example in the case of the Hammersmith Flyover BBC News, 2012).

9

48

2 The Challenge of the Built Environment

did so long before anyone was using the term data. The rest of this chapter digs into some of the ways that built environment data shapes our world and will continue to do so. This relates to a point we will make throughout this book. There is a difference between the management of data and information. Data is agnostic from a technical management and governance perspective, a data professional can jump across into unconnected industries easily. However, the information management professional is speciﬁc to a sector. Information management relates to the transactions within a industrial sector. The contradiction here is with the adjacency of manufacturing and construction. When aligned on an outcome or a complimentary digital capability, information management can traverse sectors. In other words, the difference of built environment data has more to do with information management than data management.

2.3 Designing Equity: Planning, Cartography, Geography… and Data The built environment is the sum of its component parts, the buildings, utilities, services, and infrastructure that sustain society. However, the layout of these component parts also matters as well. To speak only of the components themselves is to overlook the enormous role that the planning of the built environment plays in its function. The shape of the built environment creates the boundaries between the anthropogenic and the natural. There is a beautiful interplay between the natural topography of our world, the paths of rivers, the presence of natural resources, and the slow organic accretion of human activity over time, and the conscious decisions of city planners. The current form of any great city is a fascinating blend of natural resources (e.g., rivers, harbours, plains), human history and politics (e.g., churches, palaces, seats of democracy), happenstance, and the designs and innovations of successive generations of planners and builders. In this way the layout of the built environment has profound impact on the wellbeing and lifestyles of those who live, work, and play within it, who in turn inﬂuence how we reshape and reinvent the built environment. The previous section described how capturing data on the composition and condition of the components of the built environment allows us to better design new assets, limit the risk of failure of existing assets, and ensure that they can perform the functions we require of them. The role of data in planning concerns the spatial layout of assets, and their interactions with demographic and sociological measures. The data used in planning can be descriptive (e.g., what, where, distance, time, cost) and normative (e.g., sustainability, aesthetics, equality, liveability, social justice). We witness both noise and bias in decision making in the built environment. The economist Daniel Kahneman helps us to distinguish between noise and bias in our decision-making. Bias refers to predictable deviations from the truth, whereas noise is an unpredictable deviation. If you ﬁre an arrow with bias, then you will

2.3 Designing Equity: Planning, Cartography, Geography… and Data

49

miss the target consistently. If you’re missing the target randomly, then your aim is noisy. As we have seen, our business cases for investment in social and economic infrastructure feature both noise and bias. This might mean biases (conscious and unconscious) in terms of the social economics of architectural decisions, or the noise in our inconsistent implementation of processes such as planning approval systems. Acts of planning include the layout and size of streets, zoning, transport links, gentriﬁcation, and the presence of bike lanes. Too often these acts perpetuate inequalities in society, from the obscene (e.g., driving highways through majority non-white neighbourhoods, perpetuating racism through restrictive covenants), to the subtle (prioritising the needs of car owners, clustering polluting activities in deprived areas, nimbyism). From an ethical perspective these outcomes are deeply troubling. From a data perspective, it is also a case of the measures of success (the KPIs if you will) favoured by a few driving the outcomes experienced by many. Per the quote attributed to H. Thomas Johnson, “What you measure is all you’ll get. What you don’t (or can’t) measure is lost.10 ” In planning, as in all uses of data, the data itself is neutral, it is the interpretation of the data that results in subjective outcomes. There is also, inevitably, a role for aesthetic judgement, data be damned! If the aim of your planning decisions is to maximise house prices, minimise journey times for private vehicles, and (consciously or unconsciously) minimise diversity then your use of data and aesthetic decisions will result in an outcome not dissimilar to the paradigm of the twentieth century car-centric commuter suburb. Conversely, if the data points that you calibrate development against include sustainability, walkability, demographic diversity, mixed use, and (arguably) liveability then you may end up with more spaces that adhere more closely to New Urbanist designs. As well as deliberate decisions, the history of planning also includes many unintended consequences, where the prevailing theory didn’t align with observed reality. Grand theories, like Corbusier and followers’ plans to ﬂatten the great cities of Europe and replace them with row upon row of concrete blocks set in parkland clashed with the complex realities of human preferences, culture, and the interplay of deprivation and gentriﬁcation. Modernist developments mere miles apart could have vastly different outcomes that had precious little to do with their design. Think, for example, of the contrasting fates of the Barbican Centre (celebrated, desirable, expensive), and the Heygate Estate in Elephant and Castle (deprived, dangerous, demolished). These two projects are less than two miles away from each other in central London but resulted in radically different lived experiences for those that dwelt within and around them. Planning processes reﬂect the measures that we choose. If we prioritise the clean lines and elegant high-density buildings connected by common spaces that characterise brutalism, then we inadvertently create a lived environment that only works under speciﬁc circumstances. The preponderance of common space underneath and around the high-rise buildings requires greater maintenance to create attractive spaces for the community to use. Spaces that go unused by residents, for example due to broken amenities, are less likely to be cared for by those same residents. This in turn 10

We have been unable to ﬁnd a source for this quote widely attributed to Johnson.

50

2 The Challenge of the Built Environment

breeds further neglect and opens the door to less salubrious uses. In the case of the Barbican, which is next to the ﬁnancial centre of Europe, and hosts a major cultural institution and a school, this is no issue. Locals and visitors alike ﬂock to the gardens, water features, and maze-like raised passageways that look as good as the day they were built. However, transferring this same burden of care to areas with lower income levels and stretched maintenance budgets and ostensibly similar communal spaces can become neglected at best, and dangerous at worst. Those same walkways that in the Barbican beckon you to explore and linger, become in Heygate grim corridors where one is both isolated and vulnerable to opportunistic crime. It follows that from a data perspective, Corbusier’s brutalism design philosophy is neither right nor wrong, but rather the degree to which it succeeds depends on a larger number of data points than the planners, architects and designers typically considered. A school of thought that is rooted in housing density and ratios of greenspace to residential space, and of course upfront building speed and costs is blind to other measures. The additional contextual measures had a profound impact on the outcome of development, and included demographics, crime rates, and the availability of maintenance funding. The same blind spots exist in contemporary trends, for example in the Smart Cities movement. Some in our sector want to use technology to improve the efficiency of urban environments through Smart City technologies such as monitoring and centralised urban control rooms. However, have we deﬁned and measured what people who live in urban environments value, and what the unintended consequences of an intelligent optimised city might be for its residents? As Adam Greenﬁeld writes in Against the Smart City, “The authorship of an algorithm intended to guide the distribution of civic resources is itself an inherently political act.” Concerns about data privacy, data security, and indeed human rights are valid and important when discussing smart places. Indeed, at the time of writing there appears to be an inverse correlation between the level of mass data collection in urban environments and the freedom enjoyed by their citizens. We will never be short of built environment developments that favour form over function, or architectural statements over measures that reﬂect human needs. This is Ozymandian folly. There is no question that architecture inﬂuences society, but at the scale of a city it is subservient to culture, to history, to happenstance. One cannot merely build a 200 km long, 500 m high vanity project in the desert and expect anything like sustainable urbanity to ﬁll such a void. The visual language of planning is geospatial representations: map making. A map is not merely a picture of reality, it is a visualisation of data. Just like a chart, graph, or model, it is the result of conscious decisions on the part of the map maker, decisions about what to show, what not to show, and how to present that information in a spatial context. As Kari S. McLeod argues, “maps are not just analytical tools… they are visual arguments that contain and convey political statements, meaning, and power (McLeod, 2000).” Crude examples of maps as political statements stem from territorial disputes, the colouring and delineation of Taiwan, Crimea, Kashmir, Sudan, even Catalonia, is as much propaganda as it is data point. What is more interesting, at least from the perspective of this book, is where the data presented by

2.3 Designing Equity: Planning, Cartography, Geography… and Data

51

maps relates to the outcomes that we seek from the built environment. If the data that we collect on the built environment reﬂects the outcomes that we are seeking (e.g., car journey times vs. public transport availability), then maps are a powerful means to make the case for those outcomes. The origin story of maps as data visualisations in support of societal outcomes invariably begins in Victorian London with John Snow and Charles Booth. To the modern observer, Snow’s 1854 map of the cholera outbreak on Broad Street in Soho is both crude and unconvincing. It represents the density of cholera cases by stacking bars on top of properties in the map and then identiﬁes a water pump that, visually at least, sits (more or less) at the centre of the densest area. Truth be told, Snow’s map doesn’t exactly establish causality, and the modern data analyst would use scaled icons or a heatmap to convey the same information. Snow’s map is nonetheless clearly using the overlay of structured numeric data on a map to make a case for action, in this case: take the handle off the pump! Snow’s successor, Charles Booth, took Snow’s basic idea and improved on it in every way. For a start, the amount of data presented on Booth’s maps is greater, both geographically and in terms of population. Secondly, Booth uses colour rather than symbols to represent changes in the data (in his case, poverty levels) which makes his maps both true to the spatial nature of the problem and easier to read. Booth’s maps are incredibly beautiful works of art. Finally, unlike Snow’s water pump, Booth’s maps do not lay out the culprit so transparently. The purpose of the maps isn’t so much to provide a simple diagnosis of the cause of urban poverty, but to provide both overwhelming evidence of its existence, and to support a long text describing the nuances of the problem and making recommendations for its amelioration. Booth’s maps use data to make a political statement, and do so in a way that is intuitive, poetic, and undeniable. What Snow and Booth’s maps both do brilliantly is simplify (or abstract) the representation of the world enough to make it easier for the viewer to focus on the data that the map contains. These maps do not depict all the intricacies of the built environment, but rather layer data upon an outline of the urban form (a figure-ground diagram). This ability to use maps (or map-like diagrams) to present a recognisable view of the built environment, but also to abstract sufﬁciently to increase the signal-tonoise ratio, has made geospatial representations one of the primary means whereby we guide users through the built environment. From the iconic designs of Harry Beck’s London Tube Map and the Ordnance Survey Explorer and Landranger maps through to modern applications such as CityMapper or Google Maps, geospatial information has become the default means of mediating ordinary people’s interactions with the built environment. Built environment organisations increasingly use maps as a user interface, and there is a rich interplay between solution vendors such as ESRI, Bentley, open data overlays such as the National Underground Asset Register (NUAR), and data visualisation tools. Projects such as Morphocode’s Urban Layers visualisation also allow us to explore the history of our urban environment in the same familiar user interface. By presenting data on our built environment each of these applications inﬂuences how we use and develop the built environment. The data remains neutral, but the

52

2 The Challenge of the Built Environment

outcomes do not. Consider, for example, journey planner apps such as CityMapper. These have substantial capacity to inﬂuence travel patterns, both in terms of the options presented, and the default options recommended. To their credit, these apps do an excellent job of presenting active and sustainable travel options (public transport, walking, cycling) alongside private car use. But on the ﬂip side they often default to driving, and provide the user with limited data on the carbon impact of their chosen route. Similarly, the layers presented in applications used to plan development, be they natural habitats, heritage, or historical sites will inﬂuence the factors considered during planning. There is not yet a means to automatically test the sustainability impacts (e.g., emissions, noise, visual blight, walkability, liveability) of plans in the digital world before they are enacted in the physical world, but that is exactly the sort of data-driven functionality that would inﬂuence how we develop and adapt the built environment to support future generations of its users. In the meantime, data, often presented through maps, will remain an important part of the discourse in terms of how, and why, we plan changes to the layout of our built environment.

2.4 Measuring Value Through Societal Outcomes To reckon with the value of data to the built environment, we must ﬁrst reckon with the value of the built environment itself. That value is a mix of the tangible (e.g., the cost to replace the bricks and mortar), and the intangible (e.g., the operational knowhow embedded in the organisations that own and run the tangible assets, including their use of data). When National Highways set out to measure the value of their data as a ﬁnancial asset, they began by assessing the economic value of the road network to the UK economy, before then subdividing further into intangible assets, of which data assets were one sub-type (Laney, 2021). Part of the joy of working on built environment projects or for built environment clients is that one rediscovers the marvel of functioning infrastructure. One witnesses the dedication and expertise of the vast teams of people who have dedicated their careers to ensuring that most people can safely get on with their lives oblivious to the huge amount of effort required to keep things just so. The past two decades have been a fascinating time to work in this sector due to the level of disruption occurring. Many organisations have faced near perpetual change as the need for digital transformation has become increasingly apparent in the face of austerity, the pandemic, and the climate crisis. Over this same period, the wider world has relentlessly digitised: consumer technology and generative AI have become near miraculous. In turn, the gap to the legacy technology of the built environment sector has become more apparent, as has the scale of the opportunity to bridge that gap. There has been an endless amount of breathless speculation (and salesmanship) concerning the need for digital transformation. However, the thinking that has proven most vital is that which has reckoned with the speciﬁc and tangible challenges and opportunities presented by the built environment sector, namely:

2.4 Measuring Value Through Societal Outcomes

53

1. The centrality of data as part of a functioning built environment, 2. The nature of the built environment as a system of systems connected and regulated by information, and, 3. The vital importance of the built environment in meeting societal needs. The National Infrastructure Commission’s Data for the Public Good, published in 2017, underlines these points, arguing that “Data is now as much a critical component of your infrastructure as bricks and mortar. Data is part of the infrastructure and needs maintenance in the same way that physical infrastructure needs maintenance.” This publication laid the groundwork for subsequent papers linking the built environment to societal outcomes and wellbeing. The Centre for Digital Built Britain’s Flourishing Systems (2020) restates the importance of data as an asset to the built environment but goes further in recognising that this is because the exchange of information is vital to coordination between the systems that compose the built environment. Flourishing Systems both recognises information exchange as the how, as well as establishing a compelling argument for why it matters, beautifully stating that “The central ideas in this paper are simple and radical: that the purpose of infrastructure is human ﬂourishing, and that infrastructure is a system of systems.” The ideas behind Flourishing Systems led to the creation of the Climate Resilience Demonstrator (CReDo) which functions as a model of how interconnected infrastructure might respond to an external event such as the type of serious ﬂood that is becoming more common as the planet warms. Both Flourishing Systems and CReDo underline that it is naive for us to think of built environment services in isolation. Our interconnected system of systems is currently characterised by low levels of redundancy, leaving us exposed to systemic failure, where the loss of one service (e.g., power, telecoms) cascades to other services. Coordinating the system of systems requires greater levels of data interoperability to continue to meet society’s needs (on good days) and maintain resilience (on bad days). The Infrastructure and Projects Authority’s ‘Transforming Infrastructure Performance: Roadmap to 2030’ (TIP Roadmap) published in 2021, builds further upon these preceding papers by describing how improving how the country’s infrastructure works will deliver better societal outcomes. This means that projects shouldn’t merely aim to come in on time and on budget but should have broader aspirations to “drive improved outcomes for people and nature”. TIP acknowledges that data has a role to play in this objective. It is both the language that we use to deﬁne and measure outcomes, as well as a source of insight that allows us to understand how best to meet those outcomes. Data is both an obstacle and an enabler: collecting, or searching, for data is a costly and time-consuming business that needs to be made easier. However, proper use of that data can help us to identify and expend our limited resources more effectively, increasing the extent to which our built environment assets are available, performant, reliable, resilient, and meet the needs of their users. An aspiration to improve the wellbeing of society is an important motivator for many of us working in the built environment. It can sometimes feel naive or unrealistic

54

2 The Challenge of the Built Environment

after a day spent slogging through emails and meetings. However, there are genuine ways in which working to better leverage data to the beneﬁt of the built environment can in turn contribute to better outcomes for society. The TIP Roadmap identiﬁes ﬁve areas of focus that, in the paper’s own words, “demonstrate ways in which we need to change the nature of our interventions in the built environment; delivering new economic and social infrastructure, place-based regeneration, and optimising the performance of existing assets.” We data professionals in the built environment can use these focus areas as target downstream outcomes. Or, in other words, we won’t be able to deliver these outcomes on our own, but we should seek to use data to (directly or indirectly) help our organisations and clients to realise improvements in these focus areas. If this all sounds too abstract or self-aggrandising, the sections below provide tangible examples of ways in which data can help realise each of the TIP focus areas.

2.4.1 Delivering New Economic Infrastructure to Drive Improved Outcomes for People and Nature The TIP paper argues that today, “the built environment is seen as a series of disconnected construction projects, divorced from the wider system and their explicit purpose to deliver services that improve the lives of people and the natural environment in which we all live.” It suggests that we adopt a strategic view of the sector, “Government and industry would have the means to deﬁne and understand the outcomes required from new assets, and to understand the integration of and interactions between new assets and the existing system.” One might describe this as bringing a portfolio management approach to the government’s role in ﬁnancing outcomes for the built environment. To do so successfully will require understanding the built environment as a system of systems, and construction projects as merely the ﬁrst step in that system delivering value to society. Taking a broader view of how projects deliver value allows us to better think of the skills and supplier relationships that we need. Recognising “the role of data and digital twins in enabling us to move beyond the constraints of individuals, or single systems, to begin to understand the interactions of multiple systems.” Data is vital to realising this more joined up and holistic approach to the built environment, including: 1. Deﬁning and implementing consistent ways of working at organisational, supply chain, and sector scale. 2. Enabling more holistic management and allocation of limited resources. 3. Planning with foresight to ensure that we have the skills, innovations, and industrial capacity required to deliver on our current and projected goals such as housing targets or net zero commitments. 4. Creating insight at a system of systems level, allowing us to identify bigger picture multi-modal changes.

2.4 Measuring Value Through Societal Outcomes

55

More than a century after the advent of the production line in manufacturing, most investment in the built environment remains comparatively ad hoc and artisanal. Despite the positive inﬂuence of international standards (described in the next chapter), practice still varies enormously between countries, asset type, organisation, and even within organisations. Many working relationships (e.g., contracts, delivery bodies) only existing for individual projects, further exacerbating the variance in practice. It’s as if instead of implementing their legendary Toyota Production System, the manufacturer instead built every model of car in a different location, using a different supply chain, and asked each factory to come up with their own way of working, only to move production to another factory every few years regardless of performance. There are forces in the sector that help to create greater consistency, including: • • • • •

Standards (e.g., BS, EN, ISO, internal, informal, etc.). Forms of contract. Training/accreditation. Collaboration and co-creation. Knowledge sharing. And then there are the forces that dispel that consistency, such as:

• • • •

Short term projects. Balkanisation of clients and suppliers. ‘Not built here’ syndrome. The time lag between practice emerging and becoming codiﬁed in the standards and other processes described above. • The inconsistency between what we profess that we should do, and what we write into contracts. As data professionals we have a role to play in encouraging the former and discouraging the latter. In practice this means providing organisations with the means to capture and use knowledge from across projects and to build up an institutional memory. At a foundational level this means effective use of document management systems and making it easier for people to securely share information. For more mature clients this might mean making information easier to ﬁnd through enterprise search and digesting unstructured information into insight using techniques such as Optical Character Recognition (OCR) and Natural Language Processing (NLP). Data also has a role to play in measuring and enforcing the adoption of consistent standards and ways of working, providing ‘management information’ that allows leadership to identify areas of non-compliance. No organisation has limitless resources, and most built environment organisations and projects face very real resource constraints. This might mean working within a particular funding limit set by a regulator or client, and/or shortages of specialist resources, and/or the need to work to milestone dates. Data is the language whereby we track how we are performing against these ﬁnite resources by collecting and analysing records of our ﬁnancial transactions, project costs, scheduled tasks, human capital, inventory, risks, incidents, and other factors. Merely mastering, aggregating, and visualising this data can by itself provide decision-makers with greater clarity

56

2 The Challenge of the Built Environment

on their performance and remove conﬂicting versions of the truth. To this end, TIP among other publications recommends the creation of a National Metrics Library for the Built Environment (Dodd et al., 2023). Integrating these different data sets to understand their relationships (for example, the impact of safety on programme delays or cost) can then provide those decision-makers with the analytical insight required to act with conﬁdence. Data has a vital role in helping to deﬁne and measure value. The discourse around the investment in public infrastructure projects usually centres around tangible beneﬁts, such as the reduction in journey times associated with HS2 and Crossrail, or the reduction in the likelihood of events such as catastrophic ﬂoods. The beneﬁts of buildings are in most cases more localised, and sometimes harder to quantify. In the private sector measures such as occupancy and proﬁt may take priority. In the public sector we may need to capture data on more subjective factors such as cultural heritage, as well as measurable goods such as safety and accessibility. We can use data to ensure that organisations deﬁne and measure values in a consistent manner. Our role is not merely to keep score, but to inﬂuence the priorities of our organisations and to build capability required to help them proactively improve their performance using the types of solutions, applications, and methods described later in this book. As an example, delivering credible digital twins is not merely a matter of deploying sophisticated visualisations and models, but rather a question of what we need to measure and control to efﬁciently deliver the construction and operation of assets. A construction digital twin will be useful to the extent that it allows us to better model and plan how we deliver the works. An operational digital twin will be useful to the extent that it maintains the ‘as-built’ information model of the asset, updates the information with (near) real-time IoT data, and uses that information to aid in the function of the asset. Moreover, there must be traceability between the investment in digital twins and the value that they generate. In this way, whether we are deploying radical new technologies like digital twins, or more established data solutions, we are only as good as the measurable positive inﬂuence we have upon the performance of physical assets.

2.4.2 Place-Based Regeneration and Delivery ‘Strange’ cities exist, and for a while even grow quite large, but they are essentially dead ends in urban evolution. (Hillier, Space is the Machine, 2007)

TIP argues that “our interventions in the built environment do not sufﬁciently recognise the importance of local context” resulting in “ineffective ‘one size ﬁts all’ approaches.”11 Instead, the paper imagines a “holistic and data-enabled understanding of place and space, bringing together robust and accessible data on the 11

Superﬁcially, this focus area may seem to contradict the previous point: how can you complain about a lack of consistency in infrastructure projects, and then claim that they are all too similar? It

2.4 Measuring Value Through Societal Outcomes

57

local population, economy and environment, with insights and priorities from local stakeholders, to create the foundation for better decision-making and more impactful interventions.” There is an interplay between the functions that our built environment performs and the space that it occupies. As the Space Syntax team at UCL, led by Bill Hillier describes the relationship, “space is internalised into socio-economic processes through which the built environment is created. (Hillier, 2023)”. No project exists in isolation, nor are projects built in empty space. Take the restoration of an asset like the Palace of Westminster (more commonly known as the Houses of Parliament).12 Whilst it is doubtful that it is the kind of project that the IPA had in mind when writing about the importance of local context and delivering in partnership with communities, the restoration programme beautifully reﬂects the challenge of tailoring construction interventions to the needs of an idiosyncratic local community. The Palace of Westminster is more than just a building. Sure, it is part of the wider London cityscape, positioned where it is by a conﬂuence of history and geography, and in turn inﬂuencing the layout of the city around it. Notably, the Victoria Embankment, Bazalgette’s great artery of transport (road, rail), services, and sewage, stops abruptly at the foot of the Elizabeth Tower (aka. Big Ben). Those of us who have spent time in London are accustomed to entire blocks of the city disappearing overnight, replaced by hoardings and cranes, with the Grade 2 listed Victorian frontage still there like a thin skin around the site held up by steel beams. But such a brazen development would be unthinkable in this case. The Palace is, along with Westminster Abbey, part of a UNESCO world heritage site. It is also part of the country’s cultural heritage, ﬁlled as it is to the brim with heritage artefacts and history, it is the stage that we have grown accustomed to seeing the dramas of the day play out upon. The Palace is a place of work and even residence. It is also part of the natural environment, sitting next to a major waterway on what was once marshland, playing host to nesting birds, and consuming electricity, water, and other scarce resources. The ‘local community’ in this context are speciﬁcally the people who work in and around the Palace, including the members of the House of Commons and House of Lords, as well as the hundreds of visitors that tour the site each day, and the many more who use the Palace (and particularly the Elizabeth Tower) as the backdrop for their holiday selﬁes. And then there is the more diffuse community of everyone affected by the works, be they residents of Westminster, people who work or travel around the Palace, and of course the British taxpayers who will fund any works. The tension between the needs of the Palace’s (relatively) small local community, and the much larger number of interested parties, is at the heart of determining the correct course of action for what will invariably prove a long and expensive task. As the IPA writes, “we can collectively invest too little in places because we lack the analytical is worth recognising here that the IPA are making a distinction between the what (built environment investments that are sympathetic to local idiosyncrasies) and the how (consistent, efﬁcient, and data-driven ways of working across the industry). 12 One of your authors is currently working on the Houses of Parliament Restoration and Renewal Programme. We have avoided talking about that project too much in this book because, well, that is a book unto itself.

58

2 The Challenge of the Built Environment

and policy tools to understand them properly, whilst disproportionate investment can exacerbate regional inequalities.” In this context, the Palace suffers from perennial underinvestment, as an ageing asset with a large backlog of maintenance work. But it also risks exacerbating regional inequalities as there is no getting around that this is an asset at the very heart of the country’s rich capital city. Balancing the needs and wants of such a diverse (and vocal) group of stakeholders is a task that requires planning and coordination (and thus, data). One unexpected use of data by construction projects and built environment clients is that despite not having what one would traditionally think of as customers (e.g., members of the public buying goods and services) most use Customer Relationship Management (CRM) systems such as Microsoft Dynamics or Salesforce. There is a knack to conﬁguring these types of systems (and their data structures) to work in the context of the built environment, given that their principal use is chasing leads and managing transactions for private companies. However, done well a CRM can prove invaluable in tracking and documenting interactions with stakeholders. These systems help built environment organisations to respond to complaints, enquiries, and requests for information, and ensure that management and operational staff alike can access the information that they need without needless demands on their time. Data can both establish the context of projects and help us to negotiate the local stakeholder environment. Whilst some built environment projects might seek to create new landmarks in an area (e.g., a new skyscraper or bridge), most will seek to maintain some sense of congruency with the local environment. The use of geospatial and design models can help us to design and build assets that minimise their impact on their environment and fit within their urban and natural context. Similarly, using data to track permissions (consents), stakeholder engagement, and externalities such as noise pollution can help to ensure that a project is meeting its obligations to the local community. There is much work necessary to increase the quality and accessibility of local data. Whilst individual projects often pay to create detailed and accurate datasets of their area of interest, and organisations such as Ordnance Survey sell the base geographic layer, local data sets remain disjointed and inaccessible. As the IPA writes, “data across local planning authorities, government agencies and infrastructure providers is in different formats, often inaccessible, and regularly held in documents rather than stored as data.” To create and share the datasets that can better reﬂect local context requires effective, affordable, and consistent methods in foundational data governance, data improvement, and data engineering (plumbing). Much of this data will prove useful both during the project and after its completion, as the built and natural environments and the local context continue to change within and around the asset.

2.4 Measuring Value Through Societal Outcomes

59

2.4.3 Addressing the Need for Infrastructure Using a Platform Approach The IPA believes that “missed opportunities to leverage scale through the use of standardised, interoperable elements leads to inefﬁciency in design, production, logistics and assembly,” and targets “an increasing use of platform approaches in construction” under the banner of Modern Methods in Construction (MMC). The word ‘platform’ here is an expansive one, describing more than a mere software platform but a novel way of working. TIP describes a shift towards off-site manufacturing of the built environment, one that makes current construction approaches look artisanal by comparison. To quote the paper at length: The assembly of the asset would then take on aspects of a factory, employing sophisticated digital workﬂows and increased use of innovative machinery and robotics. Digital models and twins would enable the tracing of data and digital assurance throughout the process, with feedback on performance in use providing insight back to the digital marketplace.

This platform-based approach does not necessarily mean moving to a ‘cookie cutter’ catalogue of standard assets, although the return on investment will beneﬁt from standardisation wherever possible. Considering the previous point about reﬂecting local needs, merely building the same standard assets everywhere would be a step backwards. Instead, a useful metaphor might be that the platform approach is akin to swapping a child’s playdough for Lego. In both instances, they will be able to create a near-inﬁnite array of shapes and designs, but with Lego, they are doing so using standardised components manufactured ‘off-site’ at scale. Some works will be unavoidably bespoke. But this does not mean that you can’t use accurate design models to speed the work. The more you can use data to accurately communicate your requirements to your supply chain, the more cost-effective the work will be, regardless of the materials used. As Dexter Robinson points out, this is less about standardising components— bricks, plasterboard, and beams already come in standard proportions—but about pre-conﬁguring components to ﬁt together (for example, ensuring that components are cut to size and ﬁt together before reaching site). Data is crucial to creating and maintaining component libraries, integrating those components into designs, the logistics of off-site manufacturing, their deployment on-site using robotics, and the ‘digital assurance’ of the works. It will require data professionals within the built environment to add yet another string to their bow, adopting practices from the manufacturing sector in terms of how they manage product catalogues, inventories, and logistics and integrate the associated systems. We have already discussed in this book that the Built Environment has a long historical legacy that still impacts our work today. Dan Rossiter points out, Modern Methods of Construction restores and extends a standardisation philosophy which once arose from necessity, citing the practice of aligning building dimensions to the size of bricks to avoid the costly practice of cutting partial bricks during construction. Modern methods of construction (MMC) and data are mutually reinforcing. This

60

2 The Challenge of the Built Environment

change—from artisanal construction to manufacturing—will require a step change in the deﬁnition, consistency, and availability of data across the sector. Data enables MMC by providing a means of capturing accurate models on libraries of components, and the assets to which those components belong, and will require us to expand on the work of data standards such as UniClass and CoBIE.

2.4.4 Retrofitting Existing Buildings to Achieve Net Zero Greenhouse Gas Emissions The most carbon-intensive part of a building’s life cycle—on an emissions per year basis—is construction, accounting for around a quarter of the emissions associated with an average building. As such, the greenest building option is usually to repurpose an existing asset rather than to build something anew. The authors of TIP understand the pressing need to make better use of our existing building stock, writing: Buildings account for almost one-third of the UK’s total greenhouse gas (GHG) emissions with homes alone responsible for 22%, mainly through heating. Most of the buildings that we will use in 2050 already exist, and most waste heat and money.

Many of the measures required to increase the thermal efﬁciency of existing buildings are humdrum and common sense: wall and loft insulation, draft excluders, and more thermally efﬁcient windows. There isn’t the need for substantial data analysis to meet these building fabric needs, aside perhaps from helping plan and prioritise their rollout (i.e., a fabric-ﬁrst approach). However, as we will see later in this book as we discuss the opportunity of IoT and Smart Buildings, data has a signiﬁcant role to play in helping buildings act more effectively as a system that efﬁciently meets user needs. Simple examples include using sensors to switch off lighting, HVAC, lifts, escalators, and other services to areas of a building that are not being used (and to quickly switch them back on when demand returns). Your author, long fascinated by skyscrapers, often wondered why even in the middle of the night there are inevitably entire ﬂoors with their lights on in most large ofﬁce buildings. Greater responsiveness in how we control and adapt our building systems will be doubly important as climate change leads to extremes in temperature, leading buildings in areas that have typically only required winter heating to also require summer cooling. Without buildings that respond to weather and usage patterns we will continue to overheat in winter and, increasingly, over-cool for the rest of the year. Data is also vital for accurately tracking the carbon footprint of our building stock, deﬁning how to consistently measure performance, and helping us to “move to actual, rather than modelled, the performance of buildings through integration with smart metering and other data.” As we have seen with technology such as blockchain and digital currencies, data can help us assign value to immaterial assets. Part of what led people to invest their wealth in digital currencies such as Bitcoin or Ethereum was their apparent persistence and immutability in spite of their immateriality. Carbon calculations and carbon offset credits suffer from a lack of persistence

2.4 Measuring Value Through Societal Outcomes

61

and immutability, their calculations at this stage appear imprecise and subjective. This lack of compelling data is beginning to undermine the market for carbon offsets (Grimalda, 2023). The apparent beneﬁts of the activities linked to carbon credits may not materialise in the long run (or indeed even in the short run), leaving purchasers wondering what they have actually bought (Chapman & Masie, 2023). This problem will only compound as we start to incentivize people to not take polluting actions, thus trying to apply value to the absence of an activity. Data has a role to play in making our digital representation of carbon as persistent and immutable as our digital currencies. Sustainability must be an overarching concern of construction projects as well as the broader sector. We can no longer consider projects successful if they are not congruent with our realising a net zero economy, maintaining biodiversity, protecting from natural disaster, and avoiding harm to disadvantaged groups. This means accounting for the embodied carbon, operational carbon emissions, and whether there are means to realise the same outcome through reusing existing assets. It also means understanding and using data to model the social and environmental context that individual projects exist within and measuring their impact. We can no longer consider a built environment organisation’s data strategy or investment in data a success if it does not help that organisation to reduce carbon emissions and its negative externalities. Sustainability is the overriding concern of our age, and we address the wider role that data and data professionals must play in ﬁghting the climate crisis later in this book.

2.4.5 Optimising the Environmental Performance of Our Existing Built Environment The opening statement from this section of TIP gets to the nub of the problem: “given ﬁnite resources, adding to the built environment cannot be our primary means of improving the outcomes we derive from it.” This philosophy of making the minimum level of changes to the built environment needed to realise societal outcomes is one that, broadly, escapes the bulk of our sector. Working with assets it is easy to attach one’s ego to doing great (read: big) things and managing great (read: big) teams. Our heroes, such as Brunel and Bazalgette, tend to be those who did just that, and the modern world is full of vanity projects championed by politicians, billionaires, and autocrats. The built environment is resource intensive, particularly during its construction. Data must help us to use what resources we have at our disposal in a more optimal manner. Data promises to allow us to realise better outcomes whilst consuming fewer raw materials, emitting less carbon, and protecting green space. Realising a circular economy is in part a data problem: one of identifying sources of supply and demand for the re-use of resources and creating a marketplace for this exchange. Sustainability and parsimony should form a large part of the beneﬁts case or return on

62

2 The Challenge of the Built Environment

investment associated with many data projects, and we should formulate our beneﬁts frameworks with these outcomes in mind. We need to ensure that our organisations can deﬁne, capture, and analyse the data required to make an appraisal of our work that reﬂects not just cost and time, but also carbon, biodiversity, and societal impacts. Collectively we can create accurate and consistent measures of embedded carbon, habitat loss, and other thorny problems. For too long we have swept sustainability under the carpet as ‘too hard’ or dismissed it as a secondary concern or as the preserve of cranks. We assert that data is a valuable asset, well… nature isn’t going to value itself! Data also provides a means of continuity between construction and operation. The TIP roadmap envisions high-quality interoperable information to reduce the impact of the built environment: Information would be shareable across sector and network boundaries and would be stored and available over the lifetime of the asset. With this understanding we would be able to make better, outcome focussed decisions about how to optimise our infrastructure across the system, and the protection and restoration of nature embedded in our decision making.

The paper clearly sets out quality data and common deﬁnitions as a precursor to effective decision-making about the use of resources, stating “effective optimisation must be founded on quality data as the key enabler for making better decisions. We must increasingly treat data as an asset.” It is now entirely feasible to use technologies such as IoT, building management systems, even Digital Twins to build or retroﬁt buildings with instrumentation that will make them easier and cheaper to run and maintain. If we consider the life cycle data needs of an asset when we build or retroﬁt it then a completed project will handover not just an improved physical asset, but also an effective means of collecting information from the asset on an ongoing basis and using that information to inform the operation and maintenance of the asset. To do so, we as data professionals must work closely with, and understand the needs of, contractors, sub-contractors, client organisations, and specialists focused on different stages of the asset life cycle. The earlier in a project life cycle that we can agree on common information standards, systems, and ways of working, the more likely we are to succeed in an optimal transition from construction or renewal to effective operation.

2.4.6 A Purpose-Driven Sector What Data for the Public Good, the work of the Centre for Digital Built Britain, and the Transforming Infrastructure Performance Roadmap to 2030 provide us with is a meta-narrative for a purpose-driven built environment sector with better use of data at its heart. There has always existed an abundance of prose in the sector devoted to how things should be done (and we will cover this in-depth as we discuss standards) but comparatively little in terms of why our work matters. This is an important and muchneeded contribution and explains why we have devoted a large part of this chapter to

2.5 Building for Extinction: Data and Survival

63

describing the focus areas of TIP. We hope that elaborating on the contribution that we as data professionals can make to each of the focus areas provides you with a sense of agency in terms of delivering change to this sector, and in turn meeting the broader needs of society and protecting the natural environment. Data is not merely a service provided to an organisation; it should be a catalyst for change. We should see ourselves as agents of change in creating a data-driven, purposeful, and sustainable sector, and nowhere is that change needed more than in the ﬁght against the climate crisis.

2.5 Building for Extinction: Data and Survival A system based on perpetual growth cannot function without peripheries and externalities. There must always be an extraction zone, from which materials are taken without full payment, and a disposal zone, where costs are dumped in the form of waste and pollution.—(Monbiot, 2019)

We used to speak of climate change, but now we speak of a climate crisis. We used to speak of sustainability, but now we speak of survival. As the impacts of our runaway carbon emissions become more tangible and immediate every year, as the gravity of the situation sinks in, and as the window of opportunity to avoid the worst effects closes. Our language (like our behaviour) must adapt. There is no shade in the shadow of carbon, and yet to some extent, the built environment has attracted less criticism than other sectors. Frankly, that needs to change. Addressing the climate crisis, without addressing the built environment, is akin to ﬁghting a ﬁre by shutting the door and walking away. The built environment deﬁnes the Anthropocene, and so naturally it deﬁnes anthropogenic climate change. The United Nations Ofﬁce for Project Services (UNOPS) estimates that infrastructure (broadly deﬁned) accounts for 79% of all greenhouse gas emissions (UNOPS, 2021). This ﬁgure is composed of energy (37%), transport (16%), water (5%), waste management (2%), communications (2%) and buildings (17%). To a certain extent what the estimates do and do not include is academic: almost 100% of anthropogenic carbon emissions are associated with physical (induced or built) assets. One might quibble why ONOPS include energy generation in their estimates but omit manufacturing or agriculture, but the fact remains that better use of physical assets has a role to play in decarbonising each of these areas. The data is unequivocal, this is the sector to be in if you want to help address climate change. According to the International Energy Agency (IEA), “buildings and building construction sectors combined are responsible for 30% of total global ﬁnal energy consumption and 27% of total energy sector emissions” (IEA, 2023a, 2023b). These totals breakdown further: 1. Burning fossil fuels in buildings account for 8% global emissions. 2. Generating electricity and heat used in buildings accounts for 19% of emissions.

64

2 The Challenge of the Built Environment

3. Producing the cement, steel, and aluminium used in buildings accounts for another 6% of global emissions (note, the IEA does not count this towards the 27% emissions ﬁgure stated above). 4. The production of cement accounts for 7% of industrial energy use and associated carbon emissions (IEA, 2018). These ﬁgures mean that by the IEA’s estimate buildings and infrastructure account for 40% of global carbon emissions (Architecture 2030, 2023). Moreover, these ﬁgures do not account for the further 37% of global emissions caused by transport (i.e., road, rail, shipping, aviation), and the additional 33% of global emissions caused by industry, both of which are at least in part users of the built environment and infrastructure.13 According to McKinsey, “over the life cycle of a typical building, 76% of emissions come from operations and the remaining 24% from the processing of raw materials used for and construction of new builds. Given that 80% of the predicted building stock for 2050 exists today,14 it stands to reason that the built environment will need to decarbonize not only embodied emissions but also, more importantly, the operational emissions from the existing building stock (McKinsey & Company, 2021, 2022).” Emissions from buildings, transport, and industry all continue to grow, rebounding quickly after the lull of the pandemic (Kluger, 2021). One might argue that the drop in emissions associated with the pandemic demonstrated that, if necessary, society can continue to function with a reduced carbon footprint. The scientiﬁc consensus is that emissions need to start dropping, and drop quickly, to avoid the worst effects of the climate crisis (IPCC, 2022). The solutions required to lower the built environment’s carbon and environmental impact are almost entirely a combination of changes to the built environment itself, as well as the data systems that control how we monitor and operate the built environment. Data has a role to play in three different areas of decarbonising the build environment: 1. Lowering the carbon impact of construction (the single biggest emissions event in a building’s life cycle), 2. Increasing the operational sustainability of buildings (the largest overall source of emissions), 3. Supporting the decarbonisation of transport and industry that rely on built environment assets (tripling the potential decarbonising impact of the built environment). It is worth describing in more detail what data-driven improvement in each of these areas might look like in practice. 13

Admittedly the ONOPS and IEA numbers don’t appear to completely align, but however you cut it the contribution of the built environment is enormous. 14 Note: one often sees these kinds of statements. These are not based upon a large dataset of buildings and planned replacement dates, but rather on simple calculations that assume a 30-to130-year life cycle for buildings. Obviously, reality is far more complicated than that, with a long tail of ‘historic’ or ‘listed’ buildings that we expect to maintain indeﬁnitely.

2.5 Building for Extinction: Data and Survival

65

2.5.1 Lowering the Carbon Impact of Construction A green building is energy efﬁcient, water efﬁcient, has good healthy spaces, and is built responsibly on sites that are not critical natural areas.—Green Building Council of Australia (Green, 2022).

Whilst the individuals who work within the built environment sector may be climate aware, the sector itself has failed to deliver meaningful improvements to its sustainability. Research by Statista suggests that even in mature economies like the UK construction-related emissions continue to rise (Statista, 2023a, 2023b). The UN Environment Programme and the Global Alliance for Building and Construction report that emissions from the building and construction sector reached an all-time high in 2022, and the sector continues to drift further from the trajectory required to decarbonise by 2050 (Global Alliance for Buildings & Construction, 2022). It feels like, as a society, we have accepted that buildings are an inherently resource-intensive pursuit, and don’t feel obliged to challenge that assumption. For too long we have seen sustainability as someone else’s problem: either the sole responsibility of the environmental or sustainability team or a question that required novel technologies to address. Certainly not something that data people can have an impact upon. But more recently there has been a groundswell of individuals working in and around the built environment sector who are not comfortable with accepting emissions as the cost of doing business. One of your author’s early career ambitions was to be a building services engineer that dealt with the issues of the performance of built assets. The gap between the designed performance versus the actual performance of buildings has been well known for a long time. The building services profession globally has an awful lot to answer for, but it is not purely their fault. The biggest frustration of any building services engineer is value engineering. Value engineering is a source for many woes in our sector, where the construction supply chain uses value engineering as an excuse to reduce the performance and ‘real value’ that assets offer to their owners and end users. In many cases ‘value engineering’ is merely a euphemism for increasing proﬁt at the expense of other considerations, including sustainability. We do have to point out there are excellent humans in the building services industry. The level of passion and innovation is amazing, much like any profession, there are beacons of light pushing us towards a sustainable future. However, in the general, the industry is failing us. It mostly fails us from the attempt to commoditise design and using technology to deskill the building services engineering profession. An anecdote used regularly in lectures is when one of your authors was tasked with calculating the heating load for a space in an airport. The sage engineer placed a sealed envelope on your authors desk and requested that is opened once the calculation was ﬁnished. Your author proceeded to deploy a handful of digital tools to calculate the heating load and once completed, opened the envelope to ﬁnd a heating load estimated by the engineer to a degree of error that was just a rounding error. Now, the moral of the story here could be about the wisdom and experience of engineers and how we are losing these people too quickly to retirement. However,

66

2 The Challenge of the Built Environment

the actual moral to this story is they were both wrong. We have crystallised the knowledge and wisdom of generations of engineers into software where the rules of thumb with correction factors have led to the bloating of solutions. Either way, the building will consume more energy than intended by the designer and reduce its sustainability. This is a vital dynamic to understand as a data professional in our sector. We have digitalised the embedded biases and mistakes of previous generations with business models that have not allowed for the reinvention we needed to enable computers to do what they do best. Instead, they are just augmentations of the analogue world. As a data professional part of your role is to help shift from these analogue augmentations of technology and build data systems. These data systems should enable design processes that can actually model the outcomes of the things we build, or even better, ﬁnd a way to incentivise us to not build anything at all. In 2021 Benedict Evans, in his annual presentation on the macro trends in what really matters in tech, produced an elegant slide on the relative search interest of consumers on the internet. The slide illustrated the shift in consumer search sentiment from 2004 to 2021 and tracked two trends, the search sentiment for what is cheap and sentiment for what is the best. His conclusion: from 2004 to around 2010, people searched the internet for what is cheap. From 2010 the sentiment shifted from what is cheapest to what is best. As a built environment sector professional, we are in around 2008. We have set the groundwork with initiatives like the Infrastructure Client Group’s Project 13, an initiative to move our industry away from transactional behaviours to enterprise behaviours. We also see progress led by Professor David Mosey of Kings College London on collaborative procurement practice (Department for Levelling Up, Housing and Communities, 2022). However, our behaviours in general still have a long way to go just to get projects to be built on time, on budget, and to the desired quality without undue adversarial behaviours between ﬁrms. We are yet to see the maturity in the supply chain to be able to tackle the issues of long-term performance of assets. In other words, we struggle to get people to behave collaboratively on the issues of the here and now of projects, we can forget about any consideration of what needs addressing long after people have moved on from the project. As a data professional, you will be building the data integrity and body of information required to shift us from a what is cheapest to a what is best culture. You will enable better predictability of performance. Without you this will not be possible, unfortunately we cannot rely on the altruism of people to give up the gains of today for a beneﬁt in the future. We will achieve this in two ways. Firstly through data and information, the richer these two components are the better our knowledge and ultimately our ability to realise and deploy insight and wisdom. A great resource for structuring this data is the Energy and Carbon Reporting Framework (Centre for Digital Built Britain, 2021). The second is through contracts, robust data ﬂows and management that will enable contract models to hold the supply chain to account. If a designer, constructor, or manufacturer of goods and services makes a promise on performance, we can hold them to that.

2.5 Building for Extinction: Data and Survival

67

One pertinent example of the sector’s staff working proactively and collaboratively to address sustainability is Zero Construct (Zero Construct, 2023), a decentralised grassroots organisation of environmentally aware professionals working in the wider built environment sector including architects, designers, engineers, lawyers, commercial staff, and of course data folks. The beauty of Zero is that whilst there are sustainability and environmental experts amongst the ranks, the responsibility for addressing the climate crisis is everybody’s problem, not just the preserve of a select few experts. Nowhere typiﬁes this proactive approach to ﬁghting climate change than the Zero Next group for professionals early in their careers. Their manifesto for construction is a ten-point list that concisely spells out the radical action required to decarbonise construction (Zero Next, 2022). Other reference points that bridge sustainability and the built environment sector include publications by Jo Da Silva of Arup on infrastructure resilience, the Centre for Digital Built Britain (CDBB) particularly the Gemini Principles (Centre for Digital Built Britain, 2018), and the Cambridge Centre for Smart Infrastructure and Construction (CSIC) recommendations regarding infrastructure and the reduction of carbon emissions (EPSRC-NSF, 2022). Combining each of these sources allows us to begin to trace out the key principles at the intersection of data, the built environment, and sustainability. To make our sector truly sustainable we must: • Capture quality data to enable accurate carbon measurement throughout the asset life cycle. • Value the carbon impact of our work as much, or more, than ﬁnancial costs. • Identify where we can reduce demand for infrastructure or re-use existing infrastructure before we build new assets or demolish existing assets. • Design our assets to: • • • • •

Function as carbon sinks rather than carbon sources. Meet the ongoing needs of local communities and broader society. Avoid the over-consumption of materials. Use local materials wherever possible. Act as a ‘system of systems’ that furthers the UN Sustainable Development goals.

• Make decisions based upon the impact of our assets across their whole life cycle, including carbon, resilience, and operational performance. • Proactively use and openly share data and analytics to set industry benchmarks, hold our organisations to account, and ensure that they abide by the principles above. Taken together these principles paint a version of the built environment sector that is quite different from the short-termism and cost-focus that we have grown accustomed to. There are echoes of Transforming Infrastructure Performance here, particularly the attention to the whole life cycle footprint of assets as well as the need for community engagement.

68

2 The Challenge of the Built Environment

Arguably however, the biggest mindset change is towards a sector that does not build new assets by default. This is an entirely new way of thinking that will require us to focus on re-use and recycling and to detach our egos from working on “careerdeﬁning” megaprojects. The outcome might be a more circumspect and considered sector, one that is less concerned by how big or how quickly it can create new assets and is more concerned with stewardship and making the best use of what it’s already got. It’s easy to over-generalise here, but one is struck by a younger generation of workers who are less concerned with construction’s contribution to GDP, and more concerned with the built environment’s impact on the natural environment. Translating the principles above into actions for data professionals would look something like the following list: 1. We will create reliable means of measuring carbon emissions and savings through the asset life cycle. This means valuing carbon data and assigning ownership of both emissions and reductions in emissions in a traceable manner. 2. We will measure the embedded and life cycle carbon emissions of buildings and assess the viability and impact of carbon sink technologies. We will also consider the setting of the asset. It doesn’t matter how carbon neutral a building or infrastructure asset is if it is constructed at the expense of unspoiled nature. 3. We will make the case for the importance of the use of data in helping the sector realise societal beneﬁts and ﬁght the climate crisis, and we will create meaningful data roles at all levels of seniority with a clear career progression. 4. We will prioritise our investment in areas where there are carbon beneﬁts. This might mean, for example, optimising deliveries to site, engineering failure out of assets, or intelligent building controls that reduce heating and power usage. 5. We will provide our design staff with a function-and-data-rich design environment that allows them to create designs that are more easily reconﬁgurable. 6. We will maintain detailed product catalogues of standard components for offsite manufacture and provide designers with the data and tooling required to integrate these components into their designs. 7. We will make it easier for local businesses to work on our projects by lowering the technological and data barriers to entry, for example by providing them with free access to collaboration tools and data standards. 8. We will use data to track waste from our sites and buildings and proactively seek opportunities to reuse materials and minimise landﬁll. 9. We will use data-driven performance metrics that prioritise more than just cost, and actively encourage decisions with a lower carbon footprint. 10. We will make helping to reduce the carbon footprint of our organisations part of the job description for all of our data professionals and practitioners. The list above covers a few examples of how a data team might enable their organisation to use data as a means to become more sustainable. The speciﬁc steps taken by an organisation will necessarily vary to reﬂect their place in the asset life cycle and supply chain. However, this mapping between climate goals and data solutions proves that data professionals in the built environment have a role to play in decarbonising construction and a moral imperative not to sit on the side-lines.

2.5 Building for Extinction: Data and Survival

69

2.5.2 Increasing the Operational Sustainability of the Built Environment As we destroy nature, we destroy ourselves.—Karen Armstrong

What is the built environment built upon? The natural environment. What lies beyond the edges of the built environment, and within its gaps and crevices? The natural environment. Nor are the boundaries clear: the built environment seeps into and distorts our perceptions of what is natural. The hedgerows and ﬁelds of rural England are just as anthropogenic as our central business districts. Neither resemble the unspoilt forests, shrubbery and marshlands that preceded human involvement. Unlike a skyscraper a hedgerow co-opts rather than obliterates nature. Likewise, the natural world ﬁnds its niches amongst the Anthropocene, it power is always there if oft unnoticed, manifested in what we see as inconveniences: the exploding populations of urban rats, pigeons, foxes; the relentless invasion of Japanese knotweed in our masonry, the dryrot that stains our walls and the unexpected ailments that cross between species and cause pandemics. The built environment rests upon the natural environment, it has depleted the natural environment, but it has not subjugated it. Our few remaining unspoilt wildernesses increasingly respond to the changes wrought by our built environment as our actions disrupt the climate and the biome and in doing so change what wildernesses look like without even the need for direct contact. The natural environment will outlast us. We are reliant on it, and therefore we can never fully destroy it without ﬁrst destroying ourselves. There will be cockroaches in our kitchens whether we are there to swat them or not. All of which is to say that when we talk of using data to increase the operational sustainability of the built environment what we are really describing is reducing the extent to which the built environment depletes the natural environment and its biodiversity, without unduly impacting the social value that the built environment creates, or how resilient it is in the face of external events. In this context, the decarbonisation of the built environment is not enough, we must also increase the value that it provides to society, push against the manifestations of pernicious inequality, and in doing so consider how we make our built environment resilient in the context of a natural environment that will become more volatile as climate change progresses. We must also seek to address the harm caused by development to date. There is little scope at this time to undo the historical carbon emissions that are already ‘baked into’ climate change, but there are opportunities to reduce the footprint of the built environment, to invest in regenerating and rewilding natural environments. This requires taking a broader view of what societal value looks like, allowing us to identify where the natural potential value of a space exceeds what we are realising through its current use as part of the built environment. When we think of construction, we think of the physical removal of materials from the earth to create our assets. The extraction of rocks, metals, aggregate, sand, and countless other materials from the natural environment and their transformation into

70

2 The Challenge of the Built Environment

the built environment. The rest of the asset life cycle, the operations and maintenance phase, can be just as deleterious to the natural environment both directly (e.g., energy consumption), and indirectly (e.g., accommodating carbon-producing activities such as transport, manufacturing, reﬁning, etc.). One of our key objectives in applying data to the built environment should be to remove waste. We humans tend to have an immediacy bias towards the obvious tangible manifestations of waste: an overﬂowing rubbish bin for example, or a running faucet. Data gives us the ability to identify and track the far larger problem of hidden waste, the 90% of the waste iceberg that sits underwater, the carbon impact of actions that needn’t have been taken, or the resources spent on redundant or poorly sequenced work. Data investment and change that can reduce the impact of the built environment in several ways, including: 1. Data can help us to adapt our usage of assets to better align with patterns of usage: removing the unnecessary lighting, heating, and air conditioning that is often applied to empty or near-empty spaces by creating smart systems that respond to demand, as well as identifying a baseline level of demand that we can meet in the most efﬁcient way possible. 2. Similarly, data can squeeze more juice out of our existing assets by helping us to use what we’ve already built more intensively. Applied to transport networks, data has the potential to allow more frequent services, reduce congestion on motorways, shift commuters to mixed modes of travel, mass transit, and active travel. However, there are analogous use cases all over the built environment: squeezing more bits through a ﬁbre optic cable can reduce the need to lay more cable, similarly creating mixed-used developments and ﬂexible reconﬁgurable spaces can squeeze more human uses into a single carbon-intensive asset. 3. Data can also help to remove the waste that stems from mistakes: whether that’s the loss of thermal efﬁciency that stems from a window or door being left open, the loss of fresh water or natural gas from leaking pipes, the energy and fuel consumed by abortive works or poorly-timed trafﬁc signs, or the countless resources spent on the trap of induced demand. 4. Finally, by creating a world that is less material, data can obviate the need for carbon-intensive activities such as manufacturing or travel. Whilst cloud computing is by no means emissions free, it is almost certainly more carbonefﬁcient to share a slide deck electronically in advance of a virtual meeting than it is to print out dozens of slide packs and then ﬂy across a continent to deliver the presentation in person, the same is true of the many commodities removed from circulation by digitisation including consumer products such as compact discs and rolls of ﬁlm. We should ask ourselves, what are the analogous materials in the built environment that we can digitise away? Despite the immediate application of data in these areas, we can also make the mistake of assuming that the use of data is a signal of carbon virtue. Human emissions continue to rise in large part because our response to lower transaction costs (friction) is to consume more. More energy efﬁcient buildings do not reduce emissions if they encourage us to be less careful with our energy use, or heat/cool our properties in

2.5 Building for Extinction: Data and Survival

71

accordance with our whims rather than our needs. Just because you attach a smart meter or smart sensor to an energy source doesn’t inherently make it more efﬁcient, all that technology takes energy to be produced, so it starts with a carbon deﬁcit that needs to be recouped through judicious use. Greater broadband capacity might just as easily induce demand for a move from 4K to 8K to 16K TV screens and streaming services, rather than reduce the demand for broadband infrastructure. Virtual meetings only reduce emissions if they are substitutes to physical meetings rather than merely additions to physical meetings. And so on and so forth. Unfortunately, virtue signalling doesn’t reduce carbon. Indeed, it is often when we feel virtuous that we allow ourselves indulgences elsewhere. As individuals we might allow ourselves an unnecessary long-haul ﬂight because we feel good about buying a hybrid car. Similarly, as an organisation we might not question the carbon impact of our relentless major projects because we feel good about planting some trees outside our head ofﬁce. Data is the antidote to greenwashing, the heart might be persuaded by virtue signalling, but the ﬁgures don’t lie. Data will succeed in reducing the built environment’s impact on the natural environment not solely by reducing waste, but also by encouraging the dividend of reduced waste to be handed back to nature through parsimony. As ever, the principal contribution of data to this debate is measurement and analysis. If we can quantify and track the emissions of our assets, then we will be able to distinguish between genuine reductions and mere virtue signalling. It is only by quantifying and measuring the carbon impacts of our actions that we can ensure that we are realising a positive difference rather than merely moving our emissions from one account to another. Data also has a role to play in helping us to adapt how we operate the built environment as the destabilising effects of climate change become apparent. In the best-case scenario we will see a few degrees Celsius of warming compared to preindustrial levels based purely upon emissions to date, and at the time of writing we appear to be on a far worse trajectory than the best-case scenario (IPCC, 2022). This means that changes to the behaviour of the natural environment, and particularly the incidence and severity of natural disasters, will force us to adapt. As Jo Da Silva writes: We can no longer forecast, with a meaningful degree of accuracy, the severity or likelihood of weather-related events over the 50–100-year lifespan of new infrastructures. This challenges the ‘predict and prevent’ paradigm that has underpinned urban planning and infrastructure investments to date. (Da Silva, 2012).

Da Silva describes an increase in the uncertainty of the operational parameters that our infrastructure will face. We must change our use of data to reﬂect this uncertainty. In a complex and uncertain system, we cannot have ﬁxed rules, instead we must adapt and respond to circumstances as they arise. For example, in the past asset management theory dictated that organisations should understand the degradation proﬁles and lifespan of their assets and plan their maintenance activities in accordance with predicted failure rates. Increasing uncertainty may move us away from

72

2 The Challenge of the Built Environment

standardised ‘maintenance cycles’ and towards greater use of (near) real-time monitoring. Knowing that the volatility and severity of the environment that our assets will face is increasing will incentivise us to better monitor and respond to those conditions, rather than to theoretical values. Increased volatility makes it more important for us to use data to ensure that we have (near) real-time situational awareness of our physical and human assets and how they are performing and responding to disruption. The return in investment from technologies such as Digital Twins will advance as organisations seek better means not only to use their assets more efﬁciently and less carbon-intensively, but also to increase their resilience, and the resilience of the system-of-systems, in the face of disruption. All these beneﬁts rely on our having meaningful and accurate data concerning how our natural and built environments behave in the face of uncertainty.

2.5.3 Supporting the Decarbonisation of Transport and Industry As the IEA ﬁgures quoted earlier in this chapter demonstrate that the built environment accounts for a sizable share (at least a third) of global emissions, not to mention a big part of our impact on other measures (e.g., societal outcomes, resilience, and biodiversity). This is a huge prize, and one that should surely be higher up the agenda in terms of mitigating climate change. Moreover, we cannot lose sight of the even larger secondary effects that the built environment has on global emissions through the activities that it facilitates in energy, transport, logistics, manufacturing, and other sectors. IT itself is not without a carbon footprint. According to the IEA the data centres than run all of our cloud services, for example, account for around 1.5% of global electricity use (IEA, 2023a, 2023b). Every bit of data that we store and every calculation that we run comes with a carbon price tag, and we need to ensure, at a minimum, that we are doing enough good with our data and software to comfortably offset the carbon impact of running our data infrastructure. As energy intensive technologies such as Bitcoin have woefully demonstrated, just because something is digital doesn’t mean it is green. Not only do we need to focus our minds on means to mitigate the built environment’s impact on the natural environment, but we also need data-driven ways to measure that impact. We measure anything associated with time or money with utmost precision, but carbon, biodiversity, and other such outcomes remain fuzzy and unquantiﬁed. For consistency, it is worth establishing some boundaries between the built environment, transport, and industry. In the previous chapter we borrowed the following deﬁnitions from Transforming Infrastructure Performance (TIP): • Built environment: the human-made (or induced) surroundings that provide the setting for human activity.

2.5 Building for Extinction: Data and Survival

73

• Natural environment: all living and non-living things occurring naturally (e.g., not human-made or induced). • Infrastructure: the physical assets and services needed for the operation of human society. Following these deﬁnitions, all infrastructure assets (e.g., roads, railways, power lines, water pipes, telecoms networks etc.) constitute part of the built environment, as do ‘social infrastructure’ assets (i.e., buildings and public spaces). There are other types of physical assets which are neither built environment assets, nor infrastructure assets, but which rely upon infrastructure assets to operate. One such type of physical asset is vehicles (e.g., cars, trains, planes, boats, bicycles) which use infrastructure (e.g., roads, railways, airports, ports, cycle lanes) but which are distinct from the infrastructure itself. Another related type of physical asset are the plant and machinery associated with other parts of the economy: industrial processes (e.g., manufacturing, energy production, reﬁning), as well as commercial processes (e.g., retail, leisure, and logistics). Some of these processes are inherently reliant on assets (e.g., power plants, reﬁneries, golf courses), others are only loosely associated with the built environment (e.g., lots of manufacturing and retail takes place in standardised big box warehouse-type units). This semantic distinction has a signiﬁcant impact on how we categorise the carbon impact of infrastructure. For example, the carbon impact of operating the road network is low: powering lights, ﬁlling potholes, etc., but the carbon impact of the users of the road network is enormous. And yet whilst the conversation focuses on converting as many vehicles to electricity as possible, there are also numerous data-driven decisions that we can make in terms of road infrastructure (e.g., signal sequencing, speed limits, junction design, availability of charging points) that can profoundly impact the emissions of road users. Transport as a system is bigger than roads alone, and focusing on optimising for the experience of privately owned vehicles can result in perverse outcomes. Rather, a data driven approach should focus on mobility broadly deﬁned and seek to identify means to encourage people to choose less carbon-intensive ways of travelling and/or reducing the need to travel. There are a lot of subjective and aesthetic decisions when it comes to transport planning and urban design, many of which are becoming heavily politicised (active travel, EVs, low trafﬁc neighbourhoods, 15-min cities, urbanism vs. suburbia etc.) As in much environmental policy, part of the challenge is misalignment of incentives: the pollution comes from privately-owned vehicles, but many solutions lie with the infrastructure owner and the urban planner, and the victims of car dependence are many and diffuse, whereas the beneﬁciaries are immediately obvious. This division between the incentives of road infrastructure owners, and road users, where there are massive externalities (external costs/beneﬁts) of the infrastructure owner’s actions, is an economically inefﬁcient situation. Data has a role to play in bringing some objectivity to outcomes (e.g., regardless of your opinion on car ownership, we can demonstrate how it performs in terms of carbon emissions or journey times compared to other alternatives, is that a price that you are willing to pay?).

74

2 The Challenge of the Built Environment

For decades we have failed to see a market signal strong enough to drive the substantial investment required to decarbonise the built environment, which is often why we have looked to government policy to ﬁll the gap. Government targets and mandates can still be effective, but unlike the rational self-interest that guide the invisible hand of the markets, they require measurement and enforcement. Data ﬁlls the gap between obligation and action. The UK government’s target to reduce carbon emissions by 68% relative to 1990 levels by 2030, 78% by 2035, and 100% by 2050 is only meaningful if one has an accurate means of measuring carbon emissions (Department for Business, Energy & Industrial Strategy, 2021).15 We can only hold our leaders to account if we can quantify where they are failing to meet reasonable targets. The government is not solely responsible for most of the country’s emissions, so if they are legally enshrining a law to reduce emissions then they need to be able to identify actors in the economy that are not pulling their weight. From a system thinking perspective the government is not in control of the system of the UK economy or built environment, they can merely seek to inﬂuence its inputs and outcomes. Accurate and complete data and reporting standards are crucial to holding organisations, governments, and ﬁnancial institutions to account in their net zero pledges. In November 2022, the UN Secretary General wrote that “The problem is that the criteria and benchmarks for these net-zero commitments have varying levels of rigour and loopholes wide enough to drive a diesel truck through… I call on all net-zero voluntary initiatives to accelerate efforts to standardise progress reports, in an open format and via public platforms that feed into the UNFCCC Portal (Guterres, 2022).” Creating a common, widely accessible database of embedded carbon is an unavoidable ﬁrst step towards creating accountability. In the UK, the Royal Institution of Chartered Surveyors (RICS) Built Environment Carbon Database (BECD) is one promising initiative which is currently in limited beta testing (BECD, 2023). As with any data trust approach to crowdsourcing sector data, the proof will be in whether the database can provide participating organisations with incentive to ensure that the data is accurate, timely, and provides sufﬁcient coverage to support meaningful analyses. Unsurprisingly, many of the government’s priorities for greening the economy rely upon effecting change to the built environment. These priorities seek to realise certain metrics: jobs, investment levels, reductions measured in megatons of CO2 (Department for Business, Energy & Industrial Strategy, 2020). The Ten Point Plan relies on new wind farms, hydrogen networks, nuclear power, vehicle charging infrastructure, public transit, lower emission buildings, and carbon capture and storage infrastructure. Whilst the white paper does not advocate for the role of data, it would be impossible to plan and deliver such an ambitious investment in the built environment without the support of data-driven planning, design, digital construction, and operational technologies. Case in point, any transition to low or zero emissions vehicles will be 100% reliant on energy infrastructure, be it charging points or hydrogen fuel points. This isn’t solely about consumer-facing infrastructure, but the deeper changes to energy 15

And, obviously, assumes that the government doesn’t walk back its commitment.

2.5 Building for Extinction: Data and Survival

75

generation and distribution networks required to accommodate the change in demand. This includes investment in the data solutions required to: • Anticipate and actively ﬂex demand, load balance electricity networks (particularly given the variability associated with renewables). • Monitor and maintain pressure in pipelines to avoid leakages, particularly where hydrogen is introduced to the network. • Account for users both consuming energy, storing energy in assets such as cars and at home batteries, and feeding energy back into the grid (Bloomberg NEF, 2022). Moving to 100% renewable energy generation, or anything close, will require massive changes to energy generation infrastructure (part of the built environment), energy storage capacity and the data systems (or digital twins) required to run and balance that infrastructure not just on good days, but in extreme situations where the system of systems is under stress (Bloomberg NEF, 2022). Data and sustainability are inherently linked. Sustainability is the most pressing issue of our time, and data is one of its key enablers. The term ‘sustainability’ is inherently one of measurement and analysis. It is about making an informed judgement around what level of resource consumption is safe without placing an unsustainable burden on the environment, and then measuring one’s consumption of resources against that level. As the CDBB’s beautiful publication on sustainability Four Futures, One Choice’ states, “Complex, interconnected problems—such as poverty, environmental destruction, air and water pollution, disease and lack of education—require interconnected solutions… there is a need for better data to support these solutions and track progress.” In addition to the examples provided in this chapter, there are a wide range of use cases where better use of data on the built environment can beneﬁt not only our immediate organisational needs, but broader environmental and social outcomes (as illustrated in Fig. 2.3). As data professionals in the built environment, we must not be afraid to take a role in meeting the climate crisis. We cannot afford to simply delegate these challenges to others. We must work on the assumption that our data strategies, our data projects, our data infrastructure, and our consequent investments in data solutions can and should directly and indirectly impact the sustainability of our wider organisations and projects. There is an enormous opportunity to use digital technologies and data to reduce the resource intensity of our work, and to optimise how physical assets meet society’s needs. Part of this opportunity lies in using data to create visibility, helping consumers and organisations become aware of the impact of their decisions. We all have a rough sense of what is, and is not, sustainable. But few of us can quantify where there are the biggest opportunities to become more sustainable. This information gap can foster counter-productive behaviours (e.g., big corporations eliminating plastic straws, but doing nothing about their carbon footprint, people diligently recycling but increasing the number of ﬂights that they take, etc.). We must use data to shine a light, to separate effective action from mere good intentions, greenwashing, and virtue signalling.

76

2 The Challenge of the Built Environment

Fig. 2.3 “Asks” of data to realise environmental, social, and sectoral outcomes

2.6 Safety “But ﬁrst, do no harm.” Thomas Inman (probably) (British Medical Journal, 2013)

As we have established over the course of this chapter, what makes working with data in the built environment different is the proximity to the physical world. It is the impact that our work can and should have on the well-being of others, the outcomes that we see for society, and the fate of the planet. But you can hurt yourself and others in the physical world. No aspect of the sector better exempliﬁes both success in using data, but also the ongoing opportunity to improve further, than in ensuring the health and safety of those who build, operate, maintain, and demolish the built environment. The quantity, quality, and availability of data on safety incidents (e.g., near misses, accidents, fatalities) vary enormously between organisations and projects. It should unquestionably be one of the most valuable and important data sets captured by any organisation working on physical assets. However, organisations have struggled both to make it easy enough for staff to collect data on incidents in a timely and accurate manner, or to use what data they do capture to inform and improve safety on site. In your author’s experience working across a variety of built environment clients,

2.6 Safety

77

we have seen both good and woeful practice in this space. There are a few obvious precursors to success: 1. Can staff quickly and easily record near misses and safety incidents on site? Good practice looks like mobile apps and dedicated telephone numbers. Poor practice looks like requiring staff to log into the corporate intranet on a borrowed laptop, navigate to and ﬁll out a clunky form with irrelevant options, and then send it off into the void without any positive feedback. Good practice results in a rich data source of near misses from which we can identify corrective actions. Poor practice results in a data wasteland, with occasional drives to retrospectively make up the numbers. 2. Is the review of safety data and learning lessons into part of the everyday work of the organisation? Good practice looks like visible boards with up-to-date and clear safety ﬁgures tracking performance over time. Good practice incorporates reviews of those ﬁgures into recurring meetings (e.g., weekly stand-ups or team performance meetings) and identiﬁes and tracks actions. Poor practice looks like relegating safety performance metrics to the appendices of seldom-updated slide decks and word documents for noting (e.g., ignoring) at the end of long governance meetings with packed agendas and few actions.16 3. Enlightened clients know that improving health, safety, and well-being is part of everyone’s job, not just a few specialists. Good practice looks like encouraging all staff to report safety incidents and to consider how their work can (directly or indirectly) contribute to making their workplace or project safer. Poor practice looks like relying on a designated safety manager to function as the organisation’s conscience and delegating all thinking on the subject to them. Just as we saw when considering sustainability, health and safety have also fallen foul of the over-specialisation of construction roles, meaning that “too often, health and safety risk is not their job, they will leave it to the health and safety experts (3D Repo, 2021).” The sector is in a liminal state between good practice and poor practice. On the positive side, the sector frequently acknowledges the importance of safety, and measures of safety tend to improve as a result. On the negative side, construction remains a relatively dangerous profession, we are still a long way from doing no harm, and we have failed to fully exploit the opportunity for data and digital solutions to improve these outcomes. Digitising Health and Safety—a recent white paper from 3D Repo and Atkins—provides a useful overview of the sector’s safety posture and where better use of data can help to drive better practice. Early in the paper the authors frame the problem using statistics from the UK’s Health and Safety Executive (HSE): In 2019/20, the UK construction industry killed 40 people, equating to a fatal injury rate of 1.74 per 100,000 workers, almost four times the all-industry rate. And every year, it injures 61,000 people. 16

Usually, those long governance meetings will have commenced with a ‘safety moment’ where after an awkward silence a member of the upper management will share a story that has nothing to do with the workplace, about painting a ceiling or riding their bike or working on their golf swing over the weekend.

78

2 The Challenge of the Built Environment

The paper argues that to tackle this residual safety risk, “we need to be data-driven rather than just experience-driven.” In practice we don’t have to decide between experience and data. Tacit knowledge (i.e., experience, heuristics) and explicit knowledge (i.e., data, information) both have their place in addressing safety risk. Doing so requires us to create common data sets that are accessible across our organisations and projects. These common data sets should combine both the experience of our own staff (and lessons learned from previous projects), as well as analysis and insight from rich, timely, spatially-referenced data from our sites. Spatial models and platforms used for BIM are particularly apt tools for mapping safety risk data across a shared reference point. Organisations function as a social network, ensuring that individuals are not solely responsible for their own wellbeing. We should be looking to capture data in a way that alerts us to circumstances in which our organisations are failing to create that support network, whether it is elevated instances of lone working, shift patterns that do not allow staff to get sufﬁcient rest between work, or a lack of due diligence or timeliness in submitting the necessary checks such as risk assessments. There are obvious trade-offs here in terms of data privacy, and the understandable push back that comes with technologies such as vehicle tracking, health trackers, body cameras, and other wearable technology that could both improve safety outcomes but also creep everybody out and hand too much data (and therefore power) to employers. The role of data in improving safety becomes both to preserve a more accurate and complete record of safety incidents and risks than is possible manually, and then to use that record to ﬂag emerging issues, and potential areas for improvement. An example of data-driven safety practice comes from National Highways A14 upgrade (Catherine Moore, 2021). The A14 team rolled out a handheld mobile app that made it easy for team members to record both safety risks and good practice, so easy in fact that staff submitted thousands of reports. This app then integrated into the A14’s data analytics platform allowing modelling of risk factors. Rather than trying to predict issues, the modelling instead focused on identifying factors which contributed to high-risk days, such as worker fatigue, working after bank holidays, and even the presence of archaeologists on site. By simplifying the output of the analysis to what is effectively a trafﬁc light system of risk, the A14 made it easier to communicate the results to staff (e.g., “watch out today, it’s a red day, your team might be more fatigued than usual”) and contributed to a reduction of safety incidents on site following the pilot. One might argue that you shouldn’t need a data analytics platform to demonstrate that long shifts, long weekends, and the presence of untrained staff on site, would all contribute to riskier outcomes. But as data professionals we shouldn’t be afraid to demonstrate the obvious in a rigorous manner. Just because something is obvious, doesn’t mean that it’s not worth measuring or responding to. Whilst the term Health and Safety might suggest pedestrian images of slippery floor cones and red tape, it presents one of the most compelling areas to let data science stretch its legs. Opportunities to bring innovative data techniques to improve safety outcomes include:

2.6 Safety

79

– Computer vision: using pattern recognition to identify hazards on site either in digital replicas (e.g., BIM) or in (near) real time on site. Start-ups such as FYLD and PLINX are already able to ﬂag common safety issues such as a lack of protective clothing or barriers. – Risk assessment: by bringing the visualisation of sequencing to design models of assets, 4D Planning or Digital Rehearsal allows teams to “communicate around a model” and de-risk their work using a digital replica rather than in the real world where people accidents can happen. As PAS1192-6 states, “The increased use of digital information and modelling software applications in design development and delivery enhances the designer’s ability to anticipate, spot and foresee hazards and risks in the design (BSI, 2018a, 2018b).” – Knowledge management: often the ﬁrst place to start when improving practice in the built environment is simply to move users from ad hoc spreadsheets to a structured database. As the National Highways example above demonstrates, the presence of mobile incident capture combined with effective data storage and analytics is enough to bring a new level of rigour to many construction and maintenance organisations. In their white paper, 3D Repo position their Safetibase tool as “a searchable lessons learned database for health and safety risks,” which is a prosaic but inarguably useful aim. In the 1990s the broken windows theory became popular in policing. This was the idea that an environment that tolerates minor crimes (vandalism, for example) will be more likely to invite more serious or violent crimes. Whilst this theory had some troubling and discriminatory outcomes when applied to law enforcement, the idea that small oversights can be an indicator of wider and more serious non-compliance is applicable to safety. As we build more complete, structured, and accurate datasets of safety incidents, it will be easier for us to use data science to identify patterns that don’t look right. The indicators can be quite simple. An unusual ratio of near misses to safety incidents might indicate that issues are going unreported. A lack of timely submission of risk assessments—or copying content within those assessments— might indicate a team that sees safety management as a box-ticking exercise and is potentially acting carelessly in other areas of project management. Data plays a double role here: without creating a sufﬁcient data set we have nothing to analyse, and without data analysis we are unlikely to be able to conduct enough tests quickly enough to identify problems. There is no point in arriving at safety lessons learned after the opportunity to protect our staff or design away issues off site has passed: insight must be timely and accurate. As Digitising Health and Safety argues: We need to ﬁnd ways to harness the knowledge and experience of the experts around the table and make it automatically available when new assets are being designed. We need to harvest and analyse data collected through monitoring, wearable tech and equipment operation and feed it back to the design process (3D Repo, 2021).

PAS 1192-6 describes the process by which organisations identify, use, generalise, and share risk information as the risk information cycle. This cycle has parallels with the Nonaka–Takeuchi model of knowledge creation (Wikipedia, 2023a, 2023b) whereby tacit knowledge (i.e., experience, heuristics) is socialised (e.g., shared),

80

2 The Challenge of the Built Environment

externalised (e.g., codiﬁed into explicit quantitative information), combined (e.g., used with other information to generate insight), and internalised (e.g., communicated to/learned by human actors). Our role as data professionals is to make this process as fast, lossless, and insightful as possible so that the organisation learns from its mistakes as quickly as possible. We shouldn’t favour explicit data over tacit knowledge, but rather to understand how to create a positive feedback loop in which experience informs analysis, and analysis augments experience. In doing so, we can play a profound role in continuing to reduce the impact that the built environment has upon the well-being of those who build, maintain, operate, and work on it.

2.7 Ownership and Incentives 2.7.1 Public Versus Private The built environment functions through the interplay of the public and private sectors, with deeply blurred lines between the two. This duality reﬂects the huge amounts of capital required to build and operate physical infrastructure, the consequent burden on public sector ﬁnances, as well as the effect of natural monopolies where it is cost-prohibitive to have competing services. Taking the economist’s view, we ﬁnd three types of conﬁguration: lighthouses, lemons, and pure information. The 1970s was a time of seminal works in economics. Robert Coase’s Lighthouse (Coase, 1974) is a metaphor for the role of the private sector in delivering public goods. Coase suggests that it is market forces, under certain conditions, can solve problems with public goods provision and without government intervention. The lighthouse is the example of a public good as they are non-excludable and non-rivalrous assets. Ships cannot be excluded from their use and one ship using one cannot prevent another from using it. Both of your authors are not wholesale subscribers to neoliberalism, as holes can be easily found in freemarket capitalism. The digital twin of a lighthouse can easily be made a private and exclusive good. Your role as a data professional is to lead on the ethical use of data and always have the public good at heart. George Akerlof’s Market for Lemons (Akerlof, 1970) is an incredible piece of work and inﬂuenced one of your authors to be a microeconomist. Akerlof focuses on the information asymmetry and transactions costs (friction) in a market. He highlights that a market can degrade if sellers have more information about product or service quality than buyers. The lemon is an American term for low-quality second-hand cars and the point is that if there are high information asymmetries and friction between buyers and sellers in a market, those wanting to provide a higher quality product and service are forced to leave the market. This leads to a ‘race to the bottom’ and nothing but lemons on the marketplace to buy. Joseph Stiglitz Information and Economic Analysis (Stiglitz, 1985) follows on from Akerlof, diving into information asymmetry and how it drives the dynamics of

2.7 Ownership and Incentives

81

a market. If distributed unequally, information will drive suboptimal outcomes for market participants. Markets then tend towards failure as transactions are inefﬁcient and fail to produce mutually beneﬁcial exchanges. In other words, the market gives rise to phenomena such as adverse selection and moral hazard. Akerlof and Stiglitz’s work challenges Coase’s role of a free market and suggests government intervention is required to regulate and develop polices to prevent inevitable market failures. In summary, Coase implies that clear, enforceable rights and low transaction costs can lead to efﬁcient market outcomes even in the provision of public goods like construction projects. Akerlof illuminates how asymmetry in information about product quality can result in a market populated by sub-par offerings. Stiglitz expands on this by indicating that such information asymmetry and imperfection often lead to suboptimal market outcomes, wherein market actors, due to lack of transparent and complete information, fail to make decisions that lead to Pareto efﬁciency.17 The balance of responsibility between private ﬁrms and government boils down to who owns the risk of getting it wrong. As a data professional, you need to understand how risk ﬂows through a project and that ultimately, the risk is always shouldered by the buyer. No matter how a contract form may dress up the ﬂow of risk into a supply chain the bill is generally picked up by the government in the public sector. In the context of the construction industry, where contracting performance is frequently under scrutiny, incorporating data and technology can signiﬁcantly alleviate issues associated with information asymmetry and market imperfections: • Transparent Information Sharing: Leveraging technology for creating platforms that enable transparent sharing of a contractor’s past performance, adherence to timelines, and compliance with quality and safety standards could help in addressing the information gap. This is where the access to data and data ownership needs distinct differentiation. • Smart Contracting: Implementing smart contracts using blockchain technology (or any distributed form of technology) can potentially minimize transaction costs, ensure adherence to agreed terms, and automate compliance checks and balances, aligning with Coase’s perspective of reducing transaction costs for efﬁcient outcomes. • Quality Assurance through IoT and AI: Deploying IoT devices and AI-driven solutions on construction sites for real-time monitoring of material quality, structural integrity, and adherence to design can signiﬁcantly minimize information asymmetry related to project quality and safety, addressing concerns highlighted by Akerlof and Stiglitz. • Predictive Analysis for Risk Management: Employing predictive analytics to assess and manage risks related to project delays, cost overruns, and resource allocation can enhance decision-making by providing all stakeholders with insightful data, thereby minimizing information asymmetry.

17

This is where there are no actions that can improve the efﬁciency without negatively impacting another to be Pareto efﬁcient is to be absolutely efﬁcient.

82

2 The Challenge of the Built Environment

• BIM (Building Information Modelling) and Data Management: Utilising BIM processes combined with robust data governance can enhance collaboration among stakeholders by offering a uniﬁed, real-time view of the project, ensuring that all parties have equal access to critical project information and thereby mitigating information disparity. The exact division of responsibility between the public and private sectors varies enormously depending on the country and type of asset in question. Over the years governments have taken widely differing views in terms of what parts of the built environment should operate as private entities, as well as how to structure and manage publicly owned assets. Taking roads as an example, these are almost always made freely available as a public subsidy to private car ownership, with the result being that induced demand tends to rise to meet capacity. At the same time, the full cost of owning and operating a car falls on private consumers, hidden from government policy considerations. These arrangements across our physical asset base drive the ﬂow and allocation of capital within our economies, the assignment of risk, and the investment decisions of individuals, companies, and public bodies. They are also subject to change more frequently than one might expect, as evidenced by the intermittent privatisation and renationalisation of much of the UK’s infrastructure. As a case study, working in the UK heavy rail sector during the ﬁrst two decades of the twenty-ﬁrst century one came to accept the prevailing hierarchy of regulator (i.e., the ORR, reporting into the Department for Transport), infrastructure owner (i.e., Network Rail, an arms’ length public body), and train/freight operating companies (I.e., a variety of private ﬁrms such as Govia or FirstGroup). One could even look to other industries and see the same structure, for example energy (regulator: Ofgem, infrastructure owner: UK Power Networks, distributors: a variety of private companies). However, this model began to fray with the successive failure of the GNER and National Express franchises on the East Coast main line in 2006 and 2009 respectively, resulting in the DfT running a publicly owned Train Operating Company (TOC) in their place under the brand East Coast. Controversy over the proﬁtability and reliability of the private operators remained a political point of contention. The spread of responsibility across government, public and private operators meant that the overall whole life cost of maintaining and operating the UK’s rail infrastructure was opaque and rarely considered at a system level. None of the train operators were able (or willing) to weather the massive reduction in passenger numbers in the wake of the COVID19 pandemic. The UK government has partially privatised heavy rail operations, with the Department for Transport shouldering all the losses seen by train operators since 2020. This example doesn’t seek to comment on the correct level of privatisation versus nationalisation for any sector, but rather to note that the structures are changeable, they have a profound impact on the incentives actors respond to, and consequently impact how, when and what data is used to inform strategic decisions. In the UK, real estate, both residential and commercial, has long been dominated by the private sector, and almost completely private since the Thatcher government instituted the right to buy under the Housing Act of 1980 (Beckett, 2015). But in most

2.7 Ownership and Incentives

83

other parts of the built environment, the public sector plays a much larger role. Public bodies such as Ofgem, Ofwat, Ofcom, and the ORR regulate the UK’s critical publicserving infrastructure (i.e., power, water, waste, transport, telecoms). However, most of the spending on infrastructure goes through the private sector, either through fully privatised utility companies, as long-term concessions, or through the sub-contracting of spend by non-departmental government (“arms’ length”) bodies such as Network Rail, National Highways, or Transport for London. In most cases, the private sector is responsible for the maintenance or even enhancement of physical assets. The private sector may also proﬁt from ﬁnancing physical assets. The UK has seen a wide range of experiments in moving the dial between the public and private sectors over the last three decades, with both full-blown privatisations as well as more novel vehicles such as public–private partnerships (PPPs), private ﬁnance initiatives (PFIs) and tax increment funding (TIF). Successive governments have sought to reduce the impact of infrastructure on the public purse, only to roll back either due to underperformance (e.g., Metronet, Birmingham PFI) or force majeure (such as Covid). In some parts of the built environment, privatisation appears to have gone full circle, with railway infrastructure now effectively re-nationalised. However, some of the less visible forms of privatisation have proven more durable, such as the purchase of the rights to own and/or operate parts of the built environment by foreign pension funds (BBC, 2016), asset management funds,18 and even stateowned entities from other countries.19 This means that proﬁt-seeking companies own large swathes of the built environment through their purchases of natural monopolies facilitated by the UK Government. These companies then seek to extract proﬁt and yield on behalf of shareholders. This isn’t necessarily a recipe for disaster, but it does mean that from a data perspective, the metrics of what constitutes success might be very different depending on whether you work for the government (e.g., as part of a government department or regulator) or if you work for one of these privately owned companies or funds. In general, privatisation has proven most durable (from a societal perspective at least) where organisations have a long-term incentive to invest and avoid asset stripping, as well as being set meaningful data-driven targets in terms of the level of service that they must provide to their customers. Due to the long working lives of most physical assets, even contracts for ﬁve, ten, or ﬁfteen years represent a small part of an asset’s life cycle. Too often franchisees are incentivised to maximise capital expenditure and/or minimise operational costs, leading to a wasteful prioritisation: creating new assets rather than maintaining existing assets. Without contractual cognisance of the whole life cost proﬁle of investments—in part a data problem—private ﬁrms seek to minimise investment in maintaining the condition of the asset base because they know that even with minimal maintenance the assets are more likely to fail after 18

For example: Hong Kong’s Cheung Kong group owns UK Power Networks, and Australia’s Macquarie Group recently purchased the UK’s gas transmission and metering (National Grid, 2023). 19 For example: EDF energy, who play a huge role in the UK’s power generation network, is almost wholly owned by the French Government (EDF Group, 2022).

84

2 The Challenge of the Built Environment

the end of their contract, at which stage someone else (probably the taxpayer) will have to pick up the bill. Even where private ﬁrms are adequately incentivised to maintain the quality of the built environment, it can be difﬁcult for public sector clients to provide sufﬁcient information to private sector bidders to ensure that they make allowance for this expenditure during the life cycle of a contract. The public sector’s historic emphasis on ﬁnding the lowest cost contractors has at times meant that deliberate or accidental underbidding is the only way to win work, and a race to the bottom ensues that both reduces sector margins below the cost of capital, and in turn, reduces the quality of public services covered by these contracts. The travails of Amey’s Birmingham Highways PFI contract are a particularly galling example of both misaligned incentives and the need for the private sector to eke out a proﬁt margin. As the judgement in the resulting lawsuit stated, “The PFI contract worked perfectly satisfactorily for the ﬁrst three and a half years. Things only went wrong in 2014 when [Amey] thought up an ingenious new interpretation of the contract, which would have the effect of reducing their workload, alternatively increasing their proﬁt (Bevan Brittan, 2019).” It’s correct to blame Amey for not abiding by the spirit of a contract, but it is worth considering that at that time their proﬁt margin was around 2% (or, in other words, at or below the cost of capital), and so perhaps the temptation to engage in “ingenious interpretations” came in part from desperation. The real losers here are the people who relied on Birmingham’s roads during this time, and who saw issues like broken bollards go unﬁxed for so long that Amey incurred a £50 million ﬁne (Wadham, 2019). The split between public and private sector responsibility has an enormous impact on the culture of various parts of the built environment, as well as indirectly inﬂuencing the incentives for organisations to improve performance through data solutions. Table 2.1 describes the (rough) split after several decades of privatisation and re-nationalisation in the UK. Whilst proﬁt isn’t the motivation behind all private sector behaviours, revenue targets, margin, and other ﬁnancial considerations drive a lot of behaviour. Public sector organisations, on the other hand, are uniquely fascinating because their intentions and measures of success are rarely as clear as the proﬁt motive of the private sector. Despite accounting for 40%-50% of most developed economies, their ways of working attract less attention and analysis than private companies (Statista, 2023a, 2023b). Your standard Harvard Business Review is rarely about a public-sector utility company. In part, this is because without easy proﬁt or ROI measures the public sector doesn’t attract the same scorekeeping as big corporations. Where data for the public sector do emerge into the public consciousness, it’s usually expenditure ﬁgures completely out of context: “[Public body] spends [a large amount of money] on [what]? Insert indignation here.” And yet, if you want to ﬁnd opportunities to truly deliver ‘value’ from data, and indeed to expand the notion of what ‘value’ even means, the public sector is where many of the most interesting arguments are yet to take place. This is in part because rather than inheriting proﬁt as a motivator, public sector clients need to create and perpetuate purpose—often in the form of societal outcomes—and create the means of measuring and encouraging that purpose. The

2.7 Ownership and Incentives

85

Table 2.1 The public to private divides in the UK Sub-sector

Public

Hybrid: Public contracts to private ﬁrms

Private

Transport—rail

• Regulation • Infrastructure owners (Network Rail, TfL, TfW, TfGM) • Some Train operating companies (TOCs) • (some) rolling stock

• Contractors (construction) • (some) Train operating companies (TOCs) • Freight operating companies (FOCs) • Rolling stock leasing/ﬁnancing • Stations and concessions • Consultancy

• Rolling stock manufacturers/ leasing • Some rolling stock ownership

Transport—road

• Department (DfT) • Contractors (construction) • Regulator (ORR) • Infrastructure owners • Large-scale design and build projects (local authorities, (such as the Lower national highways, Thames Crossing) transport Scotland) • Some toll road and toll bridge concessions

Transport—aviation • Department (DfT) • Regulator (CAA)

• Private vehicles • Freight vehicles

• Airports

• Airlines

Utilities—water

• Department (DEFRA) • Regulator (Ofwat)

• Water companies

• Financing large infrastructure investments (such as Tideway) • Ownership of some private water companies (such as Macquarie owning Thames Water Ltd.)

Utilities—energy

• Regulator (Ofgem) • Nuclear decommissioning authority

• Power distribution

• Ownership of privatised national infrastructure (such as National Grid, UKPN) • Ownership of some power generation (EDF) • Energy companies (continued)

86

2 The Challenge of the Built Environment

Table 2.1 (continued) Sub-sector

Public

Utilities—telecoms

• Regulator (Ofcom)

Real Estate

• Local authority/ social housing

Hybrid: Public contracts to private ﬁrms

Private

• Ownership of privatised national infrastructure (BT/ Openreach) • Ownership of mobile networks • Ownership of cable networks • Housing • Private housing Associations • Commercial property • Construction of most public buildings (hospitals, schools, etc.)

wide range of stakeholders and obligations that public organisations are accountable to lends itself to a balanced scorecard approach, which necessitates data collection on a wider range of quantitative and qualitative outcomes (e.g., expenditure, safety, customer experience, sustainability, etc.). As we came towards the end of writing this book, we invited a wide range of different subject matter experts to review the text and provide feedback. Part of the fun of inviting a lot of intelligent people to comment on your work is that some of their comments will open whole new avenues for consideration. Case in point, reviewing the section above Steven Steer argued that from his perspective the distinction between public and private enterprises was at best secondary to the distinction between central planning and what he called “emergent free market outcomes.” Within this framing, Steer sees it as less important whether the organisation that provides a public service is public owned or privatised, and more important whether the party has incentives to improve service quality. This in turn echoes the question of natural monopolies that we touched on above. Where a natural monopoly exists—for example, the provision of drinking water—then regardless of whether the utility is publicly owned or privately owned we will not be able to rely on competition between organisations to drive service improvements. Customers cannot easily change supplier, and so it becomes more important that regulations and transparency are in place to ensure sufﬁcient scrutiny of the utility provider’s planning decisions. As Steer wrote in a 2022 paper, “The design approach for the Centrally Planned portion needs a method that creates solutions that meet all stakeholders’ needs and it must enable the Free-Market solutions (Steer, Moving to Action: Digitalising our Net Zero Energy Future, 2022).” Whilst we still believe that public versus private ownership is a useful lens to view the sector—particularly where private organisations have proﬁts to extract from public services and shareholders to appease—we do ﬁnd that Steer’s framing helps

2.7 Ownership and Incentives

87

to explain why privatisation alone does not reliably result in improvements to public services, and often has quite the opposite effect. Where governments privatise a public service, particularly natural monopolies, without any provisions for spurring competition or ensuring transparency, then we shouldn’t be surprised to see those organisations perform no better than the public sector organisations that they replace. High-quality open data has a role to play in creating transparency and holding organisations to account. However, Steer cautions against overly relying on metrics and measures alone as where these contradict the underlying incentives that organisations face then they are unlikely to change outcomes in isolation. If we create contracts that incentivise organisations to minimise short-term cost, then no amount of metrics or measures will be sufﬁcient to shift those contracts towards a sustainable whole life cycle outlook.

2.7.2 Data As Scorekeeper As data professionals our work should reﬂect (and subtly inﬂuence) the outcomes that are important to the organisations that we work for. The metrics that organisations use inﬂuence their behaviour, often by creating incentives for ways of working or outcomes that might not otherwise exist. Working in the built environment one encounters diverse types of metrics that reﬂect the politics and priorities of the sector including: • Legislative metrics: Usually a trickle-down from government policy, these might be top-down measures such as targets around safety, sustainability, diversity, data protection, and ﬁnancial probity. These measures work best when there are clear legal consequences for non-compliance. • Regulatory metrics: These measures tend to come from government bodies rather than law (though the powers to set the measures stem from legislation). Good examples include long-running performance measures set by the ORR such as the public performance measure (PPM) in the rail sector. These measures work best when there are ﬁnancial consequences for not meeting them (see ﬁnancial metrics below). Like all measures, they can drive counter-intuitive behaviour (see: ‘Parliamentary trains’) as well as sizable bureaucracies to track and dispute measures (such as the train delay attribution industry). • Financial metrics: The most obvious metric, we still undoubtedly live in a world where many organisations (and consequently the people who work for them) put proﬁt ﬁrst. • In the private sector, ﬁnancial metrics scale to motivate at all levels of the hierarchy, from the overall proﬁtability of a business to P&Ls for business units, to implicit or explicit revenue and margin targets for staff in leadership and sales positions. Financial metrics beneﬁt from being (reasonably) difﬁcult to fake, and often even stimulate competition and collaboration within as well as between organisations. However, there are also obvious limitations,

88

2 The Challenge of the Built Environment

maximising revenue on a time and materials basis will not always deliver the best outcomes for society or the environment and at their worst can incentivise work for the sake of work. • The public sector also makes use of ﬁnancial metrics, but usually these focus on the responsible expenditure of public money rather than maximising the size of the pot.20 In the UK procurement regulations and the National Audit Ofﬁce place strict controls on how public organisations spend their budgets. In practice this often boils down to the diktat “spend your budget, not a penny more not a penny less, and spend it in a year.” In addition to strict adherence to budget, the government often encourages public organisations to seek ways to realise efficiency savings (e.g., more for less), though often this translates into value engineering (e.g., less for less). • Reputational metrics: Public services exist to serve the public, and they know it. This is implicit in the way that we talk about public assets and the indignance many of us feel when they are not as available as we would like (e.g., “my train was delayed again, the council are digging up my road again, my local pool/library/ school is terrible, etc.). Public clients are sensitive to how the public perceives them, and often very keen to measure this sentiment either directly through surveys or indirectly through techniques like social media sentiment analysis. Reputational damage is often a fuzzy but vital risk that public clients and their suppliers are keen to avoid. One can leverage this fear to encourage them do things that are in the public interest but which they may not otherwise be incentivised to do. Environmental, social, and governance (ESG) metrics are common in both the private and public sectors and will increasingly account for part of the workload that data teams pick up. Where ﬁnancial metrics often encourage the relentless pursuit of growth and activity, ESG metrics could provide a counterbalance, encouraging the re-use of existing assets rather than their relentless replacement. • Ethical metrics: There are datasets that we should maintain at an organisational, sectoral, or societal level for reasons that transcend the objectives of our day-to-day work or organisation. Such metrics can help to shine a light on where our actions (or inactions) are disadvantageous to groups, and they can help reveal blind spots. Measures like the gender pay gap are important glimpses of inequity even where they are not ﬂattering or do not align to the story that we would like to tell about ourselves. As Caroline Criado Perez relentlessly establishes in Invisible Women, a lack of sex-disaggregated data combined with a default male perspective has harsh repercussions on women’s lived experience, health outcomes, and economic value. As Criado Perez puts it, the “gender data gap” is a “gap in perspective,” and her book is a catalogue of cases where organisations and political structures “didn’t deliberately set out to exclude women. They just didn’t think about them (Criado-Perez, 2019)” for example designing car safety features around the shape of the average man. The same errors of omission are almost certainly true of other disadvantaged groups and protected characteristics. As we apply data to the built 20

That said, empire building is hardly unknown.

2.8 Data Sharing

89

environment we must actively seek to think about these groups, and to collect data that reﬂects their experiences and where they deserve physical assets and systems to better meet their needs. There is a role for data in creating consistent measurements that can apply broadly across multiple organisations. The risk of the balanced scorecard approach is that organisations can deﬁne sustainability or customer satisfaction in a manner that is most ﬂattering or provides them with the most leverage over the direction of the metrics. Whilst it would be self-defeating to try to standardise balanced scorecards across organisations with quite different priorities, these scorecards should nonetheless draw from a standard set of deﬁned terms, measures, and reporting practices. A good example is the Construction Innovation Hub’s Delivery Value Scorecard which considers a project’s impact on natural, social, human, and produced (human-made) capital (Construction Innovation Hub, 2022). The ownership and incentive structures that pertain to the built environment often go unnoticed, they are the waters that we swim in and usually persist beyond our immediate planning horizon. Nonetheless they undoubtedly drive how the sector uses data. From a Machiavellian perspective, understanding what constitutes success for our superiors makes it easier for us to offer data solutions that help to deliver success in those areas and therefore to secure top-down support. From a more ethical perspective, understanding the incentives of our organisations will help us to ensure that they can use data solutions in a manner that is beneﬁcial to society and the natural environment, not just to ﬂatter egos and proﬁts.

2.8 Data Sharing As a result of many years of immature data practices, the built environment suffers from a dearth of properly curated longitudinal (time series) data, a lack of crossdisciplinary or cross-organisational datasets, a backlog of under-managed sensitive data and under-utilised non-sensitive data. There is friction here, it is hard to value something that you don’t make use of, and it’s hard to justify the liability of storing data if you are not deriving value from it. This can result in organisations underinvesting in collection and curating valuable data or disposing of potentially useful datasets. These twin issues can make it difﬁcult to properly judge what sensitive data organisations should or should not retain, as well as the appropriate retention period. The National Infrastructure Commission’s 2017 report Data for the Public Good speaks forcefully on the need for better data sharing across the sector. As the paper correctly identiﬁed, data-sharing will realise a net public good, but too often “crossboundary approaches to data collection and sharing… [are] at odds with conventional business cases which are focused on the value to the authority making the case (Hayes et al., 2017).” In this sub-optimal scenario, the beneﬁciaries of data use, namely members of the public, fail to accrue the improved level of service that would come from better data sharing because few public sector organisations or departments are

90

2 The Challenge of the Built Environment

incentivised to share data beyond their own boundaries. To overcome this collective action/ﬁrst mover-disadvantage, the paper recommends the creation of data trusts. The distinguishing feature of a data trust, according to the Open Data Institute, is that an “independent person, group or entity stewarding the data takes on a ﬁduciary duty (Open Data Institute, 2018).” For organisations wary of the liability and security that comes from sharing data publicly, a data trust can provide a lower risk means of sharing, and beneﬁting from, a broader pool of cross-organisation data.

2.8.1 Sharing Commercial Data Cost information is so crucial to the built environment sector that entire subdisciplines have evolved speciﬁcally to manage cost data. Cost estimators, quantity surveyors, and commercial managers jealously guard their methods and their data. In practice, small low-quality datasets mean that cost estimation is as much art as science. Centralised cost databases, such as the Royal Institution of Charter Surveys (RICS) Building Cost Information System (BCIS) are sparsely populated, others such a Spon’s provide only indicative pricing leaving the estimator with a fair amount of guesswork to do. Overall, the sector would beneﬁt from large, open, detailed databases of actual costs from real projects. With more than two thirds of construction projects coming in over budget, there are clearly lessons to learn in terms of additional costs that estimators overlook (KPMG, 2015). Imagine, for example, the value in having access to open book costs for every major public sector project. Not only would clients have a better idea of what they are buying, but suppliers would also beneﬁt from being able to sense-check their rates and avoid potentially disastrous instances of the winner’s curse (e.g., winning work by submitting an undeliverable low bid). However, at present there is little incentive for parties to share this information. This might be due to the asymmetric beneﬁts, the ﬁrst organisation to submit their data loses their commercial advantage without assurance that others will follow suit. Data sharing activities usually require some enlightened organisation to begin sharing before it is customary practice, perhaps even before there is apparent demand for the data. The ﬁrst organisations to share costs would not beneﬁt from reciprocity from other organisations, but they would be exposing themselves to an erosion of their commercial advantage or potentially reputational damage as they reveal where they are spending their money. It varies by contract form used, but often the bulk of the cost information sits with the supply chain, with the client organisation only seeing the high-level invoices at the end of the day. This means that even if clients wanted to share cost information, they would not be able to share the useful detailed information. Suppliers, on the other hand, will see few upsides and lots of downsides from sharing their information. Having access to more and better cost information could help suppliers to produce better estimates for their work and reduce the risk of loss-making contracts. However, over the years your authors have been privy to many conversations with suppliers

2.8 Data Sharing

91

where the prospect of sharing information has come with fear of revealing margins to both competitors and clients. The irony here is that built environment margins are often unsustainably low. An emphasis on lowest cost procurement, combined with client organisations who are unaware of their supply chain’s actual margins, has at times resulted in a race to the bottom where winning bidders often experience winner’s remorse. By protecting their sensitive cost information suppliers are clouding the sector’s ability to see this problem in full. By protecting their interests in the short run, we are collectively harming the sector over the long run. What is true of cost information is broadly true of other project controls information such as schedule and risk data. So how do we resolve this prisoner’s dilemma? Some effort has been made to create common pools of cost data by 3rd party organisations such as the Construction Data Trust. Data science provides us with the tools to synthesise massive quantities of information whilst reducing the risk of exposing the data belonging to individual parties. These tools include simple changes like pseudonymisation/anonymisation and aggregation, but also more sophisticated techniques such as using deep learning to extract insight from large datasets without revealing the source information, or generating synthetic data based upon sensitive source data.

2.8.2 Location Data for Transport Operations Understanding how people move around cities and nations and make use of infrastructure is simultaneously one of the most promising and potentially problematic opportunities afforded by (near) real-time data sources. On the one hand, planning and managing transport networks around real-world usage patterns has the potential to unlock unrealised capacity in our networks, help mitigate the impact of disruptions and aid the response to emergencies. On the other hand, society has not (consciously) consented to such large-scale monitoring of their movements, and the potential for invasion of privacy is enormous. In this ﬁeld there has been a paradigm shift in the availability of data over an extremely brief period as smartphones and other internet connected devices have proliferated. Most of us are recording and reporting our location in (near) real-time, often without realising it. The question is not whether companies track our location: they do. Instead, the question is which organisations have access to our location data, and how they use it. There are a few principal means whereby we leave our location/ movement trace, with varying levels of anonymity: • Anonymous surveys and sensors: the most low-tech and least controversial type of tracking. This includes trafﬁc counters laid across roads, passenger number trackers (scales) on trains, and even good old manual surveys with handheld counters.

92

2 The Challenge of the Built Environment

• Personally identiﬁable sensors: There are a wide range of established and (broadly) accepted technologies in this space, including automatic number plate recognition (ANPR) and ID card systems (e.g., building passes, oyster cards, contactless cards). There are also more controversial technologies, particularly facial recognition, which few democratic societies accept but which repressive/autocratic regimes use to terrifying effect. • Connected devices tracking: In theory these are services that we opt-in to using on our smart device (e.g., smartphone, smart watch, laptop, tablet, car, etc.). Manufacturers build tracking into devices (for example internet connected vehicles can report their movements as well logging other incidents such as failures, accidents, loss of traction). Other forms of tracking start based upon the actions of the user, for example using apps such as Google Maps, Waze, Strava, and that one free ‘spirit level’ app you downloaded for a DIY project, indeed any app that asks for access to location information. Other user actions that reveal location include switching on Bluetooth, connecting to Wi-Fi networks, using contactless technologies, sharing location in messages, geo-tagging social media posts, or simply sharing pictures. The extent to which the use of personally identiﬁable data by infrastructure owners is troubling is a subjective question and depends on both the intent of those wielding the data as well as one’s own personal preferences. The same technologies used benignly, for example by speeding your access and egress from a train station by removing the need to show a ticket, can also be used repressively, for example allowing regimes to track dissidents or to make it impossible for certain parts of the population to congregate or to travel freely. Whilst we may believe that we have control over how and where 3rd parties use our location data, there are plenty of back-doors. Few users closely monitor the permissions that they are assigning to applications, nor do they read what rights they are giving companies to resell or otherwise use data collected in this manner. The same will doubtless be true at an organisation level, as we are not always clear on the small print when we purchase software services from vendors. There is a thriving marketplace for location information, and plenty of infrastructure owners purchase aggregated location data from mobile network operators or use services provided by companies such as Inrix or Citi Logik. That said, the traditional built environment ﬁelds of transport planning, infrastructure operations, and timetable planning have yet to fully make use of the volume and granularity of (near) real-time data available. Many transport planning functions still rely on byzantine models based upon software developed by oil and gas companies decades ago, and few control centres have effectively integrated (near) real-time location data into their decision-making processes. This can result in circumstances where customer-facing applications such as Waze provide considerably better (near) real-time information on the performance of the transport network than is available to the people in the control centres operating that same network. There is a world in which (near) real-time location data massively improves the resiliency of our transport networks without compromising privacy. However, this

2.9 Interoperability

93

will require the right kind of legal controls, aggregation, anonymisation, improved adoption of technology, and (crucially) public consent. It will also likely require a more coherent and joined up approach from the public sector to avoid different government departments and arm’s length bodies repeatedly paying for the same information from the same vendors. In both examples, cost estimate and location data, there are clear advantages, dangers, and muddled incentives in widespread data sharing. These examples are indicative of a wider range of dilemmas faced across the built environment where the route to interoperability and data sharing remains nascent and unproven. A notable success story that begins to demonstrate the tantalising possibility is the UK’s National Underground Asset Register (NUAR). The report Data for the Public Good sowed the seed, arguing for the coordination across organisations working with data, noting that the Geospatial Commission is in an advantageous position to coordinate the sharing of data on underground assets. The Geospatial Commission, under the leadership of Holger Kessler, has delivered on this vision, overcoming a myriad of technical and political obstacles. NUAR is now one of the more fully realised digital representations of the built environment. The NUAR minimum viable product (MVP), covers much of England and Wales and incorporates data from roughly eighty public and private underground asset owners. The goal of the project is to “improve the efﬁciency with which we supply essential services and… minimise the disruption experienced by… road users (Geospatial Commission, 2023).” As NUAR continues to grow in scope and sophistication we hope that it will continue to demonstrate what the built environment can achieve through iterative development, cross-organisation data sharing, and a collaborative approach to working with data.

2.9 Interoperability Data provides us—at least theoretically—with the ability to exchange knowledge across organisational and technological boundaries. Interoperability isn’t just about exchanging data between organisations; it is about exchanging meaningful information such that those organisations gain knowledge and can act in a coordinated manner. To accomplish interoperability between two or more organisations we require: 1. A means to securely move data between organisations. 2. An agreed format (syntax) for the data such that it is legible to both organisations. 3. A shared set of deﬁnitions (semantics) such that both organisations can interpret the data consistently and accurately. 4. Behaviour alignment (processes and goals) such that the organisations act in the information in a coordinated manner. 5. Legal and contractual alignment These criteria mean that interoperability extends across data architecture, solution architecture, and business architecture. Or phrased another way, it furthers

94

2 The Challenge of the Built Environment

interoperability between humans by way of their software and data. Discussions of interoperability often focus on the purely technological aspects of exchange, for example the UK’s Built Environment Interoperability Group deﬁnes interoperability as “the ability of two or more systems* to exchange information and to use the information that has been exchanged.” However, we know that merely exchanging structured data—whilst a pre-requisite to coordinated outcomes—is not itself sufﬁcient to deliver improvements to the working of the built environment. Interoperable data requires decision-makers who are willing and able to use that data to make coordinated decisions. The opportunity here is to create interconnected ﬂows of meaningful data that reﬂect the connections present in the built environment itself. The built environment is a thick web of services: power, water, energy, telecoms, transport, and buildings, many of which are reliant on each other to meet their user’s needs. Our organisational structures, and therefore our use of data, pretends that the built environment (and its data) breaks down into and can be managed as isolated (and geographicallybounded) problems. In the UK, as an example, there are dozens of infrastructure owners, regulated bodies, and government departments responsible for the country’s utilities infrastructure, but no body responsible for ensuring that these services work systemically to meet societal outcomes. A lack of interoperability risks both redundant data collection (e.g., multiple organisations paying to collect the same data), data incompatibility (e.g., organisations deﬁning metrics in contradictory ways), and missed opportunities (e.g., organisations making sub-optimal decisions for lack of data that was available elsewhere). The Climate Resilience Demonstrator (CReDo) is an initiative born out of the Centre for Digital Built Britain and currently run by the Connected Places Catapult. It uses a blend of real and synthetic data to demonstrate the impact on water, power, and telecoms infrastructure from serious ﬂoods, and particularly where interdependencies can extend or mitigate the loss of services caused by ﬂoods. It shows services failing not just as a direct result of ﬂooding, but because the ﬂooding takes out other infrastructure assets on which those services depend. For example, one might lose a water treatment plant indirectly as a consequence of ﬂooding, perhaps because it loses power due to a sub-station being damaged by ﬂooding. What CReDo does in a compelling way is make the case for system thinking, where the best investment that an infrastructure owner can make to improve the resilience of their service might be in improving the resilience of another infrastructure owner’s assets. We can extend the logic of CReDo to consider a wider range of climate challenges, as well as a broader network/system of asset types. That logic in turn depends upon interoperability, which Kirsten Lamb, Didem Gurdur and co-authors deﬁned as “the ability for a system… to work with other systems… without the special effort of the user (Gurdur Broo et al., 2020).” True to the name, CReDo provides, in their words, “a practical example of how connected data,” (read: interoperability), “can improve climate adaptation and resilience across a system of systems.” Interoperability, as a concept, operates across a range of scales. At the solution level, interoperability is more present than ever. Most modern software products, particularly SaaS products, will come with application programme interfaces (APIs)

2.9 Interoperability

95

out of the box. As the name suggests, these allow different software products to securely use each other’s data and services. There is also a thriving ecosystem of suppliers who build platforms on top of APIs, allowing a broad collection of software to function effectively as a single business service. Examples of this type of platform include Business Intelligence (BI) tools such as Domo, Tableau and others, Master Data Management (MDM) and Data Governance platforms such as Informatica or Collibra, and in the built environment sector the integration functionality of platforms such as ESRI, Autodesk, ASite/3D Repo or GeoConnect+. Whilst we will discuss the practical application of this type of solution-level interoperability later in this book, this is merely a precursor to the level of interoperability that projects like CReDo have in mind. CReDo requires the rapid exchange of meaningful information between organisations at a system of systems level. Use cases for this type of interoperability include disaster resilience, multimodal mobility-asa-service (MaaS), energy demand management at a regional or national scale, and (presumably) less benign motivations such as wartime mobilisation. Interoperability at this level of sophistication requires not just the physical and digital infrastructure to collect and exchange huge volumes of near-real-time information between organisations, but just as crucially the ability to create shared meaning between those organisations. In more technical terms, this means that interoperability is more than simply an API with an interface speciﬁcation, it is a question of ensuring that organisations that form the system of systems share: A. Semantic interoperability: Common deﬁnitions around what terms mean (e.g., what do you mean by a customer or an asset?). B. Process interoperability: a. An understanding of what decisions they can and should inform using these ﬂows of meaningful data. b. Clarity around how they should communicate those decisions back to the other parties in the system of systems therefore generating yet more data. Publications such as the CDBB’s Information Management Framework, as well as the existing BIM process documentation such as Exchange Information Requirements (EIRs), establish the beginnings of a decentralised architecture to cater to the ﬂows of information, as well as a common ontological framework to inform semantic interoperability. However, there remains substantial work involved in reconciling the different perspectives, information speciﬁcations/needs, decision-making processes, and cultures/mindsets of organisations. Even working through the deﬁnition of what constitutes a single asset type (a culvert, for example) and what one may wish to know about an asset such as a culvert across rail, road, and environmental organisations proves a non-trivial challenge. This is not simply because different organisations have slightly different deﬁnitions of the size of an asset that they consider a culvert (rather than a pipe, or a bridge). But also, because different organisations care about different implications of the culvert’s performance: Is it preventing blockage or overﬂow? Is it likely to fail and derail a train? Does it provide drainage to my road or risk overspill on to my road?

96

2 The Challenge of the Built Environment

The challenge of interoperability becomes, for now at least, narrowing down the scope to meet the most immediately beneﬁcial use cases, and as CReDo have done, incrementally scaling up one’s infrastructure, deﬁnitions, and processes to begin to meet local, regional, and national challenges.

2.9.1 Delivering Interoperability at the Mega Scale Whilst CReDo and NUAR are great examples of a ‘bottom up’ approach to building interoperability. As we covered earlier, NUAR is a comprehensive digital map of underground assets such as pipes and cables developed by the Geospatial Commission in the UK, aiming to boost efﬁciency and safety in conducting underground works. It provides structured access to both private and public data related to the underground assets to planners and excavators. NUAR started with building a minimal viable product (MVP) which offered essential but continuously developing functionality, providing an early version of the service to a select group of users. Initially, it encompasses data for speciﬁc regions, for example, London, Northeast England, West Midlands, and Wales. The NUAR MVP was launched on 5 April 2023 by Viscount Camrose in Newcastle and began covering speciﬁc regions with plans to expand and cover England, Wales, and Northern Ireland by spring 2024. NUARs approach was to limit access to asset owners in speciﬁed regions who have agreed to the NUAR Data Distribution Agreement (DDA) and a Data Ingestion Speciﬁcation (DIS), and to authorised street works contractors working for these asset owners. The access management is done through the NUAR administration portal which allows asset owner organizations to manage users and invitations based on certain criteria. The task of gaining permission from data owners should not be taking lightly and is a key role for stakeholder managers. There are key features of the platform that are noteworthy. Firstly, it is a secure platform. NUAR maintains production security controls ensuring data protection and permitting role-based data access. It provides an interactive map that allows users to query detailed attribute information about each asset and site. Asset owners can ﬂag assets as ‘sensitive’ and assign ‘enhanced measures’ to them. Post launch NUAR will continue to evolve, adding new features, reﬁning existing ones, and incorporating additional datasets. Expansion of service coverage and inclusion of additional data are ongoing as they plan to enter more regions of the UK. The critical task is inﬂuencing legislative reform considerations. Inﬂuencing topdown the biggest lever is through legislation and regulation. NUAR’s MVP marks an essential step in developing a system that can signiﬁcantly enhance underground work practices and mitigate risks associated with such operations by providing reliable, standardized access to essential underground asset data. This initiative not only prioritizes safety and efﬁciency but also involves users in its developmental process to iteratively reﬁne and enhance its capabilities and functionalities.

2.9 Interoperability

97

Securing cooperation and data sharing between utility owners, operators, and regulators is pivotal for the comprehensive development and functionality of platforms like NUAR and CReDo, their methods of organisation (top-down/bottom-up) are irrelevant from this perspective. The “Tragedy of the Commons” is a situation in which individuals, acting independently and rationally according to their self-interest, behave contrary to the common good of all contributors and consumers by depleting or spoiling a shared resource. In the context of digital platforms like NUAR and CReDo that aim to facilitate sharing and management of asset data, it’s possible to envision some potential issues: • Data Overload vs Data Quality: Too many entities contributing data without a rigorous quality check or standardization could degrade the utility of the platform. • Disparate Objectives: With both top-down (NUAR) and bottom-up (CReDo) approaches, there might be conﬂicts in objectives or management principles, which could diffuse the focus and effectiveness of data management efforts. • Unequal Contributions and Beneﬁts: Some organizations may diligently share high-quality, accurate data while others may not. Yet, all entities derive similar beneﬁts, which might demotivate diligent contributors over time. • Security and Privacy Concerns: Sharing data on a common platform could raise concerns about data misuse or unauthorized access, especially if sensitive information about critical infrastructure is involved. • Technological and Operational Inconsistencies: Different platforms may employ varying technological solutions and operational workﬂows. Reconciliation between the systems or ensuring interoperability might be challenging and resource intensive. • Competitive Disadvantages: Organisations might feel that sharing their data freely could empower their competitors or devalue their proprietary information. Here are some conceptual steps and considerations to facilitate the integration of national (and maybe international) data systems to remedy the previous points: 1. Legal Framework and Compliance: Develop a legal framework (either by regulation of secondary legislation) that mandates data sharing and collaboration between utility owners/operators and regulators. Ensuring that data sharing, and system functionalities comply with data protection and privacy regulations. 2. Establishing Trust and Collaboration: Organise forums or roundtable discussions between utility owners/operators and regulators to understand their concerns, requirements, and challenges. Develop use-cases demonstrating the beneﬁt of data sharing for enhanced safety, operational efﬁciency, and risk mitigation. 3. Standardising Data: Establish standards for data formatting, sharing, and updating to ensure consistency and accuracy across all shared information. Implement robust data validation processes to ensure data quality and reliability. 4. Data Security and Access Control:

98

5.

6.

7.

8.

9.

2 The Challenge of the Built Environment

Employ robust cybersecurity measures to protect sensitive data and assure stakeholders about the security of their data. Implement role-based access control to ensure that stakeholders have access only to the data that is pertinent to their operations and compliance. Transparent and Traceable Data Usage: Create a transparent system that logs all data access and usage, which can be audited by relevant stakeholders. Implement feedback loops to continuously optimise data usage, accuracy, and reliability. Integrating Regulatory Compliance: Develop features within the platform to facilitate regulatory compliance, such as automated reporting and analytics. Integrate automated alerts for noncompliance or potential risks based on real data (not opinions on PowerPoint slides and spreadsheets). Continuous Engagement: Maintain ongoing dialogue with stakeholders to ensure that the platform evolves with the changing needs of the industry and regulatory frameworks. Regularly update stakeholders on enhancements, case studies, and impact stories related to the platform. Training and Support: Provide comprehensive training for utility owners, operators, and regulators to maximize the utility and adoption of the platform. Establish a support and helpdesk framework to assist stakeholders in integrating and utilizing the platform effectively. It’s important to think of this capability as a service not a mere task and ﬁnish activity. Policy Advocacy: Engage with policymakers to advocate the importance of a uniﬁed platform, highlighting socio-economic and safety beneﬁts. Showcase successful models and case studies to lobby for supportive policies and regulations.

Engaging utility regulators and stakeholders in the system will necessitate a strategic approach, focusing on mutual beneﬁt, compliance, data security, and operational efﬁciency. A collaborative environment, supported by a robust legal and operational framework, will be essential to integrate regulators, utility owners, and operators into a uniﬁed platform like NUAR, ensuring safe and efﬁcient underground work practices. There are also commercial design considerations worth including in agreements and incentives, our cheat sheet for asking good questions in stage gate reviews are: 1. Robust Governance: Implementing stringent governance to ensure fair use, quality control, and equitable sharing of beneﬁts. How are we planning to govern data? How do we understand if we own, steward, or merely federate the data? 2. Standardisation: Establishing strict data standards and protocols to ensure uniformity and quality of shared information. What standards are we planning into the contract? Or mandating the use of as a contractual requirement? (Shortly followed up by how and what happens when new standards emerge?)

2.9 Interoperability

99

3. Secure and Controlled Access: Employing robust cybersecurity and access control measures to protect data integrity and privacy. What is our exposure to liability in case of a breach? 4. Incentivisation: Introducing mechanisms to incentivise data sharing and adherence to platform guidelines. How are we incentivising the appropriate behaviours? Who pays if/when things go wrong? 5. Interoperability: Ensuring technological and operational alignment between different approaches/platforms to facilitate seamless data ﬂow and integration. Who is responsible for integrating the digital systems? What types of agreements are we making with the technology supply chain? Are there any gaps in the ‘back-to-back’ of service level agreements? 6. Legal and Ethical Frameworks: Implementing a legal framework that mandates adherence to ethical practices and ensures that violators are held accountable. What are the consequences for misuse? Do we have a culture that feels safe to report ethical issues? 7. Collaborative Decision Making: Ensuring that decision-making is a collective process, involving representatives from all contributing entities. How are stakeholders engaged? How do we deliver not just artefacts/documents but outcomes? 8. Transparent Operations: Keeping operations transparent to build trust among contributors and to demonstrate the fair and judicious use of shared resources. What is the structure of our solution architecture? How are we providing requirements to the supply chain through the procurement process? Do they include data management protocols and speciﬁcs on how data ﬂows and the ‘explain-ability’ of algorithms? 9. User Education: Regular training and updates for stakeholders to maximize the effective and responsible use of the platform. Have we included in the estimates for people’s time the cost of training and continual learning? 10. Continuous Improvement: Establishing mechanisms for feedback, reviews, and continuous improvements to ensure the platform stays relevant and beneﬁcial to all users. What is the lifecycle of a decision? How do we avoid the pitfalls of waterfall planning techniques (being too slow or not having visibility of issues and change)? It is crucial that the introduction of multiple platforms (NUAR, CReDo, etc.) does not create a fragmented or incompatible landscape for users. Coordination between top-down and bottom-up initiatives might help harness the strengths of both approaches, possibly integrating administrative oversight with grassroots-level insights and ﬂexibility. An overarching regulatory and governance framework that accommodates both approaches could help avert a tragedy of the commons scenario by aligning interests and ensuring fair, secure, and beneﬁcial data sharing. Ultimately, we are seeking the ability to implement interoperability at huge scale because that is the scale required to manage the built environment through data!

100

2 The Challenge of the Built Environment

2.10 Data as a Valued Asset Failure to manage data is similar to failure to manage capital. It results in waste and lost opportunity. (DAMA International, 2017)

One major change in the conversation about data over the last decade or so has been that organisations are increasingly willing to refer to data as a ‘valued asset’ distinct from the value of their physical assets. The late Herman Heynes was one of the ﬁrst and most vocal advocates for valuing data as an asset. Herman argued that treating data as a valued asset could drive a mindset change within organisations where “data investment becomes a value creation activity instead of an IT cost (Anmut, 2023).” With Herman’s help, National Highways went further and explicitly valued their data both as a ﬁnancial ﬁgure—effectively naming the replacement cost of their data assets—as well as a percentage of the value of the organisation’s overall asset base. There are challenges to this approach, particularly around whether one can realise this notional value, but the intent is clear. As much of the built environment belongs to the public sector, or as a long-term investment, the purpose of valuing data assets is rarely about selling that data. Instead, the exercise is more a question of accountancy and insurance: what value does data add to my organisation, and how much would it cost to replace it? As Davin Crowley-Sweet, CDO of National Highways, told Forbes: It was never the intention to sell the company’s data or use valuation merely to prove business cases, but rather to change the culture to one in which all employees feel they are stewards and enablers of a valuable company asset… We needed a language to explain data in a way that the business would understand (Laney, 2021).

The Data Management Association (DAMA) argues for “organisational recognition of data as a business asset, not just an IT concern (DAMA International, 2017).” To a private sector organisation, the value of data refers to the extent to which it can increase their proﬁtability. But value has more nuanced meanings in the public sector, as we discussed earlier in this chapter. The value of data is realised indirectly through the outcomes delivered by physical assets, as Data for the Public Good states, “data can help extract more value from every unit of infrastructure output (Hayes et al., 2017).” The Bennett Institute for Public Policy and the Open Data Institute share this focus on societal outcomes in their deﬁnition as used in their landmark white paper The Value of Data: By ‘value’, we are referring to the economic concept of ‘social welfare’: the broad economic well-being of all of society, including the proﬁtability of businesses, the incomes and needs of individuals, and non-monetary beneﬁts such as convenience or health (Coyle et al., 2020).

Whilst organisations have started to recognise the value of their data, progress has been slower in realising the value across organisations. This is ironic because data—unlike most valuable assets—is non-rival, as The Value of Data puts it “many people can use the same data at the same time without it being used up.” And yet, there are direct and indirect costs to consider, it may not cost you anything or deprive

2.10 Data as a Valued Asset

101

you of access to your data if a competitor uses it, but it would be galling if they used your data to beat you in a competition for work. The built environment is an archipelago of organisations: government departments, arm’s length public bodies, ‘tier 1’ contractors, sub-contractors, specialists and SMEs, consultancies, service providers, ﬁnanciers, and operators. This means that no one organisation can access the full dataset for even a small subset of the built environment. Some organisations, such as National Highways, have proactively sought to centralise valued data, but at the same time copious quantities of valued information remain locked up in individual suppliers due to a lack of clear ownership, or a desire to maintain commercial advantage. There is a crucial role for data professionals in client organisations to work with the legal and commercial colleagues to identify what supplier data genuinely confers meaningful competitive advantage, and where the beneﬁts of sharing data for the great good sufﬁciently outweighs the loss incurred by the sharing party that some degree of compensation is possible. If data is valuable, then it makes sense that the balkanisation of data reduces this value. The Value of Data describes this in economic terms, “when there are positive externalities, and information content comes from aggregation, too little data will be provided for use, from the perspective of society, as it can be difﬁcult for whoever incurs the cost to capture the beneﬁts of it.” Part of our role as data professionals and data clients should be to encourage, cajole, and if necessary, mandate our colleagues and peers to look beyond their own personal cost to the broader beneﬁt. As The Economist reports, “estimates for the value of open government data range from less than 0.1% to more than 7% of GDP (The Economist, 2020).” This wide range of values—translating to between £2.8 billion and £198 billion for the UK—reﬂects both a lack of consensus around how to value data, as well as the huge variance in the value of data depending on the extent to which it is used. Sarah Hayes, in her 2018 National Infrastructure Commission paper estimated that data contributes £50 billion, per annum to the UK economy (Hayes et al., 2017). The National Highway data valuation exercise described earlier valued the organisation’s data at £60 billion, suggesting that the overall value of public sector data in the UK across organisations is many multiples of that number. Part of the reason for the discrepancy is that these papers are attempting to measure different things: namely the contribution of data to GDP on an annualised basis versus the one-time replacement cost of data. That said, these are big numbers, and together they suggest that we can realise many billions of additional beneﬁts from our data by sharing it more freely, and making better use of the data that we already capture. Within the built environment sector, various parties are trying to solve this problem, both the public sector as well as private SMEs. A good example is Construction Data Trust which looks to provide a ‘neutral’ safe store for data across both public sector clients and their supply chain (Construction Data Trust, 2023). The value of data depends on how it is used, “the consequences of data use affect value… data will be more valuable in highly consequential situations.” As we have already covered in this chapter, the use cases of data in the built environment skew towards consequential, whether it is the risk of drilling into buried electrical cables and gas mains, or the risk of catastrophic asset failure.

102

2 The Challenge of the Built Environment

Built environment organisations possess a wide range of valuable data. In addition a wealth of data concerning the physical assets that constitute the built environment, clients also need most of the same corporate tools as any other organisation. The data contained within your Finance (ERP), Commercial, Procurement, HR, Comms, IT, and other corporate systems remain a valuable data resource, one that your organisation must suitably protect and use. Whilst ‘support’ functions like HR and IT may share more in common with their broader professions than with the built environment sector, it would be a mistake to assume that they do not bring any built-environment-speciﬁc knowledge. Just as delivering data projects in the built environment provides us with a unique combination of experiences, the same will be true of other professions such as ﬁnance, commercial or comms. Many of your colleagues in these functions have spent many years working for built environment clients and consultancy organisations and have learned how to adapt their expertise to the idiosyncratic needs of the sector. Many of the measures that you consider in construction and operation, such as resources, cost, and time, must ultimately link back to the corporate ERP system as the system of record and accounts. Otherwise, there is always a risk that the data ‘truths’ of the project diverge from the truths of the balance sheet. This also means that corporate functions will perform many of your data owner and data steward roles, particularly for sensitive and business-critical data sets. As a data professional you must be able to identify value in the data held across your organisation, and particularly where there are opportunities to increase the value of your built environment data, with data held in supporting systems. This might mean, for example, cross-referencing your cost estimates against what’s hitting your organisation’s bank account, using HR records to better understand the capability and capacity of your organisation, or working with commercial and procurement to ensure that your data requirements are mandated and observed in your contracts. Because it is non-rival, we realise the greatest value from data by thinking of it as an ecosystem. A healthy ecosystem is one with a wide range of different data types and data sources, both internal to your organisation and external cross-sector data. We realise value from data not necessarily by protecting and hoarding our data, but by ensuring that within the rules of GDPR and security policies we allow our data to spread and cross-pollinate. We should seek to make data available wherever there is a compelling hypothesis to test or decision to be supported. In a standard matrix organisational structure, data acts as a horizontal capability across the vertical disciplines/departments of the organisation. This means that we realise the value of data holistically across the organisation. Looking at data value this way aligns it with the preceding discussions concerning societal outcomes and balanced scorecards. The marginal cost of data is often higher in the built environment than in other industries whilst the use cases for that data are often extremely consequential for the wellbeing of society. This is even more reason to value and make best use of our data wherever possible, to not let it go to waste through neglect.

References

103

2.11 Conclusion You understand that the long-term sustainability of any enterprise or organisation, or ecological niche, is dependent on its ability to be sufﬁciently diverse and sufﬁciently efﬁcient—to stay within the window of viability.—Richard Carter, Cognitive Advantage.

When you work on the Built Environment, your successes and failures have lasting consequences. The sector faces a wide range of challenges, from climate change, to safety, to delivering better outcomes for society, and there is a role for data in each of them. Too often, the sector’s use of data has contributed to programmes of work that under-deliver, place undue pressure on the public purse, and unduly inconvenience (or in the worst cases shorten) the lives of the public. Getting to grips with data and the built environment means being willing to learn both from other sectors, and from the tremendous amount of accumulated experience, research, and guidance available from standards bodies, government departments, NGOs, and previous projects. The challenges of the built environment are often unique, but much of our data work should be the application of good practices from elsewhere. We must always work with a view towards the impact that our work will have on society. Only then will we stand a chance of proving our hypothesis, “the use of data techniques can meaningfully improve the performance and outcomes of the built environment.”

References Akerlof, G. A. (1970). The market for “lemons”: Quality uncertainty and the market mechanism. The Quarterly Journal of Economics, 84(3), 488–500. Anmut. (2023). How treating data as an asset benefits your business. Anmut. https://www.anmut. co.uk/treating-data-as-an-asset/ Architecture 2030. (2023). Why the built environment. https://architecture2030.org/why-the-bui lding-sector/ BBC News. (2012). TfL report warned of Hammersmith Flyover collapse risk. BBC News. https:// www.bbc.co.uk/news/uk-england-london-20533457 BBC. (2016). London City Airport bought for £2bn by Canadian-led group. BBC. https://www.bbc. co.uk/news/business-35666988 BECD. (2023). Built environment carbon database. BECD.co.uk. https://beta.becd.co.uk Beckett, A. (2015). The right to buy: The housing crisis that Thatcher built. The Guardian. https://www.theguardian.com/society/2015/aug/26/right-to-buy-margaret-thatcherdavid-cameron-housing-crisis Bevan Brittan. (2019). Highways PFI: The end of the road for Amey and Birmingham City Council. Bevan Brittan. https://www.bevanbrittan.com/insights/articles/2019/highways-pﬁ-theend-of-the-road-for-amey-and-birmingham-city-council/ Bloomberg NEF. (2022). Bloomberg NEF. https://about.bnef.com/new-energy-outlook/ British Medical Journal. (2013). “First do no harm” revisited. BMJ.com. https://www.bmj.com/ content/347/bmj.f6426 BSI. (2018a). BSI launches new Kitemark for BIM Level 2. BSIgroup.com. https://www.bsigroup. com/en-GB/about-bsi/media-centre/press-releases/2018/october/bsi-launches-new-kitemarkfor-bim-level-2/

104

2 The Challenge of the Built Environment

BSI. (2018b). PAS 1192-6:2018 specification for collaborative sharing and use of structured Health and Safety information using BIM. BSIgroup.com. https://knowledge.bsigroup.com/pro ducts/specification-for-collaborative-sharing-and-use-of-structured-health-and-safety-inform ation-using-bim?version=standard Bubola, E. (2022). New York Times. https://www.nytimes.com/2022/06/14/world/europe/grenfellﬁre-london-explainer.html Building Information Modelling (BIM) Working Party. (2011). Strategy paper for the government construction client group. Constructing Excellence Catherine Moore. (2021). Highways England to roll out AI app trialled on A14 scheme to identify ‘high risk’ days. New Civil Engineer. https://www.newcivilengineer.com/latest/highways-eng land-to-roll-out-ai-app-trialled-on-a14-scheme-to-identify-high-risk-days-12-05-2021/ Centre for Digital Built Britain. (2018). Gemini principles. cdbb.cam.ac.uk. https://www.cdbb.cam. ac.uk/DFTG/GeminiPrinciples Centre for Digital Built Britain. (2021). New energy and carbon reporting framework. cdbb.cam.ac.uk. https://www.cdbb.cam.ac.uk/news/new-energy-and-carbon-reportingframework Chapman, A., & Masie, D. (2023). Are carbon offsets all they’re cracked up to be? Vox. https:// www.vox.com/23817575/carbon-offsets-credits-ﬁnancialization-ecologi-solutions-scam Coase, R. (1974). The lighthouse in economics. The Journal of Law & Economics, 17(2), 357–376. Construction Innovation Hub. (2022). Value toolkit overview. Construction Innovation Hub. https:// constructioninnovationhub.org.uk/media/c0wftr01/20220927_hub_valuetoolkit_overview_int eractive.pdf Construction Data Trust. (2023). Construction Data Trust. datatrust.construction. http://datatrust. construction Coyle, D., et al. (2020). The value of data. Bennett Institute for Public Polciy. https://www.ben nettinstitute.cam.ac.uk/wp-content/uploads/2020/12/Value_of_data_Policy_Implications_Rep ort_26_Feb_ok4noWn.pdf Criado-Perez, C. (2019). Invisible women. Chatto & Windus. Da Silva, J. (2012). Shifting agendas—Response to resilience. Arup.com. https://www.arup.com/per spectives/publications/speeches-and-lectures/section/shifting-agendas-response-to-resilience DAMA International. (2017). DAMA-DMBOK: Data management body of knowledge (2nd ed.). Technics Publications. Department for Business, Energy & Industrial Strategy. (2020). The ten point plan for a green industrial revolution. gov.uk. https://www.gov.uk/government/publications/the-ten-point-planfor-a-green-industrial-revolution/ Department for Business, Energy & Industrial Strategy. (2021). UK enshrines new target in law to slash emissions by 78% by 2035. gov.uk. https://www.gov.uk/government/news/uk-enshrinesnew-target-in-law-to-slash-emissions-by-78-by-2035 Department for Levelling Up, Housing and Communities. (2022). Collaborative procurement guidance for design and construction to support building safety. gov.uk. https://www.gov.uk/ government/publications/collaborative-procurement-guidance-for-design-and-construction-tosupport-building-safety Dodd, P., Nicholson, I., & Jennings, T. (2023). A national metrics library for the build environment: Strategy for implementation. Scottish Futures Trust. https://www.scottishfuturestrust.org.uk/ publications/documents/a-national-metrics-library-for-the-built-environment-strategy-for-imp lementation EDF Group. (2022). EDF Group 2021. EDFenergy.com. https://www.edfenergy.com/media-centre/ news-releases/edf-group-2021-ﬁnancial-results-highlights-edf-energy-uk EPSRC-NSF. (2022). The role of funding, financing and emerging technologies in delivering and managing infrastructure for the 21st century. SmartInfrastructure.eng.cam.ac.uk. https://wwwsmartinfrastructure.eng.cam.ac.uk/ﬁles/epsrc-nsf_infrastructure_workshop_report_-_ﬁnal_dig ital-uk__0.pdf

References

105

Garsous, E. (2012). The impact of infrastructure on growth in developing countries. IFC Economics Notes. Geospatial Commission. (2023). First UK areas access new digital map of underground pipes and cables. gov.uk. https://www.gov.uk/government/news/ﬁrst-uk-areas-access-new-digitalmap-of-underground-pipes-and-cables Global Alliance for Buildings and Construction. (2022). 2022 global status report for buildings and construction. GlobalABC.org: https://globalabc.org/news/globalabc-releases-2022-globalstatus-report-buildings-and-construction Green, E. (2022). Specifying for green star buildings. ACA.org.au. https://aca.org.au/specifyingfor-green-star-buildings/ Grimalda, G. (2023). Many people ask me why I don’t carbon offset my travels. Here’s why. LinkedIn. https://www.linkedin.com/posts/gianluca-grimalda-98636a28_action-neededto-make-carbon-offsets-from-activity-7122449115933810688-V341 Gurdur Broo, D., Lamb, K., Ehwi, R., Parn, E., Koronaki, A., Makri, C., & Zomer, T. (2020). Four futures, one choice—options for the digital built Britain of 2040. University of Cambridge Repository. Guterres, A. (2022). Secretary-General’s remarks at launch of report of high-level expert group on net-zero commitments. UN.org. https://www.un.org/sg/en/content/sg/speeches/2022-11-08/sec retary-generals-remarks-launch-of-report-of-high-level-expert-group-net-zero-commitments% C2%A0 Hayes, S., et al. (2017). Data for the public good. National Infrastructure Commission. https://nic. org.uk/app/uploads/Data-for-the-Public-Good-NIC-Report.pdf Hillier, B. (2023). UCL Space Syntax. https://www.spacesyntax.online/overview-2/ Hillier, B. (2007). Space is the machine. University Colleage London/Press Syndicate of the University of Cambridge. IEA. (2018). Cement technology roadmap plots path to cutting CO2 emissions 24% by 2050. IEA.org. https://www.iea.org/news/cement-technology-roadmap-plots-path-to-cutting-co2-emi ssions-24-by-2050 IEA. (2023b). Data centres and data transmission networks. IEA.org. https://www.iea.org/energysystem/buildings/data-centres-and-data-transmission-networks IEA. (2023a). Buildings. IEA.org. https://www.iea.org/topics/buildings IPCC. (2022). IPCC.ch. https://www.ipcc.ch/report/ar6/wg2/resources/spm-headline-statements/ Kemp, D. A. (2019). Engaging our conscience to explore the future. Infrastructure Intelligence. http://www.infrastructure-intelligence.com/article/jun-2019/engaging-our-conscienceexplore-future Kluger, J. (2021). Time. https://time.com/5943530/covid-19-co2-emissions-climate-change/ KPMG. (2015). Global construction survey 2015. KPMG.com. https://assets.kpmg.com/content/ dam/kpmg/pdf/2015/04/2015-global-construction-survey.pdf Laney, D. B. (2021). Data valuation paves the road to the future for Highways England. Forbes. https://www.forbes.com/sites/douglaslaney/2021/02/01/data-valuation-paves-the-roadto-the-future-for-highways-england/?sh=7f955737612c McKinsey & Company. (2021). Call for action: Seizing the decarbonization opportunity in construction. McKinsey.com. https://www.mckinsey.com/industries/engineering-constructionand-building-materials/our-insights/call-for-action-seizing-the-decarbonization-opportunityin-construction McKinsey & Company. (2022). Accelerating green growth in the built environment. Mckinsey.com. https://www.mckinsey.com/capabilities/operations/our-insights/accelerating-green-growth-inthe-built-environment McLeod, K. S. (2000). Our sense of Snow: The myth of John Snow in medical geography. Social Science & Medicine, 50(7–8), 923–935. Meadows, D. (2008). Thinking in systems: A primer. Chelsea Green Publishing. Monbiot, G. (2019). The problem is capitalism. Monbiot.com. https://www.monbiot.com/2019/04/ 30/the-problem-is-capitalism/

106

2 The Challenge of the Built Environment

National Grid. (2023). Completion of sale of majority interest in NGGT and metering. National Grid. https://www.nationalgrid.com/completion-sale-nggt-and-metering Open Data Institute. (2018). What is a data trust. ODI. https://theodi.org/insights/explainers/whatis-a-data-trust/ Piketty, T. (2015). The economics of inequality. Belknap Press. Rees, R. (2023). Why are you still participating in the race to the bottom? Building.co.uk. https:// www.building.co.uk/building-the-future-commission/why-are-you-still-participating-in-therace-to-the-bottom/5123577.article 3D Repo. (2021). Digitising health and safety. 3Drepo.com. https://3drepo.com/resources/whitepapers/digitising-health-and-safety/ Statista. (2022). Operating profit/loss of Amey UK plc from 2012 to 2018. Statista. https://www.sta tista.com/statistics/543719/amey-operating-proﬁt-development/ Statista. (2023a). Carbon dioxide emissions from the construction industry in the United Kingdom (UK) from 1990 to 2021. Statista. https://www.statista.com/statistics/486106/co2-emissionfrom-the-construction-industry-uk/ Statista. (2023b). Total managed expenditure as a share of gross domestic product in the United Kingdom from 1981/82 to 2022/23. Statista. https://www.statista.com/statistics/298478/publicsector-expenditure-as-share-of-gdp-united-kingdom-uk/ Steer, S. (2022). Moving to action: Digitalising our net zero energy future. Energy Systems Catapult. https://esc-prod-admin.383apps.com/wp-content/uploads/2023/05/EDiT-Annex-Digitalis ation-Governance-v1.0.pdf Stiglitz, J. E. (1985). Information and economic analysis: A perspective. Economic Journal, 21–41. The Economist. (2020). Rethinking how we value data. economist.com. https://www.economist. com/ﬁnance-and-economics/2020/02/27/rethinking-how-we-value-data UNOPS. (2021). Infrastructure for climate action. UNOPS. Wadham, C. (2019). Amey fined £48m for Birmingham bollard dispute. Construction News. https://www.constructionnews.co.uk/news/knowledge-news/amey-ﬁned-48m-forbirmingham-bollard-dispute-05-04-2019/ Wikipedia. (2023a). Hatfield rail crash. https://en.wikipedia.org/wiki/Hatﬁeld_rail_crash#Cause Wikipedia. (2023b). SECI model of knowledge dimensions. Wikipedia.org. https://en.wikipedia.org/ wiki/SECI_model_of_knowledge_dimensions Zero Next. (2022). Zero next manifesto. zeroconstruct.com. https://zeroconstruct.com/zero-next/ Zero Construct. (2023). ZERO playbook. https://app.morta.io/project/48121677-4985-4b8d-aad2af362ff89e10/process/73187efc-c4e9-4981-9cdf-d785da97fa07

Chapter 3

Background Reading

Abstract The intersection of data and the built environment covers two broad and evolving fields. It can be hard to know where to begin with the established body of knowledge in these areas. In this chapter, we seek to provide an overview of where others have written prescriptively on topics that sit at the intersection of data and the built environment. We want to provide a primer or refresher to readers who may then wish to dig deeper themselves. We have split the chapter into three parts, beginning with standards (both British and International), then moving on to publications (from UK government bodies), and finally touching briefly upon legislation (again, largely from the UK). Finally, we reflect on how best to interpret and make use of this body of knowledge when working as a data professional within a built environment organisation.

3.1 Relevant Standards As data has increased in relevance for the built environment sector, it has crept into the standards to which the sector holds itself. Standards have proven a useful means for sector practitioners to codify their aspirations. In many cases, particularly when it comes to Building Information Modelling (BIM), early adopters have systemised the use of these standards. The efforts of the UK BIM Task Group and latterly, the Centre for Digital Built Britain have made BIM a household name in the top tier of construction firms around the globe. The bulk of the standards that we will cover are specific to the built environment sector (with some applicability elsewhere, such as manufacturing). But a few, particularly ISO 9001 are applicable across almost any sector and information regardless of sector (Fig. 3.1). Whilst there are many standards that should inform our work as data professionals, the four standards that mostly closely inform the application of data to the life cycle of a built environment asset are ISO 9001 (Quality Management and its Built Environment offspring BS 99001), ISO 19650 (BIM), BS 8536 (Operability), and ISO 55000 (Asset Management). As the diagram above suggests, an organisation that aligns to each of these standards will have a clear and effective means of managing © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_3

107

108

3 Background Reading

Fig. 3.1 The principal papers and standards informing this chapter

their information lifecycle, they will understand what information they require to build and maintain their asset base and have a process for ‘landing’ information from design and construction into operation and maintenance. In addition, they will also have a range of engineering and asset management capabilities that make effective use of their organisation’s data. There are inevitably a wide range of additional standards that have relevance to working with data and the built environment, and this chapter is only intended as a brief introduction to a broader world of standards and other publications. Here we want to pass a milestone in industrial integration. Data and information are not the same thing, we cover this in more detail elsewhere. We want to make sure a point is clear. Many books, guides and standards use data and information interchangeably. We also want to highlight the difference in data and information management. Information management in our context is how documentation traverses contractual boundaries, for example, a designer handing a builder a drawing ready for construction. Data management is more concerned with quality control and ownership and much smaller ‘intra-contractual’ transactions.

3.1 Relevant Standards

109

3.1.1 Quality Management (ISO 9001 and BS 99001) ISO 9000 (Fundamentals and Vocabulary), 9001 (Requirements), and 9002 (Guidelines) have applicability far beyond the built environment sector. Together they set out a sector-agnostic (though private-sector-orientated) system (Quality Management System) for delivering processes to a consistent level of quality. BS 99001, on the other hand, gives specific guidance for built environment organisations looking to apply the principles of ISO 9001 to their work. Together these standards codify terminology that will be familiar to most people who have worked in a business setting, such as: • • • • • • • •

The ‘plan-do-check-act’ cycle. ‘Risk-based’ thinking. ‘Evidence-based decision making’. ‘Relationship management’ The importance of staff competency, awareness, engagement, and communication. The need for organisations to have a ‘customer focus’. The joy of audits (yay audits!). And the ongoing effort to improve your organisation’s products, services, and/or processes via the mantra of continuous improvement.

ISO 9001 also provides us with a useful taxonomy and set of definitions for data terms, which can help to resolve ambiguity between data types, systems, and processes. We noted earlier in this book how ISO 9001 contains useful definitions of ‘data’ and ‘information’. A few other relevant definitions for our purposes include: • • • •

Objective evidence: “data supporting the existence or verity of something.” Document: “information and the medium on which it is contained.” Capability: “ability of an object to realise an output.” Documented information: “information required to be controlled and maintained by an organisation and the medium on which it is contained.”

When ISO 9001 describes an Information Management System, it does not mean software, but rather a formally defined collection of processes. Whilst the language applies to most businesses, there is an inherent focus on meeting customer requirements. ISO 9001 defines quality as follows: [the] degree to which a set of inherent characteristics of an object fulfils requirements.

ISO 9001 then elaborates: The quality of an organisation’s products and services is determined by the ability to satisfy customers and the intended and unintended impact on relevant interested parties. The quality of products and services includes not only their intended function and performance, but also their perceived value and benefit to the customer.

Built environment projects have customers, of course, but their requirements (and in turn ‘quality’) are often more difficult to define than those of consumers of more standardised products. In many cases the ultimate customers are members of the

110

3 Background Reading

public who may not even be aware of the project, or may not use the finished product (e.g. better infrastructure) until years after the project takes place. Think of this next time you go to a foreign country and enjoy their smooth roads, punctual rail connections, and extensive network of bicycle lanes. BS 99001 reinterprets the question of customer satisfaction to reflect the projectbased nature of built environment work, counselling us to ensure that “quality is prioritised when considering on-time and on-budget delivery and… action is taken if planned results are not, or will not be, achieved throughout the project life cycle.” ISO 9001 also recognises the importance of ‘Interested Parties’ (stakeholders), which in the case of a built environment organisation will be far more numerous than the immediate customer. Why is ISO 9001 relevant to data? Because data has a beautifully double-edged relationship to quality: 1. Data is one of the means whereby organisations measure their quality. 2. Quality of data determines its usefulness. The previous chapter covered some of the performance metrics that function as the internal currency for built environment organisations (in ISO 9000 terms, the ‘metrological function’). These metrics will in turn tie back to definitions of quality that the organisation is using, which in turn should tie back to the organisation’s vision and strategic objectives. Everyone using data in an organisation will be reliant upon its quality, or in other words whether it is ‘fit for purpose’. It is only when organisations use data that its quality becomes relevant. Similarly, it is often only through use that staff identify issues with data quality. This means that the relationship between data and quality can be positively or negatively reinforcing: • In the positive cycle, a committed user base for data helps the organisation to identify and correct data quality issues, leading to even greater use for the improved data. • In the negative cycle, the opposite happens. Poor data quality undermines the organisation’s trust and confidence in its data, and in turn reduces the use of that data and the justification for its improvement. One of our goals in using data should be to incorporate quality management principles into our use of data to try to ensure that we experience positive rather than negative reinforcement. ISO 9001’s focus on documented information is particularly pertinent to the built environment as it begins to set out the underpinnings of the document control and records management professions that are prevalent across the sector’s organisations. ISO 9001 sets out requirements for how organisations should collect and version control documented information. This in turn influences the type of metadata requirements detailed by standards such as ISO 19650. Relatively straightforward sounding requirements in ISO 9001 such as the need to maintain metadata on documented information ensuring that documented information is “available and suitable for use” with appropriate “control of changes (e.g.

3.1 Relevant Standards

111

version control)” can consume substantial amounts of resource when applied to a built environment organisation that is generating and maintaining tens or hundreds of thousands of documents across the course of a project life cycle. There are of course standards available in this area—notably EN 82045-2 Document Management: metadata elements and information reference model—but often built environment organisations homebrew an approach building upon the default settings of whatever document management software they happen to be using. This is not to say that the requirements set out under ISO 9001 are wrong or unduly onerous, far from it. But as a data professional who may at some point have to comply with ISO 9001, or who may ask others in your organisation or supply chain to comply with such requirements, you should be aware of the impact in terms of cost and time of these requirements. Increasingly software is available that automates all or part of ISO 9001 requirements (for example by automatically version-controlling documents). But there remains a disconnect between the metadata standards that an organisation sets out (particularly when seeking to align with ISO 9001, ISO 19650, and EN 82045) and what off-the-shelf tooling such as SharePoint can cater to as standard. Bridging such disconnect between requirements and systems takes a substantial manual effort on the part of staff or necessitates specialist tooling. Despite these concerns, ISO 9001 is a force for good. It sets out an important baseline level of capability for an organisation, and a meaningful common vocabulary. Its influence on several of the other more built environment-specific standards described below is apparent, and it is unlikely that an organisation can make effective use of data without at least some alignment to the requirements of ISO 9001.

3.1.2 Design and Construction (BS 1192, ISO 19650) Data has always played a significant role in the construction of the built environment, even if the term was not common parlance until recently. Megaprojects of the past, from the pyramids of Ancient Egypt onwards, would have had a sizable workforce concerned with planning, surveying, and managing the logistics of moving materials, preparing sites, tracking progress, and ensuring quality. The Ancient Pyramids sit on a perfectly level foundation that is in turn almost perfectly oriented to the points of the compass which is frankly astounding and evidence of clinical preparation.1 Whilst the digital environments of modern construction programmes would have 1

Extended footnote, because it is fascinating: The Pyramid of Giza dates to around 2600BC. Imagining these enormous works recalls thousands of labourers dragging limestone and granite blocks weighing as much as six tons up ramps to form the structure. But there was also an enormous number of people involved in the administration of the project. The ‘Diary of Merer’ papyrus contains meticulous notes of the movement and work of a site supervisor working on the operation that supplied the pyramids with their stones and brings to light only a tiny part of the admin associated with constructing this enormous structure. In 1999 Civil Engineering published an inspired article by Craig Smith and others entitled ‘Program Management B.C.’ The authors of this article take their years of experience working on modern megaprojects and use it to reverse engineer the ancient works. By surmising a work

112

3 Background Reading

been unimaginable to our ancient peers, your data entities (time, cost, quality) would not be unfamiliar. As with the stone tablets and papyrus of the past, the role of data in construction is to maintain a golden thread between decision-making and reality on the ground. To create a shared spatial–temporal awareness of the works that is greater than the capacity of any one worker. The major professions associated with construction—designer, architect, project manager, engineer, surveyor—all trace their origins back to ancient times. That said, they were formalised into the professions that we know today much more recently. The ICE, RIBA, RICS, and APM came into existence in 1818, 1837, 1868, and 1972 respectively, with the APM successfully avoiding royal chartership until 2017. Intensity in the use of data across these professions has only increased, particularly over the last 50 years. A broad suite of ‘Architecture, Engineering, and Construction’ (AEC) tools from vendors such as Autodesk and Bentley have developed massively since the 1970s as early CAD tools became sophisticated construction platforms with links into other platforms such as Enterprise Resource Planning (ERP). More recently, ‘Contech’ start-ups have multiplied, bringing a range of AI, Internet of Things (IoT), and robotic tooling to complement the entrenched AEC platforms. All this means that construction generates far more data than it ever has before, and (whilst there are always exceptions), in general today’s construction professionals are far more comfortable with using that data in their work compared to their predecessors. This is in part due to the exhortations of the various standards and publications cited in this chapter, but also because the tools required to manipulate and use data (design models, dashboards, even the common spreadsheet) are both more powerful and easier to use than they were in the preceding decades. ISO 19650 (“Information management using building information modelling”) is a collection of standards that define what Building Information Modelling (BIM) means to the sector in the present day. As a brand ‘BIM’ has really stuck. Yes, it is an application technology, but it’s broader than that: it is a way of working and has become an entire professional field unto itself. There are many thousands of people working around the world who define themselves as BIM professionals and who subscribe to the requirements of ISO 19650 or its predecessor BS 1192. BIM was rare both in realising this level of adoption and dedication, as well as being an application breakdown structure (WBS) for the construction of the Great Pyramid, they then “developed a logic diagram to illustrate the construction sequence that the team found most plausible,” used estimators to “produce labour estimates for each of the tasks identified,” and finally arrived at “a critical path construction schedule (Civil Engineering, 1999).” One does not build a 147 m high Pyramid composed of over two mil. blocks with ornate tunnels and chambers inside merely by choosing a spot and piling rocks on top of each other. The Pyramid needed a perfectly even base, precise orientation of points, and a seamless external facade. There would be substantial logistical and safety concerns associated with moving labourers and material at height on temporary works without creating bottlenecks. Moreover, the sequencing of the works, including the quarrying and transportation of stone on a tidal river, as well as the timing and positioning of ancillary works, would have required substantial planning prior to execution. As the article concludes, “We must… stand in awe of their program management techniques, as it is equally certain that they had highly developed administrative and planning skills. The complexity and logistical requirements of this project are simply extraordinary.”

3.1 Relevant Standards

113

of technology specific to the built environment. A few other applications come close, Enterprise Asset Management Systems (EAMS), Geographic Information Systems (GIS), Internet of Things (IoT), or more recently Digital Twins. However, all these technologies have applications outside the built environment. For two decades now, application of technology (and therefore data) in the built environment has been synonymous with BIM. Before we get stuck into ISO 19650, we should acknowledge its forbear BS 1192. Versions of BS 1192 stretch back into the 1940s as a standard for architectural drawings. In the 1980s Computer-Aided Design (CAD) products such as AutoCAD (first released in 1982), AllPlan (1984) and MicroStation (1986) began to digitise the production of architectural drawings. BS1192 changed its title to information management for computer-aided design (CAD) for its 1990 edition, reflecting the emergence of these CAD products. The 1990 edition of BS 1192 laid the foundations for what would become BIM. BS 1192-5, published in 1998, introduced the idea of “sub-models” and the need to relate these models to each other. By its 2007 edition, BS 1192 was subtitled ‘Collaborative production of architectural, engineering and construction information,’ making it the first true BIM standard as it built upon its predecessors by including the common data environment. Let’s explore a real example of how ISO 1192 was interpreted and deployed in a large organisation context. A common misconception of standards is they tell you precisely what to do. They do not. They merely provide a framework of requirements, but never stray into prescriptive instruction of how to do it. Figure 3.2 is an example of a ‘Digital Pathway’ it acknowledges the transitional nature of the BIM standards where the blue triangle represents the buyer/client/client agent side activity, and you can see how the pathway flows between the top and bottom tiers (the activity of the supplier/supply chain in green). The detail inside the boxes is not important here. The point is that these standards always need transposing to a specific context. This digital pathway was a tool to facilitate different business units and external clients on their specific BIM implementations. A nugget of advice here is to always consider the workflow of any process and write it down. Workflow workshops are tricky to facilitate, but they are the shortest route to gaining buy-in and understanding from key stakeholders. The Digital Pathway figure (Fig. 3.2) previously discussed acts like the original design of the periodic table of elements. The first few iterations of the periodic table had spaces left open for elements we knew would eventually fill the gaps; the following diagram (Fig. 3.3) is a post-workshop version of the pathway with the business specific steps included. The important point to make here is business management system level process mapping done well requires levels of abstraction. The detail in Fig. 3.3 would be too detailed to be presented to senior stakeholders and an abstraction is required to enable the process to be used as a story telling device to communicate what you need to stakeholders (Fig. 3.4). These examples are a real illustration from previous iterations of business management system logic. They formed the basis of a business case that enabled the organisation to invest in a programme to deliver one of the first British Standards Institute’s

114

3 Background Reading

Fig. 3.2 Atkins digital pathway circa 2017

Fig. 3.3 The Digital Pathway–post-workshop output

Kitemark for Building Information Management (BSI, 2018a, 2018b). In the historical sense, the emergence of the BSI Kitemark for BIM enabled organisations to develop consistency for BIM delivery across the whole organisation. For this organisation, it reached its ultimate mission objective of obtaining a Kitemark for the global operation of the BIM-based business management system (AtkinsRealis, 2022). This is not an intentional plug; the point here is that implementation of these types of business management systems are grown and cultivated into a global system. Your author

3.1 Relevant Standards

115

Fig. 3.4 The Digital Pathway–Business Logic Abstraction

has never seen a top-down business management system be as successful as a home grown one. As the international successor to BS 1192, ISO 19650 provides an intellectual framework for understanding the requirements and division of responsibilities associated with managing information before, during, and after trigger events (which include construction projects, acquisitions, and maintenance activities). It ties nicely back to ISO 9001, reading at times as a built-environment-specific elaboration of 9001’s sector-agnostic principles and of course BS 99001. ISO 19650 is explicit that organisations should manage information in a manner that aligns to their own operational objectives. The standard centres around a formalised set of information requirements at the level of the organisation (OIR), asset (AIR), project (PIR), and exchange between parties (EIR), as well as information models for the asset (AIM) and project (PIM). The presumption is that if a project completes and maintains these artefacts then the project team will have sufficient information to deliver the works in a collaborative, safe, and efficient manner. The artefacts will also provide sufficient information to answer the questions that the organisation may have of the project (e.g., what new assets am I receiving and how do I operate them). ISO 19650 distinguishes between project information (used during the delivery of projects) and asset information (used during the operation of assets). It reminds us that we have a duty when we run projects to create and maintain project information such that following handover, the operator can make use of it in maintaining the asset. This is a theme that we will also see covered by BS 8536 (Soft Landings). Data appears as a concept in ISO 19650, but usually in a subsidiary role to information (e.g. organisations record their project information using various structured and unstructured data types). It acknowledges the role of a Common Data Environment to store data but does not go into detail around how organisations should design

116

3 Background Reading

or deploy their common data environment (CDE). It does, however, advocate for a ‘container’-based approach to data, one which segregates information according to its originating organisation, type, sensitivity, and/or subject (e.g. asset hierarchy). This approach allows the CDE to manage many atomic parts of information through their life cycle (changes of status), with clear obligations in terms of meta-data and ownership and has the added advantage of splitting the volume of information associated with models into manageable chunks. Where 19650 Part 1 sets out roles and responsibilities, Part 2 focuses on the process required to create and maintain useful information models, and Part 3 covers how to deliver that information usefully into the operational stage of an asset. ISO 19650 is a technology-agnostic standard; it makes no attempt to specify the infrastructure that organisations should use in creating their CDE or associated artefacts. Indeed, as a non-specialist reading the standard it is often difficult to visualise what form many of the artefacts should take, whether these are spreadsheets, or documents, or databases? What this means in practice is that the IT implementation required to become an ISO 19650 compliant organisation is open to interpretation. Organisations (or their consultants) must translate carefully between BIM requirements, and IT system requirements. Thankfully, there are organisations such as Nima, the UK BIM Framework (UK BIM Alliance, 2023), and Building Smart (buildingSMART international, 2023) which maintain guidance for organisations looking to adopt BIM, resources such as pre-populated templates, and data standards such as COBie and IFC. On the whole ISO 19650 is an important and relevant standard, setting solid aspirations for how built environment organisations should frame their relationship with information. For some ISO 19650 may read as overly bureaucratic and focused contractual control. However, it is often the specification of data sharing as part of contracts that provides the earliest, cheapest, and least acrimonious route to appropriate data collection, management, and sharing between parties. At times ISO 19650 labours under the pretence that it is indeed possible to articulate all a programme’s information requirements at any given point in time, and then to persist that unity of requirements across organisational silos and boundaries. Not all client organisations are willing to invest up-front in the data specification required to meet ISO 19650’s objectives. Moreover, overly static specifications may prevent a project from iterating its requirements as they emerge through the project life cycle. That said, overall, the goals of ISO 19650 are admirable, and the standard functions as a useful point of advocacy for the benefits of defining information requirements (and the processes for managing information) early in the life cycle of a project. Anyone who has worked with big data on megaprojects knows that this is an almost impossible goal, both in terms of defining the data needs or maintaining consensus. Consequently, as data professionals we might bring an agile perspective to the admirable goals of ISO 19650. One should aim to create the relevant artefacts, but in a ‘wiki’-inspired manner that allows organisations to quickly and collaboratively update their requirements without requiring a substantial bureaucratic function

3.1 Relevant Standards

117

to administer those changes. This may mean that you fall short of ISO 19650 ‘compliance’ in some areas, by virtue of making more pragmatic decisions in terms of how you allocate your limited resources. ISO 19650 is a shining city on a hill that we all gracelessly slouch towards. It succeeds less by mandating ‘compliance’ to the standard, and more by setting out a paradigmatic view of how information should enable construction. That said, there is always a risk that as a sector we design our processes and software to adhere to the letter of the standard rather than meeting the needs of our users.

3.1.3 Operations, Maintenance, and Asset Management (BS 8536, PAS 55, ISO 55000) Built assets only create value through use. Just as in the construction stage, effective operation and maintenance of built assets requires records of the ‘spatial–temporal properties’ of an asset (e.g., What is it? Where is it? What condition is it in? How is it performing?). Whilst the spatial properties of built assets should be consistent between construction and operation, there will be a whole new set of temporal properties that become relevant when an asset is in operation. These new temporal properties will reflect the usage of the asset (measures relating to the volume and duration of usage, whether the asset is supporting people, water, electricity, etc.), as well as the condition of the asset (reflecting the tendency of assets to degrade over time). Operation of an asset also requires records of the decisions taken to respond to the usage and condition of the asset (e.g. decisions such as closures, maintenance, replacement, and renewal). As the IAM ‘An Anatomy of Asset Management’ phrases it, built environment clients need help when “there is no single, correct source of information about what assets exist, in what condition, providing what function, and where. Data becomes fragmented, out of date and/or not trusted.” All these new requirements mean that the use cases and associated data models2 differ significantly between construction and operation, as in many cases do the tools used to manage data. This means that the exchange of data, particularly data within design models, between construction and operations is a notorious source of friction and missed opportunity across the sector, despite the obvious benefits of retaining rich as-built (construction) information throughout the life of an asset. As 2

By ‘the data models’ we mean a logical specification of data requirements. This could be a schema, a data dictionary, an ontology, or all the above. It is data about data. Whilst organisational data models are more common than they used to be, the default in the industry is that the data models used by various parts of an organisation (or the asset life cycle) are implicit and undocumented. The lack of specification further complicates the movement of data between organisations or from construction into operation. Instead of reconciling two logical data models (to create an interface specification), you are instead reconciling “whatever data is required by organisation A’s mess of systems, with whatever data is required by organisation B’s mess of systems”. By failing to create models that exist independent of whatever systems you happen to be using you are inadvertently ceding control to software vendors and reducing your agility and resiliency as an organisation.

118

3 Background Reading

data professionals we should seek to ensure that the information that our organisations/projects/suppliers generate during construction are subsequently available and relevant to users who will be responsible for the future operation of the assets. The divide between those that build and those that operate and maintain, plagues the built environment sector. Some degree of specialisation, and consequent division, is natural within a sector that covers so many distinct types of building and infrastructure. Division between construction and operation roles is inevitable. It is driven both by the difference in skillsets, but also by more hidden factors that exacerbate the divide, such as the availability and rules governing capital expenditure (CAPEX) for construction and operational expenditure (OPEX) for operations (and often maintenance). What isn’t inevitable is the failure to effectively transfer data, information, and knowledge between these two functions. In many large client organisations, construction and operations are separate divisions, within mutually incompatible incentives, ways of working, data requirements, systems, and the rest. In other cases, organisations outsource the construction of new assets entirely to design and build contractors. This division in responsibility results in both a substantial loss of information across the asset life cycle, as well as duplication of data collection and management, and costs the sector dearly in terms of rework.

3.1.3.1

BS 8536: Design, Manufacture, and Construction for Operability (Aka. Soft Landings)

BS 8536 and the related ‘Government Soft Landings’ publication aims to reduce the inefficiencies arising from division in practice across the asset life cycle, particularly during the handover between construction and operation. By focusing on ‘whole life’ outcomes it seeks to ensure that the design, manufacturing, construction, operation, asset management, and facilities management disciplines can function as a more effective whole. The focus of BS 8536 is far broader than data, extending to cover the entire business process of project handover. However, BS 8536 recognises the role of data and information in this goal, stating that it wishes to improve practice in terms of: Embedding information management, using building information modelling, in the processes for the delivery and operation of assets… Emphasising the importance of information and data quality… In a manner that maximises the value that the asset owner can subsequently derive from the information and data over the planned life of the asset.

There is satisfying alignment between BS 8536 and ISO 19650 Part 3 (“operational phase of the assets”) in both terminology and objective. BS 8536 states that a ‘key principle’ is that: The transition from the delivery phase into the operational phase should include the staged and final transfer of project information and data for operational purposes from the Project Information Model (PIM) to the (Asset Information Model) AIM.

Despite the formal language, this principle is important, even transformational, because it recommends that construction projects engage with operators and asset

3.1 Relevant Standards

119

managers early in the life of the project to define what information needs to be handed over during and on completion of the project. This can be a challenge for megaprojects where the timescales to deliver a new asset stretch into decades and therefore there may not be operations and maintenance specialists within the organisation early in the asset life cycle. Too often the information generated in construction is neither accessible nor meaningful to asset management or operations staff. This in turn results in costly (and potentially dangerous) loss of knowledge, inefficient maintenance, and resurveying. BS 8536 creates a clear accountability to avoid this situation, stating: From the outset of the project, there should be an explicit working approach that requires agreement between the various parties on the work activities and their timing, with the associated information requirements and deliverables.

So, in other words, you ought to plan in advance what you are going to do with all the data that your project will inevitably generate, and make sure that you have spoken to the people who are going to receive it. Even where BS 8536 is not explicitly covering data, it uses data-driven language. It touches upon the need for effective project controls, project risk management, and tracking of benefits and outcomes. All these functions both generate and use data over the course of the project life cycle. The standard states, “as far as practicable, a quantitative approach should be taken to measuring performance and value or benefits.” The move to identify and adopt key performance indicators (KPIs) or objectives and key results (OKRs) is now widespread in the built environment sector. Often, but not always, this is more a question of organisations wanting stakeholders and the public to perceive them as evidence-based in their decision making. It is nonetheless an excellent opportunity to quickly show the benefits of better data management in earnest on projects. BS 8536 cares about, and acknowledges the importance of Information Management, in part because it includes the wisdom of people such as Brian Atkins and David Philp who have worked at the sharp end of project delivery. It is worth quoting Section 4.9.1 in its entirety, because it sets out in categorical terms the objectives of information management to build environment projects: Information management encompasses the definition and management of an information model from design, through construction, and into operation of the asset. The use of information modelling in general and the creation and management of a project and/or asset information model is seen in the context of the owner’s information management system of which its asset management system forms a part. The owner should verify that there is sufficient information technology, human resources, and competence to support information management, including the use and maintenance of an asset information model. The owner should promote a culture that values information and information management and understands how efficient information management supports better project outcomes.

Additionally, BS8536 touches on many of the same points concerning information management as ISO 19650 and uses consistent terminology, including:

120

3 Background Reading

• The need for definition of Organisational Information Requirements (OIR), Asset Information Requirements (AIR), Project Information Requirements (PIR), and Exchange Information Requirements (EIR). • The corresponding development of an Asset Information Model (AIM) and a Project Information Model (PIM). • The importance of a Common Data Environment (CDE). • Information “security, accessibility, transferability, verification and accuracy of data and information, both physically and digitally.” Section 3.4 of the standard goes into detail on the ‘work stages’ of a project. In doing so it details the changing information needs and artefacts that a project is likely to hold across its life cycle. These needs may form a useful starting point or sense check as we work with the business to identify data use cases. The standard sets out a vision where the “management of information and data should continue through the operational phase” of the asset life cycle, creating a continuity of information where the “defects, maintenance, alterations” of an asset are updated against the “asconstructed information.” This is an admirable vision, and one to which we should aspire. BS 8536, along with ISO 19650 Part 3, are useful reference points exactly because they affirm the BIM philosophy across the full asset life cycle, creating a ‘golden thread’ between often disparate parts of the sector.

3.1.3.2

Information Management in Practice

The organisations that are best placed to adopt the ‘Soft Landings’ approach are those that have both the remit and sophistication to manage assets across their life cycle. Organisations such as Network Rail, National Highways, and National Grid function as both the client to the construction of new rail, road, and gas transmission respectively, but also have large in-house teams responsible for operating, maintaining, and even decommissioning those assets later in their life cycle. Another example of this ‘infrastructure owner/operator’ is the Environment Agency. Their report published via the Centre for the Protection of National Infrastructure (CPNI, predecessor to the National Protective Security Authority), How the Environment Agency developed its Information Management Platform is an accessible and digestible summary of how the agency developed their BIM practice into an approach for the management of information across their assets’ life cycle (GIIG, 2023). As their SME Graeme Tappenden says in the report, “We needed to shift from ‘document BIM’ to having data we could… use right across the asset life cycle.” Despite the use of the word ‘platform’ in the title, the report makes clear that the way the agency succeeded was in incremental development across a wide range of component capabilities including data and information requirements, information delivery plans (IDPs), a common data environment, data management infrastructure, and processes to assure information quality. These are all topics covered later in this book, and it is gratifying to see information standards treated with equal importance as more glamorous technology. It doesn’t matter how whizzy your data science

3.1 Relevant Standards

121

happens to be, it will only add value to the extent that it makes use of information that is accurate and as broadly useful as possible, and if that information is in turn applied to decision-making across the life of an asset.

3.1.3.3

An Anatomy of Asset Management

For those looking for a plain language introduction to the field of asset management the Institute of Asset Management (IAM)’s Asset Management—An Anatomy provides an accessible “entry-point for people seeking to understand asset management (IAM, 2015).” Gone are the strictures of international standards, replaced by a business-focused explanation of the main principles of the profession and why they are valuable. Whilst the anatomy claims that these principles can apply to any-and-all assets, including “brands, licences, rights-of-way,” it is clear from the content that the focus of application remains physical human-made things, as it should be! Reading the IAM’s Anatomy it becomes clear that the asset management profession sees itself as providing a “systematic, multi-disciplined and auditable decisionmaking process” for the care of the built environment. As the anatomy states, “it is a response to a sector where fixing things as they break has become too risky, disruptive, and expensive.” The anatomy encourages its readers to make design, construction, and maintenance decisions that will prove cost effective over the life of an asset, not just save money in the short run. Whilst shortsighted asset managers have, at times, used their theory to justify stripping back maintenance to assets in the name of optimisation, it nonetheless stands as an intellectual counterpoint to the privatisation of the built environment using contracts that financially incentivise neglect and asset stripping. Whilst the anatomy doesn’t dwell on the application of information and data to asset management, it provides an easier primer to asset management and will help data professionals unfamiliar with this aspect of the built environment to better identify and understand the priorities of their maintenance colleagues and where data might be able to assist.

3.1.3.4

PAS 55 & ISO 55000: Asset Management

As with ISO 19650, the ISO 55000 series is in many ways an extension of the principles laid out in ISO 9001 but targeted to address a specific audience and use case. In this case the focus is on the quality maintenance and operation of physical assets. Also, like ISO 19650, ISO 55000 builds upon a previous UK standard, in this case PAS 55 ‘Specification for the optimized(sic) management of physical assets’.3

3

Note the use of the word ‘optimized’ (American spelling) in the title of this standard. This is a word that appears frequently in the asset management context, particularly when people are trying to sell software. However, it is fair to say that most asset management clients are working towards a structured, consistent, and initiative-taking approach to maintenance, not optimisation per se.

122

3 Background Reading

The British Standards Institute (BSI) first published PAS 55 in 2004, and updated it in 2008, around the same time that BS 1192 went full ‘BIM’. Like BIM, the concept of Infrastructure Asset Management emerged in the 1980s and 1990s (the Institute of Asset Management dates to 1994), before spreading across the sector in the 2000s and 2010. Post-war management theory and advanced manufacturing techniques including aerospace both influenced Asset Management thought leaders. Both BIM and Asset Management applied better information management and digital technology to existing fields in the built environment sector. Just as people had been designing buildings and infrastructure long before BIM and CAD came along, so people had been maintaining infrastructure long before Asset Management became a profession. However, Asset Management brought with it a range of new techniques and terminology including Enterprise Asset Management Systems (EAMS), Reliability-Centred Maintenance (RCM), and Whole Life(cycle) Costing (WLC/WLCC). Asset Management was always about more than just the application of technology. Rather, it reflected a mindset change in terms of the management and value of physical assets—the component parts of our built environment. As Dr Penny Burns, one of the pioneers of infrastructure asset management writes, a big reason for the emergence of the field was that “the full costs of infrastructure were beginning to be identified as the post war boom in building infrastructure transitions to the management of that infrastructure (Burns, 2022).” In other words, it was starting to get expensive to keep the built environment in working order. This emphasis on the fiscal impact of physical assets brought a new rigour to collecting and using data on the costs and timing of maintenance. Penny describes how in 1984 she worked with her colleagues at South Australia’s water authority to undertake a ‘True Cost study’ that established basic facts on the physical asset that were not reflected in their financial records such as the size and composition of their asset portfolio, its rate of degradation, the on-going cost of maintenance, and the replacement cost (value) of their asset base. This baseline level of data collection is now a core part of the infrastructure asset management doctrine, and much of the technology associated with the field exists to aid this collection and allow organisations to plan based on this data. There is an interesting duality across the built environment sector where the philosophies of BIM and Asset Management have evolved in parallel and informed each other, but nonetheless have remained distinct schools of thought with distinct standards. This schism both reflects and perpetuates the silos in the sector where various parts of organisations—or indeed different organisations altogether—are responsible for the construction of assets and the maintenance/operation of those assets. As we discussed earlier, the organisational divide between the construction and operation stages of the asset life cycle impacts the information used by each profession. Significant quantities of information generated during construction simply do not make it over to the maintenance and operation teams. Or even worse, organisations pay to generate the same information all over again at additional cost. This is a recognised problem, and one that any project will face as it tries to hand assets (and associated data) back to a maintenance organisation. ISO 19650 Part 3 begins to describe how you can take a ‘whole asset life cycle’ perspective to data.

3.1 Relevant Standards

123

However, the truth is that (as it stands) the information paradigms of BIM and Asset Management do not yet align. In broad strokes, data is at the heart of ISO 19650, it is at best an afterthought for ISO 55001. BIM focuses on the large-scale coordination of sizeable quantities of unstructured information, Asset Management is more concerned about accurately capturing limited amounts of structured information to minimise unnecessary expenditure. The asset management profession sometimes has a weird relationship with data. Attend an asset management conference or event and you will come across a lot of niche proprietary software that claim to do things like ‘optimise investment portfolios’ or ‘deliver predictive maintenance.’ Which, fair enough, who doesn’t want an optimal investment portfolio and predictive maintenance? And yet there is often little reference to data itself, or how crucial quality data is to derive value from these sorts of tools. Instead, vendors present their proprietary solutions as silver bullets, as if the client merely needed to buy a licence and conjure insight from thin air. The data professional knows that this isn’t how it works, one needs a solid foundation of quality data to stand a chance of anything optimised or predictive. The Asset Management profession can at times be unduly doctrinal. Occasionally it can feel like the main concern of built environment clients should be ‘deploying an ISO 55000-compliant asset management system’ rather than explaining how such an investment of time and money will benefit real humans. Gripes aside, the principles of asset management make a whole lot of sense and are genuinely important as we seek to squeeze value from the built environment. The profession is valuable in part because it brings rigour and pride to the less glamorous part of the asset life cycle: looking after things once built. At any given moment 99% of the built environment is in operation rather than under construction, so it’s in the operation and maintenance of assets that we stand to realise the greatest benefits in terms of cutting cost and carbon and improving performance. Stating the obvious, society only benefits from the operational part of the built environment (and indeed the part under construction is often a nuisance). There are many assets which we may never have the opportunity to rebuild (seats of democracy, for example), but which must nonetheless seek to retrofit to meet modern safety, performance and environmental standards. Conversely, at BS 8536 reminds us, we must design and construct buildings in a manner that is mindful of the ‘long tail’ of operation. There are a multitude of decisions that we can make during construction (think passive ventilation or triple glazed windows) that are far cheaper and less disruptive than subsequent retrofitting and will have a profound effect on the asset once in operation. Too often the ‘upstream’ projects forget the downstream needs of asset managers during the rush of the design and construction phases, particularly when it comes to data handover. As we generate data while creating, renewing, or even monitoring assets, we can’t take for granted that this data will be immediately useful to our maintenance and operations colleagues following handover. Rather, one needs to proactively work to understand what decisions they will need to make in maintaining and operating the asset. You will need to collaborate with these colleagues to identify what useful data and tooling that you can handover to them to inform asset management and prevent the need to collect the same data again just in a different format.

124

3 Background Reading

Oftentimes this will require you to understand the subtle differences in specification and tooling used by asset management professionals, for example the importance of accurate asset registers and condition data, the low adoption of geometric models in decision-making, and conversely the broader use of spatial (GIS) representations. Just because a fully realised design model was useful to you when you were building the asset, doesn’t mean that it’s going to be useful (or accessible) to the team who need to maintain it.

3.1.4 Heritage and Restoration Sectoral standards such as ISO 55000 and ISO 19650 are based on the idea of an asset life cycle (e.g., build, operate, maintain, dispose). The life cycle of built assets varies enormously, from structures and components designed to be temporary (think scaffolding and portacabins), to those that have a predictable lifespan (think pipes and cables). To those that you expect to persist for a few hundreds of years subject to suitable maintenance (think ‘quality’ housing and public buildings). The concept of the asset life cycle makes sense for most assets, but it is harder to apply to heritage assets for a variety of obvious reasons: • You may have inherited these assets from the past without meaningful construction information. • These assets are often unique, and therefore may not fit comfortably within your existing taxonomies, classifications, or ways of working. It may be unduly expensive to create meaningful digital replicas of these assets. • You may seek to operate and maintain these assets in a manner that limits their usage to prevent damage or undue change to their character. The purpose of heritage maintenance isn’t necessarily to preserve functionality but rather to preserve character. • You may intend to protect these assets in perpetuity, with no intention to dispose of them, and with a far more conservative approach to risk than with conventional assets. This can result in costs that would appear disproportionate when applied to non-heritage equivalents. Consequently, the application of data to heritage assets requires extension of your data models, both in terms of the purpose of the objects in question, as well as the types of information that you wish to capture against these assets. Heritage works also bring a degree of complexity to the logistics of construction that not all your existing tooling will be able to represent effectively. Many digital construction tools may be better suited to the construction of new assets on greenfield sites and will not be able to reflect the complexity of heritage sites without augmentation. These challenges are doubly the case for assets where they are at once unique cultural assets and simultaneously functional operational assets which should perform to a particular standard (for example, the Palace of Westminster).

3.2 Relevant Publications

125

Heritage professionals have an abiding commitment to preserving the assets under their stewardship, and this means that they already have a mature understanding of the properties of their assets that their organisations must record and manage. As data professionals, it is incumbent upon us to provide ways of working and tooling that will support our heritage colleagues in undertaking their valuable work.

3.2 Relevant Publications The built environment has existed since the beginnings of agriculture, and it has slowly incrementally developed to offer human beings longer, safer, happier, and healthier lives. The history of data as a concept is shorter, but it builds upon a long tradition of applied mathematics that, amongst many other benefits, has enabled us to better design our built environment through inventions such as the load bearing arch. Data as a digital record of information became possible in the mid-twentieth century. Whilst we now rarely think of our digital information as taking physical form (it’s up in the cloud, right?), the act of storing and manipulating bits of information to enable computation is an inherently physical act. Early pioneers of computing had to manipulate the material properties of cathode ray tubes, mercury delay lines, and other esoteric materials all in the hopes of controlling the flow of electrons for long enough to make reliable calculations.4 Digital storage, memory, and CPUs remain physical assets, albeit physical assets so precise, reliable, and rapid that they allow us to forget about their workings entirely and focus on the digital worlds that they create. In this way the story of humankind is one of increasing density on two fronts: 1. A built environment that slowly at first—and now very rapidly—supports increasing urbanisation and human density. 2. An information environment that slowly at first—and now very rapidly— supports the increasing density of information. The intersection of data and the built environment is only meaningful because of the intersection of these two trends. All of which serves to remind us that we are standing on the shoulders of giants. As data professionals working in the built environment, we are wise to take counsel—as this book seeks to do—from the intellectual history of both data and the built environment. Just because something has happened, doesn’t mean it makes sense. And in recent years in the face of both ever greater data capability and ever greater challenges to the built environment (e.g. climate change, inequality, loss of biodiversity, etc.), some remarkable thinking has emerged on the purpose of the built environment in the twenty-first century, and the role that data should play in helping it to realise that purpose. The following sections provide an incomplete (and UK-centric) overview of just a small part of that thinking (Fig. 3.5).

4

For an utterly fascinating description of these techniques, refer to Turing’s Cathedral by George Dyson (Dyson, 2013).

126

3 Background Reading

Fig. 3.5 An expanded metamodel of the value delivered to society by data, technology, and physical assets

3.2.1 Transforming Infrastructure Performance (TIP)—Roadmap 2030 (IPA, 2021) We have already covered this publication at length as we are massive fanboys, but it is worth once again recognising the role that TIP plays in framing the question of the built environment’s contribution to societal outcomes. TIP is important not as a prescriptive policy document, but as a framing of the purpose of the sector. As the title suggests, it seeks to increase the productivity of the UK’s infrastructure sector, and to ensure that investment in infrastructure aligns to societal outcomes. Whether or not one considers the built environment to be ‘infrastructure’ (and, for the record, this author does), it is a collection of physical assets, and it provides a societal outcome. It’s hard to describe the TIP Roadmap without simply paraphrasing TIP, but the diagrams that the paper includes5 function as particularly effective summaries of the paper, sandwiching as they do the built environment between the natural environment and the UN sustainable development goals. The Built Environment exists within the Natural Environment. You convert (or induce) the Natural Environment into the Built Environment so that it can provide services. Those services in turn 5

Produced by a small company called Visual Meaning who are the company for visual roadmaps for infrastructure.

3.2 Relevant Publications

127

produce data which drives decisions. Policy informs those decisions, and in theory policy should seek to meet one or more societal outcomes. TIP reminds us that the built environment functions as a system of systems. It highlights the role that data can play in helping us to understand and improve how that system of systems functions across organisational boundaries. It is a useful (and inspiring) framework for reflecting on the purpose of the built environment and the organisations that run it, as well as our precarious balance with the natural environment.

3.2.2 Government Soft Landings (CDBB, 2019) An influential paper, the ‘soft landings’ brand is well known across the UK construction sector. The paper succeeds in part because it directly addresses a familiar problem, as the paper states “the term ‘soft landing’ is typically used to reflect a smooth transition from construction into handover and close out and then into facility operation.” A soft landing includes the “smooth transition of data and information contained in the project information model to that required for facility operation in the form of the asset information model.” Not all construction projects have executed a soft landing into operation, and this remains a source of waste and inefficiency. The ability for a construction project to land asset information on handover from construction to operations is a measure of success for data professionals working on such projects. One cannot expect to simply intuit the needs of these other parties or presume that a cookie cutter approach will meet their requirements. Landing information requires early and close engagement between parties across the asset life cycle and supply chain. Whilst soft landings may seem like a distant concern when a project is just mobilising, it is during these early days-where you are still at liberty to define your data standards and practices-that you can have the biggest impact on your long-term ability to handover meaningful data. Government Soft Landings (and the associated British Standard BS 8536 that we covered earlier) is a useful reminder that we must consider data throughout the asset life cycle if we are to deliver working additions to the built environment in a timely, sustainable, and cost-effective manner.

3.2.3 Construction Playbook (Cabinet Office, 2020) Complementary to the TIP Roadmap but focused on the construction stage of the asset life cycle, the Construction Playbook acts as a primer for public sector construction clients. It documents good practice in how these organisations should specify, procure, contract, and deliver construction projects. The authors of the Playbook want to help projects set themselves up to succeed, with a particular emphasis on how projects specify their needs, procure their suppliers, and arrive at workable contract arrangements.

128

3 Background Reading

Whilst the playbook does not dedicate the same level of attention to data as TIP, it does reference where data and digital technology can help to support these outcomes. For instance, two of the five top enabling actions mentioned in the introduction as “greater sharing of better data,” and “adoption of the UK BIM framework.” Reading the playbook is unlikely to teach you much from a data perspective, but it may help to substantiate some of the use cases for data that one is likely to find in construction clients, such as: • Modern methods of construction: Like TIP, the Playbook emphasises the need for ‘Modern Methods of Construction’, in other words greater standardisation of components and use of offsite manufacturing and construction. Moving towards Modern Methods of Construction will require built environment clients to be far better at surfacing digital specifications of the components that they are seeking to procure. This will rely on detailed geometric representation of assets (e.g. BIM, 3D design models), and as the Playbook states, “digitising standards and specifications so that requirements are both human and machine readable.” • Common data specifications: The Playbook describes a construction sector where data is “often fragmented or not easily accessible.” Referencing the UK BIM Framework and Government Soft Landings, it suggests that a project information model should act as the principle means of data transition from construction to operations and encourages “the development and use of consistent structure, rules and language in standards and specifications to facilitate shared understanding and the use of digital and automated solutions”. • Net Zero commitments: The playbook recommends that “contracting authorities should adopt the most up-to-date whole life carbon assessments and methodology.” The sector has yet to establish standard practice in this space. Specifying and capturing data in an auditable and repeatable manner (rather than just being a morass of spreadsheets) is a big part of the work required to hold governments and organisations to account in this space. The playbook also sets out a range of activities such as portfolio management, risk management, benchmarking, cost management, supplier assessment, and performance-based contracts. All these areas should be common to most construction projects, and all require extensive storage, sharing, and use of data to realise. As a data professional on a project these practices may initially be hidden from you: they will likely be the work of specialists or consultants, using niche unintegrated software (or more likely spreadsheets). Those responsible for this work may not be keen to share that responsibility, and you will need to work to establish credibility and trust as someone who can help rather than simply complicate or criticise. Construction Playbook is valuable in that it is a high-profile government publication that recognise the importance of a standardised approach to the activities of construction projects, including the use of data. As the playbook states, “improving the consistency and quality of data will be transformational in how we can deliver projects and programmes by improving safety, enabling innovation, reducing costs, and supporting more sustainable outcomes.” As we will cover later in this book, many of the challenges that come as part of working with data in built environment

3.2 Relevant Publications

129

organisations have their roots in contracts and procurement. An organisation that can competitively procure strategic partnerships with the most capable and best-placed suppliers available will have an advantage in deploying data capability. Likewise, an organisation that ensures that it retains control of its own data and can enforce data sharing, governance, and quality deep into its supply chain, will be far better placed to realise value from its data. As a data professional in a built environment organisation, you may wish to encourage your colleagues in commercial, legal, and procurement to read this Playbook.

3.2.4 Data for the Public Good (National Infrastructure Commission, 2017) The right paper at the right time, Data for the Public Good paints in broad strokes and strong colours, making the case for the realisation of societal benefit through the application of data to infrastructure. It is unafraid to post big numbers against big problems—for example, data contributes £50 billion a year to the UK economy— whilst keeping its sights set firmly on helping everyday people with their bills, their travel, their environment and so on. The totemic assertions of the paper have become the air we breathe in the data profession: 1. “Data creates value,” the availability of high-quality information on infrastructure can directly contribute to the better functioning of that infrastructure. 2. “Data is infrastructure… Data is part of infrastructure and needs maintenance in the same way that physical infrastructure needs maintenance.” 3. “Data enables innovation.” The paper uses the example of CityMapper, a massively useful application that only really became possible on the back of TfL making its data publicly available. 4. “Doing nothing is a big risk,” in other words there is an opportunity cost to keeping data siloed and under-utilised, in many ways this is the inverse of point 3. 5. We must share data safely and securely… “the model needs to move from keeping all data confidential to minimum levels of commercial confidentiality.” Instead of starting with the idea that all organisational data is confidential, start with the notion that data could be open and make the case where necessary for not sharing data that is genuinely confidential. Whilst these concepts all pre-date Data for the Public Good, it is invigorating to see them asserted so strongly in a publication by a public body, and there is no doubt that this paper helped to spread and legitimise these ideas. This is particularly true of the onus on sharing data. Whilst there have been many worthy attempts within government to encourage data sharing (such as data.gov.uk), and whilst on paper most public sector organisations are committed to data sharing where possible, the truth is that organisations remain very risk averse and reticent when it comes to sharing data. Individuals are rarely incentivised to stick their neck out and share information when

130

3 Background Reading

it’s easier just to kowtow to cyber security or information management concerns and keep quiet. Throughout Data for the Public Good the paper references the pivotal role of the Centre for Digital Built Britain (CDBB) in piloting and delivering the change that the paper envisions and coordinating across key organisations such as the Alan Turing Institute, Office of National Statistics, and the Open Data Institute. Given that the CDBB lasted barely five years following the paper’s publication, reading it in retrospect can feel a bit like reading about a beautiful future that might have been. However, your authors believe that despite the CDBB’s untimely demise, the ideas surfaced in Data for the Public Good and later publications (see below) are sufficiently profound and far reaching that in one way or another the consensus will exist to make them a reality.

3.2.5 Flourishing Systems (CDBB/CSIC, 2020) This paper condenses the logic of Data for the Public Good into a short, clear, and compelling narrative. Gone are any technical details or technocratic government speak, and the imagery is beautiful. If you’re going to share one paper referenced in this book with your friends or family or merely non-data people, Flourishing Systems is the one least likely to bore them immediately! Flourishing Systems begins with a thesis statement: “The central ideas in this paper are simple and radical: that the purpose of infrastructure is human flourishing, and that infrastructure is a system of systems.” It describes the interconnectedness of infrastructure and how interconnectedness contributes to both positive and negative outcomes for society. A flourishing system of systems is one where we manage infrastructure and utilities harmoniously to realise a better, more sustainable standard of living for everyone. The alternative is one where, as per the CReDo project that we covered earlier in the book, interconnectedness reduces resilience and allows issues and vulnerabilities to cascade through the system of systems. Like Data for the Public Good, Flourishing Systems puts data and digital technologies front-andcentre in realising better societal outcomes, stating that “digital transformation is key to unlocking greater value from the built environment.” Both papers use similar language about the value of data, and Flourishing Systems employs us to “recognise digital assets, such as data, information, algorithms, and digital twins, as genuine ‘assets,’ which have value and must be managed effectively and securely. In time, as data and digital assets become valued, data itself will be seen as infrastructure.” Again, Flourishing Systems goes a step further than merely restating the importance of data. As a foundational text for the CDBB’s National Digital Twin Programme, the paper asserts that such is the complexity of the built environment’s system of systems that we need a whole new way of using data to realise better outcomes. Specifically, Flourishing Systems envisions a network of interoperable Digital Twins exchanging data in accordance with a decentralised information

3.2 Relevant Publications

131

management framework. The paper defines these terms clearly, and we repurpose these definitions elsewhere in this book: • Interoperability: the ability of two or more systems to exchange information and to make use of exchanged information. • Digital Twin: a realistic digital representation of assets, processes, or systems in the built environment; what makes it a twin is the data connection between digital and physical. • Information Management Framework: a proposed framework to enable effective information management in the built environment. Its intent is to help put the right information in the right hands at the right time to make the right decisions. Like Data for the Public Good, Flourishing Systems is both influential and ambitious. Whilst that level of ambition may have exceeded what the government of the day was willing to continue to fund, the vision of interconnected digital twins has taken root in the sector, becoming in a way our idealised end state, our “next year in Jerusalem.” What is worth pondering, your authors reflection on this section in editing, is how can we encourage governments to think truly systemically and appreciate the role that the built environment plays in the social outcomes of civilization. It sounds Idealistic, but Flourishing Systems is a seminal piece of work that explores the foundation of this view.

3.2.6 Independent Review of Building Regulations and Fire Safety: Hackitt Review (MHCLG, 2018) In a better world, the instigation of policy and thinking on the built environment would not rely on tragedy. And yet, sometimes there occurs such an egregious injustice as to create a moral imperative for action. The Grenfell tragedy happened twice. First was the fire itself early on the 14th of June 2017. Second was the prolonged inquiry, which the counsel to the inquiry described as “merry-go-round of buck-passing” before ultimately concluding that “each and every one of the deaths that occurred in Grenfell Tower, on the 14 June 2017 was avoidable (Booth, 2022).” Whilst it would be callous to summarise the litany of wrongs that led to Grenfell in a single sentence, as the inquiry proceeded it became clear that a contributing factor was a lack of a clear audit trail of design decisions concerning the flammable components used. The information either wasn’t available or the actors in question did not make appropriate use of it. The government website on the Hackitt report states: The Independent Review of Building Regulations and Fire Safety was announced by government in July 2017 following the Grenfell Tower tragedy and was led by Dame Judith Hackitt (Ministry of Housing, Communities & Local Government, 2017).

Its purpose was to make recommendations that will ensure:

132

3 Background Reading

• A sufficiently robust regulatory system for the future. • Residents feel that the buildings they live in are safe and remain so. It examined building and fire safety regulations and related compliance and enforcement with the focus on multi-occupancy high-rise residential buildings. Whilst the predominance of the report, understandably, focuses on building safety, it nonetheless contains important lessons for data and information management. The introduction from Dame Hackitt calls out the need for “transparency of information and an audit trail all the way through the life cycle of the building.” Notably, Chap. 8, titled “Golden thread of building information,” begins with the following, damning, statement: The review heard almost unanimous concern surrounding the ineffective operation of the current rules around the creation, maintenance and handover of building and fire safety information. Where building information is present, it is often incomplete or held in paper form and is not accessible to the people who need to see it.

The report identifies baseline information that should form part of the ‘Golden Thread’, including: • • • • • • • • • •

Size and height of the building. Full material and manufacturer product information. Identification of all safety critical layers of protection. Design intent and construction methodology. Digital data capture of completed buildings using technologies such as laser scanning. Structure. Fabric. Escape and fire compartmentation information. Systems in operation; and Permanent fixtures and fittings.

However, it also makes it clear that organisations must collate additional information to enable the maintenance and effective handover of the building through the asset life cycle, particularly from construction to operation. There are obviously parallels here with the ‘Soft Landings’ approach. The Hackitt report recommends the continued adoption of BIM across the sector, as part of what it describes as “the wider move towards transparency and integrity of information.” Whilst most built environment projects are not concerned with a ‘high risk residential building’, we are nonetheless contributing to the places in which people live, travel, play, recuperate, and work. As data professionals we have a duty of care to ensure that we supply those in our organisations who make design and safety decisions with the information that they need so that they can do so with a clear conscience and in accordance with safety regulations. We also have an obligation to maintain an audit trail of those decisions so that their work is open to review and scrutiny, and so that a meaningful body of knowledge transfers to those charged with ensuring that the assets remain safe.

3.3 Relevant Regulation and Legislation

133

Fig. 3.6 An illustration of the trade-off between risk mitigation and data sharing

As our organisations become increasingly aware of the importance of legislation like GDPR, the commercial advantage of data, and the need for effective cyber security protections, we as data professionals will find ourselves increasingly discussing the privacy and security implications of using data. From a data protection or cyber security perspective, the only zero risk option is to not collect, store, use, or share any data, and sometimes it can feel like this is the default position of our colleagues working in these roles. Part of our role as data professionals is to identify and highlight where the benefit of sharing and using data exceeds the risks associated with doing so. Whilst we shouldn’t have to choose between security and data, there will always be trade-offs where a pragmatic assessment of risk is important. Grenfell casts a long shadow, and it reminds us that the consequences of not collecting and sharing crucial information on our work can be far graver than the associated privacy, commercial, or security risks (Fig. 3.6).

3.3 Relevant Regulation and Legislation One would like to think that the Built Environment sector does not require legislation to do the right thing. Legal obligation is a far less inspiring motivation than concern for the wellbeing of society. And yet, malpractice and injustices do occur, and legislation should ensure at least a baseline level of accountability across most organisations. As a data professional you may not have a great deal of legal accountability yourself (except, perhaps, under GDPR). However, you will benefit from an awareness of the legislation that informs the work of those around you, particularly as there is often an obligation for organisations to maintain a record of their compliance with legislation. The goal of this section isn’t to get overly bogged down in legalese, but to provide

134

3 Background Reading

you with a rapid-fire overview of legislation that might be relevant to your work, and potentially even another useful lever to pull in pursuit of better outcomes through data. It never hurts to have the law on your side.

3.3.1 Keeping Staff Safe We described in the previous chapter that data on health and safety risks and incidents are an important aspect of built environment data. In the UK, the Health and Safety at Work Act (HSWA) 1974 established the legal obligation for organisations to provide staff with a safe working environment, to keep records of health and safety incidents, and to report certain incidents up to the Health and Safety Executive (HSE). Twenty-five years later, The Management of Health and Safety at Work Regulations (MHSWR) 1999 expanded on the reporting obligations required by the HSWA. These more detailed reporting obligations included specific reporting of incidents resulting in lost time from work, death, and serious injury, as well as workplace illnesses such as those caused by vibration (for example from hand tools) or poor air quality (such as asthma), and serious accidents (such as collapsing plant or scaffolding, explosions, or gas leaks). Both the HSWA and MHSWR have widespread applicability beyond the built environment. By contrast, the Construction (Design & Management) Regulations (CDM 1994/2007/2015) are, as the name suggests, specifically targeted at the design and construction professions. Many of the obligations under the CDM Regulations align closely to those set out by the HSWA and MHSWR (e.g. recording incidents, conducting risk assessments, training staff, reporting up to the HSE), there are also obligations specific to design and construction, such as the nomination of legal roles including Principal Designer and Principal Contractor, and the creation of a Construction Phase Plan. As data professionals’ part of our role is to help our wider organisations comply with these regulations. The intent of the legislation described above is not to merely generate paperwork, but to spur organisations to take accountability for the wellbeing of their staff. Making it easier for our organisations to record, analyse, and learn from safety incidents is one of the most direct ways in which we use data to improve how the built environment operates.

3.3.2 Keeping Occupants Safe We have a moral imperative not only to design and construct buildings in a manner that does not needlessly endanger our staff, but also to ensure that once built these assets do not needlessly endanger those who live, work, and otherwise rely on them. There are a wide range of safety standards across the built environment tailored to classes of asset. Often, unfortunately, amendments or additions to these regulations

3.3 Relevant Regulation and Legislation

135

follow tragedy. In the UK, the Building Regulations (1965 onwards), Fire Safety Order (2005), and more recently the Building Safety Act (2022) establish minimum safety standards for buildings. Regulated infrastructure, such as road, rail, or utilities see standards set at a variety of levels including: • Safety regulations (such as The Railway Safety Regulations), • Regulatory performance requirements set by regulators (such as Ofgem, Ofwat, or the ORR) which often track compliance with safety recommendations, and, • Engineering standards such as National Highway’s Design Manual for Roads and Bridges (DMRB). As with safety reporting, our goal as data professionals should be to help our organisations meaningfully comply, and demonstrate compliance, with these standards. Assurance of compliant design and construction is a major use case for digital construction and asset management data solutions such as Building Information Modelling (BIM), site data capture, and enterprise asset management systems (EAMS). By providing staff with a digital replica of our assets, and tracking their composition and condition, we can allow our colleagues and clients to better assure the safety of their assets while reducing the amount of time spent by staff in hazardous locations (such as in plant rooms, on roofs, or under bridges), and the amount of time that assets spend closed due to inspection or maintenance. This in turn makes it easier for organisations to continually assess that their assets are safely available to those that need to use them.

3.3.3 Protecting the Natural Environment To the extent that we built the built environment, we built it upon the natural environment. The built environment is a machine that converts resources from the natural environment (space, building materials, water, energy) into valuable outcomes for society (accommodation, transport, heat, power, sanitation, etc.). The built environment also degrades the capacity of the natural environment, both depleting natural resources such as fresh water, as well as discharging waste products into the same reserves, often at a level that the natural environment cannot absorb. Whilst this is a global problem that requires solutions (and changes of behaviour) far beyond what legislation by individual states can accomplish, it is nonetheless the case that most developed countries maintain some regulations that seek to limit the damage done to the natural environment. In the UK, these regulations include the Environmental Protection Act (1990), the Water Resources Act (1991), Environmental Information Regulations (2004), the Contaminated Land (England) Regulations (2006), the Waste (England and Wales) Regulations (2011), and the Energy Performance of Buildings Regulations (2012). Depending upon the type of assets for which your organisation is responsible, then you may help capture, analyse, and report on data pertaining to the energy consumption, waste generation, water usage, waste discharge, land contamination, as well as risks and mitigations associated with these measures.

136

3 Background Reading

Increasingly organisations are setting out their own Environmental Social Governance (ESG) targets which go beyond their legislative obligations. Undoubtedly many of these goals are simply greenwashing, others may genuinely change the impact that organisations have on the natural environment. The most effective ESG goals will be data-driven, and those in turn may require organisations to capture far more granular data on their impact than they would otherwise. For example, to credibly establish that they are carbon neutral an organisation will need to measure not only the carbon emissions of their assets, but also that of their supply chain, as well as the embedded carbon associated with their assets, and the veracity of any carbon offset initiatives that they invest in. This is an emerging and important data niche, and one where practitioners are still shaping best practice. Similarly, ESG targets pertaining to accessibility, diversity, inclusion, and transparency have moved organisations in the built environment to start to quantify and record metrics in these overlooked areas.

3.3.4 Keeping Data Safe The European Union’s General Data Protection Regulation (GDPR) 2016 had a massive impact upon the data industry and how organisations across the world interact with data on individuals. In establishing a digital right to privacy, GDPR, and similar legislation in other countries, moved data privacy from being a moral question to a legal obligation. However, the handling of data, particularly as it pertains to human beings, has always required ethical considerations. As the DAMA states, “because data represents people, data management professionals should recognise that there are ethical (as well as legal) reasons to protect data and ensure that it is not misused (DAMA International, 2017).” GDPR added a new set of responsibilities to data professionals, tasking them with recording where their organisations were using personal data and ensuring that the use was for legitimate purposes and limited to what was necessary in relation to those purposes. It increased the emphasis on organisations maintaining an information asset register (IAR), record of processing activities (ROPA), conducting data protection impact assessments (DPIA), and ensuring consistent and reasonable data retention periods. In the wake of the Cambridge Analytica scandal much of the attention on data privacy and personal data rightly focused on large business-to-consumer (B2C) technology firms. The level of personal data captured and used by built environment organisations pales in comparison to that managed by Meta (Facebook) or Alphabet (Google) or even other public sectors (such as HMRC). However, there is still enough to warrant attention. Common sources of personal data include job applications, public consultations, land registry information, and other forms of stakeholder engagement. Some organisations that are more consumer-focused will capture more information on user profiles including train operating companies (ticket sales,

3.3 Relevant Regulation and Legislation

137

reward schemes, train services such as Wi-Fi), toll road operators (licence information, bank account details), and utilities (billing information, consumption data). There are also many services that sell user profile information that may link to user habits like travel patterns, and even detailed census information can identify individuals when combined with other data sources. Whilst few of us share the intimate details of our lives with our water companies in the same way that we might on social media, it is nonetheless entirely possible to build a detailed picture of an individual’s interactions with the built environment. However, just because something can be done doesn’t mean that it should be done, and part of your role as a data professional should be keeping an eye out for individuals or teams in your organisation that are acting in ethically dubious ways (often without realising that is what they are doing). It is sometimes important to remind colleagues that just because something isn’t technically illegal doesn’t mean that it’s not creepy or apt to cause reputational damage.

3.3.5 Making Data Transparent Just as GDPR has added to the job role of data professionals, in the UK public sector the Freedom of Information Act 2000 (FOIA) has made the provision of data to members of the public a part of many built environment organisation’s workload. There is similar legislation in many other democratic countries including the US, Canada, India, Australia, New Zealand, and the EU.6 The FOIA allows members of the public to make reasonable requests for information of public sector bodies, and places strict limits around the amount of time that those bodies have to respond. It also limits the circumstances in which organisations may decline to provide information. Whilst GDPR and FOI requests may not always sit under the purview of a CDO or data team—though often they do—they will spur improvement in data governance and data management within these organisations. The level of FOI rates will vary enormously depending upon the public body, in the built environment FOI levels are often correlated with controversy (an expensive environmentally destructive project will likely receive—and deserve—a commensurate level of FOI requests) or public inconvenience (for example queries related to tickets issued by cameras on the road network). Any organisation that tries to triage high levels of FOI requests without a) clear governance around who is responsible for determining what information can and should be shared, and b) means for collating that information in a fast and repeatable manner, will quickly find themselves breaching FOI timescales, sharing inaccurate/ inappropriate information, and expending a disproportionate level of resources in the process. Happily, one of the quickest means to reduce the burden of FOI requests is 6

The Freedom of Information Act in the United States, The Access to Information Act in Canada, The Right to Information Act in India, The Freedom of Information Act in Australia, The Official Information Act in New Zealand Regulation (EC) No 1049/2001 of the European Parliament and of the Council of 30 May 2001.

138

3 Background Reading

for public organisations to proactively share data using services such as Data.gov.uk or via a dedicated API. However, this is not an option that enough built environment organisations choose to exercise, and so the role of FOIA (and similar legislation) in helping to ensure transparency remains secure for now.- Your authors are supporters of this concept. Information of public interest and information used in the operation of government should be held to account just as the politicians and members of public service are. When people move on, all that is left is their digital emissions in the form of data and documents. A parting thought, you author found it interesting (via The Rest is Politics podcast 2023), that Tony Blair thinks the FOIA policy has not met its policy objective. It has weaponised public information for journalists and now influences the behaviour of those in government (for example senior politicians relying on encrypted platforms like WhatsApp for government comms). This highlights an important social consequence of data governance. It has profound impact in the long run.

3.3.6 Encouraging Data Best Practice Through Regulation Regulators can and should play a constructive role in challenging organisations within their purview to improve their use and management of data. An example of where a UK regulator has challenged their sector to up its data game is Ofgem’s Data Best Practice Guidance and Digitalisation Strategy and Action Plan Guidance (Ofgem, 2021). Published following consultation with the energy companies, these guidance documents carry weight as part of Ofgem’s licensing conditions. The Data Best Practice Guidance, benefits from its brevity, (relative) simplicity, universal applicability, and pragmatic view of the maturity and needs of the sector. It establishes a set of principles that focus on data governance, specification of metadata, discoverability, user requirements, data quality, interoperability, and data sharing. By focusing on these foundational concerns, rather than challenging energy companies to invest in innovative data solutions, the guidance will help ensure that the sector invests in the ‘brilliant basics’ of data management. As the guidance states, it aims “to ensure data is treated as an asset and used effectively for the benefit of customers, stakeholders and the Public Interest (Steer, 2021).” By choosing principles that are not specific to the energy sector, Ofgem created a set of rules that if implemented could help to underpin interoperability between sub-sectors of infrastructure, public services, and the built environment, whilst providing greater clarity and direction to their own sector.

References

139

3.4 Conclusion For those of us with a penchant for data, and a desire to help the physical world to work better for everyone, these are heady days. It feels like after decades of defining success by the tons of concrete poured that there is a collective aspiration to aim higher, work differently, and heed the pressing call to action in the form of the climate crisis. There is also an increasingly coherent narrative concerning the role that data plays in this change. After many years of gestation, the core ISO standards for the built environment now speak confidently to the whole data life cycle. Though it was prematurely shuttered, the Centre for Digital Built Britain and its associated network of public sector organisations, academia, vendors, consultants, and specialist big thinkers manifested a vision of the future for the built environment that uses data to deliver better outcomes for society whilst reducing the impact on the natural environment that hosts it. Even legislation, though it remains messy, recognises the value of data, the need for transparency, and compels built environment organisations to quantify and measure how they are fulfilling their duty of care both to their staff and to the people who make use of their assets. The component parts: standards, theory, and regulations are in place to compel change. What’s lacking most now is the body of real-world practice that evidences the theory. The task now isn’t merely to advocate that organisations use data for the public good but to make it so. It isn’t to imagine flourishing systems, but to foster them. It isn’t to recognise that climate change is happening around us, but to actively reduce the built environment’s contribution to the crisis in measurable and absolute terms. Data is the vehicle for change and the scorecard by which we measure progress. As data professionals we have a seat at the table, future generations will judge us based upon what we do now.

References AtkinsRealis. (2022). Atkins achieves global BIM certification from BSI. atkinsrealis.com. https:// www.atkinsrealis.com/en/media/trade-releases/2022/2022-02-14-uk Booth, R. (2022). ‘Every death was avoidable’: Grenfell tower inquiry closes after 400 days. The Guardian. https://www.theguardian.com/uk-news/2022/nov/10/every-death-was-avoidable-gre nfell-tower-inquiry-closes-after-400-days BSI. (2018a). BSI launches new Kitemark for BIM Level 2. BSIgroup.com. https://www.bsigroup. com/en-GB/about-bsi/media-centre/press-releases/2018/october/bsi-launches-new-kitemarkfor-bim-level-2/ BSI. (2018b). PAS 1192-6:2018 Specification for collaborative sharing and use of structured Health and Safety information using BIM. BSIgroup.com. https://knowledge.bsigroup.com/pro ducts/specification-for-collaborative-sharing-and-use-of-structured-health-and-safety-inform ation-using-bim?version=standard buildingSMART International. (2023). technical.buildingsmart.org. https://technical.buildings mart.org/standards/ifc/

140

3 Background Reading

Burns, D. P. (2022). Asset management as a quest. Talking Infrastructure. https://talkinginfrastruct ure.com/wp-content/uploads/2022/03/AMQuest-Ch1.pdf Civil Engineering. (1999). Programme management B.C. web.archive.org. https://web.archive.org/ web/20070608101037/http://www.pubs.asce.org/ceonline/0699feat.html DAMA International. (2017). DAMA-DMBOK: Data management body of knowledge (2nd ed.). Technics Publications. Dyson, G. (2013). Turing’s cathedral. Penguin. GIIG. (2023). How the environment agency developed its information management platform. ukbimframework.org. https://www.cpni.gov.uk/system/files/documents/ab/e8/220125cpnieaimp-case-study07.pdf IAM. (2015). Asset management—An anatomy. IAM. https://www.theiam.org/knowledge-library/ asset-management-an-anatomy/ Ministry of Housing, Communities & Local Government. (2017). Independent review of building regulations and fire safety. gov.uk. https://www.gov.uk/government/news/independent-reviewof-building-regulations-and-fire-safety Ofgem. (2021). Ofgem.gov.uk. https://www.ofgem.gov.uk/publications/decision-data-best-pra ctice-guidance-and-digitalisation-strategy-and-action-plan-guidance Steer, S. (2021). Ofgem.gov.uk. https://www.ofgem.gov.uk/sites/default/files/2021-11/Data_Best_ Practice_Guidance_v1.pdf UK BIM Alliance. (2023). UK BIM framework. ukbimalliance.co.uk. https://ukbimalliance.co.uk/ resources/

Chapter 4

Data and Organisational Culture

Abstract In this chapter we begin to address how to make the case for data within your own organisation. This means wrestling with what it means to have a positive/receptive data culture in your organisation, as well as the psychology of individual decision-making. We then touch upon working across a complex organisation, and how to foster collaboration with stakeholders and communities of practice. We describe the purpose and formulation of a data-enabled organisational strategy, data vision statement, data principles, and identify the data capabilities required to deliver these. We then dive into example use cases and benefits statements that one might encounter in a built environment organisation, before finishing with a description of the data roles and skills that you are likely to require to deliver and embed your data capabilities.

4.1 Introduction: Meaning in Data When we use data, we are acting as an intelligent agent. As per our definitions in Chap. 1, data and information can exist in perpetuity in the universe, but knowledge and wisdom require the active participation of intelligent agents like us. Without us, or at least our algorithms, knowledge and wisdom disappear. Culture and mindset are further manifestations of intelligence that are impossible to record as information. Indeed, whilst knowledge only requires one person, culture requires multiple interacting people. In Chaps. 2 and 3 we described the historical and cultural factors that make working as a data professional amongst built environment organisations a unique and challenging experience. Outside of the built environment sector, many of the companies, organisations, and industries that have sprung up over the past few decades have the advantage of being data first. In the built environment sector, we do not enjoy a blank canvas. The culture of the organisations that we work with might be one of engineering excellence or delivery first or perhaps even safety by design, but it sure ain’t data savvy. Our job as data professionals in this context is not to subvert the prevailing culture, but to prove how being data-driven can complement a culture of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_4

141

142

4 Data and Organisational Culture

engineering excellence or ruddy-minded delivery. That said, to the extent that we have resources to build our own teams (and co-opt the resources of others), we can create sub-cultures within our organisations where data always has a seat at the table.

4.2 Data Culture Data exists for a reason, for people to use it to gain insight or make decisions. If you don’t know who is using, or could be using, your data, then why do you have it?—Lisa Allen and Deborah Yates (Allen & Yates, 2021)

The value of data is a function of how and when people use it, and to what end. The success of your users defines your success as a data professional. Adoption is everything. As the DAMA writes, “to benefit from data as a corporate asset, the organisational culture must learn to value data and data management activities (DAMA International, 2017).” Adoption is (unfortunately) driven by culture, not technology. The popular idiom is “culture eats strategy for breakfast,” but it could just as easily be “culture can frustrate even the most considered attempts to realise technological coherence.” Admittedly, that’s not as catchy. Building a data team is an opportunity to start to change an organisation’s culture. As we covered earlier, we can use data to better measure how our organisations are performing in terms of inclusion and diversity. Moreover, as we will cover later in this book building a new data team is an opportunity to bring diversity of thought into an organisation. Data exists in part to help us tell stories and to explain reality to ourselves, as such an effective data culture is one that can create compelling fact-based narratives for an organisation, and these will be rooted in that organisation’s culture and its staff’s mindset. As we have explored already, the built environment sector is characterised by a wide range of (often siloed) specialist professions, a relentless focus on delivery and high-risk, low-margin contracts. This can make it difficult for data teams to get a look in. Those unfamiliar or sceptical can perceive data as an expensive distraction or an ineffable luxury. Without addressing the culture of an organisation, colleagues may interpret our well-meaning attempts to offer data solutions as attempts to remove the autonomy or pet spreadsheets of individual teams or SMEs. Building enterprise-wide data platforms to enable self-service business intelligence sounds great until: • Elsa from cost estimating realises that it might force her to give up her jealously protected ready reckoner spreadsheet that allows her to quickly come up with rough-order-of-magnitude estimates based upon an amalgam of judgement and past projects; and, • Anna from the Performance team clocks that this means that IT will stop providing her with unquestioning access to SQL storage on the local data centre; and,

4.2 Data Culture

143

• Olaf, a personal report-generator/vanity analyst 1 to the COO groks that he will no longer be able to point a dashboard at any half-baked spreadsheet he likes and have the resulting visualisations accepted as gospel truth. In addition to the risk that staff like these feel disenfranchised by data solutions that replace their current ways of working, they may also identify their value at work with these tasks that they are doing and consequently worry about becoming redundant. Most people show up to work with good intentions, or at least with a view to realising some form of fulfilment. Consequently, data projects (and indeed any sort of IT intervention) that seek to standardise or modernise existing ways of working can quickly become interpreted as blocking valuable work rather than enabling it. With the right combination of honest communication, targeted training, incremental delivery, and benefits realisation we can alter the perception of data investment from threat to enabler. Data culture is the macro-level climate that sits above these microlevel frictions. Even organisations with mature data cultures will have disagreements around what their organisation can and should do, but these disputes are less likely to become contentious and toxic if there is productive two-way communication between stakeholders and broad agreement on the aspirations and direction of the organisation’s data roadmap. If, as a data professional, you don’t recognise the importance of cultural interventions like communications, stakeholder engagement, and business change—or if you can’t communicate your vision succinctly and tangibly—then at best you are likely to drive staff towards building their own shadow IT, and at worst you might have a full-scale rebellion on your hands. Data is a team sport. Or more accurately it’s a multi-team sport. It’s not enough simply to have a data team, however talented. You also need a wider organisation that is ready and able to work with you and to make use of data. The magic happens where the business and the data function meet. This is where we can explore use cases, share ownership, and willingly and consistently collaborate. Figure 4.1 shows where an organisation and its data function might contribute collectively to realising data outcomes. Different organisations have different balances of responsibility between business data users and dedicated data functions. In most cases, the wider business will have some say in identifying their data needs and filling data roles like Data Steward or Data Owner. Moreover, with the popularisation of data as a career accelerator, and the rise of low and no-code software, it is likely that there will be many staff members in your organisation with data skills who do not work within your data function. Employees might have acquired certain skills in previous roles, as part of trying to resolve certain challenges at work, through outside interests (e.g., hobbies) or from having sought formal training. It is not unusual to come across people with a narrow but deep expertise in one aspect of working with data, for example SQL data mining, GIS, or dashboarding. Where your organisation uses products from a large vendor like Microsoft, many analytics tools might be

1

Credit Jon Jarritt for coining this term (as far as we are aware)

144

4 Data and Organisational Culture

Fig. 4.1 Relationships between a data function and the wider organisation (business)

rolled out to staff by default (PowerBI, Excel, Data Fabric, VBA Macros, PowerPivot, PowerApps, Access databases, etc.), and therefore we shouldn’t be surprised when pockets of expertise in these tools pop up across your business. Some data professionals might see data capabilities outside their purview as a threat to their authority, and let’s be honest: channelled in the wrong direction they can be. But they are also a potential force multiplier, and an opportunity to access a far bigger pool of resources across an organisation than is possible with the relatively small data functions that most built environment organisations possess. By establishing themselves as a centre of excellence, the central hub surrounded by beautifully aligned spokes across the business, a data professional with humble resources might just stand a chance of building some momentum at an organisational level. As data folks we have usually self-selected as people who are excited about technology. Moreover, the low maturity of built environment clients means that data teams are often part of the IT function rather than a business function (where else would the techy stuff go?). In some cases, the first challenge for a data team is to establish their right to exist as a distinct concern from a traditional Information Technology function. Organisations that are not accustomed to data terminology may struggle to distinguish between the systems that contain data and the data itself. This is doubly the case in the built environment where technology usually exists to meet the needs of specialised professions such as architecture, cost management, or scheduling. As DAMA writes, “to become data-centric, organisations need to think

4.2 Data Culture

145

differently and recognise that managing data is different from managing IT (DAMA International, 2017).” An excessive focus on IT technicalities and architecture can mean that data strategy and propositions gestate in an environment that is heavily tech focused. So, in writing your data strategy, explaining how data enables your organisational strategy, or identifying the objectives and priorities of your data team, you are indulging in the specifics of data, how it is to be modelled, and managed, and the questions of technology and toolsets. You might devote page after page, slide after slide, to the specifics of how you will ensure that your colleagues have access to the right data, at the right time, using the right tooling. Once the data infrastructure is in place, you’ll be able to sit back and watch your newly empowered business reap the benefits, right? You’ll have time to do some of the more ambitious projects on your list, like finally build your digital twin, or reconcile your enterprise data architecture with your business glossary. And yet, alas, there is no shortage of instances where people have had access to timely fit-for-purpose data on modern tooling and still stuck fast to their own homebrew spreadsheets. As data professionals we won’t be able to cement our legacy solely on the back of providing the right technology solutions. We have already touched upon the fear of change. To embrace change, people need to feel that they are in a safe and supportive environment. It needs to be OK for them to struggle, to ask for help. Consequently, a positive data culture, in this context, means creating an environment where people feel safe and empowered to influence and make use of what you are creating on their behalf. They also need to genuinely believe that last part: that you are working on their behalf. There are many reasons why people might not take advantage of data services, including: • • • • • • •

A fear of a loss of control. A lack of training in how to make use of the tools. A lack of availability or willingness to change their ways of working. A concern that the tooling will not do what they want it to do. “Not built here” syndrome. A lack of accurate and timely data via official channels. Self-sufficiency through existing shadow IT work arounds and shortcuts (e.g., if I need an update, I’ll just call the supplier). • A profound suspicion and distrust of IT, reinforced/justified by prior experience. As Carl Jung taught us, “whatever you resist, persists,2 ” so we need to be empathetic to the challenges listed above. Maslow’s hierarchy of needs is a useful reference point for how you go about addressing and channelling resistance towards a positive culture that is willing and able to embrace data (Maslow, 1943). The logic of Maslow’s hierarchy is that meeting basic needs (physiology, safety) are a precursor to meeting more sophisticated needs (cognition, self-actualisation). The basic needs 2

There is no definitive source for this exact phrase in his writings or lectures. However, the concept is consistent with Jung’s ideas on the shadow and the importance of embracing and integrating all aspects of the self.

146

4 Data and Organisational Culture

are noticeable in their deficiency (e.g., “I feel hungry” or “I feel unsafe”), whilst the sophisticated needs are a route to growth (e.g., “I understand this new skill” or “I appreciate the beauty of this object”). Data is inherently cognitive, and in many cases also aesthetic. Decision-making is a big part of how we experience cognition and self-actualisation. In a business context your biggest challenge might be justifying data investment in a way that aligns with your organisation’s metrics (time savings, cost savings, reduced risk, better products, customer satisfactory, etc.). These objective justifications are undoubtedly important, but a data strategy also needs to feel right. This is because most challenges in adoption, or even simply obtaining funding, are rooted in more foundational needs such as: • “I don’t feel safe to admit that I don’t understand how to use this tool” or, • “I don’t feel like I belong to this group and feel like they are criticising how I work today” or, • “I feel like I am being sold to” or, • “This feels too ambitious, more like science-fiction than something that we could actually use.” Your data practice must feel proportionate and realistic to your organisation and their work, otherwise you risk triggering system rejection (Fig. 4.2). Who amongst us has not tuned out of a presentation not because the content was logically wrong, but because you felt threatened by the person who was presenting, or by the implications of what they were saying? Respecting culture means in part respecting the psychology of human beings, and the need to spend time empathising with your users and establishing rapport. To ensure that data services are adoptable and adopted you must first ensure that the people behind the data services are accessible and relatable. You must avoid presenting new services as a ‘done deal’ (a fait accompli), or overly denigrating

Fig. 4.2 The goldilocks zone of data strategy

4.2 Data Culture

147

what they replace. Instead, you can establish empathy and rapport by ‘co-creating’ with your users. Few people respond well to the suggestion that their current ways of working are wrong or impractical (even if, deep down, they know it is true). Nor will people feel safe to engage and learn if they feel that we will question their credibility if they don’t immediately and intuitively know how to do data or be data driven. The built environment sector brings more than its fair share of big egos and specialist skill sets, and data is rarely at the top of the pecking order. Nor should it be. As we established in Chaps. 1 and 2, the built environment existed and served humanity long before anyone was talking about data. Engineers have built amazing structures before the data profession showed up with our databases and apps. As data professionals we must remember that whilst data is our life, our identity (sad), our reason for getting out of bed in the morning, for most of our colleagues data is but a small part of their role: at best an enabler, at worst a nuisance. Consequently, our users will see any need to change their processes to fit with data systems as the tail wagging the dog. As we have described, you must approach your colleagues with empathy and respect, one that includes a compelling vision for the future and how they will directly benefit from your work. And to create a positive data culture you must always create a safe space for honesty and vulnerability. Part of what makes working with data in a built environment context so fun is that there isn’t always an established blue-print or correct answer to meet all our user’s needs. As a community of data professionals and data users we all face challenges on the road towards realising value from data. Data professionals must strike a fine balance between projecting confidence and appearing stubborn, between believing in their work and refusing to acknowledge risks or failures. There is plenty of hubris in this space, people who try to build sophisticated digital twins or buy expensive software without establishing a case for change, user needs, or even foundational data management practice. As a profession we need to understand and address our users’ basic needs before trying to blind them with science. In doing so we can begin to make the small part of each of our users’ role that relates to data one of the things that most excites them about coming to work. That is where positive data culture is born.

4.2.1 What Does It Even Mean to Make Data-Driven Decisions? If one pays sufficient attention, one can notice that there’s no subject in the middle of experience — there is only experience. And everything you experience simply arises on its own… when you look within, there’s no trace of an internal commanding officer, autonomously issuing decisions. There’s only mental activity, flowing on. (Burkeman, 2021)

Working with data in the built environment requires a degree of double think. One is forever chasing a utopian near future where everything is data driven. And yet,

148

4 Data and Organisational Culture

we carry the accumulated baggage of so many struggles to land even the smallest change. During the long dark Teams-time of the soul, you wonder: are your systems, staff, and organisations ready to realise value from data? Are you anywhere near rational enough to create value from data? Or are you stuck in a cave of heuristics laughing at the data-driven shadows on the wall? As data professionals, it is important that we cultivate a working theory as to exactly how we derive value from data. One that isn’t naïve, and which accounts for reality. We live and work in a world that is as much characterised by data loss, neglect, and misinterpretation as data leveraged to create value. Accounting for reality will help calibrate your data strategy, inform how you structure your organisations around data, and fundamentally help you to get stuff done. Cal Newport’s book “A World Without Email” describes in painstaking detail how an inefficient way of working (e.g., loads of emails and asynchronous messages, all the time), became the default way of working for the white-collar world (Newport, 2021). This way of working—Cal calls it the ‘hyperactive hive mind’—massively reduces our ability to think by forcing constant context shifts into our daily labour (sound familiar?).

4.2 Data Culture

149

Our dependence on email (and other kinds of asynchronous communication) has undoubtedly consumed many lifetimes’ worth of work and ruined the mental health of many others. As Megan Carnegie writes for Wired magazine, “18 percent of British workers say the spread of digital tools makes it harder to switch off, while the same figure finds them anxiety-inducing (Carnegie, 2021)”. It is a cautionary tale, one that warns us to consider the hidden costs of our default ways of working. It reminds us that just because we are using innovative technology, doesn’t mean that we are working effectively. Cal’s most important insight is that the downsides of technology are usually rooted in the psychology of the user base, not the technology itself. This insight is at the crux of why we don’t always see value from data. We can see this acutely in the built environment where widely adopted digital technologies such as email and document management systems have replaced prior paper-based bureaucracies without reducing the overall bureaucratic burden. If anything, the lower time burden of communication means devoting even more time to arcane governance: “now that it’s so easy to meet remotely, we can afford to add a fourth layer to our governance process.” Cal’s suggestion is not merely that you pay lip service to ‘culture’ and ‘change.’ Rather, he is urging us to consider that our failings are often the product of our individual and collective subconscious and societal consensus. This is like Daniel Kahneman’s idea of mistaken competence where we judge people’s performance based upon the outward signs of competence: confidence, title, hours worked, and reports filed, rather than outcomes accomplished. It is easy for digital tooling to lead to our conflating productivity with the number of emails sent or documents written. Ikujiro Nonaka’s theory distinguishes between ‘tacit knowledge’ (e.g., brain thoughts), ‘explicit knowledge’ (e.g., data), and knowledge creation (e.g., the combination of the two) (Nonaka, 1997). These distinctions feel vital today. Tacit knowledge still matters, you cannot replace tacit knowledge with data, you can only compliment tacit knowledge with data. The question of data-driven decision-making will always balance tacit and explicit knowledge. The difference now is that what we mean by knowledge is changing. The term ‘explicit knowledge’ implies a level of legibility and transparency that big data and neural networks have rendered impossible. Likewise, the more we learn about how human beings make decisions (Kahneman, 2012), the less the designation of knowledge (tacit or otherwise) seems to apply. There comes a point in every data project where we need to reckon with the moment Jo Lucas describes as “when machine logic meets the unpredictable, messy, ever-changing irrationalities of humans and our relationships with each other (Lucas, 2022).” If we are creatures solely driven by a haphazard set of priors and social defaults that we have unconsciously hard-wired into our brains, can we really integrate ‘knowledge’ at all?

150

4 Data and Organisational Culture

We, in the data profession, build products for people that we believe to be rational actors. When we seek to enable ‘data-driven decisions’ we define decisions as conscious choices rather than rationalised impulses. This is why data strategies fail: not because we can’t find or manage data, but because once we have it, we are often structurally and conceptually incapable of taking advantage of it. Soon, it may be that generative AI, the successors to ChatGPT, render decision-making as simple as having a conversation. But even then, we need to know what questions to ask, or even that we ought to be asking questions in the first place rather than generating endless progress reports and dashboards. Decision-making is interaction with the external world. In the built environment this is the response to some motivating factor: the roads are too congested, the asset is failing, and the budget is lacking. Usually, these motivating factors stem from the built environment butting against the unavoidable constraints of space, time, and resources. To understand how and where we fail to take advantage of data, we need to map out a sequence of events: from external stimulus through to a decision through to the enactment of that decision as a response. Something happens, we decide how to fix it, and we enact that decision, usually with humans in the loop at every stage. As Richard Carter writes, Humans process information before they decide to act in the physical domain. Therefore, by shaping the information domain you are influencing people’s beliefs and understanding of their world. From that comprehension, humans act in the physical domain.

Our infrastructure and built environment are the product of—and reliant upon— the actions of innumerable people, technologies, and collective fictions (and factions) such as organisations, governments, and regulators. Nobody owns the full picture, and we cannot solve our problems in isolation. The built environment functions as a system of systems, and we are all agents acting within those systems. Figure 4.3 is a caricature of the stimulus/response cycle that has created and maintained our built environment throughout the Anthropocene. It highlights some of the areas where our good intentions fail us, and the scope of our ambition if we are truly seeking to derive value from data. Public sector organisations, particularly those that concern themselves with the built environment, do not always rapidly respond to changing circumstances. It’s a long journey from a stimulus through to action, one that must reckon with epistemological, technological, psychological, organisational, and cultural limitations. There exists information loss and opportunity for error at every stage in the data-to-decision process: this is the gap between observing reality and affecting change in the physical world. It is the Genoa Bridge, the levees upriver of New Orleans, Rana Plaza (Foreign & Commonwealth Office, 2014), and Grenfell. Our ability to create and influence the built environment, to translate stimulus and observation into timely, ethical, data-driven decisions and action, all stem from our ability to maintain some semblance of shared meaning across the decision-making process. This is harder than it looks, information tends to entropy, particularly in the bowels of large organisations. The loss of information throughout the cycle from stimulus to response (illustrated by the arrow thickness in the diagram) is a loss of

Fig. 4.3 A semi-serious map of decision-making within a built environment organisation

4.2 Data Culture 151

152

4 Data and Organisational Culture

shared meaning, and if our quantum of shared meaning drops below a certain vital threshold then the decision-making process fails (either visibly, or worse by silently churning out misdirection). If we are looking to ensure that we create shared meaning across our organisation and our sector, then we need to rid ourselves of the expectation of a background level of common knowledge. Teams that aren’t predicated on shared meaning must actively build it. This is one of the reasons that diverse teams tend to outperform others because they can’t and don’t assume shared meaning. Another area where diversity helps is that different perspectives are more likely to spot information loss, and more likely to try new ways to address it. The built environment suffers tremendously from groupthink. Similar people from similar backgrounds rotate between client roles and consultancy pitching similar ways of addressing similar problems (your authors are part of this problem). To a hammer, everything looks like a nail, and to a GIS/BIM/Primavera person most problems boil down to a map/model/Gantt chart. In this way, the information loss of a few people can become a structural problem across the sector, designed into our software, our processes, our ways of working, and our ways of thinking. Elaborating on the decision-making image above, we can enumerate some of the areas where our decision-making processes suffer from information loss and therefore don’t always work as intended: 1. Time lag between stimulus (events) and observation: for example, organisations collect data in real-time, but may sit unused until a later point in time when humans or algorithms process it. 2. Time lag between observation and action: for example, where the data flags an anomaly or issue warranting attention, which managers then add to a backlog of problems for resolution at a future date. 3. The inaccurate and incomplete nature of observation: where our methods for measuring events are not sufficiently timely or granular to record the true nature of those events, leading us to misinterpret reality. 4. Errors introduced by computerised manipulation, aggregation, and abstraction of data. This could be the result of coding errors, loss during data transmittal or storage, or over-fitting new data to a predetermined model. 5. Errors introduced by user interfaces placed between people and the built environment. The way that we configure our software modulates our experience of data. A system where every signal is as loud as any other would be difficult to parse, like a word document without formatting. Information architecture and user interface design necessitate prioritising some content over other content, and very few users change any UI settings from their default. This, in turn, means that even the best design will influence what information we respond to, and consequently what decisions we take. Sometimes this influence is explicit (for example, an email marked as urgent), other times it is hidden (for example, what are the consequences of a table that shows only the top 20 rows, and how that table is sorted by default, or using a visualisation method that one’s audience aren’t familiar with and therefore cannot interpret).

4.2 Data Culture

153

6. Interpretation of information based upon biases, inaccurate priors, and unchallenged intuition: this is the thorniest part of the entire process, the human brain, and the illusion of rationality. Decision-making without an informed and conscious logical framework. We need to start conceiving of ourselves as lossy filters for information, not trusty repositories. The only things that we do reliably are misremember, misinterpret, and revert to our priors. Any effective system of data-driven decision-making must reckon with our innate humanity and design fail-safes around it. 7. Additional time lag, compromise, misinterpretation, and complexity added by the ‘decision-ratification’ process from individuals to organisations. Where decisions require ratification by a wider organisation, it always becomes the most timeconsuming part of the stimulus-to-response loop. Decisions made in minutes take days, weeks, or months to ratify. Decision-ratification is also a source of further ambiguity, as it relies on individuals to accurately communicate information to the wider organisation that they work within, and for members of that organisation to in turn accurately interpret that information, and constructively add their own input. Our ratification processes multiply the opportunities for ambiguity, and introduce further biases, inaccurate priors, unchallenged heuristics, and decisionmaking without a conscious logical framework. 8. Misinterpretation of organisational decisions by the individuals expected to execute on those decisions. When we act on the physical environment we usually do so through intermediaries: the maintenance or facilities management department, or more likely a contracting organisation. This adds another layer of interpretation and ambiguity into the process. Information loss occurs across our digital, human, and organisational resources throughout our decision-making processes. The nature of the loss varies, but loss itself is ubiquitous. Programmes that promise ‘digital transformation’ often seek to emphasise digital resources over human ones, but as we see from the list above there are shortcomings associated with each type of actor in the decision-making process. To compensate for information loss, we need to carefully engineer and review the logic of our data systems, we need to challenge our priors and biases when making decisions, and perhaps most importantly we need to create a decision-making culture in our organisations that leverages and augments the faculties of individuals rather than seeking to replace them. Paying attention to how decision-makers operate within your organisation is just as important as investing in the data used to make decisions. We can’t just invest in data science, but also decision science. Richard Carter describes decision science as the science of “making optimal choices from the information that is currently available.3 ” We must also seek to use human and artificial intelligence in a manner that plays to their respective strengths. This is particularly important in a built environment context where there is always the need to interact with the physical world at both the start (observation) and end (action) of the decision-making process. Artificial 3

Carter (2021) “Cognitive Advantage: How Artificial Intelligence Is Changing the Rules for Winning in Business and Government” Mayhill.

154

4 Data and Organisational Culture

intelligence and robotics still struggle to interact with the physical world with the nuance that humans do, so it is likely that the built environment will make more use of human resources ( fleshware) than other fields. Referencing back to the hierarchy diagram used in chapter one, we should seek to leverage the speed and reliability of artificial intelligence to generate knowledge from data, and then in turn leverage the intuition of humans (aided by AI decision support) to turn knowledge into insight and wisdom. Too often, data hubris leads us to expect success and for transformation to be the default. Instead, our goal should be to design resilience around the inevitable information loss and resulting failures in our data collection and decision-making processes. Later in this book we will describe the built environment technologies and data solutions that seek to address information loss across the asset life cycle. In doing so we will identify how we can start to lay out a roadmap for how to create applications that allow our organisations to make effective use of data in their decision-making.

4.2.2 Time and Decision-Making Time may seem like an esoteric subject to cover for a book about data. But time can make all the difference in terms of how useful a piece of data is. The framework we propose here will help us to distinguish between ‘right time’ data and ‘real time’ or (near) real time data. Figure 4.4 is an output from our research in the field of microeconomics and it explores the concept of latency in a transaction.4 It is based upon light cones found in physics; these describe the path of light emanating out into spacetime from a single flash. We have adopted the light cone as a metaphor to represent the data and information required to deploy knowledge and wisdom in a timely manner. In this context, we are only concerned with the transactions of data. Let’s break Fig. 4.4. into three main zones. There is the first zone: yesterday. Yesterday always travels continuously behind the present. Then there is the second zone: now (or the present moment). Now is the absolute value of now. Where time equals ‘absolute zero’ to observe the world in the ‘real now’ would require zero friction in the tools we use. Finally, we have the third zone: tomorrow, the future, the space ahead of the now for which we are yet to collect data. The data of yesterday converges into the now whilst tomorrow is always emerging.5 The two cones play an important illustrative point. Imagine the amount of data you would need to calculate where you had been three seconds ago. It’s less data than one would require working out where you were four years ago! The area of the cone for any given moment in the past is representative of all the data required for

4 5

That said, empire building is hardly unknown. For example, the DAMA DMBOK for data management.

4.2 Data Culture

155

Fig. 4.4 Transactions and latency wave theory

you to understand what was happening at that moment. As the past recedes behind us, uncertainty increases. The forward-facing cone is the same concept but heading into the future. The amount of data you need to sense where you will be in three seconds is less than the amount you would need to attempt to predict where you might be in four years’ time.6 Again, as we project further from the present moment, uncertainty increases. In both directions the areas within the cones represents the amount of data potentially available to you. The extent to which we can access that data (e.g., its visibility) depends on how quickly we can collect the data, transform it, make it useful and act upon it. In the forward-facing cone, we need to think of data as possibilities. The past 6

Extended footnote, because it is fascinating: The Pyramid of Giza dates to around 2600 BC. Imagining these enormous works recalls thousands of labourers dragging limestone and granite blocks weighing as much as six tons up ramps to form the structure. But there was also an enormous number of people involved in the administration of the project. The ‘Diary of Merer’ papyrus contains meticulous notes of the movement and work of a site supervisor working on the operation that supplied the pyramids with their stones and brings to light only a tiny part of the admin associated with constructing this enormous structure. In 1999 Civil Engineering published an inspired article by Craig Smith and others entitled ‘Program Management B.C.’ The authors of this article take their years of experience working on modern megaprojects and use it to reverse engineer the ancient works. By surmising a work breakdown structure (WBS) for the construction of the Great Pyramid, they then “developed a logic diagram to illustrate the construction sequence that the team found most plausible,” used estimators to “produce labour estimates for each of the tasks identified,” and finally arrived at “a critical path construction schedule (Civil Engineering, 1999).” One does not build a 147 m high Pyramid composed of over two mil. blocks with ornate tunnels and chambers inside merely by choosing a spot and piling rocks on top of each other. The Pyramid needed a perfectly even base, precise orientation of points, and a seamless external facade. There would be substantial logistical and safety concerns associated with moving labourers and material at height on temporary works without creating bottlenecks Moreover, the sequencing of the works, including the quarrying and transportation of stone on a tidal river, as well as the timing and positioning of ancillary works, would have required substantial planning prior to execution. As the article concludes, “We must... stand in awe of their program management techniques, as it is equally certain that they had highly developed administrative and planning skills. The complexity and logistical requirements of this project are simply extraordinary.”

156

4 Data and Organisational Culture

is based upon collected data; the future is a probabilistic model of simulated data. Or as Phoebe Bridgers sings, “the future’s unwritten, the past is a corridor (Bridgers, 2020).” This is where the looping arch joins the story. Any transaction has a cost, in our context of data the transaction cost is the latency caused from collecting, storing, computing, and utilising the data. None of these processes are instantaneous. This means, what we call relative t = 0, is lagging your absolute t = 0. The time it takes to overcome the friction is how much the former lags the latter. The yellow arch represents this lag along the axis of time. A splendid example is tax returns. The HMRC collect taxes in an annual cycle. This cycle and the allowance of an entire year to report on is due to its historical cost of transactions. It takes effort for individuals to collate their manual and analogue accounts, convert them into a report and submit it that to the HMRC for processing. The government’s t = 0 is around 365 days behind the monetary transactions taking place right now. Today, people who pay their taxes monthly in the UK via the Pay as You Earn (PAYE) scheme report on a much faster cadence: once a month. Providing a more frequent snapshot of tax revenues for the government. This has become possible due to the digitalisation of the tax system and a unification of data standards. This is a vital concept for anyone trying to substantiate a business case for investment into any IT or data management initiative. You need to know how laggy your process outputs are and how much data you need to compute to obtain a snapshot of what reality is at a given moment. The latency of how close you want your relative t = 0 to absolute t = 0 will be a function of two factors. First is the risk and complexity of the task you’re trying to undertake. The second is about the quantity of resources, including budget, you have at your disposal. Let’s take two example scenarios. The facilities management of a school and the operation of a fighter jet. School buildings are rarely lavishly funded and can rarely afford a system that can measure responsiveness of their building physics (temperature, moisture etc.) in the 1/1000ths of a second or fractions of a degree centigrade. Why would they need to do that? The environment of most schools isn’t that chaotic (at least not the physical environment) and a shift of one degree in half a second is not going to put anyone or anything at risk. They could afford a t = 0 that is about 10 to 30 min from absolute zero. All that the facilities manager would be interested in is their building’s response to weather or any type of emergency like an evacuation or a gas leak. However, a fighter jet has the resources of a nation behind it and requires a fast and complex nexus of systems to measure its environment. Fighter jets are expensive assets and training a pilot is an expensive endeavour. The telemetry systems of a jet working quickly to respond to their rapidly changing environment. Their t = 0 delta to absolute zero has to be as small as possible. When working with data we frequently make trade-offs between fidelity, accuracy, timeliness, and cost. The greater the detail we demand, the higher the cost. Updating your data warehouse overnight might cost a fraction of setting up near real-time updates. For some users, such as high-speed traders, it may be worth paying the

4.2 Data Culture

157

premium, for others it won’t be. We need to understand the specific informational needs of our organisation, now and in the future, and calibrate our investment in data accordingly. Even if you have infinite resources, it doesn’t give you the licence to measure with impunity. We also have a moral obligation as data management professionals to consider the footprint of our transactions on our environment in the embodied carbon of the telemetry systems and the operating carbon emissions of the data storage and compute.

4.2.3 Case Study: Data ‘Friction’ and the Music Industry An important dimension of our philosophy is the economic dynamics facing our sector’s data creeping deeper into the built environment’s business models and procurement requirements. We believe that the management systems required to deliver robust and useful data will reconfigure the friction (resource consumption) of our business transactions. We say reconfigure because when we observe how technology has reduced friction in other sectors of the economy, we also see that new types of friction emerge. The existence of transaction friction is the basis of why business models emerge. If transaction friction was zero, we would all be plucking our every need from the immediate environment around us. Since we can’t, we have a reason to pay others for goods and services, including the organisations that attend to our built environment. An example of this in action is within the music industry where the distribution of music has lost all of its friction. However, there is new friction where there was none: access to the music. Today we can access any music from anywhere for a fee. The concept of the physical ownership of music is now outdated (setting aside the nostalgia of the ‘vinyl revival’). The data systems developed by the likes of Napster, iTunes and Spotify did not happen without design. Their business models were thought out and implemented in concert with the technical solution (except for Napster where the technology clearly came first!). Moreover, for the more successful in that list, the ones that focused on the customer and users have been the most successful. In these data driven business models customers, users and products can be quite different things. The distinction between ‘customers’ and ‘users’ and ‘stakeholders’ is also a common source of consternation in the built environment as many ‘customers’ benefit indirectly from public infrastructure and may not even be aware of their having benefited. The nature of infrastructure is that we only notice our reliance on it when it fails. This is as true for energy and sanitation as it is for cloud services and cellular communications, and as it will be for AI and (maybe) blockchain. What the example of the music industry demonstrates to us is that contemporary data technologies can radically transform our relationship to a particular type of

158

4 Data and Organisational Culture

consumption or experience. As Neil Postman wrote back in 1992, “Technological change is not additive; it is ecological. A new technology does not merely add something; it changes everything.” It’s not in the technology that is used, it’s how it is used and, back to our earlier biological analogy, a mere change of one molecule of our DNA can have a profound impact on the organism it’s trying to build. Sometimes it’s hard to envision this level of technological change ever happening in the built environment. Moving from consuming music as vinyl, cassettes, or CDs to consuming music as streaming data is a big change, but you are taking something that was always relatively immaterial (recorded music), and converting it from having a small physical footprint (e.g., bits of plastic) to an even smaller physical footprint (e.g., data streamed from a distant data centre via a device). The built environment, as the name suggests, will never digitise to the same extent. There are physical barriers: we simply can’t fuel ourselves on digital water and food or heat our homes with digital energy. Moreover, as the example of musical mediums shows, people become emotionally attached to the physicality of their surroundings. In the same way that vinyl records are seeing a resurgence during our digital era (Robinson, 2022), so many attempts to improve the ‘efficiency’ of living spaces attract resident’s ire for destroying the character or social capital of those spaces (Gayle, 2021). We must address a critical point here, the discussion of access to music as a service (like Spotify) is a moot point for our sector. This neat story arc of Napster in 1999 to iTunes in 2001 neglects a key factor. A digital distribution platform for selling digital music is useless if you cannot digitally record music. We must head back to 1979 to witness the first digital recording of an album. This is an important consideration for our sector. We must remember we are in the digital recording of music phase of our sector’s transformation, not the digital distribution of music phase. This is not something we should beat ourselves up about. Ironically, we have the opportunity to be like Apple. Apple is a notoriously successful second mover. They were not the first movers for mp3 players, mobile phones, or digital watches. But they did nail them! We don’t want to labour the point here too much, but there is a cultural aspect to consider too. The advent of digital music recording and production has profoundly changed the culture of making music. Digital music technology made recording and producing music cheaper and faster. It democratised access and enabled megastars to emerge from laptops in their bedrooms. However, there is a hidden cultural cost here and it resonates with our own digital transformation journey. Digital music production has massively lowered the barriers to entry for musicians, opened new means of creative expression, and has made it easier to fix shortcomings through innovations like autotune and beat correction. If we are not careful, we will be forgiving of sloppy engineering of our natural and built environments. As we begin to fathom the task of protecting the planet, we cannot afford a grand experiment of lowering the barrier of technology that can accidentally accelerate a negative impact on the sustainability of our outcomes. We must enable talented and skilled people with technology to maximise our chances of protecting nature and aligning our built environment harmoniously with our ecology. In doing so, we should take advantage of knowledge gained in other

4.3 Stakeholder Engagement and Communities of Practice

159

sectors that have moved to digitise faster than we have. The opportunity for digitising the built environment will be one of attempting to reduce friction: within organisations, between organisations, and between the built and natural environments. In practice this means increasing the ‘availability’ of infrastructure (e.g., minimising downtime by minimising or eliminating the need for maintenance), and reducing friction (e.g., by allowing the consumption of infrastructure to be as seamless as possible). Examples of this in practice include technologies such as Oyster Cards, Smart Meters, and equipment like SCADA and Building Management systems that in effect reduce the friction between the physical and information domains through automation. In linking the physical and information domains through technologies such as IoT we are expanding our ‘network of networks.’ We are moving from a built environment composed of disparate assets in the physical world, towards an interconnected and resilient system of systems. This was the fundamental concept behind the ‘National Digital Twin Programme’ for example. Metcalfe’s Law (more commonly referred to as ‘the network effect,’ states that “the value of a network goes up as the square of the number of users (Fisk, 2020).” Infrastructure and the built environment have always had a huge user base, pretty much 100% of people consume some sort of infrastructure. But it has not in the past functioned as a network. At best it has been a collection of isolated networks (the railway network for example). The digitisation of the built environment is the network of the built environment, it is what will drive its value in the coming century as we are under ever greater pressure to reduce the financial and environmental implication of providing services to people.

4.3 Stakeholder Engagement and Communities of Practice Data is a moveable feast. One never truly finishes a data strategy or digital transformation; you just get it to a point where it is broadly on point and set a reminder to revisit it in twelve months. Likewise, you can’t just deploy software and walk away from it. The pursuit of data is one of continuous improvement and adapting to meet the new and novel needs of our users. To work in a complex multi-disciplinary sector like the built environment is to be confronted with a wealth of knowledge, experience, and perspective, one that is frequently refreshed with fresh faces, changing organisations, and new roles. We cannot ignore this rich heritage of professional bodies and other tribes; these are the networks that link individual specialists into communities that span and supersede organisational boundaries. Stakeholder engagement is a broad title for the group of activities that we undertake to ensure that our data practice meets actual user and organisational needs. Stakeholder engagement is a constant activity, not a box for ticking. As you engage formally and informally across your organisation, you are starting a two-way process where you learn from them, and they learn from you. Early meetings will focus more on understanding each other’s roles and talking about talking about data. Only after

160

4 Data and Organisational Culture

you establish shared meaning will you be able to cover more complex topics like data strategy, roadmaps, or the details of projects. There will often be suspicion early on, particularly if there is established practice in the business that doesn’t correspond to your roadmap (or your organisation’s enterprise architecture). You may not be starting with a blank canvas. Many organisations have been around long enough to have seen data strategies and other digital investments come and go with much fanfare and little tangible benefit. As a data professional, you will need to take a proactive and incremental approach to engaging with your stakeholders such as colleagues, peers, customers and suppliers. The key is to build credibility; too much vision and you might be perceived as detached from your organisation’s real problems, too little vision and you might come across as dull and administrative. You should seek to ensure that you are always providing a suitable means for your stakeholders to feedback, including a point of escalation or a neutral party so that they can share issues that they may not be comfortable picking up with you directly. You need to ensure that you are not moving too fast for your stakeholders, starting too big, or leaving them behind with technical jargon. Too often scoping out initial work with stakeholders starts with a blank canvas (an empty Miro board for example), which can be overwhelming to people. It can be hard to know where to begin without prompts. Instead, consider starting your engagement with a ‘straw person’ that introduces some simple use cases related to the stakeholders’ day-to-day work and encourages them to explore and extrapolate from there. This might include the stakeholders telling you that you have misinterpreted their role or their needs. If your project/ organisation has a business engagement or business relationship management function or similar then make effective use of them to inform your preparations so that you understand your stakeholders’ role, purpose, motivations, and pain points going into the meeting. There is an art to dividing and conquering your stakeholder population. And often it is worth plugging into existing comms and engagement functions whilst at the same time working your informal connections. One approach is to take a top-down, inside-out approach to expand the scope of stakeholder engagement, namely: 1. Begin with your internal IT function to ensure that you agree on fundamental principles before engaging the business. 2. Introduce the leadership and key stakeholders to your big-ticket items such as your strategy, roadmap, or a key project, and ask them to guide and prioritise further engagement with the relevant teams. 3. Where possible, ask those stakeholders for some level of sponsorship of their parts of your work. 4. Work down through the organisation seeking input to an evolving draft of your data strategy or roadmap. 5. Identify champions and communities of practice and keep them enthused and feeling special. Similarly, try to weed out the Luddites who are just going to waste your precious time.

4.3 Stakeholder Engagement and Communities of Practice

161

6. Loop back to stages 1–5 as your work develops to reconfirm your assumptions and conclusions. Stay alert to feedback and lessons learned. Some inclusive messages one might want to land during stakeholder engagement include: • You are creating a data strategy/vision/roadmap for the entire project/organisation, not simply for the IT department. • This means that staff from across the project/organisation should feel included and involved in ‘co-creating’ a strategy and developing use cases. • Contributing to the data strategy or roadmap shouldn’t require staff to have a deep appreciation of data or technology. Indeed, where possible you should seek to ensure that the conversation around data use cases remains technology/software agnostic and avoid jargon wherever possible. • Instead, you will focus on the decisions that your staff make in their day-to-day work, and how better access to information and insight could help enable those decisions. Parts of your organisation will have datasets and data types that are key to their work. Understanding the data landscape of your organisation—potentially as part of your approach to Data Governance—is important to ensuring that you are talking to stakeholder groups about their data pain points. As you expand your engagement, you will increasingly need to speak to people who may not think of themselves as users of data. This is where reframing the question of ‘data’ as being about ‘decision making’ and not technology becomes vitally important. Everyone in your organisation will use one data source or another in the course of their work, it’s hard not to! Usually, the staff members performing the dayto-day functions have the clearest idea of how to improve processes, user interfaces, data quality and all the rest, albeit they may not be able to express it in IT terms. They might even have their ears closer to the ground in terms of emerging products and techniques in their field. A crucial output of your stakeholder engagement should be to capture these use cases from the business. Your data practice can then seek to answer how to deliver those use cases in a manner that makes use of a core set of ‘patterns’. In most cases, use cases should help to answer one or more of a standard set of data capability questions such as7 : 1. What data do staff need to access, and does the organisation hold it? 2. Is the data that the organisation holds assured to a level of quality and granularity that is sufficient for decision making? Do I have means of assuring data quality? 3. Can staff find and make use of the data? Is there a single source of truth? Is the data stored and presented in a way that staff find legible and meaningful? 4. Are staff able to use data to inform their decision-making? Does the organisation derive value from it? Can staff generate insight from the data? 5. Is the organisation’s data secure from loss or breach? And is it controlled and governed? 7

Based upon a set of questions originally formulate by Davin Crowley-Sweet.

162

4 Data and Organisational Culture

Stakeholders will engage with defining and engaging with data capabilities in their own idiosyncratic ways. Some will see an opportunity to improve their ways of working and volunteer to be involved in the details of crafting your approach (at least for their favoured area), while others may join in a single workshop and feel like they have said all that they can. You should look to group your stakeholders into different cohorts that will then warrant levels of on-going engagement: • Workshop participant: someone who is engaged in facilitated workshops, and may suggest or contribute to a use case, but who is not involved with your data strategy or roadmap on an on-going basis. • Working group member or Community member: a champion, someone who is willing and able to represent their part of the business and ensure that your data strategy or roadmap meets their needs. • Sponsor: a senior member of staff who can support the objectives of your data strategy and encourage staff members to participate. • Informed but not engaged: the more disinterested or recalcitrant user, who should be kept in the loop, but is not willing or able to contribute constructively to defining your work. Good data professionals know how to meet people where they are and will tailor their engagement to meet colleagues at the appropriate level. Whilst you may focus much of the detailed engagement efforts on your working group and your advocates, you should always ensure that you loop back to the wider engagement to ensure that your strategy, roadmap, and projects continue to represent the needs of the whole of the wider organisation. To succeed, the method of delivering organisational change through data must align with functions within and beyond IT, including: • Enterprise Architecture: your method must align and agree with the wider design of your organisation’s digital infrastructure. That is, if your organisation has such designs. If there is no formalised architecture, then the data function might take it upon themselves to originate some or failing that to work with any implicit design parameters or consensus within the organisation or IT department. • Business Relationship Management or Business Engagement: you want the business to perceive your data practice as fulfilling their needs and priorities, not just those of the IT function. You must also ensure that the business understand the need for your proposed investments, and where possible are able to act as advocates and sponsors. • Finance: Your data strategy must propose investments that deliver business benefits and meaningful return on investment. • Commercial and Procurement, Legal: you need to translate your data strategy into expectations for your supply chain, both those delivering digital capability on your behalf, and those involved in delivering services to the wider organisation/ project. It must be deliverable in a manner that makes best use of the funding available, and is commercially controlled.

4.4 Writing a Data Strategy

163

• Subject matter experts (designers, asset managers, operators, project controls, sustainability, etc.): it must enable the many talented specialists in your organisation/project to undertake their work effectively. Teams and communities are where change happens in large organisations and where we can begin to influence organisational culture. Knowledge work necessitates the dynamic creation and dissolution of teams of people to realise specific purposes, often in a manner that runs counter to traditional notions of governance and hierarchy. As we work through our stakeholder environment, we should be looking for opportunities to translate groups of supporters into communities of practice. Nurturing communities of practice can be an opportunity to nurture the seeds of positive data culture within a subset of your organisation. To quote Etienne and Beverly Wenger-Treyner (via a GDS blog) communities of practice are “groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly (Busuttil, 2016).” If we create new communities then it is advisable to base these around existing business problems with a data edge, for example, working groups concerning BIM, asset management, information management, or data sharing. As covered earlier in this book, data is everywhere in the built environment sector, and it certainly isn’t restricted to the Information Technology (IT) department. Across your business, you will find a range of users, and you must meet them where they are. Some of your colleagues will know far more about some types of data and technology than you do, and you need to learn to use rather than patronise their expertise. Other users will be brimming with enthusiasm, if not expertise, and these future data champions will be keen to start working on real world problems. You must learn to foster that enthusiasm, even when progress proves slower and more difficult than you would like. Many users won’t really understand what this data thing is all about, they may even be actively cynical, or at least confident that it doesn’t apply to them. Not everyone needs to be a data evangelist, you need to find out what matters to your colleagues and how data can help, not arrive with solutions looking for problems. Underpinning an effective approach to stakeholders and communities of practice is a focus on creating shared meaning, using words (and more importantly visuals) that are accessible and meaningful to all, and creating a vision for data owned by the entire organisation rather than isolated in a particular function.

4.4 Writing a Data Strategy Your data strategy is where you set out the vision for unleashing the power of data for your business. Be ambitious. (Allen & Yates, 2021)

It is increasingly common for organisations to have a dedicated data strategy, and your role might require you to write or contribute to a data strategy. A data strategy

164

4 Data and Organisational Culture

is a good opportunity for you to document your data vision and data principles. The purpose of a data strategy is to set out the underlying narrative of your project or organisation’s use of data. This should set out why you need to effectively manage data for your project or organisation to succeed and resolve ambiguities in terms of what you are investing in, when you are making these investments, how you expect the investments to generate value, and who is responsible for realising that value. It is easy for a data strategy to exist as if removed from the real world. A strategy should not breathlessly describe the possibilities of data without substantiating how to realise those possibilities. Asserting the possibilities of data rarely creates value. Nor are the uses of data that attract the most attention necessarily the uses that make a difference. The most valuable uses of data are often quite prosaic, sometimes users may not regard them as data-related at all (for example, reviewing a report on safety incidents on a site). Data is not an end to itself, it is valuable as a catalyst for conversation, truth telling, record-keeping, accountability, continuous improvement, and decision-making. Your data strategy should refrain from hype and bloviation and should seek to illustrate aspirations using examples that will be tangible, relevant, and compelling to your audience. A data strategy should not mistake technology for change. The purpose of technology is what it does. Any technology rollout that fails to deliver adoption, use, or understanding creates no value. From the perspective of a data strategy, technology (and more broadly ‘digital’) is a means of making data accessible and meaningful to users. Technology alone cannot improve the quality of the data that you hold, nor cannot it force people to use data that they do not wish or know how to use. A strategy should not treat data as a monolithic or homogenous thing. Part of the beauty of data in the built environment is the incredible variety of data sources and data types required to deliver the finished product. One must remember that data is plural and takes different forms, particularly in the built environment where so much information exists in documents, images, sketches, plans and drawings. The approaches which render alphanumeric information useful to users will not always work for documents or models. It’s only by uniting each form of data that you can support the asset life cycle and project life cycle in their entirety. A built environment data strategy should describe a workable approach to deriving value from alphanumeric data, documents, models, spatial, and indeed data types that do not fit comfortably in any one of those categories. Above all your data strategy must be meaningful to people who are not data professionals. Preparing an effective data strategy cannot occur in isolation from the business, but rather through extensive consultation within the business, your stakeholders and suppliers. You should write in plain language. Whilst it will never inspire everyone, your strategy should allow your colleagues to quickly understand how data can help them help themselves to accomplish the things that they care about accomplishing. Data is only one of the specialisms required to enable your organisation’s success. You should ensure that your data strategy reflects and references other relevant strategies that your organisation has created, such as Digital, Cyber, Information Governance, Quality, BIM, Asset Management, AI, Ethics or Sustainability. It is by acting

4.5 Your Data Vision Statement

165

as a focal point for data and connecting into this wider ecosystem of strategies that a data strategy can influence the aspirations of an organisation and realise value. Not all Built Environment organisations have a data or digital strategy, and even where they do exist, they are not reliably in the public domain. A couple of good examples that are publicly available come from the Open Data Institute (ODI) and UK Power Networks (UKPN). The ODI Data Strategy is notable for its brevity, functioning as a short and effective primer on the data that the organisation holds, its principles for using data, and the value it intends to realise from its data (Open Data Institute, 2021). UPKN’s Digitalisation Strategy and Action Plan (UKPN, 2023) stands out for its clarity of thought, its intuitive information architecture, and its careful use of graphics and diagrams. Rather than publish the strategy as a single monolithic document, UKPN’s Head of Enterprise Data Management, Matt Webb, chose to publish it as a public facing ‘micro-site.’ This allows UKPN to provide updates as their strategy evolves, creating a living document that can grow and change with the organisation and the circumstances it faces. These examples demonstrate that effective data strategies can take many forms, and that it is worth considering the means of communication not just the content.

4.5 Your Data Vision Statement Before we get stuck into how to arrive at a suitable vision statement: a caveat. At this point in the twenty-first century the terminology of digital and data has sufficiently saturated the zeitgeist that many people who are far from data specialists can competently parrot the tech-utopian knock-off Silicon Valley drivel that so often passes for vision today. One example is AOL’s self-anointed Digital Prophet David ‘Shingy’ Shing, and his Bob & David parody Shangy who utters meaningless drivel like “Nature, renewal, recycle, it’s a life-cycle man… digital!” The point is, a great vision statement only gets you so far, it’s what you put into practice that matters. There are plenty of board members and executives out there who talk a good game on the importance of Digital Transformation whilst slashing data budgets behind the scenes. So, expectations set suitably low, let’s talk about why it’s still worth thinking about a vision statement. A data vision statement should capture the difference that you hope that your organisation will realise through its use of data. It is the ‘why’ of your data strategy. It’s what gets you out of bed in the morning, it’s the moment of inspiration whilst you are brushing your teeth, it’s where you start in your ‘elevator pitch’, it’s what you awkwardly try to explain to strangers at social gatherings. There’s no point having a vision that only makes sense to experts, or only appeals to a small group of people. Let the nerds argue about architecture, this is your opportunity to answer the “what’s in it for me?” question for your organisation. You need to challenge yourself to understand the needs of users across your organisation. If you choose to use terms like ‘informed client’ or ‘data savvy organisation’ or

166

4 Data and Organisational Culture

‘future-proof strategy’, you need to be prepared to articulate what these ideas mean in practice, and why anyone should care. The purpose of a data vision statement is to provide a quick, memorable summary of what you are looking to accomplish for your project or organisation through the implementation of your data strategy, roadmap, and investments. A data vision statement need not be categorical or complete, it is a conversation starter, not a conversation ender, a statement of intent rather than a methodology detailing how to realise that intent. For your vision statement to be useful, you should look to meet some criteria, such as: • Brevity: We’re talking one sentence, rather than a paragraph. • Alignment: Align your data vision to the overall objectives of your organisation or project. you don’t want to be pulling in a different direction from the rest of your colleagues. • Relevant: It sounds obvious, but the vision statement for data needs to relate to data, in other words an outcome that data can help accomplish. This is where your data vision statement might be an expansion of your organisational vision statement. For example, if your organisation’s vision is something like creating great places for people to live, your data vision statement might elaborate on how data contributes to that outcome. • Specific: You need to avoid platitudes, idioms, and fuzzy language. Digital! • Inspirational: Choose words that are accessible and meaningful to everyone working in your organisation. Avoid jargon. You want words which are exciting to as broad a group of people as possible. It should be something worth getting out of bed for. • Ambitious: it’s easy to become blinded by the details of every day. Whilst your goal for this week might be to get your business case approved, or to get a query to run 10% faster, your vision should step away from these technicalities. Ask yourself: why is it important that we succeed? • Realistic: That said, you’re also not going to solve world hunger, and you don’t want to promise the organisation something that data cannot deliver. It is hard to write a vision statement that meets all these criteria, and there are clearly trade-offs, for example between brevity and specificity, or ambition and realism. Some examples of data vision statements include: • “Unlocking the value of data across the economy.” This statement, one of five data ‘missions’ set out in the UK Government’s National Data Strategy, scores high in terms of ambition and brevity, but perhaps lacks specificity (Department for Digital, Culture, Media & Sport, 2020). • “To use data standards to improve public services for the public through stronger policies, analysis, and insights (Central Digital and Data Office, 2021).” From the Data Standards Authority’s strategy, there is some specificity here, and it certainly feels relevant to the authority’s work. The sentence structure is slightly clunky,

4.5 Your Data Vision Statement

167

particularly the bit about ‘public services for the public,’ but overall, it feels fitting and realistic. • “Our vision is to realise the full value of information by empowering our people, connecting ourselves and building trust with our suppliers, stakeholders, and customers. By providing information that can be trusted and valued by all, we can fulfil our purpose of connecting the country through better journeys.” This vision statement from Highways England back in 2019 has a lot going for it. It aligns to the overall corporate strategy, it enumerates specific areas of attention, and it is both ambitious and realistic. It is perhaps slightly too long, but again that is the trade-off with specificity. A good place to start when you are defining your vision is to understand what you project or organisation already publicly declares about itself, for example in annual reports or strategic plans. What are your organisation’s published objectives and values? What commitments has it made externally? What do they actively sell about themselves? Whilst it is tempting to try to cover a wide range of outcomes in your data vision statement, there is a risk that this dilutes your message. Instead, you need to abstract to the unique capability that data provides your organisation’s vision and objectives. The capabilities that a data strategy provides are usually consistent across the organisation, they just manifest themselves ways that reflect the sub-cultures of individual teams/departments/professions. The variation across an organisation doesn’t come from the data capabilities themselves, but rather in terms of how they are used. If you step into the lobby of the CIA building you will see the following biblical quotation: And ye shall know the truth, and the truth shall make you free.

Returning to the definitions from ISO 9000, namely data as “facts about an object,” and information as “meaningful data”. This implies that the value of data stems from its ability to provide decision-makers with the meaningful facts pertaining to your project or organisation. These facts can relate to the status of your assets (for example: how many? what material? where are they?), or to the status of your work (for example are you on schedule? are you working to budget?), or to the business processes that allow the latter to address the former (e.g. who is responsible for this mess? who signed it off in the first place?). As a data professional we seek to provide our colleagues with the agency that comes from having a common source of truth, a common base of facts. The use of data should help them to answer questions objectively and consistently. We can enable that objectivity and consistency both by ensuring the specification and quality of the data that we manage, as well as providing reliable and intuitive means to interpret and make use of those facts. The focus of built environment clients will always, rightly, be on the assets, their heritage, and their users. However important you may think data is, it can and should only ever be an enabler to the bigger goals of your organisation. Whilst you do not want your data Vision Statement to change frequently, it is ok for it to be a work

168

4 Data and Organisational Culture

in progress. It is possible for an organisation to outgrow a vision, and you should continue to solicit feedback on the statement as part of your user engagement. If it becomes apparent that the statement lacks saliency, or is overly technical, you can and should revise it accordingly. So don’t be a Shangy, but also don’t worry too much, if you consult with your team, trusted advisors, and closest stakeholders, and ultimately if you express what motivates and matters to you, then you are unlikely to be too wide of the mark. Once you have coined your data vision statement don’t be afraid to use it. There is no shortage of bureaucracy and governance in the public sector and the built environment, so consequently there are plenty of opportunities to cite your data vision statement in business cases, project plans, policies, standards, and strategies. Above all else, keep it in the back of your mind, it is a rubric against which you can judge your actions, and the progress that you make.

4.6 Data Principles (and Ethics) If the data vision statement acts as a north star for a data team, then data principles exist to ensure that we realise our vision in a manner that does not compromise our values (or, if we lack values, societal norms). Anyone who has paid even passing attention to the changing role of data in society over the last few decades will have noticed that the use of data is not without its downsides. There are certainly plenty of instances of organisations and governments using data unethically, particularly to constrain people’s civil liberties. But just as significant a threat comes from the unintended consequences of data, the morally neutral or unwittingly problematic use cases. The road to realising value from data has become littered with good intentions gone rotten. Your data principles should act as a code of ethics for your organisation, and as a sense check if what you are doing ever feels dubious. For a while it was possible for the built environment to sit out the data ethics debate. We regarded infrastructure has been seen as morally neutral, as a public service. This was certainly only the privileged view of people unaffected by the downsides of infrastructure. There is, for example, a long history of road construction that has disproportionate negative impacts on minority communities (Karas, 2015) which most middle-class suburbanites would have been blind to. Now though, like much of modern life the built environment and infrastructure are now both highly politicised. In the UK the debate on active travel and low traffic neighbourhoods (LTNs), for example, has seen both sides frequently resort to data and statistics to reinforce their points. To their proponents LTNs are a panacea to noise, pollution, and support increasingly popular active travel options. To their opponents, LTNs benefit a privileged few whilst increasing the journey times of most drivers. Both sides contend that the silent majority supports their position. It’s easy to dismiss such debates as petty local controversies, but the use of data in these emotive and political debates increases the moral imperative to share data openly as well as to try to share independent ethical interpretations of that data.

4.6 Data Principles (and Ethics)

169

The question of the ethical use of data often centres on the exploitation of data by AI. Dr. David Leslie of the Alan Turing Institute has written particularly well on this topic, grounding a framework for ethical AI on the principles of bioethics, “respecting the autonomy of the individual, protecting people from harm, looking after the well-being of others, and treating all individuals equitably and justly (Leslie, 2019).” Whilst AI ethics is an extremely interesting topic, and will have some relevance, the data risks faced by those working on the built environment are often more humdrum. There are several scenarios where your use of data could positively or negatively impact your organisation/project, your stakeholders, and the public. The following paragraphs describe a few such scenarios, distinguishing between the positive statement (opportunity) and the negative statement (risk). Opportunity: Acting as a responsible custodian of sensitive data: keeping it secure, using it ethically, and disposing of it at the right time. Risk: Holding more sensitive data than you need or holding it for longer than is necessary. There is no foolproof IT system, nor any way of storing data that eliminates all risks of breaches or leaks.8 Even applying the strongest encryption standards that we have today, it is entirely possible that a quantum computer of sufficient power will be able to quickly de-encrypt that data. Often, the need to store sensitive data is unavoidable, it is a side effect of storing useful data. Sensitive data includes both personally identifiable information—for example, customer identifiers, and information held on staff/suppliers such as payroll and health and wellbeing information—but also data that in the wrong hands could be used to negatively impact your organisation or your stakeholders—for example, commercially sensitive information (costs, rates, contracts), intellectual property, or physical security provisions. Although holding sensitive data is often unavoidable, every additional data point held adds to the potential harm of data loss, and so increases the risk profile. Consequently, one should only collect and store additional sensitive data where there is a justifiable need. Moreover, one shouldn’t store that data for longer than is strictly necessary, and one must have defined processes in place for handing back, archiving, or deleting this data when it is no longer required. Your retention policy must be accessible and enable staff to easily understand how long your organisation should store data (particularly sensitive data), and the appropriate storage method. Effective maintenance of an up-to-date data catalogue, and the rollout of a data governance framework, are crucial to ensuring that your organisation manages its complete data landscape, and that accountability is clear. Supply chains in the built environment sector are wide and deep, with external staff often far outnumbering those working directly for clients. This means that whatever provisions you establish in your organisation for managing sensitive data, you will need to extend these same provisions to your supply chain when they handle data on your behalf. Clearly documented policies, and a diligent supplier assurance process,

8

At least not one which also renders the information contained recoverable.

170

4 Data and Organisational Culture

will ensure that you are not merely hiding data risk by pushing it down into your supply chain. Opportunity: Collecting data with the timeliness and accuracy required to learn from the past and operate safely and effectively. Risk: Inefficient operations as a consequence of failing to collect or use data. As the paragraph above describes, needless collection of data (particularly sensitive data) is an unnecessary risk. However, the inverse is also true. Organisations that fail to collect operational data will be unable to learn from their past, or to identify changing circumstances and emerging problems. As an example, health and safety systems work on the assumption that one can learn from past incidents, particularly near misses, and use the insight that this provides to mitigate future incidents. Failing to collect and use data for this purpose is an error of omission, one that is not particularly uncommon. It is a failing that not only undermines the vitality of an organisation but could also result in legal jeopardy if a plaintiff can demonstrate that they have come to harm because of something that an organisation should have known. Accidents—by their very nature—occur in the absence of foresight. In a Minority Report style world of perfect information (or anticipation) we could in theory avoid most built environment accidents. Hindsight it 20:20, foresight is not. Had, for example, Railtrack (and latterly Network Rail) maintained better information on the condition of its rails and applied its own policy more thoroughly then maybe they could have remedied the faults that caused crashes at Hatfield and Potters Bar in 2000 and 2002, respectively. In the past organisations did not have access to remote monitoring and therefore relied on manual visual inspections of assets to assess their condition and risk. Even with the advent of real time monitoring and IoT it is not yet practical or affordable to monitor all assets in real time or respond to the enormous number of potential failures that would flag. However, the built environment has an obligation to demonstrate that they have maintained the risk profile “As Low as Reasonably Practicable (ALARP)” to borrow a term from the UK’s Health and Safety at Work Act 1974 (Health & Safety Executive, 1974). Moreover, the definition of what is reasonably practicable will change with what is technologically feasible, and so what constitutes acceptable practice today may appear wilfully negligent in the future. Opportunity: Using resources in a manner that is respectful of your organisation’s funds and the environment. Risk: Failing to manage data in a cost-effective and sustainable manner. Holding data doesn’t just come with a risk profile, it comes with a cost and carbon profile too. Merely using the Cloud doesn’t make data storage free or carbon neutral. There are a wide variety of ways in which you could spend more than is necessary in collecting and managing your data, and a range of failure modes that could increase the cost of your data handling. There are direct costs such as procuring and configuring your storage systems in a way that use resources at a higher price point than is necessary. There are the consequences of data breaches and data loss,

4.6 Data Principles (and Ethics)

171

both in terms of potential fines, as well as the cost to collect data for a second time due to loss. And there is the cost to the environment of unnecessarily consuming data storage and computation resources. Many organisations have issue with the duplication of documents, where they hold duplicate copies of the same information in multiple locations. This is a costly problem in terms of person-resources consumed in searching for and confirming the latest versions of documents, as well as creating the potential for divergent sources of truth. The International Data Corporation (IDC) and McKinsey Global Institute estimate that searching and gathering information accounts for 19% of knowledge workers’ time, and that better use of collaborative technologies could free up 20 to 25% of their working time (McKinsey & Company, 2012). It is also worth considering that enterprise data systems tend to have multiple back-ups for the purposes of availability and disaster recovery, so a single onemegabyte file may have a long tail of many more megabytes as it becomes copied backed up and mirrored across multiple systems and servers. There are also intangible losses arising from a failure to make use of your information effectively. It’s obviously impossible to prove that an organisation has made optimal use of its information, there is always room for more analyses and insight. Nonetheless, failing to make data-driven decisions can and will result in a less costeffective outcome for the organisation and the shareholder or taxpayer. There is an insidious tax on ignorance, from repeating work to making poor use of people’s time, to needlessly inconveniencing or harming people. Opportunity: Upskilling and empowering your colleagues through inspiring data projects. Risk: Causing undue impact upon staff within your organisation because of the changes associated with data projects. Change is challenging. Even those of us who like to think that we embrace change can be blindsided by it from time to time. Change can be particularly confronting when it happens quickly and is the result of a trend or force with which we are unfamiliar. The rise in the importance of data to all our working lives is both rapid and unfamiliar to many people. One can imagine that there are many skilled and competent people out there who put off learning IT skills earlier in their career and feel like they can’t catch up now. This may partly explain why data projects can engender fervent scepticism and even anger in users. Often investments in data are “done to” rather than “done with” the business, changing people’s work overnight with scant training or support. A data project will not succeed if it leaves people disenfranchised or threatened. As data professionals we need to be careful in our communications to ensure that both the changes associated with a project, and its desired outcomes, are as broadly understandable (and, frankly, relatable) as possible. Creating a list of data principles for your organisation is a good way normalise what ethical use of data looks like in your organisation. The scenarios described above are not a complete list of consequences that we should consider when setting out our data principles. The data principles of any given organisation should reflect its unique combination of opportunities and risks, as well as the values and behaviours of that organisation. Data principles should be:

172

4 Data and Organisational Culture

• Self-explanatory and clear to a non-technical audience. • Normative, setting out positive aspirations and behaviours. • Specific, without being prescriptive. It is worth testing your principles against external examples, and to avoid reinventing the wheel. This might mean cross-checking your data principles against those of similar organisations, or against the Centre for Digital Built Britain’s Gemini Principles. An example set of principles might include statements such as those listed below. These statements have been informed not just by the Gemini Principles, but also by Highways England’s excellent work in defining their data principles. • Security: Data is a valuable resource that we will keep safe and secure from accidents and attacks. • Cost effective: Managing data has a cost that we should understand and account for. Equally, we will value data as an asset. • Safety: our priority is to use data to help make the built environment a safer place for its occupants. We will remain mindful that using data can have unintended and negative consequences. • Continuous improvement: our use of data can always improve; and we will always look to augment our capability. • Innovation: we will adapt our services to support emerging needs and technologies. • Visibility: we will work in the open and share data wherever possible. • Collaboration: we will work with our wider organisation to create ownership of our data strategy and priorities. • User focus: the value of data is only realised when it’s used to help people make decisions. • Ownership: we are responsible for helping our colleagues to take responsibility for managing sensitive or valuable data. • Accessibility: we will ensure that our data services are inclusive and easy to use. • Data protection: we will conscientiously look after our customer and stakeholder data. • Public good: we must actively seek and deliver against use cases that will provide measurable value to our organisation, our stakeholders, and the public. Once you have drafted your data principles, don’t be ashamed to share them, and to seek buy-in and feedback from your wider organisation. Principles are important to the extent that we value them over more passing concerns, particularly when making tough decisions. There are many ways to make easy money in the built environment by reducing value to society, building unsustainable edifices, and compromising privacy and sensitivity; data vision statements and data principles exist to provide data professionals and their organisations with something better to aspire to.

4.7 Data Capabilities

173

4.7 Data Capabilities If our data vision and principles provide us with the why behind our organisations’ use of data, then data capabilities begin to flesh out the how, or at least the categories of how. When trying to map out how your organisation will use data it is easy to end up in the weeds. There are a multitude of design philosophies and platforms that you can use to realise your data vision. We will get into the details of technological solutions later in this book, but first you need to be able to speak in abstracted plain English terms about what you are trying to accomplish. Capabilities in this context need to meet three crucial criteria: • They are expressed in plain English so that you don’t need to be a data specialist to get the gist. • They are abstracted away from specific technologies. • They are abstracted away from specific use cases. So, a poorly expressed capability that violates these criteria would be: “our Comms department needs a Salesforce Customer Relationship Management (CRM) system to track complaints.” To be clear, we have nothing against Comms teams, or Salesforce, or CRM systems, but this capability suffers from both jargon and a complete lack of abstraction. Often it is easiest to think of data capabilities as the answers to data questions. For example, you might decide that you want to provide your project or organisation with the capability to confidently answer five ‘Data Questions’. Note: these questions draw from best practice observed with the UK’s National Highways data team. 1. 2. 3. 4. 5.

Is my data secure? Are my data needs defined? (Or, what data do I need)? Is my data of sufficient quality (Or, is it fit-for-purpose)? Is my data available to users (Or, how can I access it)? Is my data delivering value (Or, how do I realise value from it)?9 Across four ‘Data Domains’:

1. 2. 3. 4.

Alphanumeric. Documents and records. Models and spatial. Integration and interoperability.

Combining these four high-level data questions with the four data domains provides us with a four-by-four matrix. We will call this matrix the plain of possibility (PoP) and it provides a landscape for what a built environment organisation might seek to accomplish with data. The PoP is useful because every investment in 9

Here, and in many other areas of this book, we have drawn upon experience gained working for Davin Crowley-Sweet (currently long-serving CDO of National Highways). Davin initially framed versions of questions 1–4, and these were so good that they have informed our thinking in this area, we have borrowed and built upon these questions with respect.

174

4 Data and Organisational Culture

Fig. 4.5 An example four by four data capability matrix

data that you make can fit into one or more of the slots on that grid, allowing you to understand the range of capability that you could ultimately realise as a mature organisation (Fig. 4.5). Note: feel free to create your own matrix if you prefer. Having established that the PoP is a holistic view of the application of data capability to different data types, it becomes a whole lot easier to build out a roadmap. This is because the roadmap becomes the logical and prioritised process of filling in the PoP grid for a specific organisation. Mapping our current capabilities versus target capabilities, with RAG status to show maturity, provides one of the simplest possible maps of current and future states. You may then wish to work with your enterprise architects to develop a more comprehensive architecture to support this overview. If you can convincingly answer your data questions, then there is a good chance that you will be well-placed to meet your data vision and abide by your data principles. Table 4.1 elaborates on the data questions, and maps these to specific data capabilities. It also adds a fifth question pertaining to data security. Understanding the data capabilities that you are aiming to realise and phrasing them as prosaically and clearly (even laconically) as possible, will improve your ability to communicate clearly and concisely with your stakeholders. As you conduct stakeholder engagement, you can use these questions, and associated capabilities, as a framework to begin to elucidate use cases. Leading questions might include: • • • • •

Where is the data that you use most often stored, and how easy is it to access? How do you know that you can trust the quality of this data? How do you use that data to help inform or speed up your work? What data would you like to have but cannot currently access? Are there cases where you currently struggle to access or make use of data even though you see the possible value?

4.7 Data Capabilities

175

Table 4.1 Your data questions Question

Question expressed as capability

Associated technical considerations

1

What data do I The ability to specify data requirements, and • Data governance need? And what do I to understand your existing data landscape • Data catalogue already have? • User stories/use cases • Data modelling, data dictionary, reference data • Business glossary • Interface agreements

2

Is my data of sufficient quality? Is it fit-for-purpose?

The ability to assess and improve the quality • Data condition of data assessment • Data profiling • Data quality remediation • Data augmentation

3

Is my data available to users? How can they access it?

The ability to make data securely accessible to qualified staff and stakeholders using a common data platform

4

Is my data delivering value?

The ability to analyse information from the • Business common data platform using suitable tooling intelligence • Visualisation and analytics • Decision-support tools • Data science, LLMs, ML/AI.

5

Is it secure and responsibly managed?

The ability to limit access to your data to individuals with suitable permissions

Data questions mapped to data architecture.

• Data management • Data transformation • Data platforms (warehousing, data lakes, etc.) • Search and findability • APIs

• Information security • Data governance • Data protection • Record of processing activity • Legal compliance

176

4 Data and Organisational Culture

In this way your conversations with stakeholders can deep dive into the specific pain points and requirements of individual teams and roles, while still maintaining traceability back to the data questions. You will be able easily abstract and summarise these conversations back to your data questions and data types, mapping your user’s needs to the PoP matrix. This consistency will help you to identify commonality across organisation siloes and prioritise areas where there is the most benefit from investment. Whilst different teams in your organisation will inevitably use data for quite different purposes, mapping back to these abstracted capabilities should help ensure some consistency in terms of how they define, access, and analyse data (e.g., consistent enterprise-level data platforms and architectural coherence).

4.8 Use Cases You must always start with planning what you require based on your users’ needs.—Lisa Allen and Deborah Yates, The Little Book of Data

Too often, the priorities for investment in data stem solely from the data team’s view of what a business ‘needs’ or ‘should want’. This book occasionally falls into that trap as well. So, it is important that you check your priorities… “Who’s actually asking for this investment?” Business sponsorship of data investment is a great firststep towards success. That said, it is not the only precursor. As a data professional you are amongst the pre-eminent data experts in your organisation. Consequently, you will occasionally have licence to follow your own judgement. Even so, it is worth asking yourself, is there no support for this idea because people haven’t yet seen the value, or because the value does not exist? As we covered earlier, an effective data function often sits independent of an IT function. This relationship can result in the data team acting as intermediaries between IT and the wider organisation. Part of the value that a data function brings to a business with an established IT function is the ability to abstract away from systems and legacy technology. By doing so the data team can add value through: • Challenge: Providing independent assurance of use cases from the business. • Consolidation/Convergence: Seeking to abstract from the specificities of business use cases to wider organisation-level needs. • Context: Showing business users the art of the possible based upon best practice from elsewhere. • Clarifying: Understanding the business needs in data terms (e.g., what questions are they trying to answer?) rather than system terms (e.g., what bits of kit will they need)? • Consistency: Meeting a range of business needs with a common approach to data capability. Figure 4.6 represents how a common data platform could sit at the centre of a wide range of business use cases.

4.8 Use Cases

177

Fig. 4.6 An extension of the ‘data types’ model from Chap. 1, showing how different data types feed into indicative use cases

The ‘Data Questions’ (and associated capabilities) listed in the previous section provide us with a useful perspective on the use cases that you may expect to see from your organisation. Whilst this shouldn’t be used to dictate needs to the business, it should function as a useful means of abstraction and gap analysis. Figures 4.7, 4.8 and 4.9 show the kind of use cases that might exist across the organisational structure of a typical built environment client. These are illustrative rather than authoritative to help you identify the kind of questions you could be looking to ask your colleagues in these areas. Data Specification: What data do I need? Or what data will I need? Data requirements often boil down to “what am I going to get in trouble for not knowing?” Business functions will be concerned with questions that the auditors might ask, designers and engineers will focus on establishing the golden thread of design decisions, whilst functions such as safety or maintenance will be more concerned with controlling the risk of accidents and asset failure.

178

4 Data and Organisational Culture

Fig. 4.7 Indicative data use cases—social, environmental, health safety and quality

Having established a data specification baseline, an enlightened, mature organisation might consider whether there are data sets that they do not currently capture, but which could help to deliver improved performance in the future. Any ambitions that an organisation has in terms of emerging technologies such as Digital Twins, Data Science, AI, Digital Rehearsal and so on will need the support of a matching data specification, which will be more than the organisation’s current baseline. An organisation’s data specifications should evolve over time, and effective data architecture (for example, data models) should provide a means for data professionals within that organisation to maintain and adapt specifications over time. Data Quality: Is my data of sufficient quality? Is my data fit-for-purpose? As you know, data quality is a function of accuracy, completeness, and timeliness. However, it’s often very difficult to know whether the data that you are using is incomplete (e.g., to know that your data set is incomplete, you would need to have access to another more complete data set!). Moreover, a data set can be accurate, complete, and timely, and still provide far too little or far too much data to enable decision making. In the former case you are providing staff with an illusion of simplicity, in the latter case you are spending too much money collecting and managing an

4.8 Use Cases

179

Fig. 4.8 Indicative data use cases—project delivery, maintenance, and operations

excess of data. This is why “fit-for-purpose” is often an easier way to think about data quality. Whether or not a data set is fit-for-purpose is something that one will discover in the field through feedback. In other words, if your colleagues are making decisions based upon a data set, and in practice they are discovering consequential exceptions or inaccuracies to the data, then it is probably not a fit-for-purpose dataset. One might call this sampling, but obviously if you are sampling in a manner where you only identify mistakes retrospectively then you are likely to realise substantial operational costs in the process. Your author remembers working for an organisation that would routinely discover that the location data of their assets was inaccurate because inspectors would show up at the recorded location of an asset only to discover that it wasn’t there. This is obviously a time-consuming, expensive, and potentially dangerous means of discovering that a data set is not fit-for-purpose. Similarly, another organisation would routinely find inconsistencies in the data contained in reports only when staff presented those reports to upper management. Cases like this indicate that staff closest to the data lack to the tools or training to fix data quality issues at source. Improving data quality is rarely about running big ‘data improvement’ exercises, but more about creating a culture where staff are empowered and able to flag (and correct) data quality issues at source.

180

4 Data and Organisational Culture

Fig. 4.9 Indicative data user cases—business support functions

Data Accessibility: Is my data available to my users? There are several aspects to the accessibility of data, ranging from how quick it is to find and retrieve (availability, findability, ease of access), to how interpretable the data is (compatibility, usability, interpretability, documentation). We know that searching for and accessing data is one of the most resource-costly non-value-add activities in white collar work (Beyer & Holtzblatt, 1997). We also know that if employees do not have easy access to information, then they are likely to start hoarding it in their own siloed repositories (Kanter, 1983). The built environment still conducts the bulk of its reporting and analysis using spreadsheets. This is either because self-service business intelligence (BI) capability does not exist in organisations, or because that capability does not cover the wide range of data sources used in reports. It may even reflect a sense amongst staff members that they do not have control over the data- and therefore the conclusions of reports- if they are not able to manipulate the source information in spreadsheets. However, extensive manipulation of data in spreadsheets prior to the creation of reports and analyses leads to divergent and competing sources of truth with inconsistent data definitions. The only means that an organisation can create a collective understanding of performance is through the creation of shared, accessible, and standardised data sets and adherence to an agreed enterprise data model. To the extent that organisations can realise this vision, and

4.9 Data Outcomes and Benefits

181

force/encourage users to abide by it, then they can credibly say that their data is accessible. Data Use: Am I able to realise value from my data? We will cover outcomes and benefits in the next section, providing examples of the type of value statements that one would expect to realise from data. However, it’s worth remembering in this context that use cases for realising value from data, and the tangible benefits of data may not be the same thing. The use cases described above are about creating a sense of agency and ownership of data across an organisation. This is as much a question of culture and empowerment as it is one of capturing financial benefits. As data professionals we should aim to create organisations where staff feel like: • They know what data they require. • They can define the quality measures that data should abide by and are able to flag problems. • They enjoy easy and unrestricted access to meaningful data in a format they can use; and, • They can incorporate the use of that data into their work and their collaboration with others. It is unlikely that we would realise these outcomes without also realising tangible financial benefit, but it is possible. Remember that when it comes to use cases, rather than formal benefits frameworks, value is often in the eye of the beholder.

4.9 Data Outcomes and Benefits 4.9.1 Sectoral Benefits There is a genuine lack of quantified evidence across the built environment sector to support investment in data. Too often this leaves us reliant upon unsubstantiated assertions, rather than allowing us to articulate and quantify the value of investment in data. This is not to say that the benefits do not exist. Sometimes organisations lack the skills to conduct benefits capture or simply do not prioritise it, other times the complexity of delivering assets into the real world make it difficult to identify and capture benefits. Construction projects and the infrastructure that they create tend of be large, (relatively) unique, and each performs a different function and exists within a different ecosystem of suppliers and stakeholders. Similarly, there is a rarely a counterfactual or control group that allows one to compare data-enabled and data-agnostic approaches. This isn’t merely an academic issue; it unfortunately makes it harder for us to make the case for investment within our own organisations. This is particularly the case for new or low maturity organisations, and emerging technologies.

182

4 Data and Organisational Culture

We discussed earlier in this book the how publications such as Data for the Public Good, Transforming Infrastructure Performance, and Flourishing Systems have made a direct link between the functioning of our built environment and the realisation of societal outcomes. Transforming Infrastructure Performance, for example, specifically links the built environment—and the use of data to improve the outcomes of the built environment—to the UN Sustainable Development Goals. However, these links remain qualitative rather than quantitative, aspirational rather than realised. As Fig. 4.10 illustrates, data has a role to play in helping the built environment to realise a range of environmental, societal, and sectoral outcomes across the asset life cycle. It does so by providing organisations and individuals with the situational awareness required to decide how and where to intervene on the physical assets. In this context data realises benefits indirectly by allowing us to reduce the footprint of our activities, by allowing us to identify and eliminate unnecessary work, and to better target and execute of the work that we do undertake. As data professionals it can be great to find ourselves with the opportunity to work from a blank slate, to deploy new capabilities into an organisation without needing to worry about technical debt or legacy infrastructure. Even where a project is starting from nothing, the individuals employed by that project will still bring baggage and biases informed by prior experience. However, if we need to justify investment, and we lack a record of accomplishment ourselves, how do we make the

Fig. 4.10 The role data plays in realising positive outcomes for the built environment

4.9 Data Outcomes and Benefits

183

case? And how do we identify an appropriate level of investment? One feels that in the absence of quantified benchmarks then the sector will skew towards underspend. Deloitte tells us that on average IT spend is around 3.6% of organisational revenue, with banking hitting almost 8% and construction a lowly 1.7% (Kark, 2020). By this simple metric, construction is underspending. However, as is clear when we consider the range of activities that fall under Digital Construction the dividing line between IT and the wider organisation becomes fuzzier. Do technologies like BIM or Digital Twins constitute IT spend, or spend on broader organisational capability, and where do training and business change sit? Given the relatively high cost of embedding these technologies the part of the organisation to which they are assigned by finance can skew ‘IT’ spend as a percentage of business revenue enormously. Investing in data is both riskier and more expensive than traditional core IT (e.g., servers and laptops). Occasionally disingenuous or uninformed parties will try to use core IT spend as a benchmark for a digital construction or data science function, which tends to make data and digital investment appear disproportionately expensive. It should be self-evident that one cannot usefully compare an organisation that only needs IT to hand out laptops, to one where IT aspires to realise value from digital twins or predictive maintenance. What would be useful in this context is a database of quantified benefits across the sector. What benefits did Crossrail realise? How about the Thames Tideway? How about the London Olympics? When the government published its BIM strategy back in 2011 it argued that adoption of BIM alone could cut the cost of capital projects by 15–20%. Was this ever substantiated? Has any project documented realising these kinds of savings? Given the complexity and lack of counterfactuals, can one even quantify such a difference? Was it all just a sales pitch? Unsurprisingly, the UK’s Centre for Digital Built Britain (CDBB) published some of the best work in this space. Take, for example, their awkwardly titled reports BIM Level 2 Benefits Measurement (PwC, 2018) and The Value of Information Management in the Construction and Infrastructure Sector (KPMG & Atkins, 2021). Both reports attempt to take a summary view across several case studies. The KPMG report covers “several case studies identifying cost savings ranging from 1.6% to 18%.” The PwC report studied two projects in depth (to demonstrate their benefits assessment methodology) and concluded that the adoption of level 2 BIM realised between a 3 and 5% reduction in construction costs and (surprisingly) around a 6% reduction in operational costs. Unfortunately, these reports don’t assess benefits in terms of time, quality, risk, sustainability, or safety. It is possible that investing in digital capabilities such as BIM makes projects more expensive, but recoups that investment in the long run by delivering better, safer, more sustainable assets. All these efforts, unfortunately, suffer from a combination of (a) a small underlying data set, (b) a lack of granularity in the data (and/or reliance on subjective measures), and (c) the potential for sample bias in terms of which projects become case studies in the first instance. They are useful reference points, but suffer from a scarcity of data and if anything, they demonstrate just how much work remains to be done in this space. There is a risk that if we continue to fail to capture data, we will be unable to continue to make the case for investment. Similarly, if we fail to measure benefits

184

4 Data and Organisational Culture

with sufficient granularity, particularly for projects that don’t succeed, we may miss the subtle differences that distinguish good intentions from meaningful results.

4.9.2 Organisational Benefits Should the outcome of a Data Strategy simply be the realisation of an organisation’s strategic objectives? Well, yes and no. Of course, data outcomes should align to an organisation’s strategic objectives, but data will always be a small part of building and operating the built environment. Investment in data needs to stand on its own two digital feet, to wash its own face. To prove that we are creating value in the short and medium terms we must challenge ourselves to identify and track the direct benefits of our work. The benefits of digital technology can sometimes be hard to quantify. The culture of an organisation will influence how it perceives investment in data. Data-centric organisations may see investment in data as a good unto itself, as self-evidently worthwhile without the need for lengthy justification. Other organisations, perhaps those who have a history of failed data projects, may come to see investment in data as an ineffable black hole, one that swallows money and other resources and delivers merely sound and fury signifying nothing. No matter the mood music in an organisation, there is a valuable discipline in arriving at tangible, measurable outcomes, and benefits statements. As a starting point it makes sense that your outcomes would flow from the capabilities that you are creating (our ‘big five’ questions), and the use cases that you are realising against those capabilities. Moreover, to win over hearts and minds in the built environment sector it makes sense to root your benefits in the most common currencies of that sector, such as safety/wellbeing, risk, cost, time, quality, and sustainability. Figures 4.11, 4.12, 4.13, 4.14 and 4.15 provide a visually map our ‘big five’ capability questions to safety, risk, cost, time, quality, and wellbeing outcomes. From these you can start to list out the data benefits that one could track as an organisation. Again, these are examples written for a generic built environment client (perhaps with a bit of a skew to construction), the specifics will vary based upon your organisation’s focus and level of maturity. Why not have a go yourself? Indicative benefit statements for data specification (what data do I need?) in [Action][Problem][Result] format: 1. [By defining what data we need] [We will avoid storing or migrating data without identified users] [allowing us to reduce our storage costs]. 2. [By defining what data we need] [We will design systems that provide users with the information that they expect] [de-risking our solution development and speeding user adoption].

4.9 Data Outcomes and Benefits

185

Fig. 4.11 Outcomes associated with specifying your data requirements

3. [By defining what data we need] [We will track where we manage sensitive and personal information] [reducing the risk of harm to our staff and stakeholders through data loss]. Indicative benefit statements for data quality (is my data fit-for-purpose?): 1. [By ensuring that our data is fit for purpose] [We will provide staff with accurate design and construction information] [reducing the risk of unexpected changes on site]. 2. [By ensuring that our data is fit for purpose] [We will enable staff to provide an audit trail to support their decisions] [improving delivery and reducing legal/ compliance risk]. 3. [By ensuring that our data is fit for purpose] [We will reduce/avoid abortive work—whether that means collecting the same data multiple times or being able

186

4 Data and Organisational Culture

Fig. 4.12 Outcomes associated with measuring and improving your data quality

to carry out scheduled works due to foreseeable obstacles] [meaning that we can reduce the cost and time required for our works]. Indicative benefit statements for data accessibility (how can I access our data?): 1. [By making our data accessible to trusted staff and suppliers] [We will reduce the time that our staff spend searching for, manipulating, and otherwise ‘manually handling’ data] [and increase the amount of time available to them for analysis and decision-making]. 2. [By making our data accessible to trusted staff and suppliers] [We will create a single record of our performance metrics, hazards] [allowing our staff to reduce disputes and subjectivity, and proactively remediate organisational risks].

4.9 Data Outcomes and Benefits

187

Fig. 4.13 Outcomes associated with making your data accessible

3. [By making our data accessible to trusted staff and suppliers] [We will enable the collaborative design of our assets, creating a sharable master record of the building design and associated hazards] [reducing abortive work and safety incidents on site]. Indicative benefit statements for the realisation of value from data: 1. [By enabling proactive analysis of delays, cost overruns, safety incidents, and other leading indicators] [We will be able to identify where we need to take remedial action to course-correct our work] [and therefore mitigate the risk of cost and time overruns, or safety incidents].

188

4 Data and Organisational Culture

Fig. 4.14 Outcomes associated with deriving value from your data

2. [By deploying Digital Rehearsal capability] [We will be able to test the delivery of our works, allowing staff to work through sequencing and safety challenges in a digital environment rather than on site] [this will de-risk our work, allowing us to mitigate cost, time, and safety risks]. 3. [By using automated rich data collection on site] [We will be able to compare ‘as built’ observations on site to our design models] [allowing us to proactively correct errors in a timely manner and reduce the need for costly remedial works]. The benefit statements in this area will expand as your data science capability develops and as you prototype initiatives across your organisation. Whilst the benefits statements for other capabilities are relatively predictable and static, the emphasis

4.9 Data Outcomes and Benefits

189

Fig. 4.15 Outcomes associated with ensuring your information security

on rapid prototyping by a data science function means that its benefit statements will frequently change and (hopefully) expand.

190

4 Data and Organisational Culture

The benefit statements for the secure management of data: 1. [By securely managing our data] [We reduce the risk of data breaches] [and the cost, wellbeing, and reputation damage that these events can cause]. 2. [By maintaining a timely and accurate information asset register, including assigning data owners] [We will ensure that we are able to track the data assets that we hold, and who is responsible for them] [meaning that we are better able to track our information risk exposure, and respond to incidents]. The formality of benefits tracking depends in part on the culture and fastidiousness of your organisation. In some organisations it may be sufficient that you can articulate a clear narrative on where your work is creating benefit, others may mandate detailed spreadsheets and external audits. Whilst it can be tempting to avoid the diligence of formal benefits capture, you never know when it might come in useful. It is common for investment in data to attract challenge, particularly when budgets are tightening or when leadership changes, so having quantifiable data to back up your claims can be useful. Formal quantification can also help to avoid miscalculations, for example where two projects contribute to an outcome it can be easy to inadvertently double-count benefits. Where your organisation already has an established means of capturing benefits you would be well-advised to try to plug into these rather than inventing a standalone benefits framework for data. This will provide you with some common currency within your organisation and introduce a degree of independence to the evaluation. Whilst the diligence of benefits capture generally provides a useful challenge, organisations can also go too far in the other direction. One of your authors worked on a large infrastructure programme with a sizeable contingent of consultants tasked with capturing benefits. At times this led to a culture where claiming future benefits to justify the programme occasionally became more important than delivering change. One must bear in mind that benefits occur in the real world, not merely on paper.

4.10 Data Roles and Skills Recruiting and retaining data expertise and upskilling existing employees are some of the most time-consuming and challenging aspects of working with data. The competition for data skills has grown enormously over the last decade. In the postpandemic job market data has been a particularly in-demand skill in a generally overheating labour market. As the built environment sector skews public sector and low margin, wages are not always competitive, and it can be difficult to compete for talent. This may be because the money simply isn’t available, or because organisations can find it hard to stomach the relative salaries of data staff compared to other specialisms or managerial staff. In the face of a tight jobs market for data talent, organisations might be tempted to develop their own talent. Data teams in built environment organisations often repurpose staff who started out in one of the sectors other professions—engineering or

4.10 Data Roles and Skills

191

planning for example—before deliberately, or accidentally, acquiring data skills or falling into data roles. Your authors fall into this category. In the UK there are a wealth of apprenticeship schemes, funded under the Apprenticeship Levy, that aim to make upskilling existing employees relatively easy and affordable. Whether or not a role within an organisation is a ‘data role’ is not a binary question. To successively make use of data, built environment organisations need to build capability across the board, from providing all employees with baseline IT and data management skills, through to ‘centres of excellence’ for data best practice. An effective organisation with have a data posture that meets the needs of each role within their organisational structure. Not all professionals within an organisation will see data skills as relevant to their career, or congruent with their professional identity. This can mean that staff that are unreceptive to upskilling, or even actively hostile to new ways of working. Conversely, many built environment organisations can be slow to adopt or exploit emerging technologies, and ambitious data recruits may feel disenchanted with the slow pace of progress in the sector. Your author remembers working at a low-margin organisation that provided services to public sector clients. Broadly speaking this organisation paid staff according to seniority, and seniority was almost entirely down to how large of a team you managed. Inserting data scientists into this environment meant justifying why one data nerd, usually a few years out of university, should earn as much as someone with twenty years of experience managing a team of tens or even hundreds. This was a difficult case to make—at least without alienating people—and had an impact on the attitude that the rest of the business brought to the data team, which could often be coloured with an understandable defensiveness or jealousy. When we recruit data professionals and use data services we are tapping into broader pools of talent and skills from across industries. You might find yourself using an open-source coding language bankrolled by a social media firm, a data governance tool first developed for big finance, and competing for talent with online retailers, oil & gas firms, and insurance companies. To succeed, you need to do more than simply recruit a small number of specialists and hope for the best. Instead, as data professionals in the built environment we need to play to our strengths and beg/ steal/borrow practices from other industries (not to mention GitHub CoPilot). While few people enjoy feeling underpaid, most are not solely motivated by money or status. Part of the attraction of the built environment sector—at least to some candidates—will be its connection to the real world and tangible outcomes. Working in a data role can sometimes feel highly abstract and disconnected from tangible outcomes. Providing the data team with a sense of connection to the physical assets can help to “make it real,” making site visits and lunch & learns more than just recreational. The built environment offers us an opportunity to connect our work more directly to societal outcomes, whether we are building new infrastructure, maintaining the existing asset stock, or improving how our system of systems functions. This contribution to society provides an exciting narrative for many of us

192

4 Data and Organisational Culture

and can even make the difference when trying to recruit talent.10 Moreover, if we attract candidates in part due to the purpose provided by our work, we are also more likely to retain those candidates and reduce staff turnover. To many in our sector the data world can feel akin to standing next to a motorway watching cars speed by; we can see these other people racing to their destination, but we have no idea how to build or start our own vehicle. In this context we need to use the on-ramps developed in other industries to help get us up-to-speed and on to the highway quickly and easily. This starts with helping nearly everyone in the organisation understand the value of data to their work, and how they can learn to make better use of it. Once we have engendered pockets of advocacy in our organisation then we can use on ramps such as online courses, apprenticeships, and communities of practice to provide them with a means to translate enthusiasm into practice. Colleagues and stakeholders across your organisation can help or hinder your data strategy in equal measure depending upon the extent to which they understand and agree with your desired outcomes and the role that they play within an organisation. For all their many virtues, compared to technology humans are expensive, unreliable, can’t operate anywhere close to 24/7, and need to be convinced that it’s in their interest to do objectively reasonable things. To win over an organisation of any reasonable size you will need to subdivide it into meaningful groups mapped to distinct user personas. This in turn requires an understanding of the breakdown of role types, and roles in your organisation, one that allows us to tailor your communications, expectations, and training to meet the needs of diverse groups. In terms of the different user personas that exist across your organisation, this will as ever depend upon the idiosyncrasies of your organisation. You will need to engage with your stakeholders to suss out the types of groups that will be most useful to you. However, as a straw person you can start with a simple taxonomy such as the list of four below, and then add further granularity as required: 1. Leadership: These are the people that set the strategic direction and control the purse strings. You need to help your leadership to understand how investment in data can benefit the organisation, otherwise you won’t have the resources you need to realise any kind of change. 2. Data do-ers: The colleagues that design and run the data tooling. Whether they sit within a dedicated data team, or are champions spread out across the business, you need them to be skilled and to feel empowered. 3. Users: Nearly everyone else in your organisation. These are the colleagues that will use your data tooling to create every day BAU value. Without adoption by users everything you do as a date leader is meaningless.

10

One might also argue that whilst the built environment has many flaws, it is an ethical pursuit. Conversely, the industries that recruit data skills most heavily (social media, retail, finance, extractive industries, defence, gambling) are not always engaged in making the world a better place. They use data to reinforce and accelerate cycles of overconsumption that leave society, and the planet, worse off.

4.10 Data Roles and Skills

193

4. IT: Your colleagues that design, operate, and maintain the underlying architecture and computer stuff. Unfortunately, it’s really difficult to run data services in isolation, you will always have a reliance on IT (so you better make friends). Having established your personae, you then need to map the roles from your organisation against them. This can help you identify what role you need to bring in house, what you can rely on your supply chain for, and what stakeholders require the most attention. Some roles may already exist in your organisation, you may need to make the case to create others. Thankfully, there is a lot of external best practice that can and should inform your approach to identifying and building roles and skills including: • SFIA (Skills for the Information Age) Framework: An international framework of job roles and competency levels targeted, as the name suggests, at roles related to information technology. • DDAT (Data, Digital and Technology) Professional Capability Framework: The UK government’s skills framework for Digital Roles. • CDBB (Centre for Digital Built Britain) Skills for the Future: A more detailed approach from the CDBB centring round the skills required for emerging technologies such as Digital Twins. • The Data Management Association (DAMA) Data Management Body of Knowledge 2nd Edition (DMBOK2) which includes sections on both Data Governance responsibilities and the types of organisational roles required to deliver data management. Between these three resources it is possible to arrive at a useful taxonomy of data (and data-related) roles that you might need (see Fig. 4.16). One thing that really stands out is the sheer number and diversity of IT roles that you are reliant upon to deliver data services. Consequently, you need to try to ensure that your organisation’s IT resourcing enables your intended data outcomes (or at least flag your dependencies and warn your leadership when it doesn’t). Another thing to note is that we haven’t tried to map every job title in the organisation against ‘data users’, but rather we recommend mapping organisational roles into a smaller group of user persona such as ‘data owner’, ‘data steward’, ‘data analyst’ etc. Despite our begrudging reliance on humans, they are also what makes work beautiful and enjoyable. It’s in your best interest to ensure that they are given the support and training that they need to succeed. That said, there will be limits to your agency as a data professional. If you find yourself in a position within your organisation where neither the leadership nor the IT function are able or willing to support your strategy then this will prove a difficult and frustrating environment to work in. Understanding from a diagram like the one above what roles you should be accountable for, what roles will report to others but will require your support, what roles you depend upon, and what roles are stakeholders that you should seek to influence, will help you to understand where your accountability and responsibility lies, and where you will need to influence others in order to succeed. The exact boundaries of the

194

4 Data and Organisational Culture

Fig. 4.16 Indicative data role taxonomy

data function vary from organisation to organisation, and it may be useful to ask the following questions of each role to understand how it relates to you: • Is all or part of this role responsible for delivering an aspect of our data strategy or roadmap? • Will the organisation employ this role in-house, or buy it in from suppliers? • Do you need this role to provide advocacy or support to enable your data strategy or roadmap? • Is this role a user of data services? • Is this role identified as a data owner or steward under your data governance strategy? • Do you need these roles to possess skills or undergo training to govern your data or to enable your data strategy? Having answered these questions, you can then arrive at a mapping against each role showing where you need to define responsibilities, skills, and training to support your data strategy. Figure 4.17 presents an organisation that aims to hold all its core data roles in house but operate a federated model in terms of how those roles spread across the organisation. The data team will build and manage a data platform, and act as a centre for Subject Matter Expert (SME) skills in data manipulation and science, whilst activities such as data ownership, data collection, and data analysis ‘federate’

4.10 Data Roles and Skills

195

Fig. 4.17 Mapping of roles to data requirements

across the organisation. The goal here is to ensure that everybody in the organisation has a relationship with data, and that staff members can access and analyse data to inform their decisions. Helping your colleagues to understand the role that they can play in governing and using data is crucial to ensuring that your organisation benefits from data. As covered in the stakeholder engagement section above, it is important that your data strategy and associated data roles are as meaningful as possible to as large a segment of your organisation as possible. Roles such as Data Consumer and Data Producer warrant attention as they likely apply to the largest numbers of colleagues in your organisation. Defining these roles is an opportunity to define the base level of awareness and training that staff need. Often this kind of training is minimal and predominantly concerned with legal and regulatory obligations such as information security. But it’s important to make the positive case for data. We want to empower staff to start using sources of trusted data, for example to create and share dashboards rather than fiddling with offline spreadsheets. It would be naive to expect everyone in your organisation to read, digest, and understand your data strategy in its entirety, or to sign up for data training. However, you should seek to find accessible multi-media means to explain vital elements of your data strategy to a large audience, and again provide easy first steps for them to transition into better use of data. Some data roles, such as Data Analyst or Data Owner will warrant more targeted specialist training, but the bulk of the training that you do should focus on data awareness and data fluency. Given the choice of moving 1 person from VBA to Python, or 10 people from Excel to BI dashboarding, you may find that your organisation benefits far more from the latter. In delivering this widespread awareness you should seek to blur the boundaries between internal comms, engagement and training, creating opportunities for staff to learn and provide

196

4 Data and Organisational Culture

feedback on your data strategy and data services, and to share their own successes. A data strategy or data training doesn’t need to be dry, and text based, data is an opportunity to flex the corporate mould, to try to be passionate and relatable.

4.11 Conclusion For a data professional in the built environment sector, it is not enough merely to be inspiring. Sure, it’s important to have a vision, to work with a destination in mind and to sell that destination to others. However, you also need to be able to make data tangible to people and show results. Building data culture means giving people agency, helping them to ‘do data’ themselves. If you don’t tackle culture, then it will block you at every opportunity. If you don’t make it easy and empowering for people to use data in their work, then they won’t. If you don’t make people feel like they belong, then they will disengage and drift away. The data capabilities described in this chapter are deliberately straightforward, deliberately interpretable, and deliberately broad. Capabilities like this help us to express our use cases and our benefits and even the skills that we need without becoming bogged down in the specifics of platforms and software (we’re saving that for later chapters). Your data strategy, whether written or implicit, is your greatest canvas… make sure you paint with brilliant clear vivid colours.

References Allen, L., & Yates, D. (2021). The little book of data: First steps to your data transformation. Independently Published. Beyer, H., & Holtzblatt, K. (1997). Contextual design: Defining customer-centered systems. Morgan Kaufmann Publishers. Bridgers, P. (2020). Garden song [recorded by P. Bridges]. Burkeman, O. (2021, April 27). The clockwork universe: Is free will an illusion? The Guardian. https://www.theguardian.com/news/2021/apr/27/the-clockwork-universe-is-free-will-an-ill usion Busuttil, J. (2016, April 25). Communities of people inspiring each other. Government Digital Service. https://gds.blog.gov.uk/2016/04/25/communities-of-people-inspiring-each-other/ Carnegie, M. (2021, September 10). We finally know the true toll of all those bad slacks. Wired. https://www.wired.co.uk/article/time-wasted-slack-microsoft-teams Carter, R. (2021). Cognitive advantage: How Artificial Intelligence is changing the rules for winning in business and government. Mayhill Publishing. Central Digital and Data Office. (2021, December 21). Data standards authority strategy 2020 to 2023. https://www.gov.uk/guidance/data-standards-authority-strategy-2020-to-2023 Civil Engineering. (1999, June). Programme Management B.C. https://web.archive.org/web/200 70608101037/http://www.pubs.asce.org/ceonline/0699feat.html DAMA International. (2017). DAMA-DMBOK: Data management body of knowledge (2nd ed.). Technics Publications.

References

197

Department for Digital, Culture, Media & Sport. (2020, December 9). National Data Strategy. https://www.gov.uk/government/publications/uk-national-data-strategy/national-data-strategy Fisk, P. (2020, February 17). Metcalfe’s Law explains how the value of networks grows exponentially. https://www.peterfisk.com/2020/02/metcalfes-law-explains-how-the-value-of-networksgrow-exponentially-there-are-5-types-of-network-effects/ Foreign & Commonwealth Office. (2014, April 10). The Rana Plaza disaster. https://www.gov.uk/ government/case-studies/the-rana-plaza-disaster Gayle, D. (2021, August 7). Plans for 190 flats on London Latin Village site scrapped after protests. The Guardian. https://www.theguardian.com/uk-news/2021/aug/07/plans-for-190-flats-on-lon don-latin-village-site-scrapped-after-protests Health and Safety Executive. (1974). Health and safety at work etc. Act. https://www.hse.gov.uk/ legislation/hswa.htm Kahneman, D. (2012). Thinking fast and slow. Penguin. Kanter, R. M. (1983). The change masters: Innovation and entrepreneurship in the American corporation. Simon and Schuster. Karas, D. (2015). Highway to inequity: The disparate impact of the interstate highway system on poor and minority communities in American cities. New visions for public affairs (Vol. 7). Kark, K. (2020, January 7). Deloitte. https://www2.deloitte.com/us/en/insights/focus/cio-insiderbusiness-insights/tech-finance-technology-investment-budgeting-processes.html KPMG & Atkins. (2021, June). The value of Information Management in the construction and infrastructure sector. CDBB. https://www.cdbb.cam.ac.uk/news/value-information-management-con struction-and-infrastructure-sector Leslie, D. D. (2019, June). Understanding artificial intelligence ethics and safety. The Alan Turing Institute. https://www.turing.ac.uk/sites/default/files/2019-06/understanding_artificial_ intelligence_ethics_and_safety.pdf Lucas, J. (2022, March 4). Human-machine interoperability: What can we learn from the invention of the washing machine? https://www.linkedin.com/pulse/human-machine-interoperability-whatcan-we-learn-from-jo-lucas/ Maslow, A. (1943). A theory of human motivation. Psychological Review, 50(4), 370–396. McKinsey & Company. (2012, July 1). The social economy: Unlocking value and productivity through social technologies. McKinsey. https://www.mckinsey.com/industries/techno logy-media-and-telecommunications/our-insights/the-social-economy/ Newport, C. (2021). A world without email: Reimagining work in an age of communication overload. Portfolio Penguin. Nonaka, I. (1997, November 11). Organizational knowledge creation. http://www.uky.edu/~gms wan3/575/nonaka.pdf Open Data Institute. (2021, September 26). ODI data strategy. https://www.theodi.org/about-theodi/odi-data-strategy/ PwC. (2018, March). BIM level 2 benefits measurement. https://www.cdbb.cam.ac.uk/files/4.pwc bmmapplicationreport_0.pdf Robinson, E. (2022, December 28). Taylor Swift’s Midnights is the first album to sell better on vinyl than CD since the 1980s. NME. https://www.nme.com/news/music/taylor-swifts-midnights-isthe-first-album-to-sell-better-on-vinyl-than-cd-since-the-1980s-3372071 UKPN. (2023). Digitalisation strategy and action plan. https://digitalisation.ukpowernetworks.co. uk/download-our-digital-strategy

Chapter 5

Delivering Data Capability

Abstract In the previous chapter, we explored the value case for applying data to the built environment, and the cultural change required for organisations to realise that value. We now dive into the practicalities of delivering better data outcomes within organisations. In doing so we touch upon three themes. Firstly, we cover the foundational data capabilities (governance, architecture, and data modelling). Secondly, we discuss how to deliver data projects within an established organisation, and how to work alongside a core IT function. Finally, we dig into how best to procure data projects to maximise their chance of success.

5.1 Data Foundations/Precursors It is easy in the data and digital world to become swept away by the hype train. Sometimes the world of eight-dimensional interconnected blockchain digital twins can feel so tantalisingly close that it’s tempting to throw oneself (and one’s organisation) into the arms of whichever vendor pitches the shiniest toys. However, those of us who have taken the hype train for a spin usually end up with one sobering conclusion: none of the radical technologies described in the next chapter—no matter how innovative or well-designed—will deliver meaningful improvements to an organisation if they have not put the foundations of good data management in place. It’s a case of “garbage in, garbage out” as our American colleagues like to say. What technologies such as digital construction, business intelligence, digital rehearsal, digital twins, data science, smart places have in common is that they all subsist on a strict diet of timely, complete, accurate, and structured data. Time and time again organisations fail in their data transformation not because they are unwilling or unable to procure and deploy data tools, but because they haven’t figured out what data they possess, who is responsible for owning or stewarding it, or how to structure, store, share, and maintain it. This is not a book about data management best practice, for that please refer to the Data Management Association’s (DAMA) Data Management Body of Knowledge (DMBOK) or similar publications. However, before we cover the specifics of the built environment applications of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_5

199

200

5 Delivering Data Capability

data, we want to touch briefly on how built environment organisations should set themselves up to successfully manage data.

5.1.1 Data Governance and Data Empowerment The word ‘governance’ comes loaded with a fair amount of baggage. It suggests ‘spiky chair’ meetings where a group of authority figures scrutinise hapless underlings. Applied to data, governance can prevent people from accessing the information that they need to do their work, deliberately keeping people in the dark, suppressing innovation, and burdening data users (and indeed non-data users) with unexpected low-value-add obligations. Data Governance is a double-edged sword. On the one hand it acts as a line of defence helping organisations retain control of the use of their data. On the other hand, it is, as DAMA write, a means to “enabled an organisation to manage data as an asset” and thus to derive value from that asset (DAMA International, 2017). Data Governance should not be confused with organisation’s obligations under specific legislation, such as GDPR, although effective data governance should make complying with such legislation easier. Whilst ‘Data Governance’ is the standard terminology for the processes for protecting and using data, ‘Data Empowerment’ might be a more accurate and positive headline. Empowerment means providing colleagues with clearly defined obligations and training that they need to both protect and make use of data, and then trusting them to do so. Putting aside the terminology for a moment, you must remember that data is a team effort. This team needs to demonstrate both the positive opportunities that come from leveraging data, as well as the risks and liabilities that result from that data being poorly managed. These are two sides of the same coin, one cannot use data without taking responsibility for that data, and data users should feel equally valued regardless of whether they are creating new insights for their organisations or protecting the same organisation from data breaches or poor data quality. We need data governance and information security to stop dreadful things from happening such as data loss, data breach, ineffective ways of working, duplicate/conflicting information, or collecting the same data multiple times. In doing so we earn the right to make good stuff happen such as knowledge creation, insight, whizzy charts, automation, cohesion. Data users within an organisation should share a common vision and purpose, and in many cases, they should share common infrastructure and ways of working. As we have already covered, a large part of an extended data team might not have specific data roles but rather act as an diaspora within the organisation of data-savvy decision makers, data analysts, data champions, and data owners and stewards. We wrote in a previous chapter about building a meaningful data culture so that an organisation has the mindset, skillset, and toolset that it needs to realise value from data. A dedicated data function will rarely deliver enough change in isolation to realise strategic value in an organisation. Similarly, without a team of informed and empowered data owners and stewards across your business, you won’t be able

5.1 Data Foundations/Precursors

201

to have the scope or the subject matter expertise to ensure that you avoid conceding data penalties. What does this mean in practice? It doesn’t have to be complicated. We’re not going to dive into the nitty gritty of Data Governance in this book, that’s what the DAMA DMBOK is for. But at its simplest you need to define and roll out two roles: Data Owners, and Data Stewards. To be clear, these are not new standalone roles, these are responsibilities that belong to the existing staff members best placed to execute them in a way that complements their existing duties. Staff should have time allocated to discharge these new responsibilities. Acting as a Data Owner or Data Steward should not be anywhere near a full-time task, and ideally should be possible in a manner that gels with the tasks that these staff already perform. The misinterpretation may exist that adopting data governance within an organisation represents additional work. Instead, we would argue that data governance responsibility exists by default, and if it is not picked up by staff across an organisation then it will fall to a centralised information management team to pick up (or just be a organisational risk). Whilst organisations will require some information management expertise to implement a data governance framework, excessive centralisation removes responsibility and agency from business users, and turns data governance into an unduly administrative task rather than a vital way of working. Data Owners are the first line of data defence. A data owner will usually be a relatively senior individual and will be assigned responsibility for one or more datasets for which their team are the principal users. For example, a ‘Head of’ in Finance may be the Data Owner for the ERP system, or a ‘Head of’ BIM may be the Data Owner of the BIM system. The data owner’s principal role is to determine who should and should not have access to the datasets that they own, as well as advising on how their organisation should and should not make use of datasets. In this way the Data Owner does not originate Information Management policy but has some discretion over how to apply it in practice. Data Owners will also be the customer for any works to augment, change, or improve the type and quality of data that you capture as part of their assigned datasets. This will be particularly relevant where you change or update systems, or where you undertake data quality improvement exercises. The Data Owner’s remit extends not just to their immediate team, but also to how the wider organisation and supply chain use data from the datasets that they own. This is particularly relevant where reporting and analytics requires aggregating data from multiple datasets. The Data Owner will retain ownership of their data, even when it leaves the source system and moves somewhere else, for example where the organisation copies it into a cloud data analytics platform. Where Data Owners manage access to and use of data, Data Stewards are responsible for managing the quality of datasets. Data Stewards will usually report up to a Data Owner, and the Data Owner may choose to delegate some of their authority to the Data Steward (for example, approving internal access requests). It is the prerogative of the Data Owner to define what data quality looks like (e.g., we track complete costs for all of our projects) and for the Data Stewards to coax the data towards that benchmark. Data Stewards really start to add value where data users (including the stewards themselves) begin to identify problems with the data. Depending upon their

202

5 Delivering Data Capability

role, and their technical ability, a Data Steward may directly fix data quality issues, or may function as the go-between from the business to IT or SMEs in managing requests for data quality improvements. Data Stewards will likely be best placed to define—at least in non-technical terms—what constitutes quality data, and how best to measure that quality. As a practical example, an infrastructure operator used to record the length of their bridges in their asset management system as metres. Occasionally an overzealous inspector would enter the length of a bridge in millimetres rather than metres, resulting in that bridge immediately becoming (on paper at least) one of the longest bridges in the country, and causing issues further downstream when people tried to estimate costs for activities such as painting based upon square meterage. In this context a Data Steward might not be the person who identified the issue, but they would likely be the person responsible for ensuring that it was fixed. This might mean, in the case of a localised problem where the correct answer was obvious, accessing the system and manually correcting values. But it may also mean taking responsibility for more systematic changes, for example working with IT to add validation rules into the logic of the database (for example, no bridge in the portfolio is longer than 2000 m) or changing the specification of bridge inspections so that the measurement rules were more clearly advertised. Data Stewards take delegated responsibility from the data owners, they ensure the maintenance of data to enable users’ work, remediating data quality issues where necessary. Whilst moving responsibility of these roles from IT to a wider organisation might initially feel like an imposition, the hope is that once suitably empowered colleagues will take pride in being able to manage their own data and will be glad of not having to rely on others to make changes on their behalf. As a data professional it is your responsibility to: 1. Clearly define the Data Owner and Data Steward roles. 2. Take your Data Catalogue and/or Information Asset Register (list of data sets) and work with your organisation to determine who is best placed to act as Data Owner and Data Steward for each dataset. 3. Ensure that your Data Owners and Data Stewards are aware of their roles, aware of the importance of their roles, have the capacity to undertake those roles, and have the training that they require to do so properly. Data Governance is necessary to ensure that you make the most of your data in a safe and controlled manner. Done well it should be empowering, allowing your Data Vision/Data Strategy to be something that the entire organisation has a role in realising. Whilst many organisations still run their data governance and associated artefacts through spreadsheets, there are increasingly off-the-shelf products such as Collibra, Atacama, Informatica, Alation, and others that can act as a ‘single source of truth’ for data governance and extend your capability into automatically identifying and fixing errors in your underlying datasets. Data Governance is increasingly recognised as an important part of creating usable data solutions for built environment organisations. Buried in the Annexes of the previously cited IPA publication ‘Transforming Infrastructure Performance: Roadmap

5.1 Data Foundations/Precursors

203

to 2030’ (TIP to friends) is an Information Management Mandate. Written as an update to the UK BIM Framework, the mandate requirements include that any UK client organisation should “have a digital mechanism for defining its information requirements… properly specify its information requirements, and their satisfactory delivery… recognising that the information it procures and holds is an important asset with value… and apply the same level of governance and rigour to the maintenance of its information (Varah, 2021).” Over the course of these few short words the mandate establishes Information Management as a fundamental obligation of responsible client organisations. As the mandate imposes the responsibility for managing information to client organisations, so those organisations must distribute that responsibility across their staff. Data governance is the means to do so. Data management and data governance are not the same things. Governance frames data management, providing the rules for the game. Data management is where the game is played. It is important to understand whether you are governing or managing data. The best way to tell is, if you are using the data, then you’re managing the data. An analogy here is the difference between the implementation of a BIM Kitemark and working according to the Kitemark.1 The designer of a business management system is distinct from a user of that system. The manager of a project using the business management system is a user. The former is in the business of governance. The latter is in the business of management. Figure 5.1 is a real example of an adaption of DAMA’s Data Governance and Data Management model (Fig. 15 of the DMBOK 2nd ed.) applied to a large global firm. The model in the DAMA DMBOK is linear, the below creates a circular relationship between oversight and execution of data governance. It also introduces an extra third dynamic (infrastructure). The first two are found in the DAMA model, these are governance and management. Governance is a separate central team in this firm where they hold the business to account for their data management. This example is for a project-based organisation, where the management of data (by volume) is generally not through corporate systems (like finance and HR) and the bulk of the data volume is with the project teams as they handle client and supply chain data to deliver the project works. This is where the third addition comes in: infrastructure. This is the IT organisation and where this model combines the governance of both corporate systems and project technology into one estate. There is an urgent need to shift the frame of governance for data and information management. In most organisations we experience a divide between the ‘IT organisation’ (or IT department) and what we call the project technology team. Generally, as a function of the fragmentation in the market of technology vendors we find three main areas of technology implementations, namely: 1. The enterprise/corporate IT. This is where finance, HR and business-related management systems are procured, configured, and generally mandated for 1

One of your authors implemented one of the first business management systems aligned to a Kitemark and then went on to being the first to scale it to a global scale. But that’s just boasting.

204

5 Delivering Data Capability

Fig. 5.1 AtkinsRealis—data governance model. Credit to James Trinder and Andrew Cox

employees and supply chains to use. They are generally treated as a fixed cost of operations.2 2. The specialist division/awkward department that uses very specific software. These teams sometimes manage to procure systems outside of the central IT estate. Often referred to as shadow IT, these specialist tools can come with a hefty hidden operating cost, for example shielding simulation software in the nuclear industry, are a valid reimbursable cost to a client. 3. Project technology. This is the elephant in the room. Due to the complexity of modern infrastructure delivery, project technology can generate independent IT teams completely outside of jurisdiction of the central IT team. An example here is the provision of IT for joint ventures, especially incorporated 2 Although it’s worth noting that subscription models of vendors that, for example, charge for the amount of storage and compute (consumption) instead of a flat licence fee are now difficult variable cost to manage for IT teams.

5.1 Data Foundations/Precursors

205

JVs (joint ventures where a new company is created between two, generally competing firms, like Mott MacDonald and AtkinsRealis). Alternatively, a client or a supplier that sits further up the supply chain might mandate the use of a tool a system that IT hasn’t procured or supports (for example, you work as the designer for Balfour Beatty and they mandate the use of Business Collaborator). We have reached a point in project complexity that these tribes can no longer exist independently. We must deploy robust data governance to ensure we are working within important guardrails, such as data related laws and legislation. There are ethical factors, but at the simplest level, there is a commercial driver in terms of scales of economy and double/triple handling of vendors. The governance of solution architecture, data and information management workflows do require a clear cut between the oversight and execution. Oversight is required from a point of integrity, there are laws to follow and ethics to keep with. Morality is not something that can be written into laws, but as we enter the new age of generative AI and we deploy more intrusive sensors into our built environment, we may find ourselves at the pointy end of regulation that will aim to protect the digital citizen as much as the physical one. There is another major issue to highlight, and our final pitch here for why you need to sign up for DAMA and establish your data governance ASAP. Delivering infrastructure, albeit social or economic is as much as an IT project as a ‘traditional’ building project. IT projects are notoriously expensive, late, and ineffective. Add that to our reputation for being over time and over budget, you have yourself a recipe for disaster. We unfortunately cannot offer advice here on how to marry these two. We know they need to coalesce for us to deliver digitally enabled products and services to the built environment sector. An important point to make here, and to speak from our experience, this is not about a simple copy and paste of management techniques. We know waterfall project management is not effective at building buildings or software. However, one thing for sure is that research and experience suggests that agile software development techniques can’t build bridges either. There is a blend required between the techniques required. People experience what we call ‘product love.’ Product love is what makes common data environments feel more like football teams than substitutable tools. Product love is a real force, in behavioural economics terms it is known as the endowment effect and its where people ‘stick with what they know’ and perceive to have invested emotionally into a particular product. This is the most important function of data governance, breaking the negative impacts of product love and enabling people, and their work to be pointed towards an optimal outcome, not for them to become obstructive because of their product love for an obsolete tool.

206

5 Delivering Data Capability

5.1.2 Pragmatic Enterprise Data Architecture The hope is to allow a pool of information to develop, grow, and evolve. For this to be possible, the method of storage must not place its own restraints on the information.—Tim Berners-Lee (as quoted in Tufte, Seeing with Fresh Eyes, 2020)

Over the course of this book, we have reflected on the many and various uses of data in the built environment sector. There are professions that are tightly coupled to the software that they use: geospatial information systems, building information modelling, programme planning, and controls (or PMO). These couplings have evolved to the extent that their practitioners may not self-identify as users of IT or data services, and may not see IT involvement as an enabler. Rather, they are GIS or BIM professionals who use GIS or BIM to do GIS or BIM. The technology becomes at once a noun and a verb, a system, and a purpose. The same is true, to a lesser extent, with other niches such as construction cost management, enterprise asset management, and whole life cost modelling, and will inevitably become true of profession/ software duos such as digital twins or digital rehearsals. An IT or Data team cannot make the mistake of treating specialists as uninformed ‘basic’ users, but nor can we avoid recognising that there are often common requirements sat behind the idiosyncrasies of particular use cases and tools. As a data professional in this sector, you will face no end of idiosyncratic ‘asks’ from the business, and you will in turn need to identify which use cases can be standardised, and which do genuinely require standalone software. Many client organisations have come undone trying to create a bespoke solution for a business ask from the ground up when really all that the users required (or could reasonably justify) was a small amount of configuration to a commercial off-the-shelf (COTS) product. However consequential the data that built environment organisations hold, their means of modelling and storing it likely rely on tooling and standards that would not be unfamiliar in other sectors. Even the most domain-specific taxonomies or schemas—such as UniClass or CoBIE—sensibly leave the heavy-lifting of data management to open formats such as XML, CSV, RDF, XLS, that would not be unfamiliar to a data professional joining from another domain. Similarly, asset-centric products such as Enterprise Asset Management Systems (EAMS) are often user interfaces built on top of standard components like SQL or Oracle databases. So, our use of data is sector-specific in topic rather than tooling. Built environment organisations still have the same CSV files, SQL databases, and Python scripts running on the same public clouds as every other sector, it’s only the properties recorded by the data (and perhaps the user interfaces) that are specific to the assets they manage. The technologies mentioned above: GIS and BIM, are perhaps exceptions to this rule, although this means higher barriers to entry to using these specialist tools. Even where built environment-specific proprietary eco-systems have evolved, there are parallels to other sectors like product design, aviation, and manufacturing, and likely a lot of commonalities with other products in the ‘back end’ of the solutions.

5.1 Data Foundations/Precursors

207

As data professionals we need to become proficient at abstracting away from particular use cases. We need to be able to analyse and generalise the specific asks of the business to understand where their needs are (relatively) unique and where we are best off adopting practice from elsewhere (for example, the use of the public cloud, or SaaS/COTS products). It is worth distinguishing between the layers of modern architecture (e.g., storage, logic, user interfaces). Often users ask for a ‘full stack’ solution where all they really need/want is configuration of a user interface. This isn’t helped by vendors who often see locking a customer’s data into a proprietary tool as a great means of depriving their clients of architectural flexibility and securing a steady revenue stream for the foreseeable future. Being pragmatic means understanding what is and is not unique about your organisation’s needs. You will usually, but not always, best serve the built environment and its data by piggybacking on capabilities developed and used by other sectors, and by avoiding expensive customisation or the accumulation of a panoply of bespoke and disconnected tooling.

5.1.3 A Practical Philosophy (a Data Dogma) Sometimes working as a data professional means following your intuition, and we have written this section in a deliberately flippant and light-hearted manner. In addition to your vision, principles, and capability, here are a few accumulated hints, tips, and guidance that will help you to deliver meaningful data services to your organisation. We intend this dogma to function as light-hearted #workhacks rather than binding obligations. But there is a serious point here. If you want to act as an informed and effective organisation, it is important that you provide suitable and consistent guidance and advice to the business and your suppliers. Day-to-day pragmatic (even intuitive) rules of thumb are just as important to realising sage consistency as highfalutin’ principles. The following points provide a practical day-to-day philosophy for a data team, but again you should add/remove and tailor to fit your organisation. Technical Dogma: 1. Remember that information wants to be free (but that you need to put some safeguards in place before releasing it into the wild). 2. Big IT projects with all or nothing deadlines tend to fail. Where possible favour incremental development and deployment over Big Bangs, and Agile over Waterfall. 3. Respond to ‘pull’ from the organisation/client/end user/customer over ‘push’ from IT, consultants, or software vendors. Find problems looking for solutions, not solutions looking for problems, and avoid self-licking ice cream cones. 4. Abide by the 80:20 rule. Avoid throwing disproportionate resources at perfection when good is great enough. 5. The customer is always right about what they want, but not always about how to do it. Get them talking about outcomes rather than solutions.

208

5 Delivering Data Capability

6. Procure competitively, but treat your suppliers well, and seek strategic partnerships. Competition gets you value, being a good client gets you loyalty. Likewise, if you are a supplier, partner where it allows you to offer something better to your clients. 7. Don’t forget that your organisation must own its data: where possible use it to create an ‘even playing field’ across your supply chain and your stakeholders. Don’t use data as a diplomatic carrot or abandon it entirely to 3rd parties. 8. Avoid outsourcing core data capability to your supply chain. Use suppliers for specialist capability that you don’t need day-in-day-out. 9. Where possible use Public Cloud storage over On Prem storage over Local storage. 10. Data storage: make use of Platform as a Service (PaaS) solutions over Infrastructure as a Service (IaaS) solutions over Software as a Service (SaaS) solutions. This is because data storage should be as cheap, reliable, and non-bespoke as possible. 11. Applications and user interfaces: make use of SaaS over PaaS over IaaS. This is because applications and user interfaces are expensive and difficult to get right. A vendor that can spread their development cost over a large user base by offering a SaaS product is more likely to be able to build and maintain a quality product that affordably meets most of your requirements. 12. Software: Commercial Off-the-Shelf (COTS) over Bespoke over highly configured COTS over customised COTS. Again, this is where identifying the small percentage of your requirements that are unique is important. Those are where expenditure on bespoke products and customisation is worthwhile. You should try to meet every other use case by purchasing products with an existing user base, and once you have done so you should avoid customising those products to the point where they effectively become bespoke and unsupported by the vendor. 13. Architecture: Enterprise over Tactical. In other words, look to deploy capability incrementally, but always have the big picture end state in mind as the target that you are building up to. Behavioural Dogma: 1. Data works best alongside IT, but it is not part of IT. The mindset required to run an IT department (service tickets, waterfall delivery, highly structured processes, long complicated projects) doesn’t always gel with the creativity of data. That said, you are likely to need IT to support the tooling that underpins your work. As such, you need to work closely and win over IT but retain your independence and agile ways of working. 2. Consolidation: Data investment is expensive, but it accrues economies of scale. The more people that use a particular platform, the greater the benefits and the more pull there will be for the on-going maintenance and continuous improvement of that solution. For this reason, you should seek to use a common toolset over point solutions.

5.1 Data Foundations/Precursors

209

3. Usability: nothing matters if no-one can use the damn thing. Seek to support teams and individual users as they look to switch to working on your platforms and ensure that you are connecting to other services so that your platforms act as parts of a broader ecosystem. 4. Identify your champions and your SMEs, and bring them along the journey with you, give them some ownership and agency and support them when they hit obstacles. 5. Documentation is important, diagrams doubly-so. If someone can’t explain something to you visually, then they probably don’t understand it. 6. Flag risks diligently and in a forum that allows your organisation to react at the right level (preferably before they happen!). 7. Change control is your friend, don’t be afraid to stop work when projects drift. 8. Use bureaucracy and governance to your advantage, don’t avoid it, but don’t take it too seriously. It can be a source of useful challenge, or needless delay, but resisting it is rarely an option. Instead, use governance as an opportunity to set the narrative and win over senior sponsors to your work. 9. Only ask permission for things that you cannot do yourself. 10. Work in the open, seek peer review, share your challenges. 11. Ask questions, learn from your peers, your colleagues, other businesses, other projects. Be on the look-out for ideas and insights that you have missed. 12. It’s OK to not be an expert in everything. 13. Have fun and be grateful for the opportunities that arise. 14. Your time is yours, own it. Switch off Outlook and Teams. Decline meetings. Make time to focus and to learn. 15. When in doubt, pragmatism is always better than dogma (even this dogma). Assess, act, and repeat.

5.1.4 Ontology (Common Data Models) The semantic graph is critical, because all our collective intelligence draws on it to make good decisions.—Richard Carter, Cognitive Advantage Context can be thought of as data’s representational system; such a system includes a common vocabulary and a set of relationships between components—DMBOK2 (page 19)

The work of turning abstract concepts into physical reality is eased by the introduction of standard subdivisions: creating a means of decomposing what can be impossibly large megaprojects. We have standard ways of dividing work into portfolios, projects, stages, stories, epics, features, and tasks. We also have standard ways of dividing structures into their component parts, and standard ways of dividing resources and skills into specialities and inventories. It is only by subdividing and standardising that building and maintaining structures designed to last for centuries becomes anything other than completely overwhelming and mentally fatiguing. A taxonomy or ontology is a means to aggregate, map out

210

5 Delivering Data Capability

commonality between concepts/entities, and to standardise. It is an explicit or implicit artefact that provides us with a common lexicon and the means to look at parts (subsets) of a wider project or asset system. This shared meaning reduces ambiguity, in turn allowing us to reduce the risk of missing the kind of crucial details that can result in safety and performance issues on the ground. We are using academic terms such as taxonomy or ontology to summarise a much more organic process. Staff within built environment organisations may not speak of taxonomies or ontologies much, but they implicitly (and sometimes explicitly) create lexicons of shared meaning to speed up their work. When one first starts work in the built environment sector one quickly encounters the dense lexicon (or lexicons) of terms and acronyms. Part of this lexicon involves remembering the standard breakdown structures used to subdivide assets (What’s a soffit? What’s an abutment?) or subdivide time (What’s a task? What’s a milestone?). This is how implicit taxonomies manifest themselves in practice. Does the fact that only the nerdiest data nerds are talking about taxonomies make them meaningless academic abstractions? We don’t think so. As Steve Whitla, who has devoted his working life to creating Shared Meaning in organisations writes, “[a] group can’t be performing its song and listening to it at the same time. There must be a third party to reflect it back,” therein lies the role of the taxonomist and the taxonomy or the ontologist and the ontology. Much, if not all, of the data that we collect on the built environment, will relate back to a taxonomy. This is how language and meaning work, we define terms in relation to other terms. For example, to define a task that forms part of a construction project we probably need to be able to associate it with (a) a taxonomy of activities, (b) a taxonomy of assets, (c) a taxonomy of locations or space, and (d) a taxonomy of resources, in turn, related to taxonomies of risk, benefits, costs, etc. This process, whereby we use our taxonomies to anchor terms in relation to other terms, is how we derive meaning from raw data. Taxonomies associate our data with predefined and meaningful classifications (or phrased another way, definitions). We can’t define what something is, without also defining what it is not. If we can make our data meaningful, elevating into information, then we have gone a long way towards making that data useful to the built environment. Shared meaning creates common purpose. We have written at length about how complex construction projects and built environment organisations employ a wide range of different specialisms, each with their related-but-distinct lexicons. As a sector we have come close to optimising within those specialisms. To create a built environment that will continue to meet societies’ needs, and to continue to realise marginal gains, we need to become better at working across specialisms, to create shared meaning and common data. Many of the applications of data described in the previous chapter require the standardisation and aggregation of data across an organisation’s IT eco-system to work effectively. This might mean combining data on cost, risk, schedule, quality, safety, and carbon to create a single-source-of-truth for BI and reporting. One might then integrate that data with a Common Data Environment to enable 4D planning and Digital Rehearsals. Finally, one might incorporate IoT feeds from the asset base

5.1 Data Foundations/Precursors

211

combined with a representation of the organisation’s business logic to create a Digital Twin. Implicit in these use cases is the need for a common data model to ensure that we can reconcile our siloed data sets into that single-source-of-truth without loss of meaning. Modelling concepts as an ontology (or a knowledge graph) is one means of creating such a common data model, acting as a record of the entities that an organisation holds data on, and how those entities relate to one another. An ontology can then act as the ‘top level’ data model, recording and defining the core data entities in your organisation. As the organisation implements data solutions, this top-level data model will inform the logical and physical data models associated with specific data sets and software, respectively. The discipline here is in ensuring that we match our investment in data with an obligation/commitment to maintain the accuracy of our ontology. At one large infrastructure client, the data team sought to ensure that a reasonable allowance for updating the relevant parts of the Ontology and logical data model formed part of any investment in new applications that would collect, store, or transform data. This ontology tax wasn’t always a popular choice with projects, but it helped to ensure that the ontology remained an accurate reflection of the organisation’s data. This approach aligns well with the roles created under a data governance framework, where part of the role of a Data Steward should be to ensure that the taxonomy or ontology continues to reflect the logic of the data sets that they manage. It is important to create common models with the business, not simply for the business. It is imperative to test models against the heuristics/tacit knowledge/experience that colleagues possess. Doing so presents an opportunity to quickly improve the data models, as well as to demystify data modelling and make it accessible to subject matter experts. Presenting subsets of data models in a format that users will recognise and understand—for example as a mind map—helps to make the exercise feel less esoteric. Figure 5.2 is an example of a (relatively) simple ontology (knowledge graph) for a highways network. Ontologies can become complicated as you add entities and relationships, but there are useful tools such as OntoPop or Graphite that you can use to simplify how you visualise them, allowing you to review only relevant parts with your stakeholders (Fig. 5.3). In summary the benefits of using an ontology model include creating consistent terminology for an organisation (and potentially its supply chain). Taxonomies and ontologies are both human-readable and machine-readable, meaning that they can facilitate both discussion and interoperability. They can highlight how the organisation thinks about data across organisational boundaries and siloes, identifying data overlap and commonality between different business areas. And they can provide a clear overarching structure which will help to ensure that logical and physical models align to a common set of data concepts, and that libraries of information have a logical structure and are easily searchable. It is worth noting that the tooling available for Ontology development isn’t a particularly mature market segment. One leading tool, is Web Protégé (and its desktop equivalent Protege), which is a free tool developed and supported by academics at

212

5 Delivering Data Capability

Fig. 5.2 Visualisation of an ontology using Protege

Fig. 5.3 Visualisation of an ontology using OntoPop

Stanford University. There are more modern tools for ontology development and visualisation. OntoPop (ontopop.com) is a promising open-source visualisation tool first developed at National Highways by one of your authors, whilst OntoKai (a proprietary tool by Kaiasm) is a sophisticated ontology curation tool, and Graphite Concept Manager performs many of the same functions as Web Protégé but in a shinier SaaS product.

5.1 Data Foundations/Precursors

213

5.1.5 How to Build and Use an Ontology for Construction Any sort of classification of concepts is going to be, at least in part, subjective and arbitrary. Whilst you can use existing top-level ontologies for inspiration and structure, in practice these rarely express concepts in a manner that is meaningful to business users.3 An ontology should establish a common language that an organisation can use when describing their data (and what that data in turn describes). The form should be sufficiently self-evident and meaningful to facilitate conversations with the people who use that data daily (and their bosses). When defining entities, simplicity and flexibility is key. It is worth starting with universal or abstracted concepts that provide an overview of the organisation’s key concerns. These concepts will be your top level ‘parent entities,’ and you will then break those parent entities down into whatever child entities you are likely to need when describing the specifics of individual use cases and linking out entities to specific IT systems. There are many freely available top level ontologies that provide you with a structure for a new ontology, but to the uninitiated these can be highly technical and difficult to interpret. So there is nothing wrong with starting out by mapping concepts in a way that is meaningful to you. You may start with parent entities such as: • Time: a temporal position on a timeline. • Resource: an object with intrinsic value. • Concept: an abstract shared idea. Like primary colours, you can then use these three parent entities to create the first generation of child entities. The more colours and shades you produce, the more verisimilitude you can create in your work. However, you should remember that you must always be able to refer to a more abstracted representation to avoid your common language requiring excessive time to understand. Useful child entities might include: • Asset: a physical resource with value, child of Resource.

3

Multiple upper ontologies exist, with different structures and setups. Different project needs may be better suited to one upper ontology over another, for example, end-user ease of use versus correctness versus succinctness. An organisation’s “Domain ontology” might develop by using an upper ontology as the foundational building blocks, fitting the domain-specific information within the structure of the selected upper ontology. The upper ontologies just give a structure for the Entity Hierarchies, not a structure for the Relationships that connect them. Examples of existing upper ontologies are:

• • • • •

BFO (Basic Formal Ontology). GFO (General Formal Ontology). BORO. DOLCE. COSMO, and more.

214

5 Delivering Data Capability

• Person: an instance of Homo Sapiens, child of Resource. Organisation: a temporal group of Persons with a shared legal entity (and purpose). • Duration: an interval between two points in time. • Activity: an action performed by a Person over a particular duration. • Cost: financial assets committed when a Person or Organisation performs activities over time. • Role: a position held by one or more persons that entitles them to perform an activity. • Construction: an Activity, with a Cost, performed by Persons with Roles, which in turn creates new Assets. And so on, for as long as is necessary to describe the organisation’s uses of data without undue complexity. As the ontology expands through layers of child entities, it is important to also define the relationships between your entities, these combinations of two entities with a vector between them are known as triples and will be the building blocks of your data logic. Examples might include: • • • •

[Activity] [is completed by] [Person] [Cost] [is accrued by] [Organisation] [Cost] [is calculated using] [Time] [Construction] [is delivered over a] [Duration]

The use of triples allows us to define the logic of our data models which in turn determines how we present data using software. If you cannot define your logic or outcomes in terms of triples, then you have no assurance that you can implement that logic in a product or derive value from that logic. For example, if you wish to combine spatial data with programme data (e.g., integrating Primavera with your BIM CDE), you should be able to define the logic of that integration using triples before you implement it in code. Carefully constructed triples have the added benefit of being both human-readable, and machine-readable. Finally, once you have defined your entities and their relationships, you can start to catalogue the instances of those entities and relationships. For example, if you have defined entities for dataset, software, and role, you can in turn catalogue the instances of dataset (Finance, CRM, design models), software (ERP, document management, file sharing), and role (BIM Manager, Project Manager, Information Controller). You can then in turn define the sensitivity of your software using the instances of relationships between dataset and software, and the rules of your role-based access control (RBAC) as instances of relationships between dataset, software, and role. Conforming data to an ontology can help to ensure that the datasets across an organisation ‘speak the same language’ and facilitate the wider integration by data engineers and exploration of data by users. Ideally all projects building new software solutions or databases should conform to the client organisation’s ontology. In practice this means checking that there are entities in the ontology that reflect the data held within the system, and that there are relationships within the ontology

5.1 Data Foundations/Precursors

215

that reflect how the organisation uses that data. A central data modelling capability can support software projects in reviewing and (if necessary) updating the Ontology. Similarly, new software solutions or databases should have a documented logical data model that follows a method such as crow’s foot. Where a project is an evolution of an existing system then the project should check that the ontology and data models exist and still accurately reflect the system, and update/originate where necessary to maintain accuracy. Modelling data logic in this way isn’t a purely academic exercise, it is work that allows us to begin to define the data that should be held in systems (for example a cost management system), how that data should be presented in the user interfaces of those systems, but most importantly the data that must be common across systems (e.g., asset breakdown structure, roles, calendar, etc.).

5.1.6 Starting with an ‘Entry-Level’ Taxonomy or Business Glossary If a full-blown ontology sounds too onerous then a first step might be a taxonomy or business glossary. The purpose of an entry-level taxonomy is to function as an agreed, visible, and accessible list of core terms, their definitions, and their hierarchical relationships across an organisation or project. Unlike a full-blown ontology, it does not have philosophical concepts at the top level. It should function as a catalogue rather than an encyclopaedia. The entrylevel taxonomy should capture and define the top few hundred concepts used by the business in their existing explicit or implicit data specifications. It should flag existing taxonomies (such as project/work/cost breakdown structures) and anchor these in one or more entities in the taxonomy. In this way it can signpost where other, more detailed, data definitions exist. Due to its relatively small scale, it should be possible to handcraft the entrylevel taxonomy without reliance on automated tooling. To gain traction within the organisation, without consuming undue resources, the entry-level taxonomy should be: 1. 2. 3. 4.

Reasonably visually appealing, Reasonably easy to maintain and update, Reasonably easy to share, Reasonably clear to non-technical users.

An example of a format/process that might meet these requirements is a simple visualisation of a .rdf file output as a .pdf file. This would allow a super user or users to update a centrally hosted .rdf file, and then share the pdf with colleagues. The entry-level taxonomy will have direct and indirect uses but will have the greater impact indirectly through its use to inform other systems. Use cases that we want to be able to target with the entry-level taxonomy include:

216

5 Delivering Data Capability

1. A means of presenting/sharing high-level data entities and definitions with interested parties to answer the question “what does our data landscape look like?” (Customer: CDO, CIO, Exec). 2. Creation and maintenance of a ‘business glossary’ of terms in enterprise document management systems and intranet sites to improve folder structures and search results (customer: Information Management, Document Control, Knowledge Management, Quality Assurance). 3. Informing data engineering transformations, and the terms used in models created for business intelligence/self-service reporting (customers: Data Engineers, Performance Analysts, Project Controls, PMO). 4. Forming the basis for the Entities and Reference Data Library input into Data Governance software by an Information Management team (customer: Data Governance/Information Management team) 5. Acting as the ‘top level’ of data architecture (e.g., any conceptual/logical/physical data models should be traceable back to entities in the taxonomy), and tying into future business architecture (e.g., the terminology used in business architecture should be referenceable back to the taxonomy and/or the taxonomy should update to reflect the terminology of our business architecture) (customer: Enterprise Data Architects). Investing time in creating artefacts like taxonomies and ontologies can be an act of signposting an organisation to itself. This should in turn help staff to navigate the structure and function of a complex organisation and find the information and connections needed to do their work.

5.1.7 Search and Classification The average information worker spends an estimated 28 percent of the workweek managing email and nearly 20 percent looking for internal information or tracking down colleagues who can help with specific tasks.—McKinsey

Search is one of the simplest use cases to describe, but often one that is sorely lacking in practice: the ability to find reliable information. Any built environment organisation will be both a custodian and user of information. They maintain a record of their assets and rely on that same record to do their work. As an employee of any build environment organisation, your ability to work effectively depends, at least in part, on your ability to find, use, and store information. And gosh, do we make it hard sometimes! Common information management activities include: • Controlling the dissemination and version controlling of vital and/or sensitive documents and other information types. • Recording information, for example through surveys or sensors.

5.1 Data Foundations/Precursors

217

• Exchanging information and co-creating information with colleagues, suppliers, stakeholders, and customers. • Generating new records as we progress through project activities such as the design and options development. • Purchasing/accessing/integrating with new sources of information. • Receiving and signing off contractual deliverables. • Recording new hazards, risks, and safety incidents. • Handing ‘as built’ information back to the operations and maintenance function of an organisation, or to a client organisation. All these activities rely on fit-for-purpose information repositories which make it easy for colleagues to find, access, use, update, and archive information. Where these information management capabilities are underdeveloped, organisations are in effect asking staff to spend an excessive amount of time trying to find information to enable their work. And let’s be clear, information management capabilities are almost always underdeveloped. The extent to which we are missing out, particularly when we work for built environment organisations, is in part hidden by our low expectations from years of adjusting to suboptimal solutions. We take it for granted that we can ask Google or ChatGPT to tell us about the most intimate details of any random celebrities love life instantaneously, but if we want to find the latest version of an important document at work then we are resigning ourselves to spending a good part of our work day crawling through folders and inboxes or (shudder) speaking to one of our colleagues. It might seem like an extreme statement, but we would be willing to wager that there isn’t a single sizeable built environment client organisation that provides its staff with comprehensive enterprise search capability (e.g., the ability to search across systems) using natural language queries or that curates a singlesource-of-truth repository that maintains a consistent master set of organisational documents. This is the exact capability that we take for granted when we ‘Google’ something, but which sounds like science fiction when applied to a work context. Without sophisticated search tooling, making files ‘searchable’ means ensuring that there is sufficient and complete metadata (data about data) recorded against each file to create indices and folder structures that allow for easy navigation. Metadata can be an resource-intensive pursuit, and often requires us to deliberately slow down the work of our colleagues with prompts for metadata fields when all they want to do is store a file or document somewhere. Maintaining and adapting folder structures to meet the needs of complex organisations is also time consuming. Just as a library or bookshop functions when there are staff on hand to make sure that books make it to the correct aisle and shelf, so the storage locations in an organisation’s environment require proactive management. In this way folder structures are a manifestation of metadata: in other words, it is the metadata that lets you know that you have placed the record in the correct location. In the case of a book shop this is a physical location, in an information management context this is the folder structure. This is very much the status quo on most built environment projects at present, with enterprise search capability entirely dependent upon metadata and hierarchical folders. A big drawback here is that one person’s logical structure is another person’s confusing mess. Few

218

5 Delivering Data Capability

of us organise our home libraries using the Dewey decimal system, but there is little room for personal preference in organisational metadata systems. Contrast this with the logic of contemporary search algorithms which weight search results based upon more dynamic and responsive criteria such as references from other sources and usage metrics. Of course, the comparison between search engines and corporate intranets is a little misleading. As search expert, and general clever clogs, Liam McGee explains: we value search engines for their ability to traverse a vast range of content, but we rarely expect them to go particularly deep (when’s the last time you even looked at the second page of search results?). By contrast, we secretly hope that our organisation’s intranet will provide a deep and authoritative trove of information, all the way down to various iterations of documents specific to our personal work. None of us expect this level of service from Google (not yet at least). To meet this hope, the graph of information used for intranet search would need to be both smaller and denser than the equivalent graph for a search engine. In this case it is likely that the intraorganisation graph requires manual curation where this would be unthinkable for a more general search engine. Metadata is a useful discipline, it helps keep our records structured and suitably labelled, and it will always be part of a search strategy (albeit a part that we should try to automate wherever possible). However, metadata alone does not provide a positive user experience. Most users do not understand how to filter by metadata. And why should they? We’ve all been Googling and Binging for 20+ years now (and more recently talking to Siri and Alexa and now Bard and CoPilot and ChatGPT). However, as we have described the seamless user experience that we take for granted when you use Google doesn’t naturally extend into specialist subject matter like the specifics of a built environment project. This is because the Google algorithm works by building an extremely broad, but necessarily abstracted, model of terms. To provide a similar level of experience within your own domain you need a confluence of techniques, namely: • Switching on cloud services such as Azure Elasticsearch which more closely follow the search engineer style experience, whilst keeping the User Interface unchanged. • Building out an Indexing and Insights service, based upon a combination of Natural Language Processing (or NLP, a form of machine learning) and ontology building (human-created subject-matter-specific logic). • Prompting users to input a deliberately short but vital set of metadata parameters when they first save documents in an organisational folder structure. • Where possible, looking to integrate services so that it is possible to search across multiple stores of information and/or enforce consistent folder structures and naming conventions across different platforms. The advantage of taking a cloud-based, ontology-driven approach to search is that you can re-use your model across use cases, for example your BIM CDE, SharePoint/Google Docs or equivalent, contract management, and document transmittal tooling. It is unrealistic for most built environment clients to move to a single storage

5.1 Data Foundations/Precursors

219

location for all data: there is just too much variety in data types and use cases, and too much legacy technology. But effective use of well established techniques (information management, metadata, taxonomies/ontologies), and new technologies (NLP, cloud platforms, data governance tools), hold the prospect of finally knitting together data across an organisation’s estate.

5.1.8 Unstructured Data Management Whilst most of this section has focussed on alphanumeric data, effective data management is equally important for document, model, and spatial data. Documents function as a vital conduit of information and store of knowledge, whilst design models and other spatial representations act as a vital visual index during construction of assets and (sometimes) beyond handover and into operations and maintenance. In our definition of ‘document data’ earlier in this book we included the distinct types of “unstructured data” files that a construction project will manage including reports (Word, PDF), design documents (drawings, sketches), and imagery (photos, videos, point clouds). These files will range dramatically in both format and size. What distinguishes document data is that it is not inherently structured or machine readable. The data does not follow a defined tabular format (as per alphanumeric data) or necessarily align to a spatial or geometric referencing system. Whilst it is possible to extract both alphanumeric and model data from document data, document data management capability exists to manage document data in its raw format. Document management brings with it a range of standard ways of working informed less by IT and more by fields such as quality and assurance, records management, and librarianship. Consequently, in managing document data you may seek to comply with external standards, including ISOs 9001, 10008, 14001, 19650, 27001, and 45001, all of which have requirements for document control and information management. Use cases for which an organisation might require compliant data management include: • Exchange of information with customers/stakeholders. • Exchange of information with suppliers (notably designers and survey partners). • Internal storage, review, and assurance of documents. Whilst ideally an organisation should have a single systematic process for managing documents, often they end up using different document management software to meet different use cases. For example, business requirements might be based upon clerical ways of working where a standard tool like Microsoft SharePoint is the preferred means of information management and exchange. However, compliant document management for sizable construction projects requires capability aboveand-beyond what SharePoint can provide. This means that managing document data may require both corporate Microsoft tools like SharePoint, Teams, and Office as well as a Digital Construction Common Data Environment (CDE) provided by a

220

5 Delivering Data Capability

specialist vendor. In this case, integration and clear ways of working are crucial to avoid duplication of information across systems. 3D design model data is the most sector-specific of our data types. Organisations from every sector use alphanumeric and document data, but typically only organisations concerned with physical assets use model data. Consequently, BIM platform offerings are less numerous, more specialised, more expensive, more proprietary, and generally less open than the wide range of solutions for numbers and documents. However, the market is increasingly offering integration solutions for BIM products, and organisations shouldn’t allow their 3D models to become a separate information silo from the rest of their data.

5.2 Delivering Successful Data Projects One of the themes of this book is that, for built environment organisations at least, investment in data presents both a massive opportunity, and a considerable risk. There is the positive case: the world is digitising; every sector is digitising. The ability to value and leverage data is integral to the success of organisations. The value of data will increase for the forseeable future as widespread adoption of AI and automation drive new use cases. Marginal costs will fall, advanced capabilities will become low code and commodified. The baseline against which we judge organisations—even humble construction projects—will continuously move us towards data adoption. And then there is the negative case: investment in data capability remains expensive, and not always successful. The more data you hold—particularly when it is personal data—the greater your risk exposure to data loss and breaches. Data is costly by default; it needs to create value to warrant its management. Data projects can fail at any point in their life cycle. They may fail in development and never deliver any product. They may fail to attract a user base and never realise adoption or value realisation. In some cases, they may fail long after ‘go-live’ through sheer neglect! So, we’re damned if we do and we’re damned if we don’t? Well not quite. Whilst the tide of technological progress will continue unabated regardless of our decisions, we can at least take responsibility for maximising the likelihood that our own data investments bear fruit. Whilst no data project is without risk of failure, one can bring method to the madness. The best approach to delivering data projects may vary by organisational context, but for data projects in most built environment organisations adhering to something like the following principles is likely to help: • Use the customer: take time to understand what they value, and then relentlessly focus on realising that outcome. Keep them involved throughout the process, including helping them to adopt the eventual product(s). • Be deliberately boring: where possible focus on methods and solutions proven elsewhere, particularly in similar but more mature industries such as manufacturing, aviation, or logistics.

5.2 Delivering Successful Data Projects

221

Fig. 5.4 Delivering successful data projects

• De-risk through design: remove obstacles during the design stage to ensure affordability and deliverability. • Understand the processes (or lack thereof) of your IT department and bake those into your planning. Engage early with potential blockers (for example, cyber security) and encourage them to help you de-risk your design. • Be agile: focus on the MVP but think life cycle. • Seek outside expertise for sense-checking. • Work in the open. Following principles like these focus data projects on realising incremental value and approaching tasks with a humble and collaborative mindset. When challenges and setbacks occur, and they will, an incremental project with clear sponsorship from the wider organisation will be more able to respond and adapt and is more likely to have a clear value proposition that enables it to continue (Fig. 5.4).

5.2.1 Delivering Through Traditional IT Functions Working with or as part of an IT function within a Built Environment organisation is equal parts fascinating and frustrating. In other sectors it is common to have a CDO on the board with a standalone data team. Whilst this is not unheard of within built environment organisations, it is far from the default. What is almost always true, however, is that there is always a sizable, somewhat old-fashioned, and occasionally overwhelmed IT department behind the scenes. As a data professional you may find yourself working as part of an IT department or leading a standalone function outside of IT (perhaps with a dotted line to the CIO/CTO), or part of another function (for example, reporting to the CFO or COO). But you will almost always have some dependence on an IT department that isn’t prepared to move at the kind of speed you want to move at. There are a variety of

222

5 Delivering Data Capability

Fig. 5.5 Delivering value from data services in the context of an IT department

reasons for why there may be a disconnect between your data aspirations and what IT can offer you. IT may be unfamiliar with modern data techniques, or they may be far too averse to risk to use them or waiting on approval from a cyber security team who see anything in the cloud as a liability. There may not be contracts in place to bring in the specialists required to design and deploy the infrastructure required to host data solutions, you may not have access to sufficient budget, or data simply might not be high enough up the list of priorities for an organisation struggling to meet other business demands or drowning in legacy IT. Whatever the source of tension with (or within) IT, how you manage this tension will be critical to your success in delivering value through data (Fig. 5.5). As data capabilities and solutions have become more common and prominent in the sector, they have started to work their way into the existing organisation structure. Built environment organisations increasingly recognise data as a business need distinct from core IT, and data teams have sprung up in all but the most constrained or immature organisations. However, the level of funding, independence, agency, seniority, and resourcing afforded to data professionals varies enormously. How you approach and work with IT will necessarily vary depending on the politics of your organisation, your relative place in the pecking order, and the degree of independence and autonomy that the organisation grants to its data functions (or factions). There will almost always be some dependency on IT to provide hosting, security, and support, but some data teams will languish in their isolated and unloved sandboxes whilst others will be empowered to fundamentally change the enterprise architecture of their organisation.

5.2 Delivering Successful Data Projects

223

5.2.2 Working with Legacy Practices Public sector organisations rarely go out of business in the way that a private company might. Rather they change form and remit accordingly to the political priorities of the day. This can mean that whilst the name on the door might vary from one inoffensive title to another, the underlying organisation including its IT muddles through largely unchanged. Whilst this is reassuring from an operational perspective- one cannot expect the public to stop consuming public services whilst a government organisation goes through restructuring- it means that investment in data occurs in the context of an organisation with substantial history and technical debt.4 At a large infrastructure organisation, change was the only constant. The organisation’s size and regionally federated structure made it too unwieldy for HQ to ever fully control the use of emerging technologies. This provided opportunities for data-savvy teams within the organisation and its supply chain to pick up and exploit emerging use cases. This might look like individual asset disciplines creating their own asset management tools, regions developing their own operational decisionsupport tools, or suppliers using open data feeds to sell self-contained web-based data analytics tools back into the organisation. ITs job became less about developing new capabilities, and more about retrospectively finding ways to support shadow IT tooling as it popped up across the organisation. In this way, data capabilities became scattered across the organisation and its extensive consultancy supply chain. On the one hand this process was organic and value-driven, on the other hand it was chaotic, of varying quality, created an information security risk, and was needlessly redundant. At the HQ there was a large central IT department, a few multi-million-pound standalone digital transformation programmes, and specialist teams dedicated to GIS, portfolio planning, analysis, enterprise asset management, innovation, network modelling, and other specialist use cases. These HQ teams saw plenty of friction between them, often driven by the lack of capacity for core IT (who were battling a range of legacy issues) to bring novel solutions into service. Sometimes the organisation had a CDO, other times a CDIO or CIO, sometimes all of the above, but even within the HQ there was patchy centralisation of data capability. The federated nature of the organisation compounded this complexity. As power and funding moved from HQ into the regional offices, each region began to grow its own in-house data capability. This capability was, initially at least, rarely labelled ‘data’, but often involved large teams of analysts and data engineers working in ‘performance’ teams. Whilst the regional offices were, in theory at least, still obliged to use the same central IT infrastructure, there was in practice little control over what the supply chain developed on their behalf. While IT locked down corporate hardware, cloud technologies made it possible for suppliers to provide staff with access to a wide range of off-the-shelf and bespoke web-based services. 4

For example, the transition from the Highways Agency to Highways England (2015), and then to National Highways (2021).

224

5 Delivering Data Capability

The example above is an ‘operational/asset management’ organisation, e.g., an organisation primarily focused on the management of existing assets. By their nature, these organisations tend to have long shelf lives (even if the names and management change from time to time). This gives them time to accumulate legacy IT issues, specialist teams, deep supply chains, and lots and lots of software and politics. When one of your authors found himself enlisted to help a regional government to spin off part of an infrastructure network to a regional government authority, he was handed a spreadsheet listing 600+ IT systems and asked which ones the new operator would want to continue using. The list included plug-ins for Microsoft Excel, highly bespoke mainframe databases, instances of customised enterprise software, specialist engineering applications sold by one-man-band companies, and everything in between. In such a complex legacy environment there was perilously little space to talk about data concepts or data flows. Instead, data sets were implicitly associated with systems. So, for example, to know where the organisation held all the data for a particular asset type one needed to know the names of the systems used to hold that data (usually silly acronyms). This unfortunately encouraged the new operator team to start with a blank slate system and simplified data model, even at the expense of leaving valuable historic data behind following the split. Your author witnessed a similar systems-over-data phenomena at another large infrastructure organisation. Here the biggest challenge to launching an enterprise data platform that would deliver data aggregation for reporting at scale wasn’t technological, financial, or even political. Rather the biggest push back came from the cottage industry of internal staff that had sprung up to deliver ‘analytics’ by manually extracting data from source systems and manipulating it in spreadsheets or unsupported databases. The problem wasn’t so much that these individuals’ work couldn’t be massively improved through automation, but that these human agents in the machine had become the arbiter of what data was ‘correct’ and available to inform analysis. Sure, their work was more resource intensive and prone to error than an automated pipeline, but their labour costs were baked into the organisation (and therefore, for all intents and purposes, free), and their dashboards came with a reassuringly human seal of approval. It’s easy to get angry at a corporate database if it gives you the wrong answer, it’s harder to get angry at Bob the in-house analyst, or to even realise that Bob is misleading you. Your author’s experience of fighting against what one consultant memorably called the mandraulic approach to data management and data analytics continued into a large-scale construction project. Here, by the time your author arrived, every use of data that the project could conceive of occurred within disconnected spreadsheets or through outsourcing to the supply chain, with the substantial costs conveniently hidden/obscured in payroll and much broader contracts. In such an organisation arguing for upfront investment in better, more scalable, less fallible means of managing and deriving value from data face an uphill struggle. The cost of the new investment is there, in black and white, whilst the BAU of a large organisation hides the cost of existing inefficient ways of working. And so inevitably the challenge comes, “why do we need this when we are doing just fine without it?”

5.2 Delivering Successful Data Projects

225

5.2.3 Working for Construction Projects Organisations or teams focused on the design and construction of new assets appear following agreement of the financial and legal framework for a project (for example following a Development Consent Order (DCO)). These may be arms’ length divisions within large Infrastructure owners (for example, National Highways Lower Thames Crossing programme), or standalone legal entities that will only hand over the asset to the infrastructure owner on completion (for example Thames Tideway). These organisations have the advantage, usually, of a blank sheet of paper and no legacy IT. However, large construction projects bring their own challenges including: – A wide range of stakeholders and user’s needs to meet. – Use cases that don’t all fit within the traditional remit or competency of IT including many of the applications that we discuss in the next chapter such as BIM, Smart Buildings, and Digital Twins. – The need for rapid mobilisation to support the project on Day 0, followed by frequent changes in organisational needs and structure as the project life cycle progresses from planning, to design, to construction, and ultimately handover. – Unclear division of responsibilities between the client organisation and the supply chain including, crucially, what data each owes the other. – The requirement to hand information back to the end customer in whatever (usually old-fashioned legacy IT format) they dictate. All of which means that working on these projects can be frantic and frustrating, particularly in the early days. The role of the client organisation varies across projects, as does the role of IT within those organisations, with thin clients seeking to outsource almost all non-governance work to suppliers (including data and IT), whilst other informed/intelligent clients take on the risk of mastering data and hosting systems themselves. Responsibility for technology will inevitably end up split between IT, the supply chain, and specialist teams such as BIM and PMO. Your role as a data professional on a project like this will often involve trying to keep peace between competing customers, trying to ensure that the data landscape doesn’t become ridiculously siloed, as well as helping IT to support the construction SMEs without teaching them how to ‘suck eggs’.

5.2.4 Delivering into Operational IT Working as a data professional in this environment you may find reduced capacity within the IT department to help you. This can be because most of their time and resources are consumed by maintaining the legacy environment. Moreover, your IT department may see today’s novel solutions as tomorrow’s legacy headache. This does not necessarily imply a resistance to change. Rather it is an understandable aversion to the workload that comes from having to paper over the cracks of solutions

226

5 Delivering Data Capability

that are either not enterprise ready, or for which the enterprise is not ready. Extensive and onerous processes for transitioning services from ‘Dev’ into ‘Prod’ don’t come from a desire to be obstructive, but from a position of having been burned before. A preference for enterprise-grade solutions, rigorously tested, scalable, secure, and converged around strategic platforms may be beneficial for core enterprise services but will tend result in a strategy and governance processes that implicitly (or sometimes explicitly) discourage small, incremental changes. Any agile data approach that favours rapid prototyping will collide with governance processes designed for large waterfall projects without much (any) allowance for learning by doing (or as tech bros say: failing fast). Consequently, you might find that your plans to create data capability elicit responses such as: • We’ve already got Oracle/SAP/Salesforce/ServiceNow/insert other enterprise software with rudimentary dashboarding functionality… can’t we just use that? • You can’t run a project without a dedicated Business Analyst, Project Manager, Enterprise Architect, Business Change Consultant, Cyber Security Manager (keep adding roles until the project becomes unaffordable). • I know you’re working agile, but tell me what your milestones are, and where is your Gantt chart? Exactly what are you going to deliver and when? • I can’t give you or your team rights to use Python or install packages without every line of code being reviewed by our cyber security team, you’ll then have to request ‘elevated rights’ every 15 min to access the platform (approval SLA 8 hours). • Every time you want to do something in a VM you will need to raise a ‘small change’ request and wait 8 weeks for approval. • IT support looked at your data sandbox architecture and said that they’ll need £XXX,XXX per year to support it. • We can’t let you use real production data in your environment, it’s just too much of a security risk. This kind of reception can be incredibly dispiriting and slow your work down to a crawl. What’s worse, your business customers won’t really understand or care about these esoteric challenges. As far as they are concerned you are part of IT, therefore any delays are your own fault, and you are part of the problem. There are always reasons not to do things. A degree of risk aversion is probably desirable when one is making changes to the fabric of our world. As a data professional in the built environment, you will often inherit the governance structures of non-data projects. Moreover, there are data-specific risks that are new and potentially threatening to organisations across the built environment sector. The importance of cyber security functions has increased in parallel with that of data functions, and their incentives and risk tolerances do not always align. Whilst more enlightened cyber colleagues may recognise that failing to deploy any data capabilities will hardly improve our ‘security posture,’ you will encounter many who simply do not have the inclination or incentive to approve things. This is doubly the case where an organisation has outsourced its cyber capability. Contractors and consultants may not want the liabilities that come from a pragmatic approach to risk, nor do they suffer in the

5.2 Delivering Successful Data Projects

227

way in-house staff might from the consequences of failing to develop capability. If they stymie your transition to the cloud, well that’s fine, they weren’t going to use it anyway, and they won’t be around to see the organisation limp on with spreadsheets and shadow IT. Thankfully, there are a few hidden benefits. First, pretty much everyone working with data in our sector faces the same challenges, and it’s our resilience and ability to wade through the treacle that makes us valuable to our organisations. It is also what stops others in your organisation from treading on your toes too much. Yes, shadow IT will exist, but the scrutiny of IT will keep it small scale. Your ambitions to be able to fire up whatever data capabilities you need at any given time might be wishful thinking, but you should seek to establish a working relationship with IT that gives your data team the ability to work effectively. In the land of the blind the one-eyed man is king. Likewise, if nobody can access the tooling required to write code or analyse data without asking IT for permission then there will be limitations to what others can accomplish without involving your data function. Second, the other side to the ponderous pace of change is that once you land a change in this kind of environment it tends to stick. The sector is full of analytical products that are in use for multiple decades because replacing them is so time consuming. Similarly, once you establish a data engineering or data science operating model it will be hard to challenge it because colleagues will know that you have tirelessly negotiated it through the entire governance process. Third, you will find allies. It’s not just people in the business who are sick of IT, plenty of people in IT aspire to better as well. In your authors’ experience, even the most ponderous public sector IT department will claim to want to be more agile. If you can sell your function as an opportunity to transform or modernise or accelerate or streamline or otherwise speed the pace of delivery, then you may find yourself with both top down and grassroots support for what you are doing. If your data team is unlucky enough to be part of a large IT department then you may find that you are the only part of the department with explicit blessing to disrupt and challenge the status quo. As a data professional in the built environment sector, you will find yourself straddling the worlds of data, engineering, and IT. You will be part of IT but also a step removed, part of an asset operator but also an internal service or overhead. You will be part critical friend, part disruptor. To survive you need to know enough about both engineering/asset management and technology to empathise with the challenge of running an underfunded BAU IT and engineering functions. But you also need sufficient distance from the status quo to be able to imagine better ways of working. It takes patience, it takes coaching, it takes time spent thinking about who you are and how you want to succeed (see ‘How to be a Data Person’ later in this book), but it is possible. Often the worst thing you can do is to pretend that IT doesn’t exist, or that you know better than your engineering colleagues, to try to work completely outside your organisations’ established ways of working or to dismiss or ignore their requirements. As exciting as data technologies are today, they will eventually become as BAU as ERP. If data projects are successful, they become part of the IT

228

5 Delivering Data Capability

environment where the focus turns from building to continuous improvement and creating a reliable managed service. As these core technologies mature you won’t necessarily want to manage them forever. Unless you have some serious political capital (a serious technical nous), you will have to roll up your sleeves and go talk to your operation IT colleagues and understand what makes them tick, as well as checking with your wider organisation to make sure that you are delivering what they need.

5.2.5 Working Through Governance It is not unusual for the project delivery methodologies that built environment organisations adopt to resemble the methodologies that they use for their core business. For example, one of your authors worked at an organisation where the business case template was the same whether you were buying a software licence or building a multi-billion-pound new asset. This can translate into a general wariness of and/or unfamiliarity with incremental Agile methodologies. Built assets tend to have a property of not functioning at all until completion. Whilst it might be entirely appropriate to rapidly prototype a Minimum Viable Product (MVP) version of a new data solution, it’s hard to imagine a workable MVP of a bridge or a building. Whilst there are now a fair few agile dev teams building capability across the built environment sector, it does often remain the case that for solutions to enter widespread use they need to go through some stringent waterfall stage gates, doubly so if there are safety or cyber security implications for the products. As data professionals this can be grating when you just want to prove some innovative ideas. However, what your organisation is probably really asking you to do is to continuously design risk out of your projects. With this mentality in mind, you can still take an agile approach, whilst ensuring that you bake into your backlog epics that meet the governance processes of your organisation. As we established earlier in this book, you must work closely with your business customers to understand and reflect their requirements, as well as develop your understanding of what practice you should learn from the wider sector. In doing so you ensure that your proposed work meets customer needs, and eliminate non-viable options (e.g., too expensive, too complicated, not fit-for-purpose). You will also likely need to engage early with your enterprise architecture and solution architecture colleagues to ensure that by the time you are ready to start building that you have sufficiently defined and designed to receive their support. Allowing time to define and design will help you to progress from your conceptual design to an agreed, affordable, and deliverable detailed design, and hopefully win you the begrudging approval of your IT colleagues. Meeting IT halfway in this manner will mean that as you progress into build and deploy, you are able to ensure that you are meeting user requirements, that you have ‘designed out’ as many risks as possible before commencing build and are therefore able to work within the agreed

5.3 Procuring Successful Data Projects

229

budget and schedule. It will also give you top cover should issues arise in the delivery of your work.

5.3 Procuring Successful Data Projects Perhaps you found yourself working with data for the built environment because some inspiring moment persuaded you of the difference that digital technology can make to the workings of the world. However, excitement about technology can only get you so far. Eventually, one needs to reckon with the complexities of delivering change across large organisations. The more enlightened amongst us may then move on to consider the human element, culture: business engagement, communications, change, and such. You can’t accomplish much with technology if people won’t use it. These themes: technology, and culture, are typically what we use to explain our successes and our failures. Why did the data project succeed? It must have been a great application of technology. Why did it fail? We didn’t pay sufficient heed to the culture. However, there is another factor that can undermine even the most enlightened approach to technology and culture: contracts and procurement (C&P). Very few public sector organisations can deliver lasting data change without support from vendors, consultancies, and system integrators. Even private sector and profit-driven enterprises will rarely have all the skills that they require in house. An organisation’s approach to procuring and contracting sets parameters around its investments in data. These activities occur before a single bit of data is stored, or a single line of code is written, and if you don’t get them right at the start of your work then you are seriously undermining your ability to deliver. It’s like building one’s data house on shifting sands. Even sensible data professionals can miss the importance of contracts and procurement. Too often they dismiss procurement as a procedural exercise, boring administration, or box-ticking. We contend that C&P is anything but. Procurement, and the contracts that follow, establish the most fundamental parameters of a project or service, including: • The agreed scope of the work, timescales, and measures of success. • The supplier best placed to do the work (or at least the supplier that wrote the most compelling bid). • The pricing mechanism, rates, or costs for the work. • The incentives and risks accepted by each party. • The means of governing delivery and responding to change. • How the parties will work through any unexpected difficulties, and the process for stopping the works if necessary. • The ways of working together collaboratively, and keeping parties informed of progress.

230

5 Delivering Data Capability

People, and organisations composed of people, respond to incentives. You still meet old-fashioned folk who make statements like “a man’s word is his bond” and such, people who believe that the world runs on a timeless blend of charisma and rapport. That may be true in certain cases, but game theory shows us that in the face of misaligned incentives, such bonds are fickle and prone to dissolve. And the alternative is even worse: where organisations approach procurement as a baited trap, luring suppliers into bidding at the lowest possible price, and then using punishing contracts to leave them squirming on the hook with little concern for the viability of their business. The need for formal agreements is doubly the case for data projects. Data products belong to the realm of immaterial information. Consequently, success can be hard to quantify, and poor work can be easy to hide. Few organisations will be able to execute on a data strategy without some reliance on third parties. This reliance on outsourced specialist data suppliers means that there is an asymmetry of information between the client seeking outcomes, and the suppliers cutting code. A client organisation may, for example, lack the in-house skills to properly assure or audit the work of suppliers. Contracts exist to, in part, acknowledge and compensate for this imbalance. It is, of course, easier to enumerate the failures of procurements past than it is to build working models for the future. One is always fighting the previous war, overcompensating one way or another. There is no silver bullet here that guarantees a happy client and a profitable (but not too profitable) supply chain, all whilst delivering value to taxpayers. But this is not cause for despair, there are areas where a degree of consideration, foresight, and mindfulness can go a long way towards removing the worst outcomes. The following sections describe these areas, namely: • • • •

Ensuring your scope of work is deliverable. Fostering healthy competition. Selecting a suitable evaluation criteria and payment approach. Agreeing ways of working up front, including how you will iteratively assure the work. • Managing ecosystems and dependencies on other suppliers. • Making effective use of terms and conditions (Ts & Cs). • Procuring across the life cycle of a service. We will dive into each of these concerns in turn.

5.3.1 Ensuring Your Scope of Work is Deliverable One can procure a holistic data transformation, or sub-divide it endlessly. One can divide across time (like the GDS phases discovery, alpha, beta) or across domains/ specialities (architecture, infrastructure, design, UI, business change, testing). Subdividing work in some way is necessary for most projects of any scale. An organisation might divide work to ensure that there are appropriate opportunities to evaluate success part way through a project and to re-procure where necessary, or because

5.3 Procuring Successful Data Projects

231

the requisite skills do not exist in a single supplier or consortia. Moreover, an organisation might not be ready to fund the entire life cycle of a data transformation or be wary of placing ‘all of its eggs in one basket.’ Similarly, public sector procurement rules put limitations on how organisations procure work, (usually) preventing organisations from writing blank checks that suppliers can cash indefinitely into the future (rightly so). Sub-division need not mean originating entirely new procurement and contracts (though it can). Methodologies such as agile scrum allow work within a project to be iteratively sub-divided into discrete tasks with clear ownership and resource requirements without necessitating constant re-contracting or variations. When considering how to subdivide work into deliverable scopes, one should not think solely in terms of “what will suppliers bid for?” The answer: just about anything. But also, in terms of what will be a coherent and deliverable piece of work. You need to be able to attach measurable outcomes and tangible success criteria to a supplier’s scope to define what it is that you are buying. How will you recognise when work is complete to a sufficient level of quality? The agile term ‘definition of done’ is useful here. If your project’s definition of done requires multiple paragraphs to articulate, then you may wish to revisit what you intend to accomplish with the work or consider further sub-division. Conversely, one can also over-divide work to the point where individual tasks become trivial, or the interdependencies between tasks become harder to manage than the tasks themselves. Well-scoped and sized projects should give suppliers’ locomotives enough track to build up a head of steam, but not so much that they disappear over the horizon. The truth is that most organisations have a minimum and maximum size of work that they can metabolise at any one time. Scopes of work that are too small are likely to soak up more effort in governance than they deliver in benefits. This is true on both the client and supplier side (it’s easy for clients to forget that credible suppliers also have standards and governance to follow). Projects that are too large (in terms of scope, budget, or time), are liable to try to deliver more change that an organisation can absorb without impacting business-as-usual, or worse become divorced from the business altogether. A corollary to defining a deliverable scope, is arriving at a meaningful way of signing off on delivery. On data projects the client will rarely have the time or skills required to independently read each line of code. If they did, why wouldn’t they have just done the work themselves?! In most cases, client assurance will be at the “does it appear to do what I asked for it to do” level, rather than forensic detail. Most client organisations are not homogenous units. What satisfies the IT department may not satisfy the business, and vice versa, and what satisfies management may not satisfy front-line workers. Procuring suppliers, and agreeing contracts, are your prime opportunities to define how you wish to be involved as a client, to set out how and when you will give your approval, and what should happen if issues emerge after sign-off. It is also an opportunity to head off internal politics, by ensuring that your stakeholders and end users are on the hook for accepting deliverables. Ideally, final sign-off of a data project comes after client involvement from day 0, where the client is involved in shaping the project and collaboratively resolving

232

5 Delivering Data Capability

challenges. The client shouldn’t necessarily steer the ship—they may for example lack the technical expertise or demand over-customisation of the solution—but they should be in the room whenever tough compromises have to be made. When it comes to intangible data work, it’s just too easy for crucial details to be missed if the client only ever sees the ‘finished product.’ Traditionally suppliers might have baulked at clients seeing early versions of solutions, but the agile philosophy has begun to change that. Agile ceremonies like ‘show and tell’ and ‘retros’ provide the client the opportunity to become part of the delivery team, without giving them licence to ride roughshod over the development process. On conventional ‘waterfall’ projects, the client should be involved in requirement gathering, traceability of requirements, and user acceptance testing (UAT). Agreement of deliverables should be a continuous and incremental process throughout the life of a project, rather than a ‘once and done’ exercise. We’re aiming here for a compromise between the client being completely hands off, and the client checking every last detail. With the right level of involvement, the client should have confidence in the deliverables because they will have a tangible sense of what they can do differently because of the work. Client confidence shouldn’t come from a demo or some screenshots on a slide deck, but because they’ve seen the solution develop and had the opportunity to test and influence as it developed.

5.3.2 Competing and Evaluating Procurement is, at its best, a hotly contested but fair fight. Contestants should leave the battlefield bruised and bloodied, but with a renewed sense of their strengths and their weaknesses. Winner, loser, or referee, each party should end a competition knowing more about themselves and their capabilities than they started. Which is why it is such a shame when we treat competition like ticking a box. It’s like skipping a race and only watching the podium ceremony. Evaluation criteria are how a client determines which suppliers or consortia of suppliers can best deliver a piece of data work. The criteria should reflect the client’s understanding of their work and what they want to accomplish from it, including dependencies, risks, stakeholders, the level of novelty or uncertainty, as well as their values as an organisation. When using frameworks, the client weighs cost against quality, in other words the percentage of marks associated with the supplier’s commercial bid, versus the percentage of marks associated with their written submission. Cost and quality are sometimes subdivided to include factors such as ‘cultural fit,’ but the principal division between money and nous remains. For all but the most routine projects, defining a scope of work and then looking for the lowest possible price from the market is a recipe for disaster. The ‘winner’s curse’ means that in a straight fight based purely on cost the winning bidder is likely to have underestimated the work relative to the competition. This ‘race to the bottom’ approach has resulted in short-term cost savings, but a long-term erosion of the viability of the UK’s built environment. This is doubly the case when paired

5.3 Procuring Successful Data Projects

233

with privatisation/outsourcing contracts that are far shorter than the life cycle of the assets and therefore incentivise suppliers to sweat the assets. Starting a project with a supplier who is in the red on day 0 and can only realise profit by cutting corners and raising variations, is going to make creating a viable product incredibly difficult, and is likely to bog down the work almost immediately in ‘change control’ as the supplier tries to ‘vary’ their way back to profit. Whilst nobody wants to waste taxpayer’s money, you may need to worry about securing a decent price less than you think. The formulation of public sector procurement processes means that even where price is a small part of the evaluation criteria (for example 20% of overall marks) it can still play an outsized role in determining who wins the work. Suppliers know this and will sharpen their pencils accordingly. The payment approach—the means of arriving at an agreed price for the works— is one of the most consequential parameters of any procurement. Typically, contracts will either be fixed price (e.g., client and supplier agree the cost up front), or time and materials (e.g., client and supplier agree a rate card and the agreed price then depends upon the level of resourcing required to complete the task). There are variations on these themes, including hybrid approaches like capped time and materials. There are also additional, rarely used incentive mechanisms such as pain/gain shares. Moreover, the mechanisms for off-the-shelf software and cloud services tend to be different again, with cost varying by number of users, and the amount of storage and computing power consumed. When procuring data solutions, particularly where these involve a large amount of business change or user-centred design, the goal is rarely to eliminate all risks during procurement. Rather, as a customer you should seek to select a partner that has explained clearly how they will address the challenges associated with this kind of work, that understands your desired outcomes, and offers a price point that feels reasonable and a rate card that offers you some room to manoeuvre should additional work be required. As a data professional you can rarely buy in services and forget about them, and so it’s important to have a bit of skin in the game yourself (e.g., a compelling reason to show up to sprint planning and retrospectives).

5.3.3 Ways of Working Once you have procured a credible, affordable, and capable supplier, you still need to learn to work together. The language of public sector contracts: fixed costs, milestone payments, change control, deadlines, can be at odds with the best means of delivering data projects, namely: incremental, exploratory, collaborative, agile. Even when your procurement exercise delivers a winning tender that is scoped and priced in a manner that works for both parties, a project can still suffer from the terms of the contract forcing the supplier into counterproductive behaviours that are focused on meeting the terms of the contract, rather than the emergent needs of the project. If you’ve got to get something well-defined delivered to tight timescales and budget, then a fixed price waterfall-style approach may work. But if, as is usually the case in data

234

5 Delivering Data Capability

projects, you’re looking for a supplier with the expertise to help you work through a complex problem, then a time and materials contract built around agile ways of working is much more likely to allow you both to respond to whatever you discover together along the way. Agile ways of working still have a way to go to gain acceptance amongst built environment clients. This is in large part because the built environment naturally models its processes around their core work types (design, engineering, building, fit out, etc.) which do not lend themselves to an agile approach. Seen through this lens, the lack of concrete milestones associated with an agile method can make data projects seem risky at best, irresponsible or frivolous at worst. Your authors have at times had to throw their toys out the pram to resist demands for slide packs filled with Gantt charts and milestones. The mindset change needed here is going from “agile doesn’t guarantee a specific output” to “agile guarantees that whatever happens, you will get something.” The term ‘minimum viable product’ is particularly useful in this context. Most everyone who has worked in or around technology will have encountered a ‘big bang’ technology project. These are usually summarised as “on [insert date here] you need to switch off [system 1] and switch on [system 2] which is going to be [much/slightly] [better/worse/cheaper] than [system 1].” Big bang projects are almost universally disasters, usually because they require an organisation to digest a complete change in ways of working overnight. Taking a ‘minimum viable product’ approach to your data projects allows you as the client, and your supplier, to establish a small, controlled, working pattern (a pilot or prototype, if you will) that you can incrementally scale up and productionise over time. The added benefit here is that if the project stops for unforeseen reasons, you always have some sort of working product to fall back upon. This is not true of ‘big bang’ projects. By virtue of their ‘all or nothing’ nature, these projects can face cancellation even after accruing millions of pounds of abortive work without any working solutions to salvage.

5.3.4 Managing Ecosystems and Dependencies Aligning incentives across two different organisations (client and supplier) so that they can work with common cause is a delicate balancing act. And the task becomes even harder on big projects where multiple suppliers, subcontractors, SMEs, and stakeholders deliver related and mutually dependent pieces of work in parallel. As the number of parties participating on a project rises, the number of relationships between parties rises even faster (it’s maths, I’ve checked). For example, there is one relationship between two parties, three possible relationships between three parties, and ten possible relationships between five parties. Not all these relationships will be formalised in contracts or agreements, but they all need managing. Supplier 1 and Supplier 2 will only have contracts with the client, not with each other, but if that relationship breaks down it may impede the delivery of the project.

5.3 Procuring Successful Data Projects

235

More relationships mean more opportunity for misaligned incentives, conflict, and counterproductive behaviours. They mean more time spent communicating, reporting, escalating, cascading, and placating, and more complexity in planning, tracking, and executing work. What’s more, the relationships between suppliers (including subcontractors) will work very differently to the relationship between suppliers and their clients. The client may not have a full picture of any antagonism lurking under the surface until it is too late. Even within individual organisations (supplier or client) there will in practice be further layers of internal dynamics, such as teams that are functionally separate entities. Different teams of people working to different P&Ls within an organisation can sometimes be as competitive and counterproductive as they would be if they were in entirely separate competing organisations, and the same is true of the dense layers of subcontractors and independent SMEs found on many large IT projects. Another element to consider is whether your supply chain might seek to create consortium to respond to your tenders, which in theory should reduce (or at least obscure) any internal tension between parties. None of which to say that one shouldn’t use multiple suppliers where that is necessary or useful, quite the opposite. Sticking to one supplier might, for example, deny you the ability to find the specialist skills you need for parts of your work, or result in you paying a premium for certain skills. But as a client you need to walk a fine line between “placing all of your eggs in one basket” and creating complex and uncomfortable dynamics between suppliers. When procuring work competitively you are never in control of which parties bid for your work, nor should you be. A degree of conflict, or friction, or at least disagreement, is inevitable (and indeed necessary) in any sort of human relationship, doubly so when pride and money are at stake. Proactive clients will recognise these dynamics up front before mobilising work. On a contractual level, this might mean insisting on formal or informal collaboration agreements between parties. On an interpersonal level it will mean investing in creating the right kind of forums where suppliers can speak openly to each other and resolve issues, potentially without you in the room. Awareness that cross-party dependencies are inherently risks should help you to recognise where you are creating unnecessary (e.g., non-value-adding) dependencies as you scope your work packages. Your goal should be, at a minimum, to achieve continuity in the way that you divide work across suppliers to avoid unnecessary dependencies. Where dependencies are necessary or desirable the earlier and more explicitly you can flag these dependencies in both your tenders and resulting contracts the more prepared your suppliers will be, and the less likely they are to derail or unduly complicate delivery. Your authors have certainly been guilty in the past of trying to be too clever in terms of splitting projects across layers of suppliers. Delivering a project with suppliers means designing a workable operating model just as you would with internal resources. It also means being aware not only of the work at hand, but also the ‘business as usual’ environment that you are deploying into, including any ‘managed service’ contracts that you might have in place for BAU operation and maintenance of data services.

236

5 Delivering Data Capability

5.3.5 Making Effective Use of Terms and Conditions Contracts are not simply mechanisms by which we formalise offer and acceptance, they are also a means of risk transfer. When procuring work, what goes into the final contract is just as important as the choice of supplier with whom one contracts. After the arduous journey of scoping a project, and writing the business case, and wading through governance, and competitive tendering, it can be tempting to rush the contract stage. Contracts are confusing, boring, and they’re usually the last thing standing between you and actually getting stuck into the work itself. Impatience is natural, but don’t give into it! Be cognisant that contract terms are often a means whereby clients inadvertently reduce competition. A contract which insists, for example, on costly indemnities, insurance, parent company guarantees, or certifications to superfluous standards, will likely favour larger suppliers over their smaller competitors. The client may not be aware of having lost out on viable options, as the smaller suppliers may simply consider the terms too onerous and choose not to bid. Similarly, do not be surprised if, following procurement, your preferred supplier tries to negotiate away contract terms in a way that makes their bid appear less competitive than it did on paper. There may be entirely valid reasons for a client to insist on terms, and for suppliers to insist on exemptions, but be aware that this isn’t simply paperwork, this is you actively shifting the balance of risk resulting from the contract.

5.3.6 Procuring Across the Life Cycle of a Service Built well, data services can provide value for decades. Whilst there are always new techniques and technologies, your authors’ experience of working in data and the built environment has been as much about making best use of whatever data and data solutions organisations already have in situ as it has been about building the new hot thing. Data projects can be expensive, complex, and often bring a higher rate of failure than other investments. Hence this extended foray into contracts and procurement, often the original sin of failed data projects. Moreover, the design and development of data solutions is just the first part of their life cycle. An amazing piece of software, or a database, or an app will be a wasted investment if the organisation that financed its development hands it over to a service provider that lacks the capabilities or incentives to operate and maintain it. This is a pertinent risk for data services because the traditional ‘IT tower’ contracts that organisations use to maintain their core IT services (think ‘end user compute’, service desk, ERP, etc.) are not necessarily the same organisations that are best placed to help grow and improve a data service or a data platform. Data services need time and planning to grow user adoption, become part of an organisation’s decision-making processes, incrementally develop new functionality, and adapt to meet emerging business needs. Contracts for the BAU operation of a

5.4 Conclusion

237

data service/platform, or a portfolio of data services, are not a means of maintaining a static thing. They need to incorporate mechanisms and funding for investment in the continuous improvement of the service. They will also likely require a higher level of ‘end user’ support than more traditional IT services because the service desk will in some cases be helping to train the users to upskill and change their established ways of working as they learn and adopt these novel solutions. This does not mean that the same specialist parties that build your data services should also be the parties that maintain them. There are usually efficiencies in having a single supplier manage a broad portfolio of similar services. But simply tossing your new and shiny data investments over the fence to your IT service desk and moving on to the next project is not going to give your organisation the support it needs to take advantage of these transformational services. In fact, it is likely that you will need a range of different skills to deliver any significant investment in data, particularly in low maturity clients. These skills will include development and support, of course, but also user-centred design, business change, comms & engagement, and perhaps even some data engineering and data science to accelerate value realisation. Tenders are one of the most visible expressions of a client’s scruples and objectives. If you are a public sector client, the record of your procurement exercises will often be in the public domain indefinitely. It is an indication of your maturity and aspirations as a client, and the degree to which you respect both your suppliers and the need to realise value for taxpayer’s money. Done well, finding partners to work with can be one of the most interesting and fulfilling parts of the job. Done poorly it can open your organisation to reputational and legal jeopardy. The road to data hell is paved by poorly considered procurement, and solutions looking for problems.

5.4 Conclusion Congratulations, you have successfully made it through (or skipped to the end of) the driest chapter in this book. The subtitle of this chapter could easily be “boring but important stuff.” Data architecture, project delivery, procurement, these are some of the dull but necessary precursors to success, to realising the fun stuff, to making your data vision a reality. Unless you work for an early-stage start-up or as a lone maverick, it is incredibly unlikely that you will ever be able to deliver your data solutions in isolation without having to rely on colleagues in your wider organisation, particularly the various parts of IT, as well as contracts & procurement. That means that every data solution you develop, and every data procurement that you run, will have to pass through the scrutiny of others. You can resist this sad reality (trust me, I’ve tried) or you can accept it and get on with the work. Occasionally, you will even find that the expertise that your colleagues hold will stop you from doing something really quite silly or unduly expensive. Hopefully over the course of this chapter we have provided you with some signposts of where the pitfalls may lie, and how to minimise the pain of governance and processes. Ok, on to the fun stuff!

238

5 Delivering Data Capability

References DAMA International. (2017). DAMA-DMBOK: Data management body of knowledge (2nd ed.). Technics Publications. Tufte, E. (2020). Seeing with fresh eyes. Graphic Press. Varah, W. (2021). Transforming infrastructure performance: Roadmap to 2030. Infrastructure and Projects Authority.

Chapter 6

Radical Technologies

Abstract Over the course of this chapter, we provide an overview of a range of technologies that deliver the application of data to the built environment. We first use complexity theory to frame the technologies described in this chapter in terms of whether they are best suited to tackling complex or merely complicated problem sets. We then discuss key types of technology that one can use to deliver digital change in built environment organisations including Building Information Modelling (BIM), Business Intelligence (BI), Data Science and Artificial Intelligence (AI), Smart Buildings and Internet of Things (IoT), Digital Rehearsals and automated parametric design, Digital Twins, and Generative AI.

6.1 On Complexity On paper the organisations that we work for are orderly and logical, we’ve got the process maps to prove it! In practice, the experience of working across a broad organisation with a wide range of incentives, agendas, quirks, and priorities can feel chaotic. The task of moving from chaos to order can feel difficult, perhaps even futile. And complexity only increases as we scale from individual organisations to the vast network that allows the built environment to function. Jo da Silva warns us of the possibility of “cascading failure due to inter-relationships between infrastructure, institutions and ecosystems.” If we want to prevent cascading failure, or indeed if we want to use data to help the built environment realise better outcomes for society, then we must reckon with complexity. We often use words like ordered, complex and chaotic in a colloquial and informal sense. If we want to reckon more keenly with the challenges that we face in realising value from data on the built environment, then it is worth using the formal definitions—and resulting insight—developed by the exponents of complexity theory. To do this we will lean heavily on the work of Dave Snowden and his Cynefin framework, as well as Richard Carter’s book on decision-making Cognitive Advantage.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_6

239

240

6 Radical Technologies

Snowden posits that there are five kinds of contexts (domains) for decisionmaking, and suggests that if you are aware of what decision-making domain you are in, then you will know how best to act to accomplish your desired outcomes. The paragraphs below detail the five domains that Snowden identifies. We’re not going to try to rehash Snowden’s work, if you are interested in knowing more he has a quick and affordable introductory course. However, it’s useful to understand how these domains manifest themselves when working with data in the built environment sector, and where you might find yourself across different domains. • Obvious: in this domain the answers to your problems are predictable and demonstrate clear cause and effect. For any given problem there will be an optimum strategy. These might be cases in which there is a clear and relatively straightforward policy that we need to adhere to, for example we must provide staff with the ability to report safety incidents, and we must record the cost of our projects in some sort of centralised system. • Complicated: in this domain the answers to your problems remain predictable and causal but require the application of expertise or intelligence to identify. A large part of the work of built environment professionals exists in the complicated domain. As a species we have learned to create enormously complicated structures (bridges, tunnels, skyscrapers, transport, and utility networks) that, byand-large, perform in a predictable way provided that they are built, operated, and maintained in a manner that abides by expert judgement. As data professionals, we face complicated problems when we apply our expertise to meet clear goals of our organisations or customers. For example, we might provide a data model to underpin a particular business report or dashboard where there is a clear specification, and the art is in implementing the underlying processing to meet the customer’s functional and non-functional requirements. This might mean undertaking difficult but still predictable tasks like refactoring code so that it runs faster, or specifying the data model and functions so that we can calculate a metric accurately. • Complex: problems in this domain have constraints, or loose and evolving rules. Think of the flight patterns of large groups of birds: there are binding physical rules governing what each bird can and can’t do at any given moment, they can’t teleport, but acting as a group the patterns of flight are extremely complex and difficult to predict based upon past behaviour. In this domain there is no procedure, however sophisticated, that can guarantee success. As a rule, problems that require working with and/or coercing humans tend to skew complex rather than complicated. As described above, the structures that we build and maintain are complicated, but the organisations and broader industries that are responsible for that building and maintenance activity are complex. Winning support for and delivering a data strategy across a large and constantly changing organisation that is itself a part of a wider sector is likely to be a complex task (albeit one that may be composed of many complicated tasks). Working as a data professional in a complex environment requires us to use our intuition and adapt to signals. This is

6.1 On Complexity

241

not because our intuition is always going to be correct, but because it is only by “probing” a complex domain and observing the outcomes of our actions that we can obtain feedback and adjust/respond accordingly. It is your author’s belief that deploying any sort of transformational data capability, such as a Smart Building or a Digital Twin, requires a data professional to deliver complicated projects in a complex environment. • Chaotic systems: in this domain the logic of your problems is unpredictable, random, and unconstrained. Or in other words there is no observable logic. In a chaotic situation the only guides are intuitive and responsiveness. Chaos usually only manifests itself in a built environment context due to some external factor such as extreme weather, natural disaster, terrorist attack, or tragedy. The nature of chaotic situations is such that there is rarely time for data-driven decision-making in the moment of crisis, and data professionals are rarely on the front lines dealing with these situations (cyber and ransomware attacks may be the exception to this rule). That said, analysis of data following an event can and should help inform change, procedures, and heuristics that reduce the likelihood of such events in the future and/or mitigate the harm that they cause. • Confusion: if you don’t know which domain you are in, then congratulations, you are confused! Confusion isn’t necessarily a bad thing, provided one is aware of being confused. Too often we go through our professional lives on autopilot. We follow our calendars, the opinions of others, and whatever documents the consensus within our organisation revolves around. We do not ask ourselves what kind of problems we are trying to solve, and where the limitations in our ability to solve those problems might lie. In doing so we risk channelling our energy in the wrong direction, overinvesting in solutions to ordered problems, and underinvesting in seeing the bigger complex problems that demand our attention. Your authors have made this mistake plenty of times, and being the technology-focused nerds that we are, we have too often acted on the basis that the principal problems that we face in our work are complicated when in fact they are complex. This has led us to spend too much of our energy trying to perfect the architecture of our data solutions, and not enough energy engaging with the complex stakeholder environment that determines whether those architectures find funding or a user base. Snowden’s classification of domains is useful because it moves us away from overly simplistic models of order and chaos and challenges us to think with greater precision about the type of problems that we are trying to solve. Whilst we find ourselves in each of these domains from time-to-time, it is perhaps most important from a data strategy perspective to understand whether you are operating in a domain that is merely complicated, or truly complex. This is because whilst complicated and complex are close to each other on the Cynefin framework, they necessitate quite different decision-making behaviours. Complicated problems demand adherence to a defined ruleset, complex problems demand responsive, innovative, experimentation and agility. As you consider how to apply the use of data to a project, perhaps take a moment to understand whether the systems that you are working with qualify as

242

6 Radical Technologies

ordered, complicated, complex, or chaotic. Or perhaps more likely which aspects of your work fall into each domain. In taking the time to make this one distinction you will better understand how to change your approach to suit each type of system, and the tools that you will need to use to realise your desired outcomes. Another author who writes about complexity with a particular emphasis on decision-making is Richard J Carter. Carter describes complexity as “the state of a system where your knowability of it and determinability of it is such that you have uncertainty about your understanding of it.” In this context, ‘knowability’ is the extent to which you can measure something, and ‘determinability’ is the extent to which you explain the causality of something. A game of chess is ‘knowable’ because you can see where all the pieces are on the board, and ‘determinable’ because you understand the sequence of events that leads to one party or the other winning. A game of poker (Texas Hold’em) is less knowable because you cannot see all the cards in play, and less determinable because there is the unpredictable human element of bluffing. In this sense you can say that poker is a more complex game than chess. This is not the same thing as saying poker is a more ‘complicated’ game than chess, it would probably take longer to explain chess to a new player than poker, but once you know the rules of chess you can understand ‘why’ the game proceeds in the way that it does, whereas with poker it will be harder to explain retrospectively why a bluff did or didn’t work. We started this book with a problem statement of whether and how better use of data can help us to realise better outcomes for the built environment. One might argue that to the extent that the built environment sector has not fully capitalised on data it is in part because the problem is less knowable and determinable than it would appear from the outside. The lack of knowability stems in part from our failure to adequately define and measure crucial aspects of the built environment, and the amount of vital knowledge locked away in contractual and organisational silos. The lack of determinability stems from a drastic lack of information on past projects, and a culture where organisations obfuscate rather than dissect and learn from failures. Of course, these distinctions are relevant beyond the specifics of data. As Snowden states, “for the last 50 or 60 years we have assumed that all systems are ordered, and with that comes major error.” The built environment is full of task-orientated, delivery-driven people. Consequently, the narrative on such projects often mistakenly approaches the complex as if it were ordered. The default of many construction projects is to try to contain complexity by creating extremely detailed interconnected Gantt charts, and then when the projects go over budget or schedule, to try to explain away complexity by seeking to find the ‘dependencies’ and ‘bottlenecks’ on the plan. The value of these sorts of visualisations in diagnosing the causes of problems is debatable. The causes of success and failure on a project of any scale is the product of a complex system (an ecology if you will) of human interfaces. The built environment is resource intensive, and any project of any note will involve a multitude of intelligent actors, each with their own propensity to irrational behaviours and optimism bias. We cannot realistically predict or control the actions of numerous actors solely through sequential plans, risk logs, or hierarchical reporting structures.

6.1 On Complexity

243

Not all the problems that you will seek to solve with data are complex, but some of them will be. Solving non-complex (ordered) problems with methods suited to complex problems is inefficient. Similarly attempting to solve complex problems with methods that do not address complexity is futile. Snowden’s Cynefin framework provides us with a breakdown of the distinct types of systems that one is likely to encounter when operating in the real world, and how your decision-making should adapt depending on which type of system you are working with. Snowden’s work suggests that the methods you adopt to influence systems should reflect whether they are ‘obvious’, ‘complicated’, ‘complex’, or ‘chaotic’. In the words of Snowden, your aim when dealing with complex systems is to “manage the emergence of beneficial coherence within attractors within boundaries,” or in other words, create a controlled (sandbox) environment where you can find ways of doing things that work, and do more of it. The work of Carter and Snowden gives us a useful framework for understanding the level of complexity associated with our use cases. Table 6.1 plots the broad kinds of applications described in this chapter against Carter’s matrix of ‘knowability’ and ‘determinability’ and the domains of the Cynefin model. The subsequent sections of this data strategy go through each of these applications in order of ascending complexity, allowing us to build from simple foundations up to complex use cases. Just as each technology application differs in terms of the types of domain of problems that it is best suited to addressing, so they also differ in terms of their relationship to the component parts of the WKID (Wisdom, Knowledge, Information, Data) pyramid that we covered in Chap. 1. Technologies such as BIM and BI that are suited to the obvious or complicated domains of problems (e.g., non-complex), tend to follow a programmatic logic that converts data to information and subsequently knowledge. These solutions create value by helping organisations to structure and communicate their data such that staff can work their way up the pyramid to the point where they have sufficient knowledge or wisdom to solve the problems that they face. More bleeding edge technologies however, particularly those associated with data science and AI, might instead short-cut the WKID pyramid and seek to go straight from data to wisdom, bypassing information and knowledge altogether. Working on the basis of probabilities rather than logic these technologies can generate valuable answers without requiring the curation of information. A suitably trained neural network can win at almost any rule-based game at a game without any concept that it is playing a game, or even an awareness of the rules of the game. By processing vast quantities of data these technologies replace information and knowledge with correlation and reinforcement. In doing so they can provide us with the processing power and flexibility needed to respond to emergent and complex scenarios. However, they may not be suitable for processes such as building design that require a clear and auditable ‘golden thread’ of evidence-based decisions.

Cynefin domain

Business intelligence (BI)

Obvious/complicated

Obvious (except for Information business adoption, which management, digital construction (BIM), and is a complex problem) GIS

Application

High: business intelligence covers the essential reporting and analysis functions of an organisation. It predominantly uses data that the organisation captures about itself, including structured datasets and defined metrics. In some cases, BI will help to reveal known unknowns: where the organisation could capture additional information & metrics

Very high: not all organisations adequately manage their records, but this is due to neglect rather than a lack of knowability. We have the means to measure and record most aspects of the asset life cycle with great accuracy. There are few unknowns

Knowability

Table 6.1 Applications of technology mapped to Cynefin domains

(continued)

High: organisations typically use BI to identify variance in performance (usually underperformance) against defined metrics, and to analyse the causes. Typically, the level of causation is highly deterministic (for example: “You are behind schedule because the following tasks were delayed…”). BI straddles both obvious and complicated domains because the same datasets will produce outputs aimed at diverse levels of expertise

Very high: IM, BIM, and GIS focus on ‘what is’ rather than ‘why’ or ‘what could be’. They are means of structuring and recording objective reality, records, and space. Where they add value is by providing a ‘single source of truth’ and creating a ‘golden thread’ of decisions

Determinability

244 6 Radical Technologies

Cynefin domain

Obvious/complicated

Complicated/complex

Application

IoT/smart buildings

Digital Rehearsal

Table 6.1 (continued)

Medium: Digital Rehearsal takes the detailed representation of physical assets provided by BIM and seeks to understand how these interact with time (sequencing of activities) and human actors (roles on site). This means that you will need to incorporate some level of uncertainty around data inputs such as labour productivity, time required for tasks, conflicting priorities, and the use of resources

High: IoT provides us with the instrumentation and data feeds required to track many of the functions of a building. Operators can source additional information directly from building occupants (for example fault reports, or satisfaction surveys)

Knowability

(continued)

High/medium: Digital Rehearsal software relies on a mixture of deterministic rules and human judgement to model the sequencing of activities. It can be extremely granular and highly automated, but the logic of the rehearsal is usually apparent and explicable. It will not, however, always calibrate perfectly with reality Whether the rehearsal is complicated, or complex will depend upon the level of uncertainty and emergent behaviour accommodated within the rehearsal

High/medium: the determinability will vary depending upon the purpose of the ‘smart’ infrastructure. It will excel at evidencing how best to maintain a desired environmental condition (for example, noise levels, temperature, etc.). Some use cases will require expert interpretation. There are however limitations on the extent to which it can interpret or control the complex behaviours of building occupants (for example, overcrowding, disruptive acts, vandalism)

Determinability

6.1 On Complexity 245

Knowability Medium: much like Digital Rehearsal, the knowability of the problems that Digital Twins can solve will vary. Sometimes, you will use Digital Twins merely as a real time representation of the performance of an asset. These Digital Twins will share a lot of the properties of your BI applications, albeit with more sophisticated logic and visualisation More complex Digital Twins will seek to model and predict emergent behaviours. This is particularly true where the Twin represents not just the physical properties of a system, but also the (less knowable) decision-making processes that inform the operation of that asset High/medium: predictive analytics and data science tend to seek to make use of existing ‘knowable’ data collected by the organisation. It is particularly good at finding patterns in large unstructured datasets. This will reduce the need for us to compensate for a lack of knowability through heuristics or assumptions

Cynefin domain

Complicated/complex

Complicated/complex/ chaotic

Application

Digital Twins, Smart Places

Predictive analytics, data science, and AI

Table 6.1 (continued)

Medium/low: data science developed in part to provide us with the tools to test the ‘emergent practice’ and ‘novel practice’ required to work within complex and chaotic systems, respectively. Data science techniques such as neural networks allow us to develop emergent practice algorithmically in response to input data and defined goals. This means that data science (as opposed to AI/ML more generally) will realise the greatest benefits over conventional analytic techniques when applied to complex or chaotic systems. However, it also means that the outcomes of data science techniques are not always easy to explain in simple cause and effect terms

Medium: the level of uncertainty and emergent behaviour that you represent in your Digital Twins will depend upon the systems that they model (as well as the technical limitations of the platforms that you use to build the Twins) Digital Twins of complex systems, organisations, and environments/ecosystems will need to be able to have the flexibility to accommodate not just uncertainty, but also changing hypotheses concerning the logic/behaviour of the system itself. This is likewise true of the application of ’smart’ technologies over large human spaces (e.g., Smart Places)

Determinability

246 6 Radical Technologies

6.2 On Technology

247

6.2 On Technology A billion hours ago, modern homo sapiens emerged. A billion minutes ago, Christianity began. A billion seconds ago, the IBM PC was released. A billion Google searches ago … was this morning. (Hal Varian, 2013)

Considering complexity is important but, admittedly, it is difficult to jump from this narrative to the dynamics of technology today. How do we best anticipate where current technology trends are headed, and how we might make best use of them? We can’t predict the future, but perhaps we can surf the mega-trends of technology. “…technological change is not additive; it is ecological. A new technology does not merely add something; it changes everything.” Neil Postman, talking about his book Technopoly: The Surrender of Culture to Technology in 1992. Take a moment to allow Postman’s quote sink in. The change that technology provides is not additive. Truly new technology is not a bolt-on, it changes the nature of the game. This is why we start with why, not what. If we know our direction of travel, and if we have created a culture that is responsive to change, then we will be well-positioned to respond to unexpected technological trends.1 First, let’s break ‘Digital’ Technology into its component parts: • Computation. • Storage. • Bandwidth. Computation: This is the horsepower of technology: how much work can we execute within a given time using a machine? Computational speed generally refers to the speed of the computer’s engine (the processing of software commands and data). From the processing unit in your fridge through to tensor processing units used by Google to compute neural networks in the application of AI, these engines are everywhere and are the beating heart of our modern digital economy. Gordon Moore suggested in 1965 that the number of transistors on a microchip double about every two years. He was so right that it became Moore’s Law. One of your authors wrote in 2012 (as part of the BIM2050 Built Environment 2050 report) that we will eventually 1

That said, this section is not designed to walk you through the industrial complex of technology or any historic narrative of technology in a general context. Economic historians like Joel Mokyr, Robert Allen and most prominent Thomas Piketty do a much better job at that.

248

6 Radical Technologies

see a slowdown of Moore’s law and that has proven to be the case. The size of the transistors is reaching their physical limit, approaching the scale where quantum effects are significant enough to reduce reliability. However, the world of quantum computing is just around the corner and another explosion of computing power is surely ahead of us, one that could quickly undermine established ways of working in the digital space such as conventional encryption. It is also worth remembering that computational performance is not only a function of the capacity of the hardware, but also the efficiency of the code, whether that is a change in methodology (for example using neural networks) or an improvement in quality (perhaps thanks to an AI coding ‘co-pilot’). Storage: This is the memory of our digital economy; digital storage has become a cultural phenomenon. Today we each generate an enormous data trail; it feels like we are collectively trying to store every moment of our lives (this is of course a dubious privilege that only accrues to part of the world’s population). The 4 Vs of big data (originally attributed to IBM) characterise the dynamics of storage, they are: • Volume: The vast amounts of data that are generated every second. Traditional database systems are often ill-equipped to handle such large volumes of data. • Velocity: The speed at which new data is generated and the speed at which it needs to be processed. This could be high-frequency trading data, social media posts, sensor data, and similar sources that produce a constant stream of data every millisecond of every day. • Variety: The different types of data. This can include structured data (like databases), unstructured data (like text and imagery), and semi-structured data (like JSON or XML files). • Veracity: The quality of the data. Data accomplishes nothing through volume, velocity, and variety alone. It is only when we can rely on its accuracy and trustworthiness that we can begin to use it to create information, knowledge, or insight. Whilst storage has fallen enormously in cost over the last few decades, it is still generally true that the faster you wish to access data, the more expensive it is to store. A key concept to grasp here is the continuum of data storage. At one end of the continuum is bulk data storage, vast but slow to access, perhaps stored on tape. At the other end of the continuum is data that is rapidly accessible for computation, for example the (relatively) small amount of data stored in a laptops’ random-access memory (RAM). This is another limiting factor of computation time. There is no point having a huge engine (a fast processor) if you cannot pump fuel (stored data) to the injectors fast enough. There is a balance to be struck. Edge computing will be a key feature of built environment computing platforms. As machine learning capabilities move from the racks in the cloud to the chipsets in edge hardware, we will see a rapid deployment of AI at the local level.

6.2 On Technology

249

Bandwidth: This is the nervous system of our compute and storage and hopefully a relatively simple dynamic to get our heads around. Most of us are old enough to recall the struggles of slow internet connections. Bandwidth is generally expressed as a rate of bytes per second. Your AOL dial-up connection might have offered you 56 kBps (that’s 56,000 bytes per second), enough to chat on your favourite Spice Girls message board, but not much else. By contrast, Netflix (at the time of writing) requires 3 MBps for high-definition TV at 720 pixel resolution and 15 Mbps for ultra-high definition (UHD). 15 Mb is 15,000,000 bytes, so 274 times more data transmitted that using the good old fashioned 56kBps dial-up. Emerging research at University College London allows for the transfer of data at 1000GBps. As the volume of data we produce increases, and as we do more computation with it, then the bandwidth that we require will similarly increase further. The economic trifecta of technology: The impact of technology is the cumulative impact of exponential improvements in each of these three component parts. To predict the future of technology is to try to interpret the use cases that continued improvement in these three components will enable. Recent advances in AI are in large part a result of more computational power, better computational techniques (neural networks), fine-tuned hardware (GPUs), greater availability of stored data, and sufficient bandwidth to allow instantaneous communication with users around the world. Figure 6.1 shows a very broad cost estimation for illustrative purposes only and is sourced from a triangulation of sources. The purpose here is to understand the rough rate of change, prices are not adjusted for inflation or spending power parity. Figure 6.1 shows the rate of change in the cost of computation, storage, and bandwidth. The graph uses some mathematical magic to visualise the data, it is a logarithmic scale using 1980 as a reference point (1980 = 100%). It shows that the cost per Mb of data stored, transmitted, and processed have all fallen by many orders

THE ECONOMICS OF DIGITAL TECHNOLOGY Bandwidth ($/MB/s) 1980

1989

Storage ($/MB) 1990

Compute ($/MIPS) 1999

1980 = 100% - LOG SCALE

1.0000000000 0.0100000000 0.0001000000 0.0000010000 0.0000000100 0.0000000001

Fig. 6.1 The economics of digital technology (a rough guide)

2000

2010

250

6 Radical Technologies

Table 6.2 The numbers Decade Computation (cost of Storage (cost per byte of a million instructions persistent storage) per second—MIPS3 )

Bandwidth (cost per byte per second)

1980s

IBM’s 3081 could deliver 1 MIPS for several million dollars at the start of the decade, by the end of the decade a single MIPS was down to a couple of thousand dollars

Again, dominated by IBM where hard drives were in the 10s of megabytes and cost around $50 per megabyte

The data at this time is tricky because the internet was mostly utilising existing telephony infrastructure and prices are not charged per byte rate. Prices were charged at about $10 per hour for 56 kB/s. A very rough sum suggests that 1 MB/s cost around $0.5. That would mean your UHD Netflix movie is going to cost you about $2500 per hour (if a suitably fast connection existed)

1990s

Pentium ran at about 100 MIPS at a cost below $2000 and we see the single MIPS fall to below $20. By the end of this decade, we the see the MIPS fall to a couple of dollars

Early on we see hard drives storage reach the 100s of megabytes costing around $2 per megabyte. Ending the decade, we see gigabyte sized drives and the price of a megabyte drop below $1 where a typical 1 GB drive costs $100 or less

The internet is still limited to its theoretical limit of 56 kB/s, but its cost has dropped to a point where your UHD movie is streaming at about $15/h. (Although it’s still not possible to stream at the required rate!)

2000s

We see the arrival of multi-core technology; a single MIPS is now down to pennies

Storage density rapidly improves, early on we see drives in the 10s of GB now costing $0.10 a GB. By the end of the decade, we see terabyte sized drives and the cost levelling out per gigabyte

Broadband arrives and at the start of the decade we see speeds of up to 1 MB/s up to 50 MB/s by the end of the decade. Your UHD movie is now costing you about $3.5 per hour

2010s

Processors get faster and smaller with the advent of smart phones and the rapid deployment of data centres. This drives the MIPS to a fraction of a penny

The start of the decade sees the introduction of solid-state hard drives at consumer prices, costing about $1 per gigabyte and dropping to $0.10 per gigabyte by the start of the 2020s. The price of traditional disk hard drives falls to around $0.02 per gigabyte

Fiber-optic networks arrive and 100 MB/s to 1 GB/s become available. Meanwhile we see the development of 4G signals that can achieve 10 MB/s but at a much higher cost than wired internet. Heading into the 2020s we see 5G and a gigabyte fibre being rolled out to most city centres around the globe. Your Netflix movie is now costing you around $0.02 per hour

of magnitude over the last four decades, with the biggest changes in computation2 (Table 6.2). The trifecta aspect of these three dimensions is illustrated well in our Netflix movie comparison. A movie streaming service must be commercially competitive against movie theatres and home videos (VHS tapes, DVDs, and Blu-ray). We have seen the 2

A side note here and some wisdom from the world of economics. Never accept statistics from anyone that does not express them as a ratio. Any value that isn’t expresses as, for example, a cost per person is to be treated with instant suspicion. It’s hard to give a pithy rule that applies to any situation, but always question lonely values! 3 Not to be confused with Multiple Impact Protection System, which you should have on your cycling helmet.

6.2 On Technology

251

economics shift in our own sector. In the past (up to around 2010) building sites and design offices were rarely digitally connected, even with higher speed wired internet connections. It took too long and cost too much for a building site to be physically connected to the internet. However, the improvements in mobile internet technology and Wi-Fi mean that the days of the disconnected building site and frontline worker are (largely) behind us. One step deeper is in the distribution of software. The tipping point in the economics of these three factors brought about (relatively) affordable ubiquitous cloud computing, this in turn enabled software companies to move away from selling boxes of their product. Selling boxes of software was inefficient and insecure, leading to rampant piracy. For example, software companies used to ‘ship’ alongside their CDs containing their software a USB dongle that contained anti-piracy technology to prevent copies of the software being created (and revenue lost). Without the ability to ‘patch’ software via remote updates users were exposed to bugs and security issues. Whizz forward to today and we no longer see updates to some software platforms. Rather their code and updates are implemented instantly and possibly multiple times a day. Indeed, vendors even ‘A-B’ test multiple versions of the services simultaneously. This illustrates an important point that technological change is not additive, there are no bolt-ons, it has changed everything. The balance of the economics we have just discussed changes level of friction experienced by customers and internal users of technology alike.4 The following is adapted from Milgrom & Roberts’ “Economics, Organization and Management: United States Edition” (Milgrom & Roberts, 1992). There are 5 characteristics of a transaction (and its cost), in our context ‘friction’. These characteristics can be used as continuums, where different and sometimes opposing business models sit at either end. One end to the right being high, the other being low (Fig. 6.2).

4

Managers of firms generally fail to see that they need to place the same focus on the internal friction as on addressing and reducing the friction that their customers experience. For example, we would love to see managers be leaders of technology. Yes, technology needs to be led by everyone, but without a responsible leader at the top of the organisation, technology fails any meaningful implementation beyond the deployment of pet tools for small user groups. Despite all the market disruption given resulting from the internet, managers of firms still fail to grasp that the market friction they focus on and build business models around is only part of the picture. On the other side of the same coin are the frictions of the internal organisation, the internal economies of firms and their supply chains. They are no different and behave the same, they are exposed to the same economics of technology, but the management of firms and especially in the construction sector often focuses on the business-to-business friction and neglect their internal customers, the experience of their employees and supply chain partners. How entertainment is distributed, internet browsers are updated, and engineering is conducted are all influenced by these levels of market friction. Whether you employ software engineers or civil engineers it is likely that a large part of your organisation’s focus is on building better experiences for customers, despite the costs of the internal friction. Your authors have witnessed first-hand in every type of firm there is from government to small business that the user experience of the staff that work for you is just as critical to success as the experience of your customers.

252

6 Radical Technologies

Fig. 6.2 Measuring Friction

The specificity of the project and desired outcomes of the client/owner/buyer (an example is the development of a factory is more specific than the development of a house on a housing estate). The frequency of communication within a project and the duration or period over which they are procured (an example is the development of a new hospital is less frequent than the maintenance of a hospital). The complexity of the project and the uncertainty about what performance will be required (an example is the development of a nuclear power plant is far more complex than building a school). The difficulty of measuring performance of the project, as a collection of transactions (an example of this is the ownership of the plant, it is harder to measure the performance/quality of plant if the plant is not directly owned, much like a second-hand car). The connectedness of the project to dependent infrastructure (logistics and the internet for example) and stakeholders (a low connected project can be a highway project where there is little connectivity to the internet and hardly any interfaces with any other infrastructure, such as underground services).

6.2 On Technology

253

A pure transaction cost view (our friction) does come with some a caveat. The outcome performance of a project can’t be fully expressed as the sum of its production and transaction costs. Where the former depends on technology and the latter depends on the organisation of transactions. This makes the separation of production costs and transaction costs difficult. This is why we find it difficult to calculate the return of investments in information technologies in general. Whenever any of these characteristics of friction change, an equal pressure to change the business model or procurement strategy is applied. Measure your current role against these measures, compare them to an adjacent role. Are they the same? How do they differ? Which one is more valuable? Next time you buy a coffee, take these characteristics, and think about the overall cost of the transaction. What’s frequent? Where does the money go? How much time does it cost you? By recognising the complexity of the problems we are trying to solve, and by leveraging the continuous advancement in the components of computing, we can begin to deploy solutions that catch up to with the legacy of the built environment that we have created. There is a bonus factor for built environment professionals that we cannot ignore: robots. The economics of robots follow a similar trend as our dynamics of ‘digital’ technologies found above. They were deployed in complex environments such as manufacturing from the 1960s in the automotive industry. As the price of production and implementation of robotics lowered and became safer to work around (large robotic arms are very dangerous) we saw the arrival of collaborative robotics. Combining our economics above with these machines we have seen robots enter healthcare. Robots will be used in our sector through two routes. First is through the mechanisation of offsite manufacturing, similar to our automotive cousins. Second will be through direct deployment on site as edge devices on our distributed construction computing capability. This route can be visualised if you imagine that we started with hand tools and moved on to electric versions of them such as pneumatic and power drills, the next step is to not only mechanise the tool but to mechanise our limbs. This is through either a mobile collaborative robot (an arms on wheels or legs) or connected to some form of exoskeleton. All these things we already see in the market today. But much like the mainframe computer back in the 1980, they are only available for large industry. However, we are on the path to consumer level robotics for construction. Alright, enough of the meta-narrative, it’s time we focus on the what and the how. It’s time to make digital change real by appraising the kind of investments in data technology that will empower our colleagues and organisations, substantiate our data strategy, and help us to realise our data vision. The technologies that are generating the greatest hype in the sector at present sit at the convergence of several larger trends such as cloud computing, IoT, and AI. What they have in common is harnessing an improvement in the immediacy or timeliness of data, both in the

254

6 Radical Technologies

speed and volumes of data capture, and in theory the speed of translating data to action. As the EPSRC write, “emerging technologies are being increasingly applied in infrastructure systems; they increase the timeliness and reliability of information about the current state of infrastructure (EPSRC-NSF, 2023).” As T.S. Eliot taught us, “between the idea and the reality, between the motion and the act, falls the Shadow.” To build our kingdom of data, we will first need to reckon with that galling gap between intention and action, “between the potency and the creation.” In this part of the book, we dive into a broad family of technologies that could collectively fall under terms such as “Digital Construction” and “Smart Buildings.” Products, offerings, and techniques are constantly changing in this space, and this is likely to be the part of the book that ages most quickly. Nonetheless, we hope that by addressing the intention of these technologies we will capture at this moment in time how the built environment sector is seeking to realise better outcomes through data in the short to medium term. As a data professional part of your role will almost certainly be to translate abstract concepts like predictive analytics or digital twins into tangible investments that can be made in the present. Often this means being able to trace a thread between the foundational capabilities that your organisation needs in the immediate term, and the longer-term aspirations of the organisation. It can be useful to channel the desire for shiny toys like digital twins into the less sexy underpinning capabilities such as data governance, data aggregation, and data quality. This progression of investment from prolix to transformative is what you might refer to as a roadmap. Data professionals live and die by their ability to articulate credible roadmaps.

6.3 Digital Construction and Building Information Modelling (BIM) Building Information Modelling (BIM) is not a technology but a process which provides the ability to author, exchange, assure and subsequently use and re-use trusted information (data) to the benefit of all those involved in an asset’s lifecycle. (BIM Interoperability Expert Group, 2020)

Building Information Modelling (BIM to its friends) was always going to loom large in this book. Whilst the sector’s record of delivering value from investment in BIM is patchy, at best, it is nonetheless epoch defining software (or process, depending on whom you ask). BIM is notable as perhaps the first time that a major new type of software, with (partially) competing global vendors (notably Bentley and Autodesk), emerged specifically to serve the requirements of the built environment. Sure, before the concept of BIM, there was CAD, GIS, scheduling tools, and the like, but generally this was tooling that a subset of the built environment user base (e.g., designers and architects) shared with a range of users outside the built environment sector (for example, in manufacturing, product design, or extractive industries). BIM

6.3 Digital Construction and Building Information Modelling (BIM)

255

was and is notable for both its specificity and, perhaps more importantly, its broad reach across construction (and latterly operations and maintenance) professions. In the wild BIM is as much a philosophy as it is a type of software. Pioneers such as Dr Mark Bew, Iain Miskimmin, and Dr Anne Kemp wrote the BIM remit into national and international standards. It is worth noting that the framing of the original BIM Task Group for the UK Government was about enabling the government to facilitate value for money in the wake of the great financial crisis of 2008. Figure 6.3 is from an old slide deck, circa 2012 from the Construction Industry Council’s BIM2050 (which one of your authors co-founded with David Philp, Head of BIM for the Cabinet Office at the time). Now looking back at it at the doorstep of 2025 it’s interesting to see how far we have come. The point of sharing this is to show the original narrative. There was never a thing called BIM in isolation, it had levels of maturity and articulated a journey for both government and the supply chain. Governments from around the world welcomed the BIM Task Group (later the Centre for Digital Built Britain) via an international programme to bring about international coordination. The crowning glory of this story is coordinating the transition

Fig. 6.3 A presentation slide by the Construction Leadership Council’s BIM2050 Group

256

6 Radical Technologies

of the founding standards (the BS1192 suite) into the international standards (ISO 19650). However, in the transition from the BS1192 suite to the ISO we have lost some nuance in the original BIM concept, but is probably a price work paying to have the world on board with standardising information management practices. This is now ancient history. BIM sought to shape the sector to its will. If you were a project manager or an engineer or a facilities manager and you didn’t think that BIM applied to you then you were wrong, and there was BSI-endorsed paperwork to prove it. Of course, there were, initially at least, only a handful of people who got to decide what was, and wasn’t, BIM. For the most part the built environment sector continued to generate data as sketches, documents, spreadsheets, and reports as they always had, with neither BIM nor GIS becoming the primary index of built environment data as its pioneers might have envisioned. As BIM became both desirable and indeed obligatory— with declarations such as the UK Government’s ‘BIM Level 2’ mandate—this small group of people who wrote the standards came to exert a huge influence on the sector. It also reinforced the sense that BIM was unique and special, even though the underlying technologies and approaches (design modelling, document management, data aggregation and visualisation) were hardly unique to the built environment. However, all that is in the past. We are now well into the second decade of widespread BIM adoption (at least in the UK). The extended family of BIM adopters is now sufficiently large, and the tooling sufficiently matured and off the shelf , that we can take stock of what (if anything) we want to accomplish with BIM and how it enables our broader aspirations (Digital Construction, Smart Buildings, Digital Twins, Digital Rehearsal). We can now define what proportionate application of BIM looks like on our project or for our organisation, without an obligation to meet any specific standards or arbitrary thresholds. It is important to highlight a current wave of activity around the maturing of BIM from Building Information Modelling to dropping the B and settling on information management. This has been the objective of the UK BIM Framework and a motivation behind changing the name of the UK BIM Alliance to Nima. Your authors believe that BIM and related Digital Construction techniques have a significant role to play in delivering built environment projects on time, on budget, safely, and to a suitable level of quality. However, we also believe that we must work to lower the barriers to entry and flatten the learning curve for BIM, and in doing so make it accessible to a much larger proportion of staff within built environment organisations. In large part this means understanding how to surface a common set of data through a range of different user interfaces designed to cohere with different user persona. In covering BIM, we will seek to answer these related questions: • How and where built environment projects/organisations are likely to derive the greatest value from BIM and related digital construction technologies. • What these same organisations can do to ensure that these investments find widespread adoption beyond their traditional specialist user base, and in doing

6.3 Digital Construction and Building Information Modelling (BIM)

257

so tackle the fundamental information management and information sharing challenges faced by the sector. There is room in this discussion for honesty and truth-telling. One should approach the hype and marketing associated with Digital Construction with appropriate scepticism. Enormous value is possible from investment in this area, but only if you make those investments in a circumspect and user-focused manner. Digital Construction comes with extensive literature and supporting material, including lengthy British and International standards (BS1192 and ISO 19650 respectively), data standards such as IFC, CoBIE, and UniClass, and the work of the BIM Task Force, Construction Innovation Hub, and the UK BIM Alliance (now NIMA). Across its six parts, ISO 19650 sets out the value proposition of BIM and describes the roles, artefacts, and processes required to realise that value. The current industrial practice here is culminated by the UK BIM Framework and at the time of writing one of your authors is working on a new UK Government mandate for ‘Data and Information Management.’ The framing of the mandate may change, as some may point out that DIM is not a great acronym, but the essence is clear. Governments around the globe will need to continue their support to enable a data-driven built environment sector. Your authors believe there is fresh motivation to do so and the socio-economic and environmental factors (cash and carbon) will be motivating topics once again. One of the key insights that this body of knowledge brings to the built environment is that we should aspire to a single body of information across our asset and project life cycles. By defining our Organisational Information Requirements (OIRs) we are in effect undertaking business architecture, and seeking to identify what we need to know to successfully operate our organisation and the assets that it manages. In this context the information generated during projects (Project Information Requirements) and exchanged with suppliers (Exchange Information Requirements) are subsets of, and contributors to, the body of information management by the organisation. We can then create a single source of truth through a Common Data Environment and Federated Information Model. Alongside information management and data standards this is where there is the greatest opportunity for BIM specialists to work alongside their IT department to create reliable well-architected digital infrastructure to support their organisation’s work. By recognising this bigger picture of information, we can seek to ensure that there is consistency and reuse of information across life cycle stages, rather than in siloes. Such a distinction may appear extremely obvious, but it stands in sharp contrast to a history of built environment organisations thinking of information as project-specific, rarely defining their organisation information needs, and treating he information that the operations and maintenance users inherit as a subset of the construction information (e.g., “we only make use of 10% of the BIM information after handover”). What ISO 19650 doesn’t do is provide much tangible guidance in terms of how a CDE or BIM environment should be built or used (the non-functional requirements if you will). In many ways ISO 19650 is a generic information management approach

258

6 Radical Technologies

applied to physical assets. The standard leaves all the specifics to the client organisation, supply chain, and the ‘Construction Tech’ sector. As such, you should look to ISO 19650 for high-level guidance, but you still need to reckon with how to “do BIM” in a manner that meets your organisational requirements. When creating data capabilities, it often helps to start small, and BIM is no exception to this rule. In the initial stages of a project, your goal for Digital Construction might be to arrive at a ‘minimum viable product’ that minimises up-front costs and provides a test bed for users to try out functionality and ways of working. Starting small with a core group of super users will allow you to configure your tools and mature your user stories, making you more able to rapidly ‘scale up’ as-and-when required. As you grow your BIM capability you will need to make decisions such as: • What are the component parts of our BIM process tooling (e.g., document storage, metadata/attribution storage, visualisation, design, commenting)? • Which user personas will benefit from access to and training on all or part of your BIM tooling (noting that licences and training to such software can often be expensive)? Are you going to try to make use of BIM outside of a core group of designers and specialists? • How do we introduce inexperienced users to BIM in an accessible manner without drowning them in jargon? • How will we ensure that our BIM capability interfaces effectively with our broader IT ecosystem? For example, when should we expect users to move documents from the corporate environment (e.g., Microsoft Office) to the CDE? Are IT able and willing to support specialist BIM software? How does BIM fit with other specialist tooling like GIS, Enterprise Asset Management Systems (EAMS), and operational systems? • As we create this capability, how does it affect our working relationship with our supplies/customers/stakeholders? Do we expect them to buy licences for the same tooling? Does it change what and how data moves between us? Do we own a centralised platform, or federate working practice across the supply chain? Whilst there are many people who will happily tell you what you should do in this space, there isn’t one right answer to these questions. As a data professional part of your role is to work across the distinct parts of your organisation to understand their aspiration and capacity. In part you are looking for the weakest link. Due in part to savvy marketing by vendors and consultants, very few built environment organisations these days will publicly acknowledge anything short of a desire for transformational digital change in their approach to digital construction. However, on the ground shortcomings will arise that organisations cannot realistically overcome quickly. These might include: • A lack of capability/funding/accountability for deploying or supporting the required software platforms. • A lack of skills/training/willingness/incentive/bandwidth for staff to learn how to adopt these tools into their ways of working. • Insufficient quality data to populate the tooling such that it is useful.

6.3 Digital Construction and Building Information Modelling (BIM)

259

• An inability to trickle down requirements to the supply chain, perhaps due to a lack of information requirements in the agreed contracts (e.g., “You want a CoBIE spreadsheet? Well, it isn’t in the contract so it’s’ going to cost you.”). • Security concerns around the storage of information in vendor cloud offerings, or a reticence to share information freely across organisational boundaries. None of these challenges are insurmountable. However, they do mean that depending on the maturity of your organisation that Digital Construction tooling such as BIM may be the purview of a small group of specialists to begin with until you can begin to demonstrate value to your organisation and/or have the opportunity to change your existing arrangements with suppliers and stakeholders.

6.3.1 Definitions There is a surplus of terminology in the Digital Construction space, much of which has limited value to non-specialists. Additionally, every organisation has its own local lexicon which can further obscure what you are trying to accomplish. To try to keep things simple we suggest the following terminology: • Digital Construction: Following the B1M’s definition, “Digital construction is the use and application of digital tools to improve the process of delivering and operating the built environment.” Such is the success of the BIM agenda—in the UK at least—that practitioners sometimes use the terms Digital Construction and BIM interchangeably. However, we are using the term ‘Digital Construction’ to describe a broader field of technologies. • Building Information Modelling (BIM): ISO 19650 defines BIM as “use of a shared digital representation of a built asset to facilitate design, construction and operation processes to form a reliable basis for decisions.” This definition emphasises the digital representation of the asset, and downplays the extent of asset information modelling, document management, and other information practices associated with BIM. We believe that BIM is about much more than digital representations. • Common Data Environment (CDE): ISO 19650 defines a CDE as an “agreed source of information for any given project or asset, for collecting, managing and disseminating each information container through a managed process.” It is worth noting that the CDE may not be a single piece of software, as you may use different tools to manage model data, document data, and numeric data. Similarly, the CDE should not be confused with design tools such as Revit. Instead, the CDE should function as a repository of information where that information is in turn updated using other tools, of which Revit is an example.

260

6 Radical Technologies

You may find that your organisation uses the above terminology in a slightly different manner. Perhaps there is already a document management or data warehousing solution in place that users refer to as the ‘common data environment’. In which case it might be unwise or confusing to lay claim to that same terminology for your BIM CDE. Perhaps there is a ‘Digital Construction’ team who are predominantly focused on drones and robotics. Again, it’s not worth getting into fights about nomenclature, the goal is to establish a consistent terminology for your organisation. Ideally this terminology doesn’t unnecessarily diverge from the terminology used by your customers/stakeholders/suppliers. But given the choice it is more important that your colleagues within your organisation, particularly the leadership, have a tangible sense in their minds of what is meant by BIM or CDE or Digital Construction. From the authors’ experience the distance between the views of suppliers and their customers on what constitutes a CDE are detrimental to gains advertised by evangelists and marketers of these tools. A failing of BIM, its standards and the surrounding culture is that it can be too focused on the bureaucracy of project and programme management and not the data flows required to answer questions, inform, and record decisions. A practical example from industrial practice is where construction processes have been required to integrate with manufacturing processes (this is in the context of a multibillion-pound high value manufacturing programme). When these two industries are held side to side the contrast is stark. On one side there is BIM, the design and management of the facility and its surrounding infrastructure. As we zoom in, we find Product Lifecyle Management and Model Based Systems Engineering. Generally, these systems in their concepts are complimentary. However, in the practical application, they are on completely different planets. The juxtaposition of these sectors demonstrates a key difference between information and data management. PLM and MBSE tend to be concerned with data objects and object level attributes. Where as BIM and construction information management is more concerned with the flow of documentation that abstracts objects into mostly 2D drawings (despite the fidelity and investment in building detailed 3D models) and relevant documentation. Our summary here, is we want to encourage you to see the difference between data and information. They are both vital. Let’s drop the B from BIM and bring life to a new phase in the maturity of our sector. If we can do that, it opens up the possibility of building a bridge between digital construction and data management professionals. To drive the point home, data management professionals exist in every firm and industrial sector. They play a role in our medical records; our credit card payments and the air traffic control systems in the aviation sector. If you are able to think beyond the transactional nature of the information management standards and systems found in construction, you will be able to establish a fruitful relationship with a new breed of professional roaming the digital economy: the data manager.

6.3 Digital Construction and Building Information Modelling (BIM)

261

Fig. 6.4 Overlapping digital construction technologies

Digital Construction is in our usage at least—a broad domain, of which BIM is a subset. Digital Construction also includes techniques such as photogrammetry, digital rehearsal, AR & VR, and is related to domains such as Modern Methods of Construction (off-site manufacturing), Smart Buildings, IoT, and Digital Twins. Figure 6.4 presents a view of how these different concepts relate to each other, and overlap. As per our previous caveats, there is room to disagree here, and we are looking to be informative rather than prescriptive. What this diagram demonstrates well, we believe, is the messiness in the application of these concepts. Take Digital Twins for example. To some folks a Digital Twin is just a re-branded BIM model (BIM+ if you will). To others, a Digital Twin need not have any 3D visualisation or design data at all, it might instead be a sophisticated building management system, closer to SCADA than BIM. Likewise, if you collect data on the construction of your asset in accordance with a BIM data standard like CoBIE or IFC, but never get round to visualising that data in a design model, are you still doing BIM? This messiness is why it is important to confirm shared meaning when working with others within and beyond your organisation to avoid expensive mistakes when you try to work together. That said, there are clear distinctions, particularly between disciplines and tools/ techniques (or, if you prefer, between knowledge and knowhow). BIM and design modelling are not interchangeable, the former is a rigorous means of capturing information during the design and construction of built assets, the latter is a type of projection. The peers of BIM are other disciplines such as Asset Management, Cost Management, Project Controls, etc. The peers of 3D visualisation are other means of displaying information such as 2D projects, charts, sketches, maps and the like. It just so happens that 3D visualisation is uniquely well suited to presenting the data

262

6 Radical Technologies

Table 6.3 Knowledge and knowhow with respect to technology applications Discipline (knowledge)

Technique/tool (knowhow)

Digital construction

Photogrammetry, AR, VR, digital rehearsal, logistics, simulations, field data capture, contact management, AI-augmentation, generative design etc.

GIS

GPS, drones, earth observation technology, large scale data models (with geographic context), maps (of all types)

Building information modelling (BIM) Sub-discipline of digital construction (above)

Design modelling, clash detection, site data capture and handover, information management and document control, information requirements

Digital Twins/smart places/intelligent asset management Sub-discipline of digital construction (above) and IoT (below)

Building management systems, operational controls, SCADA, simulation, design modelling, multi-dimensional BIM, proactive/ predictive asset management

Internet of things

Sensors, monitoring, mesh nets, edge computing, cloud computing

Modern methods of construction

3D printing, off site manufacturing, waste reduction, standardised component libraries

collected by BIM processes. Any project adopting BIM will at a minimum capture a 3D representation of the assets in question, along with some basic level of attribution of those assets (preferably using a common standard such as UniClass or CoBIE). Table 6.3 extends this distinction. That said, the truth (to the extent that there is a ‘truth’) is perhaps more poly-hierarchical than these simple divisions suggest.

6.3.2 Dimensions and a Data-First Approach As both BIM and 3D visualisation have become more commonplace in the built environment, opportunities have emerged to present additional layers of information using the 3D visualisation of assets in space as the principal user interface. This is understandable, 3D digital replicas are an intuitive way to present spatial information, usually far more intuitive than the traditional charts and tables. That said, it is not without its downsides. By framing information in a replica of physical space we lose the power of abstraction and simplicity that comes with more conventional data visualisation techniques. For example, one may have a BIM model that captures and presents both the spatial geometry of the assets in question, but which also shows the construction of those assets or how they change over time (e.g., simulating the process of construction) and overlays other datasets (dimensions) related to scheduling logic, cost, carbon, safety

6.3 Digital Construction and Building Information Modelling (BIM)

263

etc. Typically, we present changes through time by moving from a static design model to a moving image that allows the user to move forward and backwards in time. In other cases, we might use techniques such as a heatmap (usually at a zoomed-out view of the asset) or by adding flags to specific locations on the model itself (for example marking hazards) which the user can click to obtain additional information. As these types of functionalities have emerged it has become more common to refer to multi-dimensional BIM, e.g., “4D, 5D, 6D, 7D, 8D BIM” with a higher number of dimensions suggesting a greater level of data integration and visualisation (and price tag). The pedants amongst us will point out that this is inherently a mixed metaphor, the use of the word dimension means quite different things: either it refers to the three dimensions of space plus time, or it refers to additional layers of contextual information. Additional terminology, such as Digital Rehearsals and Digital Twins, only add to the confusion. However, the dimensions terminology has stuck despite this obvious flaw. As Henry Fenby-Taylor (notable Cyber-Physical Infrastructure thinker and former Head of Information Management at the Centre for Digital Built Britain) put it: “The ‘dimensions’ just define domains of functionality… between 2011 and 2015 everything was BIM, it began to suck up every piece of digital innovation in the sector… it all got rebranded, so the academics are left chasing definitions developed by consultants who aren’t computer scientists.” What all this lexical noise obscures is the fundamentally viable idea of combining information from the project controls (cost, time, risk) and design professions to bring explicability of outcomes to complex projects. There will always be a tangible difference between seeing a bottleneck on a Gantt chart, and seeing it manifest as a sequence of moving parts in space and time. Similarly, it will be far easier to explore data on safety or logistics issues, and other design hazards, when overlaid on a representation of the asset. As such, the question that a built environment client should seek to answer is not “how many dimensions of BIM do I need?” but rather “how do I use BIM and 3D visualisation to help me better predict, control, and measure the performance of my works?” In general, one does not derive value from BIM and Digital Construction simply by accumulating dimensions and data layers on your model. Some data, such as summary statistics or longitudinal trend analysis, is unlikely to benefit from 3D visualisation. The challenge, as a data professional, is to work with your organisation to identify the crucial dimensions such as geometry, time/schedule, cost, carbon and other non-financial metrics to ensure that you have the data specifications required to translate these accurately to the spatial model, and the visualisation capabilities to make that translation meaningful to end users. Taking a data-first approach to Digital Construction can help us to step away from the specifics of software, and arguments about dimensions, and instead to focus on the logic of the questions asked by our users. Whichever ‘dimensions’ one chooses to represent in a 3D environment, you will need a fit-for-purpose data set to populate that dimension, and fit-for-purpose logic (model) for how that dimension relates to the other dimensions. As an example, if you wish to add the concept of ‘time’ or

264

6 Radical Technologies

‘schedule’ to your BIM model to arrive at a ‘4D BIM model’ or a ‘Digital Rehearsal’, you need a source of information on time (e.g. your project plan in Primavera), and some logic for how your time dimension affects your spatial model (e.g., tasks in the project plan are associated with zero, one, or more assets in the design model, and executing tasks can add, remove, move, or change those assets over time in accordance with some defined physical rules). Phrased in knowledge graph terms: to think logically about data in the built environment—and to derive value from combining data with a spatial model—one must define the entities (or concepts) about which you wish to collect data. Your entities must be simple, logical, and mutually exclusive. You can then use your entities and the relationships between those entities to define the data that you wish to collect, and the analyses that you wish to conduct. Ultimately this will flesh out the type of functionality proposed by multi-dimensional BIM, providing a far clearer idea of the valuable questions that you are trying to answer, and of the data architecture, engineering, and analysis required to get you there. A data first approach is in keeping with ISO 19650 Part 2, which states that project organisations should establish project information: • • • • • • •

Requirements (what). Delivery milestones (when). Standards (what will we share). Production methods and procedures (how). Reference information (validation rules). Common Data Environment (storage). Protocol (governance).

19650 Part 2 provides a wealth of guidance that will help to populate the functional and non-functional requirements for any CDE development. However, it does not offer guidance on what specific data sources should form the CDE or their structure. For guidance in this area, you will need to refer to existing top-level ontologies (such as BFO, ISO15926) and industry-standard data models (such as CoBIE, UniClass, or IFC). Whilst you should seek to ensure that you align to the principles of ISO 19650, use a top-level ontology, and borrow eagerly from industry-standard data models, you may not be able to remove the need to originate some project specific logic to (a) glue all of these borrowed components together, and (b) reflect the specific goals and idiosyncrasies of your project/organisation. At the start of this section, we set out to answer the following questions: • How and where built environment projects/organisations are likely to derive the greatest value from BIM and related digital construction technologies. • What these same organisations can do to ensure that these investments find widespread adoption beyond their traditional specialist user base, and in doing so tackle the fundamental information management and information sharing challenges faced by the sector. An effective way to start answering these questions is to define the entities that drive decision-making in your organisation, where you can reasonably expect to store,

6.3 Digital Construction and Building Information Modelling (BIM)

265

manage, and analyse these using Digital Construction tooling. You can then prioritise investment in a manner that seeks to address the business’ most valuable use cases first. Which is to say, “if a lot of users care about X, then prioritise bringing X data into your model.” Working with a Business Relationship Management or business engagement function here will help to ensure that you capture your use cases through engagement with your stakeholders. Figure 6.5 demonstrates an example of how you can link your logical entities (blue and green boxes) to data standards (yellow) that you issue to your supply chain, the systems that you use as an organisation (orange and purple), and ultimately to your use cases. Figure 6.5 is a simplification of the detailed logic required to ensure that you are clear on the data required to meet each use case. Creating these kind of high-level data models, supported by more detailed logical data models, helps to ensure that:

Fig. 6.5 Built environment organisation data use cases

266

6 Radical Technologies

1. You surface data to users in a manner that reflects the language and logic that they use in their work, rather than the sometimes arbitrary or generic terms used by your software. 2. You understand the data requirements of each use case, and the data aggregation required to meet those use cases. 3. You can specify the movement of information between systems in a manner that reflects a consistent business logic (e.g., avoiding conflicts between your different data standards and data sets). 4. You can specify the movement of data between your organisation and stakeholder/ supplier organisations in a manner that ensures that the data you receive is fitfor-purpose. 5. You understand the sensitivity and source of the data that you use so that you can apply appropriate access control and security measures. Note that items 3 and 4 above are closely related to your approach to managing document information, which is a core part of delivering successful Digital Construction but also has wider application across the business for reporting, audit, quality management and assurance. To realise a data-driven approach to Digital Construction, you will benefit from investing in common and accessible definitions that you share as an organisation (e.g., your Common Data Model). A Common Data Model requires the following types of investments (artefacts in bold): • High-level map of the data sets that you manage and how they are related to each other (e.g., your Ontology or Conceptual Data Model), reviewed extensively with stakeholders across the business to ensure the model reflects their understanding of ‘how things work’. • Connect your Ontology to your Information Asset Register and Data Governance Framework to ensure that you are clear on who is responsible for each data set and your expectations of their responsibilities. • Use your Ontology to inform the prioritised creation of Logical Data Models which describe the information contained within relevant data sets/source systems (e.g., schedule data in planning software, cost data in cost management system, geometric data in BIM), and Reference Data Library and Interface Specifications which describe how data from different data sets can be integrated into a single source of truth within the CDE. • You can then implement the aggregation (using an extract load transform process (ELT)) of your data into the CDE using the transformation logic defined in your Interface Specifications. This can in turn serve as the data source for other Digital Construction tooling such as Digital Rehearsal and multi-dimensional BIM/planning. You should seek to develop your Common Data Model and Common Data Environment in parallel. The creation of your Common Data Model will inform both the structure of data in your systems including BIM and project controls tooling. It is important to remember that for all their sophistication, Digital Construction tools

6.3 Digital Construction and Building Information Modelling (BIM)

267

are fundamentally an information delivery mechanism. It is easier for us to consume information about the current and future states of a project if we can view those states on our computer screen in a way that accurately mimics their real or proposed layout in the physical world. Through this optical illusion that makes the pixels presented on our flat screen appear three dimensional, we create models that recreate our realworld spatial awareness. What we then accomplish using that optical illusion then depends in large part on what data we have at our disposal, and how meaningfully we can visualise it. Emerging Digital Construction technologies provide us with yet richer data capture and visualisation capabilities, further blurring the line between what is real and what is digital replica.

6.3.3 Emerging Technologies: Bridging the Gap Between Digital and Reality Built assets last a long time (usually). Construction projects take a long time. And technology isn’t waiting around. If you are starting work with an infrastructure owner or construction project today, chances are by the time you’ve accomplished what you set out to do there the parameters of technology will have changed massively, perhaps beyond recognition. So, it pays to keep an eye out for what’s on the horizon, and there are a number of emerging technologies that will likely come into play in Digital Construction over the lifespan of projects starting out today. The majority of these are bleeding over from other industries, in particular business-to-consumer (B2C) use cases such as computer games and entertainment, but which will begin to see application elsewhere as the technology matures and reduces in cost. It has been true for a long time that the technology in your pocket or your living room is a generation ahead of what’s available on site. Consumer markets are generally larger and more profitable than built environment projects, and usually require lower levels of accuracy and functionality. This makes it much more likely that innovation in the built environment means taking consumer technology and adapting it for use on site than the other way around. But it doesn’t mean that you should start recording TikTok videos in your portacabin. Whilst not all these technologies are ‘data’ solutions per se, they are all means of collecting and using data, and they all allow us to more tightly couple our physical and digital worlds, helping us to mimic the real world more quickly and accurately in our digital replicas. Three such types of technology that blur the boundaries between the digital and physical worlds are: • Site data collection: mobile apps, computer vision, pattern recognition, automated site capture, drones. • Overlaying data on site: augmented reality (AR). • Interactive digital replicas: virtual reality (VR), metaverse, Web3.

268

6 Radical Technologies

These technologies reduce the friction in moving data between the physical realm and the digital realm, and vice versa. It’s worth discussing each direction of data flow in turn.

6.3.3.1

Physical to Digital

Data collection will always be a substantial cost associated with a construction or maintenance project. Even on greenfield sites one must conduct investigations into the land, soil, groundwater, flood hazards, natural environment, protected species and the like. On work that involves renewing or replacing existing parts of the built environment, particularly where they remain in use, data collection to inform designs can become prohibitively expensive. Once work has commenced on site, the construction verification process involves repeated and detailed inspection of the work completed and manual comparison back to the source design. Under the status-quo site surveys and site data capture remain a challenge for a wide range of prosaic reasons, including: • There are a wide range of different phenomena that one can capture about a piece of land, a building, or an asset, everything from soil composition or atmospheric conditions, usage patterns, structural strength, ambient light, and noise levels, etc. This means that there is often specialist equipment required to capture information. In the past, through a combination of vendor canniness and client immaturity, the data captured from such equipment has sat with the vendor. In these circumstances the client—who let’s remember paid for the data collection—is given full or partial extracts from the vendor’s database. Such ways of working can make it extremely difficult for a client to consolidate diverse sources of site information into a CDE or equivalent, or to ensure that the resulting data conforms to any sort of common data model or specification. • Site surveys and site data capture often rely on a fragmented supply chain, with inconsistent ways of working and skill sets. This can make it particularly challenging to establish common data standards or consistent use of tooling. • On certain projects there will be sensitive data captured on site, for example information about security infrastructure. Clients do not always have established practice for storing data centrally, controlling access to sensitive information. Consequently, it is easy to end up managing sensitive data and non-sensitive data in mutually incompatible ways. • The file formats associated with site data capture can be quite substantial, and not all organisations have a means to safely and securely transfer images from site devices into their CDE. Where these means rely on physical media (such as USB sticks) they often run afoul of organisational policies to protect against the loss of data. Similarly, organisations are not always aware of the difference between lossy and lossless storage or how to effectively index large numbers of images. It is known for organisations to lose volumes of information due to corruption, compression, or simply neglect.

6.3 Digital Construction and Building Information Modelling (BIM)

269

• Through all of this it is worth bearing in mind that capturing information on site, particularly when using human operatives, is not without risk. Inspecting assets, whether dormant or in use, can expose operatives to a range of habits including hazardous substances, working at height, slips, trips and falls, exposure to traffic, even just members of the public with conspiracies about 5G antenna. In the wind and the rain, wearing PPE, it can be difficult to use conventional touchscreens and other devices. There is a reason that data capture using paper and pen has persisted for so long. The user interfaces that staff use on site require careful attention to ensure that they are easy and safe to use. Whether capturing data from a site directly through your organisation, or via suppliers, it is vital that you agree to the Information Management requirements to ensure that this costly, and potentially risky activity is worthwhile. Effective Information Management requirements will ensure that: • Open files: This will ensure that you have on-going in-house capability to receive, search, store, retrieve and review the information that you receive from surveys. This will reduce the risk of expensive re-surveying costs. • Sign-off deliverables: Ensure that the deliverables from site are fit-for-purpose, contain the required information in the required format, and can update your digital records (for example your BIM model). • Improve decision-making: Augmenting your existing BIM model or digital replica with information gathered on site can aid with the identification and management of hazards and risks, information changes to the planning and design of assets, and ensure that the as built assets conform to the design and well as quality standards. A rich dataset of this nature will provide much of the spatial and attribute data that could inform longer-term investments such as multi-dimensional BIM, Digital Twins, and predictive maintenance. • Demonstrate compliance: site data capture should form part of maintaining an auditable record of the design and construction decisions made by the organisation (aka. The Golden Thread) as well as providing evidence for other regulatory obligations. On-site data collection will be a source of a large part of the information used within your Common Data Environment. By reducing the manual effort (and manual fallibility) associated with site data collection, Digital Construction technologies can both improve the flow of data into your design environment and connect site operatives more closely to the design. Again, building upon consumer innovations is crucial here. As an example, modern smartphones have advantages over professional photographic equipment used a decade or two ago. Whilst the cameras on smartphones still lag contemporary digital cameras (DSLRs), they capture far more detailed imagery (measured in megapixels) than the best cameras did 20 years ago at a fraction of the cost. What’s more, thanks to the popularity of consumer image processing products (think Instagram and Snapchat filters, and Google image translation) smartphones now come with sophisticated graphic processing units (GPUs) built in, enabling mobile applications to run sophisticated computer vision functions

270

6 Radical Technologies

that, for example, identify objects or text in images. If the capabilities of the phone itself are insufficient, widespread high-speed connection to the Cloud through broadband wireless and 5G means that we can upload unstructured high-resolution data from the site to the cloud for processing. Even where we require dedicated hardware, for example high-definition cameras or LIDAR sensors, this equipment still shares much of the underlying components and software with consumer technology and thus benefit from many of the same advances in accuracy, affordability, connectivity, and computer vision.

6.3.3.2

Digital to Physical

The user interface between operatives working on site and design information has never been seamless. Traditionally, plans were available on paper, and supervisors were responsible for ensuring the delivery of work according to those plans. Following construction, project staff annotated the plans to reflect changes made on site, and then archived these ‘as built’ drawings for future reference. Whilst BIM allows us to reduce the number of problems encountered on site through better design and clash detection, it does not on its own provide a better mechanism to guide works on site or ensure completion according to design. To fill this gap, we need the ability to: • Provide design information to operatives on site safely using screens, be that tablet or headset. Headsets have the obvious advantage of not reducing the operative’s movements and field of vision in the same way that carry a tablet around would (example vendors: Microsoft, Meta) allowing the technology to give users the impressions that data visualisation is overlaid on to their field of vision (aka. augmented reality5 ). • Take advantage of built-in device cameras to recognise where the operative is and overlay the design on to their screen using augmented reality. • Use accurate information contained within the design to enable construction of large parts of the works off-site in a factory environment (e.g., modern methods of

5

The Economist describes ‘extended reality’ as “a category which includes both fully immersive virtual reality (VR) and augmented reality (AR), in which computer imagery is superimposed onto users’ view of the world around them.” These technologies provide access to the ‘metaverse’ (e.g., virtual reality platforms that can accommodate large numbers of simultaneous users), which may in turn use Web3 technologies such as blockchain and decentralised currencies to provide value to digital assets. It is unclear how widely adopted VR or AR will become over the next couple of decades. Similarly, use cases for the metaverse and Web3 focus on consumers rather than infrastructure. They may remain niche distractions, or one or both may become hardware/platforms as ubiquitous as smartphones associated app stores are today. However, there are potential use cases for augmented reality in construction, as it may allow us to overlay a richness of information on top of the built environment. The most likely outcome, in the short-to-medium term at least, is that Extended Reality will help to bridge the gap between digital representations of your work (BIM models) and reality on the ground.

6.3 Digital Construction and Building Information Modelling (BIM)

271

construction) using additive manufacturing/3D printing and/or components built from standardised/modular components. • Use sensors, either within operatives’ devices or using standalone hardware, to automatically take measurements on site and identify divergences from design. • Provide a means for colleagues to explore and interact with the design in a 3D environment to reduce the necessity to visit site, and to simulate scenarios. • Translate this improved awareness of works on site into smart contracts that respond to the level and quality of work undertaken. Taken in isolation each of these technologies has a relatively limited set of use cases and may struggle to realise value or adoption. However, paired with a client with sophisticated data management, digital construction, and IoT capabilities, these technologies offer the possibility of pushing the value of digital capabilities from planning and design towards the safe and efficient construction of assets on site (or in the factory). They also have continued application once assets are constructed, offering the potential to bridge the gap between construction and operations.

6.3.4 Digital Heritage: Working with old and unique assets Whilst this chapter refers frequently to ‘construction,’ it is worth remembering built environment projects sometimes focus on the restoration of ageing assets, not construction and enhancement. Assets managed by heritage organisations are a completely different class, with practice sitting somewhere between the built environment and museums. Conventional built assets are usually to a greater or lesser extent unique in the specifics of their location, design, and use. However, unlike heritage assets, we can restore conventional assets using complementary replacement materials without significantly harming their character. With a bit of nous that your authors unfortunately lack, you can fix most issues with your home from products readily available at your local DIY shop. With the passage of time, assets become sufficiently divergent from current working practice, and sufficiently rare, that restoration becomes far more specialist. At their finest, heritage assets are quite literally irreplaceable. Whilst restoration will inevitably involve careful use of modern materials, careless application can result in considerable damage to the culture heritage embodied in these assets, as evidenced by several well-documented botched restorations (Parkinson, 2020). Heritage projects share select use cases with Digital Construction, such as the need to catalogue and store large volumes of high-quality imagery. However, they bring unique requirements, such as alignment to the Spectrum data standards for collections management (Collections Trust, 2022). To properly care for the heritage assets, you need to put in place the following types of capabilities: • Applying unique references to assets within a collection. • Tracking the location of those assets using a consistent spatial referencing system.

272

6 Radical Technologies

• Recording the relevant attributes of those assets, including their condition, and tracking those attributes over time. • Associating media with assets, for example imagery to record the location and condition of assets prior to the works. Storing media without loss (for example, image compression). • Allowing colleagues to easily search the records described above and open associated media. Whilst we have established that heritage assets are in many ways a special case, that doesn’t not necessarily mean that they cannot benefit from Digital Construction technology. The capabilities described above map to a broader need to manage numeric, document, and imagery information in a consistent and fit-for-purpose manner. As referenced above, a heritage function’s requirements to reference assets in space, add attribution, and link to high-quality imagery are like the activities undertaken by a digital design/BIM function. Historic England writes in their BIM for Heritage (2017), “benefits can be expected in construction projects on historic assets… in terms of spatial coordination (through integration of geospatial datasets and clash detection) and conservation planning (through improved visualisation, analysis and options appraisal) (Antonopoulou, 2017).” Where your organisation incorporates both heritage and BIM functions then there is opportunity to share solutions (for example, using tooling such as attribution, document management, spatial referencing, and image management) then this will likely bring cost savings to your organisation. However, you must do so in a manner that is respectful of the specific data requirements of heritage functions and organisations.

6.3.5 Geospatial Information Systems (GIS) One cannot write about Digital Construction and BIM without touching upon GIS. Both a technique, a toolset, and a profession, GIS helps us to visualise and analyse the spatial extents of the natural and built environment. It is particularly useful when applied to networks of assets that cover a large geographic area such as roads, railways, water, gas, telecoms, and electric utilities. Upon reflection, the GIS and SCADA/controls professionals can be considered as the quietly brilliant original data and information management technologies. The bottom line with GIS is scale, the GIS professional has the ability to collect, query and action data at a huge spatial scale. They are our closest connection to the data management at the scale we see across the digital economy. From our experience it is the GIS discipline that have already married the backbone of information and data management to offer insights at the scale they require. A simple practical example is Google maps, it’s the most frequently used GIS system on the planet. However, in the built environment context, there are better examples that must influence your thinking as a data professional. Risk registers on large infrastructure projects. Without GIS a risk register is a list, generally found and managed within a spreadsheet. Many projects hold risks that

6.3 Digital Construction and Building Information Modelling (BIM)

273

have the potential to put people in harm’s way. A mere (and quite frankly boring) spreadsheet cannot render risks in the light that they require. Humans cannot be relied upon to use their imaginations to bring a risk item to life. If you place that risk on a map and manage it through a GIS platform, you have the beginnings of a story. The ambiguity of a risk changes. You see where it is, how far it is from other things and how, say, five small risks grouped together in space could add up to a serious risk with fatal consequences. Maps are a form of data visualisation that are ubiquitous in our modern lives, and feel intuitive to most users, whether lay persons or technical experts. For that reason alone, GIS will form part of any data professional’s toolkit when working with built environment organisations. In certain circles the mere fact of being able to project data—any data really—on a map will do you a lot of favours. We should be concerned with what further possibilities GIS offers as a means of presenting data, and how it interacts with—or changes how we use—other applications described in this sector. When we introduced data types earlier in this book, we placed them into relatively self-contained boxes. The simplifying assumption was that different data types required different types of storage, different types of visualisations, and lent themselves to distinct analytical use cases. Table 6.4 provides an example of what this might look like in practice. In reality there is much more overlap between data types than the table above suggests. This is in part because many built environments use cases necessitate using multiple different data types, and in part because software vendors have the incentive to expand their platforms to meet as many customer requirements as possible. An example might be using a GIS or visualisation to help make it easier for users to find documented information. Folder structures are not always the most userfriendly ways of making documents available to people. No matter how logically organised they might be, folders lack the kind of visual cues that make interfaces Table 6.4 Data types Data type

Storage

Visualisation

Analytics

Documents

Electronic document management system (EDMS)

Folder structures

Do I have a complete and auditable paper trail for my work?

Alpha-numeric

Data lake/data warehouse

Business intelligence (e.g., charts, graphs, dashboards)

Can I track the key performance measures of my work and respond to emerging trends?

Models

Collaborative BIM environment

3D design models Can I coordinate the design and layout of my assets within a limited area?

Spatial

Spatial database

Maps, maps, Can I coordinate the maps, with layers, management and utilisation of layers, layers my assets across a large area?

274

6 Radical Technologies

intuitive to users. By using space as an index for your information, you can allow users to explore a map or 3D representation of your assets and click on components or locations to view the associated documentation. It won’t work for every use case, not every document will have an obvious associated location, but it can make it easier for operational users to find documents associated with a particular site or task. This can be a particularly useful approach for documentation specifically associated with locations, for example the presence of environmental or safety hazards in particular spots or areas. As Fig. 6.6 shows, we can combine alphanumeric, model, and spatial data to provide our users with far more context than is possible in isolation. Most problems in the built environment have performance (e.g., numeric), asset (e.g., model), and location (e.g., spatial) elements. We cannot measure or analyse phenomena such as patterns of asset usage, air or noise pollution, coordination of resources, or the impact of external events such as flooding, without understanding the spatial context in which they arise. As such there will always be a role for GIS in managing large geographically dispersed portfolios of assets. The combination of alphanumeric and spatial data can be a particularly potent means to visualise and explain complex macro-level phenomena such as climate change, globalisation, biodiversity, or environmental phenomena like fires and

Fig. 6.6 Examples of built environment use cases that require using multiple data types

6.4 Data Analytics and Business Intelligence (BI)

275

flooding. ArcGIS’ StoryMaps website is full of visually impressive explanations of such topics that resonate with more immediacy than mere tables and charts (ESRI, 2023). They also reveal GIS to be a useful way of visualising changes over time. By maintaining a consistent background layer (e.g., the map) one can show changes to the spread of spatial phenomena (such as growing urban areas) over time. This is a technique with application in both natural and built environment projects, for example civil engineering projects such as the creation of earthworks where work happens slowly and over a large area, or rewilding projects where the extent and progress may not be easy to gauge from the ground alone. However, integrating data types requires greater maturity in terms of data architecture and system integration. Combining numeric, model, and spatial data involves creating common definitions and data pipelines between specialist (often proprietary) tooling such as BIM and GIS products from vendors such as Autodesk, Bentley, or ESRI. As is a common theme throughout this book, having solid data architecture and data management in place is a precursor to realising value from GIS.

6.4 Data Analytics and Business Intelligence (BI) A key motivation is to reserve human time and effort to those activities that we are best suited to; ingenuity, wisdom and compassion.—Richard Carter, Cognitive Advantage (Carter, 2021)

To build a data-driven organisation, you need to ensure that individuals can access and derive value from trusted, fit-for-purpose information. Business Intelligence and self-service analytics (henceforth BI) add value in several ways, including: • Making meaningful data securely accessible to everyone within an organisation. The ability to structure and analyse that data to create information, knowledge and wisdom is dependent upon the skills and experience of staff members. As such, the more that we can provide them with straightforward ways to manipulate data in a structured and repeatable manner, without relying on manual manipulation of data, the more likely staff are to make use of approved reporting data sources when faced with a decision. • Standard models and reports allow users to ‘self-service’ trusted information without extensive work manipulating data. Done well, this reduces the amount of staff time consumed creating reports and measures, reduces the likelihood of error, and allows the broader organisation to take advantage of the specialist analytical skills of the business intelligence and data teams who build those models. • The use of standard models and reports reduces the divergence in ‘truth’ across the organisation. It makes it possible for the organisation to agree performance metrics with defined data sources and calculation methods. To make effective use of BI you must address the following questions: • Process: The process that the business follows from data question to answer.

276

6 Radical Technologies

• Life cycle and scaling: What the life cycle of a business intelligence solution looks like, the process for piloting, prototyping and ultimately scaling up successful initiatives. • Architecture and Infrastructure: The data engineering, architecture, and platform infrastructure required to support business intelligence, and how this varies across the life cycle. • Human capital: The psychology of how people do (or don’t) use data to make decisions, and how you can nudge staff towards data-driven decision making. The structure of this section addresses these four topics in order.

6.4.1 Process Business Intelligence is synonymous with dashboards, but the humble dashboard is merely the most visible manifestation of a much broader data management and business engagement process. The diagram below summarises the steps involved in running an effective business intelligence function. Figure 6.7 is an abstraction of the data management components of a defined solution. To deliver value through BI you would need to complement this data management process with a means to capture user requirements and translate these into data flows. The central boxes on the diagram show the path of data (and ultimately intelligence) through the process (Fig. 6.8).

6.4.1.1

Data Sources

Data Sources will obviously include IT systems (for example Oracle, or Primavera P6), but will also include offline records, spreadsheets, people, suppliers, and IoT devices.

Fig. 6.7 BI process—outline

6.4 Data Analytics and Business Intelligence (BI)

277

Fig. 6.8 BI process—as implemented

6.4.1.2

Data Collection

Data collection methods will vary depending upon the type of data sources used. For sophisticated systems and devices, the goal should be connection to data via an API, but for less mature or smaller-scale use cases manual extraction of data (for example via CSV files) will not be uncommon. As an organisation you should seek to apply some prioritisation logic to which use cases you do and do not seek to automate. Automation can be expensive. It may not be worth automating a data flow unless it consumes several hours of labour per week (on average) to conduct manually, if the automation can avoid timing issues for example requiring someone to work evenings or weekends, or manual processing brings an unacceptable risk of errors. Where it is necessary to extract information from unstructured datasets—such as pdf reports and archive documents—data collection may rely on intensive manual processing, and tooling such as Optical Character Recognition (OCR) and Natural Language Processing (NLP) could bring sizeable improvements to the scale of data collection that is possible. That said, at present there are few solutions trained to work with built environment data, in part because limited data sharing means that the potential training data set remains proprietary and disaggregated. When it comes to human beings, the success or failure of data collection will often depend on user interface design. Easy-to-use user interfaces, with built-in data validation, will solicit more and better data from human beings than clunky interfaces. As an example, a large infrastructure operators’ safety reporting system used to start by requesting that users choose from an immense list of office locations before reporting a near miss or safety event. A user-focused interface would ensure that staff were able to record details of the safety incident first (perhaps even allowing options like recording a voice memo or uploading an image). If location information was vital a modern user interface would use the device location or offer a shorter list of ‘nearby offices’ to choose from. Each of these steps will reduce the ‘friction’ felt

278

6 Radical Technologies

by the user, making it more likely that they will finish submitting their data, and that they will come back to submit more data in the future. It is worth noting that much of our data will come to us via the supply chain, and this is often the weakest link in the process. If you do not specify the type and format of data that you expect to receive from your suppliers, then you will likely receive the bare minimum: semi-structured spreadsheets and pdfs. This transfers the burden of data extraction back to your own staff and reduces the likelihood that they will put that data to effective use.

6.4.1.3

Data Aggregation

Data aggregation, as the name suggests, is a necessary part of the process where your business intelligence uses data from multiple sources. It is generally considered poor practice to run analyses directly on source information due to the risk of slowing live processes or even corrupting source data. The approach to data aggregation must reflect the type and quantity of data in question, and the methods used to transform and analyse it. In all cases you should have a means to undo mistakes. For simple use cases it may be sufficient to hold the raw data in a ‘backup copy’ of a single spreadsheet or file, whereas a modern cloud data platform will often push data through many layers of ‘curation’ to ensure that it is always possible to ‘roll back’ to the original raw data.

6.4.1.4

Data Transformation

Data transformation is the process whereby you ready your data for analysis. Reasons for transformation include: • Joining data from diverse sources by using (or creating) ‘common keys’. In a built environment organisation this will usually mean ensuring that different datasets refer to some common taxonomy like an asset hierarchy, work breakdown structure, cost breakdown structure, project codes, room ID or coordinates. • Aligning your source data to your defined organisational data models. This may be to make the product of your analyses easier for people to understand by removing system-specific table and field names. Or it may because there are errors in how your systems structure data that you have not yet remedied at source. • Reducing the quantity of information pushed through to the analysis stage. For example, if the analysis only requires daily averages, but the data is available hourly, then you may wish to generate a daily summary table from the hourly figures. • Identifying and remediating or removing outliers and errors that would skew your analyses.

6.4 Data Analytics and Business Intelligence (BI)

279

• Calculating measures/metrics/KPIs from your data. Often considered part of data analysis, metrics are really a transformation of data in accordance with a predefined formula. 6.4.1.5

Data Analysis

Data analysis is, conceptually at least, the stage at which you generate information (or intelligence) from your data. The type of analysis that you use will need to match the use case driving that analysis, with due acknowledgement that the forms of analysis further down the list are likely to be both more expensive, and more data intensive. • Descriptive analysis: creating a time-series plot or location heatmap so it is easier for consumers to identify patterns. This could mean tracking a metric such as ‘Earned Value’ over time. • Exploratory analysis: includes techniques like running a regression model to identify the correlation between variables. For example, construction projects may use regression analysis to better understand factors that contribute to unwanted incidents such as delays or safety events. • Inferential analysis: taking data from a representative sample and using it to extrapolate broader patterns. For example, you might conduct a detailed sample of project cost estimates against actual costs and use this to provide a benchmark in terms of whether you are typically underestimating or overestimating costs, and where you are most likely to make errors. • Predictive analysis: using data to identify trends and extrapolating these into the future. Construction projects do this all the time, often quite mechanistically through dependencies in Gantt charts. Data analysis can help us to better predict future performance and risks to measures such as cost, schedule, and safety. 6.4.1.6

Data Visualisation

Data visualisation is, as the name suggests, how most people encounter the outputs of data analysis. It is also, crucially, a user interface, no longer merely surfacing up static graphs to users, but also providing them the tools to conduct parameterised ‘self-service’ analysis of their own. For example, creating dashboards that allow users to ‘drill through’ the data means that each individual dashboard you create can enable a level of analysis on the part of the viewer. Effective data visualisation can be the difference between a message landing and an organisation making a data-driven decision, or not. ‘Self-service analysis’ can also reduce the pressure on the data analysis teams by allowing users to make minor changes to how they present charts without having to consume the time of the analysts themselves. The centrality of visualisation to the business intelligence process means that consistency of presentation and uniformity of aesthetics (as well as accessibility concerns) are vital to the effective use of the function. Staff within the organisation show know,

280

6 Radical Technologies

without needing to ask, whether they are using an officially sanctioned business intelligence output, or a shadow BI fudge. There are entire textbooks written on how to make data visualisation meaningful, appealing, accessible, and truthful. User-centred design and user experience considerations are important if you expect users to engaged with and manipulate data rather than just absorb it, and to avoid alienating users with colour blindness, dyslexia, poor sight, or other conditions. Many built environment organisations, particularly those in the public sector, maintain labyrinthine and hierarchical governance and reporting structures. Whilst these structures are not much fun to work within, they can be an open door in terms of integrating better data practices into the functions of an otherwise data illiterate organisation. Where an organisation does not manage data effectively it is not unusual for divergent versions of reality to emerge and come into conflict as part of the reporting process. Positioning data and business intelligence as a means for management to ensure that their meetings can rely on a common source of truth for report may be a compelling and high-profile use case in many such organisations.

6.4.1.7

Use Cases

It is worth noting that whilst the earlier diagrams read left-to-right (data to outputs), your BI team will usually think right-to-left (outputs back to data requirements). As they are amongst the most embedded data specialists in the business, BI analysts need to follow the ‘pull’ of the business, rather than ‘pushing’ pre-defined analyses. Usually this is best accomplished iteratively, starting out by showing stakeholders some preliminary insights and inviting feedback and questions, and then revising and repeating. People suffer from ‘blank page syndrome’ and find it easier to articulate what they want when they have a benchmark example in front of them. This is an area where the broader data team should look to complement the BI team. The BI team are your scouts on the front line, as they identify areas of demand from the business, you should look to ‘productionise’ those use cases that are more frequently requested. Fundamentally, analysis without use cases is a solution looking for a problem. Whilst you should never prejudge what you are going to discover in your analyses, you should have some notion of where the value lies. Otherwise, you may not appropriately prioritise the use of scarce resources.

6.4.1.8

Outcomes

If use cases are the little (micro, tactical) why, outcomes are the big (macro, strategy) why. Outcomes should be broad brush, reflecting a traceable connection between the analyses you conduct, and the strategic objectives of the organisation. It is likely that for every outcome, there will be numerous use cases describing the analysis required on each step of the journey towards the realisation of that outcome.

6.4 Data Analytics and Business Intelligence (BI)

281

6.4.2 Staff and Stakeholders As we covered earlier in the book when we discussed the limits of human decisionmaking, all data analyses and business intelligence realise value only through cognitive agents acting in the real world (at least for the time being). Data analysis exists to inform and influence the decision making of cognitive agents. Data, used well or used poorly, can impact the strategic direction of your work. The hope is that by correctly designing, scaling, and maturing your ways of working you can ensure that staff have access to the information and insight that they need to ensure the success of your works. Business Intelligence is an evolving capability, from migrating users out of long-treasured spreadsheets, to more sophisticated low code/no code tooling that promise to provide non-specialist users with analytical powers previously reserved for data scientists. The underlying process of data management through to outcomes does not change, but where we see increasing demand and data loads you should look to ‘productionise’ and ‘automate’ parts of the process using cloud technologies to reduce the manual handling and associated resource-intensity of the data management.

6.4.3 Prototyping and Scaling Delivering insight through business intelligence is not a ‘one-size-fits-all’ pursuit. Projects are not always predictable, nor are stakeholders. Over the course of the project or asset life cycles, there will be some questions that an organisation needs to be able to answer with regularity and predictability (for example, annual expenditure, headcount), some that pertain to specific project stages (for example, design progression, earned value), and many other ad hoc queries arising circumstantially. To provide the tools and analyses that the organisation requires you need to be able to triage demand, identify common requirements, design fit-for-purpose solutions, and ensure that the services of any IT and the Business Intelligence functions are complementary. There is, within infrastructure owning organisations, a tendency for IT and analysis/performance management functions to have developed in isolation. Decades ago, there may have been relatively little overlap between the teams that kept tabs on the performance of transport networks or utilities, and the IT departments within those same organisations. Nowadays, the importance of data and digital in everyone’s work means that these teams must compromise and cooperate to succeed. There is a continuum of business intelligence, from comparatively simple projects that can be executive on an ad hoc basis, to those which would demand specialist IT support even to scope out. Whilst every analysis will be a little bit different, you

282

6 Radical Technologies

Fig. 6.9 Axes of analytical maturity

can trace how complicated an analytical project is likely to be in terms of a few axes shown on Fig. 6.9, namely: • Repeatability: how often are you going to want to run this analysis? Is the analysis consistent or does it change? • User base: how many people will use this analysis? • Data intensity: how much data will this analysis consume? • Maturity: are you still prototyping this analysis, or is it part of a mature business process such as a defined KPI? Is the data structure defined and controlled, or evolving and changing? Maturity might also include concerns such as traceability or auditability, to what extent does the analysis need to be reviewed? In general, the more repeatable, data-intensive, and mature an analytical process becomes, and the larger the user base grows, the more ‘enterprise grade’ you will want your analyses to be. This applies to both the infrastructure that the analyses use, but also in terms of the level of auditing and documentation associated with their development. One cannot expect your analyses to come into being, fully formed, as ‘enterprise grade’ solutions. Instead, the life cycle of a successful analytical solution will see it transition from small-scale and tactical, to large-scale and mature. Note the word successful. Many analyses will never need to be ‘scaled up’ because they will not prove to have enduring value beyond the prototype stage. This is fine, business needs are transitory, and it would be foolish and wasteful to ‘gold plate’ every solution before it’s had time to prove its worth. To this end, it is useful to spell out a highlevel prototyping process that you expect analytical use cases to follow. This should draw from external practice such as GDS’ agile service manual (Government Digital Service, 2023), or Alteryx’s Data Science Life cycle (Sydney, 2019). Organisations should evaluate the benefit of use cases against a standard set of benefits categories.

6.4 Data Analytics and Business Intelligence (BI)

283

However, business intelligence use cases, as they are already business-driven—may require less formal assessment than data science use cases. As data professionals we should seek to involve ourselves in the development of analytical solutions across our organisation at the right time in their evolution. There are many fascinating use cases for analysts and data scientists to uncover within built environment organisations. You should provide your colleagues with the tools that they need to experiment and try out ideas and keep an eye on their progress. If you see something promising you should be ready to swoop in and use your data specialists to bring some horsepower to the problem. This will typically mean ‘light touch’ involvement in early self-service use cases, with more intensive involvement in aiding successful solutions to scale up. You should take a particular interest in ensuring that by the time analytical solutions are in intensive use across the organisation that they are (a) properly documented, (b) automated wherever possible, (c) running on ‘enterprise grade’ infrastructure that aligns to your approach to data management, and (d) avoid reinventing the wheel architecturally (in other words, it makes use of a common data platform and shared services). That said, large parts of your business may not be ready to adopt BI tools of their own volition. So, you may need to sponsor some early demonstrator projects first to create that first spark of data imagination amongst your colleagues.

6.4.4 Infrastructure—Warehouses, Lakes, Mesh A data-driven organisation requires a ‘decision-centric architecture’. An organisation’s data infrastructure must be able to adapt to meet problems of varying scale, repeatability, and consumer maturity. Building too much infrastructure and process around small scale, infrequent, unsophisticated questions is like using a sledgehammer to open a nut. Conversely, you will likely have some core reporting requirements that necessitate frequent, large-scale data processing. You should seek to ‘productionise’ these core requirements, taking advantage of enterprise-grade cloud infrastructure that meaningfully reduces the manual labour associated with delivering these use cases, whilst making the information more widely accessible and reducing the risk of error. The diagram below shows how you might vary your infrastructure in accordance with the repeatability and scale of your analyses. It sets out three broad infrastructure paradigms (note: we focus on Microsoft products as they tend to dominate the public sector office environment, but there are alternatives available for each of these use cases): • Microsoft (Office) Apps: Business Intelligence delivered through the default Microsoft Office toolset. Data analysts boast proficiency with Excel, SharePoint, PowerBI. They usually share this toolset with the wider organisation (e.g., there is nothing stopping other business users from opening up PowerBI and making

284

6 Radical Technologies

their own dashboards). Consequently, business intelligence is necessarily smallscale, labour intensive, and often lacks an audit trail. Nonetheless, the specialists in the team can still deliver substantial value to their organisation, particularly by tailoring their work to the needs of their organisation’s leadership. • Power Platform: Where use cases are sufficiently compelling and complicated to warrant scaling up—and particularly where users need to interact with and/ or input additional data—then there may be a need to make use of Microsoft’s Power Platform. Transition from Office to Power Platform is not all or nothing, as this tooling still plays nicely with Excel, SharePoint, and particularly PowerBI. Whilst Power Platform is a cloud-native scalable means to manage data and automate processes, it fills a niche for highly configurable form-based data interfaces previously occupied by Microsoft Access. Many of us will have interacted with Access databases and associated VBA-macro-driven forms as a stopgap means to interact with catalogues of data. Similarly, Power Platform can work as a useful middle ground where your use cases have a larger user base, or demand more data storage, manipulation, or transformation than is practicable using a spreadsheet but where the use case does not warrant investment in enterprise-grade solutions. • Public cloud platform-as-a-service (PaaS): the cloud makes it possible to make use of far greater volumes of data, far more quickly, than is possible using one’s laptop. Such is the importance of the cloud in the way most modern organisations work that even paradigms 1 and 2 will take place in the cloud. However, where the cloud really comes into its own is in large scale data-processing. PaaS solutions—such as Microsoft’s Azure Data Fabric or Databrick’s delta lake—allow organisations to deploy almost unlimited storage and processing without installing any software or hardware. Not every use case or analysis will justify making use of public cloud PaaS. For one, it can be more expensive to deploy cloud PaaS services, at least up front. This is largely because cloud PaaS requires far greater data competency and technical discipline, which in turn often leaves organisations reliant upon specialist support and consultancy. Moreover, unlike Excel, and to a lesser extent Power Platform, staff have less opportunity to ‘fettle’ with the data processing once implemented as PaaS. Both factors mean that the public cloud PaaS paradigm will only be cost-effective and user-friendly for use cases involving mature analytical solutions with large user bases. Ironically, whilst each of the three paradigms above differ from each other substantially, they focus on the data aggregation, transformation, and analysis stages of the BI process. Consequently, it is likely that they will all culminate in the visualisation of data using PowerBI, Tableau, Qlik, or equivalent. This is testimony to the flexibility of products like PowerBI that they are just as capable of consuming terabytes of data from a cloud database as they are consuming kilobytes of data from a spreadsheet. If analysts make the right aggregation and transformation decisions along the way, then the end users may not even notice the difference. As the ‘organisation/solution maturity’ arrow on Fig. 6.10 conveys, a solution (and indeed the organisation as a whole) may progress through the three paradigms during its lifetime and ‘scale up’.

6.4 Data Analytics and Business Intelligence (BI)

285

Fig. 6.10 An analytical maturity journey

There will always be different architectural patterns suited to distinct types and scales of problems. The purpose of enterprise data management is to ensure that fitfor-purpose data is securely available to users when they require it. Data management is only valuable because of the actions and decisions that it allows users to take. Likewise, BI and analytics are only possible and reliable when built upon a foundation of fit-for-purpose data management. Alphanumeric data (structured numbers and text) are the lifeblood of any large organisation. They are what allow us to make plans, keep track of progress, control cost and risk, and cascade information. All built environment organisations will use alphanumeric data. Some disciplines such as cost managers, project controls/PMO, or performance analysts will use it intensively, others will do so more sporadically. As a rule, built environment organisations have struggled to move away from an ad hoc spreadsheet-and-email-based approach to managing their information. However, the more mature organisations in the sector are investing heavily on creating trusted and common datasets from which users can ‘self-serve’ their reporting requirements. Cloud computing combined with easy deployment of technologies such as data lakes, data warehouses, and data mesh and their integration into common BI tools means that organisations can create scalable enterprise data solutions more quickly and inexpensively than ever. However, the challenges facing built environment organisations include the upfront cost of deploying such environments, migration away from legacy systems, and ensuring adoption of these new ways of working. Even with the readily apparent limitations in terms of the sharing and accuracy of data, users still

286

6 Radical Technologies

value the sense of control and flexibility associated with spreadsheets, as well as in some cases the ability to edit the official truth before reporting upwards. In a previous role, your author designed and deployed an enterprise data architecture platform which used standard components from a major cloud vendor which he unimaginatively named Data as a Service (or DaaS for short). This platform, which has since grown rapidly, provided a single place for staff to mirror data from operational systems for the purposes of reporting, business intelligence, analytics, and data science. The key technical capabilities of systems like this include: • Providing business users with access to pre-modelled and verified data aggregations from a data warehouse or data mart allowing them to quickly access and start using meaningful data from one or more systems in an easy-to-use BI tool. • Granting data scientists access to larger unstructured raw data sources from a data lake, along with the analytical tooling required to build and test models. • Means to signpost an organisation’s data landscape to users, be that via an ontology, a data catalogue, a business glossary, or more conventional data models. These techniques should help to ensure that the platform presents users with data using business-friendly terminology, and that it is clear who is responsible for each dataset. • Pipelines for automating the upload of data from business systems. • Affordable and massively scalable data storage, with built-in security and disaster recovery. • Tooling to support and continuously improve the platform, to automatically deploy and switch-off resources, and monitor usage and cost. • Means to securely share data within and outside the organisation using enterprise tools such as active directory, role-based access control (RBAC), single sign-on, and APIs. By taking a modular platform approach, organisations can build for longevity. Making use of ‘platform as a service’ offerings means that the tooling should continue to improve over its life cycle as the cloud services mature. Modular and containerised architecture means that when better options appear then organisations can switch out components without incurring additional development or migration costs. Using widely adopted white label tooling increases the likelihood that organisations can support and run the platform in-house, and/or access a wide talent pool of individuals on a permanent or temporary basis to deliver improvements. Built environment organisations stand to benefit from the flexibility that modular platforms promise, as they often face meaningful constraints in terms of development budgets, salary caps, governance processes, and information security policy. Whilst there are no silver bullets in this context, data professionals seeking to deploy capability quickly will stand a better chance if they advocate for a platform approach that uses widely adopted and broadly supported technology, and interfaces nicely with productivity tools already in use within the organisation.

6.4 Data Analytics and Business Intelligence (BI)

287

6.4.5 Human Psychology and BI You can lead a horse to water to water, but you can’t make it drink.—Old English proverb

In business intelligence, adoption is everything. you can provide your staff and stakeholders with best practice business intelligence, but if they don’t believe, acknowledge, use, or act on the intelligence then the organisation will derive no value from the exercise. There are at least two aspects to this problem: (1) organisational maturity, and (2) individual psychology. Much of this book addresses the former aspect, organisational maturity. The presumption is that if the organisation implements the right level of data governance, data management, business intelligence, user engagement and training, etc. then there will be nothing stopping the organisation from making effective use of data, from being ‘data driven’. The question of individual psychology receives far less attention, but that does not mean that it is less important. Perhaps this is because the problem feels less accessible to our working theory of how organisations work. Like the economics profession before us, we tend to assume that the organisations we work for are composed of ‘rational actors’. In practice, human decision-making is not a predictable linear process. Are we creatures solely driven by a haphazard set of priors that we have unconsciously hard-wired into our brain? Can one really integrate ‘knowledge’ at all? Is what we call decision-making merely retrospective rationalisation of your subconscious? And most importantly, what does that mean for BI!? There is a discrepancy between how we think that organisational decision-making should work, and how it works in practice. Earlier in this book we identified some of the causes of poor decision-making in the built environment sector, including: • • • • • • • • •

Time lag between events and observation. Time lag between observation and action. Inaccuracy/incompleteness/probabilistic nature of observation. Errors introduced by computerised manipulation, aggregation, and abstraction of data. Errors introduced by user interfaces placed between people and the built environment. Interpretation of information based upon biases, inaccurate priors, and unchallenged intuition. Decision-making without an informed and conscious logical framework. Additional time lag, compromise, misinterpretation, and complexity added by the ‘decision-ratification’ process from individuals to organisations. Misinterpretation of organisational decisions by the individuals expected to execute on those decisions.

The list of causes above is firmly rooted in human behaviour, and we cannot address them with a quick bug-fix or hardware upgrade. Indeed, these causes are

288

6 Radical Technologies

usually subconscious and there is no way of entirely controlling for or designing out individual bias, miscommunication, misinterpretation, or organisational dynamics. Similarly, built environment organisations still harbour some of ways of working that are a hangover from a ‘command and control’ era. This era rarely valued dissent and diversity of opinion and regarded ‘governance’ and ‘leadership’ as merely opportunities to judge rather than to collaboratively solve problems. But you can build some ‘fail safes’ around these unavoidable shortcomings. As individuals, we are effective decision-makers in contexts where we’ve had the time and discipline to consciously create a logical framework for ourselves. Where we’ve practised, sought feedback, questioned ourselves, learned from our mistakes, learned from others, identified and challenged your biases, and constantly updated our priors. Rather than seek to definitively use data to ‘answer’ the strategic challenges that we face, we should instead seek to use business intelligence as a test bed for ideas, a means of incrementally arriving at insight and direction. In this context organisational governance and decision-ratification processes can function as a structured means of obtaining more data to inform decisions over time. Implementing this model would require organisations to embrace a far more nuanced relationship to risk, and far more agile processes for creating, evaluating, and growing or killing minimum viable products (MVPs). But the payoff could be an organisation that can respond to complex environments and emergent patterns, which can add value quickly, rather than simply acting to enforce consensus and remove perceived risk. Business intelligence cannot deliver ‘data-driven decision-making’ in isolation, but it can do a lot of the heavy lifting. It is important to your data strategy because data analysts can and will function as the chief data interpreters of the organisation, they will often be the first to respond analytically to emerging situations and may find themselves acting as the ‘conscience’ of the organisation. Your data strategy cannot succeed without them, the capabilities on your roadmap depend on their adoption. BI provides a step towards agreeing on the ‘objective truth’ for your organisation by creating and consistently using trusted datasets.

6.5 Data Science and Artificial Intelligence (AI) 6.5.1 The Sub-four-minute Mile We rational humans have largely given up on numerology. Rather than ascribe significance to nice round numbers, we recognise that numbers are arbitrary measures. The magic of numbers isn’t in the numbers themselves, but in what they allow us to measure and model about the world. And so, it’s interesting to identify areas where numbers still hold significance for their own sake. One of these areas is in sports, where thresholds can become

6.5 Data Science and Artificial Intelligence (AI)

289

significant by virtue of their being nice round numbers. Think of running a subtwo-hour marathon, for example. Running a marathon in 1 h 59 min or running a marathon in 2 h 1 min, are both incredible feats of athleticism, and arguably the former is only slightly more impressive than the latter. Succeed at the latter and you’ll probably win a few races, succeed at the former and you’ll be in the history books, schoolchildren will learn your name 100 years from now. This is a shame because running, like most disciplines, is a story of incremental progress. Successive runners push the benchmark forward slightly, and when someone runs the first sub-two-hour marathon, it will be at least in part due to the work done, and the lessons learned from that runner’s predecessors. One runner who benefited enormously from being the first to cross an arbitrary numerological boundary was Roger Bannister. On May 6th, 1954, he became the first person, recorded at least, to run a mile in under four minutes. Before him, noone, after him, plenty, and in quick succession too. As the Harvard Business Review documents, “just 46 days after Bannister’s feat, John Landy, an Australian runner, broke the barrier again, with a time of 3min 58s. Then, just a year later, three runners broke the four-minute barrier in a single race (Taylor, 2018).” There are two explanations for why it took such a brief period for others to repeat Bannister’s accomplishment. HBR suggests that Bannister’s success changed other runner’s expectations, it made them realise that running such a time was indeed possible, and this in turn gave them a psychological edge when they attempted the same feat themselves. There is probably some truth in this narrative. Our narrative is simpler, but not necessarily contradictory. We reckon that someone was bound to do it at some point around the time that Bannister did, he just happened to get there first. The record for the mile had dropped from 4 min 14 s in 1915, to 4 min 6 s in 1931, and in 1942 a Swedish runner named Gunter Hagg had pushed the record down to 4 min 1.4 s, less than 1% slower than Bannister’s historic time. It was going to happen, just as the sub-2 h marathon will likely happen in the coming decades. Why this tangent into running records? Well, for one thing it makes a nice interlude from incessant talk about data. But also, because the four-minute mile is a useful metaphor for the application of data science to the built environment. Like running sub-four-minute miles in the mid-1950s, to those that weren’t paying attention it may have seemed to have come out of nowhere. At some point, perhaps in the late 2000s, nobody in the built environment was doing data science. And then shortly afterwards, anyone who had any pretensions of athleticism (or in this case data savviness) was at it. Nowadays, the idea of applying data science to the built environment is everywhere, many clients have data science teams, data science is one of the fastest growing professions/disciplines in the consultancies that work for those clients, and every software vendor worth their salt is integrating AI features into their offerings. The only thing that is perhaps lacking, for the built environment sector at least, is that iconic moment, akin to Bannister’s mile, which catalyses the mind and proves that what many thought impossible is truly possible.

290

6 Radical Technologies

The story of data science use by built environment organisations follows a familiar pattern of adapting techniques from other sectors and understanding how to apply them to unique challenges. Unlike organisations at the forefront of big data, we are not creating our own languages and libraries from scratch. Rather, in this sector, innovation takes the form of having the imagination and lateral thinking required to identify which built environment problems data science can solve. Examples of promising uses of data science on the built environment include: • Identifying the leading indicators of cost and schedule overruns on complex projects. • Optimising the scheduling of works to make best use of closures of infrastructure (think road closures or railway possessions). • Predicting asset failure based upon monitoring and condition data. • Improving load balancing on complex networks, whether that load is gas, or water, or even human beings. • Automating image recognition across large datasets, for example images of assets such as rails, paving, bridges, or tunnels taken from train or vehicle-mounted cameras, or even samples from environmental monitoring. • Natural language processing and other types of sentiment analysis to keep track of customer/stakeholder/staff. • Automated/generative design of new assets, and the synthesis of large volumes of historic records into fresh insights. Data science is potentially your best opportunity to tailor the outputs of your team to the specifics of your organisation. Whilst it is often foolhardy to build custom or bespoke software due to the likelihood of their existing a highly developed SaaS offering available to meet most needs, data science still thrives on writing code (for now at least) and the agile development of prototypes. It is the most creative part of the data profession, and in the right circumstances can often be the fastest moving. Used well, data science prototypes can show your organisation the art of the possible and help to justify investment in the more prosaic data management capabilities that make the whizzy exciting stuff possible!

6.5.2 Definitions Let’s begin by clarifying a few of the terms that we are going to use in this section, as they are often (incorrectly) used interchangeably: • Artificial Intelligence (AI): Artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind (from IBM). • Machine Learning/Deep Learning: subsets of Artificial Intelligence that train AI algorithms using large datasets and/or extensive automated training.

6.5 Data Science and Artificial Intelligence (AI)

291

• Data Science: Data science combines the scientific method, mathematics and statistics, specialised programming, advanced analytics, AI, and even storytelling to uncover and explain the business insights buried in data (again, from IBM). • Predictive Analytics: Predictive analytics is a branch of advanced analytics that makes predictions about future outcomes using historical data combined with statistical modelling, data mining techniques and machine learning (once again, from IBM, they’ve got all the best definitions). Similarly, it’s worth making the following role definitions: • Data Engineer: someone who collates and structures data. • Data Analyst: someone who uses data for reporting and business intelligence (BI). • Data Scientist: someone who applies models or machine learning to data. Using the definitions above, we can think about what distinguishes the discipline of data science from data analysis. This is important because many built environment organisations have long established analytics practices (sometimes given names like ‘performance’ teams). Consequently, when the time comes to invest in data science capabilities there can be confusion and friction, in people’s minds at least, between the remit of the existing analytics practice, and the new data science practice. Admittedly, there is often a blurred line between data analysts and scientists. Adept analysts will sometimes make use of data science techniques, and vice versa. However, the principal difference is the creation of models. Data analysis seeks to describe reality, data science seeks to model reality. The distinction here is that a piece of data analysis can report on a phenomenon, without needing to create a theory as to how the phenomena works. A data analyst might describe the change in an asset’s condition, but you would need a data scientist to create a statistical model that predicts how that asset’s condition will change in the future, or a machine learning model that consumes sensor data for similar assets to create an early warning system. Data analysis is a valuable practice, but it is one of describing what has happened, rather than anticipating what will happen. Data science is a more specialist and flexible pursuit, and an adept data scientist can use a range of techniques to realise value. These techniques include machine learning and artificial intelligence, both of which are at times used interchangeably with data science, but which are just tools within the data scientist’s toolbox. Some problems will benefit from AI, others will be better suited to statistical techniques such as regression analysis. The data scientist’s job is bigger than any one technique, it is to design models of parts of reality, and to use those models to help us understand how we can influence reality.

292

6 Radical Technologies

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built Environment The next revolution will be the coalescence of programmable machines into systems beyond programmable control.—George Dyson (Dyson, Analogia: The Emergence of Technology Beyond Programmable Control, 2020)

First, a quick, basic, and probably slightly incorrect glossary of terms: • Neural networks: a form of computing used as the architecture for AI models inspired by the structure of neurons in the brain. These take input data, and filter it through layers of weighted interconnected neurons, to arrive at an output. • Gradient descent: a popular mathematical technique for finding ‘local minima’ in multidimensional space, used to adjust the weightings of parameters in a neural network to minimise the ‘loss function’ (error). • Backpropagation: a process used to improve the performance of neural networks by feeding errors backwards through the neural connections, this is where gradient descent adjusts the weight of the neuron connections (parameters). • Large language models (LLMs): a subset of neural network AIs trained on enormous quantities of text to predict the next token (word or part of a word) in a sentence in a manner that simulates realistic written text. • Transformer: the type of modelling technique used in modern LLMs that allows the LLM to ingest large strings of text and predict the next token in the string. • Diffusion models: an application of neural networks that creates recognisable forms from a random starting point. We can use these for a range of applications, notably generating images based upon text prompts.

6.6.1 Primer: AI, LLMs, and Software 2.0 Like Moore’s Law before it, the growth in AI capability is non-linear, driven both by changes in methodology (such as the use of neural network transformer models) and far greater availability of the raw materials of AI: training data and processing power. Recent developments in artificial intelligence have been nothing short of profound, and to the casual observer they have come out of nowhere. Even in the time between originally writing this paragraph, and coming back a few months to edit it, the frontier of what AI is capable of has shifted markedly. AI models are self-reinforcing (to a point at least, there will be limitations in terms of both processing power and training data). It is as if on Monday your child learned to draw stick figures, on Tuesday they learned how to shade 3D bodies in various positions, and by Wednesday they were painting in the style of a renowned artist. Even if one understands this intellectually, the way that rapid improvement in AI manifests itself can still take you by surprise. We humans struggle to account for exponential progress, we have grown accustomed to our own fitful erratic means

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

293

of learning, not complex systems where all feedback is immediately channelled into improvement. This makes it difficult for us to anticipate the changes that AI will bring and how we can use them in our work on the built environment. We know that increasingly powerful generations of AI will appear over the coming few years, but it is hard to imagine how those capabilities will manifest themselves when we have not even fully appreciated what we can accomplish with today’s tools such as GPT4, Bard, and CoPilot. Our blindness makes it challenging to identify how AI can add value to the tangible problems that construction and built environment projects face, and what intractable problems we labour under now that will be trivially easy to solve tomorrow. Whilst the capabilities of artificial intelligence evolve constantly, the underlying techniques are—broadly, and over the short term—constant. In the same way that today’s computers use the same logic gates and mathematical principles as 1950s machines (and indeed mathematical theories prior to the invention of the computer), just far faster; so many of the advances in AI that we have seen over the last few years have come from the scale of processing that is now possible, rather than a change in the underlying techniques (neural networks were, for example, first articulated in the 1940s). It is a massive over-simplification, but for non-specialists the history of AI is about two massive changes: • The move from ‘complex logic’ programmes to neural networks (late 1980s). • The sudden availability of almost unlimited training data (high speed internet) and computing power (cloud computing) (mid 2000s until today). The major change in AI logic occurred in the late 1980s as researchers shifted from rules-based to statistical/Bayesian neural networks. However, it took until the availability of huge quantities of both computing power and training data (via the cloud and the internet respectively) for neural networks to realise the level of progress that we see today (Press, 2016). At the forefront of neural network development (and hype) today are Large Language Models or LLMs which, as the name suggests, generate text in response to text inputs. The capability of these models is, broadly, a function of: 1. The number of parameters it has (e.g., how large the neural network is, currently measured in tens or hundreds of billions of parameters). 2. The size of its vocabulary (e.g., how many distinct tokens or text/integer pairs it understands and uses in its responses). 3. The volume of data used to train the model (e.g., the amount of input data the model builders converted into tokens and fed into the model). Researchers have argued that it is the volume of training data, rather than the size of models, which best predicts model performance (Nostalgebraist, 2022), and indeed that running large models on modest datasets is simply a waste of computational resources. 4. The quality and diversity of training data. The training data for most LLMs at present comes from open source and ‘scraped’ content from the internet. This

294

5. 6. 7.

8.

6 Radical Technologies

corpus is great for allowing AI to replicate how human being converse, but it is also almost certainly repetitious and filled with inaccuracy. Hopefully in the future we will be able to include higher quality training material such as the huge quantities of academic research, standards, and papers generated by individuals working in the built environment sector over the last two centuries. The length of the training process (currently measured in 1000s of GPUs over multiple months). The number of tokens it can process in parallel (e.g., how many words it considers in context when trying to predict the next word). The amount of supervised training and reinforced learning applied to the base model to make it more human friendly (e.g., the amount of human effort expended to curate the model’s responses, and the amount of computational power used to train the model to better meet implicit human expectations) (Karpathy, 2023). The addition of randomness (or ‘temperature’) in how it chooses the most likely next word. As Stephen Wolfram writes, “for some reason…if we always pick the highest-ranked word, we’ll typically get a very “flat” essay, that never seems to “show any creativity” (Wolfram, 2023).

Over the past two years generative AI and LLMs have moved from being a niche experimental toy that one needed to apply to use, to something that is readily available for a small monthly charge. Increasingly LLMs are available as plug-ins to other applications, and increasingly integrate into the cloud architecture that built environment organisations use for the enterprise IT. It was the development of a simple but ingenious method—transformers, first realised by Google in 2017—that allowed the development of large language models (LLMs) that can process massive quantities of input text in parallel (rather than recursively). This meant that computation could be parallelised allowing AI firms to train models on enormous corpuses faster than was previously possible. These models included improved attention algorithms which meant that they could understand the relative contextual importance of input text, as well as clever means to refine the outputs to make them more amenable to human consumption. With models such as OpenAI’s GPT models, Google’s Bard, and Meta’s LLaMa now available to consumers and organisations alike, it is possible to generate enormous quantities of quality text using simple natural language prompts. Whilst the user interface of these models (the ‘assistant’ component) makes it feel like these models are responding to your questions, they are in fact merely predicting the next word (token) in a series of tokens; in this case a conversation composed of user prompts and model responses. Having generated your text, you can also illustrate that text with tools such as DALL-E, Midjourney, or Stable Diffusion which combine LLMs to parse human prompts with a different kind of ‘generative’ neural networks such as diffusion models and Generative Adversarial Networks (GANs). Diffusion models essentially start with random static and quickly refine it using their neural network to iteratively sculpt images that align to the models understanding of what the concepts in the prompt look like. Combining LLMs, diffusion, and GANs models, it is now possible

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

295

to ‘write’ and ‘illustrate’ entire novels merely with a few prompts.6 Soon the same will be true of videos as stable diffusion moves on from static images to entire videos. But that’s just the tip of the AI-iceberg, there is something more foundational happening here. For LLMs to generate convincing text they require training on massive quantities of human-generated text. This training essentially involves the model challenging itself to guess the next word in billions of text samples, iteratively updating the weightings of its neurons to improve its accuracy. Training data includes high quality datasets such as Wikipedia which are ~ 80 Gb in size, as well as broader web-scrapes of internet pages which can be upwards of a Tb in size. Whilst these volumes may seem relatively small, it is worth noting that this is only text data, not images or video which tend to have comparatively larger file sizes. For reference, the complete works of Shakespeare is about 5 Mb or about one 20,000th of the size of Wikipedia or one 600,000th of the size of an internet crawl dataset used by an LLM. As well as using these libraries of written text, firms also train the same models on existing human-generated code. OpenAI, Microsoft, and Amazon have trained their transformer models on open-source software repositories such as GitHub (as well as coding message boards such as Stack Overflow) to offer services like GitHub Co-pilot X and Amazon Code Whisperer. One can prompt these tools to generate working code, as well as unit tests, and code explanations. They can also suggest changes and highlight potential errors to human programmers, and proactively identify security issues. This is a profound change. Since its inception in the 1940s with the work of Turing and Von Neumann computer programming has changed our world. For that entire time the creation of code has been a oneway street: humans create the prompts, computers execute. At least this is true at a superficial level; actually, computer programmes such as viruses and compilers have been writing code for decades, but always according to rules originally set out by humans. However, Turing’s initial insight was that code and information (data) are essentially interchangeable, and now we stand on the cusp of the ultimate illustration of that argument: artificial intelligence (code, in other words) that can not only generate code, but can do so without instruction from human beings, using multibillion parameter neural networks that are not explicable to human beings. This will result in code that works, but which humans may not necessarily be able to explain. This is the transition that Andrej Karpathy, a founding member of OpenAI, described as the move from “Software 1.0… explicit instructions to the computer written by a programmer” to “Software 2.0… written in much more abstract, human unfriendly language, such as the weights of a neural network (Karpathy, Software 2.0, 2017).”

6

You author’s haven’t gone that far, but we have increasingly used these tools to prompt our own thought processes and check our own logic. It has also functioned as a useful motivator. At times it has felt like we need to finish this book before the need for human authorship fades entirely, and that in doing so we need to be sufficiently profound and creative to rival our neural network peers. We will leave it to the reader to evaluate our success.

296

6 Radical Technologies

6.6.2 To AI or to Automate? The shift from human to AI code might help to remedy some of the built environment’s specific challenges: • Data collection and quality: In Karpathy’s conception, the task of programming will soon be one of curating and cleansing a suitable training data set and expressing desired outcomes. This could begin to drive better behaviours across built environment organisations. Collecting quality data, and agreeing outcomes, is often more challenging (and sometimes ethically contentious) than in other industries. By using off-the-shelf or easily modifiable open-source models for our analysis, built environment organisations may be able to limit their investment in data science and instead focus investment on creating high quality datasets that will be valuable to both human and AI users. • Edge computing and sensors: Karpathy also points out that one of the biggest advantages of neural network-based AI is that they are composed of simple building blocks (e.g., neurons and weights). This means that processors, particularly GPUs, can run neural networks with relatively low power consumption. As Karpathy writes, “the world will change when low-powered intelligence becomes pervasive around us.” There are obvious applications here for built environment data gathering, where one of the challenges is providing sufficient power and bandwidth for edge computing chips deployed on assets across a large area. • Built environment-specific lexicon: Another insight courtesy of Karpathy is that because LLMs contain such a broad ‘general’ understanding of how language works, it is easier to fine tune them for specific areas of focus. In other words, because they have developed an understanding of how language (or at least the English language) works across the internet, they implicitly understand specific subtleties. In Karpathy’s example this means that you can fine tune an LLM to identify positive and negative sentiment without needing a large training data set of text with labelled sentiments. Similarly, it means that LLMs will also likely have an implicit understanding of local meanings and lexicons. So, in the same way as you can ask GPT4 to provide you with a response written in a strong Scots dialect, you can also fine tune it to provide input in a way that corresponds with the lexicon of construction professionals. • Accelerated development of domain-specific models: building on the previous point, Karpathy highlights that “it is easier to discriminate than to generate.” This is the logic of applying supervised learning to base models (rather than creating new specialist models from scratch), where a small amount of human review can translate into versions of the models that outperform the original generic base model. By building specialist models on the back of highly developed base models, built environment organisations can more easily and cheaply piggyback on the base models to make new models that are more suited to their use cases, for example reviewing technical documentation, or advising asset managers.

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

297

The possibilities are exciting, but the built environment is rarely a priority for tech giants. So how should built environment organisations begin to realise these benefits? And how do we play a constructive role as subject matter experts? The forefront of artificial intelligence is largely limited to a few giant corporations, elite academic institutions, and well-funded Bay Area start-ups. We are all ‘nonspecialists’ in this context, even if we have the word data in our job title. Working in resource-constrained built environment organisations, our approach to AI should be to benefit from the products produced by the tech giants, rather than expect to fully understand or rival them. Thankfully, it is increasingly possible to use open-source libraries published by big tech firms to run powerful models without needing to look under the hood. Consequently, it is more important that we understand the emerging trends in AI and data science rather than feeling the need to fettle with their source code. In his brilliant presentation “How to recognize AI snake oil,” Arvind Narayanan distinguishes between three distinct types of problem practitioners have sought to apply AI to solving, and the differing levels of success that AI has had at cracking these problems (Narayanan, 2019) (Table 6.5). Narayanan’s distinctions show us that we are most likely to see a return on investment where we automate perception rather than prediction. Stephen Wolfram makes a similar distinction in his GPT explainer, and it is worth quoting at length: There’s an ultimate trade-off between capability and trainability: the more you want a system to make “true use” of its computational capabilities, the more it’s going to show computational irreducibility, and the less it’s going to be trainable. And the more it’s fundamentally trainable, the less it’s going to be able to do sophisticated computation (Wolfram, 2023).

What Wolfram is saying here goes to the heart of what LLMs are and are not. Fundamentally LLMs work by identifying patterns, not by executing mathematical logic. A LLM might give you the correct answer to a problem, not because it did the correct computation, but because of a pattern that it identified in its corpus which includes the solutions to the same or similar problems. In this sense LLMs are statistical rather than computational products. Wolfram’s distinction reveals a bias in how we humans conceive of the relative difficulty of problems, as he writes, “what we should conclude is that tasks—like writing essays—that we humans could do, but we didn’t think computers could do, are actually in some sense computationally easier than we thought (Wolfram, 2023).” We need to take these distinctions on-board if we want to identify what kind of problems neural networks can help the built environment to address. It is worth noting at this stage that neural networks are only one kind of AI, and there are many problems that the built environment faces best addressed using other techniques including: • Non-neural network mathematical models such as multivariate regression analysis. • Raw computation, such as the kind of problems used in encryption or proof of work blockchains, and fraud identification.

298

6 Radical Technologies

Table 6.5 Identification of AI snake oil Problem type

Suitability of AI

Built Environment examples

Perception—”Genuine rapid progress”

Where “AI is already at or beyond human accuracy” because “there is no uncertainty or ambiguity.” For example, “given two images of faces, there’s ground truth about whether or not they represent the same person.”

Processing of unstructured data, such as archive documents and drawings, into machine-readable data Adding edge computing logic to asset sensors, for example detecting patterns of movement of asset components (girders, joints, etc.) that indicate that an asset is performing outside of tolerance Using pattern detection to scan sites to identify safety issues, errors/snags, improper site set-up, or usage patterns such as passenger flow

Automating judgement—”Imperfect but improving”

“Far from perfect, but improving,” tasks where “humans have some heuristic in our minds” and “AI will never be perfect at these tasks because they involve judgement and reasonable people can disagree about the correct decision.”

Identifying non-compliance in project data, which may indicate poor cost/schedule/risk management Predicting failure or reduced performance of an asset based upon suitably accurate monitoring data

Predicting social outcomes—”Fundamentally dubious”

“Fundamentally dubious” and reliant upon applying AI to problems that “are hard because we can’t predict the future.”

More broadly identifying whether a project will complete on time or to budget, particularly at an early stage Anything involving true complexity, where there is the introduction of noise through human actors or political uncertainty

• Automation, the use of tools such as Power Automate or UiPath to replace human workload with the rapid execution by computers of a defined sequence of tasks. For use cases where there is a clear process that we just need to be able to execute at speed then automation rather than AI is likely to be the answer. This might include a lot of the bulk tasks that are common in-built environment organisations such as processing invoices, progress reporting, reconciling schedules, labelling and filing documents, and the like. Conversely for use cases that stray into ‘predicting social outcomes’ there is a lack of evidence for AI being able to address these types of problems. In an uncertain world, there are obvious risks associated with predicting the future based upon patterns that we have recognised from a historical data set. Anyone selling these types of solutions risks fundamentally misleading the organisations that

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

299

they work for, or at least consuming a bunch of resources that would realise greater value applied to more prosaic use cases.

6.6.3 Categorising Our Problems Narayanan’s categories grant us a useful rule-of-thumb in prioritising our application of AI. In the data hype-cycle new and ground-breaking ideas often garner more attention than established and foundational techniques, However, working in the built environment you are unlikely to be employed by an organisation that has sufficient maturity or capability to consistently make use of cutting edge techniques, nor is it likely to have fully exploited more rudimentary or established data science practice. In recent roles your authors have found that it is easier to build credibility by meeting more of an organisation’s immediate needs rather than going for moon-shot technologies. This might mean making data available for self-service analytics, providing means to easily catalogue and navigate large repositories of documents and images, or automating existing spreadsheet-based workflows. Whilst hardly the stuff of data science dreams, these kinds of use cases will show early value, and help familiarise business users with data science techniques that might otherwise seem esoteric and ineffable. One of the big advantages of LLMs in this context is that they are easier to use in a way that complements existing ways of working than other AI technologies. Rather than moving users to a whole new set of tools for managing data, they can continue to use their spreadsheets and dashboards, whilst receiving guidance from an LLM (in the same way that many of us sense-check our assumptions using a search engine). It’s worth exploring the following couple of questions to help triage our problem statements by identifying the right kind of technique demanded. • Is this problem simple enough to solve with the application of hard-coded logic? Many of our challenges in the built environment don’t really need an intelligent agent to solve at all. These might be areas where we benefit from automation rather than intelligence. Data quality is an example of this area. Built environment organisations manage sizable business critical databases and datasets, a few examples include asset registers, document repositories, BIM CDEs, ERP systems, etc. Often these datasets are sufficiently large and of inadequate quality that it is prohibitive to manually identify and fix errors. But it is possible to define data quality rules that in turn we can code as algorithms to automatically search through our datasets and identify errors. This is a 100% rule-based application that can realise massive benefits to an organisation without having to go anywhere near AI. • If the problem is too complex for hard coding, can I approach this problem using neural networks? Do I have a sufficiently large training data set, or another means of training the AI?

300

6 Radical Technologies

Where the construction sector has most often struggled is on the second question. We know the types of problems that we want to solve, we know that they are too complex to hard code, but we don’t always have sufficient high-quality training data to allow us to leverage AI to solve them. This might be a chicken and the egg situation, where we struggle to justify the cost of instrumentation and data collection required to fuel AI use cases, which in turn mean that we don’t have any proof that the data collection is worthwhile. Similarly, we may struggle to justify the resources required to curate datasets to the point where they are sufficiently accurate to function as training fodder. Again, this is another area where pre-trained core LLMs may offer advantages as they reduce the volume of high-quality data that organisations must collate themselves to start generating useful insight. Examples of where AI may apply include: • Site data collection and validation: using AI-trained sensors and edge computing to monitor sites or assets and to report issues back to infrastructure operators, including: – Structural health monitoring: identifying the early warning signs of asset failure, – Predictive maintenance: identifying where equipment or assets are overloaded or behaving out-of-tolerance in a way that may suggest imminent failure, – Safety management: using feature detection to identify a lack of appropriate safety provisions on site. • Traffic management/incident response: recommending strategies to respond to disruption in networks, and/or providing users with alternative means of travel (see: Mobility-as-a-Service (MaaS)). • Load balancing on utility networks: ensuring that networks such as gas pipelines can maximise capacity and minimise leakage. • Data extraction/pattern recognition: finding valuable information in archives, and synthesising/summarising that information to make it easily accessible to humans. • Risk management: highlighting salient features that indicate a project is operating at an elevated risk profile. This could include leading indicators of failure, such as incomplete documentation, or features that increase the levels of risk such as certain working patterns, weather, or events. • Knowledge creation: LLMs provide us with the means to provide qualitative feedback across a large body of practitioners in a way that is interactive and intuitive. • Digital twins: responsive models that can improve the resilience of systems of systems. Each of these use cases will require access to a substantial amount of high-quality instrumentation and/or records and would require substantial calibration to align with the judgement of human experts.

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

301

6.6.4 Specialist Built Environment Models It is no coincidence that amongst the first use cases of neural networks to gain publicity was identifying cats and dogs. The amount of training data for cat and dog recognition algorithms is unlimited because people are enamoured with their pets and willingly upload images of them to the internet without prompting. They are also happy to curate such images (for example to remove any pictures of other animals that might sneak in, or misidentified objects). The same is not true of sewage pipes, masonry walls, or plaster soffits. Can we as data professionals make the case for substantial improvements in the granularity and accuracy of data collection, particularly instrumentation of assets? Moreover, can we find cost-effective means of calibrating the outputs of AI tools with the sector’s extensive existing engineering judgement? By meeting these two criteria can we build AI applications can operate at sufficient scale and with sufficient accuracy to effectively augment our limited human resources? When contrasting widely available information (cats, dogs) with less freely available information (engineering data, asset-specific data) we are stumbling into the distinction between general knowledge and specialist expertise. This is where the training dataset for LLMs becomes pertinent. In his talk State of GPT, Karpathy reminds us that in their hunt for the next best token to continue any string of tokens, an LLM like GPT4 isn’t looking for the correct answer but rather the likely answer. This is because the training dataset (e.g., all the internet) used to train the model includes both correct and incorrect answers to problems. Karpathy describes ways of correcting for the models’ indifference to truth, which are not unlike how we use technical experts within our own professions. These techniques include “prompt engineering” the model to use expertise (e.g., “imagine you are a leading expert in the field”), to show its work, and point it towards additional contextual information. This last point is going to be crucial in how built environment firms derive value from LLMs. Whilst training these models on the internet will provide a fair amount of superficial information on engineering concepts, there is a lot of specialist engineering knowledge that is not available freely on the internet. As one pseudonymous writer noted, “The entire available quantity of data in highly specialised domains… is woefully tiny, compared to the gains that would be possible if much more such data were available” (Nostalgebraist, 2022). By providing specialist engineering context to existing LLMs in the form of access to engineering standards, textbooks, and trade publications, we can train SME LLMs (what Karpathy calls ‘Retrieval-Augmented LLMs’). Like real SMEs, these SME LLMs will have both the advantages of a broad understanding of language and general facts, but also specific training and targeted prompts concerning the behaviour of physical assets in the real world. Such engineering-focused branches of existing LLMs are likely to prove the most fruitful in providing reliable and accurate decision-support to professionals working on the built environment.

302

6 Radical Technologies

In his description of how GPT models work, renowned mathematician Stephen Wolfram frames the advantages and disadvantages of neural networks in general, and large language models specifically. In Wolfram’s words, “The tasks we’re typically trying to get neural nets to do are “human-like” ones—and neural nets can capture quite general “human-like processes” (Wolfram, 2023). One implication of this insight is that by virtue of an LLM’s human-friendly natural language interface, they may be well-placed to interact with human beings in a way that accelerates Nonaka’s knowledge creation cycle. Figures 6.11 and 6.12 describe how we might use LLMs not only to better process our explicit knowledge, but also to ease the friction between human-embodied tacit knowledge and computer-embodied explicit knowledge through natural language internalisation and externalisation. AI will succeed here not by replacing human engineers, but by allowing those engineers to process far greater quantities of information and sense check their decisions.

Fig. 6.11 Pre-LLM knowledge creation

Fig. 6.12 Post-LLM knowledge creation

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

303

Many of us have spent our careers trying to collate and improve vast troves of quantitative data on the presumption that this is the first step towards generating insight. The human-like capabilities of LLMs demonstrate that there are also considerable gains in harnessing qualitative data, and that we favour quantitative over qualitative at our peril. As built environment organisations have belatedly realised the value of data to their operations, they have invested copious time and resources in centralising their data management. The putative advantages are clear: • If I’m a client and I own all the data, I can better track my projects and my supply chain. • If I’m a supplier and I own all the data, I can better meet my client’s requirements and out-compete my rivals. • If I’m a vendor and all the data is on my platform, I can provide a common reference point to my users, plus secure a steady stream of revenue because getting data out of a system is always a faff. It is possible, but far from assured, that LLMs provide us with an alternative to endless fights over who owns the platform. If we can use LLMs to consolidate and share insight, whilst leaving the data wherever it happens to reside, it might be possible to create knowledge across the sector’s organisational silos. LLMs also bring the bonus of obfuscating the source or sources of any insight, removing some concerns that organisations have in terms of exposing their data, intellectual property, or their own failings. Perhaps the next step towards the long-awaited data commons for the built environment isn’t a massive database, or even decentralised data sharing, but actually a neural network.

6.6.5 Predictive Analytics The last decade or so has seen substantial expenditure on the application of data science to construction and the built environment. When one of your authors joined in the creation of a small ‘data-driven asset management’ team at a consultancy back in 2012 the data-driven part was usually a combination of Visual Basic and SQL server, maybe some object-oriented programming in C# if you were feeling fancy. In the intervening decade storied engineering consultancies such as Atkins, Arup, Arcadis, AECOM, and other organisations that begin with ‘A’ have all invested heavily in creating, consolidating, and growing their data science and digital transformation practices. Much of their investment has centred on Predictive Analytics. This is the pursuit of using data to anticipate when an asset is likely to fail, and to identify where there is capacity to use assets more intensively. It makes sense that predictive analytics is one of the leading applications of data science in built environment sector for a few reasons, namely:

304

6 Radical Technologies

• The strategic priorities of construction and the built environment are almost always inherently linked to time. The time taken to create an asset. The time over which an asset is out of service for maintenance. The time delays caused by unexpected asset failure. The time consumed moving resources around the country in response to failure. Anticipating failure before it occurs, whether that is asset failure, scheduling conflicts, or safety risks, can reduce the time impact of our work, which directly translates into cost savings. • Whilst the techniques used in predictive analytics may be foreign to many working in our field, the logic of these techniques is relatively close to the type of heuristics that engineers and analysts currently use (failure modes, state models, leading indicators, etc.). Sometimes the simplest and most effective application of predictive analytics is codifying and automating the heuristics of engineering SMEs. • Similarly, whilst predictive analytics can become complicated, it builds upon the reporting and trend analysis that is already common within Business Intelligence and Project Controls functions across the sector. • Unlike consumer-focused data science products like sentiment analysis, chatbots, and recommendation algorithms, predictive analytics relates directly to the core function of the built environment: creating, operating, and maintaining physical assets in a safe, timely, and cost-effective manner. Relevant applications of predictive analytics to the built environment focus on anticipating time, cost, quality, and safety issues before they occur. An organisation might begin to identify the leading indicators of cost and time overruns on a project by isolating a range of early warning signs such as: • Seemingly innocuous errors that point to an underlying lack of discipline such as incomplete project documentation or a lack of risk assessment. • Small instances of underperformance that may accumulate to become much larger more pervasive problems. • Changes in resource profile, variations in productivity across days, and increase in lost time due to sickness, increases in re-work or an uptick in near misses/safety incidents. • Gaps in the data, or discrepancies between reported and actual progress that implies a good news culture. These factors might become apparent through statistical analysis of the project history, or similar projects. We can evaluate candidate contributing factors against real data to understand the extent to which they are predictive of failure. Similarly, whilst it is difficult to predict individual safety incidents, it is possible to identifying the leading indicators of a rising risk in safety incidents including: • Indicators of a poor safety culture including incomplete or late risk assessments, poor site set-up, low staff sentiment, rising staff sick days and absences, tight timescales, limited supplier assurance, or a failure to capture an appropriate level of near miss reports. • Repeated flagging of ‘near misses’ caused by similar issues, perhaps identified by running natural language processing (NLP) across safety documentation.

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

305

• Statistical analysis of the project history, or similar projects, to identify prior contributing factors to safety incidents. For example, the National Highways Complex Infrastructure Programme found that both shift patterns, the weekend, weather, and the presence of less construction-familiar staff (such as archaeologists) all contributed to making sites more likely to see incidents. However, the most common application of predictive analytics is on anticipating the failure of physical assets. Where instrumentation is in place to monitor assets, it is increasingly possible to observe the precursors to failure in the data on asset state and performance. Much of the equipment and assets that compose the built environment typically operate within a set of tolerances. When a load passes over a bridge, we expect the deck of the bridge to deflect under the weight of that load. An amount of deflection ensures that the bridge can effectively distribute the weight, but too much deflection might suggest that the structure is weakening and approaching a point of failure. Similarly, mechanical, and electrical equipment can drift out of operational parameters as errors approach. The equipment might perform actions more slowly (think of the doors shuddering closed on a creaky old lift) or get stuck in a reboot cycle. It is usually safer, cheaper, and less disruptive to maintain an asset or replace a piece of equipment prior to failure than reacting to failure. This is doubly the case across networks of buried assets such as gas distribution where dangerous or polluting asset failures can occur with little trace above ground. In general, the higher the consequence of failure, the greater difficulty identifying failure manually, and the greater cost of remedying failure, then the stronger the case for both instrumentation of assets and the processing of data to identify the early signs of emergent failure. In theory any organisation that manages expensive or critical assets can provide a better level of service and reduce operating costs by monitoring assets remotely and responding to signals that function as precursors to failure. The challenge in calibrating such systems is ensuring that the predictive maintenance systems accurately predict failure with sufficient time to respond (e.g., avoiding false negatives) whilst also accurately distinguishing instances where signals fall out of tolerance, but failure is not imminent (e.g., avoiding false positives). Predictive maintenance systems that do not meet these criteria can result in organisations that tolerate a higher risk of failure than they realise, and/or consume too many resources responding to alarms from assets that are not actually near failure. Work on systems that meet these criteria is highly competitive, potentially lucrative, and very much on-going.

6.6.6 Data Science on Built Environment Projects Data science should augment rather than replace the existing functions performed by teams within an organisation. This means applying data science techniques to create insight that informs existing activities such as scheduling, planning, modelling, maintenance, operations, and hazard management.

306

6 Radical Technologies

The advantage that data science brings is speed and scale. Machine learning can allow us to stress test schedules against a wider range of scenarios, respond to usage and environmental patterns across large asset portfolios and complex networks, simulate the sequencing of work across space and time, and automatically compare ‘as-built’ work delivered on site against digital designs. Data science is an act of rapid prototyping, failing fast, and targeting early business value. Figure 6.13 describes some of the opportunities for the application of data science on built environment projects, broken down by project stage (plan, design, construction, operate), and function. Time, cost, quality, safety, and risk are the lifeblood of construction and maintenance projects, likewise availability and efficiency are the preserve of operational assets. There exists extensive established practice on how to measure and capture metrics to enable continuous improvement in the delivery of projects and performance of assets (Jardine, 2007). There are also numerous data specifications for these data types (e.g., cost breakdown structures, work breakdown structures, etc.) meaning that the data quality in this area is often better than elsewhere.

Fig. 6.13 Data science techniques through the life cycle of a built environment project

6.6 Emergent Behaviour: Applying the AI Paradigm Shift to the Built …

307

The combination of these factors makes project controls and performance improvement obvious areas to seek early value from data science. Many large engineering and construction firms and clients are investing in using data science to complement and augment the work of existing practices such as planning, cost management, safety management, or construction logistics. Applications include interrogating the large volumes of data to identify leading indicators of cost overruns and delays, such as poorly sequenced plans, lack of key information, low engagement in quality assurance processes, increasing numbers of incidents, and so on. Whilst algorithms may not always be able to analyse this information as well as experienced professionals, they will be able to quickly parse large volumes of information and flag potential problems for human review. The application of data science to project controls has many crossovers into design. Having established accurate and meaningful data on time, cost, quality, safety, and risk, we then can apply this data to 4D (geometric plus time) representations of our work. Techniques in this space—variously called 4D/5D BIM, Digital Rehearsal, Construction Simulation—combine project controls data with asset data to provide a greater understanding of the areas of the work that are associated with risk of delay, cost rises, and safety incidents. By visualising project data across space and time these techniques make it easier to talk through the sequencing of activities, and to visually identify bottlenecks otherwise hidden in the lines of a gantt chart. Most construction projects have an intuitive sense of where their ‘critical path’ is likely to lie. Anecdotally, Tideway needed some enterprising engineering to avoid serious delays from the complexities of works underneath Blackfriars Bridge (Ferguson, 2020), whereas on Crossrail, Bond Street station drifted so far ‘right’ on the programme that it eventually opened after the rest of the central part of the line (Weinfass, 2021). The combination of numeric and model data in Digital Rehearsals enables a project to work through solutions to these problems in a digital environment, to establish an effective plan that de-risks the execution of works in the real world. The benefits of such technologies are apparent in the construction of a wide range of time, cost, and space-constrained projects as evidenced by the track record of 4D specialists such as Freeform (Bowles, 2023). Digital rehearsal allows both more efficient and repeatable construction sequencing, as well as acting as a communication tool and a means to engage site operatives in the planning process. There are a range of other use cases where the combination of data, AI, and in some cases specialist hardware can realise meaningful benefits on the ground. Some examples called out in the diagram include: • Using sensors and AI computer vision to check works against plan as works proceed. • Creating data-driven smart contracts. • Reviewing invoices to identify indicators of fraud. • Smoothing the handover process from construction into operation. • Creating the data underpinnings of IoT, Digital Twins, and Smart Buildings. Data science doesn’t come cheap, but it is a bargain compared to the cost of project overruns and poor estimates. The UK’s most valuable startup, the Fintech

308

6 Radical Technologies

firm Revolut founded in 2015, reached a valuation of £24billion in 2021 (Timmins, 2021). By comparison, the estimated cost of the UK’s High Speed 2 rail project has increased from £33billion in 2010 and £55billion in 2015, to estimates of between £72billion and £98billion in 2021 before one considers reduced scope and delayed timescales (BBC News, 2021). Whilst HS2 is unique in its sheer scale, almost all major construction projects in the UK in recent years including Crossrail, Tideway, and Sellafield faced cost overruns and schedule delays (Horgan, 2021). Of course, each of these projects employed a wide variety of data roles. As such, one might argue that pouring yet more data capability merely increases the cost of such projects further. And yet we believe that if you work in the public sector, then you owe it to the taxpayer to try to use data science and related techniques to address this sector-wide problem of cost and delay. We must continue to seek ways to use statistical analysis, modelling, and artificial intelligence (AI) to reduce the uncertainty associated with construction projects, to provide users with the ability to view and understand data, and to begin to create a positive and proactive feedback loop between the physical assets and the decision-makers responsible for creating and using those assets.

6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud Computing When systems designed to hide their inherent complexity from the end user fail, they fail all at once and completely, in a way that makes recovery from the failure difficult.—Adam Greenfield, ‘Against the Smart City’ (Greenfield, 2013)

Internet of Things (IoT) is a loose term used to describe the deployment of sensors and edge computing to create a diagnostic real-time data connection to physical objects. IoT appears alongside concepts such as ‘Digital Twins’, ‘Smart Buildings’ and the ‘Fourth Industrial Revolution (4IR)’. As is often the case with cross-sector semi-technical hype terms, these terms obscure the underlying opportunity. In short: it is now both possible, and affordable, to have a rich near real-time understanding of the usage, performance, and condition of a complex built asset or set of built assets. Whilst instrumentation and monitoring have use cases across the asset life cycle. IoT might help solve the puzzle of how to transition Digital Twins out of the construction stage and into operation. It is the flow of information that will enable users, facilities managers, and asset managers to understand fundamental facts about the asset including occupation levels, environmental factors (e.g., temperature, humidity, air quality), asset condition (e.g., structural strength, loads), and events (e.g., fires, floods, asset failure). Paired with cloud data storage, and data science techniques, IoT provides us with the opportunity to operate assets with far greater situational awareness. Ideally, it moves us from a paradigm where asset owners respond to usage patterns and failures, to one where they proactively anticipate issues and adjust the behaviour of their buildings or infrastructure accordingly and intervene on the assets where necessary so that they can continue to safely meet demand.

6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud …

309

6.7.1 What Is Smart? A Smart Building is one for which there is sufficient IoT and data infrastructure for operators to understand and respond to the performance of the asset in near real-time. The question of how to deliver a Smart Building or a Smart Place is important because underpinning that debate is a much broader question around the legacy of construction works in general. Smart Building capability—whether it is a standalone capability or integrated into an existing set of operational systems—will figure prominently in transition from the ‘build’ phase of an asset to the ‘operate’ phase on the asset life cycle. It is an opportunity to handover a usable, responsive, and information-rich asset from construction to operation, as opposed to merely handing over an operation manual. Potentially it is possible to retrofit Smart Building technology onto existing or heritage assets to allow them to meet increasingly stringent sustainability targets or to accommodate new uses. The ability for data and digital technologies to make such retrofits effective is an important part of reducing the carbon impact of the built environment because repurposing buildings tends to have a dramatically lower carbon footprint than demolition and replacement. For all their putative benefits, appending the word ‘Smart’ to things without really defining what constitutes ‘smartness’ does result in a degree of ambiguity. As leading Smart Building proponents have argued, “‘Smart’ is an opaque subject for the sector – there is little agreement regarding what matters in a smart building: how to approach creating a smart building, how to work with the supply chain, or even what ‘smart’ itself means (Barendrech & Barker, 2021).” Your authors own market research (via a poll on LinkedIn) indicated that there are still a variety of definitions in terms of the level of sophistication required for an asset to be deemed ‘smart.’ Many respondents thought that any building equipped with IoT devices is a Smart Building. However, most users expected the integration of IoT into control systems and the use of predictive analytics before a truly building warranted the ‘Smart’ designation. Another more user-focused interpretation of the term might be that the specific technologies used may be less important than the decisions that the technology enables. We would argue that a building becomes smart when its owners/operators are able to ensure that it responds to the emerging demands placed upon it (Fig. 6.14). Consensus on the definition of Smart Buildings and Smart Places is less mature than that of BIM, for example. There are few International Standards targeted at Smart Buildings, and fewer organisations publish specific Smart Buildings strategies relative to BIM or Data strategies. Broadly, Smart Buildings are what happens at the confluence of a few trends: • The application of IoT for buildings, including both telemetry on building systems as well as some means of assessing occupancy and usage patterns. • Cloud data management at a level that allows the large-scale storage and analysis of telemetry, monitoring, and usage data. • Creation of sufficient data models (knowledge graphs) to codify the logic of the building (and potentially its users). • Data science and predictive analytics that can take advantage of items 1–3 above.

310

6 Radical Technologies

Fig. 6.14 What the people think, what is a smart building?

• A maturation of control systems, moving beyond SCADA into predictive automation (Building Management Systems—BMS). • Visualisation of buildings as a system (BIM). Figure 6.15 summarises the component parts from a technological perspective. A recurring theme in the literature is that Smart Buildings do not exist solely to improve the operation of the buildings, but also to improve how they respond to their human occupants. As EY write, “optimally, the smart building allows owners

Fig. 6.15 High level abstraction of Smart Building concepts

6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud …

311

and operators to better understand occupant needs and behaviours (Ernst & Young, 2017).” In that same report, EY identified four principal benefits of Smart Buildings. • Creating a better customer experience, for example targeting areas where there is high occupancy with a greater frequency of cleaning and staffing to deliver a “condition-based service.” • Enhancing energy efficiency, for example automatically switch off heating, cooling, and lighting to unoccupied meeting rooms. • Making operational improvements, for example heat sensors in rooms to detect HVAC failures. • Establishing a more sustainable environment; Arup write that “Smart buildings are a critical component in the race to achieve net zero operations (Stych, 2022).” Large property owners are moving to make Smart Buildings a core part of their value proposition, one that is inextricable from both their financial, customer service, and sustainability commitments and objectives. A good example is British Land’s deployment of their BL: Connect smart building system at 100 Liverpool Street, a commercial building in the City of London. Fiona Sawkill, British Land’s Head of Digital Placemaking, describes how “some 9,000 assets – including light bulbs, lifts, fan coil units, escalators, sensors and valves – are feeding valuable data into BL:Connect on how spaces are being used and what conditions are like (Sawkill, 2022).” Sabine Lam, the Googler in charge of the ‘Building Operating System’ for Google’s worldwide campuses. Explained her mission statement as “to maximise the value of Google real estate investment by securely connecting spaces to enable data-driven execution and tailored user experience (Lam, 2021).” There are also restoration projects looking to bring smart infrastructure to historic buildings. The University of Glasgow’s Smart Campus Digital Masterplan (written by Mott MacDonald) covers their campus of over 320 buildings (including several listed buildings) (Pearce et al., 2019). The University describes its aspiration to “embed the use of smart, contemporary technology into the daily lives of students and staff to enhance the way they interact with their environment.” Their report describes a range of outcomes, from physical security and cost savings through to the augmentation of the physical environment with digital infrastructure. Their roadmap identifies short, medium, and long-term initiatives, including: • “Leverage data analytics to identify health & wellbeing gaps.” • “Create a sandbox area… to enable innovation.” • “Develop a digital twin with scenario planning functionality.” Making the case for an investment in Smart Buildings will likely require you to tie the desired outcomes to your organisation’s objectives. This might mean enhancing user experience, improving building energy performance, keeping occupants safe, or improving staff wellbeing. Deloitte describe seeking to combine “the physical assets within the building, the digital assets that create a fabric throughout the connected space, and finally the use cases that are enabled by the marriage of physical and digital assets. (Michalik, 2018).” When deploying Smart Building technology clients may be seeking capabilities such as:

312

6 Radical Technologies

• Visualising data to enable informed operational and life cycle decisions. • Future-proofing buildings systems such that they can meet changing business demands. • Creating a physically and digitally secure environment. • Improving the energy efficiency and occupancy management of a building through smart room booking and facilities management that responds to actual usage patterns. • Enabling proactive or predictive operations and maintenance. • Enhancing the accessibility and usability of built assets. Creating Smart Building functionality from scratch can be a complex and expensive task, particularly where integration between systems proliferates. However, there are opportunities to work with existing COTS solutions such as Building Management Systems (BMS) and occupancy management software, complementing these existing platforms with IoT sensors, and integrating only where necessary and useful into a Smart Buildings COTS product. Making use of ‘out of the box’ functionality in this way will have a lower upfront cost and a higher likelihood of success, particularly where it draws from methods proven by other client organisations. That said, not every use case will have an off-the-shelf product ready and waiting for it. As data professionals we must also be cognisant of our user’s privacy and level of comfort with data capture. Where possible, we should seek to make our users aware of what, if any, personal data we collect, why, and for how long. Data collection that may feel fine in a working environment may not be appropriate in a public space or domestic setting, or vice versa. Smart building technologies risk crossing the ‘creepiness’ threshold: nobody wants to feel digitally stalked by their work, their campus, their gym, or their smart speaker. A client’s approach to Smart Buildings should reflect their information security culture. That might mean using products like threshold sensors (which only count numbers of people rather than individuals) over individual desk sensors or using IoT devices that run on LoRaWAN to avoid integrating into the corporate Wi-Fi network. The outcome of using this technology should be one that leaves occupants feeling like they are able to make better use of their building, whether this is easier access to buildings/rooms/lockers, help with wayfinding, assistance with visitors, or better accessibility support. Smart Buildings may also blur into Digital Twin use cases, helping building managers to understand the impacts of near-real-time changes to the configuration of their assets. Building ‘smart’ functionality into BMS, lighting, HVAC, and other M&E equipment is not without cost, as is integrating with tools like occupancy management and room booking. It will require clients to change how they specify the hardware and tooling that they buy so that data is available by default following handover. It also means bringing rigour to how the handover process from construction to operation works as this is the principal opportunity to ensure that projects properly commission equipment and that documentation such as O&M manuals include the information that the building manager will need in the future. Even better, vendors could build this logic into the user interfaces of the digital systems that control the Smart Building. Creating well-specified and easily accessible data feeds as we install

6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud …

313

or renew equipment is key to realising efficiencies from Smart Buildings, allowing clients to ‘right size’ their office space, reduce energy consumption, target services such as hospitality and cleaning, and more easily identify and repair faults. Implicit in the Smart Buildings philosophy is a profound ability to understand the configuration of a client’s asset base, to measure its on-going performance, and to help occupants make better use of their environment. The issue of incomplete handover from building to operations has plagued this sector, which is why the soft landings literature exists. Often this means that vast quantities of information are discarded or archived at handover because operational teams simply aren’t able to make use of BIM models, imagery, sensors, schematics, manuals, and all the other information generated during construction. Or worse, the parties responsible for construction fail to collate this information so that it is available at handover. In operation, Smart Buildings are going to produce and consume massive quantities of data. Without effective data modelling, quality, storage, analysis, and (crucially) realistic and logical use cases a Smart Building will consume considerable resources without creating any value, at which point the ‘smart’ technologies will life expire long before the building does and the capability with be mothballed. The data underpinnings of a Smart Building follow from many of the topics covered in this book, namely: • A detailed understanding of the built asset and its instrumentation realised through investment in Digital Construction. • Hyper-scalable data storage, realised through data management using modern cloud-based data architecture. • A logical map (ontology) of how the asset behaves, and how human actors interact with the asset. This will be the means to translate your Smart Building use cases into data solutions as documented in your Common Data Model. • The analytical models required to understand and proactively respond to changes in the building and its usage, realised through sophisticated data science capabilities. • The hardware, instrumentation, and integration required to observe, capture, and control the aspects of the building and its use that are relevant to your Smart Building use cases. By delivering these foundational activities, the realisation of a Smart Building becomes a question of integration rather than development from first principles. Realising a Smart Building demands that the end client receives and makes use of a full life cycle of information pertaining to the asset. It does not mean training every member of the maintenance team in ‘BIM.’ It does mean creating an accurate representation of the asset base using tooling that they understand, and ensuring that sufficient monitoring of that asset base is in place to monitor its on-going performance and usage patterns. Information Management and data specifications have a vital role to play here, as does pragmatism and common sense.

314

6 Radical Technologies

6.7.2 The Drawbacks and Ethics of Smart Places Smart Buildings and Places are not without their critics. We quoted Adam Greenfield’s book ‘Against the Smart City’ at the start of this section. It is an acerbic defence of the primacy of culture and organic messy life over the optimisation of ‘Smart’ technologies (or, more accurately, the sales pitches of companies selling ‘Smart’ technologies to ‘augment’ the built environment). Whilst Greenfield’s focus is on the premise of Smart Cities, many of his critiques are applicable to individual buildings as well. Indeed, his suggestion that “smart cities are over specified” might be even more applicable to Smart Buildings. Anyone who has had to interact with a building which mediates key parts of the user experience using dated ‘smart’ technologies will attest to this. We’ve all visited buildings where a ‘high tech’ fit-out has aged poorly, for example ageing and barely responsive touch screens mediating access to lifts, coffee or meetings rooms, security barriers that appear to have a vendetta against the human race, or video conference equipment that appears to be built from repurposed 1980s fax machines. Smart buildings are fragile in the sense that they try to give people strictly mediated means to interact with the physical asset, which in turn do not change with advances in technology or user expectations. The refresh cycle on buildings is far slower than the refresh cycle on consumer technology, and may seek to solve problems where analogue means would suffice. Inserting technology into buildings changes user expectations, and should only be done where there is a reasonable expectation that it can be kept relevant. There is a risk that by harnessing fast-moving technology to slow-moving infrastructure we unwittingly reduce the life cycle of our assets. Facilities managers working to tight budgets can neglect components that require frequent upgrades or replacement, for example air duct filters. Cash rich construction projects that embed smart technologies into assets need to help ensure that: (A) Those that operate the assets have the training, skills, and funding required to keep the smart technology operational. (B) It is possible to update the firmware of the smart technologies to avoid redundancy and security risks. (C) The smart technology is not embedded in the asset in such a way that makes it prohibitively expensive to replace with new equipment in the future. (D) Where the smart technology does fail it doesn’t do so in a way that locks users out of the functionality of the assets (e.g., we can still use the meeting room even when the room booking system is down, or the road even when the smart motorway equipment is out of order). Few people are complaining that their Victorian terraced houses aren’t high tech enough, is the same true of offices? Anyone who has worked in an office during the age of videoconferencing will know that even the most lavish and modern equipment can be temperamental and require on-going support for the average worker to use. In fact, often it is the most sophisticated and expensive set-ups that are hardest for the average employee to reliably make use of.

6.7 Information, Everywhere: The Paradigm Shifts of IoT and Cloud …

315

As Greenfield writes, “in the end, the problem with overspecification is simply that it leads to a curiously static conception of the future: once we get there, we stay there.” As data professionals we have the opportunity to look beyond gizmos to the underlying data feeds that our buildings and places produce. If we specify these with the future in mind, and without becoming creepy, then there is a good chance that we will be able to continue to make use of that data to mediate and improve the user experience of our assets. We must also be mindful of the dark side of data collection and ‘smart’ spaces. What would it mean, in practice, for a city to make use of 90% of its’ residents’ information? A good person to ask might be Joseph Bradley, Head of Technology and Digital at Neom, who in 2021 promised journalists that “where smart cities currently make use of about 1% of available data, Neom will be injected with 90% of the communities’ information” (via ZDNet). It’s tempting to dismiss statements like this as mere hype. But I think it’s worth evaluating such ideas at face value as they reveal the different values that people bring to the ‘Smart Cities’ concept. Most of us, at this point, have moved beyond the naive idea that having 3rd parties use as much of our data as possible is a good thing. Cambridge Analytica, creepy advertising, and cyber fraud have encouraged us to hide our virtual identity behind privacy controls and VPNs. We are careful with our data trail not because we have much to hide, but rather because we want to avoid that pernicious sense of manipulation: by social media, by online retailers, by fraudsters. Do we want our urban environment manipulating us as well? “You were late to work today, Ian, perhaps you need an electric bike so that you can get there faster? Perhaps you would like to subscribe to our cycling plus service so that it allows you to use the premium cycle lanes?” Of course, context matters. The city that Bradley is talking about isn’t just any city. It’s a high-tech dystopia bankrolled by the despotic rulers of a country with one of the worst human rights records in the world. What would it mean for a city to use 90% of its resident’s data in a country where political protest is illegal, governments routinely use torture and capital punishment to suppress dissent? How do I feel about my city always tracking my whereabouts when there is no right to free speech or free assembly? How does my perspective change on data collection if I am a woman, or a foreign labourer, or if I am gay, and I live in a city where freedom varies so profoundly by gender and visa status, where sex outside of marriage is illegal, and where LGBT+ relationships are prohibited? As Data for the Public Good pointed out back in 2017, “data protection is fundamental to the development and successful deployment of smart city models and functions (Hayes et al., 2017).” If the leading adopters of smart technologies turn out to be autocratic nation states, and the word ‘smart’ becomes a euphemism for surveillance, then this warning might read retrospectively as something of an understatement. There are many ways to make use of people’s data. We would all love to live somewhere that anticipates our needs, keeps our fridge stocked, our children entertained,

316

6 Radical Technologies

and summons personalised autonomous vehicles to hurry us between destinations without us even asking (or, you know, at least a nice bicycle). Unfortunately, a city that knows exactly where you are and what you are doing at all times requires an immense amount of trust and rigid safeguards, is probably rigid and over-specified, and creates a massive cyber security vulnerability. A smart city is, primarily, a political means of control. Be incredibly careful what you wish for, and don’t believe people that imply that using more data and technology is always a good thing.

6.8 Digital Rehearsal Things fall apart; the centre cannot hold; Mere anarchy is loosed upon the world… And what rough beast, its hour come round at last, Slouches towards Bethlehem to be born? —The Second Coming, Yeats.

Digital Rehearsals are another expression of the ‘multi-dimensional’ BIM 3D design model, albeit one that has become a hype term in its own right. The bottomline up top here is, let’s put the multiple dimensions of models to bed. The physics department is not happy that the cost consultant has declared that the 5th dimension is money. We must keep in mind that these (insert your favourite dimension of BIM here) abstractions are unhelpful and confusing for the layperson. They also create opportunities for vendors to perpetually find another ‘D’ to sell and confuse stakeholders in corporate and project leadership roles (the folks that ultimately signoff your business case to buy the tooling that you actually need to do you work). Arguments over what is the 8th Dimension of a ‘BIM model’ is a waste of time and effort. We have 3D models, we have time, and the rest is meta data. That is all. It is far more productive to wrap everything together under the banner of Digital Rehearsal and crack on with building good quality social and economic infrastructure. Working on the built environment brings us face-to-face with the laws of physics, and our own limitations. Time is linear, the entropy of systems increases, things fall apart, and once broken can be difficult, expensive, or impossible to fix. The early days of a project are often characterised by a profound desire to make something happen. Careers and reputations depend on getting a green light to break ground. Costs, schedules, and specifications are all romanced until they are politically palatable, where the numbers are ‘right’ in the sense that they will obtain approval, even if they are no longer accurate or feasible. And then, like Yeats’s rough beast, reality slouches back home: mistakes happen, risks materialise, optimism bias unwinds, the unknown unknowns make themselves known, and the chickens come home to roost. At the start of a project, we wonder at the possibilities in our grasp, at the end we wonder what we could have done differently (and then we don’t tell anyone, because we work in a sector where few people ever admit their failures, your authors included!).

6.8 Digital Rehearsal

317

The relative permanence of the built environment is part of what makes working on it inspiring, and in some weird sentimental way an honour. The (relative) irreversibility of changes to the built environment is part of what makes it terrifying. This terror probably goes a long way towards explaining the sector’s obsession with planning. Augmenting an organisation’s real—or perceived—ability to plan is a strong benefit for investments in digital technology. Enter Digital Rehearsal, an evolution of BIM software that, as Mott MacDonald describes it, “enables work sequences to be planned, tested, revised and perfected in virtual reality before being put into practice (MacDonald, 2020).” Constraints define construction projects. Moreover, constraints are not always bad. Keeping the built environment running is a far-reaching and endless task that employs millions of people world-wide. Subdividing that unknowably large task into distinct organisations and projects with defined scopes and responsibilities makes work more tangible, deliverable, and finite. In agile terms, we need a ‘definition of done’. Built environment projects inherit their constraints early on in their life cycle: long before they pour any concrete, projects have a budget, a timeframe, and a set of deliverables. Estimation of such constraints is as much an art as a science, and usually involves compromise and ‘value engineering’ to be palatable enough to win approval. Consequently, projects often grate against their constraints, particularly as their scope inevitably evolves through exposure to reality.7 In this context, Digital Rehearsal is about optimising work around a given set of constraints. As we have already touched upon, such constraints are likely to include: Space: projects usually take place upon a defined ‘site’ or sequence of sites, work within the site is part of the project, work outside the site is not. The physical attributes of the space, and the existing assets within and neighbouring that space, constrain what you can accomplish within that space. All other things being equal, the more constrained a site, and the more valuable the stuff sitting on its periphery, the slower and more expensive a project tends to become. This is doubly true in cases where part of the site remains open during the works. Kings Cross Station, for example, remained in service to hundreds of thousands of customers pretty much throughout the period of its renovation. When space is at a premium the logistics of moving materials, plant, and assets to and from site becomes a vital challenge. In this context, BIM models are important not just as a representation of the works, but to coordinate ‘fitting’ the works into the site. Doing so requires a ‘fourth dimension’ to the model, time, allowing users to visualise the sequencing of events, and the manoeuvring of materials and plant around tight spaces. This level of functionality in turn requires both specialist visualisation capabilities over-and-above traditional BIM software as well as the data to make those visualisations meaningful. There’s no point visualising the movement of a piling rig or beam on site if you don’t know their dimensions, for example.

7

The exceptions that prove this rule are the vanity projects of sovereign wealth funds and autocratic countries which, in the absence of constraints, tend towards lunacy.

318

6 Radical Technologies

Time/Schedule: Most built environment projects occur with an end use in mind. These can be grand events, such as the London Olympics, external factors such as increased demand for transport or bandwidth, or simply the need to return assets into service on Monday morning. In the UK, every Christmas and Easter the mainline rail operator Network Rail manages to squeeze a near-miraculous amount of work into the space of the two or three days during which the network lies dormant. Returning the asset into use to ferry commuters to work can ensure that the operator avoids millions of pounds in fines (not to mention a flurry of complaints on social media). So almost every built environment project works to some deadline or set of milestones, and realising those milestones means coordinating a complicated set of interdependent tasks and specialisms. Time overruns (and resulting cost overruns) often occur. The handover between different tasks, where a delay to one task has a cascading impact on the sequence of tasks that depend on that work, can bestow even small delays a long tail of disruption. Delays result in idle resources, abortive site visits, and in the worst-case scenarios abandoned work. Just as we can use digital representations of our works to coordinate movement of resources in space, we can do the same to better plan out the sequencing of works across time. The sector has a visual language of time typified by the ubiquitous Gantt chart, but increasingly Digital Rehearsal visualisation tools can bring the sequencing of work to life in a way that makes it easier to visualise, plan, coordinate, and de-risk works by breaking out of the chart and into a dynamic 3D representation. Scope: every project exists to deliver a finite set of changes (physical or others) to a finite set of assets. This is true of the largest megaproject, down to a lightbulb replacement. Work outside of scope is not part of the project. Finite scopes help us understand what we should, and should not, plan for. We can then subdivide our scope into defined tasks, assign those tasks to roles, and in doing so create responsibilities. Sounds simple, right? However, breaking down complex, interconnected works into mutual exclusive deliverable tasks requires a deep understanding of the construction process. By and large this work is still done almost exclusively by highly skilled, and well-compensated, construction programme managers and planners. A megaproject lives and dies by its Work Breakdown Structure (WBS), and this is an incredibly important data structure that rarely falls under the purview of any CDO or team. Increasingly however, construction technology solutions are muscling into this space. While we are unlikely to delegate sign off of our construction schedules to an AI in the near future, there are lots of parts of the work that algorithms can help with, from sense-checking programmes to make sure that they are not asking teams to be in two places at once (or for two teams to be in the same place at the same time), to simulating a range of different construction sequences to recommend where the most efficient options are likely to be found. Again, the more we rely on digital solutions to guide our decision making, the more important it will become for us to capture the right granularity and accuracy of data to inform those decisions. Our historical reliance on heuristics has allowed us to fill in data gaps with judgement, but this approach may prove misleading the more we rely on algorithms.

6.8 Digital Rehearsal

319

Budget: ahh money, ever the true arbiter of what is and is not possible. Let’s face it, budget is the constraint that is most likely to get you into trouble. It will figure prominently in the headlines if your project goes skew-whiff. The media, members of the public, politicians may not care terribly if your project takes a little bit longer than planned to complete, but the moment it overspends, expect the headlines and the half-baked opinions to flow. Your work, big or small, will undoubtedly have an agreed budget envelope from an agreed funding source. That figure may well have etched itself in your memory. Any digital solution that can meaningfully reduce costs, or reduce the risk of overspend, stands a decent chance of receiving funding. Thankfully, expenditure is a bit of a lagging indicator. If we can use quality data and techniques such as digital rehearsal to better manage the constraints described above (physical movement in space, scheduling, and sequencing, breaking down scope into meaningful tasks), then there is a fair chance that our expenditure will reduce commensurately. Budget and expenditure are not the same thing, one is a prediction (or a limit), the other is an event that arises. Data and digital should also help to inform the creation of budgets upfront so that they are less likely to be completely inaccurate when tested against reality. Quality: The timeless adage in construction goes “cost, time, quality: pick two.” Of those three, quality is clearly the most subjective. One might argue that it’s hard to define quality, but you know it when you see it. Often, we assess compliance with standards -internally defined, or from organisations such as BSI—as a proxy for quality. Standards place a wide range of constraints on a project, from delivery method to how the product functions to what is built. Some quality checks can be automated, for example the spacing of fire protection equipment, many others will remain subject to human judgement into the future. Resources: Much earlier in this book we talked above the sheer variety of feudal specialisms that exist across the built environment. This variety means that projects aren’t merely constrained by the sheer volume of labour, but also by the availability of specific skills. You may have all the operatives you could ever ask for, but if you are missing that crucial Certified Widget Wrangler at the vital moment on your project then there may be nothing practically or legally that you can do to avoid delay. The same will be true of certain types of plant, particularly if you are doing something difficult like digging a tunnel or piling in a crowded urban environment. Consequently, resource management is an important part of the logistics of built environment projects, and another area where the ability to plan, manage, and forecast the availability of human and material resources by using digital solutions is crucial. Contracts: related to the item above, built environment clients contract a large part of the work that takes place on sizeable projects to external organisations. This can be to secure both quantity and specialism of resources and allows client organisations to de-risk delivery. In this context, contracts are the agreement that an organisation will employ a group of people to perform a set of tasks on the project’s behalf. Organisations performing tasks without a contract are not part of the project (no matter what their marketing material might tell you). Contracts place constraints on

320

6 Radical Technologies

Fig. 6.16 Project utility function

projects not merely financially, but by requiring client organisations to commit to a certain specification of work in advance. The degree to which a contract has fixed scope varies with the type of contract, but even flexible ‘time and materials’ contracts may include assumptions that are expensive to change. This can be the case in terms of data sharing, for example, where contractors may only need to handover data that the writers of the contract specified. Having defined our constraints, we can then seek to optimise within them. In an economic sense we are trying to maximise the utility of the project for a given set of constraints. Written as a formula, the ‘project utility’ function might look something like this… (Fig. 6.16). The case for Digital Rehearsal is, broadly, that it is easier to maximise project utility by rehearsing the construction process prior to going to site. In the same way that BIM models can allow us to identify and resolve physical ‘clashes’ in the design before going to site (for example, two pipes occupying the same space), Digital Rehearsal could allow us to identify and resolve logistical clashes in the sequencing of our work (for example, two teams scheduled to work on the same issue simultaneously, or a plant movement where it isn’t physically possible). Jonathan Munkley, Digital Director at WSP, describes how the application of Digital Rehearsal to the construction of data centres allowed his team to codify and test heuristics that previously lived in the heads of a few prized experts. By modelling the movements of plant such as heavy lifting equipment through space and time his team were able to correct for pinch points in the programme. Jonathan says that one of the unexpected benefits of visualising these movements in a 4D model was that it helped construction teams to discuss and de-risk their work in advance of going to site. By leveraging visualisation, the models provided a focal point that stimulated discussion about risks and helped teams to intuitively understand their work.

6.8.1 Parametric and Generative Design Now it would be very remarkable if any system existing in the real world could be exactly represented by any simple model... “Is the model true?”. If “truth” is to be the “whole truth” the answer must be “No”. The only question of interest is “Is the model illuminating and useful?—George Box

6.8 Digital Rehearsal

321

Parametric design and generative design are not the same thing. They are closely related, but how they are deployed in technology and how they drive an outcome for design differs. Parametric design is user input driven and is characterised well by the family function found in Autodesk’s Revit. A designer sets up the required parameters and rules of how those parameter sets behave. For example, a curtain wall system, and the parametric system provides a single output. It is merely creating an answer to a single equation, much like trying to find the trough of a curve by manually pumping different parameters into the line equation. Finding the optima is laborious. Generative design is a different game. Much like using calculus instead to find the attributes of a curve, it deploys algorithms to find form optimally. Originating in the world of design script and Bentley’s Generative Components, with today’s capabilities of AI the designer’s role is less about parametrically fumbling to a best guess optimal solution. The designer is now presented with a plethora of possible solutions for them to fine tune. Almost all designs follow a set of parameters, or rules. Those rules might be set by design standards and/or reflect what projects can accomplish with a given quantity of resources and time. This means that creating digital plans for our works can allow us to accelerate design and delivery by automating the application of some or all these parameters to our work. Anyone who has taken a long car journey lately will have noticed that roads are, for the most part, quite repetitive. There are, of course, complexities in designing road networks: junctions, drainage, bridges, earthworks. But there are also miles of highways that fly by with metronomic pulses of signs and gantries. It was this realisation that large parts of highway design follow a defined set of patterns that led National Highways to develop their Rapid Engineering Model (REM). By codifying the ‘Design Manual for Roads and Bridges (DMRB)’ as a set of logical rules, REM allowed National Highways to automate the early design of motorways by reducing the need for the manual addition of repetitive assets to the model. Instead of paying a designer to add gantries every few hundred metres why not do so algorithmically? This frees up human resource time to review and adjust the models to reflect the parts of the design that require careful expert judgement. The example of REM demonstrates the application of human-generated standards to a project. But it is also possible, with the right kind of input data and heuristics, to determine what is and is not practically possible, or optimal, within the resources and time available. Construction simulation software created by companies such as Alice Technologies combines design models, programme plans, and a logic engine to allow the rules-based simulation and optimisation of construction projects. For example, Alice takes the design of a high-rise building, break it down into its component parts (foundation, steelwork, formwork, M&E, fit-out), reflecting the logical sequencing of those activities (e.g., foundation comes first, fit-out comes last), and associated resources and productivity rates with each task. This then allows Monte Carlo analysis of the sequencing of tasks and resources to identify possible efficiencies in the plan. By running these simulations prior to commencing the work, we can help to reassure

322

6 Radical Technologies

ourselves that (a) we are not working in an inherently inefficient manner, and (b) we are able to identify and resolve potential bottlenecks in how we approach the work. Parametric design is an emerging approach, one of which has the potential to massively reduce the time and cost required to design physical assets. However, the application of parametric design to refurbishing existing buildings or designing more complex systems is yet unproven. Some types of infrastructure are characterised by clear standards and repeatable elements, others are not. This does not mean that parametric design has nothing to offer to heritage assets, or complex M&E projects, or signalling systems etc., but it does mean that the techniques employed need to mature before they can deliver value to projects of this kind. Digital rehearsal and automated parametric design are applications of modelling. Every Digital Rehearsal will necessarily be a simplification of the real world. Digital Rehearsal will only add value subject to the following constraints: • The problem statement that you are trying to answer is sufficiently well defined that you can create a useful abstraction of the real world. • Our model of the physical and organisational logic of your project is sufficiently true to real world conditions to be instructive. The data centre example described earlier is an example of meeting these constraints. The focus of the model was on the movement of plant on the construction site. The focus on plant movements, combined with a reasonably developed design model of the planned construction, meant that the model could usefully inform construction without needing to include all of the other complexities of the project (for example, ground conditions, weather, contracting strategy, financial constraints, etc.). However, if the team building the data centre did not have a design model, or lacked a specification of the plant in question, then the Digital Rehearsal would not have added value. Whilst the data requirements of a Digital Rehearsal will vary depending on the use case, there is a minimum level of information required for Digital Rehearsals to add value. In other words, the conditions that impact the works need to be predictable. Crucial data sources may include: • A design model of the project, its assets, and the space that they occupy, as well as constraints concerning the site (for example, entrances, exits, routes for delivery, transport links, physical boundaries). This is likely to come from your BIM CDE. • Any physical limitations of the asset, for example, doorway sizes, loading constraints, aesthetics, heritage assets, other site uses. • Any timing limitations, for example, working hours, noise limits, delivery times, traffic management. • Constraints concerning the sequencing of the project. Where it is possible to reorder tasks, and where it is not. This is likely to come from your project plan (e.g., Primavera P6) plus some codification of planners’ heuristics.

6.8 Digital Rehearsal

323

• The resources required for different tasks, both the specification of human resources, plants, and materials, including relative attributes like power consumption, shifts, noise levels, lead times. This needs to also include their productivity, in other words what can the resources accomplish within a given amount of time. As this list shows, even a limited Digital Rehearsal will require a project to collect, cleanse, master, manage, make accessible, and manipulate a considerable range of information sources. To be perfectly honest, this is more than what most projects reliably capture at this time, meaning that any investment in Digital Rehearsal will require a proportionate investment in data quality and data management. Where these data sources are incomplete, or are of inadequate quality, this risks immediately undermining any application of Digital Rehearsals. As such, Digital Rehearsal as a capability depends upon a project first developing some of the other applications and capabilities covered in this book and in publications like the DAMA DMBOK2 and ISO 19650 including BIM, Information and Records Management, Data Modelling, and Data Management. Relevant use cases for Digital Rehearsal include: • The movement of suppliers, teams, materials, and plant on site, as well as the logistics of their storage on and off site, and their delivery to site. • The sequencing of activities on site, particularly activities such as M&E/services where there are likely to be both access limitations and dependencies on access to space and availability of specialist resources. • Where projects are working around other operational assets—for example refurbishments to operational train stations like Kings Cross—the coordination of activities such that vital services remain available and uninterrupted. • The identification of risks and hazards on site, and their mitigation, as well as plans in the case of emergency. • The rehearsal of complex or intricate works, for example the movement and installation of large components manufactured off site. Your authors believe that the case for Digital Rehearsal is sufficiently compelling that realising this capability should form part of the justification of enabling activities including Digital Construction/BIM, Data Modelling, and Data Management. However, it is a complex capability, and one that may well be out of the reach of less digitally and organisationally mature organisations. Digital Rehearsal is not solely a technological or data challenge. It is a capability that client organisations must deploy in a manner that allows staff to make use of it and trust the results. This means that deriving value from Digital Rehearsal will mean understanding how to integrate it into our ways of working. Integration may include the use of visualisation tools in terms of how we plan our sequencing of work, test the logic of our programmes, and brief our site operatives. Digital Construction is an inherently cross-silo activity, success will require cooperation, co-creation, and data sharing across functions such as Engineering, Design, PMO, IT, Asset Management, Operations, and the supply chain.

324

6 Radical Technologies

It is worth noting that Digital Rehearsal is a relatively immature field, with a small number of solutions, few standards or common definitions, and a lot of development left to do. Investment in Digital Rehearsal currently commits a client to a fair amount of investment of time and resources, and generally a highly manual process of extracting data from point solutions and integrating them into a specialist visualisation platform. However, we expect major AEC vendors such as Bentley, Autodesk, and others to invest heavily in this space in the coming years as a precursor to Digital Twins (following section). The built environment sector must work with vendors to prototype functionality, and to understand the data specification and capture required to power that functionality. This will help to ensure that once both the technology and your organisation hit the right level of maturity that you are able to take advantage of it. There is a health warning to conclude with here. Design philosophy will be shaken by the dynamics between parametric and generative design. The example above of REM is a parametric implementation of existing standards. It will only deploy according to the input rule sets. It cannot ‘think’ beyond the constraints of the input standards. As you can imagine the standards for highway designs are based on general rules of thumb and outputs of incremental experience of previous human learning over time. Generative design will enable a bespoke approach down to the molecular performance of the steel. A generatively designed gantry will be an optimum and unique shape compared to the parametric gantry. Take a moment to surf images of generative chair designs. You will see an organic looking structure to these chairs and the implementation of a purist view of generative design will see these organic structures be applied to any structure that is using is surroundings and the unique properties of its construction material which would be some form of 3D printed metal or concrete. Both parametric and generative design from a data management is very similar in nature in terms of managing data quality and flow. Both require very tight definitions and controls of parameters, such as units and scales. Any imprecision in the data, its units or scaling will spell disaster for the desired product. This is the life and death nature of our future industry as we move away from over engineered rules of thumb (due to safety factors and general uncertainties). It is worth mentioning the economics of modern methods of construction. Speaking from the experience of your authors who have worked on, designed and constructed design for manufacturing and assembly/modern methods of construction projects. You must be aware as a data professional in these areas how important your role is. It costs more money to get it wrong in real life on the building site than getting it wrong in cyber space. However, the economics of Modern Methods of Construction (MMC) or Off-site Manufacturing means you have a more aggressive spend rate at the start of the project as you tend to more detailed work up-front. To flow data into a manufacturing process, you have produced design at the right level of detail. Machines do not deal in ambiguity, they do precisely as you tell them. There is no ‘working it out on site’ when it comes to building assemblies off-site in a factory or even deploying a flying factory on-site.

6.9 Digital Twins

325

The upfront investment in manufacturing and difficultly in making subsequent changes means getting it wrong on site is far costlier for MMC projects. Your cost of error is in the 5× to 10× more than traditional methods. This is why your work is vital to ensure that the data is robustly governed and controlled.

6.9 Digital Twins The critical centre of the cognitive enterprise should consist of an accessible and simulated digital version of itself and its operating environment.—Richard Carter, Cognitive Advantage

Our relationship with the built environment is almost always characterised by latency (lag) between observation, processing, decision-making, and action. We rarely have access to a real-time picture of our assets, and even when we do there is usually a long planning timeline in terms of responding to emerging issues. Instead, we spend time and money gathering information about the condition of our assets, and then respond by planning ‘interventions’ out into the future. There are obviously exceptions to this rule: building management systems and SCADA systems, for example. However, for most built assets the lead time between capturing information and action is measured in months, years, or even decades. Often this means that we do not schedule interventions (e.g., to repair, refurbish, renew, or replace assets) at the ‘optimum’ moment, or that they fail before intervention increasing the likelihood of service outages, delays, or even accidents. At the same time, we lack a system-wide detailed and accurate view of how we use assets, how they are performing, how they respond to changes in demand patterns, and where there are ‘downstream’ dependencies between different assets or asset types. Almost all services in the modern built environment require a huge sequence of assets in order to function. A road trip, train journey, switching on a tap, or boiling a kettle all require the ‘system of systems’ to function safely, efficiently, and reliably. In the Smart Buildings section we described how property owners like British Land can use sensors and monitoring to help their buildings respond quickly to occupancy patterns. The logical extension of this approach is modelling how a complex building or campus acts as a system of systems. Walking through the basement or service areas of modern buildings makes one acutely aware of the intricacy of the services keeping those buildings operational. Often, we lack a map of exactly what users each system serves, the consequences of system failure, what ‘upstream’ services they are in turn reliant upon, and where we do and do not have resilience. Digital Twins present— in theory at least—an opportunity to use a confluence of digital technologies (IoT, cloud computing, design modelling, simulation, and AI) to help us to respond to the state of our physical assets (condition, usage patterns, faults, external events) with greater speed and more predictable outcomes. As the DT Hub’s Digital Twin Toolkit articulates, the use cases for Digital Twins include: a. Strategy & planning: providing context for investment or configuration decisions and modelling the range of possible outcomes.

326

6 Radical Technologies

b. Asset management: modelling how the performance of assets could change dependent upon operational parameters such as maintenance levels, environmental conditions, and usage patterns. c. Assurance: ensuring the safe operation of assets and system of assets, and potentially other societal outcomes such as sustainability or social equity. Crucially these use cases are all scalable, meaning that what you successfully model for a single asset or organisation can create additional value when combined with models for other assets and organisations. The promise of Digital Twins is therefore not in merely replicating or evolving existing practices such as transport planning or whole-life cost modelling, but in creating a more holistic understanding of how our built environment works systematically. Intuitively, it is only through such ‘systems thinking’ approaches that we can truly address problems of the scale and complexity as the climate crisis. In sitting down to establish where you are with Digital Twins, we recommend starting with the Institute of Engineering and Technology’s maturity spectrum (The IET, 2019). The implementation of Digital Twins is not a binary act, it depends on factors found in the IET’s Digital Twin Maturity spectrum. It is useful to imagine digital twins as an extension of the internet. If we look at how other sectors have transformed their business into web-enabled services, you can see that there was no single route through that journey. Figure 6.17 is a concept that your authors use to explain that digital twins (in the built environment context) are merely the extensibility of the internet. The firstgeneration platform of the internet connected a few millions of people through mostly mainframe computers, then in the second platform we see hundreds of millions of people connect through personal computers. In the third platform we see most of the world connect to the internet through mobile phones and in the fourth platform, which began with the internet of things (things being mostly consumer product), will soon become the internet of the built environment and therefore be the platform we need to build real, interconnected digital twins.

Fig. 6.17 Hierarchy of internet platforms over time

6.9 Digital Twins

327

“But NASA doesn’t need the internet to build digital twins?” you may ask, and for localised small models (small in comparison to the size of the entirety of the built environment) that can be achieved. If you breakdown the key ingredients we need to enable an interconnected set of digital twins at the scale of our built environment, networks are required. Our internet of things is our best foundation for that. For the built environment data professional, we need to steer you towards thinking at the interconnected systems of systems level. We must avoid the pitfall of the tension between BIM and Digital Twins. These are means, the ends are the same. The focus needs to be on the clarity of purpose and how we can build upon existing technological platforms and paradigms.

6.9.1 Smart Buildings Versus Digital Twins There is inevitably some overlap between the concept of a Smart Building, and the concept of a Digital Twin. Both rely upon minimising the latency in the collection of information about an asset using IoT, sensors, monitoring, and cloud computing. Earlier we defined a Smart Building as follows: A Smart Building is one for which there is sufficient IoT and data infrastructure for operators to understand and respond to the performance of the asset in near real-time.

Our definition of a Digital Twin is as follows: A Digital Twin is a digital representation of a physical thing (and the logic of its operation) that one can query.

Our definition aligns with the more technical definition of a Digital Twin from the Digital Twin Toolkit (Hayes, 2021) which covers three elements: 1. “A computer model as a digital representation of a physical asset” 2. “…where the physical asset may exist or may not yet exist.” 3. With “A dynamic data connection between the physical twin and digital twin.” You can see from these definitions that a Digital Twin concept is broader than the Smart Building concept. One can have a Digital Twin of a rock, but a rock cannot become a Smart Building. So, is a Smart Building the same as a Digital Twin of a Building? In a sense, yes. Or perhaps it’s more accurate to say that a Smart Building, and a Digital Twin of the same building, will share a lot of common infrastructure. It would not make sense, for example, to have one set of IoT for the Smart Building, and another for the Digital Twin, the same is true for a lot of the data management infrastructure. As Fig. 6.18 sets out, all Smart Buildings are Digital Twins, but not all Digital Twins are Smart Buildings. The Digital Twin use cases extend beyond that of the efficient operation of assets, and into the broader processes whereby organisations plan what assets they need, how they will build them, and how their operation fits with the broader organisational context and strategy.

328

6 Radical Technologies

Fig. 6.18 An illustration of capabilities in a building context

6.9.2 Industrial Digital Twins Another way to put it, built environment professionals were in ignorant bliss as they worked towards a digital future of the sector using BIM as a foundation. Then, all of a sudden, the manufacturing sector took a sideswipe and threw Digital Twins into the mix. Closer integration with the manufacturing sector is key and streams of construction reports starting from Simon in 1944 through to Latham in 1994, Egan in 1998 and more recently Wolstenholme in 2009 and Farmer in 2019 (BIMonomics, 2015). All pointing to building a manufacturing led approach to realise better outcomes. In 2018 Boeing proposed a Model Based Engineering (MBE) Diamond that attempted to pull together systems engineering (and the Vee Model) with a lifecycle view of digital twin delivery and use. Your author experienced, working within a major construction firm in around 2012 and witnessing the application of systems engineering first hand in the built environment context. Dr Chris Millard, who joined the construction sector from the automotive industry, set out a clear vision of how manufacturing and BIM can align (Fig. 6.19). The main goal was ‘to build the plan and then to build to that plan’ ultimately enabling Balfour Beatty to be as predictable in construction as leading firms in the aviation and automotive industry. We know, 10 years on that this vision is far from a business-as-usual reality. You author wanted to continue this tradition of pulling the built environment and manufacturing together and when setting the vision of how Digital Twins and the built environment combine and created the diagram

6.9 Digital Twins

329

Fig. 6.19 The BIG BIM opportunity circa 2012

Fig. 6.20 The Digital Twin Ripple

below. Taking inspiration from the Vee, Boeings Diamond and previous experience of the Balfour Beatty systems engineering approach we have the Digital Twin ripple Fig. 6.20. The overall lifecycle is described as the digital capability to design, test the design, assemble the design, and then integrate it into operations. The key concept shift here is the starting point. Many design processes start with a business logic layer, something along the lines of the business case. This process starts with the end in mind, the existing asset. Regardless of if you have a greenfield or brownfield asset to deliver, there is always an existing condition to integrate with. This diagram is internally known as the ‘Digital Twin Ripple’ and has been instrumental in structuring delivery in a systems mindset and breaking the problem into the quadrants of: • • • •

Design Test Assemble Integrate

This also puts a particular debate to bed from our point of view. NASA made digital twins to reduce the cost of design and operations. Digital Twins span the whole lifecycle of an asset, even when most of the asset only exists as a virtual design, it still integrates into an existing and real context. The key digital capability in this framework is the ability to certify via the use of the twin that a solution is

330

6 Radical Technologies

ready for manufacturing. Remember, its far cheaper to get it wrong in cyberspace than it is to get it wrong in real life!

6.9.3 From Construction to Operations Digital Twins require a real-time or near real-time connection of the physical thing that they represent and should surface functionality that aids the decision-making process associated with all or part of the life cycle of that physical thing and/or its operation in the real world. This definition helps us to distinguish between the concept of a Digital Twin, and the more established practice of BIM. The key differences are: • Digital Twins can and should be part of the construction phase, for example as a means to create awareness of project progress, site risks, or logistics. However, the concept really comes into its own with a focus on the operation of existing physical assets (e.g., the 99%+ of assets that are not currently under construction). Within your organisation (and the wider sector), there is often a loss of data capability as projects move from construction to operations. Operators have typically been unable to exploit BIM products to their full potential, in part because vendors rarely design BIM products with operators and asset managers in mind. By designing your construction models as nascent Digital Twins you can define the data and logic required to operate an asset at the start of the life cycle and ensure that the models you create during construction have operational value. • The emphasis on being able to query Digital Twins is important. A Digital Twin should not be a static representation of an asset, it should reflect the logic of that asset in operation. This means that Digital Twins need to expose not just the material properties of an asset (e.g., location, dimensions, materials, etc.) but also the business logic governing that asset (e.g., how you as the infrastructure owner can intervene on that asset to change how it performs). This will allow Digital Twins to enable better organisational decision-making through simulation and ‘what if’ scenarios, as well as to act as a store of institutional knowledge on how their assets and organisation work in practice. • To realise the two points above, the data schema underpinning Digital Twins is necessarily more complex, and more focused on relationships rather than properties. BIM data standards, such as CoBIE or UniClass focus on the hierarchies of assets, and their properties (e.g., “span belongs to bridge and is made of steel”). Emergent Digital Twin data models (often in ontology form) capture not just the properties of assets but how they relate to their wider environmental and operational context (e.g. “span is corroded by road salts, damaged by vehicle incursions, is maintained when the flange has 20%+ corrosion, and supports a flow of 50,000 vehicles per day travelling on the M25 as well as a broadband internet cable and would cause significant safety and KPI impact in the event of failure”). Creating these data models demands the creation and maintenance of a deep ‘knowledge

6.9 Digital Twins

331

graph’ of the organisation. This is a new and somewhat untested skill in the built environment context, at least outside of oil & gas. There is a lot of existing writing on what digital twins are, and how organisations should be design and deploy them. In many cases the dogma and evangelism have both outpaced the technical capability of the sector. Where possible it is worth seeking to align with publications by the Centre for Built Britain’s National Digital Twin programme and its successor organisations such as the DT Hub and the Apollo Protocol. This will help to ensure that your investment in Digital Twins remains grounded in a community of practice that is vendor agnostic and ethically coherent.

6.9.4 Architecture The use of Digital Twins will likely vary substantially across large built environment organisations. The asset life cycle (e.g., plan, design, build, operate/maintain, dispose), demands the involvement of a changing cast of teams with different skill sets, ways of working, contractual arrangements, and supply chains. This should not, however, result in disconnected or siloed use of Digital Twins across your organisation. If you end up with several standalone Digital Twins for distinct aspects of your organisation, then you are likely to miss the greater benefit of understanding how your infrastructure behaves throughout its’ life cycle and across organisational functions. That said, the way that a Digital Twin appears through user interfaces may very well vary across your organisation to reflect the different use cases and ways of working. This separation of data and logic from user interface reflects a ‘federated’ model, one where parts of the business can design and develop Digital Twins to meet their use cases, but which ensures adherence to common data standards and business logic. The diagram below describes, in a highly abstract sense, the common ‘schema’ and ‘data’ layers that could support federated Digital Twins in the application layer. Key to this architecture is the use of open platforms wherever possible, whether those are open data standards, open-source tools, or solutions shared with other Digital Twin owners. We are most likely to realise the benefits of Digital Twins where we avoid delivery of those twins becoming entirely dependent upon proprietary solutions and walled gardens. We should learn lessons from the successes and failures of BIM in this respect. Ultimately, digitally mature organisations should want to own their own destiny in this space, and to remain free to make best use of a wide range of different products as market capability in this space grows (Fig. 6.21). As the Digital Twin Toolkit states, “Digital Twins rely on a base level of information management maturity and connectivity.” Developing and deriving value from Digital Twins is one of the least tested and most complex applications of data covered in this book. Consequently, it is likely that any built environment organisation that successfully derives value from Digital Twins will have already developed capability in other areas covered including data modelling, data governance, business

332

6 Radical Technologies

Fig. 6.21 Components of a federated digital twin architecture

intelligence, IoT, Digital Construction or Enterprise Asset Management, as well as extensive business change, organisational development, and user-centred design.

6.9.5 National Digital Twinning We have referenced the work of Sarah Hayes and the report Data for the Public Good repeatedly throughout this book as it has functioned as a source of inspiration and informed much of the (more enlightened) discourse of the sector, particularly in the UK, since publication. The paper recommended work that would fall under the remit of the Centre for Digital Built Britain (CDBB) which ran from 2017 to 2022. The National Twinning movement was crystallised by the Gemini Papers (CDBB, 2022), the swansong of the CDBB programme before closing its doors in March 2022.The Data for the Public Good paper identified the opportunity for the UK to develop “a digital twin of the national infrastructure system” and advocated for a Digital Twin pilot project. During its period of operation, the CDBB led the National Digital Twin Programme (NDTp) and the pilot project CReDo. The National Digital Twin programme—during its time under the Centre for Digital Built Britain led by Mark Enzer—took the National Digital Twin vision and turned into compelling vision of the built environment as a flourishing system of systems, an ecosystem of connected digital twins. This concept builds upon the pilot project recommendations of Data for the Public Good, where demonstrators or pilot

6.9 Digital Twins

333

projects are means of incrementally developing towards a National Digital Twin from the bottom up. The crucial point is that these digital twins must interconnect and share data if there is a reason to do so. Connecting these digital twins requires data sharing infrastructure. Following the dissolution of the CDBB elements of this work continued, including CReDo, under the auspices of the Digital Twin Hub (currently managed by the Connected Places Catapult). The National Digital Twin Programme itself continues as a Department for Business and Trade initiative led by Alexandra Luck. A few other large scale digital twin projects emerged in the UK focusing on areas of concern (such as interoperability), sub-sectors (such as energy), and locations (for example, the Isle of Wight). The UK is, of course, not alone in proposing or developing digital twins, but along with Singapore it arguably leads the dialogue in terms of large-scale cross-sectoral Digital Twins. The idea, however, remains hypothetical and unproven. The scale of investment required to realise anything resembling a digital twin at a national level would be considerable, and at present beyond the budget of any one organisation or government department. The appeal of the idea, in a techno-utopian sense, is obvious. As Data for the Public Good states, “there is limited understanding about the way the UK’s infrastructure works as a system,” and this is hardly a problem exclusive to the UK. From an insider perspective on the birth of the National Digital Twin programme, it was interesting to see a tension between those who work in the world of ‘internet engineering’ representing the computer science fraternity and those in the industryspecific information and data management space. The first task of the technical team in the original National Digital Twin team was to reconcile this tension. That was achieved through the publication of the Pathway towards an Information Management Framework (CDBB, 2020). This is the core of what we need, and you should dedicate any spare time you have to supporting, building a culture of integration between the tech industry and the technical professionals in our sector (not just the information and data management professionals.

6.9.6 Digital Twinning Across Industrial Sectors The launch of the Apollo Protocol (IET, 2021) was another latent outcome of the CDBB mission. CDBB in the final year of its programme set out to understand how interoperability between industrial sectors could work (IET, 2022). Focusing on the interface between the built environment and the manufacturing sectors. The Apollo Protocol mission is to answer the following question: How can the built environment and manufacturing, supported by the technology sector, break the sectoral divide, and unlock the benefits of digital twins?

334

6 Radical Technologies

The Apollo team conducted a series of workshops and interviews with key industry figures and found the following.8 There were four main areas of focus. Value Chains: There is a need to coordinate all the value chains of data between manufacturing and built environment sector. There are potentially billions of single value chains and building a single value chain can multiply the value available to generate new markets and business opportunities. The Apollo Protocol could coordinate this value chain between industrial sectors. Circularity: There are micro circular models for specific services, such as light bulbs and jet engines. There is an opportunity to build a circular model between manufacturing and the built environment. This will enable the waste in the built environment to be a raw input into manufacturing and vice versa. Performance: If you combine a circular business ecosystem with simplified value chains for data you will open up a new capability. The understanding of true performance. Currently the built environment and the manufacturing sector do not digitally communicate. The performance of the products produced in the manufacturing sector out in the wild of the built environment is unknown. If we are to take our obligation to save the planet from catastrophe, we must take performance data seriously. Human Capital: The built environment and manufacturing sector are in competition for the same human and material resources. Data scientist, information, and data managers et al. are in scarce supply. The sectors must work in a smarter way to develop joint capabilities to ensure the race for resources is not a zero-sum game. We cannot build the previous three (value chains, circularity, and performance) without talented people, there is no way we can understand the true performance of our built environment.

6.10 Conclusion Making data available and useful through software, platforms, and other established and emerging technologies is the most visible aspect of working with data. It is often how we as data professionals frame person and organisational success for ourselves. The technologies described in this chapter are all exciting and perhaps even transformational. However, we hope that the rest of this book has demonstrated that the best—and perhaps only—way to derive value from these technologies is to ensure that the precursors to success are in place first, including: 1. Understanding the problems that you want to solve with data and technology. 2. Learning from the thinking and practice of others. 3. Fostering a positive culture around data in your organisation and communicating a common direction. 8

This is your author’s own expression of the results as a lead author of the inaugural report.

References

335

4. Understanding how your organisation and the individuals within it use data in their work. 5. Deploying the foundational data and project management infrastructure required to successfully procure and deliver digital technologies. 6. Reckoning with complexity of your task and understanding how your chosen technology will meet that technology. The answer to an organisational challenge is rarely BIM or GIS or BI or AI in isolation, but rather the development of an intelligent organisation that understands the tools that are relevant to their purpose and how to deploy them. Many of your challenges as a data professional in the built environment will be ones that other projects, organisations or sectors have already tackled. On well-worn paths, novelty rarely leads to efficiency, and there is a lot that one can accomplish by being circumspect and ‘deliberately boring’. Innovation is as much about borrowing practice from others, as it is about reinventing the wheel. As Dr. Joanna White likes to say, “one person’s innovation is another person’s BAU.” Sometimes the best places to look to borrow practice are adjacent sectors, and many of the built environment’s recent advances in fields such as asset management or modern methods of construction take inspiration from adjacent sectors such as oil & gas, aviation, and manufacturing. Similarly, many sector-agnostic data techniques, such as those documented in the Data Management Association (DAMA)’s Body of Knowledge (DMBOK), are eminently applicable to built environment organisations. That said, we know that there are unique sources of complexity in the built environment. We don’t yet have a paint-by-numbers approach to delivering value from data, and we won’t be able to borrow all our answers from other sectors. With the availability of sensors, cloud computing, AI, and the emergence of viable Digital Rehearsal and Digital Twin platforms we have the opportunity to care for our assets, and ensure their performance, with a degree of timeliness, precision, and foresight unlike anything that we wielded before. Between the idea and the reality falls the shadow… time to get our torches out.

References Antonopoulou, S. (2017). BIM for heritage. Historic England. https://historicengland.org.uk/ima ges-books/publications/bim-for-heritage/heag-154-bim-for-heritage/ Barendrech, A., & Barker, J. (2021, December 20). Wired Score. https://wiredscore.com/blog/2021/ 12/20/smart-building-thought-leadership-series/ BBC News. (2021, November 18). https://www.bbc.co.uk/news/uk-16473296 BIMonomics. (2015, January 25). ...Construction reports, in briefs. https://bimonomics.wordpress. com/2015/01/25/construction-reports-in-briefs/ BIM Interoperability Expert Group. (2020, March). CDBB. https://www.cdbb.cam.ac.uk/files/cih_ bim_interoperability_expert_group_report_april_2020_final_wm_removed.pdf Bowles, J. (2023). Freeform LinkedIn profile. https://www.linkedin.com/in/freeform/recent-act ivity/all/

336

6 Radical Technologies

Carter, R. (2021). Cognitive advantage: How Artificial Intelligence is changing the rules for winning in business and government. Mayhill Publishing. CDBB. (2020). The pathway towards an Information Management Framework—A ‘Commons’ for Digital built Britain. CDBB. https://www.cdbb.cam.ac.uk/news/pathway-towards-IMF CDBB. (2022, March 10). The Gemini papers. https://www.cdbb.cam.ac.uk/news/gemini-papers Collections Trust. (2022). Introduction to spectrum. https://collectionstrust.org.uk/spectrum/spectr um-5/ Dyson, G. (2020). Analogia: The emergence of technology beyond programmable control. Farrar Straus & Giroux. EPSRC-NSF. (2023). The role of funding, financing and emerging technologies in delivering and managing infrastructure for the 21st century. EPSRC-NSF. Ernst & Young. (2017, October 1). https://www.ey.com/en_gl/real-estate-hospitality-construction/ smart-buildings-the-four-pillars-of-a-smart-building-strategy ESRI. (2023). ArcGIS StoryMAps. https://doc.arcgis.com/en/arcgis-storymaps/gallery/ Sydney, F. (2019, May 16). The data science lifycycle. Alteryx. https://community.alteryx.com/t5/ Data-Science/The-Data-Science-Lifecycle/ba-p/408625 Ferguson, H. (2020, March). Ingenia. https://www.ingenia.org.uk/ingenia/issue-82/tideway Greenfield, A. (2013). Against the smart city. Do Projects. Government Digital Service. (2023). Agile delivery. https://www.gov.uk/service-manual/agile-del ivery Hayes, S. (2021, March 15). Digital Twin toolkit. Digital Twin Hub. https://digitaltwinhub.co.uk/ files/file/62-digital-twin-toolkit/ Hayes, S., et al. (2017, July 19). Data for the public good. National Infrastructure Commission. https://nic.org.uk/app/uploads/Data-for-the-Public-Good-NIC-Report.pdf Horgan, R. (2021, July 8). New Civil Engineer. https://www.newcivilengineer.com/latest/crossrailsfunding-black-hole-rises-to-150m-08-07-2021/ IET. (2021, October 20). New Digital Twins project looks at language barriers. https://www.the iet.org/media/press-releases/press-releases-2021/press-releases-2021-october-december/20october-2021-new-digital-twins-project-looks-at-language-barriers/ IET. (2022). The Apollo Protocol: Unifying Digital Twins across sectors. https://www.theiet.org/ impact-society/factfiles/built-environment-factfiles/the-apollo-protocol-unifying-digital-twinsacross-sectors/ Jardine, S. (2007, January). https://www.pwc.co.uk/assets/pdf/pwc-cps-risk-construction.pdf Karpathy, A. (2017, November 11). Software 2.0. Medium. https://karpathy.medium.com/software2-0-a64152b37c35 Karpathy, A. (2023, May 23). State of GPT. https://karpathy.ai/stateofgpt.pdf Lam, S. (2021, June 24). Sabine Lam unpacks Google’s Building Operating System (BOS) program. (J. Dice, Interviewer) Nexus Labs. https://www.nexuslabs.online/content/056-sabine-lam-unp acks-google-s-building-operating-system-bos-program MacDonald, M. (2020, August 6). Digital Rehearsal could be the new normal. Water Power Magazine. https://www.mottmac.com/views/digital-rehearsal-will-brace-construction-for-thenew-normal Michalik, J. (2018, December 13). https://www2.deloitte.com/uk/en/insights/topics/digital-transf ormation/smart-buildings-people-centered-digital-workplace.html Milgrom, P., & Roberts, A. (1992). Economics, organization and management: United States edition. Pearson. Narayanan, A. (2019). How to recognize AI snake oil. Princeton University. Available at https:// www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf Nostalgebraist. (2022, July 31). Chinchilla’s wild implications. https://www.lesswrong.com/posts/ 6Fpvch8RR29qLEWNH/chinchilla-s-wild-implications Parkinson, H. J. (2020, June 24). It’s a botch-up! Monkey Christ and the worst art repairs of all time. The Guardian, https://www.theguardian.com/artanddesign/2020/jun/24/monkey-chr ist-worst-art-repairs-of-all-time

References

337

Pearce, C., Robertson, J., El Hajj, P., Mantogiannis, V., Fisher, P., & Perry, V. (2019, November). Glasgow University. https://www.gla.ac.uk/media/Media_702108_smxx.pdf Press, G. (2016, December 30). A very short history of artificial intelligence (AI). Forbes. https://www.forbes.com/sites/gilpress/2016/12/30/a-very-short-history-of-artificial-int elligence-ai/?sh=36e0105a6fba Sawkill, F. (2022, June 15). British Land. https://www.britishland.com/sustainability/our-views/ smart-100-liverpool-street Stych, M. (2022). https://www.arup.com/services/buildings/smart-buildings Taylor, B. (2018, March 9). What breaking the 4-minute mile taught us about the limits of conventional thinking. HBR. https://hbr.org/2018/03/what-breaking-the-4-minute-mile-taughtus-about-the-limits-of-conventional-thinking The IET. (2019, October 17). Digital Twins for the built environment. https://www.theiet.org/imp act-society/factfiles/built-environment-factfiles/digital-twins-for-the-built-environment/ Timmins, B. (2021, July 15). BBC News. https://www.bbc.co.uk/news/business-57854969 Weinfass, I. (2021, August 11). Construction News. https://www.constructionnews.co.uk/civils/cro ssrails-bond-street-station-faces-further-delays-11-08-2021/ Wolfram, S. (2023, February 14). What is ChatGPT doing... and why does it work? https://writings. stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

Chapter 7

How to Be a Data Person

Abstract In this concluding chapter of the book, we take a light-hearted look at the personal characteristics required to act as a resilient, effective, and ethical data professional in the built environment sector.

7.1 How to Be a Data Person Helping a built environment organisation to make better use of its data shouldn’t be a soulless task. We may live in a world of 1s and 0s, but we are still flesh and blood people (fleshware/wetware). In all the work, it’s important that we don’t lose our sense of self. Our work is better for the personhood and the humanity that we bring to it. There is no one way of being a data person: how could there be? However, one theme that repeats across this chapter is that working on data in the built environment sector often means acting as a force for change in a conservative, risk averse, profitfocused, and challenging sector. The sector has changed substantially over the last few decades, but just looking at the low bar of some of the successes (for example, creating a safer working environment for women) shows how far it had to go in the first place. Please heed this. We are not some strange backwards industry. We systematically innovate every day with very little recognition of our impact. We generally build incredible spaces and engineer useful things. The Elizabeth Line was built under London with minimal disruption to the population above. Your authors have been on site countless times and seen the problem-solving skills in full swing. This means we have an enormous capacity for change and innovation. The issue comes down two main factors. First is to recognise negativity. You (as a professional in the built environment) are told frequently that you’re a laggard and slow to digitalise. Reject that. The second is where we place our energy: our biggest enemies are toxic culture and short-sighted contracts. Don’t let these things wear you down. The grass is no greener in any other industry. If anything, we have pulled ourselves up the curve quickly as technology has caught up with our needs and complexity. You innovate faster than most. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 I. Gordon and N. Thompson, Data and the Built Environment, Digital Innovations in Architecture, Engineering and Construction, https://doi.org/10.1007/978-3-031-51008-3_7

339

340

7 How to Be a Data Person

Being a data person doesn’t just mean being someone who understands technology (or who pretends to understand technology). Throughout this book your authors have tried to tie the work that we do back to meaningful outcomes for society. We want this section to highlight some of the ways in which you can make a positive change within and beyond your organisations through your work, as well as how to prevent it wearing you down. It is part practical guide, part self-help. But we don’t pretend that it’s easy. As a leader of a new and emerging profession in a sector with hundreds of years of accumulated practice, you can sometimes be sidelined or ignored entirely. Conversely, your organisation might sometimes present you as a shiny talismanic unicorn able to deliver a whole bunch of unrealistic expectations with a casual swish of your rainbow mane. Neither of these are comfortable positions to occupy. It is also worth noting, courtesy of feedback from Janet Greenwood, that the extent to which you feel like an outsider as a data person will also depend upon how much you look like, or act like, those around you. Neither the data nor built environment professions boast proportionate representation of women and minorities. National Institute of Building Sciences survey from the US, show that within the built environment sector around two-thirds of the workforce are male, over 80% of the workforce are white, and two-thirds of women reported discrimination based on their gender (National Institute of Building Sciences, 2023). Figures from PwC for the UK show that the construction, technology, and real estate sectors have amongst the highest gender pay gaps, each clocking in at or above 20% (Hatter & Dowling, 2019). Fixing that problem will, unfortunately, necessitate people working in environments where they are under-represented. This is a challenge that will be invisible to many but one which we cannot ignore. As an aside, there is an interesting data challenge in quantifying this diversity challenge. A superficial reading of figures from the UK Government’s Ethnicity Facts and Figures site risks understating the size of the challenge. These figures show, for example, that around 6.5% of the UK’s ethnic minority population work in construction, compared to 7% of the UK’s white British population, a relatively small gap (Gov.uk Ethnicity Facts and Figures, 2022a, 2022b). Similarly, data on seniority show that 10.3% of the ethnic minority population are employed as ‘Managers, Directors and Senior Officials’ compared to 10.7% of the white population (Gov.uk Ethnicity Facts and Figures, 2022a, 2022b). On their own, these statistics would suggest that the representation problem is relatively small, and that the scarcity of ethnic minorities reflects their smaller proportion of the population. However, crucially what these figures don’t provide is a breakdown of both sector and seniority. Across our work-lives the “bring your whole self to work” mantra is starting to wear a little thin. Most of us have figured out that it’s okay, and indeed necessary, to project a different persona at work than at home (Garrone, 2017). One learns to play up one’s strengths, and to downplay or hide one’s vulnerabilities. Likewise, some of us have learned that in certain organisations being yourself is very much not OK, regardless of what the corporate values might say. Adapting your style to match the work culture in which you find yourself doesn’t make you a sociopath. Likewise, leaving an organisation that doesn’t allow you to work according to your values doesn’t make you a failure. In most built environment organisations, you

7.2 How to Be a Person

341

probably can’t work without a filter, but neither can you deny your personality and your vulnerabilities entirely.

7.2 How to Be a Person As a data person working in the built environment sector, you will likely find yourself acting as an agent of change. One day every organisation will be data-savvy, and running a data function will just be business as usual. But for now, every day can feel like a battle, as you try to coerce, inspire, or otherwise encourage your colleagues in IT, the business, and the supply chain into transforming how they work. This pressure, and frustration, and the visibility of your role (see next section) can take a toll on your wellbeing and your life outside of work, Let’s start with the basics: you are not your job; you are a person. You have value whether you’re changing the world or failing miserably. You have needs, you can’t work every hour of every day. What’s more, you’re good at your work not because of the hours that you put in, but because of the ideas that you have. Your brain is a muscle just like any other: it gets tired; it needs to recuperate. Show it some respect. Be wary of the cult of work. Words like productivity, hustle, entrepreneurialism, and grind have double meanings. Look, there’s nothing wrong with working hard, but you don’t exist solely to be productive. People who are monomaniacally focused on their own productivity rarely stop to notice the world around them, to appreciate what matters beyond the task at hand. This advice could apply to any profession but feels particularly relevant to data work. Sure, we are using the quantification of phenomena to improve the world, but that’s different from saying that we need to quantify ourselves. Once we begin to see ourselves simply as a productivity function, to cease to give ourselves time to reflect and recuperate, then we are not going to have the fortitude necessary to work in a challenging environment such as with public sector built environment clients. Indeed, is it even possible to quantify something as complex as a human life without falling foul of some existential version of Goodhart’s Law? Remember that aesthetic appeal is an important part of delivering data projects, and aesthetics cannot exist without some external sense of beauty. Beauty is in the eye of the beholder, and the beholder cannot behold if they are instead beholden to an unforgiving schedule. We are here to make the world function better, more beautifully, more aesthetically, not simply faster. The work of information designers, such as Edward Tufte, author of “The Visual Display of Quantitative Information”, and Giorgia Lupi and Stefanie Posavec, authors of “Dear Data”, is vital in this context as it reminds us not to overlook the importance of form. As Lupi writes, “the more ubiquitous data becomes, the more we need to experiment with how to make it unique, contextual, intimate,” (Lupi, 2017).

342

7 How to Be a Data Person

7.3 How to Set a Positive Example Bringing better use of data to the built environment involves persuading people to change. They will resist; they will observe; they will criticise behind your back. But hey, what do you expect? You are establishing a new pattern. There is the opportunity for this new pattern to be unencumbered by the mistakes and biases of the past. Equally, there is every likelihood of new mistakes. The biggest change we can make to our sector is through the people that we recruit. In growing our data team, we can create groups of people who look, sound, think, and act differently from the norm. This isn’t simply a matter of augmenting the preponderance of certain demographics in the sector (though that is part of it), but also bringing in people from backgrounds which have not always been actively sought by built environment organisations. The professionalisation of construction, asset management, and other sector disciplines is undoubtedly a good thing. It has created a massive body of common knowledge that is (relatively) open to all. By creating pride in a profession, these bodies have appealed to people’s sense of purpose, and generally improved standards and consistency of work. However, one side effect of professional bodies like the ICE, IET, RICS, RIBA, and the many specialist qualifications required to work on site or in any mid-to-senior level position, is that there are barriers to entry to people from other fields. We should try to resist this temptation where practical when recruiting data teams. Data is data, and computers are computers. There should be few blockers or restrictions to someone having worked with data in another sector from applying those same skills successfully to a built environment project. Similarly, unlike some of the more established fields, it is possible to learn to use data quickly, during work hours, and without formal qualification. We have the opportunity, as we recruit, to look for problem solvers, communicators, and iconoclasts without necessarily requiring a particular set of exclusive CV signifiers. Sure, as we have covered throughout this book, there are many idiosyncrasies to the data used by the built environment sector. However, this is a matter of familiarisation, not a skill set. When recruiting into a data team on a built environment project it may be wiser to look for candidates with a palpable interest in the field, rather than prior experience. Indeed, talented data people will rarely come with experience working in built environment organisations. Something as simple as "I used to build models of buildings in Lego," or "I like to visit old buildings in my free time," might be useful indicators of whether someone is likely to apply themselves to the sector and stick with it through inevitable challenges.

7.4 How to Be Conscious of Your Biases

343

7.4 How to Be Conscious of Your Biases Thinking about data all day can delude you into thinking that you are as rational and objective as a computer. This is a mistake. We are not rational creatures, no matter how much we might rationalise our actions retrospectively (American Psychological Association, 2023). The title of this subsection is not ‘how to avoid biases’ but ‘how to be conscious of your biases.’ The biases are a given; they are baked into you as a human being, but it’s what you do (and don’t do) with your biases that counts. As described above, recruitment is one of the areas of your work where you can have an impact on the culture of an organisation. It’s also an area where biases, conscious or unconscious, can have a distorting outcome. Recruitment is hard because it’s inherently difficult to judge how someone will perform in a role—or gel with a team—based solely upon a CV and an hour or two of conversation. Faced with this tough assignment, it’s understandable that we often default to “does this feel like someone I can get on with?” We’ve all done it. Which is not to say that we should deliberately set out to hire a bunch of completely incompatible individuals, but rather we need to be more conscious about our tendency towards homogeneity, and the impact that this can have on the dynamics and resilience of our teams and the sector. Whilst rapport and charisma aren’t necessarily bad metrics when recruiting, particularly for roles that are ‘customer-facing’, we need to be conscious of how using these criteria alone will distort the outcomes of our recruitment. Starting with the obvious: just because someone communicates differently to you, perhaps because they are from a different background, this doesn’t necessarily mean that they are communicating badly. Getting a varied group of people through the door and into your organisation is merely the beginning of addressing biases rather than the end. A diverse team will be less likely to suffer from groupthink or false harmony, but far from immune. Empowering your team means giving everyone space to succeed, fostering constructive challenge, and avoiding the default of favouring the loudest or ‘most confident’ voice in any debate. Part of your role as a leader is to create a space where everyone feels comfortable sharing their suggestions and insight. Breaking down the hierarchical thinking ingrained in our organisations and workplaces needs to start at the top. You will need to be consistent in creating space and soliciting input so that your team trust that they can genuinely offer constructive criticism of your plans, or share vulnerabilities and misgivings, ask for help, or point out where something feels wrong, without you becoming judgmental or defensive. This change in dynamics takes time, and it might require you to take the first step, to show some vulnerability and authenticity, or admit where you have been wrong, to set the tone for the rest of the team.

344

7 How to Be a Data Person

7.5 How to Be Ethical Being ethical in a work context often means reckoning with your own shortcomings, and those of the organisations that you work for. The built environment has a long and inglorious track record of treading over the rights of the poor and disenfranchised, and there are plenty of projects to this day which prioritise vanity over morality. Whilst we are not seeking to set out any higher moral code in this book, we are aware that there are ethical considerations in any built environment organisation and in any data role, ranging from the transactional to the existential. A hierarchy of ethical considerations would include: • Errors of temptation: allowing flattery, bribes, or other rewards (for example, vendor offers of gifts and prizes) to influence your decisions. • Errors of omission: failing to make best use of the resources at your disposal by taking shortcuts or neglecting proper due diligence (for example letting contracts without ensuring sufficient competition or value for money); failing to establish a compelling argument for the opportunity cost of your work. • Errors of outcome: investing your time and energy in work that does not deliver a desirable societal outcome (for example, working to maximise the profits of exploitative industries, and/or those which continue to the degradation of the earth’s natural resources). • Errors of bias: as per the previous section, deliberately or accidentally creating products that unduly favour one group of people over another, particularly where that group is otherwise structurally disadvantaged. Most built environment assets have some sort of negative externalities (noise, pollution, health hazards, crowds, congestion, disruption), and too often these externalities concentrate in disadvantaged areas. There obviously isn’t one right answer to the question of ethics, but there are certainly plenty of wrong answers. The most important thing is to question yourself. Few people set out to be unethical, but we all have a rational self-interest in conflating what is good for us and our careers with what is ethical. If we believe that data has an outsized role to play in the future of our built environment, then the ethics of how we use data will become increasingly important. We covered earlier that the opportunities that data presents also bring new risks and liabilities. The more data we capture, the greater the risk of misuse. The more complex our models, the greater risk of bias. The more predictive our models, the greater our obligation to use their power fairly and equitably. In the conclusion to his paper ‘Understanding artificial intelligence ethics and safety’ (Leslie, 2019), Dr. David Leslie poses the question, "how will the values and motivations that are currently driving the gathering energies of technological advancement in artificial intelligence come both to influence that future society’s ways of life and to transform the identities of its warm-blooded subjects?” In helping to determine how and where to use AI to the benefit of the built environment and society we are playing some small part in answering that question.

7.5 How to Be Ethical

345

To the extent that there is money in the built environment sector, the roles that are best for society are not always the best compensated. It’s easy to adjust our ethics, retrospectively, to suit our circumstances, without even noticing that we have done so, particularly when there is money involved. When that money is coming from autocrats and extractive industries, it’s easy to rationalise away one’s complicity in the degradation of civil rights and the natural environment. Cognitive dissonance can arise where the projects funded by unethical sources are cloaked in the terminology of ‘woke’ climate-conscious and socially-conscious society. ‘Greenwashing’ isn’t a new thing, but it will likely become more prevalent as environmental credentials cease to be an ‘optional’ part of project planning. Trust your common sense here: if a project looks too good to be true, it probably is. As we covered earlier, when discussing sustainability, the alternative to which we should compare built environment investments is not building something different: it is building nothing at all. By this standard, projects need to justify not just their cost and emissions in operation, but also their upfront cost and embedded carbon. Ethical conflicts will inevitably arise, so it’s important for you to establish what is ethical to you beforehand, to establish the impact that you want to have in the world independent of who’s paying the bills. It is easier, for example, to decide that combatting human-made climate change is a priority before the job offers arrive from BP, Shell, and Aramco, rather than afterwards. This is a particular challenge if you find yourself working in consultancy roles where you may have little influence over the decisions made by other parts of your organisation. As you work to determine what is and is not ethical, it is important to remember to learn from others. There is a rich human history of philosophical and religious thinking on morality and ethics which we can learn from, but it’s also worth using the people in your life as a sounding board. Trust your own moral compass but make time to test it against others. Finally, intuition plays a role here too. Our modern lives are full of Twitter (X) feuds and manufactured conflict. There are self-interested hucksters willing to publicly defend nearly every conceivable position. But in the built environment we have the advantage of being rooted in the real world. As the aphorism goes, "if it looks like a duck, and it swims like a duck, and it quacks like a duck, then it’s probably a duck." There will always be people who will try to convince you that extracting coal and oil without limit, building white elephants in the desert, and moving energy intensive raw materials around the world is in everybody’s best interest. But you know intuitively what a fair, sustainable, and ethical society looks like. So question whether you are actively contributing to realising it.

346

7 How to Be a Data Person

7.6 How to Be Open Minded and Work with Others Working with the data means you… are cutting across the silos in the organisation and therefore potentially messing around in everyone’s backyard.—Caroline Carruthers and Peter Jackson (Carruthers & Jackson, 2020)

Data is a team sport, in which there are competing priorities, and many ways to achieve an outcome. Data capabilities will not flourish in isolation, at least not in a way that realises value for any sizeable organisation. An effective data team will likely have blurred edges: there will be roles which are part of the immediate team, and a larger body of colleagues, suppliers, consultants, and vendors who, to some extent, share your objectives. Let’s be honest, the boundaries of a data team won’t even fall entirely within your organisation, as valuable knowledge sharing occurs within networks, and even amongst professionals who otherwise compete against one another. Where does the data team end, and the rest of the business begin? If I am acting as a data steward, or building dashboards, or managing information, or writing data requirements into a contract, am I not at the very least part of the wider data community? If I am a consultant helping build capability for an organisation, should I not be able to share in the goals of the organisation, at least temporarily? You may have realised, or perhaps you will soon realise, that your power or influence as a data professional is best expressed through empowering others to be more effective. Working in an organisation where staff across the organisation are enthusiastic about using data may well prove to be a greater indicator of success than the size or expertise of your data team. Even if, sometimes, that enthusiasm expresses itself in some ill-conceived experiments, questionable ‘innovations,’ or shadow IT. There is a subtle distinction between delivering a data project or solution, and the technology that powers that solution. Your colleagues within an organisation may conflate buying a product, building a model, or trying a new technique, with delivering data change. In doing so they miss the vast amount of business analysis, training, change, and other work required to deliver a successful data project. There is a trade-off here between doing things properly and allowing others to learn from their mistakes. Between creating a culture that embraces creativity, and one that adheres to a strict hierarchy of responsibility. Between enforcing architectural/ logical coherence and following the pull of the business. Between respecting the specialist skills of your data team and helping the business to learn to meet their own needs (Fig. 7.1). As a data professional you’ve got to pick your battles. In one memorable meeting your author attend, a CDO said to a group of senior stakeholders, "I wish, just once, I could come to a meeting about data and have you guys turn to me and say, ’as the CDO, what do you think we should do?’" That meeting concerned a long-term multi-million-pound investment in a data platform: it was an extremely pertinent moment to assert the data team’s expertise and the need to cut down on “random acts of data” by the wider business. Other red lines might include improper use of sensitive information, or questions of data ownership. But not every battle is

7.6 How to Be Open Minded and Work with Others

347

Fig. 7.1 Every origami and data project ever…

worth fighting, and a data team that only says “No. You can’t. You’re stupid,” will not be able co-create with the wider business (and will likely be perceived as an obstacle to be avoided). There is a tension between leading with your expertise, and allowing your wider organisation to find solutions to their problems. Most organisations need some hierarchy of responsibility for how they manage data, what tools they use, how they procure data services, and so on. But within these parameters, the IT department or data team should encourage the wider organisation to push whatever tooling you have agreed to the limit. Staff across the organisation need to feel heard and to trust that you acknowledge, appreciate, and listen to their suggestions. It can be annoying when colleagues return from a conference convinced that some new piece of software will solve the world’s problems. But every now and then your ‘non-data’ ‘non-IT’ colleagues will surprise you with a genius idea that you would never have thought of. And in the meantime, you can delight to see them making use of the tools and platforms that you have created on their behalf. It’s possible that the members of the executive or board of your organisation may not have had to think too deeply about data in the course of their careers. A prerequisite to leadership in built environment organisations is often to have led large teams of people and/or driven down costs. Knowing your way around a spreadsheet or a financial report might help on that journey, but it’s not a career path that necessarily requires you to be ’data savvy’. Consequently, until a new generation of CDOs turned CEOs emerge, it’s likely that your role as a data professional will involve much simplifying and explaining. A big part of your job will be trying to help your leadership understand "what’s in it for them," and why it’s all so damn expensive! Let’s be honest, when all you want to do is to start building data capability and changing the world, having to ’bring the business along with you’ can be a tad frustrating. Sometimes it might feel like if you put half as much effort into building something as you do managing stakeholders and the simplified narrative, then you might accomplish something. But in these instances, it’s useful to remember that

348

7 How to Be a Data Person

consensus building and setting the narrative are foundational parts of your role. More than anything, it is what separates you from the many technical experts who moan about culture but don’t fix it, who never try to understand the human causes of their problems or how to address them. You can’t usually choose your leadership, and sometimes you get unlucky. It’s tough at the top, and leadership changes often result in changes in culture and priorities. However, you can choose how you respond to change, and how you collaborate with leaders. It takes resilience to keep trying despite setbacks. And it also takes judgement to know when to quit. But remember that senior leaders must check the actions of their organisation, so challenge comes with the territory, and shouldn’t be taken personally or even necessarily interpreted as a bad sign. Finding a work culture that works for you will make an enormous difference to your wellbeing, morale, and personal effectiveness. Whilst it can be difficult to assess from outside an organisation, when you find a place that supports you and helps you to grow it can be worth prioritising over rank, salary, and ego.

7.7 How to Sell a Narrative Working as a data professional will require you to act as an evangelist or salesperson for data within your organisation. This is doubly true in the built environment sector which does not always see the benefits of investing in data capabilities that are so immediately apparent to you. You may have experienced this when you meet people outside of work and they ask the standard “so what do you do?” question. Perhaps you excitedly respond along the lines of “I work with data in construction,” only to elicit confused stares. “Oh,” they might say, “Is there data in construction? Are you, like, counting bricks or something?” Just as it can be hard to explain our profession to people at children’s birthday parties, it can also be difficult within our professional lives. Success and failure in data depends as much on establishing a clear and consistent narrative describing your value proposition as it does on delivery. Wait, that was too corporate! Even the phrase ’value proposition’ abstracts away from reality and can come across as inauthentic. Basically, why should your organisation give you the authority/resources/budget/ endorsement to invest in data or to encourage adoption? How will it tangibly benefit the organisation? Without a good narrative, it is likely that the response you get in your working life will be closer to ‘confused layperson at a dinner party’ than ‘informed colleague who understands your value.’ If this prevents you from accessing resources, or if it means that when you do deploy solutions there is no push for adoption from outside your team, then it will scuttle your chances of realising change before you begin. Sometimes people who work in technical specialist positions, such as data roles, are uncomfortable thinking of themselves as salespersons. Your authors fall into this category. Sales is the domain of extroverts in flash suits driving up and down the

7.7 How to Sell a Narrative

349

M4 in Audi A4s (or driving down the A4 in BMW M4s). It can feel threatening, inauthentic, perhaps even sacrilegious, as if the value of data should be apparent to everyone a priori and having to persuade people of its value somehow cheapens it. Most of all, sales just sound tiring! It’s much more fun to sit in front of your computer writing code and drawing architectures than it is to go out glad-handing people. So, we’ve got bad news and good news for you. The unwelcome news is that whatever data position you hold, whether you are a consultant, or client-side, or working for a software house, part of your job will always be selling. This will mean selling the value of what you do, and the vision for how it will make your organisation, your customer, or your sector better. The good news is that if you care about something, you won’t have to try to sell, and you certainly won’t have to do a hard sell, you can be authentic and still persuade people. The difference between success and failure won’t be whether you are a ‘natural’ salesperson, it will come down to how you define and refine your message, and how you deliver that message as a compelling narrative. A big part of getting your narrative right is understanding what matters to your audience, as well as the level at which they can engage with your material. The ability to empathise with others—particularly the desires, fears, and insecurities that underpin their behaviour—is crucial to delivering change in any large organisation. Very few people, outside of the IT department, will agree to a project because it is ‘the architecturally correct’ solution, or uses the latest cloud tooling. A big part of being a data professional, therefore, is translating those technological advantages into outcomes that the business cares about. Better architecture might mean less downtime, faster response times, easier search, high-resolution pictures, etc. Each of these instances of better user experience will contribute to the organisation’s perspective of your team and the value that you deliver. Where possible, these ‘warm and fuzzies’ should also align to realising the values and goals of the business (or, in corporate speak, aligning to the business architecture). As we craft our narratives, we should prioritise simplicity and relatability. What does our audience need to know about data, and what does it mean to their lived experience? Humans can rarely internalise more than two or three facts or statements at a time, and they have an attention span that is far shorter than our standard hourlong meetings would suggest. Walls of slide decks in size 10 font exist to prove that work has been done, not to persuade people of anything. When it comes to narratives, the narratives that sink in will rarely take more than thirty seconds to summarise. Our job, as leaders, is to settle on a clear, accessible, meaningful pitch. Organisations will test, criticise, and even tear apart our pitch, and we will learn and build it up again. We then find ways of repeating and elaborating on that pitch until it starts to sink in and change the way that people relate to data. It is relentless work being an evangelist for data and taking accountability for driving change.

350

7 How to Be a Data Person

7.8 How to Make Product Love, and Not Be Ruled by It Kudos to Julie Hirons for the term ‘product love.’ Applications, platforms, and technology in general can be like sports teams. No matter how good or bad they are, many products have devoted fans. All products sit on the continuum of product love. This is an important psychological phenomenon: in the world of behavioural economics, it is referred to as form of sunk cost fallacy. This is when a person is reluctant to leave a course of action (in this case being a user of a platform) because they have invested heavily in it, even when moving on would be more beneficial. Julie’s concept of product love has two major competing forces. The first force is the accumulation of the product love, achieved in a few ways. There is the emotional energy of learning the product as you connect the functions of the tool to the day job, which in turn can increase your value. The value is both monetary (because, in general, if you train in a specialist tool you can demand more money) and political (because you are more useful to senior figures when you look cool and make others look cool too). If you reach the lofty heights of superuser status as a recognised member of an elite group, you will be showered with incentives from the vendors. The second force is the detractor of your value. The tool becomes out of date, or the market for the users of that tool has become saturated. But you haven’t got the time to learn a new tool and, besides, you have invested so much into this tool, why give up on it now? Your author wants to share their own specific experience of product love and cover a key moment that every technical person has in their career. Your authors have been lucky enough to be guided by some great mentors and mentees and this advice should not be an exclusive thing, we want everyone to know that it’s okay. Your author sat down at a PC in a work setting for the first time in the early 2000s. The first application they opened was the Revit Technology Corporation’s hottest new technology, Revit. They were hooked. The tool was cool, you could easily make models of buildings and they quickly designed their first house extension for a customer. However, as many people experience in our sector, if a big project ends abruptly, and if you are last in, you’re generally going to be first out. Crashing out of what they thought was a normal way to design things in the construction sector, they found their next job, falsework design. The reality of 2D AutoCAD and printing drawings to count things with highlighters strengthened the product love for Revit. They dedicated their career to get back on the Revit horse. Designing in 3D and getting the computer to add things up was cool and, in all honesty, less effort. They also had a guitar addiction that needed feeding. By this time Autodesk had acquired Revit and your author was moving up the ranks in the world of Revit users. Reaching the pinnacle in the 2010s, they were invited to join the elite ‘Gunslingers’ where Autodesk hand-picked hardcore Revit users to travel to the USA to test Revit prior to release. The point of this story is that your author’s professional and personal identity evolved around being a superuser of Revit. Being in their mid-20s and getting sent

7.8 How to Make Product Love, and Not Be Ruled by It

351

around the world to test software, teach the tool and ultimately see designs become real things was an incredible experience, but they knew it could not last. Product love is addictive: the vendors know that and they design for it. You will build up contacts and get the star treatment, but like any rock star, it won’t last forever and it’s better that you’re aware of that and plan for it. Your author spent a few years working out that the ‘Revit Rock Star’ moment came to an end, but it was not THE end. Another honourable mention for this section is David Miller of Miller Architects. Your author, in a bizarre turn of events, ended up utilising their offices as a temporary home when joining a major contractor. In that time, David mentioned something that stuck. David shared his experience of moving from being an architect to an owner of an architecture firm. He described the strangeness of moving ‘off the tools’ and into the world of leadership and management. This exchange alerted your author to the need to move on. They still have a deep love for Revit and its community. They have attended weddings, birthdays and mourned the death of community members. Product love is a serious, and deeply personal business. To be a data person you must be aware of product love. Be aware that you love products and other people love other products. The behavioural dynamics are complicated, and we highly recommend finding a behaviour change course to address it. Your author attended University College London’s Centre for Behaviour Change course for changing behaviour in low resource settings and learnt a great deal about the tools and techniques found in modern psychology. If you are not able to find an appropriate course for you, do at least explore the behaviour change wheel by Michie, Atkins and West (Michie et al., 2011). The reason for not providing more in this book about dealing with behaviour change is that the resources from UCL’s Centre for Behaviour Change are the cuttingedge practice, and we will not replicate them here. You would not want your medical doctor to be diagnosing from reading a couple of paragraphs from an adjacent book about the built environment data. The message here is, to deconstruct product love, go get structured learning in behaviour change. It will be worth the effort and every penny. As for building product love, focus on user experience design (UX) and get into the world of gamification. How games designers make games addictive is an ethical quagmire. However, game design can be a positive force in your quest to build product love. Here we recommend the Octalysis Framework (Chou, 2023). It’s not a formal science but does provide some useful ideas on how to build tools to engage people. You will also learn that gamification has nothing to do with 3D models, or using computer games, as many seem to think it is. To close on product love. We must empathise with users of all tools. They love them. Their livelihood and personas are generally defined by them and, just like football teams, pointing out that they suck will only offend them and lead them to double down on their love. Your product love is something to identify and manage, but do not let it rule you or your career.

352

7 How to Be a Data Person

It’s worth mentioning Julie Hirons’s context of product love. Julie, at the time I heard her use that phrase, worked in corporate IT and managed ‘common data environment’ products. She was previously a trainer of technical tools and expertly brokered the relationship between IT and the users. It was there we discovered that common data environments like Business Collaborator, ProjectWise and Asite (to name a very few) were butting up against things like BIM360 and 3DRepo (at the time of writing 3DRepo were acquired by Asite). There was a product love battle between the document control community and the engineers that wanted to surf 3D models on the cloud. The goal is to be an agent of change. Julie was an exemplar agent and displayed the ability to recognise the behavioural dynamics of users to coin the phrase product love. It requires emotional and political intelligence to identify that. This comes with both training and experience. “Sometimes you have to break eggs to make omelettes” John Eynon (BIM+, 2021).

7.9 How to Take Accountability and How to Lead The main difference between responsibility and accountability is that responsibility can be shared, while accountability cannot.—Caruthers and Jackson, 2018 In time, every post tends to be occupied by an employee who is incompetent to carry out its duties.—Peter, 1970

In the built environment sector, first comes the technical nous, then comes the responsibility, then comes the accountability. We have treated ‘how to take accountability’ and ‘how to lead’ as one question, because we believe that accountability is fundamental to leadership. That is, at least, the type of leadership that data projects in the built environment demand. As we have already covered at length, the built environment sector suffers from a few tired ways of working that the data profession provides us with an opportunity to challenge, disrupt, and change. One of these tired conventions is the ‘command and control’ style of leadership: where taking accountability is confused with micromanaging, domineering, and (at its worst) bullying. The irony here is that the type of built environment programmes and data projects that we have described in this book are far too large and complex for a single individual to own them all. Sure, every megaproject has its charismatic leader (usually male and late middle-aged), but it takes the collective effort of hundreds, even thousands, to “turn the ship around” to quote the title of Captain David Marquet’s book on leadership. Marquet writes a compelling case study of his time as a submarine captain, and his realisation that by providing teams with autonomy (rather than barking orders at them), he was able to transform his teams from underperforming to highly capable. In our formulation, leadership and accountability are about: • empowering your team to act with confidence,

7.10 How to Grow Talent

353

• celebrating their successes, and • having their back when they fail for the right reasons. We use the phrase ‘act’ deliberately. Delivery is the marker of a successful data team. Yes, it is important to have vision. Yes, it is vital to engage with stakeholders, to speculate, envision, plan, strategize, evangelise, and all the rest. But the application of data to the built environment has never really lacked leaders who can talk a good game. What it has often lacked is delivery and realisation of benefits. Again, this is where accountability comes in. If at first you don’t succeed, there will always be someone else to blame: the leadership for lacking vision, finance for not giving you the money, cybersecurity for being intransigent, procurement for being slow, staff for being flaky, suppliers for being incompetent or litigious, users for being close-minded. Threading the needle and getting the best out of people is tough, but it’s what taking accountability looks like; it’s what makes you a leader, and it’s what the built environment needs.

7.10 How to Grow Talent Talented data people are rare and expensive in any sector, doubly so for the built environment sector. We compete for talent with far better resourced sectors: big tech, finance, insurance, etc. So growing talent will be a key part of your resourcing and the delivery of your data strategy. Helping people advance in their careers is one of the most satisfying parts of any leadership position. This is doubly the case where, as we have described above, you are simultaneously changing the mindset of your organisation and sector. However, it’s easy to become selfish and territorial. The people that you help to grow are not ’your’ people, their success will only ever be in small part down to you, and there is every likelihood that at some point they will move on to their next role. The pace with which people change roles will vary by person, by organisation, and in response to wider job market trends. But as much as it hurts, we need to celebrate our protégés moving on to bigger and better challenges. If they never did then that might suggest you are failing them as a mentor and not providing them with the opportunity to grow, or perhaps your organisation simply isn’t providing them with the opportunities they need to hone their skill set. The same is true for you, by the way. It’s important to consider what might appeal to an ambitious and talented person about working for your organisation. Whilst data roles are increasingly well compensated—by public sector standards at least—we will never be able to compete with tech giants and the like. This means that people will often work for you because you can offer them more opportunities to grow, greater autonomy or responsibility, and a more profound sense of purpose. You should get comfortable with the idea that people will use roles in your team as a stepping-stone to seniority and pay elsewhere.

354

7 How to Be a Data Person

If it means that you get a few good years from somebody that you otherwise couldn’t afford, that’s a net gain for everyone. So how does one grow talent? Not unlike our principles for taking accountability and leading: giving talented people responsibility, autonomy, frequent constructive support, and by making it okay for them to fail.

7.11 How to Respect Your Own Time and Your Mind One of your authors recently managed an otherwise talented and committed individual who frequently complained of having to attend too many meetings, and not having enough time to complete their work. About 9 months into the role, your author had the following conversation: Author: “What did you think of that meeting invite I sent you” Employee: “what meeting invite?” Author: “Oh, the one from earlier today.” Employee: “I haven’t seen that yet.” Author: “But you’ve already accepted it.” Employee: “Ah, that’s because I set my inbox to automatically accept all meeting invites.” Author: “WTF? That’s a terrible idea! Please switch that off immediately!” When we let other people fill our calendars, it is easy to find that one’s time is being monopolised by others. There’s the (relatively) simple question of managing one’s own calendar, but beneath that is the deeper question of what constitutes productivity. As a knowledge worker your time is your value. If you cede your control over your time, you cede your ability to deliver. Built environment organisations often have their ways of working rooted in public sector bureaucracy. This can mean a highly hierarchical organisation structure, lengthy governance processes, innumerable and interminable ‘working groups’ (talking shops), and crucially, if you are not careful, a calendar that is filled from 0800 to 1800 (and beyond) with back-to-back meetings, check-ins, boards, governance groups, working groups, planning sessions, stand-ups, retrospectives, ad nauseam. Intellectually, we know that having unstructured time during the working week for reflection, reading, writing, learning, and contemplation will make us better and more fulfilled knowledge workers. In practice, the moment we take a break from our laptops we feel ‘unproductive.’ This is doubly true if your peers are all talking about their incredible workloads and long hours. But just because it’s the default culture doesn’t make it right (regardless of the social norms displayed in some professions and organisations). ‘Work smart, not hard’ isn’t just a slogan, it’s pretty much the only way to sustainably succeed as a knowledge worker. This isn’t solely because constantly transactional workloads are draining and lead to burn out, but also because a knowledge worker who never spends time learning anything or obtaining context will eventually

7.12 How to Learn

355

become outmoded. At best, they cease to be valuable to their organisation and, at worst, they start to make counter-productive decisions. The value of unstructured time, or at least time with the inbox and IMs switched off, is well documented and rarely honoured. Which leads us nicely to our next topic…

7.12 How to Learn Learning comes in many forms. But all those forms require mental space for focused attention. This does not mean that you need to lock yourself away in a library for years. The built environment and data are both domains that reward ’learning by doing’ and ‘learning from others.’ However, even learning in the moment benefits from the conscious decision for retrospective reflection. This is literally how the brain works: creating and reinforcing/pruning neural connections. Creating moments to learn might mean including time in your diary for reading and writing, formal studying, hands-on training, working on side projects, attending conferences, and learning from others (regardless of your respective positions in the ‘hierarchy’). As a data professional your greatest asset is the people that you surround yourself with. Such is the breadth of both data and built environment domains that there will always be specialists in your team, amongst your peers, or working for your suppliers or customers, from whom you can learn. These specialists might include the architects, coders, and developers designing, creating, or deploying your software solutions, the engineers and asset managers using those solutions, or the consultants and business change professionals helping you to realise value from your investments. Even when people appear to be less experienced or ‘specialist’ than you, they can bring a fresh perspective or spot things that you have missed. The challenge here is to create space for active listening. Often, the job of a data professional can feel like “wading through treacle” as one of our stakeholders recently described it to us. When bureaucracy, governance, and routine reporting consume our time it can be easy to lose any opportunities to learn. If your relationship with your team is merely asking for updates against arbitrary milestones then you might meet your short-term goals, but you’re not going to learn much along the way. One of the real advantages of agile delivery methods for data professionals is the practice it encourages of show and tells and retrospectives. Used correctly these ceremonies provide the opportunity for everyone involved in a project not merely to stay abreast of ‘progress,’ but to witness the work delivered, ask questions, and offer suggestions for improvement. Similarly, if you have time to be involved, sprint planning with an experienced team of developers can provide real insight into ‘how the sausage is made.’ Of course, you shouldn’t approach every relationship you have with specialists and peers through the lens of agile ceremonies. But there is almost always an opportunity to spend a little (preferably unstructured) time with people to be curious and understand more about their work and how they do it. In our experience, these types of ‘hands on’ ‘1-to-1’ interactions are comfortably the easiest way to make time for

356

7 How to Be a Data Person

learning as part of your everyday schedule without ever picking up a book or paying for expensive training. Here are some opportunities for learning, and what they’re good for: • Learning from others (bosses, peers, mentors, suppliers) Positives: no upfront commitment of time. Likely to respond to your immediate priorities and challenges. Negatives: risks depriving you of the bigger picture or allowing you to understand best practice outside your immediate environment. Limited by gaps in the knowledge of people around you. • Conferences Positives: an effective way to quickly survey your wider sector and expand your network. Also, an opportunity to raise your own profile if you speak or join a panel. Negatives: time consuming and potentially expensive. Often more likely to expose you to sales pitches rather than new knowledge. • Professional bodies Positives: professional bodies have long been the gatekeepers of knowledge and accreditation in the built environment sector. They guard the treasured definition of what it means to be an architect or engineer or quantity surveyor etc. With their large memberships they can support you with networking, events, mentoring, and libraries. This can be particularly useful for people early in their career. Negatives: There is necessarily a degree of conservatism in these bodies. By defining what an engineer is, one is also defining what an engineer is not. This can mean that the purview of these organisations changes only slowly over time and struggles to keep up with the pace of digital innovation. Discussion of technology will often be at the application level rather than the data level. • Training Positives: Setting aside time to learn about a particular topic in detail in a dedicated setting can be a targeted way to pick up a new skill. Negatives: You need to be clear on the outcome that you are looking for through training, and how you will tie it back to your work. Training courses can lack an immediate connection to your work, and you may have forgotten most of the specifics by the time an opportunity arises to put your skills to beneficial use. • Reading Positives: As a rule, we should all read more. Whether you are reading textbooks, or simply browsing your LinkedIn feed, it’s rarely a bad thing to expose yourself to the thoughts and perspectives of others.

7.13 How to Think About Mental Health

357

Negatives: Much like training, the challenges with reading are relevance and retention. It can be hard to proactively find the content that’s going to benefit your work in the short or medium term, or to curate the trolls and windbags out of your LinkedIn feed. One approach is to try to accomplish two things with your reading: firstly to respond to the references that you pick up in your work (e.g., someone mentioned a standard, why not have a peruse of it?), and secondly to contribute to a broader, less-focused, appreciation of your context, one that is conducive to your own reflection and writing (below). • Reflection and writing Positives: It can sound very self-indulgent to suggest that you can be your own teacher. However, reflection is an important part of growing as a person, and the modern work life balance suffers for not making space for us to parse our experiences. To paraphrase Socrates (yes, that dude from Bill & Ted), “the unexamined working week is not worth working.” The built environment culture is one that often prioritises action over contemplation, where memories of the day gone by may be drunk away in the pub, but however you can fit it into your working week, whether during your commute, or during dedicated focus time, or merely by staring out the window between Teams calls, reflection will reward the reflector. Author’s note: commuting by bike is particularly good for this, provided you don’t get too distracted racing other people. Negatives: It’s probably possible to navel gaze a bit too much, and selfreflection probably shouldn’t be done without also seeking input and wisdom from others using the options described above. Those of us who are adept at learning will use a mix of the methods described above, like a balanced diet of knowledge. Don’t be too hard on yourself if you struggle to find the time. Little and often is the best way to approach learning, and even ten minutes a day spent well will quickly provide you with a broad range of context.

7.13 How to Think About Mental Health Growing up many of us learned to think of mental health as binary: either you were ‘sane’, or ‘crazy’, like The Shining or Beautiful Mind. Later, I saw people that I loved suffer from depression and anxiety, and in time I suffered myself. Truth is, mental health isn’t binary, it’s a continuum. It unquestionably impacts people that you work with, no matter how ’together’ they may present, and one way or another it will impact your life and your work too. This isn’t a problem; it’s totally fine and only indicative of your status as a human being. But certain work environments can cultivate a sense that such imperfections are unwelcome. This is true of any work culture that values the outward presentation of confidence and reliability over authenticity and creativity. Neither the built environment sector nor the data profession avoids this kind of insipid presenteeism.

358

7 How to Be a Data Person

There may be an undue emphasis on ’firefighting’, ’delivery’ and working long hours. Rather than acknowledge the impact this can have on the emotional wellbeing of individuals, there is pressure to ‘push on’ and ‘keep going.’ Teams may try to cope by creating a culture of ‘banter,’ but over time, this may prevent a team from being honest about their emotions. In building data teams, we sometimes suffer from not placing enough importance on soft skills1 or empathy. Of course, we need to be productive, and we do have bottom lines and KPIs to answer to. However, your personal performance, including your annual reviews with your line manager/ your customers, is fundamentally influenced by the wellbeing of your team/supply chain. People are not assets to sweat, they are humans. Authenticity and vulnerability are characteristics that you will have the opportunity to demonstrate every day. Displaying these traits need not mean that you won’t be taken seriously or be seen to lack conviction or vision, quite the opposite. Making space for people to express where they need help is necessary to lead a highly functioning team. It is important to be aware of your vulnerabilities and how you respond to pressure or stress. This might mean melancholy, depression, or anxiety, impostor syndrome, frustration, anger, or something else entirely. You can start to identify how they manifest themselves in and out of work. Personally I experience a mix of anxiety and mean-spirited snarkiness. You can share with people you trust and enlist allies to help you if you wobble. Talking through your problems isn’t a luxury; therapy and coaching aren’t just for people who are struggling. We all share a common human need to be understood, validated, and to connect with others. The theme here, just as with learning, is creating space for reflection, not shying away from our mistakes but embracing them as opportunities to grow.

7.14 How Not to Take It All Personally Working on the built environment almost always means working—directly or indirectly—for large, complex organisations. It goes with the territory: there simply aren’t a whole lot of ‘mom and pop’ infrastructure companies because of the range and quantity of skills required to build, operate, and maintain infrastructure safely and securely. If you work in a (relatively) unconsolidated sector, for example local water distribution, you might be part of an infrastructure owner and supply chain of a few thousand people. If you work for a big engineering consultancy you might have a few thousand or tens of thousands of colleagues (and tens of thousands of people working 1

When seeking a review from the ‘double doctor wife,’ a clinical psychologist, she asked what we meant by ‘soft skills’ and were we buying into the idea of empathy and emotional intelligence as ‘soft,’ as in ‘to be soft’, and the negative connotations of that phrase? Your authors were unconscious to that and assumed ‘soft skills’ was a universal term. It does raise an interesting dynamic found in the construction sector in general. We have moved on leaps and bounds in our emotional intelligence, but our use of language still hints to a notion of needing to be ‘hard to get on.’

7.14 How Not to Take It All Personally

359

for your competitors). If you work in a more centralised/nationalised sector—railways or power distribution for example—you’ll likely be part of an ecosystem of hundreds of thousands or millions of people. The bigger and more complex an organisation becomes, the more centres of power emerge within it, including the competing leaderships of clients and suppliers, the influence of external regulatory and standards bodies, the decisions of government departments and the courts, and the protections of trade unions and professional bodies. All of which serves to illustrate that no matter your job title, you will never have carte blanche to realise change in this sector. This is not to say that change is not possible, but that it occurs slowly, through a dance of conflict and consensus, with setbacks and reversals along the way. A successful data project that realises user adoption and business value in this context will be all the more impressive for it. But in leading these projects it can be easy to take the inevitable challenges and delays that this sector throws up quite personally. It can be possible to care too much. To leverage too much of your own credibility and self-worth into the governance and bureaucracy and politics of an unforgiving grind. This is not to say that you shouldn’t be accountable for the outcome of your work. Rather, we are arguing that the functioning of built environment organisations almost guarantees petty tyrants and arbitrary setbacks, and that you should see this as part of the journey rather than a failing on your part. These are obstacles for you to tolerate rather than tortures devised for you alone. Your success isn’t solely dependent on how much you care. In fact, maintaining a pragmatic detachment from your work might make you more effective. There’s a reason that railway operators don’t employ many trainspotters: they care a bit too much about the details, they are too emotionally invested. Data people can fall into this trap as well. Because data isn’t always part of the culture, it can sometimes be seen as a bit of a luxury. If your colleagues perceive you as a ‘data-spotter’ and not a data professional, then the leadership of your organisation may start to assume that you are recommending investment in data to fulfil your own priorities and whims rather than because it is in the interest of the organisation. Not taking your work too personally means understanding that your job is to present your organisation with options, to explain the benefits and risks of those options clearly, perhaps even to make a recommendation, but then to step back and let the organisation process the choice. Nine times out of ten they will make the ‘right’ call. Where they don’t, it’s not necessarily your fault, nor is it a reason to give up hope. It’s just a sign that you need to try a different tack. That said, there may come moments where you realise that you simply aren’t in an organisation or project that is ready to use what you can offer. In these situations, resilience and persistence can transmute into something closer to Stockholm Syndrome. You only get one career (or at least one working life), and there is an opportunity cost to your time. So be honest with yourself. If one thing is true in all the complexity and madness of the built environment it is that few decisions are permanent: there is always a new emergency or problem to solve. On the other side of this dynamic are budgets and contracts. Another reason to not take it all personally is that for many people, they are driven by externalities. For

360

7 How to Be a Data Person

example, from current experience, people in their work environment are generally operating within a brief and some sort of ‘commercial environment’. In other words, people are driven by how the money flows. This is written from a cynical point of view, and subject to several rewrites. However, it’s hard to find another way to express that, at the end of nearly every conflict around scope or general distribution of responsibilities, in the experience of your authors, there is either an existence of or lack of budget. Which is a function of poorly designed/aligned incentives, which is economist jargon for poor contracts. The point here is that people generally never set out to do a bad job. If you are feeling like you are being singled out or have a sense of personal attack, do try to follow the money first. Then work out how you can navigate the realities that money brings. Here is a real-life example. The job was to set up a multi-billion-pound enterprise tasked to deliver a vital piece of critical national infrastructure. The piece of infrastructure had the ambition to deliver using a ‘Project 13 enterprise’ (Infrastructure Client Group, 2023) which requires supply chain members to behave like progressive partners, not avariciously ‘traditional’ / transactional suppliers. The client procured each of its partners independently with different forms of contract that were specifically negotiated between each partner. Confusion and battles of scope and budget ensued. To save you the harrowing story of the stress and strain of trying to navigate that, lets jump to what was done to remedy it. First was to focus on what could be done within the specific contracts. This may seem counter-intuitive. The aim of the initial focus was about building credibility. This does run the risk of building a perception of working within a silo, and maybe empire building. However, the principle of good work speaks for itself and ‘don’t tell me, show me’ tends to win over any form of shallow political manoeuvring. It will feel slow, and you will be criticised for taking too much time. You are better to think slow and act fast. The next step, fresh out of Covey’s ‘The 7 Habits of Highly Effective People’ (Covey, 1989), is to seek first to understand, then to be understood. This is where the analysis of contract forms, and who has budget to do what, is conducted. You could do this first, but you can only do this step once you know the intimate details of what your scope is and how it is contracted. What we found was that a particular partner had not planned a provision for IT systems in their budget. They created a ‘business case by business case’ governance model with the client that resulted in a misaligned level of effort (they spent more time doing business cases and navigating red tape instead of delivering value). A slight tangent here. An impactful lecture that one of your authors had attended (Chen-Yu Chang, UCL 2014) presented an analogy for governance structures. Here’s a visualisation for you: imagine you are in your kitchen, and you have the choice of two knives to use to prepare your food, a fruit knife and a meat cleaver. Now imagine trying to chop 100 oranges with a meat cleaver. Tiresome right? It might feel swift at first, but when you reach the 20th orange, you will be tired. It’s the sledgehammer

7.15 How to Take Sustainability Seriously

361

to crack a nut. The same goes for the fruit knife. Hacking at a side of beef will not make a swift delivery of a joint for your roast dinner. We appreciate the laboured analogy here, but it should bring about a visceral experience of how governance structures must align with their intended purpose. You must use the right knife for the right purpose. Unfortunately, the analogy breaks down when we apply it at the organisational level of the kitchen. You’re either an amateur cook, and improper knife use will just put you at risk of injury, or you’re swiftly ejected from a professional kitchen. Back to the real-life scenario, the governance purposes and structures were misaligned. For example, conducting a governance process that costs £100,000’s in peoples time to sign off a piece of software that costs £10,000 (in a technology stack of nearly 100 tools), is quite frankly a monumental waste of taxpayers’ money. We co-developed a commercial model with the partners that enabled transparency of budget spend to the client, a means of performance measurement for the partners, but ultimately it created a collaborative and much nicer environment to work in. People were less defensive and political in their behaviour. The model was not anything special, it was transparent to each partner and equitable in terms of articulating a specific line item in the budget that the client could apply scope to. Interestingly with a pain/gain incentive structure between the partners, it was in everyone’s interest to declare the budget and find ways of consolidating any overlaps in scope (as opposed to desperate protection of the overlaps via poor behaviour). Follow the money and take a lead in aligning incentives.

7.15 How to Take Sustainability Seriously Our house is falling apart… It will take fierce determination to act now, to lay the foundations when we may not know all the details about how to shape the ceiling.—Greta Thunberg

Speaking of emergencies—and permanence—there is only one topic that we could end on. Over the course of this book, we have examined what the intersection of data and the built environment looks like with increasing levels of granularity. We started at the macro level, describing the challenges facing our entire sector. Then we worked at the organisational level, covering the cultural and technological impact of data on built environment organisations. Finally, we looked at the micro level, the actions, and characteristics of an individual data professional such as yourself. Sustainability has been a theme throughout. We have described the massive impact that our sector has on the natural environment including its substantial carbon footprint, and the actions that individual organisations can take to realise sustainability through their use of data. Now, as we turn to our individual responsibilities, it can seem quite abstract and daunting. What could I as a data professional possibly do to address this massive epoch-defining hyper-object that is the climate emergency?

362

7 How to Be a Data Person

The day-to-day of working in data and the built environment can feel completely divorced from climate action. As we get stuck into the grind of delivering data change, the big picture becomes hidden behind meetings and sprints, business cases and procurement, data models and designs. It can be hard to identify the moments when you can positively turn the dial on sustainability. It is perhaps useful to take a step back and break down how we can have an influence into four categories: • • • •

Housekeeping: avoiding waste in our own work. Direction-setting: how our data strategy and solutions influence sustainability. Advocacy: where we can involve ourselves in a wider sectoral change. Personal choice: avoiding working on projects that do not align with your values, and thus depriving them of your talents.

What makes working with data in the built environment sector interesting is the same thing that makes delivering a sustainable built environment interesting: there are few easy answers, and there isn’t a cookie-cutter template of action (at least not one that works). This book has not been a book about sustainability, except that the climate crisis is so omnipresent that everything written about the built environment is, implicitly or explicitly, about sustainability. Even writing, which either doesn’t acknowledge the climate crisis, or attempts to greenwash it away, is implicitly a document of our inability to address the emergency. As Greta said elsewhere in her speech quoted above, “if our house was falling apart the media wouldn’t be writing about anything else.” If data and the built environment has a meta-narrative it is that society will judge us by the outcomes that our work delivers (or neglects to deliver). Through that lens, if you are not figuring out how to integrate sustainable outcomes into your work then you are almost certainly under-serving society with your talent. We may feel powerless in the face of such an overwhelming crisis, but it is only through making use of whatever small platforms we have that we can start to collectively course-correct humanity. In practice this means acting wherever we can, including: • Genuinely incorporating carbon into your business cases and procurement frameworks, crucially considering ‘do nothing’ and ‘repurposing’ options in plans, and the carbon benefits that these bring. • Proactively seeking to use data services such as cloud computing that are demonstrably carbon neutral, as well as proactively managing demand and switching off unnecessary software and services. • Exploring with your organisation how better data can reduce both cost and emissions, for example by streamlining logistics, reducing reliance on fossil-fuelburning plant, reducing rework on site, allowing materials to move to and from site in minimally polluting ways (for example, river barges), and integrating sustainable materials into our projects (for example, low carbon cement). • Using data to accurately forecast and monitor the carbon impact of your assets, and to identify how to realise the greatest reductions in emissions. Smart Places and IoT provide us with the opportunity to create physical assets that only generate

7.16 Epilogue: Of Bytes and Bricks

363

emissions when actively used and which are otherwise dormant and carbon neutral. • Similarly, ensuring that we maintain data on the downstream (hidden) emissions of our assets, whether that is embodied carbon in materials, or the emissions of our supply chain. Only by specifying and mandating data on these emissions can we hold our wider ecosystem to account. • Moving high emission activities from the physical to the digital world. This could mean using teleconferencing over physical meetings, or more ambitiously it could mean rehearsing and improving the efficiency of our work in digital simulations rather than the real world. Of course, this list of possibilities is just scratching the surface. We are yet to find or deliver upon the real answers to helping the built environment to rapidly reverse climate change. It is both terrifying and exciting, but we’ve got to start somewhere. We spoke about Neil Postman’s quote earlier: he said technology is not additive, it’s ecological. Technology does not merely change something, it changes everything… Sustainability is not additive. Technology cannot no longer be used to reduce the quality of our work or subsidise poor planning, design, construction, and operation of our built environment. We must use data and evidence instead of powerpoints and opinions in our endeavour of pulling our built environment into harmony with our natural environment. We are the first generation to experience the outcome of climate change. We may well be the last who are able to do something about it.

7.16 Epilogue: Of Bytes and Bricks The world is on fire. Our governments are broke. Disinformation is rife. At the intersections of these challenges: data and the built environment. Society needs a built environment that meets its needs equitably and sustainably, one which rapidly atones for the sins of the past. Will data get us there? Or is it simply an expensive distraction? We’ve been mulling it over. Perhaps we all have. Train delays, power cuts, potholes, flaky broadband; we pause, and we wonder: why? There is so much accumulated infrastructure, seemingly enough to meet our every need. But it sure doesn’t take much for the cracks to show, literally or metaphorically. In our lifetimes we have seen intangible assets become the dominant asset of our economy. Our consumer products allow us to generate and use staggering amounts of data. AI has gone from an eccentric distraction to borderline miraculous. Yet our experience of applying these rapidly maturing technologies to the built environment is underwhelming. The complexities of our sector lie not in technology but in people, professions, processes, culture, and the primacy of physical assets. Digital technology may have

364

7 How to Be a Data Person

transformed music, socialising, news, entertainment, and romance into ubiquitous and weightless things. And yet bridges, buildings, pipes, and cables today aren’t far removed from bridges, buildings, pipes, and cables from decades ago. No matter what version of the iPhone we carry in our pockets, we humans still have the same basic needs of shelter, food, water, and warmth. After two years of reflection, reading, and writing on this topic we must admit that we haven’t quite found the pithy meta-narrative that we sought to describe data’s role in our sector. In lieu of a snappy tagline, the best that we can offer are diagnoses. In isolation they are not so dissimilar from the challenges faced by other industries. Together, they describe the unique combination of history, culture, and technological limitations that make applying data to the built environment a unique and profound task. Serving society: The built environment exists to meet societies’ needs. Publications such as Data for the Public Good, Flourishing Systems, and TIP Roadmap to 2030 remind us that we should be seeking nothing less than public wellbeing, resilience, equity, and sustainability. In this context, data is not and cannot be neutral. We must use it to help the built environment meet this larger purpose. Carbon impact: There is an opportunity cost to all investment in the built environment, one measured in carbon. The direct and indirect impact of the built environment on global emissions is enormous, and under-acknowledged. Everyone knows that plastic straws are bad for the environment, but fewer people are aware of the disproportionate footprint of cement. Sustainable solutions do exist, and data has a vital role to play in keeping score, making the sector’s carbon footprint more visible, and reducing wasted resources. Complexity: To work with the built environment is to reckon with complexity. Complexity rarely stems from the assets or the technology, but from the human, societal, and political context. Data solutions that do not reckon with this complexity are unlikely to realise lasting change. Data footprint: The built environment relies on an incredibly broad range of data: structured alphanumeric, documents, models, maps, imagery, and datasets that vary in scale and importance. Deriving value from data on the built environment demands a more comprehensive approach for which there is not always an off-the-shelf solution. Connected domains: To realise value our digital domain must act in aid of the physical domain. It’s cheaper and safer to make mistakes in models than in real life. However, our decisions are only as good as the connections between the two. If our digital domain bears no resemblance to our physical domain, then we are deluding ourselves. Tribalism and specialism: The sector has accumulated a wide range of professions tracing their origins back to Victorian old boys’ clubs. These professions cross our organisational boundaries and we have built these siloes into our culture. Data is our opportunity to create shared meaning, common definitions, and interoperability to begin to optimise across, rather than within, siloes. Bias towards building: Creating and improving our built environment is what inspires us: we see the built environment as a morally good thing. Delivering projects is also what brings money into our organisations and what gets us promoted or

7.16 Epilogue: Of Bytes and Bricks

365

rewarded. And yet, as we have seen repeatedly, there is an opportunity cost of building in terms of money and carbon, and the numbers do not always stack up. Tacit knowledge: We build, maintain, and operate the built environment through the coordinated labour of human actors. This means that our success in applying data to the built environment relies upon our ability to improve the flow of knowledge between individuals and computers. Keeping these diagnoses in mind has helped us to think about how the built environment can benefit from emerging data-intensive technologies. Whilst it is obvious that massive neural networks, as a pertinent example, now have profound capabilities, it is less obvious how those capabilities can help us to better serve society or reduce emissions. The diagnoses above start to offer clues. We know that the built environment comes with a data footprint of material spanning hundreds of years. This is far more information than any one human could ever feasibly internalise. But it is a trivial amount of information for AI. We now have access to a technology that can synthesise our entire corpus and explain it back to us as usable tacit knowledge. In the past we relied on graphs and dashboards, now we can just chat! AI has the potential to quickly integrate and cascade knowledge across the sector. This connection of digital and physical domains allows AI to make sense of information coming from the physical world, whether that is real-time IoT feeds, or reports submitted by specialists. This promises a decision-making (or decision-augmenting) engine that can reckon with the systemic complexity of our built environment. The history of blindly assigning decision-making to inscrutable AI isn’t great. By assessing this technology in a manner that is mindful of the built environment’s idiosyncrasies we are more likely to arrive at solutions that deliver value. Imagine yourself as an automotive engineer of the 1950’s, the first motorway has opened, there are no speed limits or seat belts. Look today at the social outcomes of automotive engineering. Congestion, civilization designed around cars, pollution and frequent deaths. We cannot afford our data engineering professionals in the built environment to sleepwalk society in the same way. Investing in data has the potential to transform the way we approach the built environment and meet the ever-evolving needs of society. However, we can only achieve this if those using data remain aware of the unique challenges posed by the built environment. Data provides us with a comprehensive understanding of the impact of our work—both positive and negative—allowing us to be accountable and make decisions that benefit society and the natural world, rather than solely focusing on personal gain or career advancement. With data, we gain a broader perspective, enabling us to recognise our biases and explore new ways of building, maintaining, and operating. Data will have an impact on our sector regardless. Our meta-narrative should be ensuring that its impact is positive not corrosive. We started this book by describing it as “a love story about the built environment and the data it produces” and we asked, “will our couple live happily ever after, or pass like ships in the night?” It’s fair to say that there has been a lot of flirting, but perhaps we’re still waiting for our movie script ending, that moment of unity and consummation. Sometimes our protagonists seem fated to be together, other times

366

7 How to Be a Data Person

irreconcilably different. The beauty and nuance of data, the crude materiality of our physical assets, the deceptive intangibility of data, the gargantuan forms of our infrastructure, and of course the humans working between those two domains. What we do know is that the story must eventually draw to an end, perhaps as a tragedy where we came so desperately close but ultimately failed to use data to make our built environment work better for the people within it and the natural environment on which it rests. Perhaps as a feel-good redemptive arc, where we finally start to help people and the natural environment thrive and undo the sins of the past. Let’s take what lessons we can from our work to date and write the rest of the story together.

References American Psychological Association. (2023). Rationalization. https://dictionary.apa.org/rationali zation BIM+. (2021, July 2). John Eynon RIP. https://www.bimplus.co.uk/john-eynon-rip/ Carruthers, C., & Jackson, P. (2020). The Chief Data Officer’s playbook. Facet Publishing. Chou, Y.-k. (2023). The Octalysis Framework for Gamification & Behavioural Design. https://yuk aichou.com/gamification-examples/octalysis-complete-gamification-framework/ Civil Engineering. (1999, June). Programme Management B.C. https://web.archive.org/web/200 70608101037/http://www.pubs.asce.org/ceonline/0699feat.html Covey, F. (1989). The 7 habits of highly effective people. Free Press. Garrone, E. (2017, May 18). BBC worklife. https://www.bbc.com/worklife/article/20170518-whywere-different-people-at-work-and-at-home Gov.uk Ethnicity Facts and Figures. (2022a, July 27). https://www.ethnicity-facts-figures.service. gov.uk/work-pay-and-benefits/employment/employment-by-sector/latest Gov.uk Ethnicity Facts and Figures. (2022b, July 27). https://www.ethnicity-facts-figures.service. gov.uk/work-pay-and-benefits/employment/employment-by-occupation/latest Hatter, L., & Dowling, S. (2019). https://www.pwc.co.uk/human-resource-services/assets/pdfs/spo tlight-on-real-estate.pdf Infrastructure Client Group. (2023). Welcome to the Project 13. https://www.project13.info/ Leslie, D. D. (2019, June). Understanding artificial intelligence ethics and safety. The Alan Turing Institute. https://www.turing.ac.uk/sites/default/files/2019-06/understanding_artificial_ intelligence_ethics_and_safety.pdf Lupi, G. (2017, January 30). Data humanism, the revolution will be visualized. https://www.pri ntmag.com/article/data-humanism-future-of-data-visualization/ Michie, S., van Stralen, M., & West, R. (2011). The behaviour change wheel: A new method for characterising and designing behaviour change interventions. Implementation Science, 6, 42. National Institute of Building Sciences. (2023, June 5). https://www.nibs.org/news/new-builtenvironment-workforce-report-63-say-its-important-increase-diversity#:~:text=Highlights% 20in%20the%202023%20Built%20Environment%20Workforce%20Survey,environment% 20for%20more%20than%2020%20years.%20More%20items