108 74 5MB
English Pages 238 [230] Year 2022
Paul Beynon-Davies
Information Modelling A Pragmatic Approach
Information Modelling
Paul Beynon-Davies
Information Modelling A Pragmatic Approach
Paul Beynon-Davies Cardiff Business School Cardiff University Cardiff, UK
ISBN 978-3-030-98804-3 ISBN 978-3-030-98805-0 https://doi.org/10.1007/978-3-030-98805-0
(eBook)
# The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
I have been engaging with issues of data and its use and effects for over 40 years. During all this time, I have always cringed when people from all sectors of society, economy and polity use the terms data and information as synonyms. Surely, I have always thought, there must be a clearer way in which we can both define and utilise the notion of data as well as the notion of information. In recent times, I have been able to provide more focus to resolving my long-standing unease, and to do this, I have had to ask myself some strange questions, such as why we have data in the first place, what does data do in practice for us and why do people often experience problems with data, even in the age of mass communication and ubiquitous technology. In a recent book of mine (Beynon-Davies 2021b), I used the following allegory to highlight what I think is the central source of this ambiguity surrounding data and information. This allegory has been told many times in many different quarters, and it goes something like this. Two young fish are swimming along when they meet an older fish. The older fish greets them while passing by saying, ‘morning, how’s the water?’. Having swam a little further on, one young fish pauses, turns to the other young fish and asks, ‘what’s water?’. Both data and information are a bit like the young fish’s water—they are an inherent and important part of our surround-world. But because they are mundane and accepted, we all tend to assume that we understand what data is and how it relates to information. For many people, answers to the questions I pose about data seem obvious—we have data because it provides information for us, information enables us to make better decisions and the problems we experience with data are purely down to a lack of good organisation of data or poor processes of data collection. But let us pause for a moment. Are these accepted characterisations true? Data are represented in data structures, and data structures can certainly be used to inform but about what? Data structures can be used as collective memory of what has happened in some domain or what is happening, but they can also be used to make things happen in the future. All data structures in some sense mis-inform as well as inform, because in the very nature of creating a memory trace of something or someone, the maker of the data structure makes a decision about what is significant to represent and, as a consequence, what is not. Hence, data structures are not only memory traces—they are also deliberate acts of forgetting. Sometimes, data v
vi
Preface
structures are deliberately created to mis-inform in the sense they may be designed either explicitly or implicitly to portray a particular worldview, and such a worldview may be open to question by various groups and individuals in society. This means that data structures in many settings inherently carry with them the ‘politics’ of their creation. Hence, data structures and the way they are made should not always be seen to be inevitably beneficial because they can be used frankly for some very evil purposes. Many data structures are also not always useful in the sense that the making of such records serves to disable human performance in areas such as decision-making, as well as support such performance. These issues I have with the common sense understanding of data and information transpose over into an unease I have always encountered with information modelling or, as it used to be called, data modelling (another example of how we tend to treat data and information as somewhat the same thing). I have used information modelling many times within practical work in my engagement with industry and the public sector. However, it was only when I was required to teach this technique to students that I experienced difficulties in explaining to novices how I, as an experienced practitioner of the technique, arrived at a particular information model for a certain set of circumstances. To help resolve these difficulties both for myself and my students, I began investigating a better way of approaching information modelling. This led me in the 1990s (Beynon-Davies 1992) to publish a paper which explored the relationship of information modelling to semiotics—the doctrine of signs. Since that time, I have used the notion of a sign to better position notions of data and information and explain more clearly how they relate together. In essence, data are differences made in some substance by some actor. Information, in contrast, is an accomplishment made by some actor in his or her encounter with data. For information to exist, it is therefore necessary to have data—information is a set of differences which make a difference to some actor. But data is not ever-present in the world waiting to be ‘collected’ by the actor. Data involves the explicit creation of structures by actors, and in doing so, such actors make decisions not only about what to represent but how to represent it. Once such structures are created, there is no guarantee that information will be accomplished by other actors in their encounter with these data structures. For the relationship between data and information to be achieved, there must be a common ontology shared between actors, which enables them to accomplish information with certain data structures. Such an ontology amounts to a set of conventions established amongst a community of actors about what certain data structures communicate. This more accurate and nuanced understanding of how data relates to information is not just a theoretical exercise—it has practical consequences. It has led us to develop a much more productive way of doing a number of things with information modelling. First, it enables us to explain more clearly the purpose of information modelling—what this technique is actually meant to achieve. Second, it allows us to
Preface
vii
explain in a much more straightforward manner the key principles of this technique to novice users. Third, and finally, it enables us to provide much more productive guidance on how to undertake this technique in practice. But don’t take my word on this. To see if I am true to my word on these three points, read on. Rhondda, South Wales, UK 2022
Paul Beynon-Davies
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Aim and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Key Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Figures, Examples, Exercises and Solutions . . . . . . . . . . . . . 1.5 Book Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
1 1 2 3 4 4 7
2
What Is Information? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Information Situations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 What Information Is and Is Not . . . . . . . . . . . . . . . . . . . . . . 2.4 The Stands for Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Communicative Acts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Patterns of Information Situations . . . . . . . . . . . . . . . . . . . . . 2.9 Physical and Institutional (Social) Ontology . . . . . . . . . . . . . 2.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
9 9 10 14 17 18 21 22 28 29 33 33
3
Why Model Information? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 A Short History of Information Modelling . . . . . . . . . . . . . . 3.3 The Notion of a Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Information Models and Reality . . . . . . . . . . . . . . . . . . . . . . 3.5 What Are Information Models for? . . . . . . . . . . . . . . . . . . . . 3.6 Investigating the Ontology of Domains . . . . . . . . . . . . . . . . . 3.7 Conversations for Action . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Visualising Patterns of Information Situations . . . . . . . . . . . . 3.9 Documenting a Pattern of Information Situations . . . . . . . . . . 3.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
35 35 35 37 39 41 42 45 47 50 53 54
ix
x
Contents
4
Information Modelling from First Principles . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Objects and Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Classification and Instantiation . . . . . . . . . . . . . . . . . . . . . . . 4.4 Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Valuing an Object and Forming an Object Class . . . . . . . . . . 4.6 Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Constraints upon Association . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Generalisation and Specialisation . . . . . . . . . . . . . . . . . . . . . 4.9 Generalisation Hierarchies and Lattices . . . . . . . . . . . . . . . . . 4.10 Aggregation and Decomposition . . . . . . . . . . . . . . . . . . . . . 4.11 Institutional Ontology as a Sign Lattice . . . . . . . . . . . . . . . . 4.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
55 55 55 57 61 62 63 66 70 72 73 75 78 78
5
Visualising an Information Model . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Why Visualise? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Notations for an Information Model Diagram . . . . . . . . . . . . 5.4 Visualising Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Visualising Relationships of Association . . . . . . . . . . . . . . . . 5.6 Visualising Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Visualising Constraints upon Association . . . . . . . . . . . . . . . 5.8 Visualising Generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 Visualising Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10 Institutional Facts to an Information Model Diagram . . . . . . . 5.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
81 81 81 82 84 84 86 88 89 91 92 96 96
6
Composing an Information Model from Institutional Facts . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 A Pattern of Information Situations . . . . . . . . . . . . . . . . . . . 6.3 Unpacking the Content of Messages . . . . . . . . . . . . . . . . . . . 6.4 Generating Institutional Facts . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Validating an Information Model . . . . . . . . . . . . . . . . . . . . . 6.6 Revising Information Models . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
99 99 100 102 108 109 114 117 117
7
Practical Issues in Information Modelling . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Class, Attribute or Relationship . . . . . . . . . . . . . . . . . . . . . . 7.3 Repeating Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 One-to-One Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 When to Generalise and Aggregate . . . . . . . . . . . . . . . . . . . .
. . . . . .
119 119 119 121 122 123
Contents
7.6 7.7 7.8 7.9 7.10 7.11 7.12
xi
. . . . . . .
124 125 127 128 131 132 133
8
Information Modelling and Data Systems . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Data and Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 The Ontological Status of Data Structures . . . . . . . . . . . . . . . . 8.5 Data Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 The Relational Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Normalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 Turning an Information Model into a Relational Schema . . . . . 8.9 Visualising Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 8.10 Identifiers and Candidate Keys . . . . . . . . . . . . . . . . . . . . . . . . 8.11 Determinancy Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
135 135 135 137 140 142 143 146 148 155 156 157 162 163
9
Information Modelling in Context . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 The Place of Information Modelling Within Business Analysis and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Data and Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 The World Wide Web and Metadata . . . . . . . . . . . . . . . . . . . 9.5 Information Modelling and XML . . . . . . . . . . . . . . . . . . . . . 9.6 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.7 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.8 The Notion of a Data Science . . . . . . . . . . . . . . . . . . . . . . . 9.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
166 168 172 173 175 177 181 182 183
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Information Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7 Constraints on Relationships of Association . . . . . . . . . . . . . 10.8 Generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.9 Generalisation Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
185 185 185 186 187 187 187 188 189 189
10
Strong and Weak Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . Recursive and Ternary Relationships . . . . . . . . . . . . . . . . . . Modelling Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Connection Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Information Model Patterns . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 165 . 165
xii
Contents
10.10 10.11 10.12 10.13 10.14 10.15 10.16
Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Strong and Weak Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . Recursive and Ternary Relationships . . . . . . . . . . . . . . . . . . Composing an Information Model . . . . . . . . . . . . . . . . . . . . Modelling Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Connection Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
190 190 191 191 193 196 197
Appendix: Solutions to Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Information Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4 Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.6 Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.7 Constraints on Relationships of Association . . . . . . . . . . . . . . . A.8 Generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.9 Generalisation Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . A.10 Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.11 Visual Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.12 Strong and Weak Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.13 Recursive and Ternary Relationships . . . . . . . . . . . . . . . . . . . . A.14 Composing an Information Model . . . . . . . . . . . . . . . . . . . . . . A.15 Modelling Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.16 Connection Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
199 199 199 200 201 202 203 204 205 208 209 210 212 212 215 217 218
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
1
Introduction
1.1
Aim and Scope
The aim of this book is to provide a tutorial introduction to information modelling for use on undergraduate and postgraduate modules in information systems, information technology and computer science and even digitally focused modules within business and management. The book will also be of relevance to practitioners looking for a fresh and innovative approach to the design of data systems. Traditionally, information modelling has been important to technologists tasked with creating data systems of various forms. More recently, it has influenced practices in other areas such as building construction and architecture. The approach is increasingly relevant as an approach for understanding the active role that data plays within business and management and promoting the planning of business activity around the proper design and management of data systems. Information modelling has been around for some decades, but as evidence of the increasing importance of information modelling: • This technique is still very important to the contemporary designer of data systems of many forms. • The technique is also of much use to the modern data analyst/data scientist in establishing the proper context for data analytics. • Most contemporary academic courses in computer science, information technology and information systems worldwide cover information modelling somewhere in their curriculum. • The Association for Computing Machinery places information modelling within its guide curricula for software engineering and information systems. • The British Computer Society offers a qualification in data analysis and places information modelling at the heart of this endeavour. • The Data Management Association (DAMA) is a professional association for data managers and includes information modelling within its professional body of knowledge. # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_1
1
2
1 Introduction
The book takes a fresh and innovative approach to information modelling based in the author’s previous research and consultancy work. This approach considers information modelling to be an exercise in building an account of the communicative practice important to a group of actors attempting to work in a coordinated manner within some institutional context. This means that the standard constructs of information modelling are located clearly within a solid theoretical background of what we refer to as information situations.
1.2
Approach
Information modelling has been an important technique within the armoury of the business analyst for over four decades. It began with the work of Chen in the 1970s, developed through the work of a range of people working on the so-called semantic data models in the 1980s and settled into the object-oriented frameworks of the 1990s through class modelling. Since that time, the technique has stabilised to form a major part of the toolkit of the contemporary business analyst. However, there is a key problem with information modelling, which was cogently summed up by David Hay way back in the 1990s. He stated that the central problem faced by practitioners of information modelling as ‘learning the basics of a modelling technique is not the same as learning how to use and apply it . . . [Information] modelling is particularly complex to learn, because it requires the modeller to gain insights into an organization’s nature that do not come easily’ (Hay 1996). Part of the attraction of information modelling, at least at its core, is that it uses relatively few constructs. These constructs are also readily imparted to newcomers to the technique. But students and practitioners when attempting to model using this simple approach find it extremely difficult to apply effectively when engaging with actual instances of institutional action. I make the claim within this book that many of the problems experienced with the conduct of information modelling in practice are due to a misconceived notion of the proper context for information modelling, which I refer to as information situations—situations in which information is accomplished by institutional actors. Within this book, I propose and demonstrate a way of thinking about the nature of information models in relation to information situations which helps resolve many of the difficulties experienced with conducting information modelling in practice. This leads me to describe an innovative approach to composing an information model which does justice to this different way of thinking about the relationship between an information model, communicative competence and institutional reality. I locate the basis for many of the practical problems experienced with this popular business analysis and design technique in the pragmatics of information models. Pragmatics as a term has at least two senses appropriate to understanding the nature of information models. In one sense, pragmatics is used to denote a sub-field of linguistics and semiotics, particularly concerned with the relationship between signs and their use within context. It is in this sense that we build a theory of information
1.3 Key Strengths
3
situations which does justice to a vast amount of literature in this area. In another sense, pragmatics denotes the application of pragmatism—that branch of philosophy associated originally with the work of American scholars Peirce, James, Dewey and Mead. The key principles of pragmatism are that human concepts are defined by their consequences, truth is embodied in practical outcome and learning is controlled inquiry, in which rational thought is interspersed with action. Although there is no direct relationship between pragmatics as a linguistic endeavour and pragmatism as philosophical orientation, there is evident common ground in the positioning of both knowledge and reality in the centrality of action. I locate problems experienced with practical information modelling in a misconceived understanding of the relationship between the constructs of an information model and their proper context—namely, institutional reality. This is an issue of pragmatics. But I also wish to focus upon the nature of information models as a way-station to institutional action. In this sense, I consider the pragmatic consequences of an information model to be critical to both its design and use. To understand any modelling technique, we need to understand three things: constructs, notation and principles of application. All three are described for information modelling in a tutorial manner within this book. The book begins by establishing the bedrock for the student by discussing the nature of information in terms of a theory of information situations. This leads to a discussion of why it is considered important to model information. An account is then created of the key constructs of information modelling—classes, attributes, association relationships, generalisation and aggregation—in terms of this bedrock theory. Various ways of visualising an information model are then discussed, followed by an account of our innovative way of composing an information model in practice based around an analysis of communicative patterns evident within some current or future domain of organisational action. This leads to a discussion of translating an information model into a design for some data system. As we shall see, this can be undertaken in both a top-down and a bottom-up manner. We then address certain practical issues that arise in the conduct of information modelling. Finally, we discuss the positioning of information modelling within certain areas that influence the modern digital landscape—that of metadata and the semantic web and the developing disciplines of data analytics and data science. Accompanying chapters provide a set of closely integrated exercises and sample solutions.
1.3
Key Strengths
Compared to existing literature in this area, this book has a number of key strengths. First, no prerequisite knowledge is assumed on the part of the reader. Students and practitioners are tutored in the development of information modelling from first principles. The book covers all the core principles of both entity-relationship diagramming and class diagramming—the two major approaches to information modelling.
4
1 Introduction
As we have mentioned, problems with information modelling experienced with traditional approaches are the result of a misconceived notion of the relationship between the constructs of an information model and institutional reality. It is my belief that spending some time unpacking the nature of both data and information up-front for the student and providing some solid theoretical basis for this distinction is critical to getting students to engage effectively with the intricacies of information modelling. Therefore, unlike existing texts in this area, which tend to be largely atheoretical, the proposed book builds a coherent account of information modelling based in strong theory. This theory is introduced in an informal manner and through a number of practical examples of information situations relevant to a range of institutional settings. There is nothing as practical as a good theory. This book therefore provides solid guidance on how to produce information models in practice. The text promotes a practical approach to information modelling based around the analysis of communicative practice within delimited domains of organisation. Numerous examples are peppered throughout the book to illustrate constructs and their application. Detailed exercises in information modelling with solutions are also provided. The author has over 30 years of experience in the field both in teaching the subject and in applying information modelling in practice. He has published a range of texts which impinge upon data and the design of data systems—his textbook on database systems went through three editions. The current text on information modelling forms a companion volume to his existing texts on Business information systems (3rd edition—2020), Business analysis and design (2021a) and Data and society (2021b).
1.4
Figures, Examples, Exercises and Solutions
It is important to recognise that although information modelling is not necessarily a visualisation technique, some form of visualisation is normally expected in its application. Therefore, given the nature of the subject matter, the book contains a substantial number of figures. All figures are drawn by the author. Numerous in-text examples of the concepts of information modelling and their application are included throughout the text. A separate chapter is devoted to a range of exercises which the reader can use to test understanding and application of the technique. A corresponding chapter of solutions is also provided to support learning.
1.5
Book Contents
In this section, we provide a quick overview of each substantive chapter within the book. The chapters are designed to be read in sequence. The early chapters build an account of information modelling from the bedrock of a theory of information situations. Later chapters discuss a number of practical issues concerned with the application of the business analysis and design technique. The conclusion
1.5 Book Contents
5
demonstrates a larger context for the application and importance of information modelling. Chapter 2: What Is Information? The central claim of this book is that to conduct information modelling effectively, both the student and practitioner need to understand the nature of information. Within this opening chapter, we build a theory of information in terms of situations in which information is accomplished by actors. This enables us to conclude that any successful attempt at information modelling must begin with a close understanding and analysis of the information situations pertinent to the domain in focus. This domain may be an existing domain of communicative action, or it may be an entirely new domain of communicative action. Chapter 3: Why Model Information? This chapter considers what a model is and how models relate to notions of institutional reality. Traditional approaches to information modelling, as we shall see, regard the relationship between an information model and reality as one in which reality is made up of things with properties and an information model is composed of formal statements which correspond to objective facts about such things. We shall show that this conception leads to certain problems with the conduct of information modelling. This leads us to present a contrasting account which we believe offers a more sophisticated and accurate representation of the relationship between an information model and reality. We shall show how our framing of an information model provides a better way of considering not only the true purpose of an information model but also how to approach the investigation of institutional domains. Chapter 4: Information Modelling from First Principles In this chapter, we build an account of the major constructs of information modelling using our theory of information situations as its bedrock. We start with the notion of an object referred to through an identifier. This leads us to consider the process of classification, which involves grouping objects that share common characteristics into an information class. Information classes are defined in terms of attributes held to be common amongst a group of objects, but they are also defined in terms of their relationships of association with other classes. Such relationships of association are further defined in terms of certain constraints, known as cardinality and optionality. We then look at two important processes of further abstraction sometimes considered important to modelling institutional ontology with classes—that of generalisation and aggregation. Chapter 5: Visualising an Information Model Within Chaps. 2, 3 and 4, we compose an information model using the canonical form of a series of binary relations, and we use such binary relation to represent a set of institutional facts about the content of communication relevant to the domain in question. However, information modelling originally developed as a diagramming
6
1 Introduction
technique meant to aid the work of analysts and designers of data systems of various forms. Within this chapter, we demonstrate various ways of building a visualisation of an information model from an established set of institutional facts. Chapter 6: Composing an Information Model Within this book, we view an information model as a model of important aspects of institutional ontology—a model of what actors within some domain deem to exist, how they communicate about such things and how they use such communication to coordinate joint activity. This way of thinking about both the content and the purpose of information modelling allows us to develop a clear way of composing an information model which does justice to some institutional ontology under investigation. Within this chapter, we demonstrate how to build information models either from an analysis of the instrumental communicative practices within some domain or by designing a set of communicative practices for some new domain of action. Chapter 7: Practical Issues in Information Modelling In this chapter, we examine a number of practical issues associated with the conduct of information modelling and how these may be resolved. We first consider the issue of interpretive flexibility—the fact that the modeller may choose to model the same thing as a class, attribute or relationship depending upon the institutional context under consideration. The same flexibility applies in the case of using generalisation and aggregation within information modelling. Then, we consider the distinction between strong and weak classes and notions of ternary and recursive relationships. This leads to a discussion of how to include time within an information model and the important problem of connection traps and how to avoid them. Chapter 8: Information Models and Data Systems Information modelling is typically directed at the design of some data system. The architecture of some data system is defined in terms of some data model, of which one of the most popular is that of the relational data model. The design of some relational database, which is referred to as a schema, is best understood through a visualisation technique known as dependency diagramming. This technique offers a straightforward route for conducting a process important to the design of a relational schema known as normalisation. Chapter 9: Information Modelling in Context Within this chapter, we consider the context of information modelling in a number of different senses. First, we consider how information modelling fits within the larger practice of business analysis and design. Second, we consider how information modelling has relevance not only to modelling data but also to the modelling of metadata. This leads us to discuss the way in which information modelling is relevant within the design of Web infrastructure. Third and finally, we consider how an understanding of information modelling is important to building a more
1.6 Conclusion
7
nuanced approach to big data as well as to the more overarching and emerging discipline of data science.
1.6
Conclusion
The late novelist Ursula Le Guin in her quartet of fantasy novels (Le Guin 1993) described the world of Earthsea in which magic is a reality. Magic is enacted by key actors in this world, namely, wizards. Such actors spend many decades in learning how to accomplish magic through the use of special words, and the use of such words by these actors allows them to manipulate things in the world of Earthsea. But we as actors in the worlds we build are also very much reliant upon the words we use. In fact, our use of words is not entirely remote from the wizard’s use of words in Earthsea. This is because words are key examples of signs and signs always have two faces. Our use of words as signs allows us to describe our institutional worlds but also to reflect upon these realities. But when we use words, we also construct major aspects of the reality we are describing. This means that we all engage in sign-magic on a daily basis. As we shall see, this book is very much about words or more generally signs. This is because information modelling by its very nature focuses upon the use of signs by actors within institutions of many different forms to get things done.
2
What Is Information?
2.1
Introduction
To model information, we need first to understand as clearly as we can the nature of information. In other words, to build any coherent account of information modelling and how to do it properly, we first have to know what information is and what it is not. This is not as easy as it seems. Part of the problem is that information as a concept is normally taken for granted by the disciplines which most deal with it, such as computer science, information systems, information science and information technology. This attitude of mundane acceptance has also migrated without questioning into the newer areas of big data, data analytics and data science (Chap. 9). Another part of the problem is that the concept of information, as we have argued elsewhere (Beynon-Davies 2013), has many different connotations in a multitude of different literatures. So, to help steer a clearer path through this conceptual murk, we have developed a theory of those situations in which information is clearly present. By using this theory as our guide, we can clearly see what information is and what information is not. This rendering of information situations provides us with a number of advantages over other literature. First, it provides us with a much clearer understanding of the context for information modelling in the sense that we can clearly identify why we are doing information modelling and for what purpose. Second, it allows us to build a much better account of the core constructs of information modelling. Third and finally, it provides for us a much more productive route for describing the proper conduct of information modelling. Using our approach, the newcomer to information modelling can clearly understand how to compose an information model from an analysis of the communicative competence appropriate to some domain of institutional action. Figure 2.1 is an attempt to visualise our theory, developed from a range of the authors’ previous work (Beynon-Davies 2021b). The eminent biologist Gregory Bateson (1972) usefully defined information as ‘. . . any difference that makes a difference’. Information, as we shall see, is the difference or set of differences that an # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_2
9
10
2
Articulation
What Is Information?
Coordination
S1
S1 M1
T1
T2 A2
A1
A2
S2
Communication
Environment
Fig. 2.1 A model of information situations
encounter with some structure makes to an actor. Information is an accomplishment made by an actor or actors and is very much bound up, as we shall see, with instrumental and formal communication used by actors within institutional settings to get things done. In this chapter, we provide a discussion of our theory’s key components and what such components mean in terms of information and demonstrate how an understanding of information situations provides a more resilient route into information modelling than that offered within traditional accounts of the subject.
2.2
Information Situations
Figure 2.1 is an illustration of what we refer to as information situations—situations in which information is accomplished. We contend that such situations always consist of a number of essential components—actors, structures, messages and actions—all taking place within some environment. The environment of some information situation is typically for the information modeller some domain of institutional action. Let us consider these components in more detail. Actors We use the term actor in a deliberately abstract way here to denote anything that can act. Actors transform their environment in some way and include not only humans but also other animals, machines and certain classes of artefact. Two human actors are indicated in Fig. 2.1, which we have labelled as A1 and A2. However, as we know, systems of information technology also form key actors within institutional settings. Structures Structures are things within the environment of actors which undergo a certain form of transformation. A structure is brought into existence by particular actors by
2.2 Information Situations
11
making differences within some substance evident in the environment. Not all structures are equal within information situations. Within this book, we focus upon structures explicitly produced and used to communicate things between two or more actors. This type of structure is known as a data structure. Records of all forms are data structures, as are lists, registers, ledgers and so on. Within Fig. 2.1, a data structure S1 is transformed by transformation T1, whereas transformation T2 is undertaken upon some undefined physical structure S2. Messages Data structures are transformed through acts of articulation by actors. Through the articulation, such as (T1), of data structures, such as (S1), messages can be conveyed as signals between one actor and another. This signalling of messages is the essence of communication. One actor creates or effects some articulation of a structure, and one or more other actors sense or read the changes made to the structure. Through this process, two or more actors commune—they arrive at a common understanding of something. In Fig. 2.1, M1 constitutes the message transmitted by actor A1 to actor A2 through the articulation of structure S1. Actions So, it should be evident that not one but three types of inter-related or coupled action are illustrated and labelled in Fig. 2.1. There is first the act which involves articulation of some data structure by some actor. Then there is the act of communing through this structure between one actor and another—of collectively agreeing as to what structures stand for—this is the essence of communication. And, finally, there is usually a responsive action on the part of the receiving or sensing actor, which may be to articulate some further structure within the environment. This domain of action we refer to as coordination because most of the structures that we focus upon within this book (data structures) are transformed with the intent of coordinating the joint activity of multiple actors working within some environment. Environment Information situations typically occur in a repetitive manner within delimited settings which we refer to as institutional domains or domains for short. An institutional domain is an environment constructed or reproduced from patterns of actions performed repeatedly by actors working to achieve joint activity in the fulfilment of established goals. An institutional domain may be the whole of or a coherent part of some private, public or voluntary sector organisation. Or it may refer to something larger such as systems of government, social care or policing within the nation-state. Sequence There is a necessary sequencing of action in any information situation—certain actions always occur before other actions. There is also a necessary temporal delay or lag between various forms of action. Articulation must always occur before communication, and communication must always occur before coordination. There is also a necessary lag between an actor articulating some data structure, that data
12
2
What Is Information?
structure communicating something to some other actor and that actor coordinating their activity. The lag or delay between articulation, communication and coordination may be a matter of seconds, but it can also be a matter of hours, days and even weeks. Example: Emergency Response
Now consider just one information situation from one particular institutional domain—that of medical emergency response. This is an institutional domain composed of many different information situations which we shall examine in some detail throughout this book. The information situation illustrated in Fig. 2.2 provides ‘flesh’ to the component elements from the more abstract Fig. 2.1. We must remember that this is an abstraction of only one situation extracted from a pattern of actual information situations that occur every hour of every day amongst working medical emergency organisations within the UK, which is used as our key example in this book. However, information situations such as this having similar characteristics occur between actors attempting to accomplish information in this manner throughout the world. The dotted arrows indicated upon Fig. 2.2 represent the distinct sequence of action evident in any information situation: a data structure must be articulated before it can be used as a message to some other actor and before this actor can take appropriate further coordinated action. Within this information situation, one actor, an ambulance dispatcher, creates a data structure known as a dispatch message, which is entered into the incident system of emergency response. This dispatch message as a signal is transmitted to an ambulance station where it is read (sensed) a few seconds later by another actor, an ambulance driver. This ambulance driver effects certain action in response to this message, namely, to drive his ambulance to the incident as indicated in the message. This whole pattern is likely to be enacted in a few minutes, but in times of high demand, there may be an appreciable delay of hours between the receipt of the message and the action of driving to the incident. ◄ Example: Online Grocery
Or consider another information situation from a different institutional setting or domain, this time in the private sector. Figure 2.3 illustrates an information situation repeatedly exercised within a current area of commercial activity— that of supermarket retail over the Internet and Web—sometimes referred to as online grocery (Beynon-Davies 2017). Here, a customer creates an online grocery order on some digital commerce website. This order communicates to some grocery operator some time later a request to pack and deliver the indicated groceries to a specified location on a specified date. Again, at a further point in time, this communication triggers a grocery delivery to the customer by another actor, namely, a delivery driver. ◄
T1: create dispatch message
S1: dispatch message
S1: dispatch message
Communication
Fig. 2.2 An information situation from emergency response
A1: dispatcher
Articulation
M1 A2: ambulance driver
DIRECT[Go to incident X at location Y
S2: emergency ambulance
T2: drive S2 to incident Y
Emergency response
A2: ambulance driver
Coordination
2.2 Information Situations 13
14
2
What Is Information?
Articulation
A1: customer
T1: create grocery order
S1: grocery order
S1: grocery order
M1
S2
A2: delivery driver
DIRECT[Deliver groceries S2 to household Y on date Z]
T2: deliver groceries S2 A2: delivery driver
Communication
A1: customer
Coordination
Online grocery
Fig. 2.3 An information situation from online grocery
Exercise: Driverless Cars Now consider an information situation in the near future where driverless electric cars may be shared amongst a pool of passengers in a large urban area such as metropolitan London. In this future information situation, a carpool passenger places a car transport request to the shared car indicating pickup and drop-off locations. This request communicates not to a human actor but to the car itself, causing it to automatically schedule the journey into its movements and execute the trip in the directed fashion. Try to visualise the component elements of this information situation in the manner of Figs. 2.2 and 2.3. In other words, identify the actors involved, the data structure articulated, the message communicated and the coordinated action taken.
2.3
What Information Is and Is Not
Our model of an information situation may at first glance seem rather abstract, but it is useful for highlighting a number of different ways in which information is defined in various literatures. Let us first review these different ideas of what information is. We shall then show how these different conceptions of information relate to various parts of our model of information situations. Information as Stuff One particularly common conception of information is one in which information is portrayed as fundamental ‘stuff’ which helps any physical system maintain organisation (Stonier 1994). As such, information is believed to be an objective phenomenon, independent of and the same for all actors. This conception underlies the classic approach to information, evident in the theory of Shannon (1949). Within
2.3 What Information Is and Is Not
15
‘information theory’, information lies in the signal which conveys the message and is associated with the degree of order (negentropy) evident in the signal. Example
In this perspective, as far as our model of information situations is concerned, M1 as a message transmitted as a signal ‘contains’ information. The ordered set of differences comprising this signal are taken to be the very stuff of information. ◄ Information as Interpretation Another particularly dominant perspective is to conceptualise information as the act of interpretation of some signal by some actor. In this sense, information is not inherently ‘contained’ in the message itself. Instead, it is seen to be created within an act of sense-making conducted upon the message by an individual actor (Boland 1987). In this guise, information is seen as a subjective phenomenon, bound to some actor. Here, information is associated with some notion of inner processing undertaken within the mind or psyche of actors—a process of information. Example
In this perspective, information requires an actor such as A2 to assign some meaning to the set of differences evident in message M1. The consequence of this is that the same set of differences made in some substance may be interpreted as meaning different things to actor A3 than to actor A2. ◄ Information as Intentionality More recently, information has been considered an inter-subjective phenomenon, reliant on the ‘negotiation’ of collective or shared intentionality (Searle 1983). As such, information is considered an inter-subjective accomplishment amongst groups or communities of actors. Here, information is related to the shared ways in which actors build an ‘aboutness’ between sensed structures evident in the environment and mental states. Example
In our model of information situations, such collective intentionality involves the aboutness between structures in the environment, such as S1, and some internal state which causes the actor to emit a certain message, such as M1. In turn, message M1 becomes a state of the world which causes some mental state in all receiving actors, causing them to effect certain actions, such as T2. ◄ Information Does Not Exist Finally, we should mention the most radical position which proposes that information does not exist—it is a null concept. Stimulated by the work of Maturana and Varela (1987) and their idea of an autopoietic (self-producing or self-organising)
16
2
What Is Information?
system, this viewpoint maintains that information is merely a convenience imposed by observers upon situations of behavioural coordination through structural coupling. In this sense, we observe patterns of order in some situation, such as certain patterns of messages and actions. But such patterns merely correspond to invariances between the actions of certain actors in relation to the environment. We impose upon such patterning the convenient idea of information being ‘conveyed’ or ‘communicated’ as a useful way of accounting for the behavioural coordination which corresponds to such invariances. Example
According to this view, the observer of some situation perceives that whenever actor A1 does something, such as transform the data structure S1, actor A2 does something in response, namely, transform the structure S2. The observer infers the presence of some information through the evident patterning of the behaviour of these actors. In other words, there is an observed invariance between the presence of S1 and the occurrence of T2. ◄ Information Arises Within Situations in Which All Elements Are Present The key consequence we take from our theory of information situations is that an accurate account of information must encompass all four viewpoints, but in one entangled whole. Information is objective because it relies upon the materiality of signals. In other words, a signal is always made up of a set of physical differences made in some substance, and such differences can be objectively observed by all. Information is subjective because it is built from the interpretation of structures by actors. This means that the differences made to some structure may be interpreted differently by different actors. Information may be inter-subjective when it amounts to the outward expression of a collective intentionality which associates certain states of the environment with certain mental states. This means that information only becomes possible when actors collectively agree that certain things stand for certain other things. Finally, in terms of each of these conceptions taken independently, information may not exist. We take this to mean that information is not a substantive concept solely reliant upon any component part of some information situation. Instead, it is better to propose that information must have all the elements of the situation illustrated in Fig. 2.1 present to be deemed to exist. Information is always a phenomenon which emerges from the continuous exercise or accomplishment of some pattern of actors, messages and actions all working within some environment. Exercise The next time you see a news report which uses a phrase such as ‘the information tells us that’, try to step back and think how the term information is being used in this context.
2.4 The Stands for Relation
2.4
17
The Stands for Relation
So, let us place our cards on the table. We make the central claim within this book that to understand how to conduct information modelling within effective practice, the modeller needs to understand how the constructs of this approach relate directly to information situations as we have described them. As we shall see, information modelling primarily focuses upon one crucial part of an information situation, namely, the messages generated and transmitted within information situations. Information modelling also focuses mainly upon the content of such messages. Finally, information modelling does not regard all messages made by actors as important. Instead, it tends to focus upon messages that attempt to get things done by actors within delimited institutional settings. Having said that information modelling focuses upon the content of messages does not mean that we can ignore other elements of information situations. Information situations always form the proper context for information modelling work, and a proper understanding of information situations by the modeller is critical to the effective construction of an information model. However, given such understanding, information modelling itself makes no attempt to cover and represent all the elements contained in information situations. To simplify somewhat, information modelling is particularly concerned with how messages are used to build intersubjective agreement (collective intentionality) about the meaning of things amongst a group of actors. We shall see in turn that an information model primarily concerns itself with a limited part of messages, which we refer to as the content of messages. The content of messages consists of the signs used to stand for things of importance within some institutional setting to actors within such a setting. An information model is an attempt to map the patterns of such content evident within some delimited domain of human and machine action. Information, as we have seen, fundamentally relates to acts of communication between people and between people and machines. But information is always directed at achieving coordinated action. The informed person makes decisions about appropriate action in particular situations. An information model is an attempt to represent a limited but important part of the patterns of information situations evident in some delimited domain of institutional action. Typically, an information model builds a representation of what actors normally communicate about in pursuit of instrumental action. This model, as we shall see, is built in terms of certain constructs which enable us to identify and describe the things communicated about. To identify and describe such things, we utilise the constructs of classes, attributes and relationships between classes. Classes, attributes and relationships are key examples of signs. Signs are important because it is through the application of signs that we as actors make sense (impart meaning) about the world. But we don’t just make individual sense. Through signs, we accomplish collective or inter-subjective meaning—we make sense between ourselves by achieving a common understanding of the aboutness of things—how one thing is about another thing. Through such collective meaning,
18
2
What Is Information?
we coordinate our joint activity. All this, of course, is the essence of information situations—the linkage between structures, messages, actions and actors. But let us become a little more analytic in our understanding of signs. According to the American philosopher Charles Sanders Peirce (Atkin 2016), a sign is (Α) some thing (Β) that stands to somebody (Γ) for some other thing. The three signs utilised here to segment out parts of the definition of a sign correspond to the first three letters of the Greek alphabet (alpha, beta and gamma). Example
Consider the simplest of signs—the pointing finger. The pointing finger is a classic example of a sign. Indeed, it is an embodied sign: a sign produced by manipulation or transformation of the human body. The pointing finger is a sign because it stands for something else. In this case, it directs our gaze to some other thing. A human smile is another common example of an embodied sign—the smile tells us something of the inner state of the actor making the smile. A smile is taken to stand for an inner state such as an emotion of happiness or joy. ◄ In the first example, a pointing finger is the thing that stands for the thing being pointed at to both the actor producing the gesture and to the person observing the gesture. In creating this collective stands for relation, we produce or accomplish meaning. Hence, to read written Greek, we need to agree on the meaning associated with the signs of the alphabet—what each sign stands for. The name for this collective building of such meaning amongst a community of actors is an ontology. Exercise Think of some other forms of embodied sign such as a clenched fist. If someone waves a clenched fist at you, what is this embodied sign meant typically to stand for?
2.5
Identifiers
Within information modelling, two of the most important types of signs are identifiers and descriptors. More commonly used terms for descriptors are properties and attributes. When a sign refers to something, they are said to be identifiers for things. Alternatively, signs may describe something, in which case they are descriptors—properties or attributes of something. We use the term thing here in an entirely neutral way to stand for anything that can be referred to or described by actors within some institutional setting. Let us first examine the use of signs as identifiers (we shall look at descriptors, properties or attributes later). So, identifiers are signs which refer to something. More precisely, an identifier is anything which can be taken to refer to some other thing
2.5 Identifiers
19
across time and space to multiple actors. Referring is a critical function within communication which allows a sender to specify one and only one thing to which the sign within a message applies while also providing the means for a receiver to identify the thing from the sign/message. Identifiers are particularly useful within acts of communication because they can refer to some instance of a thing without actually the need to describe it. They can also refer to this instance across many different information situations. Example
For instance, personal names such as ‘John Smith’ are typical identifiers, while a definite description of this person might consist of the phrase ‘the man with red hair and a pronounced limp’. Red hair and a pronounced limp can be taken to be descriptors of the person. ◄ Example
Consider a sign important to many institutional settings—the passport number as a personal identifier. Each country in the world is able to create its own form for such an identifier, and each passport identifier or number refers to one person: [ REFERS TO ] To take just one example of the use of such an identifier, within the UK, a passport number currently consists of nine digits: For instance: [109999555 REFERS TO John Smith] ◄ Note that we cannot actually represent or record as a fact the relationship between a physical thing and a sign directly. We actually have to use other signs as proxies. The fact we have just listed in relation to a passport number actually relates two identifiers. One is a ‘natural’ identifier and consists of a personal name; one is a ‘surrogate’ identifier, created by a particular institution (in this case, the UK Passport Office on behalf of HM Government) to uniquely refer to a certain person. Both natural and surrogate identifiers can refer to some thing, but surrogate identifiers enforce the uniqueness of reference across information situations. Example
Hence, the surrogate identifier 109999555 will always refer to one and only one British citizen. The natural identifier John Smith is sufficient to refer to this person in many contexts. However, in certain situations, the referring function will break down, potentially because there is likely to be more than one person named John Smith in the UK. ◄
20
2
What Is Information?
So, an identifier such as a passport number is a sign—an identifier which refers to an instance of a person. More generally, for Charles Sanders Peirce, any sign is a rule taking the form: [X stands for Y to Z in C] Some thing X stands for some other thing Y to some actor Z within some institutional context C. The terms X, Y, Z and C in this rule are placeholders, meaning that they refer to some as yet unspecified value of something. We can make this rule work for us by substituting specific terms into each of the placeholders. Example
So, the term 109999555 is an identifier (X) if it stands for a specific person John Smith (Y) to a specific actor or role such as a border control officer (Z) within some institutional context such as the domain of British citizenship (C). ◄ The implementation of any stands for relation is a convention. The relation between some thing and what it is taken to stand for relies purely upon the weight of precedent and is unlikely to emerge without the presence of such precedent—the idea that something occurs because it always has occurred. This means that most signs are not fixed but arbitrary in the sense that there is no inherent linkage between the terms X and Y to actors such as Z in the rule above. The stands for relation is merely accepted to be the case amongst some community of actors because it has always been the case within this institutional domain. Example
The fact that the spoken and written word Paul is taken to stand for me the author as a person is purely arbitrary in the sense that prior to my being named on a birth certificate, there was no formal identifier for me as a person or citizen of the nation-state into which I was born. I might have been identified with an entirely different name, such as Jack or Rhys. However, at the time of registering my birth, a convention was established through an act of communication known as a declaration—that from hereon, the constitutive rule X (Paul) shall be taken to stand for Y (me) to all actors (Z) in all institutional domains (C). ◄ But note that there is a certain magic happening with the use of signs such as identifiers within institutional settings. The assignment of identifiers to things serves not only to identify such things to the institution concerned; they bring these things into existence for the institution or more precisely to actors communicating within this institutional domain. Hence, the assignment of identifiers to things serves to help define an important part of the so-called ontology of the institution—its notion of what things exist or to put it another way what reality is. So rules of the form X
2.6 Descriptors
21
stands for Y to Z in C are best seen as constitutive rules in the sense that they serve to constitute (produce and reproduce) the ontology of some institutional domain—they create and recreate the very notion of what things are important within this domain to actors communing with each other (Searle 1995). Exercise You the reader will have a personal name, which is a natural identifier referring to you. But, try to think about all the other identifiers used to refer to you, and jot these down. For instance, if you are a student, what identifies you to the university? It is probably more likely to be a code rather than a personal name. What about identifiers used to refer to you by other organisations you interact with such as banks, supermarkets and so on?
2.6
Descriptors
As we have mentioned, things are not only referred to by identifiers; they are described by descriptors, sometimes called designators. Information modellers tend to refer to descriptors as properties or attributes of some thing, so we shall adopt this naming practice from hereon. Example
A passport as an identity token does not just, of course, contain details of the passport identifier which refers to a particular person. The passport number as the main identifier is not the only sign used on a passport. Hence, when a particular passport as a data structure is issued, it serves to declare a whole series of what we shall call institutional facts about the person, such as: [109999555 GIVEN NAME Joe] [109999555 SURNAME Bloggs] [109999555 DATE OF BIRTH 15/03/1957] [109999555 SEX male] [109999555 NATIONALITY British] ◄ The relations between signs in this example are all matters of description, designation or attribution. In other words, they all attribute particular values to an identified person. Attributing such values to an identified thing serves to describe that thing.
22
2
What Is Information?
Exercise A census of all the persons resident is normally taken in countries such as the UK every 10 years. See if you can find out both how citizens are identified and what descriptors are normally attributed to persons upon the national census.
2.7
Communicative Acts
One critical definition of the term to commune is that it involves the interchange of thoughts and feelings between actors. In terms of our model of information situations, an information model primarily concerns itself with what we refer to as the communication domain—the realm in which messages are used to commune between two or more actors. Example
Consider a simple example of an information situation in which an act of communication (a communicative act) takes place. Person A looks across at Person B, who is at the opposite end of a room. She holds up a hand and points a finger upwards, clenching her other fingers in a fist. What will B take this to mean? Will he take it perhaps as an insult, a command to provide one of something or a message that there is something stuck on the ceiling. ◄ In forming the shape of the pointing finger, person A is making a set of differences with a certain substance, namely, a certain part of her body. Person B encounters this data structure but must make a decision as to what he thinks is the most appropriate meaning to assign to this act performed by person A. When this decision is made, then it makes a difference to actor B in the sense that their future action will depend on this accomplishment. Perhaps, they will later meet at an agreed place at 1 o’clock. The most important element or component of an information situation as far as information modelling is concerned is the message. But there are actually two major parts to any message formed within an information situation—the content of the message and the intent of the message. The intent of a message establishes what the actor is trying to achieve with the message—the purpose of the message. In contrast, the content of the message consists of the things identified and described by the signs which make up the message. In the example of the pointing finger, the content of the message is the pointing finger as a sign which is taken to stand for a meeting at 1 o’clock. The intent of this message is what is referred to by the philosopher John Searle (1970) as a directive. Through this message, actor A is requesting or directing actor B to meet him at 1 o’clock.
2.7 Communicative Acts
23
Example
Consider the message in Fig. 2.2—DIRECT[Go to incident X at location Y]. The intent of the message here is to direct the actor receiving the message to do the things described in the message. The content of the message is composed of an emergency incident identified by X and located at a location identified by Y. ◄ Example
A communicative act from a manufacturing domain is illustrated in Fig. 2.4. Two critical actors within this domain are an inbound logistics controller and an inbound logistics operator. Many communicative acts are enacted by these actors within the daily business of their work. One such communicative act is illustrated here as a speech bubble. The inbound logistics controller is the sender of a message and the inbound logistics operator the receiver of this message. The elements within the square brackets consist of the content of this message—the things identified or described—in this case, a delivery and where it is to be placed. The keyword DIRECT refers to the intent or purpose of the message. Here, the inbound logistics controller is directing that the inbound logistics operator do something indicated by the message itself—namely, ensure that a delivery is checked after being moved to a manufacturing bay. ◄ So, information modelling is interested in acts of communication undertaken by actors within institutional domains, but it is not interested in all communication enacted within these settings. Information modelling is interested specifically in communication that gets things done. In other words, it is interested in communication that helps people, machines and other artefacts coordinate joint activity. Fig. 2.4 A communicative act
DIRECT[Delivery X needs to be unloaded for checking to bay Z]
Inbound logistics controller
Inbound logistics operator
24
2
What Is Information?
Example
You would not normally include within an information model a representation of the greetings between business actors, but you would possibly wish to represent aspects of communication relevant to the establishment and holding of business meetings. One particularly important piece of communication here will be the date and time at which the meeting is to be held as well as the place where it is to be held. Without this piece of communication, various business actors would not be able to coordinate their joint attendance at this meeting. ◄ The American philosopher John Searle (1970) calls such communication which gets things done speech acts. We shall refer to them as communicative acts, because as we know much communication within contemporary institutional domains is conducted not through human speech but via a range of different media, such as the creation of electronic records and the transmission of electronic documents, electronic mail, SMS messages and so on. Communicative acts that get things done are instrumental communicative acts. A communicative act is instrumental because it is designed by the sender of the message to influence the action of the receiver of the message. Examples
For example, within the domain of emergency response, if a caller asserts that a medical emergency has taken place, then another actor, a call-taker, is expected to take certain action, such as to alert yet another actor, a dispatcher, to dispatch an ambulance. Or, if a manager at a company offering technical courses instructs a booking clerk to prepare a schedule for a certain course, then both expect that this activity will be undertaken. ◄ John Searle (1970) maintains that there are five major ways in which people influence other people through communication—assertives, directives, commissives, declaratives and expressives. Each of these terms describes a different purpose for the message. Examples of these five different types of communicative act are provided in Fig. 2.5. These examples all relate to the case of medical emergency response. Let us look at each type of communicative act in turn. Assertives are communicative acts that explain how things are in a particular part of the domain being communicated about, such as reports of business activity. Such acts express the truth of the content of a message on the part of the sender of the message. For instance, within various business organisations, different assertives will be made on a regular basis using verbs such as report, confirm, deny, etc.
Call-taker
Fig. 2.5 Forms of communicative act
COMMISSIVE
Call-taker
COMMIT[An ambulance will be with you in X minutes]
Caller
Paramedic dispatcher
ASSERT[Medical emergency X is of form Y]
ASSERTIVE
Call-taker
DIRECT[Medical actions X, Y and Z need to be taken]
Caller
DIRECTIVE
Ambulance driver
DECLARE[ Incident X is now closed]
EXPRESS[I happy/ Control am centre unhappy with your annual manager performance]
Dispatcher
Call-taker
DECLARATIVE
EXPRESSIVE
2.7 Communicative Acts 25
26
2
What Is Information?
Example
Within emergency response, a call-taker might communicate to a dispatcher that a medical incident is of a certain degree of seriousness, in the sense of whether it is regarded as life-threatening or not. ◄ Directives are an attempt to influence receiver action through some message. Directives consist of any act of communication in which some direct response is required from the receiver of the message. Directives are communicative acts that express how one actor would like another actor to behave. They represent the senders’ attempt to get the receiver of a message to perform or take an action. Within business organisations, directives are evident in the use of certain verbs such as request, suggest, summon, recommend and prohibit. Example
So, within emergency response, a call-taker might issue certain advice about medical actions that need to be taken by the caller. ◄ Commissives commit a sender of some message to the future course of action detailed in the message. So actors may commit themselves or others to something happening in the future. Commissives are communicative acts that express how I as an actor intend to behave. Such communicative acts commit a speaker or sender to some future course of action. They are communicative acts that represent a speaker’s intention to perform an action at some time in the future. For instance, within business, promises, guarantees, acceptances and refusals are all examples of commissives. Example
Hence, within emergency response, a call-taker might promise a caller that an ambulance will be with them in a certain period of time. ◄ Declaratives are messages that change the state of some domain through the communication itself. The main difference between an assertive and a declarative is that when something is declared to be the case, it cannot be undone. For example, a judge may use words such as ‘I sentence you to. . .’ or your boss may utter the words ‘You are fired. . .’. Example
Within emergency response, an ambulance driver might declare that an incident is closed. This change of state means that the dispatcher releases the ambulance and its crew back into the pattern of action. ◄
2.7 Communicative Acts
27
People frequently motivate others through their expressions. Expressives are communicative acts that represent the speakers’ psychological state, feelings or emotions towards some proposition or state of affairs. Expressives represent the sender’s state, feeling or emotion about something. An evaluation of something or someone by somebody normally involves the use of expressives. Example
Within emergency response, a manager might express satisfaction or dissatisfaction with some person’s performance. ◄ Exercise Try to classify the following acts of communication as assertives, directives or commissives: • • • •
An employee reports on business activity to a manager. A person’s presence is requested at a particular business meeting. A guarantee document guarantees some action in the case of some problem. A person sends an email denying his or her participation in a particular affair. • A business plan suggests a course of action. • An agreement ensures that joint actions will be taken between participating actors. The importance of this distinction between intent and content is that different messages may have the same content but different intent. Example
For instance, assume within the domain of some manufacturing company that the content of a message is [Product X, Location Y]. The content of this message identifies a particular product and a certain production location. Now consider two different communicative acts that utilise the same identifiers for X and Y. ASSERT[Product X, Location Y] is an assertive message. It probably asserts the belief of a particular production worker or perhaps a production ICT system that a given product is currently placed at a given production location. DIRECT [Product X, Location Y] is a directive message. Here, a given actor, such as a production supervisor perhaps, is requesting another actor, such as a fork-lift truck driver, to move an indicated product to a designated production location. ◄
28
2.8
2
What Is Information?
Patterns of Information Situations
The point we wish to make here is that within any particular domain of institutional action, information situations do not of course exist in isolation—they form patterns. Consider Fig. 2.6 which represents an extract from a larger pattern of communicative acts relevant to the domain of emergency response. The sequence between the two communicative acts here is represented by a dotted arrow. The pattern begins when telephone operators take an emergency call. The caller’s area code or closest mobile phone cell is identified from the call, which is then routed to the ambulance control centre. At the control centre, a call-taker matches the call number with a physical address using a computerised map (or gazetteer) of the area covered by the service. These two communicative acts will be enacted at various times by many different callers and call-takers, but the essential features of these acts of communication will remain the same. Also, these acts form a clear sequence because the accomplishment of information in the first communicative act must always precede that of the second communicative act. In other words, the call-taker cannot look up the precise location of the emergency without first taking some details of this from the caller. Communicating where an emergency has taken place and involves a certain person is an assertion. The caller is communicating the belief that the content of his or her message is true. In contrast, when a call-taker uses the interface of the electronic gazetteer to enter the details of the emergency, she is requesting a response from this IT system. As such, this communicative act is a directive. As we shall see in a later chapter, there are many examples of communicative acts evident within a wider pattern of information situations that serves to constitute the domain of emergency response. For instance, dispatchers are regularly instructing an ambulance driver to ‘go to this location and attend this emergency incident’. Sometimes, this act will involve a radio message. Most often, it will involve an electronic message transmitted to an ambulance resource, received on some dashboard display and read by the ambulance driver. Or consider another example in which ambulance drivers are expected to assert to dispatchers when they have
Notification of emergency
ASSERT[A medical emergency has taken place at location X on person Y]
Caller
Call taker
Identifying locations
ASSERT[Caller X and emergency location Y]
DIRECT[Find caller and emergency Call taker location] Gazeteer
Fig. 2.6 Two communicative acts from a wider pattern of communicative acts
2.9 Physical and Institutional (Social) Ontology
29
‘arrived at the allocated incident’. This communicative act will consist merely of selecting an option on the ambulance’s dashboard display, which transmits a signal back to the incident IT system that in turn updates the display of the dispatcher. Finally, consider the case of paramedics communicating to a dispatcher detail not only of ‘the patient’s condition’ but also of ‘the treatment administered’. This is likely to comprise a complex and asynchronous dialogue conducted as a series of radio messages between paramedic and dispatcher.
2.9
Physical and Institutional (Social) Ontology
We have stated a number of times within this chapter that information modelling can be seen as an exercise in building a model of important aspects of some institutional ontology. We have also used the term ontology a couple of times within this chapter and the term institution many times. In doing this, we have relied upon your, the reader’s, conventional understanding of such terms. But, what more precisely do we mean by these terms and why is information modelling an attempt to model aspects of institutional ontology? The term ontology derives from the ancient Greek ontos for being and logos for study of. Ontology, as a branch of philosophy, is a theory of reality, being or what things are seen to exist. More recently, the term has been used within computer science and cognitive science to refer to a model for representing the world or more readily some specific domain within the world. Generally speaking, there are two types of things that make up or are seen to exist within any ontology—physical things and institutional things (Searle 2010). Physical things are things like people, mountains and rivers; institutional things are things like bills, payments and contracts. We take the position, first promoted by the philosopher Charles Sanders Peirce, that we cannot experience reality directly— we always experience reality through the mediating layer of signs. Hence, you will note that to refer to both physical things and institutional things, we have to use signs such as and . However, even though we need signs to both identify and describe both types of things, the main difference between physical things and institutional things is that physical things exist independent of the signs that actors use to refer to or describe them. In contrast, institutional things only exist because actors collectively agree that these things can be referred to and described in certain ways. Institutional things have no existence independent from the actors that signify them. Example
A physical thing such as a mountain will still exist even if actors choose not to refer it as such or describe it. Hence, the specific physical structure in North Wales or the physical structure in the West of Scotland will continue to exist even if we did not refer to these things as Snowdon and Ben Nevis. But the piece of paper or plastic in your hand with numerous bits of graphic printed upon it will
30
2
What Is Information?
only persist as a piece of paper or plastic until we collectively agree as a group of actors that this structure refers to a sum of money. When this collective agreement about what this structure stands for comes into play, then this structure becomes a ten pounds sterling banknote or a ten dollar bill. ◄ What actors deem to exist in their physical environment we might refer to as physical ontology. What actors deem to exist within the domain of their institution we might refer to as institutional or, more generally, social ontology. In a way, ontology provides to actors a way of making sense of facts about reality and through this to build or constitute reality. Given that there are two different types of things, then we would expect that two different types of facts arise from physical and institutional ontology—physical facts and institutional facts. Physical or brute facts are observer-independent. They exist independent of humans as observers of things. Brute facts are so-called because they are matters of ‘brute’ physics, chemistry and biology. Within a brute fact, the status of the thing (or things) referred to has an existence independent of institutions—even of the institution of language—although the expression of such facts relies upon systems of signs. Example
An example of a brute fact is that the sun is 93 million miles from the earth. In contrast, institutional facts are matters of culture and convention. They exist only within the context of human institutions, such as that the European Journal of Information Systems is considered a quality journal amongst the information systems academic community. ◄ Example
In terms of medical emergency response, brute facts constitute the ontology of physical things such as human beings and their medical conditions, the physical configurations of ambulances and medical equipment as well as the geographical layout of the area covered by the emergency service. Hence, it is a brute fact that an ambulance station can physically hold no more than five ambulances and that this ambulance station is situated 12.5 km from its nearest general hospital. ◄ Institutional facts are by their very nature dependent upon human institutions. They rely upon a declaration by certain actors that certain things are true. Institutional facts are matters of culture and convention and, as such, are observer-relative. Within an institutional fact, the status of the thing (or things) referred to depends upon a collective acceptance by the actors concerned that the thing has a certain function. This means that institutional facts exist only within the context of human institutions and are brought into existence through collective declaration of the conventional meaning of things by actors within such institutions. Institutional
2.9 Physical and Institutional (Social) Ontology
31
facts are important because they serve to constitute the institutional domain itself as a social ontology. Example
Hence, the following institutional facts might be established within the communicative action evident in a response to an emergence incident: • Ambulance resource 423 has been dispatched to incident 120453. . . • Ambulance resource 423 consists of two crew members D46 and P54 and equipment 24346 and 32895. . . • Crew member P54 is named Jane Bloggs and is a paramedic. . . All these facts rely upon collective acceptance of the meaning of key terms utilised in communicative acts by actors such as resource, incident, crew member and P54. The meaning of such terms is not just a matter of reference and attribution, but is constituted through the effects they have upon action. ◄ Example
Consider the different institutional domain of manufacturing and a thing familiar within the institutional context of manufacturing—that of a stillage. Stillages are physical things and as such have an existence independent of the institution. In other words, they can be described in terms of brute facts such as a stillage is a steel box being approximately 1 m in depth, height and width. These brute facts can be confirmed by any observer of such objects making such facts observerindependent. But what is the function of a stillage? A stillage may be a physical structure, but these physical structures are assigned a status within the institution concerned. A stillage is used to store various stages of finished product—‘stock’—within the context of the manufacturing plant. We might even frame the constitutive rule in this case as being: [A stillage (X) stands for a unit of stock (Y) to a group of actors (Z) within the manufacturing plant (C)]. ◄ Both physical facts and institutional facts are built using signs. Signs, as the primary constructs of communication, rely upon a shared ontology amongst a group of actors: the context within which a group of symbols is used in continuous communication by a social group or groups. Hence, a shared ontology is a necessary pre-condition for joint communication and effectively frames or controls such communication.
32
2
What Is Information?
Example
Plant biology relies upon a common ontology first established by Carl Linnaeus in which plants are given names, typically in Latin. Hence, a daffodil is named in this taxonomy as narcissus. This name, along with descriptors of its key features, is then used by plant biologists around the world to identify and describe this species of plant. ◄ An ontology provides a shared vocabulary, which can be used to ‘model’ a domain of organisation in terms of the type of objects or concepts that exist, as well as their properties and relations. We can use the term ontology to denote a common set of representations used by a group of communicants by which they transfer both content and intent from one actor to another. Ontologies are not fixed; they are continuing and communicative accomplishments and in effect emerge within any system of communication. Ontologies are important because they help support practical action. Example
Consider the case of medicine and its use of an ontology to get things done. When you visit a general practitioner (GP), she will use a number of ontologies to help her diagnose your illness, decide upon your treatment and record your prescription. For instance, a conversation about your symptoms will lead the GP to propose a number of possible illnesses you may be suffering from. Each such illness will have a generic name in a standardised, structured vocabulary of medical terms. These terms will have relationships with a range of possible medical treatments which will also have standard terminology. Medical treatments may involve prescribing certain drugs. Such drugs will be given a generic name within a formulary. This document lists not only the generic names but also possible proprietary names, the normal uses of the drug and likely side effects of drug use. ◄ Example
A key example of the way in which ontologies change is the continuing debate over gender identity. This is also a key example of the way in which ‘signs have politics’ (Beynon-Davies 2021b). Gender identity is typically defined as a personal and internal perception of oneself, which usually translates linguistically into the use of some gender category to classify someone. The consequence of this is that a person’s gender class may differ from their biological sex. One way of thinking about this is that classifying a person as having a certain sex is a brute fact. It is a biological fact whether one is born male or female. However, how a person chooses to define themselves in terms of gender is an institutional fact— reliant upon the collective agreement not only of the person but of a community of actors that this person should be referred to in certain ways. ◄
2.11
Summary
33
Exercise Consider a university domain. Name three things which everybody within the university would regard as physical things. Then name three things which are definitely institutional things.
2.10
Conclusion
We started this chapter with the claim that any modelling of information must start with an understanding of what information is and what it is not. This led us to make the case for the need to properly understand the component elements of information situations—situations in which information is accomplished. We made the claim that any situation in which information is clearly present must always consist of actors, (data) structures, messages and actions, all taking place within some domain of institutional action. The consequence of this is that any successful attempt at information modelling must begin with a close understanding and analysis of the information situations pertinent to the domain in focus. This domain may be an existing domain of communicative action, or it may be an entirely new domain of communicative action. In the next chapter, we provide an introduction to the constructs of information modelling and demonstrate clearly how these constructs relate to elements from our model of information situations discussed in this chapter.
2.11
Summary
• Information is always accomplished within information situations by actors. • An information situation is made up of actors, structures, messages and action all taking place within some institutional domain. • Information situations are situations in which information is accomplished by actors. Information emerges from the continuous exercise or accomplishment of some pattern of actors, messages and actions all working within some environment. • The proper context for information modelling is the pattern of information situations relevant to some institutional domain. The modeller needs to have an effective understanding of the information situations of relevance to build effective models. • Information modelling focuses primarily upon the messages transmitted between actors within information situations. Any message has both content and intent. An information model primarily concerns itself with the content of messages.
34
2
What Is Information?
• The content of messages transmitted between actors consists of a series of signs used to stand for things of interest to actors within the institutional domain under consideration. • Two types of signs are particularly important to document within an information model—identifiers and descriptors. When a sign refers to something, they are said to be identifiers for things. Alternatively, signs may describe something, in which case they are descriptors. • However, to properly understand the content of messages and how this content should be unpacked in an information model, the information modeller must understand the way in which such messages are repeatedly used for instrumental action within the institution in question. This demands an understanding of the purpose or intent of messages created by actors. • There are five major types of purpose associated with messages transmitted for instrumental communication. Assertives communicate the belief by the sender that the content of the message is true. Directives are an attempt to influence receiver action through some message. Commissives commit a sender of some message to the future course of action detailed in the message. Declaratives are messages that change the state of some domain of organisation through the communication itself. Expressives represent the sender’s state, feeling or emotion about something. • Hence, to effectively compose an information model, the modeller must first analyse the existing pattern of communicative actions relevant to some institutional domain. Alternatively, the information modeller must design a new pattern of communicative actions for some proposed domain of activity.
3
Why Model Information?
3.1
Introduction
Information modelling, as we have seen, is a technique employed primarily by the business analyst and the business designer. However, whereas most literature on information modelling covers the approach of building such models in fine detail, it does not adequately address the important prior activities of engagement with and investigation of institutional domains. It also does not adequately address the important question as to why we model in the first place and why information models are important. Within the current chapter, we attempt to address this gap. We first consider the idea of a model and how this construct relates to notions of institutional reality. This leads us to consider the deficiencies inherent in the way in which traditional approaches to information modelling regard the relationship between an information model and reality. We then consider the relationship between models and reality promoted in this book, which we believe offers a more sophisticated and accurate representation of this relationship. We shall show how our framing of an information model provides a better way of considering not only the true purpose of an information model but also how to approach the investigation of institutional domains. But first, let us start with a brief history of information modelling.
3.2
A Short History of Information Modelling
Information modelling has quite a long and established history within the discipline of business analysis and design. Interestingly, it originated not as a modelling technique but was developed in attempts to build more expressive architectures or data models for the database systems of the 1970s. It is for this reason that historically, when considered an analysis and design technique, it was largely known as data modelling rather than information modelling. # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_3
35
36
3
Why Model Information?
During this period, two prominent architectures for managing data known as the hierarchical and network data models dominated. However, in 1970, Ted Codd published his landmark paper on the relational data model (Codd 1970), which was set to influence the design of database systems for many decades hence. During the 1970s and 1980s, many alternative data models to the relational data model were published which became known as the semantic data models (SDMs), because they attempted to provide formalisms for building more expressive meaning (semantics) into database systems. One of the most cited of such SDMs was the entityrelationship model (sometimes referred to as the entity-relationship-attribute model) proposed by P. P. S. Chen in the mid-1970s (Chen 1976). Although developed originally as an alternative to the relational data model, IT developers soon latched on to the usefulness of the graphic approach provided by Chen to express his data model and adapted it to the purposes of database design. A number of extensions to the approach were proposed over the years, particularly the inclusion of abstraction mechanisms such as generalisation and aggregation. This led many to propose the use of what became known as extended entity relationship modelling as the appropriate way of conducting what is known as conceptual modelling, as opposed to logical or physical modelling for database systems. In early 2000, the Object Management Group adopted many of the features of extended entity relationship modelling within its specification for class diagrams as part of the Unified Modelling Language (UML). Although various updates have been made to UML as an approach since that time, the conventions of class diagramming have remained consistent for well over a decade. As we shall see in Chap. 5, there are certain subtle differences between class modelling and extended entity relationship modelling, but the core of these techniques is essentially the same and covers all those constructs to be covered in Chap. 4. We have clearly made the explicit decision to refer to the technique in focus as information modelling rather than data modelling or database modelling within this book. This is a direct consequence of our discussion in Chap. 2 where we considered the distinction between and relationship between data and information. Information modelling is concerned with patterns of information situations evident within delimited domains of institutional action. Information modelling focuses primarily upon the messages transmitted between actors within information situations. An information model concerns itself with the content of such messages, and these messages consist of a series of signs used to stand for things of interest to actors within the institutional domain under consideration. Any sign is some thing that stands to somebody for some other thing. The thing which stands for some thing is typically some form of data structure. But information modelling is also interested in what things are being referred to or designated by data items, elements and structures. So information modelling must model more than just structures of data—it must model how such structures cause actors to accomplish certain things within institutional settings.
3.3 The Notion of a Model
3.3
37
The Notion of a Model
In Chap. 2, we considered what information is and is not. Clearly, to understand the basis of information modelling properly, the other term we need to unpack here is that of a model. So what is a model? All models are abstractions. Abstraction is necessarily a process of filtering—of ignoring certain things that we feel are unimportant and including certain other things which we regard as important. But what is the status of the things we choose to ignore or include? One view of models is that they are objective constructs which abstract from certain agreed features of some reality and represent these features in some standard form. Example
Hence, natural scientific theories, such as Darwin’s theory of evolution or Einstein’s theory of relativity, can be seen as models in this light. They all are attempts to describe and predict features of the material or physical world, which is the same for everyone. ◄ Another view is that models are subjective constructs, dependent upon a person’s position and how they view reality. According to social scientists, within the social world we all potentially may work with different models of reality. Example
For instance, it is possible to argue that managers always work with models of their business. But managerial models are rarely explicit models. They are subjective models which managers use to make sense of their own situation of organisation. These models are tested in the realm of management decisionmaking and the outcomes resulting from such decision-making. ◄ The approach promoted in this book treats models as tools which are reliant upon signs and which are directed at achieving collective agreement amongst a community of actors in the fulfilment of joint actions. In this sense, models are both objective and subjective, or we should more properly say inter-objective and intersubjective. Human activity is clearly undertaken by human actors, motivated towards the solution of some purpose and mediated by tools in collaboration with others. In this perspective, human beings never interact with the ‘world’ directly. Instead, such interaction is always mediated through use of tools. Tools may be physical or technical tools such as hammers or computers as well as psychological or social tools which help people form activity themselves or with others, such as symbols or maps. Both physical tools and psychological tools are mediators which help change the structure of activity. Tools therefore shape the way in which humans both perceive and interact with ‘reality’ since they reflect the experiences of other people
38
3
Why Model Information?
who have tried to solve similar problems. Any model in this perspective is therefore a ‘tool’ for debating or negotiating about the nature of some reality: the aim being to achieve mutual understanding and joint action. The critical tool of concern in this book is that of a sign. Models are constructed using systems of signs, and for signs to work, there must be some collective agreement about what the signs are, what they mean and how such meanings shape activity. Within this book, we use the idea of a model as a way of negotiating collective belief as to either how things are in some domain or how we, as a collective or community of actors, might like things to be in this domain. Exercise Consider a map such as the map of the London underground or the Paris Metro. In what way is such a map a model? Who are the community of actors that use such a model and for what purpose? Models need sign systems in the sense that models are created through signs and effectively act as an external communicative resource amongst a group of actors. This means that all such representations are models and all models are forms of representation. Natural languages, such as English and Welsh, are clearly the richest of sign systems with which we model or represent our ‘world’. For analysis and design of business organisation, more restricted and formalised sign systems are typically used. This is the reason that visualisation (Chap. 5) is much used as a means of presenting such models. Any model can also be used in a number of different ways in relation to the dimension of time. Models can be built of realities in the past, present and future. An information model, for instance, can be developed as a model of current ontology— what people currently communicate about in terms of classes, attributes and relationships within some delimited institutional domain. This type of information model we refer to as an AS-IS information model. Alternatively, we can develop an ontology of some future communicative pattern. Here, we are designing some new way of working with an associated understanding of the communicative practice that will be needed to support coordinated activity. Such a model we refer to as an AS-IF information model. Finally, we might use an information model to specify how communication will be structured and represented within data systems. This we refer to as a TO-BE information model. Models can also vary in terms of their level of abstraction. Within the database systems area, for instance, and as already mentioned, a distinction is frequently made between conceptual, logical and physical models. In this sense, an information model would normally be seen as a conceptual model since it documents, at a high level, the things of interest that actors within some domain need to communicate about. In contrast, a logical model translates these things into the constructs appropriate to the architecture of some data system. Finally, a physical model refers to the implementation of data structures within some actual data system. In Chap. 8, we
3.4 Information Models and Reality
39
shall illustrate how to turn an information model (conceptual model) into the design for some relational database (logical model). This design can then be implemented within some relational database management system (physical model).
3.4
Information Models and Reality
Let us explain our view of information models more clearly by contrasting it with the conventional view of what information models are. To do this, we need to examine and expand upon the differing notions of what reality is. Traditionally, most current practices of information modelling either explicitly or implicitly utilise a view of reality consistent with that evident in the work of the philosopher Mario Bunge. Bunge’s theory of reality proposes that the world is made of concrete things that possess properties and that properties can be conceived of as functions that map a thing onto some value. Yair Wand and Ron Weber (1990, 1995) utilised elements of Bunge’s conception of reality in their proposals for a way of conceptually modelling information systems. Wand and Weber (1990) built upon Bunge’s conception to propose that the things and their properties relevant to some domain may be modelled directly as constructs within an information model. They further proposed that classes can be modelled as things with common properties, whereas associations are suitable for modelling binding mutual properties shared between classes (Wand and Weber 1995). So, in terms of what has become known as the Binge-Wand-Weber conception of ontology, things, whether physical or institutional, comprise an organised collection which are objective—meaning that they are the same for everyone. The practice of information modelling is then seen to be a process by which things are identified and represented as statements in some formal ‘language’ and that these statements then correspond to objective facts about some domain of reality. We have referred to this framing of information models and reality elsewhere as the conventional view of information modelling (Beynon-Davies 2018). Given our discussion in Chap. 2, this way of thinking about an information model and its relationship to reality might at first glance seem entirely sensible. It is certainly evident in background assumptions made within many existing texts on information modelling. However, practitioners of information modelling often encounter substantial problems when they attempt to perform information modelling in practice (Bodart et al. 2001) using this view of reality. Novices in information modelling experience considerable difficulty in turning problem descriptions into the abstract representation of some information model (Batra and Davis 1992). The ‘quality’ of information models prepared by both novice and expert alike is frequently poor (Moody and Shanks 2003). Also, various stakeholders in the domains being modelled often have difficulty understanding information models. Even one of the founding fathers of the conventional worldview of information modelling—Ron Weber (Weber 2003)—has questioned the typical practices by which information modellers tend to model artefacts such as orders and invoices rather than the underlying phenomena on which such artefacts rely. More recently, issues have
40
3
Why Model Information?
been raised in relation either to the lack of theoretical foundation (Siau 2003) or to the meta-modelling assumptions underlying conventional information modelling (Eriksson et al. 2013). This echoes an emerging critique of the ontological foundations of information modelling. Allen and March (2006) argue that Bunge’s ontology is concerned with representing the world of material things that exist independent of human interpretation. It has little concern with the world of human intentions and meaning. This has led some to contrast the conventional view of information modelling with a worldview more sensitive to a view of institutional ontology based in communicative competence (Klein and Hirschheim 1984; Lyytinen 1985). It is quite easy to demonstrate some of the weaknesses of the conventional view of information modelling based in the Bunge-Wand-Weber ontology by considering a simple problem in information modelling taken from the domain of medical emergency response, which we have considered already in Chap. 2. Example
Suppose you are given the task of representing an emergency incident on an information model. The conventional worldview assumes that identifying and describing an emergency incident can be done through representing a series of objective facts, much in the same way that the physical existence of an emergency ambulance vehicle is an objective fact. However, the existence of an emergency incident is not an objective fact; it is a social or an institutional fact reliant upon acts of communication enacted by actors within the domain in question. A call made by a person reporting some happening to a call-taker within the control room of some ambulance service only becomes classified as an emergency incident through a process of triage which involves various actors interpreting the severity of the medical condition of the people reported about and communicating the appropriate classification to various actors such as paramedics, dispatchers and ambulance drivers. This means that an incident only becomes an emergency to the institution of emergency response, or more precisely to actors working within this institution, when it is classified as such by certain actors given the institutional authority to declare this status. ◄ Exercise Consider a higher education institution such as a university. Explain why the grading of student assessments are not objective facts but institutional facts. Explain also which actors are involved in the production of these facts and where classification is applied in the process of grading. Therefore, within this book, we promote an alternative view of information modelling which arises from our focus upon communicative competence. This means that we focus upon how actors within some domain communicate currently about the things of interest to them or wish to communicate in the future about
3.5 What Are Information Models for?
41
certain things. As we have seen, this view proposes that when actors engage in the process of identifying and describing things, they are engaging in social acts that help constitute an institutional reality which is inter-subjective. But institutional reality is always built upon a physical reality which is inter-objective. This view further suggests a framing of an information model as a specification of the structure of terms used within acts of communication relevant to some institutional domain. These terms, as signs, must refer to and describe both institutional things and physical things. Exercise Within a university setting, name three things that are commonly communicated about between lecturers and students. Are these physical things or institutional things?
3.5
What Are Information Models for?
It should be evident from our brief account of the history of information modelling that the primary purpose of this technique throughout its application has been as an analysis and design technique which aids in the development of data systems of various forms. We use the term data system to refer to an organised collection of data structures and associated processes of articulation that may be used to operate upon such data structures (Chap. 8). The classic example of a data system is that of a relational database system. A relational database system uses the data structure of a table or tuple, operated upon by processes of insertion, deletion, update and retrieval. We shall show how to translate an information model into a schema for a relational database in Chap. 8. But the concept of a data system is much wider than that of a database system. For instance, and as we shall see, there is a close relationship between the constructs of an information model and the constructs underlying the current infrastructure of the World Wide Web. In such a realm, information modelling is useful within exercises of metadata modelling—not only in terms of modelling the elements of XML schemas but more widely in proposals to build greater semantics into the World Wide Web. We shall examine this idea of a much wider context for information modelling in Chap. 9. There is even a much more wider purpose to information modelling than the design of specific data systems or the metadata concerned with such systems. Because of its usefulness for modelling aspects of institutional ontology, information modelling has a much wider range of application within business, the public sector and the voluntary sector. In previous work (Beynon-Davies 2021a, b), we have made the case for thinking of organisations of all forms in the modern world as being ‘scaffolded’ through patterns of information situations. The close coupling of data to action means that information modelling is useful in many organisational situations
42
3
Why Model Information?
for coming to a collective agreement about not only how things are but how we might want them to be. Let us consider just three areas where information modelling is important in this wider sense. Statistics is a branch of mathematics devoted to the collection, analysis, interpretation and presentation of masses of numerical data. Statistics is also now seen to be an important part of the infrastructure of data science. If you talk to any good statistician, you will establish the necessary truth that the analysis of data in any form must be based upon a firm understanding of the ways in which data structures are made. Any statistic is only as good as the data it is built upon, and indeed, it is impossible to interpret any aggregate measure generated through statistics properly without understanding the making of data structures which scaffold this analysis. This means, of course, that the whole notion of conventional statistics implicitly relies upon a form of information modelling. We would argue that the design of data sets used by statisticians as well as the investigatory techniques of questionnaires can be much improved through effective information modelling. The management of large-scale construction projects is now facilitated by so-called building information models. A building information model is a digital representation of the physical and functional characteristics of a building from inception through to design, construction and use. In a sense, such a building information model represents an agreed model of the things of interest to numerous actors such as architects, planners, builders and administrators. The building information model acts as a collective communicative resource important to all actors engaging with this artefact. Finally, we should mention that data as implemented in data systems is a critical resource not only to specific institutions but acts as important infrastructure across institutions. In this sense, information models are important to what is known as data administration, which attempts to develop and execute policies for data definition, control and protection. Data administration is the attempt to impose order upon the diverse data structures articulated currently across an institution while also planning the data required for future action. To achieve this, data administrators implement standards for the definition and storage of data. Administrators also create and monitor practices that define and control access to data resources. They ensure the integrity of the data resource and that it is secured from threats. This means implementing procedures to ensure that the organisation complies with any legislation concerning data privacy. Finally, data administrators encourage sharing of data across applications and promote the idea that data as a resource is independent of IT applications and its users.
3.6
Investigating the Ontology of Domains
Given that information modelling is an attempt to represent important elements of the communicative practice within some institutional domain, the question remains—how do we engage practically with such communicative practice? In
3.6 Investigating the Ontology of Domains
43
other words, how do we try to make sense of what people currently communicate about or how do we envision what actors will need to communicate about? Information modelling is a specialism within the wider discipline of practice known as business analysis or business design. The business analyst tries to make sense of existing domains of organisation, while the business designer envisions future domains of organisation (Beynon-Davies 2021a). Investigation within business analysis normally occurs in short periods of immersion within such domains, typically using some combination of investigation techniques. Various forms of representation are then constructed to communicate common understanding between the business analyst/designer and various actors with a stake in the domain under consideration. An information model is one such form of representation. There are a number of established ways of making sense of both existing and envisaged patterns of information situations within some domain of organisation. These ways include conversation, observation and participation. There is also the analysis of existing data structures such as records or documents. The investigation work of the business analysis is typically conducted as a systematic conversation between the business analyst and so-called stakeholders in some problem situation (Beynon-Davies 2021a). The focus of business analysis and design is typically upon some situation within institutional life which is regarded as in some way problematic—hence, the term problem situation. People who are interested in change to the situation are referred to as stakeholders in the problem situation. The purpose of systematic conversation is to build some common ground between the analyst and such stakeholders about the ‘shape’ of the problem situation. This common ground may either constitute an understanding of how things currently are or an understanding of how stakeholders would like things to be. As a form of investigation, the business analyst/designer and stakeholders can engage in conversations on an individual or a group basis. When conversations are led and controlled by the business analyst with specific, named individuals, they are referred to as interviews. Interviews are directed conversations, designed to achieve specified goals. When conversations are led and controlled by the business analyst with a representative group of stakeholders, they are likely to be referred to as a focus group, collaborative meeting or design workshop. Interviews are clearly not everyday conversations—they are systematic and directed conversations organised typically around pairs of questions and answers. Having said this, the degree of formality can differ between interviews. Informal or unstructured interviews are those in which questions are formulated by the business analyst within the flow of the interview itself. Formal or structured interviews are those in which a structure or protocol is devised prior to the interview and used by the business analyst to drive the flow of conversation. The consequence of this is that the actual questions asked within an unstructured interview will differ from one interview to the next. In contrast, a structured interview will deliberately involve asking the same questions of different actors. Unlike an interview which is one-to-one communication between the business analyst and a stakeholder, a focus group is a one-to-many communication between the business analyst and a range of stakeholders. A focus group is a discourse in
44
3
Why Model Information?
which a group of people are asked a series of questions about their perceptions, opinions, beliefs and attitudes towards something or some situation. These questions are asked in an open group setting where participants should be encouraged to talk freely with other group members and in doing so may formulate joint responses to the questions asked. The group nature of this form of investigation is seen to be important to generating consensus views of something or some situation. Interviews and focus groups are particularly good means of investigation where the objective is to develop some common basis of understanding about some problem situation. Meetings and workshops are vehicles particularly for decisionmaking in areas such as the prioritisation of issues to be addressed or requirements for a new information model. Within business analysis, workshops are typically vehicles for joint design of problem solutions. Workshops constitute sessions in which the business analyst and representatives of stakeholder groups get together in a structured situation to formulate thinking about either an existing domain of organisation or a new domain of organisation. One of the best ways to understand a set of practices is to engage in such practices yourself—this is what is meant by participation. There is nothing like practically attempting to ‘walk in someone else’s shoes’ for appreciating what is actually involved in doing some aspect of work. The key problem with participation as an investigative technique is that it is likely to take time. In contrast, observation usually involves being present in work settings but not directly participating in the pattern of action. Instead, the analyst will be involved in recording the detailed work behaviour of people and machines. One way to manage the observation of work is through shadowing, that is, following a particular worker around and observing all the tasks performed by this worker in the activity system in question. Another way of managing observation for business analysis systematically is to walk through a pattern of action with the people doing the job. Ideally, this should be done a number of times with different workgroups to tease out any differences in practices across business units. Most of the investigation techniques discussed so far involve the business analyst engaging with a domain through its human actors. However, as a direct consequence of our theory of information situations, it is useful to think of artefacts as acting, at least in a limited sense. Therefore, it is particularly important that the business analyst engages with such artefacts to help make sense of either current or envisaged domains. Within Chap. 2, we hinted that data utilised within some domain of organisation helps constitute institutional facts about this domain. Such data comes in many different forms. For instance, documents are a valuable resource in most organisations. Such documents may consist of paper forms used in work, reports generated from ICT systems or design documents of various forms. Documents are particularly important, for understanding the structure of data used in the support of work performance. Data structures act as an institution’s collective memory. Hence, sampling the records used in support of some domain of action and analysing such records is
3.7 Conversations for Action
45
particularly important for understanding what actors within the domain feel it is important to remember about. Records also act as key resources for communication between multiple actors, sometimes remote in time and space. Records are typically built for a particular defined purpose and are important to understand for the way in which they establish purpose and performance in some domain of organisation. In recent times, business data has become even more significant than in the past. The amount of business data represented in organisational ICT systems of various types has grown astronomically. Various technological approaches are now available to analyse large data sets which suggest patterns of action worthy of further investigation. Collectively, this approach has become known as ‘big data’ (Chap. 9). Technologies such as data warehousing, data mining and data analytics are being used by business analysts to determine hidden patterns—patterns which will probably need to be explored and detailed further using other methods of investigation such as observation, workshops and interviews.
3.7
Conversations for Action
In terms of information modelling, we are not interested in all conversations; we are interested in conversations for action. A conversation for action is one in which two or more actors accomplish information with the purpose of coordinating their activity. Therefore, one particularly important resource for the information modeller are the conversations for action that may be collected from the domain under investigation. A conversation is typically made up of a sequence of adjacent communicative acts (Clark 1996). In other words, one actor articulates an utterance, and another actor articulates an utterance in response; this leads to a further pair of communicative acts and so on. Pairs of communicative acts within a conversation tend to follow conventional patterns in which the articulation of a certain utterance generates a preferred response. One of the most typical patterns of adjacent communicative acts includes the question-answer pattern, such as ‘How many production units do you have?’ ‘Eight’. This is actually a type of communicative act which we called a directive in Chap. 2 followed by an assertive. Such a question-answer pattern forms the basis of structured conversations such as the interview. Another pair is the assertion-agreement pattern or assertion-disagreement pattern, such as ‘There is clearly a problem with stock flow’. ‘Yes, I think you are right’ or ‘There is clearly a problem with production scheduling’. ‘No, I think it has more to do with the way we manage stock’. This pattern is composed of an assertive followed by an assertive and is the basis of much group decision-making. There is also the summons-response pattern, such as ‘I’d like to talk to you about this issue on Tuesday’. ‘Yes, that should be fine’. This is a directive followed by a commitment and is the basis of communication used for control purposes within organisations.
46
3
Why Model Information?
Finally, there are two types of paired acts of communication that express the inner state of actors involved in the communication. There are the typical thanksacknowledgement pattern such as ‘Thank you for your contribution to this effort’. ‘No problem, I enjoyed it’ and the apology-acceptance pattern such as ‘I am sorry I raised this issue so abruptly’. ‘Your apology is accepted’. Example
Consider the conversation between a caller and a call-taker within the control room of an emergency response service. At the control centre, many different callers make calls to the call-takers working in shift patterns. These calls occur many times over a 24-hour period, 365 days a year. The call-takers are left unconstrained in holding a conversation with the caller, but there are certain important things that the call-taker must gather from the caller during the duration of the conversation. The conversation has an instrumental purpose—to accomplish sufficient information so that decisions can be made not only as to whether to dispatch an ambulance but where to and with what resources. Consider one instance of a call made to the control centre: Call-taker: You are through to the ambulance service . . . how may I help you? Caller: Please, can you send an ambulance . . . my mother has fallen out of bed and cannot get up off the floor. Call-taker: I see; are you calling from the house your mother has fallen in? Caller: Yes . . . she is groaning on the floor now . . . can you hurry please? Call-taker: I first need to take some details. . . can you tell me the name of your mother and the address you are at? Caller: Elsie Phillips and we are at 25, Halethorpe Road. Call-taker: Do you know the postcode for the property? Caller: No, sorry. . . Call-taker: No problem, I can find it on my map. . . yes we have the address. . . can I have your name please? Caller: I’m Joe Phillips and Elsie is my mother . . . Call-taker: OK. Is your mother conscious Joe? Caller: Yes. Call-taker: Are you able to speak to her Joe? Caller: Yes, she says her left hip is very painful and she cannot get up. Call-taker: Can she tell you how long she has been on the floor Joe? Caller: She says that she fell out of bed last night . . . so, she must have been lying on the floor for some hours. . . Call-taker: OK, try not to move her Joe but make her as comfortable as possible by perhaps putting a cushion under her head and putting a blanket over her . . . an ambulance will get to you soon . . . Caller: I will do . . . can you tell me how long the ambulance will be please?
3.8 Visualising Patterns of Information Situations
47
Call-taker: It should be with you in a matter of minutes Joe . . . but I’ll keep you updated on this number . . . Caller: thank you. . . ◄ This conversation consists largely of adjacent pairs of questions and answers or directives and assertives. The call-taker directs the caller to assert certain things about the situation. Through this instrumental pattern of communication, the calltaker is trying to establish a number of institutional and physical facts about the possible emergency incident, namely, who is calling, where the incident has taken place, who is involved in the incident and what is their likely medical condition. This will enable the call-taker to hold a further conversation for action with a paramedic dispatcher who will decide upon the appropriate level of response to the incident and may get back in touch with the caller to engage in further conversation which will direct the caller to do certain things until the ambulance arrives.
3.8
Visualising Patterns of Information Situations
Within this chapter, we have made much use of the idea of a pattern of information situations, first introduced in Chap. 2. A pattern is any regular set of differences (Bateson 1972) which is reproduced across more than one situation. The idea of pattern is central to many disciplines. For instance, the American architect Christopher Alexander (1964) proposed that architectural design is based on a number of archetypal patterns which encapsulate fundamental principles of building design. This idea has had much influence within other disciplines such as software engineering where design patterns are proposed as general solutions to programming problems. Hay even produced a set of patterns for common information models found within business (Hay 1996). In terms of any pattern, such as a pattern of information situations, it is important to make the distinction between a pattern in principle and a pattern in practice. A pattern in principle is the ideal or schematic form of a pattern and consists of roles undertaking commonplace action. A pattern in practice consists of definitive actions undertaken by specific actors in specific places and at specific times. The term scenario is often used to refer to some representation of a pattern in practice, consisting of actual and observed patterns of action performed by particular actors within some domain. From the analysis of various scenarios, the business analyst may devise a pattern in principle which abstracts the common features of actors and actions. This distinction between patterns in principle and patterns in practice applies to overall patterns of information situations as well as to individual patterns of articulation, communication and coordination. Example
The conversation we have seen between a caller and a call-taker is a pattern of communication in practice. By studying recorded conversations as patterns of
48
3
Why Model Information?
communication in practice, it becomes evident that certain common features are evident in all conversations made between persons enacting the roles of callers and call-takers within the domain of emergency response. These common features can be used to form a pattern of communication in principle. ◄ But how should we represent such patterns of communication in principle? In previous work, we have found comics useful as a means of visualising patterns of information situations (Beynon-Davies 2021a) in general or patterns of communication in particular. This is for a number of reasons. Comics are highly visual, and we know that ‘a picture paints a thousand words’. We deliberately use only a few constructs within a pattern comic to help us think differently about organisation. Comics are deliberately freeform in nature—you can add to the core constructs of comics with ease. Most people, with little prior training, find it reasonably straightforward to ‘read’ a comic. We use comics to focus upon patterns of action by actors. In other words, we put actors and action at the centre of our representations of institutional domains. Finally, such comics can be used not only to document common understanding about what people do or think they do; they can be used to document ways of improving some domain. We refer to these visualisations as pattern comics or business pattern comics. The technique is deliberately informal and open-ended but is typically used to visualise some system of action. To do this, we need ways of describing actors taking action within a defined chronology of events, and for this purpose, the set of elements illustrated in Fig. 3.1 is useful. A typical comic is made up of a series of panels, with each panel consisting of one or more cells. The finite set of descriptive states for the domain in question is visualised as a finite collection of comic cells, each cell typically describing one state of action within the overall business pattern. The sequencing of cells is represented as dotted arrows linking cells. Therefore, each cell is generally used to represent a snapshot of action (event) within an overall plot, and a linked series of such cells is used to narrate the storyline. Human actors are represented by stickpersons or named mannequins within comic cells. Machine actors such as ICT systems or artefacts such as data structures are represented by appropriate icons. When actors are represented, speech bubbles (to indicate external communication) and thought bubbles (to indicate internal communication) are attached to pictured characters—particularly within patterns of communicative action. Captions are also attached in a more free-form way to cells and are used to convey additional message content over and above that conveyed by visualisation. Typically, pattern comics are primarily used as an analysis tool—as a way of making sense of what is going on currently within some domain. In this mode, we have tended to use comics as a way of representing observed action. The comic is drawn and then used as a focus for discussion with representative actors from the pattern of action being analysed. This serves to validate and verify observations. It also serves to establish common ground about ways of doing amongst a community of actors. The documentation of routine work in practice as one or more pattern comics is then used as a resource for the production of a representation of the routine in principle. This comprises an abstraction of observed action. Again, this
Fig. 3.1 Elements of a pattern comic
Cells can represent acts of articulation, communication or coordination.
Each panel is made up of more than one cell.
Panel
A comic is made up of more one or more comic panels.
Role
Role
Thought bubbles are used to represent internal dialogue.
Cell
Articulation act
Caption
Speech bubbles are used to represent external dialogue.
Role
Role
Actors can be humans, machines or artifacts.
Communicative act
Caption
The chronology of the narrative is established through sequencing of cells.
Artefact
?
Caption
Role
Option
Coordination act
Cells are generally used to represent action performed by identifiable actors.
Caption
Role
Captions are sometimes used to provide a third person accounting of the context for the action in a cell.
Decision points can be used to indicate choices undertaken by actors to change the flow of action.
Connector symbols are used to indicate both the start and end of a particular pattern and to connect between patterns.
3.8 Visualising Patterns of Information Situations 49
50
3
Why Model Information?
representation of the routine in principle can be validated and verified with participating actors. Through this process, it becomes possible to use such comics as a high-level representation of the situation AS-IS within some domain of organisation.
3.9
Documenting a Pattern of Information Situations
We suggested in Chap. 2 that the primary purpose of any investigation of an institutional domain for the purposes of information modelling is to understand and document patterns of information situations that serve to constitute the domain in question. Let us demonstrate what this means in terms of a domain in which a piece of information technology is utilised in caring for the elderly in their own homes. We shall show how we derive a pattern in principle of information situations for this domain and then use this to develop a set of institutional facts which identifies and describes the key things of interest to actors within this domain. These facts can then be used to compose an information model and build a visualisation of this model. We shall merely introduce the process of composition here, which we cover in much more detail in Chap. 6. Elderly persons falling in their own homes is one of the most common reasons they get admitted to hospital. The use of personal alarms with associated telecare systems is increasingly important to the care of the elderly in their own homes. A personal alarm is a small device with a single button worn on the wrist or around the neck of the person at all times. When the button is pressed, a monitoring system is alerted, and help is sent. To investigate such a domain, the investigator might interview some of the key stakeholders within the domain, such as service providers, carers and the elderly people themselves. The investigator might also visit the home of a number of elderly persons and see the operation of personal alarms in action. She might also participate as a call handler at the control room of the service provider. A focus group of carers and elderly persons might gather much useful insight into the needs of these stakeholders. Finally, a workshop might help stakeholders within the service providers to determine areas where the system of personal alarms might be improved in the future. Figure 3.2 documents the pattern of information situations relevant to this domain based upon a close analysis of how this technology is used in practice. This pattern consists of an assemblage of acts of articulation, communication and coordination undertaken by multiple actors. The sequence of actions in narrative form is as follows (numbers relate to those on the figure). An elderly person makes a telephone call to the service provider (1). This act of articulation communicates to the service provider a directive detailing the details of the person required to be registered. The next step is that the service provider holds a follow-up call with the elderly person/ carer (3). This call is used to discover the list of contacts to be contacted in the case of an alarm being raised (4). These acts of communication trigger an act of coordinated action (5), namely, that an installer installs the fixed position receiver for the alarm
Make
Contact
Contact
call
Confirmation call
Yes
COMMIT[I shall/ am unable to visit person X]
Service provider
Call handler
Contact call
Contact
Contacts call
14
Service IT system
DIRECT[ End of list]
Confirm visit
Call handler
DIRECT[Can you confirm you have visited person X]
No
No
Elderly person
Call handler
ASSERT[I have visited person X]
Commitment to visit?
11
9
Make contact call
Service provider
Contact
Service IT system
DIRECT[ Contact commits]
Registration call
DIRECT[A visit needs to be made to person X at location Y]
13 confirmation
Contact visits elderly person
Contact
Service provider
DIRECT[I would like to order a personal alarm for person X]
2 Order personal alarm
Fig. 3.2 A pattern of information situations for personal alarms
Elderly person
10
Raise contact
Visit elderly person
12
Carer
Registration call
1
Make registration call
3
Yes
End of contacts?
8
Service IT system
DIRECT[ Visit made]
Yes Yes
Assert incident
Contacts call
15
Emergency call
4
Make call to emergency response
16
Emergency signal
Assert medical emergency
18
7
Signal alarm incident
Elderly person
Installer
Elderly person
Emergency call
ASSERT[A possible medical emergency has occurred with person X] at location Y
Ambulance control
Ambulance control
Service IT system
Service provider
Communicate contacts
DIRECT[I would like contacts X, Y and Z added to my list]
Visit confirmed
No
Callhandler
Emergency signal
ASSERT[An incident has occurred on person X at location Y]
17
Call handler
Service provider
Make contacts list call
5
Attend emergency incident
19
Elderly person
Install personal alarm
Ambulance crew
Any health issues?
20
Carer
6
Paramedic
DIRECT[ Health issues]
PRESS personal alarm
A&E staff
21
Elderly person
Ambulance crew
Take patient to nearest general hospital
No
Elderly person
3.9 Documenting a Pattern of Information Situations 51
52
3
Why Model Information?
within the home of the elderly person and trains the elderly person/carer in the use of the alarm. The elderly person wears the alarm, and the button is pressed at some point (6). This triggers a signal which is received by a call handler in a service control centre (7) and asserts to this person that an incident has occurred (8). The call handler then contacts, by fixed or mobile telephone call, the first contact on the nominated list of the person (9). This call requests the contact to commit to visiting the elderly person and check his or her status (10). If no response is obtained from the contact or the contact is unable to commit to checking on the elderly person (11), then a further call is made to the next contact on the list. If the contact visits the elderly person (12), the call handler makes another call (13) to confirm this (14) with the contact. If the visit is confirmed, then the pattern ends. If no such confirmation is obtained, then a call is made to emergency response (16). Such a call is also made in the case of the call handler running out of contacts to contact (17). The emergency call asserts that a possible medical emergency has occurred for the elderly person (18). Ambulance crew then attend the person (19). If the elderly person has any health issues (20), then the patient is taken to the nearest general hospital for assessment (21). Otherwise, the pattern comes to an end. The key actors in this pattern are therefore elderly persons, their carers, the service providers and people to be contacted in an emergency, which may potentially be members of the emergency response service such as ambulance drivers and paramedics. We should not also forget that the ICT system of the service provider is a key actor in this pattern. Clearly, most of the indicated actions upon Fig. 3.2 are communicative acts (Chap. 2). Communication occurs to enable installation of equipment—personal alarms and receivers. Communication is important to ensure coordination of actors in response to an emergency. The diamonds upon this pattern comic are used to indicate decision points that change the flow of action through the pattern. What is also documented within this pattern is the way in which communication happens between actors. This consists mainly of fixed line telephone or mobile calls as well as electronic signals transmitted to the service provider IT system. This pattern of information situations provides for the investigator a solid basis for constructing an information model which documents what is communicated about in the current situation. This is clearly an AS-IS information model. We shall examine in some detail the constructs of an information model in the next chapter, and then we shall consider how to compose an information model in Chap. 6 from a close understanding of information situations. This involves the investigator unpacking the content of communicative acts and translating the things of interest into classes, relationships and attributes. Here, we introduce in overview both the constructs and the key principles of this composition process. For instance, consider the communicative act within cell 8 upon Fig. 3.2 which asserts that [An incident has occurred to person X at location Y]. This communicative act in principle abstracts from a range of communicative acts in practice. From this communicative act in principle, we can infer a number of things of interest, namely, an incident, a registered person and a location. These are what we refer to as information classes.
3.10
Conclusion
53
We can also infer a number of associated institutional facts from this communicative act, namely, that persons and locations must be uniquely identifiable to the service provider’s IT system and the call handler that uses it. We can probably say that an incident is timestamped in the sense that the signal sent from a personal alarm will be registered at a particular time and date. We can also assume that a person is associated with a designated location and that an incident occurs at a certain location on a registered person. This means we can write the following list of institutional facts in a base form of notation known as binary relations. [P101 REFERS TO ] [L101 REFERS TO ] [P101 ISA Registered person] [L101 ISA Registered location] [Registered person located-at Registered location] [Registered Location locates Registered person] [Incident involves Registered person] [Registered person involved-in Incident] [Incident HASA time] [Incident HASA date] The first two facts here define two identifiers used in this domain and indicate that they refer to instances of things or object within this domain. The third and fourth facts define that the identifiers identify persons with personal alarms. The fifth to eighth facts establish relationships between a location, an incident and a person registered with the system provider. Finally, the last two facts detail the fact that an incident class can be described in terms of the date and time it is recorded. Using such institutional facts, we have a starting basis for forming elements of the information model. We can add to this understanding by examining each communicative act and expanding upon institutional facts felt important for supporting coordinated action within this domain. Exercise Take one other of the communicative acts detailed on Fig. 3.2, and try to generate further institutional facts from it.
3.10
Conclusion
Within the current chapter, we have examined why models are important and how such models relate to reality. We have proposed that information models are a special type of model focused on an understanding of the communicative competence of actors within some domain of institutional action. Such understanding may
54
3
Why Model Information?
be obtained by engaging in many different forms of investigation, such as interviews, observation and participation. In Chap. 4, we move from issues of investigation to a discussion of the key constructs used in the representation of information models. We set our coverage of these constructs firmly within the theory of information situations which was introduced in Chap. 2. Business analysis and design tend to utilise visualisation as a means of communicating understanding, not only between business analysts and technical personnel but also to the stakeholders in the domain in question. This is why in Chap. 5 we describe how to turn an understanding of the institutional facts pertinent to some domain into an information model diagram.
3.11
Summary
• Information modelling has been used as a practical business analysis and design technique for at least 40 years. • Information models are clearly models. A model is a way of negotiating collective belief as to either how things are in some domain or how we, as a collective, might like things to be in this domain. • Most current practices of information modelling either explicitly or implicitly utilise a view of reality as being made of concrete things that possess properties. Information modelling is then seen to be a process by which things are represented as statements in some formal ‘language’ and that these statements correspond to objective facts about some domain of reality. • We promote an alternative worldview of information modelling which focuses upon how actors within some domain communicate about the things of interest to them. According to this worldview, an information model is a specification of the structure of terms used within acts of communication relevant to some institutional domain. These terms may refer to and describe both institutional things and physical things. • Information modelling has been used primarily as an analysis and design technique which aids in the development of data systems of various forms. But information models have a much larger range of application. • Information modelling relies upon a prior investigation of some institutional domain. Such investigation may comprise interviews with key actors, participation in institutional activity, observation of such activity or analysis of data structures used to support such activity. • We suggest that the primary purpose of any investigation of an institutional domain for the purposes of information modelling is to understand and document the patterns of information situations that serve to constitute the domain in question. From such patterns, an understanding of the institutional facts important to actors can be gleaned, and an information model can be composed.
4
Information Modelling from First Principles
4.1
Introduction
In this chapter, we shall cover most of the constructs relevant to contemporary information modelling. But we shall build an account of these constructs from first principles, using the theory of information situations established in Chap. 2. We shall portray the use of these constructs in combination as a way of modelling important aspects of what we shall refer to as institutional ontology. We start with the notion of an object referred to through an identifier. This leads us to consider the process of classification, which involves grouping objects that share common characteristics into an information class. Information classes are defined in terms of attributes held to be common amongst a group of objects, but they are also defined in terms of their relationships of association with other classes. Such relationships of association are further defined in terms of certain constraints, known as cardinality and optionality. We then look at two important processes of further abstraction sometimes considered important to modelling institutional ontology with classes—that of generalisation and aggregation. Generalisation can be considered the process of extracting from one or more information classes the description of a more general class. Generalisation is used to build a class hierarchy of super- and sub-classes. Aggregation is an abstraction in which a relationship between objects is considered a higher-level object. An aggregation relationship relates a whole to its parts.
4.2
Objects and Identifiers
We used the term ‘thing’ quite a lot of times in both Chaps. 2 and 3, when we were discussing the issue of ontology. The common English term ‘thing’ is typically used in a very general way to represent any unit of existence or more precisely something a set of actors within some institutional domain takes to exist. Information modellers # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_4
55
56
4
Information Modelling from First Principles
would refer to such things as objects and think of the abstraction of such objects as classes of object or object classes. So, objects are the component units of some ontology—some set of beliefs common to a set of actors about what reality is. An object is an instance of some thing of interest to two or more actors within some domain. Objects, as we have seen, may be physical things such as customers, products, houses and cars—all these objects have a material form. But objects may also be events such as a house sale, a customer order, a customer payment or a car service—these events are all timestamped happenings. Finally, objects may be purely institutional things, such as orders, sales, contracts and deeds. Institutional objects may take a material existence such as a paper form or an electronic record, but the objects themselves do not depend on their material form as such. Instead, they rely upon a collective acceptance amongst a group of actors that these objects are deemed to exist. A fundamental characteristic of any object is that it must be distinguishable from other objects by a community of actors. Actors within some institutional domain must be able to sever some physical or conceptual space and say what is a certain object and what is not that object. In this sense, we might define an object as some aspect of a domain which can be distinguished from other aspects of the domain: something that makes a difference to actors. To differentiate one object from another, we typically assign an identifier to the object (Chap. 2), and to effectively discriminate objects, each identifier ideally should be unique within the domain in question. Example
So, assume the domain of interest is a manufacturing plant. We might have a list of identifiers for objects of interest to this domain as follows: [5342] [6634] [9982] ... As identifiers, these signs refer to some distinct object—physical, institutional or an event—as we discussed in Chap. 3: [5342 REFERS TO ] [6634 REFERS TO ] [9982 REFERS TO ] ... ◄ Through assigning an identifier to some object in this manner as a form of reference, we bring that object into existence for some institutional ontology. Such
4.3 Classification and Instantiation
57
rules of reference serve to constitute major aspects of the ontology for the actors communicating within this domain. Example
Take the example of a large and dispersed institution such as the UK National Health Service. The objects of particular interest to actors working within this institutional domain are patients. Such patients are typically referred to by a common surrogate identifier, known as an NHS number. An NHS number is a 10-digit number such as 485 777 3456 and serves to uniquely refer to one and only one patient of this institution throughout their life. ◄ You will note in this example that we substituted the term stands for with refers to. As you will see, within information modelling, we formalise or specialise a number of distinct stands for relations in this manner. In the next section, for instance, we shall examine the ISA relation which is critical to understanding processes of classification and instantiation. Exercise Conduct a small investigation of a domain known to you. Determine what identifiers are used on a regular basis. How many of such identifiers are surrogate identifiers—artificial identifiers created purely to uniquely refer to objects? If they are not surrogate identifiers, do the identifiers allow actors to readily discriminate between the objects referred to or are there any problems of unique identification?
4.3
Classification and Instantiation
Bowker and Leigh-Star (1999) within their important exploration of the use of categories within standards established for professional practice make the important point that classification helps order human interaction. For them, ‘to classify is human’. Classification schemes form an important part of the data infrastructure underlying much human activity, and, as such, they are frequently invisible or tacit to actors performing such activity. They are ready-at-hand parts of the way in which actors approach objects. It should therefore come as no surprise to find that classification as the process of assigning classes to phenomenon is a critical part of information modelling. So, let us look at what classification means in terms of institutional ontology.
58
4
Information Modelling from First Principles
Exercise Conduct a brief investigation of a classification scheme such as the British National Formulary or the Dewey Decimal System. What are these classification schemes used for and by whom? If you remember from an earlier section, a constitutive rule is a rule which implements the very notion of a sign and always has the form: [X stands for Y to Z in C] where (X) is some thing which stands for some other thing (Y) in some institutional context (C) to some actor or group of actors (Z). Identifiers are the types of sign we introduced in Chap. 2 and which we have dealt with so far in this chapter. But identifiers of course do not describe. For this, constitutive rules need to work within a process which Charles Sanders Peirce refers to as infinite semiosis—semiosis being the process of sign-use. This process is theoretically infinite because one sign may stand for another sign, which in turn stands for another sign, and so on. In other words: [A stands for B; B stands for C; C stands for D. . .] The process of infinite semiosis is particularly evident in the way in which actors use signs to abstract. The idea of classification or instantiation is a key example of abstraction and involves the definition of a series of common properties applicable to a group of objects. As a constitutive rule, classification can be expressed as: [X ISA Y to Z in C] The relation ISA (Brachman 1983) here may be taken as yet another special type of stands for relation. Within this rule, X is normally a placeholder for some identifier of an object, while Y is a class or category to which the thing identified by X applies. C denotes the institutional context or domain in which this particular classification rule holds for some actor Z. Example
So, we might instantiate (assign some instances to) the identifiers previously listed within the domain of a manufacturing plant in the following manner: [5342 ISA Product] [6634 ISA Product] [9982 ISA Product] ... ◄
4.3 Classification and Instantiation
59
The Y term in the constitutive rule we have just seen is an object class, or class for short, and forms an abstraction of a group of instances or objects. This means that there are normally many objects that correspond to an object class, as is the case with our example of the class product. If there are not many instances of something within the domain in question, it is probably not worth actors making and using the abstraction of an object class within communication. Generally, a class or more accurately an information class may be defined as some ‘thing’ which actors within some institutional domain recognise as important and communicate about currently or wish to communicate about on a regular basis.. Other terms used to refer to such categories for things of interest are entity and entitytype. To communicate about such things, the group of actors must be able to distinguish instances of some class from instances of some other class. Therefore, a class is an abstraction from the complexities of some domain. When we speak of a class, we normally speak of some aspect of the domain which can be distinguished from other aspects of the domain, again, something that makes a difference. But now we are working at a higher level of abstraction than an object. An object class is a sign which stands for an object, or more likely a set of such objects, and serves to categorise or classify such objects. Example
Take, for instance, a university as an institutional domain. Universities need to communicate about a number of things to help in the activities of teaching and learning. These things include students, lecturers, courses and modules. In this example, all these things would be valid information classes. ◄ Note that different institutional domains will have different things of interest depending upon the perspectives of actors within the domain. Hence, they will need to have a different set of information classes to communicate about. Example
Therefore, a university will be interested in students, lecturers, courses and modules as information classes. An insurance company will be interested in a totally different set of classes such as customers, policies and claims. A manufacturing company will be interested in deliveries, products, jobs and dispatches. ◄ An information class, of course, is also a clear example of a sign. When actually writing of a product such as a galvanised steel lintel or a person such as a lecturer or an event such as a business visit, we are inherently using signs as classes. As we indicated, to speak or write, or generally to communicate, about some object, we
60
4
Information Modelling from First Principles
need an identifier to refer to the specific thing we are referring to or identifying. In such terms, an identifier, as we have seen, is one of the most important of signs. Example
Paul Beynon-Davies is a natural identifier for me: it singles out or refers to me as an object of interest. Lecturer, consultant, academic and author are all signs for classes which apply to me: they are designators for certain concepts that encapsulate a certain space of objects. Or alternatively, Module might be a class, whereas Information modelling might be an instance or object of the Module class. ◄ What we use such classes for is to chunk up the world so that we can communicate about it. Classes are categories that enable actors to discriminate between things and describe such things. Hence, when we write: [9982 ISA Product] We are defining an object (identified as 9982) as being a member of the class product. In one direction, from object to class (9982 to product), we are classifying an object as being an instance of a class: this is the process of classification. In the other direction (product to 9982), we are engaging in instantiation: instantiating (making an instance of) a given class, by listing an object that is encompassed by or covered by the class. Example
Within the university domain, Lecturer or Professor may be a class, whereas Paul Beynon-Davies is an instance of this class. Paul Beynon-Davies is an object. Or alternatively Module might be a class, whereas Business Analysis might be an instance or object of the Module class. In the case of a manufacturing organisation, L1200 will be a specific instance of a steel product (class). In the case of the emergency ambulance service, Jane Smith might be an instance of the class patient. ◄ Exercise Gather a small range of actual communications in a domain familiar to you. For instance, collect a range of similar emails sent you in some domain of organisation. In terms of this collection, think about what is regularly being communicated about. What is being classified or categorised in such repetitive communication? What instances can be abstracted into what information classes?
4.4 Attribution
4.4
61
Attribution
It is clear from the previous discussion that an object class is an abstraction of the common features of a group of objects. Such features are defined in terms of relationships between the class and its properties or attributes. This is the process of attribution, which involves the use of signs to describe objects. A class is also defined as well in terms of its relationships with other classes. This is the process of association, which refers to how we tend to communicate about certain things through signs always in relation to other signs denoting other things. Let us first concentrate on the process of attribution. Example
In a university domain, we normally define a class such as Module or Student because we wish to communicate about such things and eventually to record some data about the occurrences of these things. To do this, we use the properties or attributes of a class. For instance, students have names, addresses and telephone numbers; modules have titles and credit points. ◄ A class is given shape through its properties or attributes. When we define the properties of some class, we engage in a process of attribution. Attribution is the process of defining a class in terms of its properties or attributes. Example
Consider the institutional domain of manufacturing again. Within this domain, manufacturing products are key things of interest that are communicated about on a regular basis. Take one class of manufacturing product, perhaps steel lintels. This product is a class defined by its attributes or properties such as product length and product weight. ◄ The constitutive rule for attribution might be written as: [X HASA Y to Z in C] where X is a class and Y is an attribute of the class within the institutional domain C. HASA is thus another special type of stands for relation which allows us to define a class in terms of a listing of its attributes. Example
For example: [Product HASA Product Length] [Product HASA Product Weight] ◄
62
4
Information Modelling from First Principles
This way of defining a class through its attributes, properties or features is referred to as an intensional definition of the class. Classes are by their very nature interesting things because information classes are normally used to define logical groupings of data, otherwise known as a data structure. So, as we shall see in Chap. 8, the classes defined upon some information model will normally turn into the data structures used by some data system. Hence, one rule of thumb or heuristic to apply in identifying suitable classes appropriate for a given institutional domain is the following: If you need to store and access data about many properties or attributes of some thing, then that thing is likely to be an information class.
4.5
Valuing an Object and Forming an Object Class
The opposite of classification, as we have seen, is instantiation. Classification involves determining the group of properties common to a set of objects. Instantiation means defining an individual object by assigning values to the properties of a class. Implicitly when discussing instantiation, we are starting to make a connection here between the objects and classes on an information model and the data structures of a data system. Any data structure can be seen to be made up of a number of data elements, and each data element is made up of a number of data items. A datum, a unit of data, is used to represent a fact relevant to some institutional domain. Typically, a datum is formed by making a data item correspond to the attribute of some class and assigning some value to this data item. This means that a data element is typically used to collect together a set of cognate attributes of some class and through so doing builds an instantiation of the class—it represents an object of the defined object class. Example
So, we can build a data element or object for the product class by listing its attributes—product weight and product length—and assigning a value to these attributes, such as: [5342 Product Length 10] [5342 Product Weight 20] ... ◄ This means that the entire listing of objects as data elements serves to form a complete data structure. And this data structure serves to represent, through a complete listing of objects, the object class. This way of defining an object class in terms of its objects is said to be an extensional definition of an object class.
4.6 Association
63
Example
Hence, we might provide a complete extensional definition for our product class by building a list such as: [5342 Product Length 10] [5342 Product Weight 20] [6634 Product Length 20] [6634 Product Weight 40] [9982 Product Length 60] [9982 Product Weight 60] ... ◄ Another way of putting this is that values assigned to attributes are used to distinguish one instance or object of a class from another. Example
To distinguish one instance of a student from another, we give them a different name, address and so on. Or more likely to distinguish between objects with the least effort, we assign a unique identifier to each student—typically a surrogate identifier. ◄
4.6
Association
However, classes are not only defined by their attributes but also in terms of their associations with other classes. An association is typically a defined relationship between two distinct object classes—this is said to be a binary relationship. Example
In analysing an institutional domain, we might express the fact that a customer places a sales order or a supplier handles a purchase order. Or alternatively, within a university domain, we might wish to represent the fact that students enrol on modules and that lecturers or professors teach modules. In these phrases, customer, supplier, sales order, purchase order, student, module and lecturer/professor are information classes. Places, handles, enrols and teach(es) are signs we might use for relationships of association between these classes. ◄ Example
Let us examine an example we have seen before from our manufacturing domain. Within this domain, there are likely to be associations between a stillage (a container for a set of product) and a manufacturing location. We first need to
64
4
Information Modelling from First Principles
define stillage and location as classes with objects; we have of course already defined the product class. We refer to stillage objects through a stillage code and location objects through a production location code. For example: [26641 ISA Stillage] [26643 ISA Stillage] [24536 ISA Stillage] ... [PL0102 ISA Location] [PL0103 ISA Location] [PL0104 ISA Location] ... ◄ We then need to build a series of associations between these three classes. This means associating the product class with the stillage class and the stillage class with a location class. Example
For instance: [Stillage CONTAINS Product] [Stillage LOCATED AT Location] [Stillage MOVE TO Location] ◄ The terms CONTAINS, LOCATED AT and MOVE TO within these binary relations here are signs we use to refer to linkages between objects in the class stillage and objects in the class product, as well as objects in the class stillage and objects in the class location. In other words, we can define relationships of association by extension through building lists of named pairs of object identifiers. Example
Hence, we might have a contains list such as: [26641 CONTAINS 5342] [26643 CONTAINS 6634] [24536 CONTAINS 9982] ... Next, we might build a stock location list, such as: [26641 LOCATED AT PL0102] [26643 LOCATED AT PL0102]
4.6 Association
65
[24536 LOCATED AT PL0102] ... or a stock movement list: [26641 MOVE TO PL0103] [26643 MOVE TO PL0103] [24536 MOVE TO PL0104] ... ◄ It should be noted that since there are two classes involved in an association relationship, we can call a relationship by different names depending on the direction of naming. Another way of thinking about this is that each class plays a distinct role within any association and that each role can be given a different name. Example
The association between a stillage and a product is named as CONTAINS if we make Stillage the first term in the triple of some institutional fact. The first term, which is known as the subject in a triple, defines the class whose role is being played. [Stillage CONTAINS Product] If we make Product the first term, then the name of the relationship will need to subtly change to read or be communicated properly. In doing this, we are naming the role being played by Product in this relationship: [Product CONTAINED IN Stillage] ◄ Hence, any one association relationship can have two potential names, each name being a role of the class playing out in the relationship. It is also noteworthy that two classes, such as stillage and location, can be associated together by more than one association relationship. The classes will play different roles in each of the association relationships involved. Example
The classes House and Person can be related by ownership and/or by occupation. Hence, we might express this in the following manner: [House OWNED BY Person] [Person OWNS House] [House OCCUPIED BY Person] [Person OCCUPIES House] ◄
66
4
Information Modelling from First Principles
This helps explain why in constructing institutional ontology, we need a layer of abstraction over and above the layer of institutional facts. The abstraction layer provides context to the institutional facts and represents the collective understanding or acceptance of the facts by actors within the domain. This is the essence of institutional ontology and the reason we try to make the important parts of this explicit through an information model. Example
For instance, within our manufacturing domain, it is impossible for actors to be informed by the fact [26641 LOCATED AT PL0102] without a collective understanding that 26641 refers to a stillage, PL0102 refers to a production location and the term LOCATED AT stands for an association between the two objects. ◄ Not every class upon an information model will be related to every other class. In theory, having identified a set of say 6 classes, up to 15 association relationships could exist between these classes. In practice, it will usually be quite obvious that many classes are quite unrelated. Furthermore, the goal of information modelling is to document only direct relationships of association: that is, association relationships between two classes, with no intervening class. Example
Direct relationships exist between the classes Parent and Child and between Child and School. The relationship between Parent and School is indirect; it exists only by virtue of the Child class. ◄ Exercise Each trailer arrives from a customer and might be loaded with a number of different types of steel product. Each batch of such products is therefore labelled with a unique order number. As a whole, each trailer is given its own delivery advice note detailing all associated batches on the trailer. Identify classes and relationships of association from this short snippet of an interview conducted with a manufacturing organisation.
4.7
Constraints upon Association
To each relationship of association, we can add two types of business rule or constraint, which expresses for the modeller how a given institutional domain works currently or should work with its associated information classes. One type
4.7 Constraints upon Association
67
of rule is known as a cardinality rule, while the other type of rule is known as an optionality rule. Cardinality establishes how many instances of one class are related to how many instances of another class. Any association relationship may be typed as either a oneto-one (1:1), one-to-many (1:M) or many-to-many (M:N) relationship. If we state that the relationship is one to one, then one instance of a class is always associated with one instance of the other class. Specifying a relationship as one to many means that one instance of a class is associated with more than one instance of the other class. If we state that the relationship is many to many, then many instances of one class are associated with many instances of another class. Example
In terms of the cardinality of the places relationship between customer and sales order, we ask ourselves the question: how many sales orders can be placed by one customer and how many customers appear on a particular sales order? If the answer to any of these questions is many, we say that the cardinality of that class in the relationship is many; if not, it is one. Hence, in the case of customer places sales order, customer is likely to have a cardinality of one and sales order a cardinality of many. ◄ The concept of cardinality can best be understood by using an occurrence or instance diagram. These diagrams are based on mathematical visualisations known as Venn diagrams and illustrate how occurrences or instances of information classes inter-relate. The circles or ovals on the diagrams are meant to represent sets of instances. Each information class therefore is represented as a set of instances, and each instance/object is given a unique identifier. The relationship of association between two information classes also comprises a set (drawn as circles or ovals with dotted lines on the diagram) and includes the set of associations drawn between instances of both classes. Example
Consider Fig. 4.1 which illustrates the cardinality between two classes appropriate to a university setting. Three instances of a lecturer class are identified as well as three instances of an academic department or school. The line drawn between ‘Computer Science’ and ‘234’ indicates that the lecturer with the identifier 234 is employed by the Computer Science department of this university. Note that the cardinality of the association relationship in Fig. 4.1 is one to many (1:M). The department ‘Computer Science’, for instance, has two lecturers or professors associated with it. Hence, we can express the facts of this case as: [Lecturer EMPLOYED BY Department] [Department EMPLOYS Lecturer]
68
4
Information Modelling from First Principles
234
Computer science
237
Biology
123
Business
Lecturer
Department
EMPLOYS
Fig. 4.1 Instance diagram—one-to-many relationship
[EMPLOYED BY Cardinality one] [EMPLOYS Cardinality many] In contrast, in Fig. 4.2, Lecturer to Student is a many-to-many (M:N) relationship. Lecturer 237, for example, teaches students 34698 and 37798. This is expressed as: [Lecturer TEACHES Student] [Student TAUGHT BY Lecturer] [TEACHES Cardinality many] [TAUGHT BY Cardinality many] ◄ In defining the cardinality of an association relationship, we are actually making two assertions about the domain we are modelling. In essence, the information modeller is selecting between four possible options for cardinality that might apply to any one association relationship. Example
In terms of any two information classes, there are at least four ways in which cardinality might be expressed. So, in terms of the situation between Lecturer/ Professor and Module, as far as teaching is concerned, we might choose between one of the four cardinality rules.
4.7 Constraints upon Association
69
TAUGHT BY 34698
234
37798
237
34888
123
Lecturer
24988
TEACHES
Student
Fig. 4.2 Instance diagram—many-to-many relationship
(1:1) A lecturer may teach at most one module and a module is taught by at most one lecturer—[TEACHES cardinality one]; [TAUGHT-BY cardinality one]. (1:M) A lecturer may teach many modules but a particular module is taught by at most one lecturer—[TEACHES cardinality many]; [TAUGHT-BY cardinality one]. (M:1) A lecturer teaches at most one module but a particular module may be taught by many lecturers—[TEACHES cardinality one]; [TAUGHT-BY cardinality many]. (M:N) A lecturer may teach many modules and a module may be taught by many lecturers—[TEACHES cardinality many]; [TAUGHT-BY cardinality many]. ◄ In contrast to cardinality, optionality establishes whether all instances of a class must participate in a relationship or not. Hence, each class participating in a relationship is either mandatory or optional in that relationship. If a class is
70
4
Information Modelling from First Principles
mandatory in the relationship, then all instances of that class must participate in the relationship. If a class is optional in a relationship, then at least one instance of the class need not participate in the relationship. Example
Hence, in the case of customer places sales order, the optionality is mandatory both for customer and sales order in the places relationship. This means that we make two further assertions about the business situation: [Customer PLACES Sales order] [Sales order PLACED BY Customer] A customer must place at least one sales order to constitute being a customer of the company—[PLACES optionality mandatory]. A sales order must always be associated with an existing customer— [PLACED BY optionality mandatory]. ◄ Optionality is also best illuminated through an instance diagram. Example
The class Lecturer has mandatory participation in the relationship illustrated in Fig. 4.1, while Department has optional participation. Biology, for instance, is not associated with any lecturers or professors currently. In Fig. 4.2, the optionality of Lecturer in the teaches relationship is optional, as there is at least one lecturer not teaching any students; Student is mandatory indicating that all students have to be taught by some lecturer. [EMPLOYED BY optionality Mandatory] [EMPLOYS optionality Optional] [TEACHES optionality Optional] [TAUGHT-BY optionality Mandatory] ◄
4.8
Generalisation and Specialisation
Much information modelling can be conducted solely with the constructs of classes, attributes and relationships of association. The original technique of entityrelationship or entity-relationship-attribute diagramming can be undertaken just with these constructs. However, over the last couple of decades, it has become important to add two other relationships of abstraction to an information model, where it is deemed necessary.
4.8 Generalisation and Specialisation
71
So far, we have only moved one step up in the process of semiosis by classifying some object or attributing properties to an object or associating one class with another class. We next consider moving further up the hierarchy of semiosis through the process of generalisation. Generalisation is normally used in tandem with classification to build an abstraction hierarchy. Classification, as we have seen, involves grouping objects that share common characteristics (attributes and relationships) into an information class. The main difference between classification and generalisation is that while classification relates an object class with its objects, generalisation relates an object class with another object class, and this object class is at a higher level of abstraction. In this manner, generalisation can be considered as the process of extracting from one or more information classes the description of a more general class. The special constitutive rule for generalisation here is expressed as: [X AKO Y to Z in C] where X is an object class, described as the sub-class, and Y is its super-class, meaning it is a more general or abstract class than Y. The AKO (short for a kind of) relationship represents a generalisation relationship or its opposite specialisation. In one direction, from sub-class to super-class, we are generalising from one level of abstraction to another. Example
Hence, when we state that: [Lintel AKO Product] [Crash barrier AKO Product] ... We are expressing two sub-classes of the product class, namely, lintel and crash barrier. ◄ In the other direction, from super-class to sub-class, we are reducing the level of abstraction or specialising a class. Example
Within the institutional domain of financial trading, Stock and Share might be seen as sub-classes or specialisations of a Security class. Likewise, Debenture and VariableStock might be considered sub-classes or specialisations of Stock. ◄ Generalisation, through hierarchies, can be used to provide a more economical representation of some institutional domain than would be available by merely using the construct of an information class. The important point here is that sub-classes
72
4
Information Modelling from First Principles
inherit the properties and relationships of their super-class. The analogy being made is between the transfer of traits through genes amongst organisms and the transfer of properties down through a hierarchy of classes through specialisation. Example
In terms of our example manufacturing domain, we know that a product class can be specialised as a lintel or a crash barrier. Hence, by declaring a lintel to be a kind of product means that we can assume that it has a weight and length and also that it is stored in a stillage at a production location and moved between production locations within the manufacturing plant. ◄ Generalisation is particularly important to many professional practices that involve the standardised naming of things. For instance, it is critical to taxonomy, the science of identifying and naming species or organism. Taxonomy is an important sub-discipline of biology where the taxonomic scheme of biological organisms is organised hierarchically in terms of domain, kingdom, phylum, class, order, family, genus and species. This amounts to a formalised hierarchy of signs and allows biologists across the world to communicate effectively. Most libraries also use taxonomy for organising the storage and retrieval of publications. For instance, the Dewey Decimal scheme, much used in libraries worldwide, organises publications into ten main classes. Each main class is then expanded into ten divisions. And, each division is then expanded into ten sections. Many applications of the concept of generalisation do not fall into neat hierarchies. In such cases, we speak of a generalisation lattice. In other words, a given object class may be a sub-class of more than one super-class. Example
Within the stock market, a MarketMaker class could be said to be a sub-class of both an Investor class and a FinancialIntermediary class. ◄
4.9
Generalisation Hierarchies and Lattices
In formal terms, generalisation relationships are transitive, irreflexive and antisymmetric: • Transitive. If A is a kind of B and B is a kind of C, then A is a kind of C. • Irreflexive. A is not a kind of A. • Anti-symmetric. If A is a kind of B, then B is not a kind of A.
4.10
Aggregation and Decomposition
73
Example
Transitive—If programmers are computing staff and computing staff are employees, then programmers are employees. Irreflexive—Employees are not a kind of employee. Anti-symmetric—If computing staff are a kind of employee, then employees are not a kind of computing staff. ◄ Since classification relationships define links between objects and object classes, it does not make sense to talk of the transitivity or symmetry of these relationships. In terms of generalisation hierarchies, it is sometimes useful to make a distinction between partial and covering sub-classes. In terms of some information class, if its sub-classes are partial, then other sub-classes can be included for the super-class. If sub-classes are covering, then no further sub-classes are permitted. Example
If we regard Broker and MarketMaker as partial sub-classes of FinancialIntermediary, then other sub-classes are possible. If these sub-classes are covering, then Brokers and MarketMakers would be the only type of FinancialIntermediary permitted on the stock market. In terms of our manufacturing example, it is unlikely that lintel and crash barrier are the only types of product produced by the company. Hence, this generalisation relationship would be described as partial. ◄ Disjoint sub-classes do not overlap. However, we can conceive of situations where the concepts referred to by information classes do overlap. If all sub-classes in an information model are disjoint, we have a strict hierarchy of classes. If some are overlapping, we have a lattice structure. Example
Share and Stock are disjoint sub-classes of Security. A Security cannot be both a share and a stock. Broker and MarketMaker are two overlapping sub-classes of financial intermediary since market makers can act as brokers. ◄
4.10
Aggregation and Decomposition
We can build a substantial part of some ontology with classification, attribution, association and generalisation. However, there is one more constitutive rule that can be useful in certain circumstances for building institutional ontology—this is aggregation or its opposite decomposition. The constitutive rule here is:
74
4
Information Modelling from First Principles
[X PART OF Y to Z in C] in which X is a class which is part of a wider whole class Y in some domain C. An aggregation relationship occurs between a whole and its parts and is an abstraction in which a relationship between objects is considered a higher-level object. This makes it possible to focus on the aggregate while suppressing lower-level detail. Example
For example, in terms of the financial domain, we might define a financial portfolio class that aggregates together all the financial products making up a given customer’s interaction with the financial company. In such terms, a financial portfolio class can be considered an aggregate of securities, insurance policies and savings accounts. Likewise, a country can be considered an aggregate of regions which are aggregates of counties which are aggregates of districts and so on. In the case of the health service, a patient history can be considered as a collection or an aggregate of diagnoses, prescriptions and treatments. ◄ Hence, aggregation relationships compose an object out of an assembly or aggregation of other objects. When we state that: [Railway station PART OF railway] [Railway line PARTOF railway ] We are declaring that railways are composed of an aggregation of railway stations and railway lines. The opposite of aggregation is decomposition, that is, the process of decomposing an object class into its constituent parts. But given that we can build aggregation as well as generalisation hierarchies, what is the difference between the two? It is possible to distinguish between aggregation and generalisation in the following way. If two classes are defined in terms of a generalisation relationship, then both sub-class and super-class effectively refer to the same physical or institutional thing, the same group of objects. The super-class is merely a higher-level abstraction of the thing than its sub-class, and both instances of the sub-class and super-class will be referred to by the same identifier. In contrast, within an aggregation relationship, the aggregate, the whole, is different from any of its parts. The aggregate is merely a useful container for collecting together a set of cognate classes, instances of which will all have different identifiers. Example
A lintel is the same thing as a product, and a stock is the same thing as a security. However, a financial portfolio is different from an insurance policy, and a country
4.11
Institutional Ontology as a Sign Lattice
75
is different from a county. A railway is different from a railway line, and a patient history is different from a patient treatment. ◄
4.11
Institutional Ontology as a Sign Lattice
So, let us review where we have got to. In the previous chapter, we made the case for thinking of information modelling as an attempt to build a partial model of institutional ontology, a model focused upon the things identified and described by a group of institutional actors within communicative practice. A lattice of signs helps provide a concrete way of thinking about the notion of institutional ontology and is constructed from objects and classes as well as relationships of attribution, association, generalisation and aggregation. Objects and classes are signs we use not only to identify and describe things of interest within some institutional domain; they also prescribe what can exist within this domain to institutional actors. When we declare a lintel to be a kind of product and that is a product, we not only identify a product as being of a certain type; we expect through inheritance for it to be described in terms of its length and weight. But, as we indicated in a previous section, identifying and describing a product in this way brings this thing into existence for the domain. As far as institutional ontology is concerned, a thing does not exist until it can be signified or named by actors within the domain. But information classes do not exist in isolation. They exist in a complex lattice consisting of other related signs. The way in which a certain sign has the potential to inform actors is down to its relationships with other signs within the lattice structure. Hence, as we have seen, a stillage only makes sense as a container of product which can be stored at production locations and moved between such locations. The lattice also establishes that products may be lintels or crash barriers and both of these classes are part of the wider aggregate of a product line. We shall look in some detail at how to visualise an information model in Chap. 5 and consider a number of different conventions for doing this. Here, we just provide a stepping point from the discussion of the current chapter to ways of visualising information models. One simple way of visualising the sign lattice appropriate to some delimited institutional domain is to simply position appropriate terms for classes upon the page and link these terms together with lines or arrows labelled with the appropriate construct—HASA, AKO and PART OF—and appropriate labels for relationships of association, such as CONTAINS, LOCATED-AT and MOVE-TO. Figure 4.3 provides an example of such a simple visualisation for aspects of the institutional ontology of the manufacturing domain we have discussed in the current chapter. This form of representation is similar to something known as a semantic net and has proven popular not only in areas of artificial intelligence such as machine learning but also in attempts to develop an architecture for the so-called semantic web. In Chap. 5, we shall cover other more involved and standard conventions for
76
4
Information Modelling from First Principles Object class
Product line
24536 26641
Product weight
26643
ISA
Product
Stillage
Product length LOCATED AT
Lintel
PL0102
ISA
Crash barrier
Location
PL0103
PL0104
9982
Fig. 4.3 A sign lattice
visualising the sign lattice appropriate to some institutional ontology—such as the form of visualisation in Fig. 4.4. This form of visualisation is particularly directed at the design of data systems (Chap. 8). Although we have taken great pains within this chapter to unravel the ways in which institutional ontology is built from first principles, we should remember that actors taking action within domains, such as manufacturing, emergency response and higher education, do not think and act with such a formal notion of ontology. Instead, they acquire the elements of such ontology through socialisation into the domain and utilise such ontology as an accepted and unexamined part of their surround-world—their ready-at-hand appreciation of the significance of objects. This means that a domain actor’s ontological understanding is very much entangled with their use of signs to identify and describe things and through this process of semiosis to act in terms of such things. As we have seen in Chap. 3, we arrive at a sign lattice, such as the simple one displayed in Fig. 4.3, by investigating and representing patterns of communication
4.11
Institutional Ontology as a Sign Lattice
Fig. 4.4 An information model
77
Location LOCATES
RECEIVES
LOCATED AT
MOVE TO
Stillage
Product line
CONTAINS
CONTAINED IN
Product
productCode productWeight productLength
(disjoint, complete)
Lintel
Crash barrier
appropriate to some institutional domain. This understanding is then used to unpack the content of communication and to compose an information model from a close understanding of the purpose of such content. We shall examine some of the ways of composing an information model in Chap. 6. If our aim is to build an information model of some existing domain, then we can build an information model from the bottom-up or the top-down. Through intensive investigation of some domain, we can traverse information situations from the
78
4
Information Modelling from First Principles
scaffolding of existing data structures through to communicative acts and the coordinated activities that rely on such practices. Or, we can reverse the investigation of information situations by starting with what people do in the domain and then by close study of communicative acts come to an understanding of what people identify and describe. If the information model is being built for an entirely new domain of action, then we first need to design the patterns of information situations appropriate to some new area of work. Once this is achieved, then we have a concrete basis for building an information model.
4.12
Conclusion
In this chapter, we have spent some time considering how the core constructs of information modelling—objects, classes, attributes and relationships—relate to our model of information situations discussed in Chap. 2. We have started to portray information modelling as an attempt to build a partial model of institutional ontology, a model focused upon the things identified and described by a group of institutional actors within communicative practice. In the next chapter, we consider the relationship between such a model and reality in greater detail and consider how we begin to investigate the basis for such an information model. In doing this, we shall highlight the key differences between the approach to information modelling promoted in this book and traditional approaches to information modelling, alluded to in the introductory chapter.
4.13
Summary
• An object is some thing a set of actors within some institutional domain takes to exist. • When we group a set of similar objects together and provide a category for such a group, we classify such objects. Objects are then said to instantiate the class. An information class is a sign which stands for an object, or more likely a set of such objects, and serves to categorise or classify such objects. • One information class can be associated with another information class. Association relationships are characterised by two sets of rules: cardinality rules and optionality rules. • Cardinality defines how many instances of one class are related to how many instances of another class. • Optionality establishes whether all instances of a class must participate in a relationship or not. • An information class is characterised by a number of properties or attributes. One or more attributes of the class are chosen to be identifiers for the class.
4.13
Summary
79
• A class may be a sub-class of another class. In which case it is related through a relationship of generalisation. • A class may be part of a container class. In which case it is related through a relationship of aggregation. • A lattice of object classes provides a concrete way of thinking about the notion of institutional ontology and is constructed from objects and classes as well as relationships of attribution, association, generalisation and aggregation.
5
Visualising an Information Model
5.1
Introduction
In Chap. 4, we considered information modelling purely in terms of its major constructs. These core constructs consist of classes, attributes and relationships of association. Additional constructs include relationships of generalisation and aggregation. The modeller can use these constructs to form a representation of important aspects of communicative competence relevant to some institutional domain. Within Chap. 4, we considered a canonical form for the representation of an information model in which a series of constitutive rules written as binary relations is used as the means for capturing the essence of some institutional ontology. However, as we indicated in Chap. 3, information modelling originally developed as a diagramming technique meant to aid the work of analysts and designers of data systems of various forms, particularly relational database systems. Diagramming is used because, rather than trying to understand and capture what is going on or what people would like to happen in words, business analysts tend to use pictures of various forms. Visualisation is not only used to build a collective record of some experience; it can also be used to facilitate creative thinking or to improve analysis of some problem.
5.2
Why Visualise?
As we have seen in previous chapters, it is perfectly possible to build an information model as a written definition of the physical and institutional facts appropriate to some domain. However, information modellers, just like business analysts in general (Beynon-Davies 2021a), prefer to build visualisations of information models. We use the term visualisation here not to refer to the process of forming a mental image of something but to the process of building a diagram or graphical representation of something. # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_5
81
82
5 Visualising an Information Model
Analysts like to visualise problems and designers like to visualise solutions because there are some key advantages to visualisation, bound up with adages such as ‘a picture is worth a thousand words’ (Tufte 1990). By this adage is meant that a diagram of something can frequently cover more of some problem situation than a written description. Generally, more can be represented on one visual than in many pages of writing. Visualisations also appear to be especially good at representing the complexity of some situation, particularly situations in which there are many interactions between things. Visualisations also tend to be more engaging than written text. This makes it easier to communicate common understandings of things amongst actors with diverse backgrounds and perspectives. Since visuals are by their very nature highly visible, this helps encourage group working. Finally, visualisations tend to be easier to change than written descriptions or specifications. This makes them extremely suitable for building prototypes as models of possible solutions to problems. It is for these reasons that in this chapter we look at this issue of visualisation in some detail and show how to diagram an information model using one of many possible visual notations. Exercise To gain a feeling for some of the advantages of visualisation, try this exercise. Suppose you have to explain to somebody the complex route to be taken from point A where you are currently situated to point B, a location many miles distant. Try to do this in two ways. In the first way, you verbally describe the route to be taken. In the second way, you draw them a rough route map on a sheet of paper. Which of these two ways of representing a problem situation is likely to be the most effective and why?
5.3
Notations for an Information Model Diagram
As we explained in Chap. 3, ways of conducting information modelling have remained relatively stable for over three decades. To isolate and express the accepted elements of an information model separate from issues of visualisation, we have implicitly used at a number of points what are known as binary relations, which were originally proposed in the work of Richard Frost (1982, 1983). We have used binary relations as what is known as a canonical form for expressing an information model. A canonical form is a basic or standard form for representing something, but which can be translated easily into other forms. A binary relation, as we have seen, can be considered a triple of items, in which the first item is termed the subject, the second the relation and the third the object. The theory of binary relations is useful because it can be shown that many representational formalisms, familiar within information modelling, can be reconstructed from these simple, atomic forms (Frost 1983).
5.3 Notations for an Information Model Diagram Fig. 5.1 Variation in information modelling notation
83
offers Course
Module offered-on
Course
1
offers
M
Module
offered-on
offers Course
Module offered-on
Course
1:1 offers
0:M Module offered-on
0
Course offers
1..1
Module
offered-on
1..*
Course
Module offers
offered-on
Information models are usually mapped out as diagrams, but there is unfortunately no standard notation for diagramming information models. A number of notations used in practice are illustrated in Fig. 5.1. Each of the diagrams in this figure specifies the same subset of an ontology associated with an academic institution, which we might represent as a series of institutional facts in the following manner: [Course OFFERS Module] - a course offers modules [Module OFFERED-ON Course] - a module is offered on courses [OFFERED-ON cardinality One] - A module is offered on one and only one course [OFFERED-ON optionality Optional] - A module does not need to be offered on a given course. [OFFERS cardinality Many] - A course offers a number of modules [OFFERS optionality Mandatory] - A course must offer at least one module. In the following sections, we use one of these notations to consider issues of visualisation in more detail. The resulting visualisation can be easily translated into a visualisation using the other notations. The first notation illustrated in Fig. 5.1 is the one used within this book, primarily because it is the one closest to the original notation proposed by Chen and it is the
84
5 Visualising an Information Model
one most easy to draw quickly on paper. The last notation in Fig. 5.1 is that proposed by the Unified Modelling Language (UML) for class diagramming. But to reiterate, the reader should not worry about the use of any one notation as each of these notations can be readily translated into any of the other notations.
5.4
Visualising Classes
An information class is typically represented upon an information model diagram by a rectangular box in which is written a meaningful name for the class. Note that it is conventional to denote an information class with a singular noun. This is because, as we discussed in Chap. 4, a class represents a category of something. There is only one example of a category, but a category is used to cover many instances of objects. One way to think of this graphic is that the class is setting a boundary around objects relevant to this class. It is dividing up the world into those things included inside the box and hence instances of the class and those things outside the box which represent all other things within the institutional domain in question. Example
We speak of an order and not of orders, a patient and not of patients. Figure 5.2 provides some more examples of information classes from different organisational domains. ◄ Example
Draw the likely information classes from the following description: The stock market is a market for the purchase and sale of securities. Securities come in two major forms: stocks and shares. A stock, sometimes known as a giltedged security or gilt, is a security with an associated interest rate. The most important type of stock are government bonds. Shares are a type of security which pay no interest, but pay a dividend to shareholders at regular intervals. Shares are normally issued by companies to raise capital. ◄
5.5
Visualising Relationships of Association
An association relationship between classes is represented by drawing a line between the relevant boxes on the diagram. In many notations, labels are placed on the relationship lines and are typically used as a way of resolving ambiguity. It must be acknowledged that it is frequently difficult to think of meaningful labels in this manner for relationships and sometimes including labels for relationships is cumbersome to represent on a diagram. Most relationships are best represented by verbs. Verbs however usually imply some direction. Hence, the relationship between person and grade might be read as
5.5 Visualising Relationships of Association
85
Module
Lecturer
Product
Patient
Incident
Location
Customer
Supplier
Student
Order
Payment
Sale
Fig. 5.2 Example information classes
person is graded by grade in one direction and grade grades person in the opposite direction. Example
Figure 5.3 illustrates a number of relationships, some labelled and some unlabelled, between information classes. ◄ Example
Identify further classes and relationships of association from the following description: Persons or institutions which deal in securities on the stock market are known as financial intermediaries. There are two main types of financial intermediary: brokers and market makers. Securities are bought from certain registered market makers by investors. A purchase of a security is known as a deal. ◄ Within the notation proposed by the Unified Modelling Language (UML), modellers are encouraged to assign role names to classes involved in a relationship of association, as illustrated in Fig. 5.4. As a name, a role must clearly be a noun which adds a certain confusion to the situation and departs from standard practice in information modelling. We have adopted a compromise position in this book, in which two labels may be included upon a relationship line but these labels are standard verbs. Where the size of the diagram is particularly large, then these labels are often omitted.
86
5 Visualising an Information Model
Incident
involves
Patient
located-at
Incident
Location site-of
Lecturer
teaches
Module
Student
enrolled
Course
Customer
places
Order
Product
Order
Fig. 5.3 Sample relationships of association
Order
product order
ordered product
Product
Fig. 5.4 An example of the use of role names
5.6
Visualising Attributes
Upon an information model, attributes may be represented by adding their names to the appropriate class box. Attributes are enclosed within the class box itself to represent the way in which they add detail to the description of an information class. However, when an information model becomes full with a large number of
5.6 Visualising Attributes
87
information classes, the attributes associated with particular classes are likely to be left off an information model diagram. Instead, they will be included within an accompanying document to the diagram. Example
Figure 5.5 provides some examples of attributes appropriate to a number of different information classes. The chosen identifiers for each class are underlined. ◄ Exercise Draw the relevant classes with their attributes from the following description: Each market maker will define the state of each type of share it holds in terms of two prices: the offer price and the bid price. The offer price is the price a market maker is willing to sell a share: the price at which an investor will buy. The bid price is the price a market maker is willing to pay for a share: the price at which an investor can sell to him. The difference between the two prices is known as the market makers’ ‘spread’. Different market makers will quote different spreads on shares depending on the state of their book.
Fig. 5.5 Example attributes
Incident incidentNo incidentDescription incidentCategory incidentStatus
Lecturer employeeNo lecturerName lecturerStatus
Patient patientNo patientName patientAge patientCondition patientMedicalHistory
Module
moduleCode moduleName credits
88
5 Visualising an Information Model
5.7
Visualising Constraints upon Association
The cardinality and optionality characteristics of a given relationship of association amount to constraints upon the behaviour of the two classes involved in this association. There are a number of competing notational devices available for portraying the cardinality of an association relationship. A popular and convenient way to represent cardinality is by drawing a crow’s foot on the many end of an association relationship. A crow’s foot is so-called because it looks like the foot of a bird, such as a crow. We assume that the default participation of a class in an association relationship is mandatory. If the participation is optional, we add a circle (an ‘O’ for optional) alongside the relevant class. Hence, if no ‘O’ is present, we assume that the optionality of a class is mandatory. If we want to be certain of our definition, we can use a strike symbol (a line drawn perpendicular through the relationship line) to indicate mandatory status. Example
Figure 5.6 provides a number of examples of relationships with the cardinality and optionality of classes defined for these relationships. ◄ Exercise Draw the cardinality and optionality appropriate for the following domain description: To conduct a deal, an investor issues a broker with an order specification. An investor may place many orders with a broker but may not place any orders with a particular broker. A broker may handle many orders but may not handle deals with certain investors. UML treats the two concepts of cardinality and optionality, somewhat confusingly, through the single idea of multiplicity. An example of this visual notation is illustrated in Fig. 5.7. A specification of multiplicity is placed at each end of an association relationship and consists of a lower bound and upper bound separated by two dots. The lower bound is the minimum value that can be taken by instances or objects of a class, while the upper bound is the maximum value taken. An asterisk is used to indicate an unspecified number of many instances as an upper bound or lower bound. Hence, the multiplicity of a given class could be expressed as 0..3 meaning that there cannot be more than three associated objects in this relationship but there may be none. In Fig. 5.7, the multiplicity 1..1 indicates that there is one and only one object of this class associated with the other class in this relationship, whereas the multiplicity 1..* indicates that there is at least one but possible many objects associated for this class within the relationship.
5.8 Visualising Generalisation
89
Incident
Patient
Incident
Location
Lecturer
Module
teaches
Student
Customer
enrolled
Course
places
Order
Order
Product Fig. 5.6 Example relationships
Order
1..1
1..*
Product
Fig. 5.7 Multiplicity upon an association relationship
5.8
Visualising Generalisation
A generalisation relationship is indicated on an information model diagram by a line drawn between sub-class and super-class with a triangle placed at the head of the line next to the super-class. This is the UML notation for generalisation. Disjoint generalisation is represented by the labels disjoint or overlapping expressed in brackets and placed next to the triangle. Partial generalisation may be indicated by the keywords incomplete or complete expressed in a similar way.
90
5 Visualising an Information Model
Example
Figure 5.8 illustrates two examples of the diagramming of generalisation relationships. The first diagrams the facts that both stock and share are sub-classes of a financial security. Stock and share are also disjoint and complete sub-classes, meaning that a security must be either a stock or a share. The second diagram defines the facts that a broker and a market maker are sub-classes of a financial intermediary. These sub-classes are overlapping and incomplete, meaning that a broker can also be a market maker and vice versa. There also other types of financial intermediary besides a broker and market maker. ◄ Fig. 5.8 Generalisation
Security (disjoint, complete)
Stock
Share
FinancialIntermediary (overlapping, incomplete)
Broker
MarketMaker
5.9 Visualising Aggregation
91
Exercise Draw a generalisation hierarchy for the following case: Stocks can have variable interest rates or fixed interest rates. Fixed interest rate stock is sometimes called debenture or loan capital. Similarly, shares can offer fixed or variable dividends. Fixed dividend shares are sometimes known as preference capital; variable dividend shares are known as equity capital.
5.9
Visualising Aggregation
Graphically, we may depict aggregation as a series of lines or a forked line between the whole and its parts. A diamond is also placed next to the aggregate class. Example
Figure 5.9 illustrates a sample aggregation relationship, indicating that a financial portfolio class is made up of a collection of other classes. ◄ Exercise A patient record is a classic example of an aggregate consisting of personal details, health conditions, treatments, medicines, allergies and past reactions to medicines, scans, X-ray results and lifestyle information such as if the patient drinks and smokes. Try to draw a visualisation of this aggregate.
FinancialPortfolio
Stock
Fig. 5.9 Aggregation
Share
InsurancePolicy
SavingsAccount
92
5.10
5 Visualising an Information Model
Institutional Facts to an Information Model Diagram
Within our discussion within the current chapter as well as Chap. 4, we have introduced rather informally the idea that it is relatively straightforward to move from a set of institutional facts established for some domain to an information model diagram. Let us demonstrate this process more completely here in terms of an extended example. Suppose that we have established through some form of investigation (Chap. 3) the following institutional facts held important to a certain manufacturing domain. The first set of facts establish the set of information classes held important by actors working within this domain. It is therefore convenient to list them as a set of binary relations as follows: [Delivery advice ISA Object Class] [Dispatch advice ISA Object Class] [Customer ISA Object Class] [Delivery item ISA Object Class] [Dispatch item ISA Object Class] [Product item ISA Object Class] [Job ISA Object Class] [Production run ISA Object Class] [Production Schedule ISA Object Class] In a sense, the class Object class is a meta-class here—a class which is normally implicit rather than explicit upon an information model diagram. We next need to establish which class is associated with which other class. A possible set of association relationships is listed as follows, with each relationship named twice for consistency to indicate the direction of each relationship or the role played by a class in the relationship: [Delivery advice DETAILS Delivery item] [Delivery item DETAILED-UPON Delivery advice] [Dispatch advice LISTS Dispatch item] [Dispatch item LISTED-UPON Dispatch advice] [Customer CREATES Delivery advice] [Delivery advice CREATED-BY Customer] [Customer RECEIVES Dispatch advice] [Dispatch advice RECEIVED-BY Customer] [Product item APPEARS-DELIVERY Delivery item] [Product item APPEARS-DISPATCH Dispatch item] [Dispatch item NAMES-DISPATCH Product item] [Delivery item NAMES-DELIVERY Product item] [Delivery item PROCESSED-AS Job] [Job PROCESSES Delivery item] [Dispatch item COMPLETED-AS Job]
5.10
Institutional Facts to an Information Model Diagram
93
[Job COMPLETES Dispatch item] [Product item MANUFACTURES-AS Job] [Job MANUFACTURES Product item] [Production run HANDLES Job] [Job HANDLED-BY Production run] [Production schedule DETAILS-RUN Production run] [Production run DETAILED-ON Production Schedule] The cardinality of each relationship can then be established by indicating whether a given class has at least one or many instances involved in the detailed relationship. Hence, we might specify the cardinality in the following manner: [DETAILS cardinality One] [DETAILED-UPON cardinality Many] [LISTS cardinality One] [LISTED-UPON cardinality Many] [CREATES cardinality One] [CREATED-BY cardinality Many] [RECEIVES cardinality One] [RECEIVED-BY cardinality Many] [APPEARS-DELIVERY cardinality One] [APPEARS-DISPATCH cardinality One] [NAMES-DISPATCH cardinality Many] [NAMES-DELIVERY cardinality Many] [PROCESSED-AS cardinality One] [PROCESSES cardinality Many] [COMPLETED-AS cardinality One] [COMPLETES cardinality One] [MANUFACTURES-AS cardinality One] [MANUFACTURES cardinality One] [HANDLES cardinality One] [HANDLED-BY cardinality Many] [DETAILS-RUN cardinality One] [DETAILED-ON cardinality Many] Note that each relationship label should be unique within any one information model to enable us to unambiguously assign constraints upon the relationship. Also, we have not defined each relationship as an association explicitly, such as [HANDLES ISA Association]. Instead, we have assumed, as in the case of an Object class, that the Association class is a meta-class used implicitly to classify each relationship of association. Finally, the optionality of each relationship needs also to be listed in the following manner: [DETAILS optionality Mandatory] [DETAILED-UPON optionality Mandatory]
94
5 Visualising an Information Model
[LISTS optionality Mandatory] [LISTED-UPON optionality Mandatory] [CREATES optionality Mandatory] [CREATED-BY optionality Mandatory] [RECEIVES optionality Mandatory] [RECEIVED-BY optionality Mandatory] [APPEARS-DELIVERY optionality Optional] [APPEARS-DISPATCH optionality Optional] [NAMES-DISPATCH optionality Mandatory] [NAMES-DELIVERY optionality Mandatory] [PROCESSED-AS optionality Mandatory] [PROCESSES optionality Mandatory] [COMPLETED-AS optionality Mandatory] [COMPLETES optionality Mandatory] [MANUFACTURES-AS optionality Optional] [MANUFACTURES optionality Mandatory] [HANDLES optionality Mandatory] [HANDLED-BY optionality Mandatory] [DETAILS-RUN optionality Mandatory] [DETAILED-ON optionality Mandatory] Given that we have established the relevant institutional facts for the domain, it is a relatively simple process to produce a diagram from this using the conventions discussed in this chapter. First, produce the labelled boxes for each information class. Second, draw and label the lines to represent associations between information classes. Third, add cardinality to the relationships. Fourth, add optionality to the relationships. A completed information model diagram which corresponds to the facts established for this domain is illustrated in Fig. 5.10. Note that we have not listed any attributes for the information classes upon this diagram. This might form part of a more detailed investigation and diagramming effort. Clearly, this diagram does not indicate any generalisation or aggregation. This would be entirely possible by, for instance, making a generalisation hierarchy out of advices, such that a Delivery advice and Dispatch advice are considered sub-classes of an Advice super-class. However, as a general rule, we suggest that the information modeller should only include such relationships of abstraction where they are deemed necessary to illuminate aspects of communication within the domain under investigation. Example
Consider the case where this manufacturing company deals with two types of customer which it refers to as a major and minor customer. Major customers make repeat orders with the company and produce their own delivery advices; minor customers make irregular orders with the company and do not produce their own
creates
lists
listed-upon
namesdispatch
Dispatch item
appearsdispatch
Product item
namesdelivery
Delivery item
appearsdelivery
detailed-upon
details
completedas
manufactures
manufactures-as
processed-as
Job
completes
Production run
detailed-on
handles
handled-by
processes
detailsrun
Production schedule
Institutional Facts to an Information Model Diagram
Fig. 5.10 An information model diagram for a manufacturing domain
Dispatch Advice
received-by
receives
Customer
created-by
Delivery advice
5.10 95
96
5 Visualising an Information Model
Fig. 5.11 Generalisation in the manufacturing domain
Customer
Major customer
Minor customer
delivery advices. In this situation, we might choose to model these institutional facts as a generalisation hierarchy and visualise this situation as in Fig. 5.11: [Major customer AKO Customer] [Minor customer AKO Customer] ◄
5.11
Conclusion
Although the constructs of an information model are relatively standard, there are many different ways of visualising an information model. Within this chapter, we have considered one of the many ways of visualising an information model through a diagram. Visualisations are important for a number of reasons, such as encouraging group working through being highly visible and compact. We have begun to suggest in this chapter that a way of building an information model diagram is from an established set of physical and institutional facts appropriate to the domain in question. This we refer to as the process of composing an information model. To compose something, we make or form some representation from its perceived constituent elements. In the next chapter, we consider this approach of composition in much more detail.
5.12
Summary
• Three basic constructs are used in information modelling as a business analysis technique: classes, relationships and attributes. Relationships can be relationships of association, generalisation or aggregation. • An information class is typically represented upon a diagram as a labelled box. • A relationship of association is typically indicated upon a diagram as a line drawn between related classes. Cardinality constraints are typically indicated by notating the many end of the relationship with some graphic such as a crow’s foot. The
5.12
• • • •
Summary
97
optional end of some relationship is indicated by some other notation upon the relationship line such as a circle. Attributes can be represented as labels nested within the class box upon the diagram. For large information models, the attributes are typically left off the diagram. A generalisation relationship is indicated upon an information model diagram by a line drawn between sub-class and super-class with a triangle placed at the head of the line next to the super-class. We may depict aggregation as a series of lines or a forked line between the whole and its parts. A diamond is also placed next to the aggregate class. From a set of institutional facts established for some domain, it is relatively straightforward to compose an information model diagram.
6
Composing an Information Model from Institutional Facts
6.1
Introduction
The key question faced by any information modeller is, where do I start? As we have indicated in previous chapters, this is a question not addressed adequately by conventional literature. The main reason for this is that such literature does not work with any established theoretical understanding of the ‘material’ that information modelling deals with (Chap. 2), as well as what information modelling is attempting to do with this ‘material’ (Chap. 3). Both issues arise from the fact that conventional approaches to information modelling work with a narrow and unproductive conception of the institutional domains with which information models engage. As we have seen, within the current book, we have promoted the idea of considering an information model as a model of important aspects of institutional ontology. Institutional ontology is what actors within some domain deem to exist, how they communicate about such things and how they use such communication to coordinate joint activity. Institutional ontology provides to such actors a way of making sense of both physical and institutional facts about reality and through this to construct and reconstruct this reality. This is why we have proposed that an information model must necessarily be focused as a model upon the patterns of instrumental communication relevant to some domain of institutional action. This way of thinking about both the content and the purpose of information modelling allows us to develop a clear way of composing an information model which does justice to some institutional ontology under investigation. Within the current chapter, we shall demonstrate how to build information models from an analysis of the instrumental communicative practices within some domain. The approach can also be readily adapted to designing an information model for some new domain of action. The steps of this approach, which are described in more detail within this chapter, are as follows:
# The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_6
99
100
6
Composing an Information Model from Institutional Facts
• Develop a model of the pattern of information situations under consideration. This may be a model of some existing or as-is pattern or a model of some possible or as-if pattern. • Unpack the content of the messages used within this communicative pattern into the constructs of an information model—classes, attributes and relationships. • Generate a set of binary relations which adequately reflects the content of communicative acts as a series of abstractions of relevant physical and institutional facts. • Form the set of physical and institutional facts into a complete information model and produce a visualisation of this model using one of the many possible notations. • Check the validity and consistency of this model with actors or stakeholders within the domain in question. • Revise the information model if necessary.
6.2
A Pattern of Information Situations
Within Chap. 2, we described the proper context for an information model as being some domain of institutional action. More precisely, an information model cannot be built without some detailed understanding of the set of communicative practices undertaken by actors within the domain under consideration currently or likely to be undertaken in some new domain of institutional action. One way of building a communicative pattern for an existing situation is to observe and collect together a series of actual communicative practices within the domain under consideration. Hence, in the case of medical emergency response, one might collect samples of emergency calls, dialogue within the control centre and communications between dispatchers and ambulance resources. Such communicative practice might then be analysed using one or more established approaches for dealing with extended stretches of conversation such as content analysis or discourse analysis. However, in line with the general way in which business analysts typically approach requirements elicitation, one would normally expect a communicative pattern to be built from a series of extended and unstructured interviews or workshops with domain stakeholders. We discussed such investigation techniques in Chap. 3. Elicitation of whatever form is a way-station to making sense of the communicative situation under investigation. To do this, we need some understanding of not only what is communicated and by whom but the sequencing of communication and how this relates to coordination of activity within the domain. The representation of communicative acts within some communicative pattern, as well as the chronology of such acts, is necessarily a narrative abstraction of the communicative situation—a story of who communicates about what with whom and in what sequence. Each speech bubble placed upon the visualisation of a communication pattern is an abstraction of a range of actual communications we might find while investigating
6.2 A Pattern of Information Situations
101
some domain, such as an ambulance control centre and its associated ambulance crews. Example
Hence, the content—a medical emergency has taken place at location X on person Y—is an abstraction of the communicative practices or utterances between callers and call-takers within ambulance control. ◄ Consider the emergency ambulance case as a pattern of information situations first discussed in Chap. 2. This pattern can be written as a narrative in something like the following manner. 1
2
3
4
5
6
7
8
The lifecycle begins when telephone operators take an emergency call. The caller’s area code or closest mobile phone cell is identified from the call, which is then routed to the ambulance control centre At the control centre, a call-taker matches the call number with a physical address using a computerised map (or gazetteer) of the area covered by the service. The call-taker asks a pre-established series of questions of the caller(s), prompted by a set of rules embedded in the incident system Most ambulance services in the UK now institute a process of ‘triage’ to enable prioritisation of response to incidents. Calls are classified as category A (life-threatening), category B (serious but not life-threatening) or category C (does not require emergency response). On this basis, further decisions are made about the dispatch of resources to such incidents, taking account of two national targets set for response times to category A and B calls. Within the UK, ambulance services are required to reach 75% of category A calls within 8 min and 95% of category B calls within 19 min. For category C calls, patients are referred to other healthcare providers or transferred to a paramedic who will offer medical advice Assuming a call is categorised as either A or B, an emergency incident is declared and the location entered in an incident management system by the call-taker. A dispatcher will start to listen in to the call at this point The task of the dispatcher is to assess the most appropriate resource to send to the incident using a screen indicating a plan designed to maximise the efficient use of resources (known as the system status management or SSM plan), a screen listing the status of all resources and a screen which plots the current location of such resources against a computerised map. The SSM plan is an attempt to dynamically deploy resources around the area covered by an ambulance service according to demand patterns established for day and time, geographical area and clinical urgency Using this technology and her knowledge of the local area, the dispatcher selects and assigns a resource to the emergency incident. The dispatcher uses a radio message to inform the crew about the location of the incident (including a map grid reference) and reported details of the patient’s condition While the dispatcher is conducting this task, the call-taker will be giving pre-arrival advice to the caller. In certain extreme cases, the call-taker will remain in continuous communication with the caller until the ambulance arrives at incident Following receipt of an incident alert from the control room, and once mobile, a member of ambulance crew presses a button on their communication set to indicate departure. Crews are guided by satellite navigation to the incident location, supplemented with radio communication from the control room (continued)
102
9
6
Composing an Information Model from Institutional Facts
Upon arrival, a member of crew presses an arrive button on the communication set. A paramedic then administers any immediate treatment required at the scene and communicates the medical condition of the patient back to ambulance control The dispatcher will enter details of the patient condition and the treatment administered into the incident system. If the patient condition is sufficiently serious, the dispatcher will request of a general patient admissions system to suggest possible hospitals to admit the patient based upon the patient condition, the location of the incident and the location of hospitals. If the patient condition is deemed non-serious, then the ambulance resource makes itself available for further allocation In the case of further treatment being required, the dispatcher will select an admitting hospital and communicate the patient condition and likely time of arrival to the emergency department of this hospital. The admitting hospital is indicated to the ambulance crew by the dispatcher. When the patient is deemed ready, she is moved into the ambulance and prepared for departure. A crew member then presses a leave scene button Upon arrival at the general hospital, an at hospital button is pressed As soon as a cubicle is available in the emergency department, the patient is admitted Finally, the crew presses a clear button which declares that they are available to be allocated as a resource again
10
11
12 13 14
From a narrative such as the one provided here, it is possible to identify a range of communicative acts appropriate to this institutional domain. Such communicative acts are essential to the coordination of the activity of numerous actors such as callers, call-takers, dispatchers, ambulance drivers and paramedics. The investigator will need to abstract the detail of these communicative acts from a close understanding of the communicative practice taking place in the control room, the ambulance, the incident site and the accident and emergency department of the general hospital. This will probably not only involve interviews with key stakeholders but close observation and possible participation in such settings. As introduced in Chap. 3, we have found it useful in other work to visualise this pattern (Beynon-Davies 2021a) as illustrated in Fig. 6.1. Each element of the narrative is numbered to correspond to an appropriate element of the visualisation in this figure.
6.3
Unpacking the Content of Messages
Assuming that a communicative pattern can be built from our analysis of some existing situation or from a design of some new situation, then the next step is to take each act of communication in turn and unpack the content of messages. This cumulative content can then be converted into the constructs of an information model, namely, classes, attributes and relationships. Example
For example, take the first communicative act from the pattern illustrated in Fig. 6.1 and presented in greater focus within Fig. 6.2.
Paramedic
Ambulance driver
Admitting nurse
Ambulance driver
Assert departure to hospital
Dispatcher
DIRECT[take patient X to hospital Y]
Ambulance driver
11
Yes
14
Declare incident closed
Caller
No
Dispatcher
DIRECT[ Hospital admission ]
10
3
Category C Call taker
Declare treatment administered
10
ASSERT [Possible hospitals]
Admissions system
admitting
Incident system
3
DIRECT[ Call category] Category A or B
Call taker
DECLARE [patient has condition X, received treatment Y]
Dispatcher
End of pattern
Communicate category c actions
Call taker
Call taker
DIRECT[Emergency is category A/B/C]
9
Category A or B
COMMIT[An ambulance will respond within X minutes]
Dispatcher
Assert incident arrival
Paramedic
ASSERT[Patient X has condition Y and received treatment Z]
Ambulance driver
Ambulance driver
ASSERT[Arrived at incident X]
Ambulance driver
Dispatcher
Assert incident departure
DIRECT[Go to location X and attend incident Y]
COMMIT[We are leaving to respond to incident X]
Paramedic dispatcher
ASSERT[Departing to incident X]
Communicate Call taker category a/b actions
7
SSM plan
5 Decide upon resource
DIRECT[Resource type X needs to be sent to incident Y]
ASSERT[ available resources at locations]
DIRECT[Medical actions X need to be taken before ambulance arrives]
Dispatcher
Caller
Incident system
DECLARE[Incident X has occurred at location Y on patient Z]
4 Declare incident
3 Deciding upon response
ASSERT[Medical emergency X is of this form]
Paramedic dispatcher
DIRECT[Medical actions X need to be taken]
Dispatcher
DECLARE[Incident X is now closed and resource Y is available for dispatch]
DECLARE[Patient X is admitted to hospital Y]
Admissions system
ASSERT[Leaving for hospital X]
Dispatcher
ASSERT[Arrived at hospital X]
Assert patient admitted
ASSERT[This is the handover of patient X with condition Y]
Call taker
DIRECT[Find caller and emergency Call taker location] Gazeteer
ASSERT[Caller X and emergency location Y]
Fig. 6.1 Emergency response as a system of communication
at hospital
12 Assert arrival
13
Caller
ASSERT[A medical emergency incident has taken place at location X on person Y]
2 Identifying locations
1
Notification of emergency
8
Dispatcher
Instruct resource
6
6.3 Unpacking the Content of Messages 103
104
6
Composing an Information Model from Institutional Facts
COMMUNICATIVE ACT Notification of emergency
INTENT
CONTENT
ASSERT[A medical emergency has taken place at location X on person Y]
ACTOR
Caller
Call taker
Fig. 6.2 Unpacking a communicative act
The content of this communicative act is: [A medical emergency has taken place at location X on person Y] This content is an abstraction of the key elements taken from an analysis of the range of emergency calls taken by ambulance control. As such, it identifies the key things or objects of interest that serve to trigger further actions by actors such as call-takers, dispatchers and ambulance crew. This content can be unpacked as a series of institutional facts using the constructs of classes, attributes and relationships as described in Chap. 4. Hence, such facts which are immediately apparent from this content include: [Medical emergency OCCURS-AT Location] [Medical emergency INVOLVES Person] [Location SITE-OF Medical emergency] [Person INVOLVED-IN Medical emergency] ◄ Within this specification of the communicative act, Medical emergency, Location and Person are likely information classes. OCCURS-AT and INVOLVES are two relationships of association with their reverse names or roles being SITE-OF and INVOLVED-IN. The relationship of association [Medical emergency INVOLVES Person] establishes the context of the relationship between the named person and the
6.3 Unpacking the Content of Messages
105
specific medical emergency. The association relationship [Medical emergency OCCURS-AT Location] specifies the relationship between the particular medical emergency and an established map location. But the communicative act within this pattern forms only the starting point for an analysis of the wider information situation within which it occurs. Probably in structured conversation with key stakeholders, a number of other institutional facts will become apparent which serve to provide greater depth to the model of information. For a start, it is likely that we need to add to our information model some reference to both actors involved in the communicative act—Caller and Call-taker. We will probably also wish to record data relating to the medium by which the communication occurred. In other words, in this case, we need to include a class Emergency call. In total, this adds the following institutional facts to our specification: [Caller MAKES Emergency-call] [Emergency call MADE BY Caller] [Call-taker HANDLES Emergency call] [Emergency call HANDLED BY Call-taker] This adds two further classes and relationships to our information model. But, knowing that we have a set of information classes means that we also know that there will be a range of identifiers needed as attributes of these classes. Hence: [Medical emergency REFERENCE incidentNo] [Person REFERENCE personNo] [Location REFERENCE locationRef] [Emergency call REFERENCE callNo] [Caller REFERENCE callerID] [Call-taker REFERENCE calltakerID] As we have seen in Chap. 4, the relation REFERENCE is a special form of relation which relates a class to an identifier. From further investigation, we might further infer that we would wish to record other attributes of certain classes, such as that an emergency call has a start time and end time and that a person of concern has a name, sex and age. This makes the total set of institutional facts analysed so far as being: [Medical emergency OCCURS AT Location] [Medical emergency INVOLVES Person of concern] [Location SITE OF Medical emergency] [Person of concern INVOLVED IN Medical emergency] [Caller MAKES Emergency call] [Emergency call MADE BY Caller] [Call-taker HANDLES Emergency call] [Emergency call DESCRIBES Medical emergency]
106
6
Composing an Information Model from Institutional Facts
Caller makes made by Emergency call called from
handled by
describes
occurs at
described handles by Medical emergency Call taker involves involved in Person of concern base of
site of Location
Fig. 6.3 A first-pass information model of emergency response as a visualisation
[Medical emergency DESCRIBED BY Emergency call] [Emergency call HANDLED BY Call-taker] [Call-taker HANDLES Emergency call] [Emergency call CALLED FROM Location] [Location BASE OF Emergency call] [Person REFERENCE personNo] [Location REFERENCE locationRef] [Emergency call REFERENCE callNo] [Caller REFERENCE callerID] [Call taker REFERENCE calltakerID] [Person HASA name] [Person HASA sex] [Person HASA age] [Emergency call HASA start time] [Emergency call HASA end time] Using the procedure for translating a series of institutional facts into a visualisation discussed in Chap. 5, we can produce a first-pass information model as illustrated in Fig. 6.3. Note that in making this visualisation, we have assumed an appropriate cardinality and optionality for each of the relationships of association on our model. These assumptions will need to be confirmed in structured conversation with key stakeholders and if necessary revised.
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Call taker
Caller
Paramedic
Ambulance driver
Admitting nurse
Ambulance driver
Dispatcher
Admissions system
Dispatcher
Dispatcher
Incident Admissions system system
Dispatcher
Dispatcher
Call taker
Dispatcher
Resource system
Incident system
Paramedic dispatcher
Ambulance driver
Ambulance driver
Dispatcher
Paramedic
Ambulance driver
Caller
Ambulance driver
Call taker
Gazeteer
Paramedic dispatcher
Call taker
Call taker
Dispatcher
Actors
ASSERT[Arrived at incident X]
to
DECLARE[Patient X is admitted to hospital Y]
DIRECT[take patient X hospital Y]
admitting
ASSERT[Patient X has condition Y and received and treatment Z]
DIRECT[Medical actions X need to be taken before ambulance arrives]
DIRECT[Go to location X and attend incident Y]
DIRECT[Resource type X needs to be sent to incident Y]
DECLARE[Incident X is now closed and resource Y is available for dispatch]
ASSERT[This is patient X with condition Y]
ASSERT[Arrived at hospital X]
ASSERT[Leaving for hospital X]
ASSERT [Possible hospitals]
ASSERT[Departing to incident X]
COMMIT[An ambulance will respond within X minutes]
COMMIT[We are leaving to respond to incident X]
ASSERT[ available resources at locations]
DECLARE[Incident X has occurred at location Y on patient Y]
ASSERT[emergency X is of this form]
DECLARE [patient has condition X and received treatment Y]
DIRECT[These actions need to be taken]
ASSERT[Caller X and emergency location Y]
DIRECT[Emergency is category A/B/C]
ASSERT[A medical emergency has taken place at location X on person Y] DIRECT[Find caller and emergency location]
Communicative acts
?[Hospital admission]
?[Call category ]
Incident closed
Handover patient
Arrived at hospital
Leaving for hospital
Declare condition
Arrived at incident
Departing to incident
Commit to respond
Leaving for incident
Direct type of resource
Declare incident
Classify emergency
Emergency call
Terms
Emergency incident
Patient admission
Hospital
Patient
Patient
Emergency incident
Emergency incident
Ambulance resource
Direct incident
Ambulance resource
Emergency incident
Emergency category
Location
Person
Resource available
Patient
Patient condition
Assert condition
Departing to incident
Response time
Emergency incident
Resource type
Location
Category A
Caller
Medical emergency
Ambulance driver
Hospital
Ambulance driver
Hospital
Treatment administered
Patient condition
Ambulance driver
Take medical actions
Location
Location
Incident system
Category B
Gazeteer
Caller
Call-taker
Call-taker
Dispatcher
Patient condition
Dispatcher
Ambulance driver
Admitting hospital
Treatment administered
Dispatcher
Call-taker
Dispatcher
Resource system
Call-taker
Category C
Table 6.1 Actors, acts and terms within the communicative pattern of emergency response
Admitting nurse
Dispatcher
Dispatcher
Ambulance driver
Caller
Ambulance driver
Dispatcher
Patient
Call-taker
Location
Paramedic
Incident system
Dispatcher
Paramedic dispatcher
Paramedic dispatcher
Admissions system
Admissions system
Paramedic
SSM plan
6.3 Unpacking the Content of Messages 107
108
6
Composing an Information Model from Institutional Facts
Exercise Generate the institutional facts that define the cardinality and optionality of information classes in the various relationships of association detailed in Fig. 6.3.
6.4
Generating Institutional Facts
The process described in the previous section needs to be undertaken for each of the information situations represented in the communicative pattern under consideration. In this manner, the modeller will generate a complete set of terms used within and about the communicative acts represented upon a communicative pattern. Table 6.1 extracts such terms and presents them alongside the actors and acts in which such terms appear in Fig. 6.2. As you can see, most of the terms consist of things identified and described, such as emergency incident, patient, ambulance resource and location. But the table also includes appropriate terms for communicative actors that participate in communicative acts—dispatcher, caller, paramedic and incident system. Finally, the modeller needs to include terms for certain critical communicative acts themselves that are likely to be referred to and described: emergency call, arrived at incident or admit patient. As we have seen, to form binary relations as representations of the facts appropriate to this institutional domain, we need to make decisions as to whether the terms identified in Table 6.1 are used to identify or describe things of interest in the communicative context under consideration. Most of the terms present within Table 6.1 would serve to classify and thus to identify things of interest. Thus, we can infer the presence of institutional identifiers for classes such as: [Patient REFERENCE nhsNo] [Medical emergency REFERENCE emergencyNo] Classifying terms also relate to terms which serve primarily to designate or describe. Hence, patient condition serves to describe the diagnosed medical condition of an identified patient, while treatment administered designates a course of medical intervention undertaken upon a patient. These are represented as relations of attribution, such as: [Patient HASA Medical condition] [Patient HASA Medical treatment] Finally, the modeller needs to decide whether things referred to by classifying terms co-occur within the communicative pattern under consideration. When there is
6.5 Validating an Information Model
109
evidence of terms relating to other classifying terms, then we have instances of association. These are represented by free-ranging predicates, such as: [Call-taker HANDLES Emergency call] [Emergency call CALLED FROM Location] [Emergency incident INVOLVES Patient] This means that we make sense of the communicative pattern of medical emergency response in terms of the list of binary relations represented in Table 6.2. Note that the terms forming classes, attributes and associations are defined here on first occurrence in the pattern and not repeated in the table. Of course, we need to perform this translation for each communicative act visualised on a communicative pattern and collate the various classes, attributes and relationships together to form a complete information model which adequately describes the communicative practice in this domain. Such a model is illustrated in Fig. 6.4. Note that we have left off the labels for relationships in this diagram, merely because of issues of space.
6.5
Validating an Information Model
Clearly, one information model may be better than another in modelling the institutional reality under consideration. Traditionally, the quality of an information model would be judged in terms of features or facets such as accuracy, completeness, simplicity and elegance. Accuracy is considered in terms of how closely the model represents the reality. Completeness is considered in terms of whether or not the model completely covers the reality being considered. Simplicity refers to the use of the minimum number of constructs required to model the domain. Finally, elegance refers to the degree to which an information model is easily understood both by business and technical actors. We would argue that the notion of information model quality is very much bound up with the issue of validity. It is important to re-confirm the validity of an information model with domain actors, because the ‘quality’ of any information model can only be established through acts of sense-making. In other words, rather than thinking of an information model in terms of ‘accuracy’ or ‘completeness’, the modeller needs to ask—does my representation do justice to the patterns of communicative action in the domain? In this sense, an information model is necessarily a pragmatic construct—it focuses upon and is always oriented towards action. At first glance, the information model illustrated in Fig. 6.4 may appear overcomplex, because it attempts to encapsulate all, not part of, the communicative context illustrated in Fig. 6.1. The classes and associations on such an information model might appear to represent undisputed things of interest for actors within this domain. In practice, classes such as patient, medical emergency and emergency incident act as signs which help ‘scaffold’ this institutional order. What constitutes or should constitute a patient and what constitutes a true emergency and thus a valid
110
6
Composing an Information Model from Institutional Facts
Table 6.2 Binary relations pertinent to the terms in Table 6.1 1
2
3
Class [Emergency call REFERENCE callNo] [Person REFERENCE name] [Medical emergency REFERENCE emergencyNo] [Caller REFERENCE callerNo] [Call-taker REFERENCE handlerID] [Location REFERENCE locationID] [Gazetteer REFERENCE versionID]
[Paramedic dispatcher REFERENCE practitionerID] [Category C response REFERENCE responseID] [Classify emergency REFERENCE eventID]
Attribute [Medical emergency HASA emergencyDescription] [Person HASA personDescription] [Caller HASA callerDescription]
Association [Caller MAKES Emergency call]
[Call-taker HASA calltakerDescription]
[Person INVOLVED IN Medical emergency]
[Location HASA locationDescription] [Location HASA coordinate X] [Location HASA coordinate Y] [Medical emergency HASA emergencyCategory] [Classify emergency HASA eventDateTime]
[Emergency call CALLED FROM Location] [Medical emergency OCCURS AT Location] [Location IDENTIFIED IN Gazetteer] [Paramedic dispatcher CLASSIFY EMERGENCY Medical emergency] [Medical emergency BECOMES Category C response] [Medical emergency BECOMES Emergency Incident] [Call-taker ISSUES Category C response] [Call-taker TAKE MEDICAL ACTION Caller] [Emergency incident OCCURS AT Location]
[Take medical action HASA eventDateTime]
[Take medical action REFERENCE eventID]
4
5
[Emergency incident REFERENCE incidentID] [Incident system REFERENCE versionID]
[Patient HASA DateOfBirth]
[Patient REFERENCE nhsNo] [Declare incident REFERENCE eventID]
[Patient HASA Sex]
[Ambulance resource REFERENCE resourceID]
[Patient HASA Name]
[Declare incident HASA eventDateTime] [Emergency incident HASA startTime] [Ambulance resource HASA resourceType]
[Call-taker HANDLES Emergency call] [Emergency call ABOUT Medical emergency]
[Call taker DECLARE INCIDENT Emergency incident] [Emergency incident INVOLVES Patient] [Person BECOMES Patient] [Emergency incident RECORDED IN Incident system] [Ambulance resource CURRENTLY AT Location] (continued)
6.5 Validating an Information Model
111
Table 6.2 (continued)
6
7
Class [Resource system REFERENCE versionID]
Attribute [Direct resource HASA eventDateTime]
[SSM plan REFERENCE versionID]
[Availability HASA eventDateTime]
[Direct resource REFERENCE eventID] [Availability REFERENCE eventID] [Dispatcher REFERENCE dispatcherID] [Ambulance driver REFERENCE driverID] [Leaving for incident REFERENCE eventID] [Direct incident REFERENCE eventID] [Commit to respond REFERENCE eventID]
[Leaving for incident HASA eventDateTime]
[Ambulance driver LEAVING FOR INCIDENT Dispatcher]
[Direct incident HASA eventDateTime]
[Dispatcher DIRECT INCIDENT Ambulance driver]
[Call-taker COMMIT TO RESPOND Caller]
8
[Departing to incident REFERENCE eventID]
[Commit to respond HASA responseTime] [Commit to respond HASA eventDateTime] [Departing to incident HASA eventDateTime]
9
[Arrived at incident REFERENCE eventID] [Assert condition and treatment REFERENCE eventID] [Hospital REFERENCE hospitalID]
[Arrived at incident HASA eventDateTime] [Assert condition and treatment HASA eventDateTime] [Patient HASA patientCondition]
[Declare condition and treatment REFERENCE eventID]
[Patient HASA treatmentAdministered]
10
11
12
[Leaving for hospital REFERENCE eventID] [Direct hospital REFERENCE eventID] [Arrived at hospital REFERENCE eventID]
Association [Paramedic dispatcher DIRECT RESOURCE Ambulance resource] [Resource system AVAILABILITY Ambulance resource] [Ambulance resource SSM PLAN Location]
[Declare condition and treatment HASA eventDateTime] [Leaving for hospital HASA eventDateTime] [Direct hospital HASA eventDateTime] [Arrived at hospital HASA eventDateTime]
[Ambulance driver DEPARTING TO INCIDENT Dispatcher] [Ambulance driver ARRIVED AT INCIDENT Dispatcher] [Paramedic dispatcher ASSERT CONDITION AND TREATMENT Dispatcher] [Dispatcher DECLARE CONDITION AND TREATMENT Incident system] [Emergency incident RECORDED IN Incident system]
[Ambulance driver LEAVING FOR HOSPITAL Dispatcher] [Dispatcher DIRECT HOSPITAL Ambulance driver] [Ambulance driver ARRIVED AT HOSPITAL Dispatcher] (continued)
112
6
Composing an Information Model from Institutional Facts
Table 6.2 (continued) 13
14
Class [Admitting nurse REFERENCE nurseID] [Patient handover REFERENCE eventID] [Patient admission REFERENCE eventID]
Attribute [Patient handover HASA eventDateTime] [Patient admission HASA eventDateTime]
[Incident closed REFERENCE eventID] [Resource available REFERENCE eventID]
[Incident closed HASA eventDateTime] [Resource available HASA eventDateTime]
Association [Paramedic PATIENT HANDOVER Admitting nurse] [Admitting nurse WORKS AT Hospital] [Admitting nurse PATIENT ADMISSION Patient] [Hospital PATIENT ADMISSION Patient] [Patient admission RECORDED IN Admissions system] [Ambulance driver INCIDENT CLOSED Dispatcher] [Dispatcher RESOURCE AVAILABLE Availability]
emergency incident is a continuous source of sense-making for participating actors within the domain of medical emergency response. As we have already mentioned, for instance, an emergency call only becomes a medical emergency and consequently an emergency incident through the ways in which actors such as paramedic dispatchers triage events. An emergency call only becomes the institutional fact of an emergency incident if it is deemed sufficiently ‘serious’ to warrant dispatch of an ambulance. One of the main practical advantages of this approach to composing an information model is that such a model displays greater flexibility to accommodate change to institutional action. For example, in 2005, the UK government recommended that targets set for responding to emergency calls should be measured consistently across the UK. It suggested that the clock should start ticking when an emergency call is connected to the control centre and not when the call-taker declared an emergency incident, which is what most UK ambulance services had been measuring. Following adoption of this subtle recommendation, UK ambulance services spent years re-configuring their IT systems, because on average the difference between connecting a call and identifying an incident is as much as 1 min. The information model in Fig. 6.4 distinguishes between an emergency call and an emergency incident and thus easily accommodates this change to the measurement of performance. Indeed, as suggested in Table 6.2, the information model has the potential to log every critical state communicated about within the lifecycle of an incident. More interestingly, statistics collected on the practice of emergency response reveal that while 30% of calls are categorised as life-threatening by call-takers and ambulance dispatchers, only 10% of such incidents turn out to be life-threatening in nature. Also, 77% of all emergency calls result in a journey to a local hospital, but only 40% of these patients are eventually admitted for treatment. There are complex reasons for this situation. Nevertheless, various ambulance units have attempted to
6.5 Validating an Information Model
Caller
113
Call-taker Paramedic dispatcher
Person
Take medical action
Key Commit to respond Class Category C response
Emergency call
Declare incident
Classify emergency
Direct resource
Cardinality: Cardinality: one many
Involved in
Category C becomes
Association
Optionality: optional
Medical emergency
Optionality: mandatory
Becomes
Category A/B becomes
Emergency Incident
Involves Resource system
Ambulance resource
Direct incident
Incident system
Patient
SSM plan
Currently at
Availability Admissions system
Location
Resource available
Paramedic
Gazeteer
Patient handover
Assert patient condition and treatment
Patient admission
Ambulance driver
Direct incident
Leaving for incident
Departing to incident
Arrival at incident
Direct hospital
Leaving for hospital
Arrived at hospital
Incident closed
Declare condition and treatment
Hospital
Dispatcher
Admitting nurse
Fig. 6.4 An information model for emergency response
make changes to such breakdowns in practice, to meet the implicit intentions expressed in such measurement. For instance, some have begun to re-configure their IT systems to collect a patient summary containing not only important medical data about the patient but also a history of interaction with the ambulance service. This inherently amounts to a re-configuration of the notion of what patient and incident means to emergency response. It is hoped that records based upon such re-configuration will not only allow call-takers to refine the process of triaging patients and incidents but also better signal to an ambulance crew what to expect at incidents and consequently how better to perform. It should be evident on the information model in Fig. 6.4 that a clear distinction is drawn between a person described within the context of an emergency call and an eventual patient responded to. This might enable a more nuanced history of interaction with a service, which might better inform changes to practice.
114
6
Composing an Information Model from Institutional Facts
The example of emergency response used here is based on analysing an existing situation and producing an information model of this domain. But information modelling is equally as relevant to the design of new domains of institutional activity. The process of composition then becomes one of envisaging some possible or as-if pattern of information situations in sufficient detail so that it becomes possible to unpack the content of the messages used within the new communicative pattern and translate these into the constructs of an information model.
6.6
Revising Information Models
There is an important consequence of the view of information models promoted in this book. If information models are attempts to model institutional ontology and institutional ontology relies upon patterns of information situations, then information models never stand still. Any domain of institutional action will continuously experience changes to the way in which data structures are articulated, messages are communicated and activity is coordinated. Because an information model must continually reflect changes to institutional action, it is therefore essential that an information model is continuously revisited and revised where necessary. Example
Consider the issue of marriage as an institutional fact. Recently, in the UK and in a number of other countries, the definition of what constitutes marriage has changed. Prior to this change, marriage could only occur between two person of different sex—one person being male and another person being female. Hence, we might have modelled this as having marriage as an association between a male and a female. Now, same-sex marriage is acknowledged in law. This requires us to change our model to perhaps something like a unary relationship between a person class, where a person is a generalisation (a super-class) of a male or a female. ◄ Exercise Draw an information model for the original definition of marriage and the new institutional definition of marriage. Another way of putting this is that information models are important to data administration within organisations of all forms. Data administration is generally seen to be that function concerned with the management, planning and documentation of the data resource of some organisation and as such is seen to be important to the effective control, security, integrity and sharing of data both within some organisation and between organisations. The key driver towards data administration has been that data, like capital, personnel, etc., should be treated as a manageable resource. In other words, data are seen as a critical commodity in an organisation’s
6.6 Revising Information Models
115
attempt not only to operate effectively but also to adapt to its changing environment. However, there are key problems with managing data as ‘commodity’. For instance, a number of units, departments or services within the organisation might collect data on similar things of interest but in radically different ways. Some units may collect data but have no clear idea why they collect this data. Certain other organisational actors may believe that there are notable gaps in the data collected by the organisation. Where data are collected, it may be inconsistent or untimely or irrelevant. This means that users of such data may feel it is too unreliable to be useful or may receive data too late for it to be useful. More worryingly, decision-makers within the organisation may receive conflicting data from different sources within the organisation. Example
Consider why data is such a critical resource for a university in terms of its operations such as teaching. Without data, such as what students it has, what students are taking which modules and what grades have been achieved by students, a university is unable to operate effectively in teaching, grading and awarding students. But what if different university departments or schools maintain their own distinctive collections of data about students with their own distinctive definitions for such data structures? What if data is frequently missing or incomplete and out of date? Also, what if there is incomplete knowledge amongst both administrative and academic staff as to what data is collected and where it is kept. In such situations, various staff within the university may spend a substantial amount of their time resolving problems with such data. Such situations demonstrate the key need for some systematic way of managing the data resource on an organisation-wide rather than a unit-wide basis. ◄ Data administration is hence an attempt to develop some order from the potential chaos in which data structures are articulated across an institution. Data administration also involves planning the data required for future action. Hence, data administration concerns itself with a number of themes associated with data definition and use. In terms of data definition, administrators implement standards for the definition of data and attempt to control the media for the recording and communication of such definitions. Administrators also implement data control practices that define and police access to data resources. They also attempt to ensure the integrity of data and that it is secured from threats. This also means implementing procedures to ensure that the organisation complies with any legislation concerning data privacy. Finally, data administrators encourage sharing of data across applications and promote the idea that data as a resource is independent of IT applications and its users. To achieve such goals, traditionally, data administration is conducted through the development of data dictionaries, which attempt to encapsulate the metadata of the organisation—data about data (more about metadata in Chap. 9). Alternatively, data administrators may seek to develop corporate or enterprise information models.
116
6
Composing an Information Model from Institutional Facts
Prefixing the term information modelling with the word corporate would tend to suggest an elevation of this practice to the level of the whole organisation. A corporate information model should form a map of institutional ontology of the whole or a substantial part of an organisation. This differs from an application information model which is produced to support a specific organisational function or IT development project. Data administrators then attempt to use these ‘maps’ to control the design and use of the data resource of the organisation, typically by enforcing levels of standardisation of data structures across organisational units and in order to ensure better data integration and consequent data sharing both within and between organisational units. Information models are an important tool in the armoury of the data administrator. An information model, as we have seen in previous chapters, provides understanding of the things of interest to organisational actors. Such things or objects have to be identified in a consistent manner. A data administrator should ensure that identifiers for objects like products, people, invoices and orders should be designed to have three key features. First, an identifier, as a matter of definition, should be uniquely associated with one and only one object. Every object, such as an instance of a product, should have one and only one identifier. An identifier should be assigned immediately on creation of an object. An identifier should not contain any details about the object it identifies. It should serve the sole function of identifying an object. The reason for this is that so-called non-mnemonic identifiers maintain stability over time. Example
Consider the case where a company uses a three-digit code to identify its products. The first digit is used to indicate the warehouse where the product is stored. Now suppose the company decides to change its warehousing practice and moves all products of a particular type from one warehouse to another. This will necessitate changing all the product codes for the products moved. ◄ A natural consequence of this discussion is that it is important to administer and control the assignment and use of identifiers in organisations. It is also usually not good practice to rely on identifiers supplied by external agencies. Example
Suppose an organisation uses the delivery advice number from its supplier to identify different deliveries. If the supplier inadvertently send two separate deliveries with the same advice number, then the organisation’s internal information systems are likely to suffer. ◄
6.8 Summary
6.7
117
Conclusion
As we mentioned in Chap. 3, learning the principles of a visualisation technique is not the same as applying it. This is the reason that in this chapter we have covered in some detail the key things involved in the composition of an information model. We have particularly emphasised how a good understanding of the context or pattern of information situations relevant to some institutional domain is a necessary prerequisite for information modelling. We have also emphasised that an information model, by its very nature, is a continuously moving beast and as such is critically important to the administration of the data infrastructure within and between organisations. In the next chapter, we consider a number of issues associated with the practice of building information models, such as when to model something as a class, attribute or relationship.
6.8
Summary
• To compose an information model, we first need a model of the pattern of information situations under consideration. This may be a model of some existing or as-is pattern or a model of some possible or as-if pattern. • We then need to unpack the content of the messages used within this communicative pattern into the constructs of an information model—classes, attributes and relationships. • This enables us to generate a set of binary relations which adequately reflects the content as a series of institutional facts. • The institutional facts are then used to produce a visualisation of this model using one of the many possible notations. • The validity and consistency of the information model are checked with actors within the domain in question. • The adequacy of the information model in reflecting communicative practice is continuously examined, and the information model is revised if necessary. • Because they must continually reflect changes to institutional action, an information model is continuously revisited and revised where necessary. This makes information models important to data administration within organisations of all forms.
7
Practical Issues in Information Modelling
7.1
Introduction
The coverage of information modelling in previous chapters has largely focused upon the theoretical aspects of applying this technique. In contrast, the current chapter examines a number of practical issues associated with the conduct of information modelling and how these may be resolved. We first consider the issue of interpretive flexibility—the fact that the modeller may choose to model the same thing as a class, attribute or relationship depending upon the institutional context under consideration. The same flexibility applies in the case of using generalisation and aggregation within information modelling. Then we consider the distinction between strong and weak classes and notions of ternary and recursive relationships of association. This leads to a discussion of how to include time within an information model and the important problem of connection traps and how to avoid them.
7.2
Class, Attribute or Relationship
In Chap. 3, we spent some time making the case for thinking about an information model in a specific way. Within this book, we have promoted the idea of a model as a way of negotiating collective belief as to either how things are in some domain or how we, as a collective or community of actors, might like things to be in this domain. A key question faced by any information modeller is how do you know that something should be modelled as a class, attribute or relationship? When trying to model some domain, it is difficult to know where to start and what constructs to use. In other words, what things are or should be of interest and what constructs should be used to model such things? The modeller hence usually has a degree of interpretive flexibility in deciding which construct is most appropriate. # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_7
119
120
7
Practical Issues in Information Modelling
Example
Suppose you are given the task of building an information model to represent the social convention of marriage. The common difficulty experienced here is whether to represent marriage as a class, a relationship between classes or an attribute of a class. Marriage might be represented merely as marital status attributed to a person class. Alternatively, marriage might be seen as an association between two persons of whatever gender. ◄ The answer to the first question of what things are or should be of interest relies upon a close understanding of the communicative practice appropriate to the domain in question. In other words, the modeller must develop a close appreciation of what actors either currently communicate about or wish to communicate about. The answer to the second question of what constructs should be used to model such things must depend upon an appreciation of how certain signs are used by certain actors within the domain in question and for what purpose. In other words, the answer to such a question depends upon the institutional context within which signs such as marriage are being used by actors. This means that what helps guide the construction of an information model must be a clear understanding of the communicative pattern relevant for the domain in question. Example
For the institutional reality which is the domain of emergency response, marriage is probably only important as a means of signalling to control, ambulance and hospital staff that there is an important next of kin to communicate with. Hence, modelling marriage as an attribute (marital status) of a patient might be sufficient for the patterns of action evident in this domain. For some tax authority, marriage is of interest because it may affect the tax position of certain citizens—hence, it is probably appropriate to model this situation as a relationship, perhaps a recursive relationship with a citizen class. Finally, for a marriage registry, the event of marriage itself is the significant thing of interest. Hence, it is appropriate to represent marriage within this institutional context as a class in itself, with its own attributes and relationships. ◄ What we are actually saying here is that for any sign to be meaningful, it must have practical consequences. This is actually a maxim of the philosophy of pragmatism first promulgated by the American polymath Charles Sanders Peirce (Bacon 2012). For Peirce, to understand the nature of signs, we must always apply the pragmatic test of whether the making of a set of differences to some substance makes a further difference in turn to some actor (Bateson 1972). In other words, we must judge any potential sign in terms of its consequences, whether it has any practical bearing on some situation, in terms of how actors act in such situations.
7.3 Repeating Attributes
121
Example
For example, we can only judge how to model a sign such as marriage in terms of what it communicates to actors within some domain. We also need to judge the result of such ‘communication’, in turn, in terms of differences made to the consequent behaviour of actors, perhaps changes to their work activity. Within emergency response, the attribute marriage or marital status directs ambulance and hospital staff to communicate with a nominated next of kin. Within some tax authority, the relationship of marriage will direct staff and systems to modify the taxation of particular citizens. Within a marriage registry, the class of marriage needs to be recorded in great detail to coordinate the work of staff before, during and after this life event. ◄
7.3
Repeating Attributes
A related problem to this is that frequently we may wish to communicate many things about some apparent class. In practice, this means that we may have to model many properties or attributes of this thing. In such cases, it is normally fruitful to examine whether the class is actually one class or is better represented as a series of related classes. Some proponents of information modelling argue that you should look to try to keep the number of attributes associated with any class within the bounds of 7 plus or minus 2 attributes—this is a cognitive limit established for human short-term memory. One heuristic or rule of thumb you can apply to achieve this is to look for attributes which repeat in terms of any one instance of a class. What we are really saying here is looking for attributes which have what we shall refer to in Chap. 8 as a non-functional dependency with the identifier for a class. When this occurs, we need to fragment out these repeating attributes into a separate class. Example
Suppose you are working within human relations or human resources and you spend much of your time communicating about employees. The tendency here would be to build one class Employee and assign all the attributes communicated about employees to this one class. But suppose you find that this makes Employee a class with 40 attributes. This might direct you to consider whether all the attributes are directly attributable to an employee or whether they are best modelled as related classes of a person. For instance, an employee may hold down a number of distinct roles having defined durations of employment with the company. Each duration needs to be recorded in terms of a role number and role name along with its start date and end date. This means that for any one employee identified by a staffNo, there are likely to be a number of role numbers, role names, start dates and end dates. Rather than placing these attributes within one
122
7
Practical Issues in Information Modelling
Employee class, it is probably best to put these attributes in a separate class such as an Employment class and associate them back to the Employee class. ◄
7.4
One-to-One Relationships
Within Chap. 4, we defined relationships of association largely as one-to-many or many-to-many relationships. However, it is possible to model such associations as one-to-one relationships. It must be said that one-to-one relationships of association are not used very frequently within information modelling, but there are two situations in which they sometimes prove useful. The first of these is when the thing referred to by a class undergoes a change of state during its lifecycle. Example
Remember that within the domain of emergency response, a medical emergency reported to the ambulance control centre has to be classified in terms of its medical severity. If the medical emergency is classified as one of the first two categories (A and B), then this medical emergency becomes an emergency incident to this institution and triggers a set of consequent actions, such as sending an ambulance to the incident location. In such a case, we need to record this change of state, and we do so by relating the medical emergency class to an emergency incident class through a one-to-one relationship as in Fig. 7.1. The same goes for the transition from a person reported to a call-taker to a patient of the ambulance service. ◄ The second case where one-to-one relationships may prove useful is when we wish to communicate about many things attributed to a class. Generally, we might use a unary relationship to name collections of attributes that apply to the same thing.
Fig. 7.1 One-to-one relationships
Medical emergency
becomes
Emergency incident
Person
becomes
patient
Clothing customer
Food customer
7.5 When to Generalise and Aggregate
123
Example
Suppose a company runs both a high-street clothing retail operation and a supermarket chain. The company offers a loyalty card to its customers and wishes to reward customers that utilise both arms of their business. However, because the number of attributes they wish to record about both types of customer exceeds 30 attributes, they decide to have a clothing customer class and a food customer class on their information model (Fig. 7.1) related by a one-to-one relationship. ◄
7.5
When to Generalise and Aggregate
As we have seen in Chap. 6, information models can be built largely without using any of the relationships of abstraction we discussed in Chap. 3. So the key question is frequently—when should you use generalisation and aggregation within an information model? Generalisation is actually another useful strategy to use within modelling when we wish to communicate about many properties of some thing. Frequently, we may find that many of such properties only apply to specific instances of a class rather than to all instances of a class. The set of properties that applies to the specific set of instances should then be used to form a sub-class of the wider class. Example
Suppose we have a Product class and we find that salesmen wish to communicate about many properties of the products they sell such as product code, product description, price, length, weight and so on. However, on closer examination, we find that certain properties such as load bearing capacity are used within communication by salesmen and customers only for certain instances of manufactured steel products—namely, lintels. In this case, it makes sense to split off these properties into a separate sub-class of the super-class. ◄ Aggregation is a form of abstraction that relates some whole to its parts. Unlike generalisation however, the parts are distinct objects from the whole. Within information models, aggregation is particularly used for handling problems of representing component-assembly issues, member-collection issues or place-area issues (Winston et al. 1987). In component-assembly problems, an object is divided into its components in terms of some organised pattern or structure. Each component part has a distinct function which can be separated from the whole.
124
7
Practical Issues in Information Modelling
Example
A handle is part of a cup. Wheels are part of cars. Chapters are parts of books. ◄ In member-collection problems, each part in this type of relationship is not the same as the whole and does not play a specific function in terms of the whole. The parts can be clearly separated out from the whole. Membership in the collection is determined purely on the basis of spatial proximity or social connection. Example
A tree is part of a forest. A juror is part of a jury. The ship is part of the fleet. ◄ Place-area aggregates are used to relate areas to places or locations within them. Places are not parts by virtue of any functional contribution to the whole. Every location shares common features with all other areas. But no location can be separated from the area of which it is a part. Example
The Everglades are part of Florida. An oasis is part of a desert. ◄
7.6
Strong and Weak Classes
Information classes are not created equal—within any information model, there are likely to be strong classes and weak classes. An information class is said to be a strong class if the existence of its instances or objects does not depend on the prior existence of the instances of some other class. In contrast, the objects of a weak class depend on the existence of the instances of some other class within the domain considered. Example
In a university domain, the classes Module and Student are both strong classes because the existence of a given module and student does not depend on the prior existence of any other thing. However, the class Assessment is likely to be a weak entity since it depends on the prior existence of both a Student class and a Module class. ◄
7.7 Recursive and Ternary Relationships
125
Identifying weak information classes is important because instances of such classes cannot be uniquely identified by attributes of the class itself. They must acquire some identifying properties from the strong classes with which they are associated. Example
A student will be identified by a studentNo and a module by moduleCode. An assessment class may be identified by a compound identifier made up of a studentNo and moduleCode. [moduleCode REFERS TO Module] [studentNo REFERS TO Student] [moduleCode AND studentNo REFERS TO Assessment] ◄ From the example, it should be evident that weak classes are particularly relevant to many-to-many relationships of association. Some approaches to information modelling recommend decomposing any many-to-many relationships within an information model into two one-to-many relationships. The relationship is then promoted to becoming a link class. A link class is a weak class because it crossrefers between instances of one class and the instances of another class. Example
A Student will normally be assessed a number of times on a given Module, and a Module will assess many students. In Fig. 7.2, we introduce an assessment link class to connect students with modules. Note that the many ends of the relationships always appear at the link entity. Student and Module are both strong classes in this relationship. Assessment is a weak class because it relies on the prior existence of both Student and Module. ◄
7.7
Recursive and Ternary Relationships
In conventional information model diagrams, the relationships are all binary, that is, we diagram two information classes and a relationship or a set of relationships between these information classes. It is possible however for association relationships to be unary. In other words, a relationship may involve only a single information class. Unary relationships are frequently described as being recursive in that they relate classes of the same type.
126
7
Module
Practical Issues in Information Modelling
Module
moduleCode ..
moduleCode ..
assesses
Assessment studentNo moduleCode ..
assessed by
Student studentNo ..
Student studentNo ..
Fig. 7.2 Decomposing a many-to-many relationship
Example
Figure 7.3 details a recursive relationship called prerequisites or prerequisite for which applies to the class Module. A module may have a number of prerequisite modules; a given module may also be a prerequisite for numerous other modules. This makes the cardinality of both ends of this relationship mandatory. A module does not need a prerequisite and a prerequisite does not need to have any postrequisite modules. This makes the optionality of both ends of the relationship optional. ◄ As well as relating a class to itself, we may also find uncommon situations in which three or more classes are related together—a so-called ternary relationship. Fig. 7.3 Unary relationship
prerequisite for
prerequisite of
Module
7.8 Modelling Time
127
Fig. 7.4 Ternary relationship
Employee
Skill-used
Project
Skill
Because of their complexity, ternary relationships are only used when they cannot be decomposed into a series of binary relationships. Example
The relationship skill used in Fig. 7.4 associates the classes Employee, Skill and Project. What shows that this ternary relationship is necessary is if we attempt to fragment the relationship skill used into any two relationships we lose valuable information. Hence, if we had a relationship between Employee and Project, between Project and Skill and between Skill and Employee, we would not be able to determine on which project a particular person utilised a particular skill. ◄
7.8
Modelling Time
In most institutional settings, actors are interested in things which we might generally call events. Events are happenings and as such occur at a particular time and date—classes that must be timestamped in some way. Hence, in many situations, some way of handling both past and future time as well as present time must be utilised within the analysis and design of communicative acts. Example
Consider the case where staff at a university wants to communicate about the courses it runs as well as details of the students which enrol on such courses. If we are only interested in current enrolment, then the information model in Fig. 7.5a is sufficient.
128
7
Practical Issues in Information Modelling
B
A
Course
Course
courseCode ..
courseCode ..
enrols
Enrolment studentNo courseCode enrolmentDate
enrolled on
Student studentNo ..
Student studentNo ..
Fig. 7.5 Information model with time
However, a more realistic situation is approached when the university decides it wishes to build institutional facts about past enrolments for management decision-making. Staff now wish to record which students have enrolled in which courses over a period of say 5 years. This means that we now make the relationship between course and student a many-to-many relationship as in Fig. 7.5b. This structure however is equally capable of handling future events. Suppose the university wishes to extend the use of its records to schedule future courses. The only modification we need to make to our information model is to make course and student optional in the appropriate relationships. In other words, we wish to allow course details to be recorded prior to places being filled. Likewise, we wish to record details of students prior to their enrolment on a particular course. ◄
7.9
Connection Traps
One key advantage of visualising an information model is that it becomes easier to identify a number of potential problems with the navigation around such models. Figure 7.6a, b illustrates a number of potential pitfalls in information modelling. These pitfalls are known as connection traps (Howe 1981) because they may trick the modeller into making invalid assumptions about the connection between information classes.
7.9 Connection Traps
129
A – fan trap
1
1
1
Faculty 2
2
3
Department
3
2
4
Staff
4
Department
Faculty
Staff
B – chasm trap Faculty
Department
1
1
2
2
3
3
4
4
5
Staff
Department
1
2
Faculty
Staff
Fig. 7.6 Connection traps. (a) Fan trap, (b) chasm trap
The first type of connection trap to consider is known as a fan trap because it may occur when two one-to-many (1:M) relationships fan out from the same information class. Example
In the information model on Fig. 7.6a, the following assertions are made about some academic domain: • • • •
A faculty has many departments. Every department belongs to at most one faculty. A department has many staff. A member of staff belongs to at most one department.
The business analyst assumed that this way of building an information model was sufficient to tell him which staff belong to which department. As we see from the associated instance diagram however, this assumption is incorrect. Although we can tell from the information model which staff belong to which faculty and
130
7
Practical Issues in Information Modelling
which department belongs to which faculty, the link between staff and departments is ambiguous. Figure 7.7 illustrates a representation for the same information model which overcomes the fan trap. The instance diagram seems to indicate that the query— which staff work for which departments—is clearly answerable from this revised model. ◄ The second kind of connection trap is known as a chasm trap because it suggests that a relationship exists between all instances of two information classes when this is not the case. Example
The revised information model in Fig. 7.6b may be subject to a further problem. What if we have within our university staff who are employed by faculties rather than by departments? In other words, the optionality of staff in this relationship is optional. Our information model would give us an incorrect answer as it assumes that all staff must be employed by departments. To avoid this mis-interpretation, we have to introduce an additional relationship into our diagram between staff and faculty (see Fig. 7.7) and clearly indicate that both relationships are optional. Note that what defines a fan trap or a chasm trap is determined by the business rules applicable to a particular domain. Fig. 7.6b, for instance, would be perfectly reasonable as a representation of some domains where the business rules prohibit faculty staff. ◄
Faculty
Department
Staff
1
1
2
2
3
3
1
4
2
5
3
Staff
Fig. 7.7 Faculty staff information model
Department
Faculty
7.10
7.10
Information Model Patterns
131
Information Model Patterns
We proposed in Chap. 2 that what serves to scaffold the very idea of an institutional domain is the pattern of information situations present within this domain. But the idea of a pattern has another useful facet—the idea that a pattern in whole or in part may be applicable to more than one institutional domain. In other words, it is more than likely that we might observe common ways of doing things (of articulating, communicating and coordinating things) between separate domains. From such commonalities, we might develop a pattern which applies across domains. This idea is very much the foundation of benchmarking and reuse. A benchmark was originally a mark cut in a wall or pillar of some building and was used as a reference point to take measurements. In business terms, a benchmark now typically refers to some way of doing things which is regarded as in some way exemplary. In such a sense, benchmarking refers to the idea of comparing some pattern of activity, information or data within one’s own domain with that from some other domain which is perceived to engage in best practice. Example
For instance, the UK is divided into a number of local authority areas for administrative purposes. Local authorities are tasked with delivering a number of public sector services to citizens living within the boundaries of the local authority. Such services include waste collection, social services and primary and secondary education. One would expect that there are commonalities between the way in which one authority delivers services and the ways in which other local authorities operate. One would therefore expect that the communication required to support service delivery as well as the data structures needed to support communication would have common patterns across a range of local authorities. ◄ One key consequence of the idea of common patterns is that we should be able to reuse certain patterns within processes of analysis and design. In other words, we should be able to use a pattern derived from observing commonalities as a template to establish an appropriate pattern for activity, information and data in some other domain. In the 1990s, David Hay (1996) proposed a range of information model patterns (he called them data model patterns) for a number of different institutional domains such as inventory, work orders, contracts, accounts, laboratory testing, material requirements planning, process manufacturing and documents management. Michael Blaha (2010) has published a more recent incarnation of this idea in which he refers to information model patterns as archetypes. This should come as no surprise in the sense that business practices in areas such as trade between one organisation and another are relatively standardised across countries. This is just another way of saying that information modelling is useful in clarifying conventions associated with information situations across domains.
132
7
Practical Issues in Information Modelling
Example
Take the idea of a sales order which of course underpins economic exchange in capitalist economies. It involves communication about a number of things important to establishing the contractual basis of economic exchange. A sales order takes place between two parties—the seller of the goods and the buyer of the goods. A sales order also typically details a product sold and the quantity of such product. This description details the key elements of a core or generic pattern of the following form: [Buyer CREATES Sales order] [Seller RECEIVES Sales order] [Sales order CONTAINS Order line] [Order line DETAILS Product] [Order line HASA qty] We would expect an information model based on some elaboration of this ontology to apply across a range of retail situations. ◄ Example
Take another example of an information model pattern, pertaining to institutional facts associated with a contract (Blaha 2010). A contract is an agreement between two or more actors, typically for the supply of products. There are many different types of contract associated with different forms of product, such as foodstuffs, securities or even services. An abstract pattern for the common core elements of a contract is illustrated in Fig. 7.8. ◄
7.11
Conclusion
Just like any modelling technique, to prove proficient, one must practice it in many different settings. Within the current chapter, we have considered a number of issues which impinge upon the proper practice of information modelling. Such issues include which construct to use and when, how to handle unary and ternary relationships and how to avoid connection traps. We have also proposed the idea of an information model pattern that can be promoted as a benchmarking tool or reused within design work. The next chapter focuses upon the primary objective of information modelling, which is to design a data system of some form.
7.12
Summary
Fig. 7.8 A contract pattern as an information model
133
Contract type
Actor
Contract
Contract item
Product
7.12
Summary
• A key question faced by any information modeller is how do you know that something should be modelled as a class, attribute or relationship? The answer to this question depends upon an appreciation of how certain signs are used by certain actors within the domain in question and for what purpose. • When we wish to communicate many things about some apparent class, it is often fruitful to examine whether the class is actually one class or is better represented as a series of related classes. • One-to-one relationships are useful in situations when the thing referred to by a class undergoes a change of state during its lifecycle. They are also useful when we wish to communicate about many things attributed to a class. • Generalisation is actually another useful strategy to use within modelling when we wish to communicate about many properties of some thing. Aggregation is particularly used for handling problems of representing component-assembly issues, member-collection issues or place-area issues.
134
7
Practical Issues in Information Modelling
• It is useful to identify strong and weak classes upon an information model. An information class is said to be a strong class if the existence of its instances does not depend on the prior existence of the instances of some other class. In contrast, a weak entity depends on the existence of some other class within the domain considered. • On rare occasions, it may be important to include a unary or a ternary relationship on an information model. • When the business analyst needs to include time within an information model, it inherently means using many-to-many relationships between the classes which are time-dependent. • It is sometimes useful to decompose all many-to-many relationships on an information model into two one-to-many relationships. • The business analyst must ensure that her information model does not suffer from connection traps such as fan traps and chasm traps. Instance diagrams are a useful visual aid for spotting such traps.
8
Information Modelling and Data Systems
8.1
Introduction
Information modelling is typically directed at the design of some data system. We use the term data system, rather than database system, to indicate here that information modelling has a wider range of application than is originally assumed. We first review the nature of data and contrast this with our conception of information provided in earlier chapters. This leads us to define more precisely the concept of a data structure which lies at the heart of some data system. The architecture of some data system, as we shall see, is defined in terms of some data model. This leads us to consider one of the most popular contemporary data models, that of the relational data model, which we introduce in a somewhat unusual manner through the commonplace data structure of a list. The design of some relational database, which is referred to as a schema, is best understood through a visualisation technique known as dependency diagramming. This technique offers a straightforward route for conducting a process important to the design of a relational schema known as normalisation.
8.2
Data and Information
Our theory of information is the bedrock for our consideration of not only information modelling but the data systems that this activity is generally directed at. This is because the notion of an information situation helps clarify the important distinction between data and information. As we determined in Chap. 2, information is any difference that makes a difference to some actor. More accurately, in terms of our theory of information situations, information is any set of differences which makes further differences in the psyche (mind) of some actor or group of actors. The crucial distinction here is that data are differences made in some substance or medium by some actor or actors. Information, in contrast, is an accomplishment made by actors through their encounter with data. # The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 P. Beynon-Davies, Information Modelling, https://doi.org/10.1007/978-3-030-98805-0_8
135
136
8
Information Modelling and Data Systems
Example
Consider human speech. Speech is physically a pressure wave, or more precisely a series of pressure waves, that travels through a solid, liquid or gas. In this form, differences correspond to the peaks and troughs of the soundwave, and this variation corresponds to what is known as the modulation of the signal. ◄ Certain differences made in some substance can be coded to form symbols. In Chap. 4, we defined a sign as some thing that stands to somebody for some other thing. Or, in terms of a constitutive rule: [X stands for Y to Z in C] In this definition, X is the symbol which stands for some thing (Y) to some actor (Z) within some institutional context (C). The important point about this definition is that the stands for relationship between a symbol and what it refers to is typically an arbitrary one. It relies merely upon precedent set amongst a community of actors: that collectively a community of actors accept that some thing will be taken to stand for or count as something else. There are a number of consequences arising from the arbitrary nature of symbols. First, a certain set of differences may hold significance for one actor but not for another. In other words, the stands for relation is not the same for everyone. Example
Consider the spoken word croeso (pronounced croyso). The units of speech making up the spoken word are known as phonemes and comprise the smallest significant set of differences made in the spoken word. These differences made by speaking the word croeso only hold significance for someone communicating within the institutional context of the Welsh language. To make sense of these differences to English speakers, we have to make another stands for relation, namely, that croeso stands for welcome. ◄ Second, the same set of differences may inform two different actors differently. In other words, different actors may accomplish a different stands for relation with the same symbol. Example
Consider ‘54’ as a set of differences—a sequence of graphical elements. Through convention established in our society, the graphic ‘5’ and the graphic ‘4’ are taken as digits. But what does the sequence of these two digits stand for? To one actor, perhaps within the institutional setting of a university, it stands for a number or more particularly a grade awarded to a student for an assessment. To another
8.3 Data Structures
137
actor, perhaps working within a manufacturing setting, the sequence stands for a product code and identifier a particular type of widget to this actor. ◄ Third, a set of differences only becomes significant to a group of actors through convention. Clearly, this is inherent in our definition of data as a set of differences created with the purpose of communicating something. This means that actors must collectively agree that a certain set of differences when used repeatedly will always stand for the same thing for them. Without this collective agreement, the set of differences would not form a symbol. Example
Without conventions, communication would be impossible. The group of such conventions of communication is sometimes confusingly referred to as a code. For example, Morse code was invented by Samuel Morse to facilitate communication over the newly invented telegraph system. The code establishes the conventions that certain sequences of dots and dashes making up the Morse code stand for the letters of the alphabet. Once telegraph operators accepted this group of conventions internationally, then remote communication across the world over the telegraph became possible. ◄ So, Fig. 8.1 illustrates the component elements of any sign within which a symbol achieves significance. First, there is the symbol—the set of differences made in some substance. Second, there is the object—the thing referred to or described. Third, there is the concept—the differences made to some actor though engagement with data. Fourth and finally, there is always the actor at the centre of the process of signuse or semiosis. All four of these elements must be present for any sign to exist. This conception of the sign allows us to precisely distinguish between data and information. Data is the set of differences made in some substance—the symbol. Information is accomplished by some actor in making the connection between the thing referred to—the object—and what that thing is taken to mean, the concept. So, data is formed from differences made within a substance. Symbols are coded from such differences and formed into larger entities known as data structures— structures for data. Such data structures may act as messages conveyed as signals, or they may also be used to record details of things—to build collective memory of some things.
8.3
Data Structures
Within our theory of information situations introduced in Chap. 2, a key part is played by structures sensed and effected by actors while engaging in their surrounding environment. We indicated there that the information modeller inherently focuses upon certain types of structure that are used to communicate things between two or more actors. This type of structure is known as a data
138
8
Information Modelling and Data Systems
Fig. 8.1 The component elements of a sign
THING Concept
Actor Symbol
Object
structure, and we shall use this term to refer to a conventional set of differences used to communicate things between a group of actors. Within computer science, a data structure is a term which is used broadly to refer to some systematic form for organising data (Tsitchizris and Lochovsky 1982). This concept is clearly central to the interests of all the information disciplines (information science, information management, information systems, computer science). Much of the infrastructure of information and communication technology, for instance, is clearly taken up with the mechanics of data structures, particularly as it pertains to applications within business and government. However, although much research and development continues to be devoted to finding better ways of storing, retrieving and manipulating data structures, this concept is only rarely examined critically within the information disciplines. By this we mean that the data structure is treated largely as a technological artefact, helping to support but somewhat isolated from considerations of institutional order. As such, data structures appear to form part of the accepted and unexplored background to the conduct of investigation and explanation in these disciplines. Although data structures are not really thought about much, the data formed in structures are critically important both to organisations and to individuals, in the sense that much organisational and individual action is reliant upon data structures.
8.3 Data Structures
139
Example
As a citizen of a modern nation-state, your biography is not only recorded but lived through data structures. Your birth is marked with a birth certificate, which enrols you as a citizen of the state. You pass various education exams and are issued with certificates which qualify you to do certain things. You learn to drive and apply for a driver’s licence. You purchase a car and you apply for a vehicle licence. You undertake gainful employment and get recorded in employment, national insurance and taxation records, which require you to do certain things like pay income tax. You decide to travel but must prove your citizenship by applying for a passport. You perhaps get married and are issued with a marriage certificate and may have children issued with their own birth certificates. All these data structures change your institutional status and in turn your rights and responsibilities. You may at some point fall seriously ill and need to access data structures such as your NHS record or national insurance record to access healthcare and welfare benefits. When you retire, crucial records held about your public and private pension scheme will determine the income available to you. Finally, your death triggers a death certificate, which is used by your descendants to resolve issues of probate (inheritance). ◄ Data structures may be paper forms, letters, documents and memos. They may be electronic tables in a database or electronic documents held on some data server or even emails, texts and social media messages. However, all data structures have a common core of features. In very general terms, any data structure can be seen to be a hierarchical construct. A data structure is made up of a series of data elements which in turn are made up of a series of data items. For instance, in a physical filing cabinet, a drawer of the cabinet might form the data structure, while a hanging section placed in the drawer might be the data element, while an individual paper form placed in a section might be the data item. Or in an electronic database, a table would be the data structure, while a row of the table would be a data element, and an individual attribute of a row would comprise a data item. Example
At its most basic, a list corresponds to a set of elements: an assembly of distinct ‘things’, considered as a thing in its own right. The list as a data structure consists of a set of list items. Each of these elements can take many different forms, but let us take a form we are familiar with—that of a binary relation. As we saw in Chap. 4, a binary relation consists of a triple of data items, in which the first data item is termed the subject, the second the relation and the third an object. So, consider a simple passenger list for an airline taking the following form:
140
8
Information Modelling and Data Systems
[109999555 REFERS TO John Smith] [105599544 REFERS TO Anwar Prakash] [103399565 REFERS TO Zu Cheng] .... This list consists of a series of binary relations, which we are already familiar with. The first data item in each list item is the subject and, in this case, constitutes a UK passport number. The last data item is a natural identifier for a person—a personal name. Both data items are related or predicated through the REFERS TO relation. This predicate effectively implements what we called identification—it associates a given surrogate identifier with some natural identifier for the person. ◄ The sense that a data structure is a particular way of organising data is clearly an abstraction—a set of principles for both storing and accessing data. In certain literature, this abstraction is sometimes referred to as an abstract data type. But data structures such as lists are clearly instantiated—given form. In this sense, a specific instance of a list, such as a product list, passenger list or picking list, is also a data structure. In the concrete, a data structure is used to represent things and through such representation to help constitute institutional order. Within the discussion that follows, we shall utilise the term data structure both to refer to an abstraction and to an instantiation. Exercise Take the data structure of a birth certificate. Determine what the data elements and data items of this data structure correspond to. Then think about the performativity of a birth certificate—what does a birth certificate actually do in institutional terms?
8.4
The Ontological Status of Data Structures
Within this book, we want to establish the key principle that information modelling is important because we should never take data structures for granted. Ontology, as we have seen in Chap. 4, is a theory of reality, being or what things are seen to exist. Within this book, much as we have done previously with the concept of information and the practice of information modelling, we want to reverse the conventional ontology associated with data structures. In other words, we want to question the conventional ways of thinking about why data structures exist or be. As we intimated in Chap. 3, conventionally, and as conceived in the dominant literature, a data structure is viewed purely as a technological artefact. In this view, data structures, their elements or their items are taken to represent propositions about
8.4 The Ontological Status of Data Structures
141
things in some institutional reality. The institutional reality is also assumed to be observer-independent, meaning that it is the same for all actors. Example
Hence, in a manufacturing setting, a picking-item, which relates a given identifier for a shipping item with a given identifier for a truck, serves as a proposition about these things to workers within the institutional reality of a supply chain. ◄ Within formal logic, data elements or data items as propositions may take only one of two values, namely, true or false. We either assert the truth of a given proposition by writing a data element or data item to the data structure, or we retract a given proposition by deleting the corresponding data element or data item from the data structure. This implies that the state of a data structure at any given time consists of true statements about the real-world domain it represents—such as the case of loading of shipping onto transportation. This so-called correspondence view of truth implies that there is a necessary separation between institutional reality and data structures. It also implies that a data item as an externalisation or representation is taken to correspond to some real-world thing or more likely a set of things important to actors within some institutional reality. As a consequence of our theory of information situations, we argue that we need to reverse this conventional ontology associated with data structures. Data structures are not separate from institutional reality; they are very much entangled within it. In fact, data structures are constitutive of such realities in the sense that they ‘scaffold’ action and inter-action between actors working within and between institutions. Data structures are not only forms of structure; they serve to inform institutional actors and often prescribe or proscribe action on the part of such actors. Data structures are important to instituting of facts about things and through this process are critical to the production and reproduction of institutional order (Beynon-Davies 2016). Example
Consider a list of passport numbers again. A member of the UK Passports Office can use this list to declare British citizens. In doing so, such actors are inherently using the identifiers in this list to instantiate an information class in the following manner: [109999555 ISA British citizen] [105599544 ISA British citizen] [103399565 ISA British citizen] ... Passports and passport identifiers were originally designed to enable the declaration of citizenship and as such to enrol persons into the institutional domain of international travel. But such tokens and identifiers are now used in
142
8
Information Modelling and Data Systems
many other situations relating not only to government and its agencies but to interaction with private sector institutions. For instance, a member of the UK Borders Agency can use a list item from the list above to authenticate a person. In other words, a fact from this list asserts that the individual is who they say they are. On this basis, a person with a passport number is permitted to travel to nominated countries. ◄ So, data structures are not only informative, they are also performative—they get things done. This we have referred to elsewhere as the performativity of data structures. Example
The performativity of lists is evident in a number of English terms—blacklisting, shortlisting, whitelisting and even watchlisting. These terms all refer to commonplace activities driven by the data structure of the list. When shortlisting a group of people for an interview or for a prize, the data structure, the shortlist serves to initiate action such as calling someone to interview. Blacklists may be used as data structures shared between financial institutions to prevent persons who have reneged on their debts from obtaining credit. The whitelist has been used particularly by trades unions to refer to people held by the union to be suitable for employment within their protected trade. Finally, a watchlist is a list of persons or things that some institution deems should be watched for possible action in the future. A whitelist, shortlist and watchlist are clearly a data structure which prescribes what should happen to those persons or things identified on the list. A blacklist, in contrast, proscribes persons or things identified on the list from certain happenings. ◄
8.5
Data Models
So, data structures, elements and items are used to represent instances of things of interest to a particular institutional domain. As we have seen in previous chapters, data structures are particularly used to represent what the philosopher John Searle refers to as institutional facts about some domain. Institutional facts are matters of convention which serve to define or constitute the institution in which they are used. Any model of data must be an abstraction from such instances of institutional facts. In the case of data structures, the business analyst needs to abstract from actual instances of records, lists, etc. certain commonalities of structure. We refer to such an abstraction as a data model. A data model primarily describes structures of data and how these relate together. As we have seen, data structures consist of data elements which in turn consist of data items. Data items act as placeholders or ‘containers’ that take or are assigned values. Valued items, elements or structures are used to
8.6 The Relational Data Model
143
represent things of interest to some domain: to build institutional facts which actually serve to constitute the domain of interest. Example
Take a historical example. With the rise of the modern office in the nineteenth century, the technology of filing cabinets, paper folders and paper forms was invented and used to organise data. Within this data system, a typical record consists of a series of data items or fields which serve to represent an instance of something of interest to the organisation. For instance, a business organisation might create a typical record, consisting of a paper form, for each of its customers, with fields such as customer name, customer address and customer telephone number. Such a form might then be placed in a suspension folder for easy access. Records (such as paper forms), in turn, are typically collected into the data structure of a file—consisting perhaps of a filing cabinet and representing some association between these data elements. For instance, a customer file assembles a series of customer records. Various different customer files might be created, with a specific criterion used to decide which record goes in which file: customers located in different areas of the country or handled by different account managers, for instance. ◄ This data model of files, records and fields served administrative organisations effectively for over 200 years. More recently, another model for managing data as electronic records has gained dominance. This is generally known as the relational data model.
8.6
The Relational Data Model
It is useful to approach the relational data model through the idea of a list, a data structure which we have already met. At its most basic, a list corresponds to a set of elements: a collection of distinct objects, considered as an object in its own right. There are two ways of describing or specifying the members of a set. One way is by intensional definition, using a rule such as A is the set of colours of the French flag. The second way is by extension—that is, listing each member of the set. An extensional definition within mathematics is denoted by enclosing the list of members in curly brackets such as A ¼ {blue, white, red}. Most lists used for modern institutional purposes are actually ordered sets known as sequences or tuples, implying that both the elements of the list and the position of the elements in a list are significant—hence, the tuple is different from the tuple . Within institutional contexts, simple lists consist of an ordered collection of data items, typically, as we have seen, identifiers for persons, things or events. More complex lists consist of ordered collections of data elements (such as records) or even data structures (such as files).
144
8
Information Modelling and Data Systems
Dispatch advice
Goronwy Galvanising Advice No.
Date
Customer name
101
22/01/1988
Blackwalls
Item length
Order Qty
Batch weight
Returned Qty
Returned weight
Order No.
Description
Product code
13/1193G
Lintels
UL150
1500
20
145
20
150
44/2404G
Lintels
UL1500
15000
20
1450
20
1460
70/2517P
Lintels
UL135
1350
20
130
20
135
23/2474P
Lintels
UL120
1200
16
80
14
82
Driver
Received by
Fig. 8.2 A dispatch advice
Exercise Consider another data structure of interest to modern organisation—the electronic mail or email. Analyse an email as a data structure. What constitutes its elements and what are its likely data items? The mathematician Ted Codd (1970) had the key insight of mapping aspects of this theory of sets, particularly the idea of tuples onto that of files, records and fields. Codd proposed mapping the data structure of a file onto that of a mathematical relation, being a set of tuples. This data structure fundamentally underlies the data management systems used within mainstream digital computing systems. The relational data model uses the data structures of tables, consisting of multiple data elements or rows and which in turn consists of a series of data items or columns. Consider a typical business form such as the dispatch advice in Fig. 8.2 which might be used by a manufacturing organisation. The data on this form can be stored in a relational database using the two data structures illustrated in Fig. 8.3. Exercise Primary keys act as identifiers. How does the idea of a primary key relate to the notion of an object identifier such as a person identifier? The table named Dispatch notes in Fig. 8.3 consists of four data items (dispatchNo, dispatchDate, customerCode and Instructions) and three data elements corresponding to three rows in the table, one for each dispatch note that arrives with a dispatch of steel product to a customer.
8.6 The Relational Data Model Dispatch notes
Dispatch items
145
Dispatch No.
Dispatch date
Customer code
101
22/01/2012
BLW
102
25/02/2012
TCO
103
10/03/2012
BLW
Dispatch No.
Sales order No.
Customer product code
101
13/1193G
UL150
20
101
44/2404G
UL1500
20
101
70/2517P
UL135
20
101
23/2474P
UL120
14
Instructions
Dispatch quantity
Fig. 8.3 A simple relational database
Each row in a table is identified by values in one or more columns of the table, called the table’s primary key. To act properly as an identifier, the values of a primary key must be unique and not null. In other words, we must have a value for each element of the primary key (rather than a null value), and each value must be unique in terms of other values of the primary key. For instance, in the Dispatch notes table, the Dispatch No. data item is the only item having both these properties. It is therefore the most suitable candidate for a primary key for this table. Values in columns may also act as links to data contained in other tables. Such columns are called foreign keys. A value for a foreign key must either be the value of some primary key elsewhere in the database or be null. When the value for a foreign key is null, the requisite rows are not related. In Fig. 8.3, the primary key of the Dispatch items table is actually composed of two data items: Dispatch No. and Sales Order No. Both these data items individually are in fact foreign keys to two other tables in the database. Dispatch No. acts as a foreign key back to the primary key of the Dispatch notes table. Sales Order No. acts as a foreign key back to a Sales orders table. The values of these two foreign keys can never be null because we always must know which dispatch note and sales order a particular dispatch item relates to. This leads to some clarity in the difference between a data model and an information model. A data model describes patterns of structure amongst the data utilised in some domain. An information model is an attempt to develop some representation of the collective accomplishment a group of actors make with data. In other words, an information model attempts to document what data structures are used for—what things they communicate to actors within the domain. We should also distinguish between a data model, such as the relational data model, and a data schema or data system model, such as a relational schema. The
146
8
Information Modelling and Data Systems
relational data model is an architecture for data, whereas a relational schema specifies the data structures that will be used for some specific application of a data system. There is a convenient shorthand for expressing the data structures in a schema for a relational database—sometimes known as the bracketing notation. The shorthand is as follows: