304 29 10MB
English Pages [212] Year 1997
Deremu
Peco
s
essence
Co
O
Compucunc
r THE
ESSENCE
OF
THE ESSENCE OF COMPUTING SERIES Published titles The Essence of The Essence of The Essence of The Essence of Forthcoming The Essence The Essence The Essence The Essence The Essence
Program Design Discrete Mathematics Logic Programming Using C++
titles of Artificial intelligence of Databases of Human-Computer Interaction of Z of Compilers
LHe ESSENCE
OTF
PROGRAM
DESIGN
Douglas Bell lan Morrey and
John Pugh
Ut Prentice Hall NEWYORK LONDON MADRID SINGAPORE
TOKYO SYDNEY TORONTO PARIS MUNICH MEXICOCITY
First published 1997 by Prentice Hall Europe — Campus 400, Maylands Avenue Hemel Hempstead
UK
Hertfordshire, HP2 7EZ
A division of Simon & Schuster International Group~ © Prentice Hall 1997
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form, or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior permission, in writing, from the publisher. Typeset in 10/12 pt Times by Photoprint, Torquay Printed and bound in Great Britain by Hartnolls Limited, Bodmin, Cornwall
Library of Congress Cataloging-in-Publication Data
Available from the publisher British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library ISBN 0-13-367806-7
lel eos 4:5
01
00
99
98 97
This book is for our children: Alice, Tom, Beth, Chloe and Gareth.
They made the effort worthwhile.
4
.
=
-
*
"
Serema
Users’.
Copy - SCé
Contents
Foreword Acknowledgements Preface
x Xll Xili
1 Introduction What is design? 1.1 Why do program design? 1.2. 1.3. A variety of methods 1.4 The range of application areas 1.5 Assessing methods Summary 1.6
1 1 2 3 DS 4 6
2 Structured programming
7
2.1 2.2 2.3 2.4 2.5 2.6 29 2.8
Introduction Arguments against goto Arguments in favour of goto Selecting control structures What is structured programming? Summary Exercises References and further reading
3 Functional decomposition 3.1.
Introduction
3.2 3.3.
An example — a video game More about the method
3.4 3.5 3.6 3.7 3.8 3.9 3.10
What about data? Alternative solutions The place of functional decomposition Discussion Summary Exercises References and further reading
7 8 12 14 16 19 19 2)
22 De 98 Da
28 28 29 30 SZ 32 ou,
viii
CONTENTS
4 The Michael Jackson program design method Introduction A simple example Processing input files Physical and logical structures Processing several input and output streams Serial file processing and correspondence Structure clashes . Discussion and evaluation Summary Exercises 4.11 References and further reading 5 Data flow design Introduction An analogy First examples A second example Rationale for the method Another example — monitoring a plant Discussion and evaluation Summary Exercises onl References and further reading 6 Object-oriented design Introduction OOP concepts Objects and encapsulation Polymorphism and dynamic binding Classes Inheritance and subclassing Class libraries Summary of OOP concepts Introduction to OO design CRC design Refining a design Designing is programming and programming is designing OO design methodologies Discussion and evaluation Summary Exercises References and further reading
CONTENTS _ ix
7 Formal methods
Introduction Specification Design Implementation Discussion Summary Exercises References and further reading
119 £9 122 128 133 134 137 137 138
Design notations Introduction Pseudo-code (PDL) Flowcharts Nassi—Shneiderman charts Data flow diagrams Structure charts Class hierarchy diagrams Class—responsibility—collaborators (CRC) lists State transition diagrams Petri nets Summary References and further reading
Design guidelines Introduction Why modularity? Module size Program complexity Global data is harmful Information hiding and abstract data types Coupling and cohesion Coupling Cohesion Shared modules and bottom-up design Summary Exercises References and further reading
10 Review 10.1 Top-down or bottom-up? 10.2 Data or actions? 10.3 The three dimensions of programming 10.4 Programming as modelling
182
x
CONTENTS
10.5 10.6 10.7 10.8 10.9 10.10 Index
The formal approach Design methods — the reality Assessing design methods Discussion Summary References and further reading
Foreword
As the consulting editor for the Essence of Computing Series it is my role to encourage the production of well-focused, high-quality textbooks at prices which students can afford. Since most computing courses are modular in structure, we aim to produce books which will cover the essential material for
a typical module. I want to maintain a consistent style for the series so that whenever you pick up an Essence book you know what to expect. For example, each book contains important features such as end-of-chapter summaries and exercises and a glossary of terms, if appropriate. Of course, the quality of the series depends crucially on the skills of its authors and all the books are written by lecturers who have honed their material in the classroom. Each book in the series takes a pragmatic approach and emphasises practical examples and case studies. Our aim is that each book will become essential reading material for students attending core modules in computing. However, we expect students to want to go beyond the Essence books and so all books contain guidance on further reading and related work. Program design is one of the core activities in computing and this book provides a comprehensive review of a wide variety of program design techniques. It does not advocate any single method but presents a balanced comparison of many methods, giving guidance on their applicability, strengths and weaknesses. This book provides a broad view of the field, suitable for a
first course in program design, or one can read individual chapters to get an introduction to a specific technique. Ray WELLAND Department of Computing Science University of Glasgow (e-mail: ray @dcs.gla.ac.uk)
Acknowledgements
Many thanks to: Jacqueline Harbor and Derek Moseley at Prentice Hall Europe for their encouragement and patience; Ray Welland for his helpful comments and suggestions for improvements to the text.
Preface
What is program design? Program design is about the way that the programmer creates a structure for his or her program. This is separate from the process of writing down programming language instructions. Everyone who has ever written a program knows that there is a stage during programming where you try to sort out whether you need a loop or a test. This is the process where you try to get the important parts of the program in the right order. These are significantly different problems from the detail of making sure that the brackets and semi-colons are in the right places. Program design is about getting the structure right.
Who is this book for? This book is aimed at readers who have had some experience of programming — people who have used one programming language and who have written several small- to medium-sized programs. The language used may be Ada, BASIC, C, C++, COBOL, Pascal, Smalltalk or whatever (in strictly alphabeti-
cal order so as not to show favouritism). This book does not use any particular programming language — or indeed any programming language at all. During programming you will have grappled with the task of trying to ensure that the logic of the program is correct (program design) as well as the task of making sure that the program is grammatically correct (coding). You will have used one design approach. You may have used an explicit method, possibly with some authoritative name. Alternatively, you may have used your creativity and logic in an intuitive, unconscious way to create a design for your programs. Now you want to know something about other approaches to program design.
What this book contains This book describes the varied range of design methods
available to the
programmer. We explain how to use each method. Then we assess each method
xiv
PREFACE
and explain how good or bad it is. Then, at the end of the book, we compare all the different design approaches.
The stages of programming There is much more to programming than writing down instructions in the programming language. There are two major steps in developing a program: 1. 2.
specification implementation
Specification is sometimes known as requirements engineering. It means finding out from the future users of the program what they want from the program. This specification of requirements has to be documented as clearly as possible. Requirements engineering is not addressed in this book. It is common to distinguish the following stages of the implementation phase of programming: 1. 2. 3.
design coding testing
4.
documentation
These are often carried out one after the other. What these steps mean is as follows:
1.
Design is the act of examining the specification and somehow deriving a structure for the program. This stage is the subject of this book. 2. Coding is writing down the instructions in the particular programming language that is being used. 3. Testing is trying to ensure that the program works correctly. 4. Documentation is creating information that will tell someone else how the program works — so that they can fix it! Arguably it is design that is the crucial element in programming. If the design is not right, then all subsequent stages will be flawed, and, particularly, the program will not work (or will not do what the user intended). This list of stages for program development is the classical pattern. But there are other major approaches to development. One example is formal program development, using the rigour of mathematics. Another approach is prototyping. This book discusses these and other methods.
How to read this book This book can be used in several different ways. you can read it from beginning to end. If you technique or notation, you can dip into the book are primarily interested in comparing methods,
If you are brave and energetic, just want to know about one at the appropriate place. If you look at the last chapter.
CHAPTER
1
Introduction
1.1 What is design? Design has been carried out for thousands of years in many areas of human endeavour. We design: @ @ @ @
domestic transport buildings machines
artefacts like beds, cookers and washing machines systems like cars and railway networks like houses and cathedrals like power stations and oil refineries
Historically these different areas of design have developed independently. Each has its own group of experts and its own approaches and methods. The way that someone approaches the design of a washing machine is very different to the way in which someone designs a bridge over ariver. What is common about design in all the different areas? e@
It is creative — before there is nothing, afterwards there is something.
@ @
It uses science when and as necessary. It leads to something that is useful, or pleasing, or both.
What is program design? You, the reader, know what a computer program is, and have probably written some programs. You know that getting the program to work correctly is challenging and is affected greatly by the structure of the program. Program design is both a noun and a verb. To go about program design is to perform the act of devising the structure for a program. A program design (the noun) is the structure of that program. Program design starts with a specification of what the program is to do and ends with the structure of the program.
A program has structure at different levels of detail. The overall structure
consists of a number of functions, procedures or subprograms. The precise name of these units differs from one programming language to another. There is a more detailed structure within each of these units. This structure is expressed in terms of the variables and the actions. The actions consist of combinations of sequences, selections (if statements) and repetitions (loops). Some people deny the existence of design. They say ‘it’s obvious’ what a program should look like and that therefore there is no need to go through any
2
INTRODUCTION
process called design. We disagree. The process of design may be invisible or intuitive, but it is there.
How does the design of computer programs differ from the design of other things? @ There are methods for program design (explained in this book!). @ The product of program design has no concrete existence. We distinguish two types of program design. Architectural or high-level design is about the grand-scale structure of a program. The product of this stage is a large-scale or architectural structure for the software, expressed in terms of modules and their interconnections. Detailed or low-level design is concerned with the detailed structure of each of such modules. Also, low-level design is concerned with the detailed design of small- to medium-sized programs.
1.2 Why do program design? There are several reasons for claiming that program design is important: @ @ @
reliability cost maintenance
Programs are complex. If you compare the complexity of a program with some other designed object — a car, a washing machine, even a computer — the program wins hands down. In consequence, it is very difficult to construct programs that work without failing and that do what is expected of them. There are lots of ways of trying to ensure that a program works reliably, but a key method is design. If a program is designed properly, it should work properly. Another consequence of the complexity of programs is the fact that it takes a long time to write programs and therefore, if someone is being paid to do it, it costs a lot of money. Currently most of the time taken up in developing a program is spent in testing, and particularly debugging. However, if a program is designed right, it will be easier to test and debug. Most useful programs
are modified,
once
completed,
and then for some
considerable time afterwards. This is called program maintenance. Many professional programmers spend a lot of time maintaining programs. It is a demanding and difficult job because they have to find out how the program works, which parts to change and then how to alter the program. A program that has been well designed is easier to modify. Arguably, the act of designing a program is one of the most important stages in program development. It ranks in importance with establishing precisely what it is that the user wants — the drawing up of the specification for the program. If the design is good, then all the ensuing stages will go well — coding, testing, debugging and adapting the program to meet changing needs.
THE RANGE OF APPLICATION AREAS
3
The promise is that if we use a systematic design approach we will create a product that is reliable, meets its needs and is adaptable. If the design approach is poor, then all subsequent stages will probably be difficult and time consuming. What will emerge is a product that does not satisfy its requirements, is unreliable and unadaptable.
1.3 A variety of methods If you talk with people about program design, you will find that there are a number of evangelists about. Such people believe that there is one single method that solves all problems. Moreover, all other methods should be discarded as historical anachronisms. The favoured method used to be “structured design’, but more recently ‘object-oriented design’ has become the flavour of the decade. In this book, we do not have favourites. We try to explain a number of methods and assess them objectively. We believe that each and every method has a place. Some methods are applicable to particular types of application. For example, data structure design is very useful in applications where serial file processing is an important ingredient of the problem.
1.4 THE RANGE OF APPLICATION AREAS Nowadays computers permeate all our lives. But for the purposes of software development, it is useful to identify various application areas. Each area has its own characteristics, which we now review.
Information processing This is the traditional area of data processing that used to be associated with massive mainframes, but which is now more associated with networks of PCs.
The main characteristic of these systems is that they are dominated by file processing. Often the processing that needs to be done on the files is minimal — instead the main work is either searching through files and updating or extracting information.
Knowledge-based systems and artificial intelligence These are fairly novel applications of computing. To meet these applications, information is stored in complex and novel ways. The processing of the information is also distinctive. Both these elements have meant that conventional approaches to design have proved inadequate and therefore new methods are under development.
4
INTRODUCTION
Event-based systems When you press a button on the screen of a GUI (Graphical User Interface) screen, a piece of software has to respond to the demand. This reactive approach is very different from a traditional program, which is proactive.
Numerical programs Numerical programs are characterised ‘by large amounts of computational power using sometimes complex algorithms. The input and output requirements of such programs are usually rather minimal.
Real-time and embedded systems The control systems in a fly-by-wire aircraft must be highly reliable and must respond within predictable times. There is a high degree of parallelism (things that have to be done at the same time). This places particularly stringent demands on the software. Some people argue that nowadays there is no valid distinction to be made between these different categories of program. They cite computer systems, like those in a petrol station, that carry out a whole range of tasks. The computer controls the petrol pumps (real-time embedded system). But the shop also sells sweets and groceries (information system) and provides a convenient interface for the operators (event-based software). The computer also communicates
with
head
office
about
stock
levels
and
sales
(communications
software). We argue that these different types of software are still present within a large, complex software system and that it may well be appropriate to use a variety of design approaches to deal with this diversity.
1.5 Assessing methods The theme of this book is that there are a number of different approaches to program design. It would be easy if there were one single method that outshines all the rest. Unfortunately,
this is not so; different methods
have different
strengths. In order to assess and compare different methods, we use a set of criteria, which we will now discuss. Later, when we look at each method, we will use
this same checklist for discussion.
Special features and strengths Each method discussed in this book has some special feature or claim to fame that distinguishes it from the other methods.
ASSESSING METHODS
5
Weaknesses Most design methods have their Achilles’ heel. For example, it may be that the design method fails to help in any way with particular aspects or stages in the design process. Philosophy/perspective While all programs consist of a symbiosis of data and actions, different approaches to design tend to emphasise either one or the other, at least during some of the stages of design. Each program design method has some theoretical basis or emphasis which underpins it. In this book we articulate this perspective or philosophical basis. Systematic? A systematic method may consist of small, simple steps and therefore be easy to use. As a consequence, it may create good documentation. On the other hand, a highly prescriptive method may stifle creativity and insight. Appropriate applications
Some design methods are particularly successful when applied to particular application areas. Inappropriate applications
Some design methods are of limited use if someone tries to apply them within certain application areas. Top-down? The dominant design method of today is top-down design. The top-down approach to design or to any task means starting with the overall objective that
is to be achieved. This overall task is broken down into a series of simpler tasks. In turn these are broken down into even simpler tasks, and so on. Top-down program design starts by identifying the main features of the program. This topmost element uses some defined, but as yet unconstructed elements. These are the next candidates for construction. They in turn make use of lower-level elements. And so on. The program grows as a tree from its root at the top towards its branches and leaves at the bottom. The main ingredients of the program are considered first; consideration of detail is postponed. For a number of years, this approach to problem solving and design has been regarded as the only tenable approach. There are other approaches, and they have their roles. Good for small-scale design? Small-scale design is applicable to programs that range in length from 1 page or 50 lines up to about 40 pages or 2000 lines.
6
INTRODUCTION
Good for large-scale design? code. When Large programs are programs over about 40 pages or 2000 lines of xity programs are of this size or bigger, the problems of size and comple complex become dramatically different. Some methods are useful in this situation, whereas others cannot cope.
Tool support
Whereas traditionally, programmers used only a text editor and a compiler, most programmers today make the most use of software packages to help them in their work. A software tool can massively assist in developing a program and providing the documentation for subsequent maintenance. Some of the software that is used today goes by the grandiose name of a CASE tool — a ComputerAided Software Engineering tool.
1.6 Summary @ © e @
®
Program design is devising the overall structure of a program. Design is important to achieve high reliability, low cost and good maintainability. A number of different design methods are currently available. The application areas of computers can be classified into a number of areas — real-time, information processing, etc. Criteria can be established for assessing a program design method.
CHAPTER
2
Structured programming
2.1
Introduction
Structured programming forms the basis for all current design approaches. This chapter reviews the principles and practice surrounding structured programming. It seeks to answer questions like: what is the essence of structured programming? What is the role of the goto statement? Why are certain control structures favoured? Structured programming is now part and parcel of most approaches to programming. It is widely accepted that it is not only the best, but the only way to carry out programming. This was not always the case. At one time there was a great debate about structured programming — what it meant and whether it was a good idea. One view of structured programming is that it holds that programs should only be built from three components: sequences (normally written in the order in which the statements are to be executed), selections (normally written as if...then...else), and repetitions (written as while...do). The goto statement is,
by implication, banned. In this chapter we begin by examining the controversy about the goto statement. The outcome of the argument is that goto statements are an irrelevancy; the argument is about something else — good program structure. We go on to explore the significant principles of structured programming. There are various subsidiary questions. If the three structures of structured programming are diagrammed as flowcharts (Figure 2.1), the following characteristics become clear: 1.
The constructs have only one entry and exit.
2.
None of the constructs consists of more than three boxes.
If we visualise any one of the three constructs as they are used, then a third characteristic is evident:
3.
The entry is at the start and the exit is at the end.
Why is it that these characteristics are important? Why are other constructs that have the same characteristics (Figure 2.2) ruled out? We now go on to look at these questions.
8
[oo
STRUCTURED PROGRAMMING
Figure 2.1
Figure 2.2
2.2 Arguments against goto goto statements are unnecessary A theorem proved by Bohm and Jacopini (1966) tells us that any program written using goto statements can be transformed into an equivalent program
that uses only the structured constructs. The converted program will, in general, need additional data items that are used as flags to control the actions of the program. Indeed the new program may look rather contrived; nonetheless, it can be done. On the face of it, therefore, there is no need for programs with goto statements in them. Note, as an interesting side issue, that the theorem does not tell us how to transform the unstructured program, only that it can be done.
ARGUMENTS AGAINST GOTO
9
Experimental evidence Structured programming is well established and widely regarded as the best approach to programming. You might think, therefore, that there would be clear evidence from real software projects that it is beneficial. But this is not so; there are no convincing results from real projects — largely because a carefully controlled experiment would be difficult and expensive to mount. It would be necessary to develop a particular piece of software two ways — once using structured programming and again using ‘unstructured’ programming. All other variables, like the expertise of the programmers, would have to be held constant. The two versions of the software could be compared according to criteria like development time and number of errors. Regrettably, there are no results of this type. However, experimenters have carried out small-scale studies comparing how easily people can understand and debug small structured programs compared with unstructured ones. In a typical experiment, each of a group of subjects is presented with the listing of a program, written in a structured way, and asked a series of questions that are designed to assess their comprehension of the program. The accuracy of the replies and the time taken to reply are both measured. These are measures of the ease with which the program could be debugged or changed. A second group of subjects are given copies of the same program, re-written in an unstructured way. The accuracy and response times of the two groups are compared. The results of these experiments generally indicate that structured programs are superior to unstructured ones. The results of empirical studies are reviewed in the literature given at the end of the chapter. In a review published in 1984, long after the dust had settled on the structured programming debate, Vessey and Weber concluded that ‘the evidence supporting [structured programming] is weak’. This conclusion largely stems from the difficulty of carrying out experiments that give trustworthy results.
Clarity and expressive power Compare the following two equivalent program fragments:
10
STRUCTURED PROGRAMMING
As we read down the first program fragment, we are not immediately sure what are the roles of the label and goto. It could take us some time to read and study the program in order to discover that they are being used to create the repetition of a piece of code. This is made immediately obvious by the while statement in the second program. Worse still, there is a remaining doubt in the first program that there may be another goto, aimed at this same label from some other point in the program. The facilities of a programming language should allow people to describe what they want to do in a meaningful way. If we examine a typical program written using goto we see that the goto statements are used for a variety of purposes, e.g.: ®@
to avoid a piece of code (which is to be executed in different
@ ® @
circumstances) to perform repetition to exit from the middle of a loop to invoke a shared piece of code
When we see a goto, there are few clues that allow us to decide the purpose for which the goto is being used. The alternative is, of course, a unique language construct
for
use
in each
of these
different
circumstances.
These
are,
respectively: if...then...else while...do or repeat...until exit procedure call
It is as if the goto is too primitive an instruction — like a machine instruction to load a register — that can be used in a whole variety of circumstances, but does not clearly convey its meaning in any of them. In summary, the goto lacks expressive power and it is therefore difficult to understand the logic of a program that is written using a lot of goto statements. When we look at a piece of coding, words like while and if give us a strong clue as to what is intended; goto does not.
How many pencils? Suppose we want to read a program in order to understand it, by tracing through it as if we were a computer executing it. Suppose we have available a supply of pencils (or fingers) to help us. The pencils will be used as markers, to be placed on the program listing to point to places of interest. If we are following a simple sequence, then we will only need one pencil to keep track of our position. If we encounter a procedure call, we need two pencils — one to leave at the point of the call (in order to know where to return) and another to proceed into the procedure.
ARGUMENTS AGAINST GOTO
11
If we encounter a while statement or a for loop, then we need an integer, a counter, to keep count of the number of times we have repeated the loop. To summarise, if the program has been written in a structured way, we need: 1. 2. 3.
One pencil to point to the current position. One pencil to point to each procedure call that has been executed but not returned from. A counter for every uncompleted loop.
This may seem like a lot of equipment, but consider the alternative of a program that contains a lot of goto statements. As before, we will need to indicate the position of the current statement. Next, we need a pencil to point at every goto that has been executed. But now, whereas in the structured program we can remove a pencil whenever we return from a procedure, finish a loop or complete an if statement, we can never dispense with pencils; instead we need ever more. The increased number of pencils reflects the increased complexity of the goto program. The real problem becomes evident when we want to refresh our memory about what happened before we arrived at the current point in the program. In ihe program without goto statements we simply look back up the program text. In the unstructured program, when we look backwards we are defeated as soon as we reach a label, because we have no way of knowing how we reached It.
Ease of reading (static and dynamic structures) In the Western world we are used to reading from left to right and top to bottom. To have to begin by reading forwards and then to have to go backwards in the text is rather unnatural; it is simpler if we can always continue onwards. This is an important feature of a structured program; it can always be read from top to bottom — provided it has no procedures. The exception to this rule arises in comprehending a while loop, during which repeated references back to the terminating condition at the start of the loop are necessary. Programs are essentially dynamic beings that exhibit a flow of control, while the program listing is a static piece of text. To ease understanding, the problem is to bring the two into harmony — to have the static text closely reflect the dynamic execution. In a structured program, the flow of control is always down the page, which exactly corresponds to the way that text is normally read.
Proving programs correct To prove all programs formally correct is not a practical proposition with present-day techniques. Nonetheless there are some lessons that can be learned from proving.
12
STRUCTURED PROGRAMMING
In one technique of program proving, assertions are made at strategic points in the program. An assertion is a statement of what things are true at that point in the program. More exactly, an assertion describes the relationships that hold between data items that the program acts upon. An assertion at the start and end of a piece of program are called the input and output assertions respectively. Proving consists of rigorously demonstrating that if the input assertion is true, then the action of the program will lead to the output assertion being true. A structured program consists solely.of components that have a single entry and a single exit point. This considerably aids the process of reasoning about the effect of the program. In contrast, it is usually impossible to isolate singleentry, single-exit structures within a program with goto statements in it. Even when formal proof techniques are not being used, but where an informal study of the program is being made, the single-entry and single-exit character of structured programs aids checking and understanding.
2.3 Arguments in favour of goto Deskilling The goto statement is one tool amongst many provided by the programming language. To take it away from the programmer is to deprive him or her of a tool that can be validly used in certain circumstances. Consider a craftsperson who is an expert at making delicate objects from wood. Suppose that we remove from that person a tool that we consider to be inappropriate, say an axe. The skill of making a discriminating selection amongst
the available
tools
is: reduced,
because
the choice
is narrower.
Furthermore, the skill in using the tool is no longer required. (Remember, however, that there may be occasional circumstances in which an axe is the most suitable tool.)
Exceptions Often a program is required to detect an error and to take some special action to deal with it. Suppose that such an error is detected many levels down in a chain of procedure calls. One way of handling the error is to create an additional parameter associated with each procedure call. This approach can become very unwieldy as procedures receive and merely pass on parameters that they do not need to act on. An alternative is for the procedure detecting the error simply to goto a suitable place in the program where the error can be dealt with. This can result in a significant simplification to the program. The essence of the argumen t is that an exceptional situation in the program demands an exceptional solutio n — a goto.
ARGUMENTS IN FAVOUR OF GOTO
13
Some programming languages, such as Ada, have solved this problem using a special mechanism for handling exceptional situations.
Program performance On occasions it is possible to write a program that will run more quickly using goto statements. An example is a program to search a table a for an item x: fori :=1 to tablesize do if a(i) = x then goto found endif endfor notfound:
found:
The nearest we can get to writing this as a structured program is to write: i:=1 while i tablesize then
endif
which requires an additional test. Although both programs achieve the same end — finding (or not finding) the desired item — the steps taken by the two
thereis programs differ. The first (unstructured) program takes fewer steps, and
is no way of writing a structured program that works in the same way. Thus it ones. possible to write goto programs that run more quickly than structured Naturalness
n is Take the table-searching program above. Arguably the unstructured solutio most the best in the sense that it is the solution that solves the problem
14
STRUCTURED PROGRAMMING
naturally. Any transformation of this program, or any other solution, is a distortion of the natural solution. In other words, getting rid of goto statements in existing programs (as can always be done) will sometimes needlessly destroy a good program. The trouble is deciding what is ‘natural’. It surely differs from person to person, depending on individual psychology and cultural experiences. So it is a rather subjective judgement.
2.4 Selecting control structures Rather than take part in a parochial debate about the merits of a particular control structure, let us take a constructive approach. If we had a free choice and a blank piece of paper, what control structures would we choose to use in
programming? Perhaps our first consideration should be to establish the principles that govern the selection. Let us examine the following criteria: standardisation abstraction expressive power orthogonality minimality
We shall see that some of these conflict with each other. Note also that in our examination we are confining ourselves to sequential, imperative programming, in contrast to concurrent or declarative programming (as in logic or functional programming). Standardisation
Domestic appliances exhibit enormous variety, and yet all plug into a standard socket. Similarly, it is desirable to build software from components that all exhibit the same external interface. The simplest interface comprises one entry point, at the start, and one exit point at the end. This has the strength of being consistent with the essence of sequential programming. It also conforms with the important idea of calling a procedure. We are used to the idea of calling a procedure as a sequential step and returning from it to the next instruction in sequence. (We do not, for example, expect to supply a label as a parameter to which control is returned.)
Abstraction
Abstraction is probably the most important idea in structured programming. The human mind cannot devise or understand the whole of a compl ex system.
SELECTING CONTROL STRUCTURES
15
Instead we can only understand one part of the system at a time. Nonetheless it is vital to understand the whole system. The only way to resolve these contradictory statements is to be able to perceive the overall structure in a digestible way. The solution is to use abstraction; the system must be described in a notation that allows subsystems to be seen as black boxes whose task is readily understood, but whose detail is invisible. In programming, the procedure has long been a mechanism that fulfils this role. Other constructs that possess the same single-entry at the start, single-exit at the end, property are if...then...else.... and while...do... .
Expressive power In discussing the arguments against the goto statement, we saw that the goto is too primitive. It has more to do with describing what a machine will do than what a programmer intends. Instead we look at the range of structures on offer, on the one hand tempted to seize every mechanism available, while on the other conscious of the need for minimality. Certainly we need some mechanism for repetition, and either a while statement or recursion will suffice. Many languages provide both, with the addition of repeat...until. Arguably we also require a statement to carry out a choice of actions following a test. The if...then...else construct fulfils this requirement, but others are equally valid, including the case statement. Again, we are torn between expressive power and minimality.
Orthogonality When designing a set of facilities, a good design principle is to create features that are as different from each other as possible. If this is so, we can more
easily satisfy the goal of a minimum number of functions, while at the same time ensuring that the facilities are sufficiently powerful for all our needs.
Minimality The principle of minimality curbs our tendency to include too many facilities.
One of the uses of Bohm and Jacopini’s theorem is that we know that the three control structures are sufficient. (Strictly, a construct for iteration, such as a while, is also unnecessary, because any loop that can be written iteratively can also, in theory, be achieved using only recursion.) Consider the flowcharts of various control structures. Sequence has one box, while has two, and if has
three boxes. There are other control structures that involve only three or fewer boxes; but from amongst them all, these are the minimal feasible set.
16
STRUCTURED
PROGRAMMING
2.5 What is structured programming? It is easy to become engrossed in the arguments about the goto statement, but is this the central issue of structured programming? Can a program that is written using only the three structures claim to be well structured? The answer is no; it is possible to create a bad program structure using the three structures, just as it is possible (with greater difficulty) to create a good structure that uses goto statements. To illustrate why this is so, consider a badly structured program that has been written using many goto statements. If we now transform this into a program that uses the three structures, we still have a badly structured program, since we have done nothing significant to improve it. As a second example, consider a program to search a table for a required item. Two alternative solutions, one structured, the other not, were compared earlier. However, arguably, neither of these is the best solution. Here is another,
in which the item being sought is first placed at the end of the table: a(tablesize + 1) :=x i:=1 while a(i) not = x do i:=it+l endwhile
if 1 = tablesize + 1 then else endif
This is arguably the best of the solutions because it is less complex (the condition in the while statement is simpler) and would execute more quickly on a conventional computer (for the same reason that there is only one condition to test). This example illustrates that the use of the approved structures does not necessarily guarantee the best design. A structured program is essentially one that can be understood easily, and the most useful tool in understanding is abstraction. Abstraction is concerned with identifying the major elements of what is being studied, and ignoring detail. Such is the complexity of software that we have to do this in order to stand a chance of understanding it. Abstraction can only be achieved if the control flow constructs are used in a disciplined way, so that part of structured programming is the avoidan ce of goto statements. For example, consider the program shown in Figure 2.3. We can draw boxes around components as shown. Because it is built from
limited control structures, we can view the program at all levels of detail as
WHAT IS STRUCTURED PROGRAMMING?
17
Figure 2.3
consisting of abstract components that have only one entry and one exit. If we examine any subsystem of the program, it is totally contained — the boxes do not overlap. If we look in detail at its contents, it does not suddenly sprout connections with other subsystems. When we uncover the nested contents of a traditional Russian wooden doll we do not expect suddenly to encounter a doll that affects any of those we have already seen. (Structured programs are, of course, more complex than these dolls; it is as if, when we open a doll, not just one, but several more are revealed.)
Suppose that in the above program we inserted a goto as in Figure 2.4. We have now ruined the structure, since we can no longer view the program at different levels of abstraction. As an analogy compare the problems of understanding aplate of spaghetti as compared with a plate of lasagne. In order to understand the spaghetti, we have
can to understand it all; we cannot employ abstraction. With the lasagne, we
18
STRUCTURED PROGRAMMING
Wma then goto label
Figure 2.4
peel off layers, one by one, uncovering the interesting detail of success ive layers and understanding each separately from the others.
Notice, though, that throughout our discussion we have concen trated almost
exclusively on control structures and have neglected references to data. Although a program that uses the three structures may have abstraction of control, it may well have global data. This seriously weakens the abstraction, since at any level of detail all of the data has to be considered. Thus control abstraction is only one side of the coin; the other side is data abstraction — an important theme that is developed later in this book. The idea of abstraction is a powerful tool in constructi ng software that is understandable. But, by itself, it is inadequate. In order to create programs that are well structured, we need a systematic method that takes us from the statement of the problem to the structure of the progr am. There are a variety of methods, discussed in later chapters. Most modern methods are ‘top-down’. They exploit the idea of abstraction in the following way. Initially an overall,
EXERCISES
19
abstract description of the problem is constructed. This is refined or broken down into several more detailed components. These, in turn, are refined again, and so on until the level of detail is adequate.
2.6 Summary We have recounted the arguments about the pros and cons of goto statements. The arguments against the goto are: @ @ @ @
@ @
It is unnecessary. The experimental evidence. The lack of expressive power. The difficulty of finding out where you came from in a program constructed using goto statements. The need to read backwards in a program with goto statements in it. The difficulty of proving the correctness of programs that use goto statements.
The arguments for the goto are:
@ ® ®@ @
To ban it is to It has a use in goto statements perform well. It is sometimes
deskill programming. exceptional circumstances. are sometimes necessary in order to make programs ‘natural’ to use goto statements.
Much of the debate about structured programming has been conducted in this way. Throughout the arguments, the need for clarity in programs is the dominant idea. The avoidance of goto statements does not necessarily lead to a structured program. The central idea of structured programming is abstraction. Indeed, we can define structured programming as: ‘the systematic use of (control) abstraction in programming’. (Abstraction can be applied not just to control, but to data as well. This is the subject of other chapters.) Abstraction requires not only the use of suitable control structures but also a systematic design method.
NN
2.7 Exercises Review the arguments for and against goto statements and criticise their errs validity. 2 State in your own words what structured programming is. Imagine that you meet a friend who has a personal computer and is used to 1
20
3
STRUCTURED PROGRAMMING
programming using goto statements. Explain in simple, clear language what structured programming is. A while statement can be used instead of an if statement, because the statement: if bool then s endif can be re-written as:
while bool do s bool := false endwhile
4
Show how an if...then...else construct can similarly be re-written using while statements. If the if statement is superfluous, why have it? Convert the following into a structured program. 1 := start loop: if x = ali] then goto found endif if i= end then goto notFound endif i:=itl goto loop
notFound: write “not found’ action] goto end found: write ‘found’ action2 end:
Compare and contrast the two programs. 5S Recursion can be used to accomplish repetition, as in the following example of a procedure to skip spaces in a stream of input information: procedure SkipSpaces read (char) if char = space then SkipSpaces endif endprocedure
6
Convert this example to a procedure that uses a while statement for repetition. Compare and contrast the two solutions. What is the role of recursion in the programmer’s toolkit? Argue for and against providing the following constructs in a programming language: case repeat...until until...do
op
ee
REFERENCES AND FURTHER
READING
21
2.8 References and further reading The theorem that establishes that any program can be written without goto statements is described in: Bohm C and Jacopini G, ‘Flow diagrams, Turing machines and languages with only two formation rules’, Communications of the ACM, Vol. 9, No. 5, pp. 366-71, May 1966. The famous article that launched the debate about structured programming is: Dijkstra E W, ‘Go to statement considered harmful’, Communications of the ACM, Vol. 11, No. 3, pp.147-8, March 1968.
The first textbook on structured programming was: Dahl O J, Dijkstra E W and Hoare C A R, Structured Programming, Academic Press, New York, 1972.
The following book contains a review of the small-scale experiments that have been conducted and that broadly show that programs without goto statements are superior: Shneiderman B, Software Psychology: Human Factors in Computer and Information Systems, Winthrop, Cambridge, MA, 1980. A more recent survey is given in: Vessey I and Weber R, ‘Research on structured programming: an empiricist’s evaluation’, JEEE Transactions on Software Engineering, Vol. SE-10, No. 4, pp.
397-407, July 1984. The following article is an amusing spoof about a suggested ‘come from’ statement. It is an interesting exercise to analyse precisely why this statement is ridiculous. Clark R L, ‘A linguistic contribution to goto-less programming’, Datamation, 1968. Reprinted in Communications of the ACM, Vol. 27, No. 4, April 1984. A vigorous exposition of the argument that structured programming is deskilling is given in: Kraft P, Programmers and Managers, Springer Verlag, New York, 1977.
CHAPTER 3
,
Functional decomposition
3.1
Introduction
As the word functional suggests, this program design method focuses on the functions, or actions, that the program has to carry out. To use the method we first write down, in English, a single-line statement of
what the program has to do. For example, suppose we want to write a program to direct a robot to make a cup of instant coffee. We could begin by writing: make a cup of coffee Then we express the problem solution in terms of a sequence of simpler actions: boil water get a cup put coffee in cup add water add milk add sugar to taste
Next we take each of these statements and write down what it does in greater detail. This process of decomposition or refinement is continued until the solution is expressed in sufficient detail. Usually this is when the solution corresponds to the statements provided in the programming language to be used. A statement that is to be broken down may be regarded as a call on a procedure (paragraph, subroutine). The set of more elementary steps into which it is decomposed can be viewed as the body of the procedure. The language that is used to express the solution is called pseudo-code (because it is similar in some ways to programming language code), or program design language (PDL). It consists of sequences of (English) sentence s, each beginning with a verb. It also involves the approved constructs of structur ed programming — if...then...else for selection or comparison and while for repetition. For example, the statement ‘boil water’ above might be refined
to:
AN EXAMPLE - A VIDEO GAME
23
switch on kettle
while water not boiling do whistle a tune endwhile switch off kettle
Again we might refine ‘add sugar to taste’ as: if sugar required then put sugar in cup stir endif We can restrict ourselves just to while and if or we can make use of the repeat, for and case constructs.
3.2 An example - a video game In order to illustrate the use of functional decomposition we will design the software
for a simple video game,
called
‘breakout’.
This account
can be
skipped if you feel that you have a grasp of the method. The game makes use of a keyboard and a crude visual display that can display characters (but no other graphics) at any position on the screen. The screen looks like this: SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
s
s
s
s
SWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWSs WWWWWWWWWWWWwwws
SWWWWWWWWWWWWWWWWwWWww
WWWWWWWWWWWWwws
SWWWWWWWWWWWWWWWWWwWw s
s s
:
oO
s
s s
s
s
g
s
s s
bbbb
Ss s
It portrays a wall, built from ‘w’ characters, a bat (“b’ characters) and a ball (the letter ‘o’). The ball bounces around the screen, moving up, down or diagonally, changing direction conventionally when it hits the wall, the sides or the bat. The player can move the bat left or right by keying ‘I’ or ‘r’ on the keyboard. The objective is to make the ball knock down the wall and so break out.
24
FUNCTIONAL DECOMPOSITION
The ball is out of play and vanishes when it passes the bat. The player iS supplied with a succession of four balls. When these have been used, the best score ever attained is displayed. This completes the specification of the game. Let us refer to places on the screen byapair of (x, y) coordinates and assume that the software can assign a character value to a position on the screen by executing a statement like: screen (17, 22) := ‘‘b”’
Similarly suppose that the value at any position can be tested using a statement like: if screen (x, y) = “‘w’’ then... We will refer to the coordinates of the ball as Xball and Yball. We begin with the overall logic: video game draw sides draw bat at home position while someone wants to play do draw wall play a game display biggest score endwhile
We have relegated detail to statements like: draw sides and we can study and check the design without worrying about detail. We would probably realise that essential steps are missing so that we might amend the above to: video game draw sides
draw bat at home position best score := 0
while someone wants to play do
score :=0 draw wall play a game if score > best score then best score := score endif display best score endwhile
AN EXAMPLE — A VIDEO GAME
25
Now that we are satisfied with this, we choose one of the statements and write down what it does in more detail. An obvious candidate is:
play a game balls left := 4 while balls left > 0 do
play a ball subtract one from balls left endwhile and then:
play a ball create a new ball
while ball is in play do
check ball position move ball move bat endwhile erase ball
Let us now work on the detail of moving the bat:
move bat if key = ‘‘l’’ then move left endif if key= ‘‘r’’ then move right endif which requires: move left
erase old bat if bat position >left wall + 1 then bat position := bat position — 1 endif draw bat and:
move right
erase old bat if bat position < right wall — bat length then bat position := bat position + 1 endif draw bat
26
FUNCTIONAL DECOMPOSITION
This completes a fair amount of the detail concerned with moving the bat. We can leave further detail for the time being and turn to consider more of the detail of moving the ball. Moving the ball is easy: move ball erase ball
Xball := Xball + Xstep Yball := Yball + Ystep screen (Xball, Yball) := ‘‘o’’
but we have to be careful to check whether it has hit something: check ball position check out of play check sides check bat check brick
We will finish the account of this development with just two more pieces: check sides
if ball at either side then Xstep := — Xstep endif if ball at end boundary then Ystep := — Ystep endif check brick
if screen (Xball + Xstep, Yball + Ystep) is a brick then screen (Xball + Xstep, Yball + Ystep) := blank add one to score Ystep := — Ystep subtract one from bricks left if bricks left = 0 then draw wall endif endif
There is still some way to go with this particular design, but this much gives the flavour of the design and of the method. Notice how, throughout, we can check
MORE ABOUT THE METHOD
27
the design without having to take account of a lot of unnecessary detail. Notice the informality of the statements used; we can simply make up new actions as necessary, but still within the discipline of the control structures sequence, if and while.
3.3 More about the method One way of visualising this method is to see that at any stage of decomposition the solution is expressed in terms of operations that are assumed to be available and provided by an abstract (or virtual) machine, like a very high-level programming language. This is the idea of ‘levels of abstraction’. In order to design and understand a solution at any one level of detail it is not necessary to understand anything about how the levels beneath work. We know what they do, but not how they do it. Thus the human mind has a much better chance of understanding the solution and therefore of getting it right. One approach to functional decomposition is to refine a program design level by level. The program structure can be visualised as an upside-down tree — always with a very flat top that grows downwards uniformly. This is the ‘breadth-first’ approach to design; it matches the strict, abstract-machine view of program design, described above. An alternative approach is to pursue one of the branches of the tree, leaving the others stunted until later. (This is called ‘depth first’.) Our motive for doing this might be to explore what seems to be adifficult part of the program. We might thereby hope to gain insights that guide us in refining the rest of the
structure. The informality of a pseudo-code design makes it impractical to verify its correctness in a formal way. However, the use of abstraction facilitates
informal checking. Have all the right steps been included and in the right order? Is the algorithm complete, or is something missing? Because each piece of pseudo-code is written according to structured programming, it has only one entry and exit. It is therefore easier to check. This idea of trying to ignore the detail of lower levels of the design is easier said than done and takes some nerve. Ina sense it is a real test of whether we believe program design is worthwhile and whether we can resist the temptation to rush on to coding pieces of the program. During problem solving, the human
mind undoubtedly makes wide-ranging leaps over lots of aspects of a problem. So it is debatable whether people can actually think strictly in this manner. Whatever the claimed design method, it is more likely, perhaps, that people design a program using some personal, semi-intuitive method and afterwards tidy up the design so that it looks as though it was created in a systematic way.
28
FUNCTIONAL
DECOMPOSITION
3.4 What about data? What is the role of data in the method? It does rather play second fiddle to the actions that have to be taken. The making of the cup of coffee illustrates that by concentrating on what the program should do, considerations of data are ignored, or rather postponed. The idea is that the data emerges during the decomposition as it becomes evident what is necessary and what needs to be done with it. We defer decisions about pieces of data and the structure of information until we have a very clear knowledge of what operations have to be performed on them. When that stage is reached we can design data which is tailored precisely to the operations. There is a possible problem here. Suppose a program is written in this way and later needs modifying. What happens if some new operation on the data is required? Remember that the data was designed with the old operations in mind, so it may not be in a convenient form for the new.
Another question about functional decomposition is whether it is actually possible to delay thinking about data in this way.
3.5 Alternative solutions A major characteristic of functional decomposition is its flexibility, or (put another way) its lack of guidance. This can be seen by looking at an example. Consider a program that is to input 80 character lines of text, consisting of words and spaces, and is required to convert them into 120 character lines, containing as many words as will fit. We might begin like this: initialise while more input lines do input a line process a line endwhile clean up Or we could start:
initialise while more input information do assemble an output line output the line endwhile clean up
THE PLACE OF FUNCTIONAL DECOMPOSITION
29
Or a third start might be: initialise extract first word while more words do
dispose of word extract next word endwhile
clean up Each of these three solutions could be refined to arrive at a different program. According to functional decomposition they are all reasonable. The method gives us no guidance as to which of the three is best. The only way to find out is to refine all three solutions (along with any others) to a level of detail at which we can make a reasoned choice between them. The choice might be made on the basis of clarity and simplicity, performance or some other criteria. But functional decomposition, by itself, gives no guidance for selecting the best solution. Because there is always more than one solution to a problem — more than one design that meets the specification — the user of functional decomposition is duty bound not just to find one solution, but to find the best solution. The
criteria for the ‘best’ solution will differ from program to program — sometimes it will be maximum clarity, sometimes it will be optimum performance. If in pursuing one design it turns out to be unsatisfactory, we can re-trace the design level by level, considering alternatives at each level. It will often prove desirable to replace whole subtrees in this manner. This process is largely under control because we know which part of the structure we are looking at, and it has only one entry and exit point.
3.6 The place of functional decomposition Arguably functional decomposition was the first truly systematic method for program design. It is associated with the names Dijkstra and Wirth. It evolved hand in hand with the ideas of structured programming in the 1960s. Since then other methods that claim to involve structured programming have been devised, so that nowadays functional decomposition is just one variety of structured programming. There is often a lot of confusion about the terminology surrounding structured programming and top-down methods. Let us try to clarify the situation. Functional decomposition is a ‘top-down’ method, since it starts with the overall task of the program. But it is not unique in this respect. Functional decomposition is also called refinement of function.
‘stepwise refinement’,
though it is, of course,
30
FUNCTIONAL DECOMPOSITION
Unlike some other methods, this one has not been packaged and marketed. Perhaps this is because it originated in academic circles and perhaps because it is a method that is not so well defined as others. There is little hard evidence of how much it is used, though it may well be widely used under different
names.
3.7 Discussion The flexibility of functional decomposition means that it can be used in the design of a program for any type of application; it is generally applicable. But because it concentrates on the actions that the program has to take, it is perhaps most useful for problems in which the procedural steps are clearly evident. One such area is numerical computation, like a program to calculate a square root. Another is the control of a sequential process, like a washing machine. Functional decomposition can be used to design the detailed, low-level structure of a program, or it can also be used to design the high-level or architectural structure. Thus it is applicable to both small- and large-scale software. Overall, how good is this method? If we want a completely well-defined method that we can use almost without thinking, then it is inadequate, since its
use requires considerable skill. On the other hand it is an excellent approach if we want a method that guides our thinking but allows us plenty of scope for creativity. In a sense, therefore, the method is not as advanced as some. For
example, Jackson’s method takes the programmer from a description of the structure of the data or files that the program is to act upon, via a number of fairly precise steps, to the program design. By contrast, the use of functional decomposition encourages (some would say necessitates) the use of creativity and imagination in devising program structures. Its use also needs careful judgement in selecting the best structure from amongst any alternatives found. In other words it tends to foster interest in and enjoyment of programming. In summary, functional decomposition is a general-purpose method for program design, based around structured programming, but in allowing the development of alternative designs for the same problem it poses several
unanswered questions: 1. 2.
3.
Where do we get the inspiration for alternative designs? How do we choose between designs?
How do we know that we have arrived at the best possible design?
We have to look to other sources of ideas to answer these questions. These issues have led some to say that functional decomposition is not really a serious method.
DISCUSSION | 31
Evaluation Special features and strengths Provides a powerful approach and notation for program design. Allows full creativity and imagination.
Weaknesses Leaves much to the creativity of the programmer. Philosophy/perspective Top-down, with an emphasis on actions. Systematic? This is not a systematic method. It does not involve well-defined However, it does proceed top-down, making full use of abstraction.
steps.
Appropriate applications Functional decomposition concentrates on actions. So it is most appropriate where these are obvious and central. These are, in fact, most applications —
including event-driven, mathematical and process control.
Inappropriate applications Because this method concentrates on actions and ignores data, it is probably a poor tool in designing programs that process data or files that are rich in
structure. Top-down? Yes — this method is centrally top-down.
Good for small-scale design? Yes, functional decomposition is very suitable for the design of small- to medium-sized programs.
Good for large-scale design? Yes; Tool support available. A This method is an informal method, and no commercial tools are
tools standard text editor is all that is really needed. It is possible to construct that check the syntax of pseudo-code and prettyfiers to convert pseudo-code into a standard and readable layout.
32
FUNCTIONAL
DECOMPOSITION
3.8 Summary Functional decomposition proceeds by starting with a single, grand statement of what the piece of software is to do. This is next re-written as a series of simpler steps using pseudo-code (program design language) as a notation. Pseudo-code consists of English imperative sentences written as sequences, with if statements or with while statements. The designs are refined (re-written as more primitive steps) until the required level of detail is achieved. The method makes direct use of the power of abstraction provided by structured programming, while requiring significant creativity and judgement to be employed. It is applicable to the full range of computer application areas.
3.9 Exercises 1 2
Complete the design of the video game program. Use functional decomposition to design the software for each of the following: (a) (b) (c) (d) (e) (f)
a microcomputer-controlled washing machine a system to control a power station software to calculate and print an organisation’s payroll a general-purpose operating system or DBMS a system to monitor and control medical equipment a mathematical calculation
3.10 References and further reading Arguably the most important book about structured programming is: Dahl O J, Dijkstra E W and Hoare C A R, Structured Programming, Academic Press, London, 1972. Unfortunately, the above doesn’t spell out the method as clearly as it might. A slightly more comprehensive treatment is in: Wirth N, ‘Program development by stepwise refinement’, Communications of the ACM,
Vol. 14, No. 4, April 1971.
CHAPTER 4
The Michael Jackson program design method
4.1
introduction
Starting with the specification of a program, this method, via a series of steps, leads to a detailed design, expressed in pseudo-code. The method is variously called the Michael Jackson program design method (after the name of its inventor), Jackson structured programming (JSP), and data structure design. The basic idea behind the Michael Jackson method is that the structure of a program should match the structure of the file or files that the program is going to act on. To get a feel for how this can be done, let us look at a few simple examples. First, suppose we want a program to input a set of numbers, terminated by a negative number, and to calculate and output their sum. Here’s some sample input data:
29 67 93 55-10 With some experience of programming, we can, of course, immediately visualise the structure of this program. Its main feature is a while loop. But another way of looking at the situation is to see that because there is a repetition in the input data, there should be a corresponding repetition in the program. We have gone from the input data structure to the program structure. Consider a program that is to print a report. The report is to appear on a number of pages. Each page has a heading, a series of ordinary lines and a summary line (Figure 4.1). Forget, for the time being, the structure of any input data. Again, with some experience of programming, we can visualise that we will need in the program statements to print a heading, print a detail line and so on. But we can also see that within the program we will need a loop to deal with the printing of a number of pages. We will also need a loop to deal with printing the set of detail lines on each page. The statement that prints the page heading would precede the loop that prints the detail lines, and so on. This program structure that we
are describing matches the structure of the report. What we have done is to
derive the structure of the program from what we know about the structure of the report.
34
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
Page break
Heading
}Ordinary lines Summary line
Figure 4.1
We will return to these examples later, showing how the Jackson method treats them.
4.2 A simple example Let us consider the design of a program to draw the following on a VDU screen: * xk
*
KkKKK*
KKKK*
week
The first step in the method is to analyse and describe the structure of the information that the program is to process (in this case produce). The product of this step is called a data structure diagram. The diagram for the picture is given in Figure 4.2. In English, this reads:
@
©
The picture consists of the top half followed by the middle, followed by the bottom half. The top half consists of a line which is repeated. The bottom half consists of a line which is repeated.
In general, the diagrammatic notation has the following meaning: Consists of: A line drawn downwards below a box means ‘consists of’. Thus Figure 4.3 shows that A consists of B. Sequence: Boxes at the same level denote a sequence. Figure 4.4 shows that A consists of B followed by C.
A SIMPLE EXAMPLE
Figure 4.2
2 Figure 4.3
Figure 4.4
35
36
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
Figure 4.5
Repetition: An asterisk in a box signifies zero or more occurrences of the component. Figure 4.5 shows that A consists of B repeated zero or more times. Notice that the data structure hierarchical fashion.
is decomposed
(or refined) in a top-down,
Having now described the data structure, the next step is to convert it into a
program structure. This is easy, because, remember, the program structure must correspond to the data structure. So all we have to do is to write ‘process’ in every box of the data structure diagram. We thereby obtain a ‘program structure diagram’. For our program this is shown in Figure 4.6. A program structure diagram like this is interpreted as follows: @
The program as a whole (represented by the box at the top) consists of (lines leading downwards) a sequence of operations (boxes alongside one another).
Process
picture
Process
Process
top half
middle
Process
bottom
half
Process
Process
line
line
Figure 4.6
A SIMPLE EXAMPLE
@
37
Sometimes a program component is to be repeatedly executed (an asterisk in the box).
The next step is to write down operations that the program will os well-defined part of the method these actions should be. For the
(in any order) a list of all the elementary have to carry out. This is probably the least — it does not tell us how to determine what program we are working on they are:
write n asterisks write a blank line
write s spaces increment s decrement s increment n decrement n Oke Oe initialise n and s For
later
reference,
we
number
the
operations,
but
the
ordering
is not
significant. Next each of these operations is placed in its appropriate position in the program structure diagram. For example, operation 2 needs to be done once, for the middle of the picture. It is therefore associated with the box labelled ‘process middle’. Similarly, operation 1 is associated with the component ‘process line’ (Figure 4.7). This act of associating operations with positions is not automatic; instead, as
indicated, judgement has to be employed.
Process
picture
(8)
Process
Process
top half
middle
Process ee
half
Process
Process -
Figure 4.7
38
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
Now comes the final step of transforming the program structure diagram into pseudo-code. The method has its own notation for pseudo-code, called schematic logic, which is essentially the same as other program design languages. Expressed in our notation, the structure of our example program is: initialise n and s while more lines do
write s spaces write n asterisks decrement s increment n endwhile write blank line
initialise n and s while more lines do
write s spaces write n asterisks decrement n increment s endwhile
To derive this pseudo-code from the diagram, start at the top of the diagram and write down its elementary operations. Then indent the code across the page and go down alevel on the diagram. Now write down the operations and structures present at this level. Repeat, indenting each time we go down alevel on the diagram. We have now arrived at a program design, capable translated into most conventional programming languages. To sum up, the steps we have taken are:
1. 2. 3. 4. 5.
Draw a diagram showing the structure of the file. Derive the corresponding program structure diagram. Write down the elementary operations that the program will have to carry out. Place the operations on the program structure diagram. Derive the schematic logic of the program. ee
ee
Exercises
1
of being readily
Design a program to draw a pine tree like this: *
*KK
ORK
ee
ee
A SIMPLE EXAMPLE
2
Devise an algorithm (pseudo-code) for a program to display a multiplication table, such as young children use. For example, the table for numbers up to6 is:
i! 2 3 4 5 6
3
39
1
2
3
4
5
1 D 3 4 5 6
2 4 6 8 LO) Ib
3 6 9 2 sy NGS
4 5 8 MO V2 gee Onee2 Oem ZO ZS BE BO).
6 6 Saez Ics SO SN
The algorithm should produce a table of any size, specified by an integer input from the keyboard. Design a program to create this pattern of asterisks:
wkkK*
** x*k* wkkKK* * x
*
kk* kkk
4
5
with a number of repetitions input initially from the keyboard. Design a program to create the following pattern on a VDU screen:
with a number of repetitions input as an integer from the keyboard. JSP only makes use of while; it ignores for. Discussion: why do you think this might be?
re
EEE
40
THE MICHAEL JACKSON
PROGRAM
DESIGN METHOD
4.3 Processing input files Let us tackle this problem: Design a program to input and add up any number of numbers, ending with a negative number. The data structure diagram is given in Figure 4.8, and the corresponding program structure diagram is given inFigure 4.9.
Numbers
Negative number
* Number
Figure 4.8
Process numbers
Process
Process
negative
body
Process
number
*
number
Figure 4.9
PROCESSING INPUT FILES
41
We can write down the elementary operations as: read number add number to sum initialise sum ee print sum
attach them to the program structure diagram and derive the pseudo-code design: sum :=0 read number while number not negative do add number to sum read number endwhile print sum So far there is nothing new here; we have used the method as before, except for
one thing. JSP did not help us to realise that we would need an initial read operation before the loop, followed by another read at the end of each loop. JSP gave us the structure or skeleton in which we could place the elementary operations — it did not give us the fine detail. Here now is a new ingredient of JSP, illustrated by this problem:
A serial file consists of records. Each record describes a person. Design a program to count the number of males and the number of females. The data structure diagram is given in Figure 4.10. What is new here is the boxes with the letter ‘o’ in them to indicate alternatives. Such boxes are written alongside each other, and there may be any number of alternatives. We now derive the program structure diagram as before. After writing down operations and assigning them to the program structure diagram, we arrive at the following pseudo-code design: open file initialise counts while not end of file do read record if record = male then increment male-count else increment female-count endif endwhile print counts close file
42
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
Figure 4.10
realising that the boxes statements.
with alternatives
in them
become
if...then...else
4.4 Physical and logical structures When we go about describing data we don’t always necessarily describe it exactly as it first looks. Sometimes we take a view that is different from its actual (physical) structure in order to view the data according to how we need to process it (the logical structure).
The physical structure of the data describes the structure without considering the processing requirements of the problem (i.e. without considering the particular use to which the data may be put). The logical structure of the data describes the structure of the data in which the processing requirements have been taken into consideration. The following example is used to illustrate the difference between physical and logical structures. A file consists of a series of observation records; each record has the following format:
date max-temp min-temp observer place What we can do is:
1.
Draw a data structure chart describing the above (physical) structure.
PHYSICAL AND LOGICAL STRUCTURES
2.
43
Draw data structure charts describing the logical structure when the processing requirements are: (a) To count the number of days with a max-temp of 24 degrees centigrade. (b) (c)
To count the number of observations made by Jones. To count the number of days the max-temp observed by Jones lies in the range from 18 to 25 degrees centigrade.
Here are the diagrams. Note the different logical views, describing only that which is relevant to the processing requirement (Figures 4.11—4.14).
Observations file
Physical structure
Figure 4.11
Observations
file
Logical structure for count days with max = 24°C
Figure 4.12
44
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
Observations file Count observations
made by Jones * Record
Figure 4.13
Observations file
3
Count days when max observed by Jones was in range 18—25°C
Record
Figure 4.14
Exercises
1
Complete the designs of the three data analysis progr ams, using the data struct ure diagrams given above.
PROCESSING SEVERAL INPUT AND OUTPUT STREAMS
2
3 4
5
45
Design a program to count the number of words in a piece of text which ends with the word ‘amen’. Assume the existence of a procedure to input the next word. Do not count the word ‘amen’ in the total. Design a program to calculate the average word length in a piece of text. The text ends with an asterisk. Design a program to count the percentage occurrence of the letter ‘e’ in a piece of text. The text ends with an asterisk. Design a program that acts as a calculator by accepting strings like this:
B47 34 — 7 6. and outputting the answer. The strings consist of any number of integers and operators (+,
6
7
8
—, *, /), evaluated left to right. Assume a perfect
typist, who makes no mistakes. Again, same as above but the operators are the words plus, minus, multiply, divide. (No error checking.) A file describes issues and receipts of stock. Each record describes either an issue or a receipt and includes the stock number and amount. The file is sorted so that transactions describing the same stock item are grouped together. Count the number of stock items and the number of receipts in the file. Design a program to count the number of blocks of information received down a communication line. The program reads one character or byte of information at a time. A block starts with the character F4 (hex), followed by any number of data bytes, followed by F2 (hex). The stream of blocks is terminated by the character F1 (hex). ee
ee
4.5 Processing several input and output streams a single So far we have just looked at programs that process a single input or
streams. output stream. Now we turn to the more common situation of multiple all of the The method is basically the same, except that we will have to describe streams, and make the program structure reflect them all.
An interactive system a screen and Let us consider a rudimentary stock control system that uses ially similar keyboard for interactive transactions. This is a system that is essent s and commerce to many on-line information systems used in banking, airline ing screen generally. Suppose that when the system is switched on, the follow is displayed:
46
THE MICHAEL JACKSON PROGRAM DESIGN METHOD
| |
STOCK CONTROL
| CREATE
EEMi
ta. este tnete
6 3
| delete
item
2.4... 2sese
ce 4
Pisetecstock i. Oia. aes larecedverstock.. 2.1. .uo
ie Jeet
| |
=| |
|
Reqieiy: nV emimrare te 25 olete 50
|
HERO TUCO ne rs sear Metres Sistah ei ots 6
|
| Most of the screen is filled with output, with scope for the user to input a choice from the menu. What we shall set out to do is to draw a single data structure diagram that describes everything that happens on the screen — both the input and the output (Figure 4.15).
The diagram describes the serial stream of data that is output to the terminal, followed by the (serial) stream of input that the user inputs as a response. The two are complementary, and can easily be incorporated into the same, single diagram. We are ignoring, for the moment, the structure of any of the processing concerned with each of the individual menu choices. Note that the diagram implies that there will be a heading, followed by choices, followed by a prompt, followed by a wrong response. So it looks as if the diagram says that there will always be a wrong response. However, the diagram goes on to say that a wrong response consists of zero or more invalid replies — and thus it does allow for an initial correct response. We now go on to write down the elementary operations: clear screen display heading display choices display user prompt accept user response display error message call issue procedure call receive procedure call create procedure ee Oo 10. call delete procedure 11. call query procedure
Notice that we are choosing to postpone consideration of the detailed handling of each transaction type. (We shall do this in a moment.) This is consistent with the idea of restricting our design to the area of curre nt concern — the menu screen. We assume the existence of various routines or subprograms, which we will design later. Now we attach the Operations to the program structure
PROCESSING SEVERAL INPUT AND OUTPUT STREAMS
47
Dialogue
*
Screen
Heading
Choices
Prompt
Response
Invalid
Valid
*
o
Wrong
Junk
Etc.
2
Error message
Figure 4.15
diagram (we’ll omit this step because it’s quite straightforward) and derive the pseudo-code:
while user choice not equal to end do clear screen display heading display choices display prompt accept user choice while user choice is wrong do display error message accept user choice
48
DESIGN METHOD
THE MICHAEL JACKSON PROGRAM
endwhile case user choice 1 : call issue procedure 2 : call receive procedure 3 : call create procedure 4 : call delete procedure 5 : call query procedure end case endwhile
‘
Now we have the final design — but there is one small defect. The overall loop that terminates when the user responds by typing the option 6 may well never be done at all, if by chance the choice has been initialised to 6. The sure remedy is to initialise it to something that will ensure that the loop is done at least once: make user choice = space Let’s go on to look at another part of the stock control system — the part that deals with the create screen. This is sometimes called ‘form filling’:
CREATE SOS Ke mlm
AS Cio
NEW STOCK
ITEM
Sty aaa anceear eee ocr
Ps @ Tiaras sie 5 dees
KG Grate sO
aes Berea ocak el aa
confirm correct another
? (y/n)
new stock item?
eee
eee eee ee ee eee
a)(a) (6 NUMBER We say that directory is of type NAME ->NUMBER. NAME is called the source of the function, and NUMBER is the target. Any function of this type
DESIGN
129
can take as its argument a member of NAME, and return as its result a member of NUMBER.
However, our particular directory is partial in the sense that it
| is only defined for some of the elements of NAME, since not everyone will have a telephone number. The subset of NAME for which directory is defined is called the domain of the function. For example, suppose NAME is the set {Bob, Carol, Ted, Alice, ...}, and NUMBER is the set {12345, 56789, 45678, ...}. Then directory might be defined explicitly as follows: directory(Bob) = 56789 directory(Alice) = 12345 That is, if we apply directory to Bob, 56789 is returned, and applying it to Alice returns 12345. Nobody else is on the phone, so directory(Carol), directory(Ted), etc., are undefined. The domain of directory (written dom directory) is the set {Bob, Alice}, which is clearly a subset of NAME.
Similarly, the range of directory (written ran directory) is the set {12345, 56789}, a subset of NUMBER. Alternatively, the directory function can be defined by explicitly enumerating (as a set of maplets) how elements of NAME are mapped to elements of NUMBER: directory = {Bob +> 56789, Alice +> 12345}
Functions express ‘many-to-one’ relationships. That is, if we model directory as a function, then several people will be permitted to have the same telephone number, but an individual can have no more than one telephone. A function is called total (rather than partial) if its domain is equal to its source. For example, consider the square function, which squares its integer argument. The type of square could be described thus: square: N > N N is the set {0, 1, 2, 3, ...} of natural numbers, and is one of several basic _ types provided by Z. square is total (indicated by the arrow without the bar) because it can be applied to any element of N, not just a subset. Now we can begin to think about a concrete data structure to represent the
components of our system state, shelved and loaned. Two arrays (which we’ll
to be a call Cshelved and Cloaned, the C standing for ‘concrete’) would seem
natural representation: Cshelved, Cloaned : array [1.. ] of CATNO
For example, suppose shelved was the set {cl, c2, c4} and loaned was {c3, c5}. Then Cshelved and Cloaned would look like this:
130
FORMAL METHODS
Cshelved
Cloaned
FBP aA NO WO
The purpose of the variables ns and nl is to mark the indices of the last entries in the arrays Cshelved and Cloaned respectively. For simplicity, we assume the arrays are infinite, but of course we shall only be using afinite part of them, defined by ns and nl. The rest of the arrays are filled with some default value of no interest to us. These arrays are quite easy to model in Z. As usual, we record the the concrete model in a schema: CLibrary
Cshelved, Cloaned : N, ~ CATNO ns, nl: N
Vi,j : 1l.ns @ i#j => Cshelved(i) # Cshelved(j) Vi,j : 1..nl @ i#j = Cloaned(i) # Cloaned(j) Vi: l.ns @ (Vj: 1..nl @ Cshelved(i) # Cloaned(j))
N, is another basic type provided by Z, standing for the set {1, 2, 3, ...}. The arrays Cshelved and Cloaned are modelled as total functions, and ns
and nl are natural numbers. So the example arrays above would be modelled by: Cshelved= {1 +> cl, 2 ++ c2, 3 +> c4, ...} Cloaned= {1 +> c3, 21> ¢5,. ..} ns =3 nl=2
Note that the programming language expression Cshelved[2], which accesses the second element of the array called Cshelved, is equivalent to the function application Cshelved(2) in our concrete model. What does the invariant in CLibrary say? All three lines use a ‘universal quantification’ expression, which has the following general form:
DESIGN
131
V declarations @ predicate This can be read as ‘for all these declared variables, the following predicate holds’. So the first line of the invariant says that if we choose any two different values of i and j in the range 1 to ns, then Cshelved(i) and Cshelved(j) will also be different. This is to ensure that there will be no repeated catalogue numbers in Cshelved — a fact that can be taken for granted in our abstract model, because sets can’t have repeated elements. The second line makes a similar assertion about Cloaned. The third line restates concretely the invariant from our original abstract version of the specification — that Cshelved and Cloaned cannot have any elements in common. Notice how this line already points towards an implementation as a nested for loop. The abstract model recorded in Library and the concrete model recorded in CLibrary are clearly related. We describe this relationship formally by means of a ‘retrieve’ schema, which states how the abstract model can be retrieved
from the concrete model: __ Retrieve
Library CLibrary shelved = ran (1 .. ns cl, 2+> c2,
3
4, 2 -})
= (cl, C2, C4}
1 .. 3 is the same as the set {1, 2, 3}.
Now we are in a position to write the new, concrete Borrow operation:
1382
FORMAL METHODS
CBorrow
ACLibrary book? : CATNO
fi: 1 .. ns @ book? = Cshelved(i) ns'=ns — 1 Cshelved' = squash({i} book?}
ACLibrary gives us access representation of the library, borrow. The first line of the with that catalogue number quantification’ expression:
to the before and after states of the concrete and, as before, book? is the book we wish to predicate establishes that there is indeed a book in the Cshelved array. It uses an ‘existential
4 declaration @ predicate which can be read as ‘there exists at least one value of the declared variable such that the predicate holds’. Thus the predicate is true only if there is a value of i in the range 1 to ns such that book? is equal to Cshelved(i). For example, if: Cshelved
=.{1 > cl, 2:
¢2, 3 =
e454}
ns = 3 and book? = c2, then the value of i which satisfies the predicate is 2. The < symbol is the domain. subtract operator. The expression:
sf describes the function which results from subtracting the set s from the domain of the function f. So, for example, if:
Cshelved= {1 +> cl, 2+>¢2,3
> c¢4,...}
i=2 then
Cshelved'
= squash({i} < Cshelved) = squash({2} < {1 +> cl, 21> ¢2,3 > 4, ...}) = squash({1 => €1,°3 +> ¢4, . . 3) After the domain restriction, the resulting function models an array with a gap, or a missing value, in position 2. Fortunately, Z provides us with a squash operator, which re-sequences the indices: squash({1'r> cl, 31> ¢c4,.. VS {lS icl, 0 C4. se)
So Cshelved' describes the array which results from shuffling the elemen ts in
order to close the gap.
IMPLEMENTATION
133
Finally, we have to add the borrowed book to Cloaned. Here, we make use of the fact that a function is a set of maplets, and we can therefore legitimately use set operators such as U to create new functions from old. For example, if: Cloaned= {1 > Fi) book? = ¢c2
c3, 21> 5, ...}
then nli— 3
Cloaned'
=Cloaned U {nl' +> book?} = 1it. 3 22 FCS, oh leah = ftloe>.03:, 25> C553 > C2, nad
e2}
Intuitively we can see that CBorrow is a correct refinement of Borrow. But this isn’t enough — we must formally prove that this is the case. To do this, we must show that: @
®
The abstract precondition of Borrow is equivalent to the concrete precondition of CBorrow — or in other words, whenever Borrow can be legally invoked in some abstract state, CBorrow can be invoked in the corresponding concrete state. The final states produced by Borrow and CBorrow are equivalent. In other words, any concrete state that can result from invoking CBorrow could also be produced (in its abstract form) by Borrow.
We would have to show this ‘operation correctness’ for all the specified operations, i.e. including our (as yet unwritten) refinement of the Return operation. Such proofs are non-trivial and outside the scope of this book. Suffice it to say that once these checks are complete, we can have confidence
that the concrete model is indeed equivalent to the original abstract model, and we can move towards an implementation based upon it. The reader may be becoming uneasy about the lengthy (and difficult) design procedure we have described, and the obvious practical difficulties it would
entail in the real world. Such uneasiness is well founded; we shall return to
these concerns in Section 7.5.
7.4 Implementation At the end of the design phase we should have a specification whose state is expressed in terms of data types which have a direct counterpart in the collection of data structures provided by the target programming language. The
1384
FORMAL METHODS
next task is to turn the operation specifications into program code, either by constructing algorithms by inspiration (much as programs are traditionally written from natural language specifications) and then formally verifying the equivalence of specification and code; or, better, by transforming the concrete specification gradually into code using transformation rules which are guaranteed to preserve the correctness of the original specification. This latter approach is known as operation refinement, and has the additional benefit of ensuring that the code is produced in small, manageable chunks rather than as a monolithic whole, thus minimising the potential for error. From a given concrete specification, an operation refinement system has to be able to introduce the usual programming constructs (such as_ blocks, procedures, local variables, assignment statements, and selection and iteration
clauses) in order to implement the specified system. Most current refinement systems are based upon expressing algorithms in a so-called guarded command language, which is slightly different from the more familiar programming languages such as C++ and Pascal. The details of this approach are outside the scope of this book (which is primarily about design rather than implementation). In fact, the reader may find it difficult to see how the concrete
specification CBorrow could be immediately transformed into code, although most practising programmers could very easily produce (by inspiration) code which implemented the functionality of the operation, based on an understanding of its meaning rather than its precise structure. Refinement is actually an iterative process, in that there may be many concrete specifications produced en route to the least abstract one that is used to generate the implementation. Thus CBorrow would need further refinement to take it closer and closer to a version which mirrored typical iterative array accessing constructs. The use of the word ‘design’ in the above is perhaps slightly different from its conventional use in a non-formal approach to software development. In formal methods, the design phase is defined to be that process which produces a concrete design from an abstract specification. This is the phase we have concentrated on here. Operation refinement is then used to produce an implementation from the concrete design — a process analogous to coding from a design produced by one of the other methods in this book.
7.5 Discussion What are the benefits to be gained from expressing a specification mathematically rather than in English? The first thing to note is that the abstract specification makes no recommendations as to appropriate data structures for the implementation. Although the library was originally modelled as two disjoint sets, the programmer who bypasses the design (refinement) stage may choose to implement it using sequential or random access files, as a linked data structure, as a simple two-dimensional array, or in some other way. The choice
DISCUSSION
135
is open and unconstrained. In this respect, the notation has forced abstraction upon us; this is an important aspect of formal specification. In addition, formal specifications provide a sound basis for inspection and analysis during regular review meetings. Some devotees also argue that formality releases a specifier from any emotional attachment to his or her work, and this relaxes participants who are called upon to defend their designs. We can think of the specification as a contract between those who wish to use the program and those who have to construct it. For the user, the contract states the properties which he or she would like the program to exhibit; for the implementor, it states the behaviour of the program which must be built. In addition, a validator may act as the representative of potential users — his or her concern is that the specification properly embodies the requirements; a verifier’s task would be to show the correctness of the implementation with respect to the specification. For the purposes of this chapter, it is enough to reiterate that a formal specification is more concise and less open to misunderstandings than one written in a natural language — although clearly a sound understanding of discrete mathematics is a necessity for those intending to use the specification. Moreover, the respectability of mathematics as a precise notation with a long and well-established underlying theory inevitably inspires confidence in its rigour and expressive power. One view is that the formal specification can merely replace the traditional English specification, in which case the programmer’s task is the same as it always was: to use the specification as a description of the problem (a statement as to what is required) and to construct a realisation (an implementation prescribing how it should be solved). The program which implements the specification may then be constructed in the usual way by relying on the programmer’s skill, experience and understanding of the specification. Alternatively, as we saw in Sections 7.3 and 7.4, the program may be systematically refined from the specification using correctness-preserving rules and procedures. It is this systematic approach to the entire development process, and the:flexibility allowed, that has made formal methods attractive for real industrial applications. Formal specification has now been established in the computing industry as a viable and productive way to improve the effectiveness of the software development process, and to increase a user’s confidence in the correctness of the software produced. The phases which involve formal proof, while offering
obvious potential benefits, are currently too labour intensive to be widely used. However, there are other techniques for specification validation which are more
accessible to the less mathematically
inclined developer. For example, a
specification can be animated by re-writing it in a suitable executable language
(thereby producing a rapid prototype), and using this executable version to explore the specification’s behaviour, possibly with the sponsor’s participation. Special-purpose tools are currently being developed to support this activity. If formal methods are to fulfil their promise, the construction of such integrated
136
FORMAL METHODS
tools is considered vital to the eventual acceptance of a formal software development scenario. Evaluation Special features and strengths Can produce provably correct software — programs that we know will fulfil their
sponsor’s requirements because they have been systematically derived from validated formal specifications using correctness-preserving procedures. Weaknesses The above is somewhat idealistic. In practice, the design process is very labour intensive, and often difficult even for mathematicians.
Philosophy/perspective Based on the premise that mathematics is a much more precise and unambiguous notation for describing program specifications than a natural language such as English; and that proving is better than testing, in that a program can be shown to be correct for all possible inputs rather than for a carefully chosen subset. Systematic?
Data refinement is essentially systematic in that there is a prescribed procedure to be followed, but effective design still requires experience and inspiration in choosing an appropriate concrete representation. Appropriate applications
All applications, but particularly beneficial in sensitive domains in which program correctness is crucial — such as situations which are potentially life threatening if software errors occur. Inappropriate applications
Applications exhibiting concurrency. Event-driven systems and graphical user interfaces. Top-down?
Formal specifications can certainly be constructed in a top-down manner, and the design phase which follows is inextricably tied to that initial disciplined development. Good for large-scale design? Yes.
EXERCISES
137
Good for small-scale design? Yes. Tool support CASE tools such as prototyping and animation systems, formal specification language editors, and proof support tools are in various stages of development in the computing industry and in academia.
7.6 Summary @ e @ e@ e@
@
Software specifications should admit only a statement of what the problem is, not how to solve it. A specification is a description; a program is a realisation; a verification is a justification. Formal specifications allow the possibility of proving programs correct. Z takes a model-based approach to formal specification. A formal specification can be validated using mathematical reasoning. A formal method for software design uses a correctness-preserving procedure called refinement to transform a formal specification into an implementation. Ideally, integrated tools should support the software development phases of specification, validation, data refinement and implementation.
@
Formal specification has been established as a viable and productive way to improve the effectiveness of the software development process.
I
7.7
Exercises
4
‘Write a sorting program.’ In what way is this inadequate as a specification? A necessary aspect of specifications is discussing them with potential users. The average user may not understand mathematics. Isn’t this a disadvantage? Why might a formal specification still be worth producing? List some non-functional requirements of a computer system to manage a library. For example, how many books should it be able to accommodate? Enhance the library specification by specifying the operation of buying a
5
between a book and the person who has borrowed it? Write CReturn, the refinement of the Return operation.
1 2
3
new library book to add to stock. How might we model the relationship
1388
6
FORMAL METHODS
What would you say were the most important attributes a specification (formal or otherwise) should have for a programmer who (bypassing refinement) has to implement it?
7.8 References and further reading A readable introduction to the mathematics needed for an understanding of formal methods is given in: Woodcock
J and Loomes
M, Software Engineering Mathematics,
Pitman,
London,
1988. This book gives an overview of several approaches to formal specification, using an electronic diary system as a vehicle for comparison: Cohen B, Harwood W T and Jackson M I, The Specification of Complex Systems, Addison-Wesley, Wokingham,
1986.
Not all books on formal specification are easy to read. This is one of the more approachable texts: Wordsworth J B, Software Development with Z, Addison-Wesley, Wokingham, 1992; A comparison of prototyping techniques based on formal methods, together with a description of a practical prototyping system (called EPROS) is given in: Hekmatpour S and Ince D, Software Prototyping, Formal Methods and VDM, AddisonWesley, Wokingham,
1988.
CHAPTER
8
Design notations
8.1
Introduction
The purpose twofold: 1. 2.
of collecting
different design notations
into one
chapter is
To provide a reference summary to the notations. To disconnect the notations from individual design methods.
Some notations can be used outside a particular design method and, indeed, some notations (structure charts, for example) are used within more than one design approach. So it can be helpful to know about notations independently of methods. Remember, also, that a notation is not by itself a method — it is merely a way of assisting the design process and of documenting the end product. Perhaps the most obvious medium in which to express a program design is a natural language such as English — something everyone understands. But we are trying here to express procedural information about how the completed software should perform its task, and consequently precision is vital. There are too many ways in which English could be misused in describing even a simple procedure. So we must resort to notations which constrain us to write only well-structured and unambiguous process descriptions.
8.2 Pseudo-code (PDL) Pseudo-code is an informal textual language for use in designing programs. It
is sometimes called program design language (PDL) or structured English.
An example of the design of a program to search a file, expressed in pseudo-
code, is:
open file read a record while not record sought do read a record endwhile display the record close file
140
DESIGN NOTATIONS
Pseudo-code consists of the following ingredients:
sequences repetitions, usually expressed using while ... do ... endwhile selections, usually expressed using if... then ... endif or if... then... else ... endif procedures Sequences are expressed as a series of actions like this: input a number add to total display on screen
The actions are English sentences, denoting (other than clarity) to govern these sentences. language. Similarly, the tests that appear in loops English, rather than in the strict grammar of
actions. There are no guidelines Thus pseudo-code is an informal and selections are expressed in a programming language, as in:
while not a full stop do input character endwhile Full use of indentation is recommended in PDL. Some people like to write the special words (if, then, while, etc.) in capitals or emphasise them in some other way. Evaluation Special features and strengths Textual and informal. Similar in grammar and meaning to most programming languages. Extensible to include any desired facilities: for example, do ... until. Weaknesses In its present form, unable to describe data.
Philosophy/perspective
That a program consists of a sequential algorithm. Systematic? PDL is purely a notation; it is not a method. Thus it can be used in a systematic manner; equally it can be used in a disorganised or hacking fashion to create a program design.
FLOWCHARTS
144
Appropriate applications
PDL is highly suitable for applications where the actions of the program are important. It is a notation that describes actions. It does not attempt to describe data. Inappropriate applications Applications where the data element modelling.
of the problem is strong, e.g. data
Top-down? Again, PDL is purely a notation; it is not a method. It certainly can be used topdown, as in the method called functional decomposition, described in Chapter 3. Equally, the notation can be used in another way that is not top-down, e.g. object-oriented design.
Good for large-scale design? Naming problems can arise in large-scale designs, and this problem can be exacerbated by the lack of tools which are sensitive to naming conventions. Good for small-scale design? Yes, Tool support Any text editor can be used to create and edit PDL, but PDL involves informal English, which obviously cannot be checked or interpreted. A software tool can
check that matching endif and endwhile statements are present. A tool can also format PDL with indentation to improve its readability.
References and further reading Caine Stephen H and Gordon E Kent, ‘PDL — atool for software design’, reprinted in Tutorial on Software Design Techniques, 4th edition, Peter Freeman and Anthony I Wasserman (eds), IEEE Computer Society, Silver Spring, MD, 1983.
8.3 Flowcharts Flowcharts are one of the earliest notations for program design. They provide a diagrammatic notation for carrying out design and for documenting a program. A flowchart consists of boxes connected by lines. The lines explicitly show t the flow of control through the diagram, and therefore through the resultan program. Shown in Figure 8.1 is a flowchart for a program to input two numbers, add them together and display the result.
142
DESIGN NOTATIONS
Figure 8.1
To read the flowchart, you start at the ‘start’ box and follow the line. Arrows are placed on the lines to confirm the direction to be taken. The meaning of the differently-shaped boxes is as shown in Figure 8.2. There is a final type of box, the decision or test box, illustrated in the flowchart in Figure 8.3 for a program to input two numbers and output the greater of the two. A program design that has repetition in it looks like that shown in Figure
8.4. Describing a loop like this uses a test box and makes use of the explicit flow of control back to the start of the loop. There is no explicit notation for a loop
in flowcharting.
Ce
a
Start and stop
Input/output
Processing
Figure 8.2
FLOWCHARTS
Output a
Figure 8.3
Figure 8.4
143
144
DESIGN NOTATIONS
Figure 8.5
Abstraction can be described simply by writing the high-level action in a processing box, as in Figure 8.5. The main processing step in this diagram, the calculation of the square root, can be diagrammed separately.
Evaluation Special features and strengths
Flowcharts are graphical and explicitly show control flow.
Weaknesses It is very easy to draw a flowchart that violates the discipline of structured programming. Such a flowchart cannot be understood very easily and cannot be translated into several of the modern programming languages. Surprisingly, there is no built-in notation to describe repetition. A loop must be constructed as shown in the above example using a test box. Philosophy/perspective Flowcharts emphasise control flow. Systematic?
There is no systematic method associated with flowcharting.
NASSI-SHNEIDERMAN CHARTS
145
Appropriate applications
Applications where control flow is paramount. Inappropriate applications Applications where data is important, e.g. information systems.
Top-down? Top-down design can be performed if flowcharts are used carefully. Good for small-scale design? Flowcharts have traditionally been used for small-scale program design. Good for large-scale design? Flowcharts do not have the expressive power to describe the large-scale structure of software. Tool support No commercial packages are available specifically for flowcharts, but many drawing packages allow flowcharts to be drawn, saved, edited and printed.
References and further reading Ramsey H R, Atwood M E and Van Doren J R, ‘Flowcharts versus program design languages: an experimental comparison’, Communications of the ACM, Vol. 26, No. 6, pp.445-9.
) 8.4 Nassi-Shneiderman
charts
Nassi-Shneiderman charts are a ‘structured’ form of flowchart. Where flowcharts allow diagrams to be drawn that violate the rules of structured programming, Nassi-Shneiderman charts ensure that a design conforms to structured programming. In its simplest form, a Nassi-Shneiderman diagram describes a sequence of actions. For example, the sequence: open a file read a record display on the screen
is represented Each action in a box to informally, in
in a Nassi-Shneiderman chart as shown in Figure 8.6. is enclosed in a box and the complete sequence is also enclosed show that it is a complete program. Actions are described English.
146
DESIGN NOTATIONS
Open file Read record
Display record
Figure 8.6
If an action is too long or complex to be described in a box, then an overall statement of the action is placed in the box. The detail can be described in another separate diagram. Thus abstraction is provided for. Repetition is described by enclosing the sequence to be repeated in a bracketing box, as in this example (Figure 8.7): open file while not end of file do read record
display record endwhile close file Selection, or choice of action, is represented as follows (Figure 8.8): Open file Not end of file Read record
Display record Close file
Figure 8.7
Read record
ke
=Male
count
_ Closefile ea
Figure 8.8
:
NASSI-SHNEIDERMAN CHARTS
147
read record if male then add one to count of males else add one to count of females endif close file
Evaluation Special features and strengths Graphical, structured. Weaknesses Unable to describe data.
Philosophy/perspective That a program consists of a sequential algorithm that follows the ideas of structured programming.
Abstraction Nassi-Shneiderman charts provide for abstraction by allowing a high-level statement of some operation to be written in an action box. Systematic? Nassi—Shneiderman charts are purely a notation; they are not a method. Thus they can be used in a systematic manner, and their use does tend to encourage a systematic approach. Appropriate applications Nassi-Shneiderman charts are highly
suitable for applications where the actions of the program are important. It is a notation that describes actions. It does not attempt to describe data. Inappropriate applications Applications where the data element modelling.
of the problem is strong, e.g. data
Top-down? Again, Nassi-Shneiderman charts are purely a notation and not a method. They certainly can be used top-down. Equally, the notation can be used in a way that is not top-down.
Good for large-scale design?
No. Nassi-Shneiderman charts have the same limitations as flowcharts.
Good for smail-scale design? iY eS:
148
DESIGN NOTATIONS
Keyboard
Raw
Squared
numbers
numbers
Screen
Figure 8.9
Tool support None available.
References and further reading Nassi I and Shneiderman B, ‘Flowchart techniques for structured programming’, SIGPLAN Notices of the ACM, Vol. 8, No. 8, pp. 12-26, August 1973. Yoder C M and Schrag M L, ‘Nassi-Shneiderman charts: an alternative to flowcharts
for design’, Proceedings of the ACM SIGSOFT/SIGMETRICS Software Quality and Assurance Workshop, November 1978, pp. 386-93. Reprinted in Tutorial on Software Design Techniques, 4th edition, Peter Freeman and Anthony I Wasserman (eds), IEEE Computer Society, Silver Spring, MD,
1983.
8.5 Data flow diagrams A data flow diagram is a graphical notation for describing the design of a program. Such a diagram describes the flow(s) of data through a system, together with the actions that are carried out on it. The simplest data flow diagram is illustrated in Figure 8.9. A program is required to input numbers from a keyboard, square them and display the squared values on the screen. There is a single input flow of data and asingle output flow. There is a single action that is to be performed. You will see that inside every ‘bubble’ or processing box, there is text describing
its action.
These
bubbles
are sometimes
called transformations,
because each of them transforms one data flow into another. Every line or arc has an arrow on it, showing the direction of flow of the data. Every line should be labelled with the nature of the data. A more complex example is shown in Figure 8.10, where a computer is being used to control the barrier on a car park exit machine. Most of a data flow diagram consists of bubbles and arcs. If a static store of data (like afile) is present in the system, it is diagrammed as in Figure 8.11, which shows a program to display the contents ofa file on the screen. In all but the simplest pieces of software, it is usual first to draw an overall diagram of the data flows. Some of the bubbles will represent complex actions. Such a bubble is a high-level or abstract view of an action, and can be re-drawn in more detail as a separate diagram.
DATA FLOW DIAGRAMS
Display message
149
Error
message
Error signal Card Payment
Figure 8.10
File
Screen
Figure 8.11
Evaluation Special features and strengths Emphasises data flow and the actions that are carried out on the flows. Provides a notation for describing the large-scale structure of software.
Weaknesses
Ignores the question of how the transformations are sequenced and controlled (though ignoring this issue at an early stage in design might be regarded as desirable). Philosophy/perspective The flow of data in a program.
Systematic?
use Data flow diagrams are a notation, not a method. Some design methods
data flow diagrams. But none show how to create the diagrams in a systematic way. Appropriate applications All.
150
DESIGN NOTATIONS
Inappropriate applications None. Top-down?
Data flow diagrams are not hierarchical. But they can be used in a top-down manner by first identifying the main transformations in the software and later breaking down the major actions into more detailed data flow diagrams. Good for small-scale design? No. Good for large-scale design? Yes. Tool support
Commercial packages are available to construct, store, display and edit data flow diagrams.
References and further reading Page-Jones M, The Practical Guide to Structured Systems Design, 2nd edition, Prentice Hall, 1988.
8.6 Structure charts Data flow diagrams describe the transformation of data and the order in which these transformations are to be carried out. They do not describe the overall control the completed system should exhibit. The structure chart is a notation for documenting control information. A structure chart consists of boxes and arcs. Each box represents a module — that is, a procedure or operation carried out by the system — and the arcs show the call hierarchy of the modules. For example, shown in Figure 8.12 is a structure chart for a program that inputs data, converts it into some other form, and then outputs the result. The Process module is in overall control. It first calls the Input module, which in turn calls two modules — first one to input the raw data, and then another to convert it. Finally, having received the converted data, the Process module calls the Output module to display the result. Labels can be added to the arcs to show the data which is passed between modules. Thus in the structure chart in Figure 8.13, module a passes data d, to
module b, which then passes data d, back to a. Module a then sends d, to
module
c. There
expressive power.
are other
extensions
to the notation
which
enhance
its
STRUCTURE CHARTS
151
Process
Input P
raw data
Convert
Figure 8.12
Figure 8.13
Structure charts are very closely tied to the ideas of data flow design (see Chapter 5). They provide a graphical representation of the call hierarchy of a collection of modules, each module corresponding to a process in a data flow diagram.
Evaluation Special features and strengths l structure Provides a graphical description of the high-level hierarchical contro . Has wide of, and communication between, the modules in a software system applicability.
Weaknesses not show the Requires some intuition and creativity in construction. Does internal structure of individual modules.
152
DESIGN NOTATIONS
Philosophy/perspective Seeks to convert information flow into program structure, based on the idea that an initial ‘global hierarchy’ of modules can be subsequently refined into more and more detail.
Systematic? Can be derived from data flow diagrams. The process for doing this is fairly methodical, and guidelines and strategies exist which minimise the complexity of module interactions. Such strategies are inspired by the notions of cohesion and coupling (see Chapter 9). Appropriate applications All. Inappropriate applications None. Top-down?
Yes. A structure chart is intrinsically a hierarchical structure. Good for large-scale design? Yes; Good for small-scale design? No. The emphasis is on architectural design. Tool support
Yes. Graphics tools exist for drawing and editing data flow diagrams and structure charts.
References and further reading Page-Jones M, The Practical Guide to Structured Systems Design, 2nd edition, Prentice Hall, 1988.
8.7 Class hierarchy diagrams This diagram describes some of the information about a program that is object oriented. The diagram shows the class hierarchy of the program. It is a treestructured diagram in which subclasses are shown below superclasses. A subclass inherits methods from its superclass. The example (Figure 8.14) shows the class hierarchy of the types in the C programming language.
CLASS HIERARCHY DIAGRAMS
153
Composite types
Numeric
Figure 8.14
This type of diagram should not be confused with a structure chart (which > shows which modules use each other when the program is running). The class hierarchy diagram shows static information about the relationship between
classes. The CRC (Class—Responsibility—Collaborators) list (see Section 8.8) is
more akin to a structure chart in the type of information that it conveys.
Evaluation
Special features and strengths Graphically shows the structure. Weaknesses is hidden. None, within its remit. However, some of the detail of the classes
Philosophy/perspective re. Reflects a program that has an object-oriented (OO) structu
154
DESIGN NOTATIONS
Systematic?
No. A diagram like this is derived from a process that is rather ad hoc. Appropriate applications
Any that can be structured using OO techniques. This tends to be most modern applications. Inappropriate applications Programs that have not been designed according to OO principles. Top-down?
The diagram may not be derived top-down, but it does describe a hierarchy (a top-down structure). Good for large-scale design? Yes, excellent.
Good for small-scale design? No, the notation supports programs that consist of a number of classes and describes the program at the high level of abstraction of classes. Tool support Any package that supports the creation and editing of box diagrams is appropriate. Tools can be written to check the consistency between the program text and the hierarchy diagram. A tool could be written to scan program text and convert it into the class hierarchy diagram.
References and further reading Booch
Grady,
Object-Oriented
Design
Cummings, Redwood City, CA, 1994.
with Applications,
2nd edition, Benjamin/
8.8 Class-responsibility-collaborators (CRC) lists These are used to describe how classes in object-oriented programming (OOP) use each other at run-time. There is one list per class in the program . It describes for the class: @
the class name
@ @
the function of each of the methods provided by the class the names of classes used by the class
In other words, it describes what the class supplies and what it needs.
CLASS-RESPONSIBILITY-COLLABORATORS (CRC) LISTS
155
Class name: car
Responsibilities
Collaborators
* Gas pedal
Engine
¢ Brake pedal
Wheels
¢ Steering wheel
Transmission
° Display & instruments | Body
Figure 8.15
Figure 8.15 shows a CRC list for a motor car. It uses Engine, Wheels, etc.,
and it provides as services to its users brake pedal, gas pedal, steering wheel, etc.
Evaluation Special features and strengths Textual and informal. Can be used by a group of designers in conference.
Weaknesses Informal — no computer verification or consistency checking possible. Philosophy/perspective That a program consists of classes and objects. Systematic?
No — the creation of these lists arises from a fairly ad hoc process.
Appropriate applications
All, provided that they are structured according to OO design. o Inappropriate applications es. principl OO None — except those programs not structured according to
Top-down? Certainly not. The relationships in an OO program hierarchical and therefore neither is this notation.
are
generally
not
156
DESIGN NOTATIONS
Good for large-scale design? Yes. OO design, and therefore this notation, is particularly suited to large programs. | Good for small-scale design? No. The assumption is that the program is sufficiently large to merit construction from a number of classes (and therefore objects). Tool support Any text editor.
References and further reading Beck K and Cunningham W, ‘A laboratory for teaching object-oriented thinking’, ACM Sigplan Notices, pp. 1-6, October 1989.
8.9 State transition diagrams State transition diagrams are a notation for describing the way in which systems change their state depending on the inputs they receive. For example, a simple digital watch may have a number of modes of operation — it could be used to display the time, the date, as a stopwatch, as an alarm, and so on. It may have several buttons which change its current mode. Once ina particular mode, the same buttons may have different effects. A fragment of a state transition diagram describing such a watch is shown in Figure 8.16. Each mode (state) of the watch is drawn as a circle, and the directed arcs from state to state are labelled with the input that causes that change of state. The watch’s normal mode is a simple time of day display. Pushing button 1 puts the watch into its stopwatch mode, showing a zeroed display. Pushing Display time Push 3 Display date
Figure 8.16
STATE TRANSITION DIAGRAMS
157
button 2 starts and stops the stopwatch. When stopped, the watch displays the current elapsed time. Pushing 1 in this mode re-starts the watch, and pushing 2 returns the watch to its zeroed stopwatch state. Other button presses from the display time mode lead to other facilities: displaying the date, setting the alarm, etc. The particular attraction of this notation is its ability to express how the same input can have different effects depending on the history of previous activities. For this reason, state transition diagrams are particularly useful for documenting the sequence of events which are necessary to respond to particular requests in a graphical user interface. Once again, there are many variations on the basic notation described here. Evaluation Special features and strengths Graphical,
concise, easy to understand
notation which is straightforward to
validate and to convert into code. Can be used as both a specification and a design tool.
Weaknesses Requires some intuition and creativity to decide the necessary states and transitions. The notation’s very flexibility is open to abuse. For example, nondeterminism is allowed — two or more arcs labelled with the same input may (in theory) emanate from the same state. Philosophy/perspective Based on the theoretical concept of ‘finite state machines’, and the idea that a
program must produce a sequence of changes of machine states in order to compute a result. Systematic? No particular methodical strategy associated input/transition pairs.
exists
for identifying
and
states
their
Appropriate applications
State transition diagrams are in widespread use, particularly for specifying and designing menu-driven systems. Inappropriate applications
i
an Those applications for which the identification of states and transitions is are inappropriate model. For example, scientific number-crunching applications ill suited to this approach.
158
DESIGN NOTATIONS
Top-down?
Not really — although a high-level design can be constructed subsequently refined.
first and
Good for large-scale design? No. The number of states and transitions can grow very rapidly. Good for small-scale design? NGS:
Tool support
Graphics tools exist for the construction, transition diagrams.
is
execution
and editing of state
References and further reading Jacob R J K, ‘A state transition diagram language for visual programming’, IEEE
Computer, Vol. 18, August 1985. Linger R C, ‘The management of software engineering. Part III: software design practices’, IBM Systems Journal, Vol. 19, No. 4, 1980.
8.10
Petri nets
State transition diagrams such as that for the digital watch above do not allow for the possibility of two buttons being pressed simultaneously. Such concurrent processes require a notation capable of describing the synchronisation necessary to ensure that, for example, one activity may have to wait for another to complete its processing before it can start its own. Petri nets are a graphical notation for expressing the required timing of interactions between different components of a system which exhibits concurrency. Like state transition diagrams, a Petri net is a modelling tool. Shown in Figure 8.17 is a simple example which models the activity of borrowing a book from a library. A Petri net is composed of places (drawn as circles) and transitions (drawn as bars). P, Po, P3, P4 and p, are places, and t,, t, and t, are transitions. p, and P3 are input places of t,, and p, is an output place of t,. Petri nets are marked by putting tokens on places in the net. Once anet is marked, a transition can fire if all its input places contain at least one token. The result of firing a transition is to take one token from each of its input places and place one token on each of its output places. Suppose someone arrives at the library desk to borrow a book. We can represent this event by placing a token on P,. This immediately enables t,,
which fires, removing the token from p, and placing tokens on p, and p,. The
PETRI NETS
159
Person arrives at
library desk
supplies date
issued
Person leaves
library Figure 8.17
person’s chosen book and library card are now awaiting librarian, who supplies the due-back date by placing a token t,, which fires, removing the tokens from p,, p, and p; and p,. The book is issued, t, fires, removing the token from p,
inspection by the on p,. This enables placing a token on and placing tokens
on p, and p;. The borrower leaves the library, and the due-back date is recycled
ready for the next borrower. There is a well-developed theory underpinning the use of Petri nets for modelling concurrent systems, which allows for a formal analysis of the behaviour of a given net. For example, is a given marking (a particular configuration of tokens on places) actually reachable from an initial marking? By formally answering such questions (i.e. by mathematical proof), we are making assertions about the system being modelled, in advance of any implementation. And, as always, the earlier we can demonstrate the correctness of the design, the more chance we have of minimising potential errors in the completed system.
Evaluation Special features and strengths which A powerful graphical notation for specifying and designing systems
exhibit concurrency
and whose
execution
must
take account
of timing
160
DESIGN NOTATIONS
considerations. Systems modelled as Petri nets can be mathematically analysed in order to make predictions about their behaviour.
Weaknesses Requires some
experience
and intuition
to determine
an appropriate
net
structure.
Philosophy/perspective A Petri net is a mathematical model. By making assertions about the behaviour of the net, we are indirectly making similar assertions about the system it models. Systematic? Two alternative approaches have been suggested. In the first, a system is designed using conventional techniques, and is then modelled as a Petri net.
The latter is analysed, and if errors are discovered, the design is re-visited and appropriately modified. The modified design is then remodelled, analysed, and SO On in an iterative cycle until the design is error free. This final design can then be implemented. In the second approach, both the design and the modelling are carried out in terms of Petri nets, and the final stage is to convert a final, error-free Petri net into a working system.
Appropriate applications Real-time and concurrent systems. Inappropriate applications Sequential systems.
Top-down? Not really — although subsequently refined.
a high-level
design can
be constructed
first and
Good for large-scale design? No. The interconnections in the net can grow very rapidly.
Good for small-scale design? Yes; Tool support
Graphical tools exist for constructing, editing and analysing Petri nets.
References and further reading Peterson J L, Petri-net Theory and the Modelling of Systems, Prentice Hall, 1981.
REFERENCES AND FURTHER READING
161
8.11 Summary The above is by no means an exhaustive list of design notations. For example, decision tables provide a tabular representation of actions to be associated with particular conditions. Booch diagrams are a notation for expressing objectoriented designs. Moreover, many of the notations we have considered exist in several variations. For example, control structure diagrams add graphical symbols to pseudo-code or source code to give a diagrammatic representation of a design. Statecharts are an extension of state transition diagrams which allow for the expression of concurrency. And there are many more. Is there a ‘best’ design notation? The answer to this must be no. The fact is,
as we have seen, a particular notation is often best suited to a particular type of application. So if a design notation is to be chosen for a program in aparticular domain, then it is perhaps best to focus first on the notation (or notations) associated with that domain, and then use other criteria to refine the choice. Is
the notation easy to understand and review? Is there a natural transformation from design to implementation? Since the design/implementation process is invariably iterative, is a design expressed in the notation easy to enhance and maintain while preserving its correspondence with the program? And, as aiways, such considerations would inevitably be balanced against a programmer’s previous experience and preferences and predilections for a particular favourite notation.
8.12 References and further reading Booch G, Object-Oriented Design, Benjamin-Cummings, Redwood City, CA, 1990. Cross J H, Sheppard S V and Carlisle W H, ‘Control structure diagrams for Ada’, Journal of Pascal, Ada and Modula-2, October 1990. Harel D, ‘On visual formalisms’, Communications of the ACM, Vol. 31 22N0:49; 1988. Hurley R B, Decision Tables in Software Engineering, Van Nostrand Reinhold, London, 1983.
CHAPTER 9
Design guidelines
9.1
Introduction
This book describes a number of methods for program design. Several methods contend for the role of the single ideal method but no method can claim to lead the designer to an ideal program structure. Given this situation, a number of important guidelines have emerged. Their role is to supplement the design methods in providing guidance during the process of design. In addition, they enable us to make judgements on the quality of a piece of software that has been designed. These guidelines are largely independent of particular design methods. Many of the guidelines described in this chapter centre around the idea of modularity. Modularity is one of the key issues in program design. It is the unobtainable holy grail. Modularity is to do with the structure of software. This structure is the end product of all of the major current design methods such as functional decomposition, object-oriented design and data structure design. If we were asked to design an automobile, we would probably design it in terms of several subsystems — engine, transmission, brakes, chassis, etc. Let us call these subsystems modules. In identifying these particular components for an automobile we have selected items that are as independent of each other as possible. This is the essence of good modularity. The guidelines we shall describe in this chapter help to answer questions like: @ @ @
How big should a module be? How complex is this module? How can we minimise interactions between modules?
Before we embark on these questions, we should identify what a ‘modul e’ and ‘modularity’ are. Usually this is dictated by practical considerations , such as the facilities provided in the available programming language and operating system. There is a variety of mechanisms for splitting software into independent modules, or, expressed another way, grouping together items that have some mutual affinity. In various programming languages, a module is as follows:
WHY MODULARITY?
163
COBOL - subprogram BASIC — subroutine FORTRAN -—subroutine Pascal — procedure or function Modula-2 — module C — function C++ — an object Ada — procedure or package Thus a module is a fairly independent piece of program that typically has a name and some instructions and some data of its own. A module is invoked, or called, from some other module and, similarly, uses (calls) other modules.
In this chapter we use the term module in the most general way to encompass any current or future mechanism for dividing software into manageable portions.
9.2 Why modularity? The scenario is that of software that consists of thousands or even hundreds of thousands of lines of code. The complexity of such systems can easily be overwhelming. Some means of coping with the complexity are essential. In essence, the desire for modularity is about trying to construct software from pieces that are as independent of each other as possible. Ideally, each module should be self-contained, and have as few references as possible to other modules. This aim has the following consequences for nearly all stages of software development.
Architectural design This is the step during which the large-scale structure of software is determined. It is therefore critical for creating good modularity. A design approach that leads to poor modularity will lead to dire consequences later on.
Module design If the architectural design is modular, then the design of individual modules will be easy. Each module will have a single well-defined purpose, with few, clear connections with other modules.
Debugging It is during debugging that modularity comes into its own. If the structure is
modular, it should be easier to identify which particular module is responsible
164
DESIGN GUIDELINES
for the observed fault. Similarly, the correction to a single module should not produce ‘knock-on’ effects.
Testing Testing a large system made up of a large number of modules is a difficult and time-consuming task. It is virtually impossible to test the system as a complete whole, and usually it is carried out in a piecemeal fashion. Thus the substructure of the system is crucial. Maintenance
Maintenance means fixing bugs and enhancing a system to meet changed user needs. This activity consumes enormous amounts of software developers’ time. Again, modularity is crucial. The ideal is to make a change to a single module, with the confidence that no other modules will be affected. Too often, obvious
or subtle interconnections between modules make the process of maintenance a nightmare. There are two other important aspects of modularity, as follows.
Independent development Most pieces of software are implemented by a team of people, often over months or years. Normally each module is developed by a single person. It is vital that interfaces between modules are clear and few. Software reuse
A major current trend is to reuse software from an earlier project. This avoids re-inventing the wheel, and can, potentially, save enormous effort. However, a module cannot easily be reused if it is connected in some complex way to other modules in the previous system. A heart transplant from one human being to another would be impossible if there were too many arteries, veins, nerves, etc.
to be severed and re-connected.
9.3 Module size How big should a software module be? Take any piece of software. It could be constructed in two radically different ways — once with small module s and again with large modules. Assuming that the software has a hierarc hical structure, we can diagram the two alternatives as trees (Figure 9.1).
MODULE SIZE
165
Figure 9.1
If the modules are large, there will only be a few of them, and therefore there will only be a few connections between them. We have a structure which is a tree with few branches and a few, very big leaves. The complexity of the interconnections is minimal, but the complexity of each module is high. If the modules are small, there will be many modules and therefore there will
be many connections between them. We have a tree with many branches and many small leaves. The smaller the modules the easier an individual module should be to comprehend. But if the modules are small, we run the risk of being ~ overwhelmed by the proliferation of interconnections between them. The question is: which of the two structures is the better? A common point of view is that a module should occupy no more than a page of coding (about 40 to 50 lines). This suggestion takes account of the difficulty of understanding logic that spills over from one page to another. A more extreme view is that a module should normally take up about seven lines or less, and in no circumstances more than nine. Those who argue for the ‘magic number’ seven do so on the basis of experimental results from psychology. Research indicates that the human brain is capable of comprehending only about seven things (or concepts) at once. This does not mean that we
can remember only seven things; clearly we can remember many more. But we can only retain in short-term memory, and study as a complete, related set of objects, a few things. The number of objects ranges from about five to nine,
depending on the individual and the objects under study. The implication is that
if we wish to understand completely a piece of program or a piece of pseudo-
code it should only be of this length. Relating lines of code to concepts may be
166
DESIGN GUIDELINES
oversimplifying the psychological basis for these ideas, but the analogy can be helpful. We shall pursue this further later in the chapter. We have already met an objection to the idea of having only a few statements in a module. By having a few statements we are only increasing the number of modules. So all we are doing is to decrease complexity in one way (the number of statements in a module) at the cost of increased complexity in another way (the number of modules). So we gain nothing overall. The argument for small modules is~as follows. We start by asking how a piece of software should be studied. When developing or when reading a program we do not look at the whole program at once. Instead we focus attention on only a single module at a time. Hopefully we can study the logic of that one module in isolation from others. We use the power of abstraction, so that we understand what other modules do without the need to understand how they do it. When we have completed an examination of one module in its entirety we turn our attention to another. At no time should we need to comprehend two or more program modules at the same time. So, we conclude, it is the complexity of individual modules that matters. Small modules can give rise to slower programs because of the increased overhead of subroutine linkage. But nowadays a programmer’s time can cost significantly more than a computer’s time. The question here is whether it is more important for a program to be easy to understand or whether it is more important for it to run quickly. These requirements may well conflict and only individual circumstances can resolve the issue. It may well be better, however,
first to design, code and test a piece of software using small modules. Then, if performance is important, particular procedures that are invoked frequently can be re-written in the bodies of those modules
that use them. It is, however,
unlikely that subroutine linkage will adversely affect the performance of a program. Similarly, it is unlikely that encoding procedures in-line will give rise to significant improvement. Rather, studies have shown that programs spend most of their time (about 50%) executing a small fraction (about 10%) of the code. It is the optimisation of these small parts that will give rise to the best results. Clearly a count of the number of lines is too crude a measure of the size of a module. A seven-line module containing several if statements is more complex than seven sequential statements. The next section pursues this question.
9.4 Program complexity In the early days of programming, main memory was small and processors
were slow. It was considered normal to try hard to make programs efficient.
One effect of this was that programmers
often used tricks. Nowadays
the
situation is rather different — the pressure is on to reduce the development time
PROGRAM COMPLEXITY
167
of programs and ease the burden of maintenance. So the emphasis is on writing programs that are clear and simple, and therefore easy to check, understand and modify. What are the arguments for simplicity?
e@ @
A simple program is more likely to work quickly and then go on working after it is put into service. The programmer who (inevitably) has to locate the bug in our program or else modify it to meet a changed specification will respect us and thank us the more if our design is easy to follow, rather than riddled with clever but incomprehensible tricks.
If we look at the world of design engineering, a good engineer insists on maintaining a complete understanding and control over every aspect of the project. The more difficult the project the more firmly the insistence on simplicity — without it no-one can understand what is going on. Programmers have frequently been accused of exhibiting the exact opposite characteristic; they deliberately avoid simple solutions and gain satisfaction from the complexities of their designs. Perhaps programmers should try to emulate the approach of traditional engineers. So, many designers and programmers today strive to make their programs as clear and simple as possible. A programmer finishes a program and is satisfied both that it works correctly and that it is clearly written. But how do we know that it is clear? Is a shorter program necessarily simpler than a longer one (that achieves the same end), or is a heavily nested program simpler than an equivalent program without nesting? People tend to hold strong opinions on questions like these; hard evidence and objective argument are rare. Arguably what we perceive as clarity or complexity is an issue for psychology. It is concerned with how the brain works. We cannot establish a measure of complexity — for example, the number of statements in a program — without investigating how closely such a measure corresponds with programmers’ experiences. We now describe one attempt to establish a meaningful measure of complexity. An aim of such work is to guide programmers in selecting clear program structures and rejecting unclear structures, either during design or afterwards. The approach taken is to hypothesise about what factors affect program complexity. For example, we might conjecture that program length, the number of alternative paths through the program and the number of references to data might all affect complexity. We could perhaps invent a formula that allows us to calculate the overall complexity of a program from these constituent factors. up The next step.is to verify the hypothesis. How well does the formula match
with reality? What correlation is there between the complexity as computed
understand from the formula and, for example, the time it takes to write or to
the program?
168
DESIGN GUIDELINES
Amongst several attempts to measure complexity is McCabe’s cyclomatic complexity. McCabe (1976) asserted that complexity does not depend on the number of statements in a module. Instead it depends only on the decision structure of the program — the number of if, while, etc., statements. To calculate the cyclomatic complexity of a program, count the number of conditions and add one. For example, the program fragment: ifa=b then c:=d else e:=f endif p:=q
has a complexity of 2. This corresponds to the number of independent paths through the program. Similarly a while and a repeat each count one towards the complexity count. Compound conditions like: if a>b
and c>d then
count two because
this if statement
could be re-written
as two, nested if
statements. Note that the simplest program, consisting only of a simple sequence of statements, has a cyclomatic complexity of 1 — however long it is. If we had two algorithms that solve the same problem we could use the measure to select the simpler. McCabe suggests that if the cyclomatic complexity of a module is greater than 10, then it is too complex. In such a
case, it should either be re-written or else be broken down into several, smaller
modules. Cyclomatic complexity is a useful attempt to quantify complexity, and it is claimed that it has been successfully applied. It is, however, open to several criticisms as follows. First, why is the value of 10 adopted as the limit? This figure for the maximum allowed complexity is completely arbitrary and unscientific. Second, the measure makes no allowance for the sheer length of a module,
so that a one-page module (with no decisions) is rated as equally complex as a thousand-page module (with no decisions). Third, the measure depends only on control flow, ignoring, for exampl e, references to data. One program might only act upon a few items of data, while another might involve operations on a variety of complex objects. (Indirect references to data, say via pointers, are an extreme case.) Finally, there is no evidence to correlate fully McCabe’s measu re with the complexity of a module as perceived by human beings.
GLOBAL DATA IS HARMFUL
169
9.5 Global data is harmful Just as the infamous
goto statement
was
discredited,
so current ideas of
software engineering regard global data as harmful. Before we discuss the arguments, let us define some terms. By global data we mean data that can be widely used throughout a piece of software, and is widely accessible to the modules in the system. In an individual computer program, global data is often declared at the top of the program. By the term local data, we mean data that can only be used within specific procedures; access is closely controlled. In a program, local data is often declared within the procedure that uses it. For any particular piece of software, the designer has the choice of making data global or local. (Of course, data can also be passed around the software by parameter passing.) Here is the argument about global data. Suppose that three modules named A, B, C access some global data as indicated in Figure 9.2. Suppose that we have to study module A in order, say, to make a change to it. Suppose that modules A and B both access a piece of global data named X. Then in order to understand
A we have to understand
the role of X. But, now, in order to
understand X we have to examine B. So we end up having to study a second module (B), when we only wanted to understand one. But the story gets worse. Suppose that modules B and C share some data. Then to understand B fully we have to understand C, and so on. We see that in order to comprehend any module that uses global data we have to understand all the modules that use it.
In general, local data is preferable because: @ @
®@
It is easier to read and study an individual procedure because it is clear what data the procedure is using. It is easier to remove a procedure to use in a new program, because it is a self-contained package. The global data (if any) is easier to read and understand, because it has been reduced in size.
Global data
x
Figure 9.2
170
DESIGN GUIDELINES
So, in general, the amount of global data should be minimised (or abolished) and the local data maximised. However, rules such as this should never be followed blindly — there are always occasions when it is appropriate to break them. For example, in some circumstances it may be preferable to use a global variable rather than pass an enormous structure as a parameter.
§.6 Information hiding and abstract data types Information hiding, data hiding or encapsulation, as it is sometimes called, is a modular way of structuring software. The idea is that for each data structure (or file structure), all of the following: @ @
the structure itself the statements that access the structure
@
the statements that modify it
should be part of just a single module. A piece of encapsulated data cannot be accessed directly. It can only be accessed via one of the procedures associated with the data. Such a collection of data and procedures is called an abstract data type. The classic illustration of the use of information hiding is the stack. Procedures are provided to initialise the stack, to push an item onto the stack top and to pop an item from the top. (Optionally, a procedure is provided in order to test whether the stack is empty.) Access to the stack is only via these procedures. Given this specification, the implementor of the stack has freedom to store it as an array, a linked list, or whatever. The user of the stack need
neither know, nor care, how the stack is implemented. Any change to the representation of the stack has no effect on the users (apart, perhaps, from its performance). The procedures that act upon the stack are conceptually, but not necessarily textually, grouped together with the stack itself. Information hiding meets three aims: 1. Changeability: If a design decision is changed, such asafile structure, changes are confined to as few modules as possible and, preferably, to just a single module. 2. Independent development: When a system is being implemented by a team of programmers, the interfaces between the modules should be as simple as possible. Information hiding means that the interfaces are calls on subprograms which, it is argued, are simpler than accesse s to shared data or file structures. 3. Comprehensibility: For the purposes of design, checking, testing and maintenance it is vital to understand individual modules indepe ndently of others. As we have seen, global and shared data weake n our ability to understand software. Information hiding simply elimi nates these threats.
INFORMATION HIDING AND ABSTRACT DATA TYPES.
171
As another example, suppose that we have to write a text-formatting program. Suppose we find it necessary to assemble a line to be printed by placing in it characters, one by one, until the line is full. We could allow the program to access a buffer directly, e.g. when it needed to place characters in the buffer. We might well end up with a program in which pieces of code are dotted around everywhere. Remembering that putting a character in a buffer will involve updating an index or counter, we might have many references to the same collection of data items. There is plenty of scope for error; a classic mistake would be to reverse the actions of incrementing the index and inserting a character. An alternative structure would be to create three procedures: 1. 2. 3. Now
Initialise the buffer. Place a character in the buffer (printing if it is full). Termination (print any remaining line). none
of the information
about the line — its length, name,
associated
pointer — need be referred to anywhere else in the program. Any change to the representation of the buffer, or to the length of the line, is totally confined to the three procedures. Some programming languages (Ada, C++, Modula-2) disallow any references to this module other than calls to the three procedures. This further helps to prevent errors. Clearly the facilities of the programming language can greatly help to structure software according to information hiding. An additional facility would be for the module to initialise itself, thus eliminating the need for calling the first procedure. As a further example, consider an information system, e.g. an airline seat reservation system, that responds to inputs from an interactive terminal. The commands might be: reserve a seat cancel a reservation
enquire about availability
One view of the program structure would be to see it as two modules: 1. 2.
one that interacts with the user via the terminal, and one that stores and retrieves information from the database.
Each can be seen as a module that encapsulates data, providing access to that data via procedure calls. We might go on to conceive of a third, controlling,
module that makes calls on each of the other two, one after the other. In summary, the principle of information hiding means that, at the end of the design process, any data structure or file is accessed only via certain welldefined, specific paragraphs, procedures or subprograms. Unfortunately, however, not all programming languages allow convenient information hiding.
172
DESIGN GUIDELINES
From small beginnings, the principle of information hiding has blossomed into a major concept in program design and software engineering. It has not only affected language design, but led to completely different views of programming, like object-oriented programming (see Chapter 6).
9.7 Coupling and cohesion . The ideas of coupling and cohesion are a terminology and aclassification scheme for describing the interactions between modules. The essence of the approach is that a piece of software should be divided into modules in such a way that there is a minimum of interaction between modules (low coupling) and, conversely, a high degree of interaction within a module (high cohesion). If this can be achieved, then an individual module can be designed, coded,
tested or amended without the undue complications of having to deal with other modules. Another beneficial effect is that when an error occurs in a module, the
spread of damage to other modules may The diagrams in Figure 9.3 illustrate structure of a piece of software, in each each program there are 20 interactions accesses to data items. In the left-hand
be limited. the ideas. Each diagram shows the case built from four modules. Within — these could be procedure calls or diagram there are many interactions
between modules, but comparatively few within modules. In contrast, in the
right-hand diagram there are few interactions between modules and many interactions within modules. The left-hand program has strong coupling and weak cohesion. The right-hand program has weak coupling and strong cohesion. Coupling and cohesion are opposite sides of the same coin, in that strong cohesion will tend to create weak coupling, and vice versa.
Figure 9.3
COUPLING
173
9.8 Coupling We are familiar with the idea of one module making a procedure call on another, but what other types of interaction (coupling) are there between modules? Which types are good and which bad? First, an important aspect of the interaction between modules is its ‘size’. The fewer the number of objects that connect modules, the better. If modules
share common data, it should be minimised. Few parameters should be passed between modules in procedure calls. It has been suggested that no more than about two to four parameters should be used. Deceit should not be practised by grouping together several parameters into a record and then using the record as a single parameter. What about the nature of the interaction between modules? We can distinguish the following ways in which modules interact. They are listed in an order that goes from strongly coupled (least desirable) to weakly coupled (most desirable).
Altering another module’s code. Branching to or calling a place other than at the normal entry point. Accessing data within another module. Shared or global data. Procedure call with a switch as a parameter. Procedure call with pure data parameters. RSP SU Sh Re Passing a serial data stream from one module to another. Let us examine each of these in turn.
1. Altering another module’s code This is a rather weird type of interaction and the only programming language that normally allows it is assembler. Remember, however, that in COBOL the alter statement allows a program essentially to modify its own code. The problem with this form of interaction is that a bug in one module, the modifying module, appears as a symptom in another, the one being modified.
2. Entering at the side door In this type of interaction, one module calls or branches to another at a place other than the normal entry point of the module. Again, this is impossible in most languages, except assembler and BASIC. In COBOL, however, a program
There is can be constructed from sections, within which are paragraphs. within nothing to prevent one section from performing or going to a paragraph
another section. Again, in some dialects of FORTRAN, a module can be made to have several entry points by the use of the ENTRY statement.
174
DESIGN GUIDELINES
The objection to this form of interaction is part of the argument in favour of structured programming (Chapter 2). It is only by using modules that have a single entry (at the start) and one exit (at the end) that we can use the power of abstraction to design and understand large programs.
3. Modifying data within another module For one module to alter another’s data seems rather less harmful than changing another’s code. However, the objection is the same and the coupling is strong because a fault that appears in one module may be caused by another.
4. Shared or global data Shared data is any data that two or more modules have access to. Global data is usually used to mean a collection of data that is accessible to a large number of — perhaps all — modules. The facility to access data in this way is present in nearly all widely used programming languages. There are several reasons why it is undesirable. First, adding new data to the shared data may cause a name clash with an existing private (local) data item within any one of the modules.
Second,
if the shared data is used to hold
parameters then such items will be subject to the arguments above about altering data within another module. But the main argument against shared and global data has been discussed above — it makes software much more difficult to understand.
5. A procedure call with a parameter that is a switch We have seen that both shared data and unusual transfers of control result in strong coupling between modules. The solution is, of course, to use procedure calls with their attendant parameters. Even so, it is possible to worsen the coupling by passing as a parameter not pure data but an element of control. An example is where a module is passed an indicator telling the procedure which action to take from amongst a number of available actions. (This indicator is sometimes called a switch.) Here is an example of a procedure call on a general-purpose input-output procedure: call io (command, device, buffer, length)
The parameter ‘command’ has values 0, 1, 2, etc., that specify whether the operation is a read, write, open, etc. This is undesirable simply because it is unnecessarily complicated. Such a module can be split into several, each catrying out a single action. As an alternative to calling a single module and ee it a switch, we can instead call the individual appropriate module, like
this:
call read (device, buffer, length)
COHESION
175
We have eliminated a parameter from the interaction and at the same time created well-defined modules, each with a specific function. This contrasts with a single, multi-purpose module. understand and maintain.
Arguably
this modularisation
is easier to
6. Procedure calls with parameters that are pure data Here we have a form of coupling that is nearly ideal. The modules interact in a well-defined manner, suffering none of the weaknesses discussed in the schemes above. In particular, it is quite clear what information is being communicated between modules. Remember, though, that for weak coupling the number of parameters should be few.
7. Passing a serial data stream The weakest (best) coupling is achieved without any transfer of control between modules. This is where one module passes a serial stream of data to another. We can visualise this by imagining that one module outputs information as if to a file, and the second module reads it, again as if from a
file. The important ingredient is that the outputting module has no access to the data, once it has released it. This type of interaction is not widely available in most programming systems and is only to be found in languages that allow software to be constructed from concurrent tasks that communicate by message passing, e.g. Ada. Similarly, in the UNIX system, programs can communicate via ‘pipes’ which again are serial data streams. That this form of coupling is best can be shown using information theory. The conclusion from this discussion on coupling is that the weakest (best) coupling is to be achieved by using modules that communicate by either (1) procedure calls with a small number of data parameters, or (2) passing a serial stream of data from one to the other. One of the problems of this analysis of types of coupling between modules is that, while it gives us a good qualitative feel for interaction between modules, it gives us no quantitative measure of the
interaction.
9.9 Cohesion Cohesion is about unity. How do we group actions together in the best way? How do we group data together in the best way? In the marriage between data and actions, which is dominant and which is subservient, or are they equal partners? Cohesion describes the nature of the interactions within a software module. A scheme has been drawn up for classifying the various types of cohesion. These range from low cohesion (undesirable) at the top of the list, to
high cohesion (desirable) at the bottom of the list. Some of these types of
176
DESIGN GUIDELINES
cohesion are now only of historical interest; current design methods ensure that they just don’t arise. The list of categories is: 1. 2. 3. 4. 5.
coincidental logical temporal communicational functional
,
We will now look in turn at each of the different types of cohesion. In each case our analysis will be based on a statement of what a module will do. We will see that if a module does a mixture of things, then it has poor cohesion. On the other hand, if a module carries out one specific action, then it has good
cohesion. 1. Coincidental cohesion
In coincidental cohesion the components are in the module purely by coincidence. There is no relationship between the components; their coexistence is purely arbitrary. This type of modularity would arise if someone had taken an existing module and arbitrarily chopped it up into modules, say each of one page in length. It would then be impossible to write down a meaningful statement of what each module accomplishes.
2. Logical cohesion In logical cohesion the module performs a set of logically similar functions. As an example, during the design of a piece of software we could identify all of the output activities of the system and then combine them into a single module whose function could be described as: output anything Such a module is clearly multi-functional. It performs any of a range of (output) operations such as:
output text to screen output line to printer output record to file On the face of it such a module is rational, even logical. It seems as if an act of housekeeping has been carried out to collect together logically related activities. Another example of a logically cohesive module is one that is described
by:
calculate
and which carries out any of a range of mathematical calculations (log, sine,
cosine, etc.).
COHESION
177
The problem with a logically cohesive module is that it is multi-functional; it carries out any of a menu of actions, rather than one single well-defined action. It is unnecessarily complex. If we need to modify any one ingredient within the module, we will find it hard to ignore the other elements.
3. Temporal cohesion In temporal cohesion the module performs a set of functions whose only relationship is that they have to be carried out at the same time. The classical example isa set of initialisation operations. Thus a module that carried out the following collection of actions: clear screen
open file Initialise total
would exhibit temporal cohesion. A sequence of initialisation actions like the one above is such a common feature of most programs and systems that it is hard to see how to avoid it. But as we can see from our example, the ingredients within an initialisation module are not related to each other at all. The solution is to make the initialisation module call other, specialised modules. In the above example the initialisation module would be improved if it consisted of the sequence of calls: initialise terminal initialise files initialise calculation
4. Communicational cohesion
In communicational cohesion functions acting on common data are grouped together. For example, a module that displays and logs temperature is carrying out two different actions on the temperature data. A similar example is a module that formats and prints total price. Thus a communicationally cohesive module is described by several verbs and one noun. The weakness of such a module is, again, that it is unnecessarily
complex. 5. Functional cohesion
Functional cohesion is the best type of cohesion. A module with functional cohesion performs a single, well-defined action on a single subject. Thus a sentence that accurately describes the purpose of the module will have only one
verb and a single object that is acted upon by the verb. Here are examples of such modules:
178
DESIGN GUIDELINES
calculate average print result input transaction open valve obtain date
As with the ideas of coupling, if we find that the modules in our software exhibit poor cohesion, the concepts of cohesion do not provide us with a systematic method for improving our structure — they merely tell us how our structure is poor. Another problem with the classification scheme is that it is sometimes very difficult to identify which type of cohesion is present — after all, there does seem to be a very fine line dividing communicational and sequential cohesion. Finally, cohesion is a qualitative assessment of quality, rather than a quantitative measure or metric.
9.10 Shared modules and bottom-up design Experienced programmers are used to the idea of designing subprograms that can be used in a variety of situations, thus saving both effort and main store.
How does this fit in with modern ideas of design? One line of reasoning is as follows. Suppose that we have arrived at a structure in which a lower-level module appears as part of two or more higher-level modules. Expressed in terms of a tree, this is a leaf growing from two or more branches. Two questions arise. First, how was the module designed? Suppose that the module was designed top-down. If a module is shared, from which one of its parents did it emerge? Surely you can’t design two things at the same time? So the shared module must be unsatisfactory; it can’t be completely suitable in more than one context. The second problem concerns testing and debugging. If a fault occurs in a shared module it is not clear how it was called or performed — it might have been along any of a number of branches. So location and correction of the fault is made more difficult. Because of these problems, it can be argued that if any shared modules are to be employed then they should arise only from the following, bottom-up design method. When first thinking about the solution to a problem, ideas arise in the mind. For example, in a particular program, perhaps the use of tables (arrays) strongly suggests itself. Suppose that the available programming language is COBOL. Now, although COBOL supports the idea of tables, it falls short of providing ready-made facilities to add, subtract, multiply and compare tables. Why not augment COBOL’s facilities and implement a set of generalpurpose table manipulation routines? This would make COBOL more suitable for solving the immediate problem, and the routines might well come in useful in the future. We see that the design of these routines has not emerged from a strictly top-down approach, but from bottom-up considerations — starting with an available facility and trying to make it more convenient.
SUMMARY
179
The specification of what shareable, general-purpose subprograms do and
how they are used should be completely independent of any particular context in which the routines might be used. The same should be true of how the routines work, and of the testing they are subjected to. To emphasise their generality, such routines should be placed in a common system library. Examples of such pieces of software are graphics routines, input-output routines, filing system routines, and mathematical routines. There are a number of bottom-up design methods in existence. Here we look at one, most commonly seen method. We start with a picture of a machine (that provides various low-level facilities), and some problem (that we can visualise at a high level). The machine provides features that are primitive, sometimes complex, and not at all in harmony with the problem. We start therefore by taking the bare machine and adding to it a layer of software that we believe will enhance the machine and make it look more convenient for the problem we wish to solve. We have now created a more powerful, abstract machine, built
on top of the bare machine. We repeat the process, designing and building another, higher-level layer on top of the existing layers, and continue in this way until we have constructed a virtual machine that matches the problem well. Hopefully we can now solve the problem using a simple, clear algorithm that is expressed in terms of the facilities we have provided. The reader will see that bottom-up design requires an act of faith on the part of the designer, that suitable facilities have been devised. There is no guarantee that the features that are implemented will turn out to be useful.
9.11
Summary
In this chapter we have discussed a whole range of considerations that can be taken into account during the design of a piece of software. These guidelines help us in two ways:
1.
2.
They us to They piece
help us decide what to do during the act of design, trying to lead programs that are clear, simple and flexible. provide us with criteria for assessing the structure of an existing of software.
Restricting module size is one crude way of reducing complexity. An extreme view is to restrict all modules to no more than seven statements. Various quantitative measures of complexity have been suggested. One
metric is McCabe’s cyclomatic complexity measure, which is based upon the
number of decisions within the module. The principle of information hiding holds that data should be inaccessible ng other than by means of the procedures that are specially provided for accessi the data.
180
DESIGN GUIDELINES
Coupling and cohesion are terms that describe the character of the interaction between modules and within modules, respectively. Coupling and cohesion are complementary. Strong coupling and weak cohesion are bad; weak coupling and strong cohesion are good. Coupling can be categorised into a number of types, ranging from one module altering another’s code to modules communicating by exchanging serial data streams. Types of cohesion can be classified within six categories. Thus coupling and cohesion provide a terminology and a qualitative analysis of modularity. ; Finally, modules that are used in more than one place should not arise from a top-down design process, but instead from a bottom-up approach. Notice that these ideas can be grouped into two areas: 1.
Those that deal with interactions within modules — length, cyclomatic complexity, cohesion.
2.
Those that deal with interactions between modules — information hiding,
coupling, shared modules.
9.12 1 2 3
4
5
6
7
Exercises
What is modularity and why is it important? Argue for and against restricting modules to about seven statements. Write down two different programs to search a table for a desired element. Calculate the cyclomatic complexity of each and hence compare them from the point of view of clarity. What factors do you think affect program complexity? What is wrong with McCabe’s approach to calculating complexity? Devise a scheme for calculating a satisfactory complexity measure. Study any program design that you know about. How are the modules coupled? What forms of cohesion are present? Is information hiding in use? Can the structure be improved? Is there any correspondence between: (a) any one form of cohesion and information hiding? (b) any form of coupling and information hiding?
(If you have read Chapter 3 on functional decomposition) Does functional decomposition tend to lead to modules of a partic ular
of cohesion? If so, which?
8
9
form
(If you have read Chapter 3 on functional decomposition) In functional decomposition, the modules are functionally independen t but they may act upon shared data. Is functional decomposition compa tible with information hiding? (If you have read Chapter 4 on the Jackson method) Does the Michael Jackson method lead to a program structure that exhibi ts any particular
REFERENCES AND FURTHER READING
10
181
types of coupling and cohesion? How does information hiding relate to, or contrast with, the Michael Jackson method?
Consider a programming language with which you are familiar. What types of coupling are allowed? What types are not permitted?
9.13
References and further reading
This is the paper that suggests the small capacity of the human brain when comprehending a set of items as a complete whole: Miller G A, “The magical number seven, plus or minus two; limits on our capacity for processing information’, The Psychological Review, Vol. 63, No. 2, pp. 81-97, March 1956. McCabe’s cyclomatic complexity measure is described in: McCabe T J, ‘A complexity measure’, JEEE Transactions on Software Engineering, Vol. SE-2, No. 4, December
1976.
Two other measures of program complexity are suggested in: Woodward M R, Hennell M A and Hedley D, ‘A measure of control flow complexity in program text’, IEEE Transactions on Software Engineering, Vol. SE-5, pp. 45—S0, January 1979. Halstead M H, Elements of Software Science, Elsevier, New York, 1977.
This paper introduced the idea of information hiding: Parnas D L, ‘On the criteria to be used in decomposing
systems into modules’,
Communications of the ACM, Vol. 15, pp. 1053-8, December
1972
and is reprinted in: Freeman P and Wasserman AI, Tutorial on Software Design Techniques, 2nd edition, IEEE Press, New York, 1985. This article castigates global data. It therefore relates to the idea of information hiding and to common data coupling: Wulf W and Shaw M, ‘Global variables considered harmful’, ACM Sigplan Notices, ' February 1973.
The ideas of coupling and cohesion are discussed at length in this book. There is also treatment of the issue of the optimal size of a module: Yourdon Edward and Constantine Larry L, Structured Design, Prentice Hall, Englewood Cliffs, NJ, 1979.
of This gives a more recent (and perhaps more readable) presentation of the ideas coupling and cohesion: Press, New Page-Jones M, The Practical Guide to Structured Systems Design, Yourdon York, 1980. The issue of shared modules is discussed in: 1975. Jackson, M A, Principles of Program Design, Academic Press, London,
CHAPTER
10
Review
10.1
Top-down or bottom-up?
The dominant design method of today is top-down design. It starts by identifying the main features of the program. This topmost element uses some defined, but as yet unconstructed, elements. These are the next candidates for
construction. They in turn make use of lower-level elements. And so on. The program grows as a tree from its root at the top towards its branches and leaves at the bottom. The main ingredients of the program are considered first; consideration of detail is postponed. Bottom-up design is the exact opposite of top-down. It starts with the lowestlevel elements. The programmer speculates about what low-level elements will be useful in the final program, and constructs them one by one. Next the lowlevel elements are used by elements constructed at the next level up. And so on. The tree grows from the bottom. These days, bottom-up design is often criticised because it starts with the detail rather than the main tasks of the program. It’s like starting to design a bridge by thinking about how big the rivets should be, or planning a meal by deciding whether to use sea salt or ordinary salt. But, undoubtedly, there is a role for bottom-up design. Suppose we wanted to construct a program to play chess. We could well envisage a number of functions that would be very useful. For example: display the board check to see whether there is a piece on a particular square check to see whether a particular move is legal
and these functions would be at the bottom of the tree; they would be constructed in a bottom-up fashion. Some people argue that very often it is difficult to start either at the top or at
the bottom of the program. Instead, some middle part of the program looks like an obvious candidate to start with. The tree starts in the middle. It grows sideways, upwards and downwards. This is called middle-out design. It does
not enjoy a respectable status, but nonetheless practice.
it is undoubtedly
used in
THE THREE DIMENSIONS OF PROGRAMMING
183
The spectrum of design methods surveyed so far looks like this: bottom-up — middle-out — top-down
10.2
Data or actions?
There is another dimension to design. This is to do with whether to base design on action or on data. Functional decomposition is a top-down method that concentrates
on the functions
(or actions) that the program is to perform.
Design starts with the overall main actions that the program is to take. The more detailed actions are constructed in the top-down fashion. Data is only introduced as a by-product, as and when necessary. At the opposite end of the spectrum is design based upon the structure of the data that the program is to process. This is data structure design, Jackson structured programming or JSP. It starts by analysing the structure of the input and output data that the program is to process. The structure of the program is then derived from the structure of the data in a series of well-defined steps. In between the two poles of functional decomposition and data structure design lies object-oriented design. In object-oriented design, the symbiotic relationship between data and action is recognised. Data, together with the operations that act upon it, are bundled together into groups called objects. Design proceeds by identifying the objects in the program and the relationships between the objects. Because of the equal role played by data and action, object-oriented design is arguably a holistic approach to design. The spectrum of approaches to design that views different design approaches according to the relationship of data to action is:
functional decomposition — object-oriented — data structure
10.3 The three dimensions of programming The act of program design is about taking a user’s requirements and somehow meeting those requirements by creating a program. Arguably the task of a program design method is to capture and model the important aspects of a set of requirements. The user’s requirements are usually (but not always!) written in the form of a specification. One way to see program development is that the requirements are somehow converted or transformed into a program design. The particular way in which the conversion is carried out is what distinguishes one method from another.
184
REVIEW
Now a conventional, procedural, program consists of an amalgam of three ingredients: 1. 2. 3.
Some data (numbers, characters, arrays, more complex structures) Some actions upon data (e.g. calculations, input, output, comparison) A control of the timing of the actions (sequences, repetitions, selections)
The timing of the actions controls the order in which actions are carried out, whether they are repeated or not, and whether they are conditionally enacted. Every program must have all these three elements. Somehow the programmer — or the program design method — has got to obtain all of this information. Some of these three ingredients will usually be present in the problem to be solved, the specification. However, often some of this information is not present in the problem, but instead has to be invented by the
programmer. Some approaches to program design boast that they address all of these three dimensions. More realistically, some approaches to program design concentrate on only one of the above three dimensions, to the exclusion of the others. Such a method will tend to be successful in applications where the importance of the dimension is paramount. We now review several application areas of computing and identify, for each area, which attributes are important:
@
@
@
In real-time, or process control, applications, there is often a complex relationship between events; the crucial element is when the events take place. Data is simple or virtually non-existent. In a mathematical or scientific program, the data is often simple, but the algorithmic complexity is enormous. In business applications, formerly called data processing, the data is rich in structure, but the processing requirements are simple.
Looking at some of the design approaches described in this book:
@
@
@
Functional decomposition concentrates on actions and the sequencing of actions. Data is largely ignored. Data structure design concentrates heavily on data. Actions are obtained by brainstorming. It is assumed that the sequencing of actions will mirror the structure of the data. Thus the sequencing is derived from the data structure. In object-oriented design, the aim is to model as much as possible in the real world that is being modelled. This includes both the data and the actions that are taking place.
Thus different design approaches capture and emphasise different aspects of the problem to be solved.
PROGRAMMING AS MODELLING
185
10.4 Programming as modelling Some approaches to program design attempt to ‘model’ some aspect of the
situation being programmed. We will now look at three modelling approaches to devising the structure of programs. The three approaches have in common that they claim to ‘model’ the situation being programmed. First, consider a program to control an automatic washing machine. Written in pseudo-code, part of it might well look like this: open wait start wait
water valve until tub is full motor for 5 minutes
This program models the actions of a washing machine. The most appropriate design approach is probably functional decomposition. Next consider a program to add up a list of numbers (say prices on a shopping list) typed in at the computer keyboard. The numbers are repeated and therefore we will need a corresponding iteration in the program. Hence the program looks like: make sum zero repeat
input number add to sum until there are no more numbers
display sum Here the program models the structure of the input data. The most appropriate design approach is probably the Michael Jackson method. Looking back at the two programs and the two approaches to constructing them, while they both attempt to model a real-world situation, one uses actions as the starting point, but the other uses data structure. The first program is like many in process control and in mathematics, where there is little structure to the input or output information. The second is like many programs in
information processing, where the data is rich in structure. Indeed in a system
with a database, the data is absolutely paramount, with the programs that access it playing a very subservient role. More commonly these days programs are not obviously in either of the above categories. Consider almost any program that deals with interactive input from a computer terminal. For example, suppose that a user has at any time a
menu of options available, any of which can be selected using the mouse or cursor keys. Here we don’t have any obvious structure to the data and neither do we know what action should be done next. Instead we will probably construct a program that looks like this:
186
REVIEW
repeat when the user makes a selection case user choice save option : do save operation print option : do print operation delete option : do delete operation quit option : do quit operation end case forever Here the case statement is like a multi-way switch that causes the one appropriate action to be selected from amongst the available list. What is the approach to modelling used in this situation? Arguably it is modelling based on events. We identify whatever events may affect the system and happen within the system and provide pieces of program that can deal with each of them. Examples of events are: A user selects an option from a menu. A user points at an object on the screen and presses a mouse button. An intruder crosses a light beam. A message arrives down a telecommunications line. In a modern computer system with windows, menus, buttons and a pointing
device (like a mouse) there may bea large number of different possible choices from which the user may select at any time. This event-based modelling is the third and increasingly important approach to program structure. The most appropriate design approach is probably object-oriented design. Let us distinguish between this and action-based programming. A task-based program commands and is active; an event-based program is responsive and reactive. It consists of chunks of code that are simply lying around waiting for something to happen. What of the distinction between data-oriented and event-oriented modelling? The data-oriented approach can only be used when the structure of the data is well defined (as in the list of numbers example above), whereas the eventoriented approach can cope with unpredictable sequences of input (as in the menu example above, where the user can select items in any order). In all three approaches to program design — data, actions, events — a realworld situation is modelled or simulated by the program; the structure of the program mirrors the structure of some crucial aspect of the problem to be solved.
10.5 The formal approach Program design is about trying to write programs that are correct . But what does that word ‘correct’ mean? Conventionally it means ‘fully tested’ — but
DESIGN METHODS — THE REALITY
187
then it is usually impossible to test a program with all the possible inputs it could receive (although of course there are systematic methods which seek to impose a discipline on testing procedures). Formal methods recognise that a program can only be shown (or proven) to be correct with respect to its specification, and if mathematics is to be applied to the correctness problem, then the specification must (like the program) be expressed in a formalised syntax. Formal methods try to introduce into software development rigorous principles and practices which have long been applied to traditional engineering endeavours such as bridge building and aeronautical engine manufacture. They aim not only to produce more reliable software, but to do it more cheaply. Unfortunately many people (even programmers) are frightened of mathematics, and clearly a reasonably high level of mathematical sophistication is necessary for the successful application of these methods. However, not all of the concerns of software developers can be directly addressed by resorting to formalisation. Capturing non-functional requirements is problematical, as are project management issues such as quality control and cost estimation. Nevertheless, mathematical reasoning will be increasingly used in other stages of the software life cycle, and software tools to support these activities will continue to be developed. This all-pervasive role of formal notations and mathematical reasoning is one of the prime tenets of formal methods devotees. But the old adage of a craftsman only being as good as his tools is perhaps even more apposite here than in conventional software development.
10.6 Design methods - the reality Having looked at the respectable design methods, let us turn to the disreputable ones. If you talk to professional programmers, the likelihood is that they will say that they use one of the respectable program design methods mentioned above. Nowadays there is tremendous pressure to use a ‘proper’ design method. The truth is, however,
different. A number of studies have been carried out to
discover how professional programmers really carry out design. This work really started back in the late 1960s when people like Peter Naur tried to observe their own thought processes as they carried out a program design (Naur carry 1972). Later studies usually proceed by observing programmers as they are out the design task, or else by getting them to speak aloud what they thinking as they go about design. In this section we review the results of such studies (Hoch 1990). (It should be pointed out that the observational methods must used to obtain these results are not without problems and so some caution
be used in interpreting the results.) t The first revelation is that there is an enormous diversity amongs enormous programmers as to the design strategies they adopt. There is also an
188
REVIEW
diversity amongst the designs that programmers will produce to solve the same problem. Different programmers use different methods and construct different designs. The second revelation is perhaps even more surprising — programmers do not
use top-down design. This is, of course, heresy; the top-down approach is currently the dominant method. This evidence must be qualified — it seems that professional programmers do use top-down, but only when they understand all aspects of the situation very thoroughly. If a programmer has a mastery of the language and if the problem seems to be very simple (to the programmer), then the programmer may use top-down. In other words, programmers use top-down only very rarely. If, as it seems, programmers do not use top-down, what do they do? It appears that what they commonly do is to break the problem down into smaller problems — but not in a top-down way. They select fragments of the problem for consideration. They do this by using a whole range of personal strategies. At the end of the day, when they have completed their design, they then legitimise it by documenting it according to one of the credible design notations. Typically, they pretend that it was done top-down. It seems that there are other ways that competent programmers carry out design. They reuse parts of old programs, like parts of scrap cars. They remember a program that they may have written some time ago that is in some way similar to the new program. They retrieve the listing and copy those parts that are useful. Another similar approach is to use memories of old programs. Experienced programmers build up Aladdin’s caves of memories of the designs that they have created. No survey of design approaches would be complete without a mention of hacking. This term has two distinct meanings. One meaning describes the act of getting admittance to a secure computer system in order to steal money or secrets or to cause mayhem. The other meaning describes one style of programming. Hacking is plunging into a solution to a problem without any design whatsoever. A hacker takes the program specification and immediately Starts to write down programming language instructions. Probably the hacker will not even pause to write them down — they will immediately start to key in instructions to the computer. Hacking makes use of intuition, creativity and individuality. It dates from the early days of programming, when programming was regarded as an individual creative act and when there were no wellestablished design methods. Nowadays hacking is often frowned upon as being unsystematic. So hacking is either famous or notorious, depending on your point of view. LISP programmers have long championed a program design strateg y which lies somewhere between hacking and the disciplined approach advocated in this book. Perhaps this is because the application areas in which LISP is used (such as artificial intelligence) demand an exploratory approach in which programs
may be written in order to try to demonstrate a theoretical premis e. Thus AI
DESIGN METHODS — THE REALITY
189
tries to solve ‘ill-formed problems’ — it is difficult to determine when (or if) we have solved such problems, because they are only partially understood. Moreover, LISP programmers regularly embark on the design and construction of programs they don’t know how to write. This kind of ‘exploratory programming’ clearly requires a great deal of help from the language and its programming environment, anda flexibility in the way in which ideas can be expressed which is not found in languages like Pascal and C. For example, in LISP we can use variables without declaring their type, or define functions which can take arbitrary numbers of arguments. We can define and use functions which call other functions that haven’t been written yet. We can edit, test and debug a program incrementally even while the program is running. And, very importantly, LISP blurs the distinction between program and data — which is the reason for LISP’s much maligned bracketed syntax. It is this dynamic, rather than static, approach to program construction which enables and supports exploratory programming. Erik Sandewall (1978) has described this method of program development as structured growth: an initial program with a pure and simple structure is written, tested, and then allowed to grow by increasing the ambition of its modules. The growth can occur both horizontally, through the addition of more facilities, and vertically through a deepening of existing facilities.
Sandewall argues against the moralistic view that this could be considered as hacking under another name,
on the grounds that ‘even if some
kinds of
program changes are dangerous and/or bad, that does not prove that all of them are’. He also makes the point that stepwise refinement and structured growth are variants of top-down design, in that both refine a program’s component parts from an initial high-level view of its overall structure. Finally, there is evidence that men and women carry out design differently (Turkle 1984). Men, it seems, approach the computer as a slave that has to be controlled, dominated and brought under the will of the programmer. It may have to be wrestled with or struggled with in order that it does the programmer’s bidding. In contrast, women approach the computer as a machine that has to be accommodated to, something that has quirks to come to terms with. The designer proceeds by trying something and, if it is not successful, trying something else. The design emerges from the process of negotiation. No-one would argue that all men and all women adopt these different approaches to design. And certainly, the authors would not argue that any differences between men’s and women’s approaches to design are inherited at
birth. Assuming that most professional programmers are required by the organisado? tion that employs them to use a systematic design method, what do they using their We speculate that professional programmers first create a design one own personal method. They then legitimise it by casting it into the shape of
190
REVIEW
of the approved methods. This may involve producing the charts and/or pseudo-code documentation that accompany proper use of the method.
10.7 Assessing design methods Which method is best? In this book, we have used a set of criteria to assess the different methods. The criteria are: p
@ @ @ @ @ @ @ @ @ @
Special features and strengths Weaknesses Philosophy/perspective Systematic? Appropriate applications Inappropriate applications Top-down? Good for small-scale design? Good for large-scale design? Tool support
These are qualitative factors, useful in making judgements about methods. What is not available is good quantitative data — data that says ‘if you use method X, you will get 20% fewer errors and you will do it 50% quicker than method Y’. If you look at the literature, you will find that there have been no
convincing studies that answer this question. Indeed, we believe that even the benefits of structured programming have not yet been proved. For example, Iris Vessey (Vessey and Weber 1984), having reviewed the empirical studies, concluded that ‘In general, the evidence supporting [structured] programming precepts is weak.’ There is a variety of approaches. We speculate that different design methods may be better for different programming problems. No single method can claim to be the best overall.
10.8
Discussion
We have seen that there is a variety of different, respectable approaches to program design. We have also seen that professional programmers don’t use any of these methods in practice. So what can we conclude? There are several options. We could ignore the evidence, bury our heads in the sand and pretend that programmers do use the approved methods. We could admit that training and education has not been succes sful and call for more and better training and education.
REFERENCES AND FURTHER READING
191
We could conclude that programming is in its infancy and that in a few decades programmers will use systematic methods. We could speculate that as the effects of training permeate the industry, we shall see these methods used
in earnest. Equally we could embrace the reality that programmers use individual ad hoc methods, recognising that programming is a challenging and highly creative activity. If we are happy with the products that programmers produce, why worry? As long as programs can be understood and maintained easily, it does not matter how they were actually designed. We should, perhaps, regard a program design method as merely a way of understanding a program that has already been written. Finally, program design is a small-scale activity with, perhaps, a low degree of complexity. If programmers do not use the approved methods, we can only speculate about whether and how designers of bigger systems use systematic methods, like SSADM for example. There is evidence (Friedman and Cornford 1989) that they also don’t use the approved methods. Our suggestions are: @
@
@
To programmers: keep on doing what you have always done. Be content in knowing that no-one else uses the approved methods either. Make sure you document your programs so that they look as though they were designed systematically. To project managers: recognise that programmers do not actually use the proper methods. Be sure to insist that they create the appropriate documentation to allow maintenance. To educators and trainers: recognise that students will not use the approved methods. Don’t try to force them into unnatural behaviour. Tell them the truth. Realise that students need to know about the methods so that they can retroactively document them.
10.9 Summary The evidence suggests that professional programmers use individual, personal methods rather than any of the approved methods. There are several possible reactions to this unsettling news.
40.10 References and further reading Friedman A L and Cornford D S, Computer Systems Development: History, Organiza tion and Implementation, John Wiley, 1989. 1990. Hoc Jean Marie (ed.), Psychology of Programming, Academic Press, London,
192
REVIEW
Naur P, ‘An experiment in program development’, BIT, Vol. 12, No. 3, pp. 347-65,
1972. Reprinted in Computing: A Human Activity, ACM Press, Addison-Wesley, Reading, MA, 1992.
Sandewall E, ‘Programming in an interactive environment: the LISP experience’, ACM Computing Surveys, Vol. 10, No. 1, pp. 35-71, March 1978. Reprinted in Barstow D
R, Shrobe
H
E and
Sandewall
E, Interactive Programming
Environments,
McGraw-Hill, 1984. Turkle S, The Second Self: computers and the human spirit, Granada, London, 1984. . Vessey I and Weber R, ‘Research on structured programming: an empiricist’s evaluation’, IEEE Transactions on Software Engineering, Vol. SE-10, No. 4, July
1984.
Index
abstract data type, 170 abstraction, 27, 144 Ada, 74 architectural design, 2 assessing methods, 4 Booch diagram, 161
bottom-up, 182 bottom-up design, 178 changeability,
global data, 169 goto, 8, 12
high-level design, 2 information hiding, 88, 170 inheritance, 94 instance, 92 interactive system, 45 inversion, 60
170
class, 92
Jackson structured programming, 33
class hierarchy diagram, 95, 152
JSP, 33
class responsibility collaborators, 103, 154
comprehensibility, 170
large-scale design, 6 LISP, 188
concrete design, 121
local data, 169
cohesion, 81, 172, 175
concurrent processes, 158 control structure diagram, 161 control structures, 14 cost, 2 coupling, 81, 172-3 cyclomatic complexity, 168
data flow design, 67 data flow diagram, 68, 148 data hiding, 170 data invariant, 124
data refinement, 121 data store, 72 data structure, 183
data structure design, 33, 184 data structure diagram, 34 decision tables, 161 decomposition, 22
design, | design methods, 187 detailed design, 2 dynamic binding, 90 encapsulation, 87, 170
flowcharts, 7, 141
formal approach, 186 formal method,
120
functional decomposition, 22, 29, 183-5
low-level design, 2 maintenance, 2
message passing, 88 method, 88, 89 Michael Jackson method, 185 Michael Jackson program design method, 33 middle-out design, 182 model-view-controller, 111 modelling, 185 modularity, 77, 81, 162, 163
module size, 164 multiple inheritance, 99 Nassi—Shneiderman charts, 145
object, 88 object-oriented, 183 object-oriented design, 87, 100, 184, 186 operation refinement, 121, 134
PDE 39, Petri net, 158
physical and logical structures, 42 polymorphism, 90 postcondition, 126 precondition, 126 program complexity, 166 program design language, 22, 139
194
INDEX
program structure diagram, 36 proof obligation, 128
protocol, 89
static binding, 90 structure chart, 70, 150 structure clashes, 56
proximity, 61
structure diagram, 70 structured analysis, 83
pseudo-code, 22, 37, 139
structured design, 80
refine, 73 refinement, 22, 121
prototyping, 112
reliability, 2
structured structured structured subclass,
English, 139 programming, 7, 16 systems analysis and design, 83 94
requirements definition, 121
superclass, 94
requirements specification, 121
system state, 124
reusability, 111
schema, 124 schema calculus, 122 schema inclusion, 125 schematic logic, 37
serial file processing, 50 simulation, 100
single inheritance, 99 small-scale design, 5
tool support, 6 top-down, 5, 182 top-down design, 188 transformations, UNIX, 60, 74 use-cases, 107
specification, 120, 121, 122 SSADM, 83
validating, 127 VDM, 121 verification, 120
state transition diagram, 156 statechart, 161
LZ, MPA N22
software specification, 121
148
essence
O
c COM
UIE
THE
ESSENCE
OF
PROGRAM
a
\
The Prentice Hall Essence of Computing Series provides a concise, practical and uniform introduction to the core components of an undergraduate computer science degree. Acknowledging recent changes within higher education, this approach uses a variety of pedagogical tools — case-studies, worked examples and self-test questions — to underpin the student's learning. Details of all titles in the series can be found on page ii.
Aimed at those who have written several small-to-medium-sized programs, The Essence of Program Design provides a complete introduction to the varied range of design methods that are available. Each method is analysed and assessed, and comparisons are drawn to allow the programmer to ascertain which method is most suitable for the task in hand. Program design is crucial to all successful programming: if the design of a program is incorrect, then all subsequent stages of program development will be flawed. This is why The Essence of Program Design is essential reading for all undergraduate and novice programmers. Key features: Compares, contrasts and assesses the most widely used design methods, including JSP, object-oriented design, dataflow design and formal methods ®
Totally language independent
Packed with learning aids, including exercises, examples an reading lists
HN LO82)