ZB 2000: Formal Specification and Development in Z and B: First International Conference of B and Z Users York, UK, August 29 - September 2, 2000 Proceedings (Lecture Notes in Computer Science, 1878) 3540679448, 9783540679448

This book constitutes the refereed proceedings of the First International Conference of B and Z Users, ZB 2000, held in

125 18

English Pages 510 [524] Year 2000

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
ZB 2000: Formal Specification and Development in Z and B
Preface
Programme and Organizing Committees
Table of Contents
Meeting the Challenge of Object-Oriented Programming
A Formal Mapping between UML Models and Object-Z Specifications
Introduction
UML
UML Class Diagrams
A Formal Definition of UML Class Constructs
Object-Z
A Metamodel of Object-Z
A Formal Description for the Object-Z Metamodel
The Semantics of Object-Z
The semantics of attributes
The Semantics of Types
The Semantics of Classes
The Semantics of Object-Z Specifications
A Formal Mapping between UML Class Constructs and Object-Z Constructs
UML Classes
UML Associations
UML Association Classes
UML Generalizations
Mapping A UML Class Diagram to an Object-Z Specification
Mapping the UML Class Diagram to its Semantic Models
Related Work
Conclusion
References
A Generic Process to Refine a B Specification into a Relational Database Implementation
Introduction
Overview of the B Refinement Process and the Relational Model
Overview of the B Re
Overview of the Relational Model
Description of our Approach
From UML Class Diagrams to B Speci cations
Refinement of B Specifications into Relational Database Implementations
Related Work
Refinement Rules
Transition to the First Normal Form
From an Object-Based Model to a Value-based Model
Transformation of Associations
Extension of Partial Functions
From a Functional Representation to a Set Representation
Mapping into SQL Tables
Description of the Mapping
Discussion
Proof of the Re
Conclusions and Future Works
References
Recursive Schema Definitions in Object-Z
Introduction
Recursion in Object-Z
Fixed Point Definitions
Initial State Schemas
Operation Schemas
Interpreting Recursive Definitions
Initial State Schemas
Operation Schemas
Tree Example Revisited
Initial State Schema
Insert Operation
Conclusion
References
On Mutually Recursive Free Types in Z
Introduction
Traditional Free Types
Example Free Types
General Form of a Free Type
Constraints Abbreviated by a Free Type
Finiteness
Consistency of Constraints
Formulation as Inference Rules
Use of Inference Rules in an Example Proof
Single Disjointness Constraint
Mutually Recursive Free Types
Example
General Form
Constraints
Constraints for the Example
Consistency
An Inductive Proof Involving the Example
A Second Example of Mutual Recursion
Conclusions
References
Reasoning Inductively about Z Specifications via Unification
Motivation
Z and the {sf CADi${@mathbb Z}$} Proof System
Unification in {sf CADi${@mathbb Z}$}
Application to Inductive Reasoning
Free Types
The Basic Strategy
Using Syntactic Orderings
Selection Strategies
Conclusions
References
Reconciling Axiomatic and Model-Based Specifications Using the B Method
Introduction
Specification: Mathematical Model vs. Axiomatic
Example: A Stack
The Stack Axioms
An Axiomatic Specification of a Stack
A Model-Based Specification of a Stack
Reconciling the Axiomatic and Model-Based Specifications
Reconciliation through Implementation
Instantiating the Abstract Constants
The Reconciliation
An Example with State
The StackVAR Machine
The StackVARSeq Machine
The StackVAR Implementation
A Framework
Reasoning about Operations
Summary
References
Compositional Structuring in the B-Method: A Logical Viewpoint of the Static Context
Introduction
Background: The Structuring Primitives of B
Incremental Specification: The {sc includes} and {sc extends} Primitives
Sharing Specification Text: The {sc uses} Primitive
Reference-Only Sharing: The SEES Clause
Layered Implementation: The {sc imports} Primitive
Internal Consistency of Abstract Machine Specifications
Logical Background
Incremental Specification
Extending Specifications
Sharing Components
Layered Implementation ({sc imports})
Conclusion
References
Automatic Construction of Validated B Components from Structured Developments
Introduction
Component, Composition and Refinement in B
B-Components
B Composition
The proposed Tool
Extraction Algorithms
Component Extraction from a Direct Refinement
Component Extraction through a Chain of Refinements
Component Extraction from a Structured Refinement
Conclusion
Specification Structure
Stating New Properties
Calculus of Component Relations
Perspectives
References
Playing with Abstraction and Refinement for Managing Features Interactions
Introduction
Feature Interaction Problem
Abstract Models for Services
The Refinement-As-Composition Principle
Modelling CF Call Forwarding
Modelling TCS Terminating Call Screening
Modelling OCS Originating Call Screening
Analysis of OCS and TCS
Combining CF, TCS, OCS
Concluding Remarks and Future Works
References
A Formal Architecture for the 3APL Agent Programming Language
Introduction
The 3APL Programming Language
3APL Types
Beliefs
Actions
Goals, Contexts, and Front Contexts
Practical Reasoning Rules
3APL Agents
Agents and Mental State
Initial Agent State
3APL Agent Operation
Applying Practical Reasoning Rules
Goal Execution
Conclusions
References
Substitutions
Application of Substitutions
Composition of Substitutions
Unification
How to Drive a B Machine
Introduction
Overview of CSP
The Language
Semantics
Specification
A Simple Coupling between B and CSP Loops
Developing a Control Executive
Consistency of a CSP Control Executive and a B Abstract System
Reviewing Divergence Freedom
Conditions for Deadlock Freedom
Determining Guards Using $PAIRS$
Verification of Deadlock Freedom Consistency
A Coupling for Terminating Loops
Extended Syntax
Acceptable Deadlock
Modified Condition for Deadlock Freedom
Verifying Deadlock Freedom Consistency for Terminating Loops
Allowing Channels in Loops
Further Extended Control Syntax
Preserving Consistency with New Syntax
Example with Divergence and Deadlock Freedom
Discussion
References
Deriving Software Specifications from Event Based Models
Introduction
Description of the Proposed Method
Specifying Sequential Programs
Specifying Concurrent and Distributed Systems
Deriving Program Specifications from System Specifications
Using the B Method and Generalized Substitutions
The Flight Warning System (FWS)
System Description
Formal Specification of the FWS System
FWS. Environment Specification:
Component Specifications:
Implementing the Derived Modules
Conclusion
References
Reformulate Dynamic Properties during B Refinement and Forget Variants and Loop Invariants
Introduction
Some Dynamic Property Refinements
Operational Description
Temporal Property Expression
About Refinement
Reformulated Dynamic Property Verification
Refinement and Dynamic Properties
Weak Sufficient Conditions
Strong Sufficient Conditions
Tools
Related and Future Works
References
Proofs
Type-Constrained Generics for Z
Introduction
Implicit Instantiation - Schema Negation
Type Compatibility - Schema Conjunction
Syntactic Overloading - Schema Logical Operations
Schema Axiomatic Definitions
The scope of Generic Schemas
Projection
Further Schema Operations - Natural Composition
Heterogeneous State Transitions
Removing Decorations - Schema Precondition
Schema Override
Turning an Operation Schema into a Relation
Schema Composition
Type-Correctness and Associativity of Schema Composition
Schema Piping
Rules of Refinement
Homogeneous State Transitions - $Delta $ and $Xi $
Homogeneous Operation Schemas - Recognising the State
Schema Iteration
Implementation
Conclusions
References
Typechecking Z
Introduction
Types
Requirements on the Typechecker
Schemas
Browsing
Draft Standard Z
Type-Constrained Generics
Specification of the Typechecker
Notations
Type Inference Rules
Type Inference System
Implicit Instantiations
Schemas
Browsing
Draft Standard Z
Undecoration Expressions
Type-Constrained Generics
Diagnosing Type Errors
Conclusions
References
Guards, Preconditions, and Refinement in Z
Introduction
Guards and Preconditions in Z
Example
Classical Precondition and Guarded Interpretation
Refinement
Combining Guards and Preconditions
A Syntax for Using Generalised Guards
Operations with Guards and Preconditions
Regions of Before States
Three Valued Interpretation
Semantical Description of the Regions
Meaning of Refinement
Operation Refinement
Rules for Operation Refinement
Example
Generalisation of Traditional Refinement Rules
Related and Further Work
Strulo's Work
The $(R,A)$-Calculus
Hehner and Hoare's Predicative Approach to Programming
Refinement Rules for Required Non-determinism
Conclusion and Future Work
References
Retrenchment, Refinement, and Simulation
Introduction
Some Inadequacies of Refinement
Retrenchment
Stepwise Simulation
Modulated Refinement and Simulation
Simple Simulable Retrenchment
The Mobile Radio Example Revisited
Conclusions
References
Performing Algorithmic Refinement before Data Refinement in B
Introduction
Algorithmic Refinement before Data Refinement
System of Interest
Case Study 1 - Development of Make_Ballot
Make_Ballot - Algorithmic Refinement followed by Data Refinement
Make_Ballot - Data Refinement followed by Algorithmic Refinement
Case Study 2 - Development of Pre_Process
C++ STL Multisets
Pre_Process - Algorithmic Refinement followed by Data Refinement
Pre_Process - Data Refinement before Algorithmic Refinement
Dealing with Procedural Refinement in the B-Toolkit
Conclusions
References
Program Development and Specification Refinement in the Schema Calculus
Introduction
A First Example
The Specification
The Development
An Interpretation of Operation Schemas
Preconditions and Postconditions
Operation Schema Calculus
Refinement Inequations
Types and Programs
An Example Using Promotion
Specification
Refinement
Further Work and Conclusions
References
Are Smart Cards the Ideal Domain for Applying Formal Methods?
Introduction
Needs of Security and Formalism
The Small and Secure System
The Certification Process
The Complexity is Increasing
Reducing the Cost of the Test
The Constraints
Development Overhead
Industrial Constraints
Cultural Resistance
Need of a Methodology
Conclusions
References
Formal Methods for Industrial Products
Introduction
Overview of the Application
Special Features
Security Models and Proofs
Segregation with Communication
Modelling Consequences
Segregation and Multi-promotion
Modelling the Functionality
Determinism
Imposed Determinism
Determinism and Refinement
Determinism and Traces
Functional and Non-functional Properties
Two Security Models
Differently Segregated
Resulting Specification Structure
Abstract Security Policy Model, SP
Virtual Machine Model, VM
Concrete Hardware Model, HW
Resulting Proof Structure
Proof Tree
Proof Sizes
HW has SP Segregation Property
HW has SP functional properties
Results
Design of the Virtual Machine
Identification of Communication Channels
Proof Detected an Error
Lessons Learned
Model Structure Versus Proof Structure
Presentation
Providing Further Justification
Elegant Mathematical Results may not Help
Summary
References
Conjectures with fuzz
Non-generic Conjectures
Generic Conjectures
An Execution Architecture for GSL
Introduction
Reversible Computation
Virtual Machine Architecture, Outline
``Executing'' Non-deterministic Choice
The Command
Random Choice and Morgan's pGSL
Abstract Command Language
ACLA Details
Indirect Threaded Code
Multiple Code Fields
Reversing Branch Instructions
The ACLA Multi-tasker
Evaluating the Feasibility of an Event
Conclusions
References
A Computation Model for Z Based on Concurrent Constraint Resolution
Introduction
The MZ Calculus: Syntax and Informal Semantics
Mapping Z to MZ
Abstraction and Application
Predicates
Schema Calculus
Types and Genericity
Semantics
Computation Model
Domains
Reduction
Resolution
Implementation
What Can be Computed?
The Positive Answer
The Negative Answer (And What Can be Enhanced)
Conclusion and Related Work
References
Analysis of Compiled Code: A Prototype Formal Model
Introduction
Background
Program Analysis
Formal Model
Expressing Higher-Order Properties in Z
Specification of Program Analyses
Structure of the Paper
Processor Model
Register and Memory Model
Processor State
Instruction Set
Instruction Execution
CPU Execution Cycle
Behavioural Model
Discussion
Basic Blocks Abstraction
Introduction
Representing Basic Blocks
Instantaneous Basic Block Decompositions
Correct Basic Block Decompositions
Discussion
Concluding Remarks
Limitations of the Processor Model
Recent Work
Future Research
References
Zzzzzzzzzzzzzzzzzzzzzzzzzz
Segregation with Communication
Introduction
Motivating the Problem
Trace Definition of Segregation
A Segregated System
Using Segregation
Multiple Models
Multi-promotion
Unwinding Theorem
Multiple Models
Traces from the Computational Model
The Computational Model
Including Initialisation and Finalisation
Input--Output Traces
Event Traces
Mapping from State-and-Operations to Traces
Multi-promotion
A Reminder of Single Promotion
Introducing Multi-promotion
Defined Communication Channels
Unwinding Theorem
Strength of Segregation
Property not Preserved by Refinement
Conclusions
References
Closure Induction in a Z-Like Language
Motivation
A Typed Language and Its Semantics
The Syntax of Expressions
The Semantics of Expressions
Inductive Reasoning
Closure Induction
Definedness Rules
Conclusion
Proof of the Soundness Theorem for Closure Induction
References
Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z
Introduction
Fuzzy Sets
Motivation
A Possible Fuzzy Set Representation in Z
The Toolkit Summary
Some Basic Definitions
Some Set Measures
Some Set Operators
Fuzziness, Set Equality and Set Inclusion
Set Modifiers and Fuzzy Numbers
Fuzzy Relations
Range and Domain for a Fuzzy Relation
Range and Domain Restrictions (and Anti-restriction) for Fuzzy Relations
The {em max-min } Relational Composition Operator for Fuzzy Relations
A {em Fuzzy Relational Image} for fuzzy Relations
Fuzzy Functions
Alternative Notation and Definitions
Conclusion
References
Author Index
Recommend Papers

ZB 2000: Formal Specification and Development in Z and B: First International Conference of B and Z Users York, UK, August 29 - September 2, 2000 Proceedings (Lecture Notes in Computer Science, 1878)
 3540679448, 9783540679448

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen

1878

3 Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo

Jonathan P. Bowen Steve Dunne Andy Galloway Steve King (Eds.)

ZB 2000: Formal Specification and Development in Z and B First International Conference of B and Z Users York, UK, August 29 - September 2, 2000 Proceedings

13

Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Jonathan P. Bowen South Bank University, SCISM Centre for Applied Formal Methods Borough Road, London SE1 0AA, UK E-mail: [email protected] Steve Dunne The University of Teesside School of Computing and Mathematics Borough Road, Middlesbrough TS1 3BA, UK E-mail: [email protected] Andy Galloway Steve King University of York Department of Computer Science Heslington, York YO10 5DD, UK E-mail: {andyg, king}@cs.york.ac.uk Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme ZB 2000 : formal specification and development in Z and B ; proceedings / First International Conference of B and Z Users, York, UK, August 28 - September 2, 2000. Jonathan P. Bowen . . . (ed.). Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 2000 (Lecture notes in computer science ; Vol. 1878 ISBN 3-540-67944-8 CR Subject Classification (1998): D.2, F.3.1, D.1, I.1.3, G.2, F.4.1 ISSN 0302-9743 ISBN 3-540-67944-8 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH © Springer-Verlag Berlin Heidelberg 2000 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Stefan Sossna Printed on acid-free paper SPIN: 10722395 06/3142 543210

These proceedings are dedicated to the memory of Philippe Facon

Preface

These proceedings record the papers presented at the first International Conference of B and Z Users (ZB 2000), held in the historic city of York in the north of England. B and Z are two important formal methods that share a common conceptual origin; each is widely used in both academia and industry for the specification and verification of both hardware and software systems. Jean-Raymond Abrial is the founder of both approaches, which share a common mathematical basis of set theory. Z was designed mainly for formal specification of computer-based systems. Subsequently, B was designed to aid in the formal development from a specification to a program. B has tool support for this process. Both approaches aim to avoid errors that are otherwise typically discovered and removed, often more expensively, at the testing stage or, worse, remain until after delivery to the customer. In ZB 2000 the B and Z communities came together to hold a joint conference that simultaneously incorporated the 12th International Conference of Z Users (formerly the Z User Meeting) and the 3rd International Conference on the B Method. Although organized as an integral event, editorial control of the joint conference remained vested in two separate but cooperating programme committees that respectively determined its B and Z aspects, but in a coordinated manner.1 In particular, a joint meeting was held in March 2000, hosted by South Bank University in London. At this meeting, the programme committees met separately and together to select the papers for the conference. The committees are especially grateful to Janet Aldway and Stella Jimoh of the School of Computing, Information Systems and Mathematics (SCISM) for aiding in the organization and smooth running of this two-day meeting. The conference benefited from the contributions of a range of distinguished invited speakers drawn from both industry and academia, who addressed significant recent industrial applications of the two methods, as well as important academic advances serving to enhance their potency and widen their applicability. Our invited speakers for ZB 2000 were drawn from France and the United Kingdom, reflecting the roots of B and Z: David Everett (Platform 7, UK), Jean-Louis Lanet (GemPlus Research Laboratory, France), Dominique M´ery (Universit´e Henri Poincar´e, Nancy, France), and Mike Spivey (Oxford University Computer Laboratory, UK). Besides its formal sessions the conference featured tool demonstrations, publishers’ displays, special tutorials, and other meetings. The conference was also held in conjunction with the IEEE ICFEM 2000 International Conference on Formal Engineering Methods, held at York during the week after ZB 2000. The 1

An indication of which Programme Committee accepted a paper is given in the Table of Contents.

VIII

Preface

co-location of the two conferences was designed to enable some delegates, especially those from abroad, to attend both events at reduced expense. The location of ZB 2000 at the University of York reflected important work in the area of formal methods at the university, including Z and B. In particular, members of the Department of Computer Science at the University of York had been very active in the establishment of an international ISO Z standard which was nearing completion. The ZB 2000 conference was jointly organized by the Z User Group (ZUG) and the International B Conference Steering Committee (APCB). The conference was sponsored by Praxis Critical Systems, Daimler-Chrysler AG, IBM, Rolls-Royce plc, and BAE SYSTEMS plc. It was also supported by BCS-FACS. We are grateful to all those who have contributed to the success of the conference. On-line information concerning the conference is available under the following Uniform Resource Locator (URL):

http://www.cs.york.ac.uk/zb2000/ This also provides links to further on-line resources concerning the Z notation and B Method. We hope that all participants and other interested readers enjoy these proceedings.

August 2000

Jonathan Bowen Steve Dunne Andy Galloway Steve King

Programme and Organizing Committees The following people were members of the ZB 2000 Z programme committee: Chair: Jonathan Bowen, South Bank University, London, UK Co-chair: Sam Valentine, University of York, UK Ali Abdallah, South Bank University, London, UK Paolo Ciancarini, University of Bologna, Italy Neville Dean, Anglia Polytechnic University, UK John Derrick, The University of Kent at Canterbury, UK Andy Evans, University of York, UK Andreas Fett, Daimler-Chrysler Research Berlin, Germany David Garlan, Carnegie-Mellon University, USA Wolfgang Grieskamp, Technical University of Berlin, Germany Henri Habrias, University of Nantes, France Jonathan Hammond, Praxis Critical Systems, UK Ian Hayes, University of Queensland, Australia  University of Nebraska at Omaha, USA Mike Hinchey, University of Sk¨ ovde, Sweden Mark d’Inverno, University of Westminster, UK Jonathan Jacky, University of Washington, USA Randolph Johnson, National Security Agency, USA Steve King, University of York, UK Kevin Lano, King’s College, London, UK Shaoying Liu, Hiroshima City University, Japan Jean-Francois Monin, France T´el´ecom R&D, France Fiona Polack, University of York, UK Norah Power, University of Limerick, Ireland Mark Saaltink, ORA, Ottawa, Canada  Technical University of Berlin, Germany Thomas Santen, GMD FIRST, Berlin, Germany Alf Smith, DERA Malvern, UK Susan Stepney, Logica, UK David Till, City University, London, UK Jim Woodcock, Oxford University, UK John Wordsworth, IBM Hursley UK Laboratories, UK

X

Programme and Organizing Committees

The following served on the ZB 2000 B programme committee: Chair: Steve Dunne, University of Teesside, UK Co-chair: Andy Galloway, University of York, UK Christian Attiogb´e, University of Nantes, France Richard Banach, University of Manchester, UK Marc Benveniste, STMicroelectronics, France Didier Bert, IMAG, France Juan Bicarregui, CLRC Rutherford Appleton Laboratory, UK Pierre Bieber, CERT, France Michael Butler, University of Southampton, UK Jeremy Dick, Quality Systems and Software, UK Mark Frappier, University of Sherbrooke, Canada Jeremy Jacob, University of York, UK Brian Matthews, CLRC Rutherford Appleton Laboratory, UK Luis-Fernando Mejia, Alstom, France Jean-Marc Meynadier, Matra Transport, France  Service Central de la S´ecurit´e des Syst`emes Louis Mussat, d’Information, France Marie-Laure Potet, IMAG, France Ken Robinson, The University of New South Wales, Australia Steve Schneider, Royal Holloway, UK Emil Sekerinski, McMaster University, Canada Bill Stoddart, University of Teesside, UK Marina Wald´en, ˚ Abo Akademi, Finland

At the University of York, the following helped with the local organization in various capacities: General Chair: Administration: Co-chair/Publicity: Co-chair/Publicity: Treasurer: Submissions: Local Arrangements: Exhibitions/Sponsorship: Tool Demonstrations: Proceedings: Website:

John McDermid Ginny Wilson Sam Valentine Andy Galloway Jonathan Moffett Steve King Darren Buttle Fiona Polack Ian Toyn David Hull James Blow

We are especially grateful to the above for their efforts in ensuring the success of the conference.

External Referees We are grateful to the following people who aided the programme committees in the reviewing of papers, providing additional specialist expertise: Pascal Andr´e, Institut National Polytechnique, Cˆ ote d’Ivoire Peter Breuer, Universidad Carlos III de Madrid, Spain Michael Cebulla, Technische Universit¨at Berlin, Germany Francis Klay, France T´el´ecom R&D, France Jean-Yves Lafaye, Universit´e de La Rochelle, France Michael Luck, Warwick University, UK Ulrich Ultes-Nitsche, University of Southampton, UK

Sponsors ZB 2000 greatly benefited from the cooperation and sponsorship of the following organizations:

BAE SYSTEMS plc Daimler-Chrysler AG IBM Praxis Critical Systems Rolls-Royce plc

Tutorial Programme The following tutorials were scheduled on the day before the main conference (August 29, 2000): An Introduction to Object-Z Graeme Smith, University of Queensland, Australia A machine-independent approach to real-time refinement Ian Hayes, University of Queensland, Australia The B-Method Ib Sørensen and Ken Robinson, B-Core Ltd, UK and University of New South Wales, Australia

Table of Contents

Meeting the Challenge of Object-Oriented Programming . . . . . . . . . . . . . . . . Mike Spivey (Invited Speaker)

1

A Formal Mapping between UML Models and Object-Z Specifications . . . . Soon-Kyeong Kim and David Carrington (Z)

2

A Generic Process to Refine a B Specification into a Relational Database Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 R´egine Laleau and Amel Mammar (B) Recursive Schema Definitions in Object-Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Graeme Smith (Z) On Mutually Recursive Free Types in Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 I. Toyn, S.H. Valentine, and D.A. Duffy (Z) Reasoning Inductively about Z Specifications via Unification . . . . . . . . . . . . 75 David A. Duffy and Ian Toyn (Z) Reconciling Axiomatic and Model-Based Specifications Using the B Method 95 Ken Robinson (B) Compositional Structuring in the B-Method: A Logical Viewpoint of the Static Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Theo Dimitrakos, Juan Bicarregui, Brian Matthews, and Tom Maibaum (B) Automatic Construction of Validated B Components from Structured Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Pierre Bontron and Marie-Laure Potet (B) Playing with Abstraction and Refinement for Managing Features Interactions148 Dominique Cansell and Dominique M´ery (Invited Speaker) A Formal Architecture for the 3APL Agent Programming Language . . . . . . 168 Mark d’Inverno, Koen Hindriks, and Michael Luck (Z) How to drive a B Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Helen Treharne and Steve Schneider (B) Deriving Software Specifications from Event Based Models . . . . . . . . . . . . . . 209 Nestor Lopez, Marianne Simonot and Veronique Vigui´e Donzeau-Gouge (B)

XIV

Table of Contents

Reformulate Dynamic Properties during B Refinement and Forget Variants and Loop Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 F. Bellegarde, C. Darlot, J. Julliand, and O. Kouchnarenko (B) Type-Constrained Generics for Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Samuel H. Valentine, Ian Toyn, Susan Stepney, and Steve King (Z) Typechecking Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Ian Toyn, Samuel H. Valentine, Susan Stepney, and Steve King (Z) Guards, Preconditions, and Refinement in Z . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Ralph Miarka, Eerke Boiten and John Derrick (Z) Retrenchment, Refinement and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 R. Banach and M. Poppleton (B) Performing Algorithmic Refinement before Data Refinement in B . . . . . . . . 324 Michael Butler and Mairead Meagher (B) Program Development and Specification Refinement in the Schema Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Martin C. Henson and Steve Reeves (Z) Are Smart Cards the Ideal Domain for Applying Formal Methods? . . . . . . . 363 Jean-Louis Lanet (Invited Speaker) Formal Methods for Industrial Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Susan Stepney and David Cooper (Z) An Execution Architecture for GSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 Bill Stoddart (B) A Computation Model for Z Based On Concurrent Constraint Resolution . 414 Wolfgang Grieskamp (Z) Analysis of Compiled Code: A Prototype Formal Model . . . . . . . . . . . . . . . . . 433 R.D. Arthan (Z) Zzzzzzzzzzzzzzzzzzzzzzzzzz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450 David Everett (Invited Speaker) Segregation with Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 David Cooper and Susan Stepney (Z) Closure Induction in a Z-Like Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 David A. Duffy and J¨ urgen Giesl (Z) Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z . . . . . . . 491 Chris Matthews and Paul A. Swatman (Z)

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

Meeting the Challenge of Object-Oriented Programming Mike Spivey Oxford University Computing Laboratory Wolfson Building, Parks Rd, Oxford, OX1 3QD, UK [email protected]

Abstract. Object-oriented programming allows programs to be built from components, and makes it possible, with care, to produce libraries of components that can be used in many programs. In this talk, I will review some of the forms of composition and patterns of interaction that are used in OOP, and examine the question of whether mathematical specifications can be given to the interfaces between components.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, p. 1, 2000. c Springer-Verlag Berlin Heidelberg 2000

A Formal Mapping between UML Models and Object-Z Specifications Soon-Kyeong Kim and David Carrington Department of Computer Science and Electrical Engineering The University of Queensland, Brisbane, 4072, Australia Email: [email protected], [email protected]

Abstract. This paper presents a precise and descriptive semantics for core modeling concepts in Object-Z and a formal description for UML class constructs. Given the formal descriptions, it also provides a formal semantic mapping between the two languages at the meta-level, which makes our translation more systematic. Any verification of UML models can take place on their corresponding Object-Z specifications using reasoning techniques provided for Object-Z. With this approach, we provide not only a precise semantic basis for UML but also a sound mechanism for reasoning about UML models.

1 Introduction For defining a specification notation, the most commonly used technique is to first define the abstract syntax of the language and then to describe its semantics. In the case of UML [9], the language is defined in terms of its metamodel. In the metamodel, individual modeling constructs that exist in UML are described from three distinct views using different notations. For example, the abstract syntax of modeling constructs in UML is described using class diagrams. The static and dynamic semantics of the modeling constructs are expressed in a combination of OCL [9] and English. Using graphical notation to define the abstract syntax of the language makes it easier to understand the language structure. However, the English description describing the semantics is not precise enough for rigorous analysis of UML models. Moreover, current UML does not provide a sound mechanism for reasoning about its models. In this sense, UML is not yet a precise modeling technique. The goal is to provide a formal basis for the syntactic structures and semantics of UML modeling constructs and to provide a sound mechanism for reasoning about UML models. To achieve this goal, we first give a formal description for UML modeling constructs using Object-Z classes. Second, we translate UML modeling constructs to Object-Z constructs for a rigorous analysis of UML models (readers may refer to [11] for reasoning techniques for Object-Z specifications). One of our major concerns in this work is to make the translation process as formal as possible. Without providing a systematic or formal mapping, the translation approach is not practical. Verifying the translation is also difficult. To achieve a formal J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 2–21, 2000. © Springer-Verlag Belin Heidelberg 2000

A Formal Mapping between UML Models and Object-Z Specifications

3

mapping between two different languages, it is essential that both languages should have a precise description for their syntax and semantics. Like UML, the syntax of Object-Z [2, 12] is well defined. Major contributions to the semantics of Object-Z are provided by Griffiths [7] and Dong [1]. Griffiths in particular provides a semantic basis for some basic concepts in Object-Z: object identity, class, and object. However, the semantics described in Z are complex and difficult to understand. As a consequence, we define an abstract metamodel for Object-Z in a similar architecture to which the UML metamodel is defined. In the metamodel, the abstract syntax and semantics of core modeling constructs in Object-Z are grouped together into Object-Z classes. We also use UML class diagrams to show the structure of Object-Z modeling constructs. Given the formal description for UML constructs and Object-Z constructs, the UML constructs are translated to Object-Z constructs. The translation process is described formally in terms of mapping functions. In this paper, we restrict the scope of our work to the UML class constructs and class diagrams. UML has been accepted as a standard OO modeling notation by OMG [9] and is already popular in industry. Hence, it is very important that UML should have a precise semantics for its notation and should provide a systematic mechanism for verifying its models. Using a well-defined object-oriented formal specification notation like Object-Z (where most of the fundamental concepts in object-orientation such as objectidentity, class, inheritance and polymorphism are supported by the specification technique itself) to specify the UML: œ gives a rigorous way of exploring the concepts embodied within the UML, œ makes formal reasoning of UML models possible and, œ provides a precise basis for mapping between different specification languages. From the Object-Z point of view, defining a metamodel for Object-Z enhances Object-Z as a specification language. In particular, the metamodel should provide a precise basis for mapping between Object-Z and other object-oriented specification techniques which have been defined in a similar metamodel architecture such as UML. Also the translation approach should contribute to the use of UML for developing Object-Z specifications. Within this scheme, at the early conceptual modeling stage, UML diagrams can be used to understand the structure and behavior of a system. Then the properties captured in the diagrams are translated to Object-Z specifications. The developed Object-Z specifications are enhanced by adding detailed properties required for the system that cannot be represented by using the UML notation alone. Finally a rigorous analysis takes place on the enhanced Object-Z specifications. The structure of the rest of this paper is as follows. Section 2 gives a formal description of UML class constructs using Object-Z classes. This work builds on our previous work [8] on formalizing UML class diagrams using Object-Z. Section 3 presents a metamodel for Object-Z. The metamodel is described using a UML class diagram and Object-Z classes. Section 4 describes the semantics of Object-Z modeling constructs and a mapping from the modeling constructs to their meaning. Section 5 presents a formal mapping from UML class constructs to Object-Z constructs with an example. Section 6 discusses related work. Finally, section 7 concludes and discusses further work.

4

S.-K. Kim and D. Carrington

2 UML In this section, we give an example class diagram and provide a formal description for the UML class constructs using Object-Z classes. Based on this description, class diagrams are formally described. $FFRXQW

*

*

 D F 1 R , Q W  E D O D Q F H , Q W 



c u s to m e r

account

(+ )w ith d ra w ( a n : I n t,a m o u n t : I n t ) (+ )d e p o s it ( a n : I n t, a m o u n t: I n t )

C u sto m e r n a m e : S trin g a d d r e s s : S tr in g

T ra n s a c tio n d a te : D a te tim e : T i m e

C h e c k in g A c c o u n t c re d it_ lim it : In t check account

book

0 ..1 1 ..2 0

1

C heckB ook

2 0 . .5 0 check

C heck

Fig. 1. A UML class diagram for the bank system.

2.1 UML Class Diagrams Figure 1 is an example class diagram that models a bank system. The diagram represents most UML class constructs, namely class, association, composition association, association class and generalization. It also meets all the rules described in the following section for well-formed class diagrams. The diagram consists of two major entities in the system: Customer and Account. Each class has its own attributes and operations ( + and - symbols prefixed to the attribute and operation names represent the visibility of the attributes and operations). Class Account is further classified into CheckingAccount using generalization in UML (the unfilled triangle in Fig 1 represents that class Account is the superclass). The CheckingAccount is associated with class CheckBook. The multiplicity constraint 1..20 means that an instance of CheckingAccount maps to at least one instance of class CheckBook and at most twenty instances of CheckBook. The CheckBook has a composition relationship with class Check (a composition relationship is represented by attaching a filled diamond symbol to the composite class). An association class Transaction represents a relationship between class Customer and Account and has its own attributes. 2.2 A Formal Definition of UML Class Constructs Classes: A class in UML is a descriptor of a set of objects with common properties in terms of structure, behavior, and relationship [9]. A class has a name, attributes and operations. An attribute has a name, a visibility, a type, and a multiplicity. An operation has a name, a visibility and parameters. Each parameter of an operation has a

A Formal Mapping between UML Models and Object-Z Specifications

5

name and a given type. Prior to formalizing classes, we define a given set, Name, from which the names of all classes, attributes, operations, operation parameters, associations and roles are drawn. >1DPH@ An Object-Z class UMLType is a meta type, from which all possible types in UML such as object types, basic types (integer and string) and so on can be derived. Each type has a name and contains a collection of its own features: attributes and operations. Thus, a circled c, which models a containment relationship in Object-Z [1] is attached to the types of attributes and operations. ÆÊ80/7\SHÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇDWWULEXWHV~80/$WWULEXWHÈ ÇÇRSHUDWLRQV~80/2SHUDWLRQÈ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Attributes and parameters are also defined as follows. Variable multiplicity in UMLAttribute describes the possible number of data values for the attribute that may be held by an instance. Visibility in UML can be private, public, or protected. 9LVLELOLW\.LQG SULYDWH_SXEOLF_SURWHFWHG ÆÊ80/$WWULEXWHÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇW\SH”80/7\SH ÇÇYLVLELOLW\9LVLELOLW\.LQG ÇÇPXOWLSOLFLW\}‚ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

ÆÊ80/3DUDPHWHUÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇW\SH”80/7\SH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

Within an operation, parameter names should be unique. ÆÊ80/2SHUDWLRQÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇYLVLELOLW\9LVLELOLW\.LQG ÇÇSDUDPHWHUVVHT80/3DUDPHWHUÈ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇDZSSUDQSDUDPHWHUV³SQDPH SQDPH·S S ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ With these classes, we define an Object-Z Class UMLClass as follows. Since a class is a type, it inherits from UMLType. Attribute names defined in a class should be different and operations should have different signatures. The class invariant formalizes these properties. ÆÊ80/&ODVVÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç80/7\SH ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇDZDDDWWULEXWHV³DQDPH DQDPH·D D ÇDZRSRSRSHUDWLRQV³ ÇÇ RSQDPH RSQDPHµRSSDUDPHWHUV RSSDUDPHWHUVµ ÇÇ ±LRSSDUDPHWHUV³ ÇÇ RSSDUDPHWHUV L QDPH RSSDUDPHWHUV L QDPHµ ÇÇ RSSDUDPHWHUV L W\SH RSSDUDPHWHUV L W\SH ·RS RS ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

6

S.-K. Kim and D. Carrington

Associations: In UML, relationships between classes are represented as associations. In most cases, associations in a class diagram are binary between exactly two classes. Moreover, aggregation and composition are always binary relationships. For these reasons, only binary associations are considered in this paper. A binary association has an association name and two association ends. Each association end has a role name, a multiplicity constraint, a class to which the association end is attached, an attribute to describe the navigability, and an attribute to describe whether the relationship is aggregation or composition. The Object-Z class AssociationEnd is a formal description of association ends. It has a role name, a multiplicity constraint, an attached class and attributes for describing aggregation and navigability. The multiplicity constraint describes a range of nonnegative integers denoting the allowable cardinality constraints for instances of the class attached to the other end. The variable aggregation can take the values none, aggregate, or composite. The variable navigability can be true or false. The constraints in the predicate state that a multiplicity cannot be {0} and for composition, the multiplicity of the composite end can be no more than one. $JJUHJDWLRQ.LQG QRQH_DJJUHJDWH_FRPSRVLWH ÆÊ$VVRFLDWLRQ(QGÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇUROHQDPH1DPH ÇÇPXOWLSOLFLW\}‚ ÇÇDWWDFKHG&ODVV”80/&ODVV ÇÇDJJUHJDWLRQ$JJUHJDWLRQ.LQG ÇÇQDYLJDELOLW\%RROHDQ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇPXOWLSOLFLW\‘^` ÇÇDJJUHJDWLRQ FRPSRVLWH·PXOWLSOLFLW\^^`^`` ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ A binary association has a name and exactly two association ends. An Object-Z class UMLAssociation is a formal description of binary associations. ÆÊ80/$VVRFLDWLRQÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇHH$VVRFLDWLRQ(QGÈ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇHUROHQDPH‘HUROHQDPH ÇÇHDJJUHJDWLRQ^DJJUHJDWHFRPSRVLWH`·HDJJUHJDWLRQ QRQH ÇÇHUROHQDPH^DHDWWDFKHG&ODVVDWWULEXWHV³DQDPH` ÇÇHUROHQDPH^DHDWWDFKHG&ODVVDWWULEXWHV³DQDPH` ÇDZDD80/$VVRFLDWLRQ_D‘D³ ÇÇ ^DHDWWDFKHG&ODVVDHDWWDFKHG&ODVV` ÇÇ ^DHDWWDFKHG&ODVVDHDWWDFKHG&ODVV`·DQDPH‘DQDPH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ The constraints in the predicate state the core properties of association: œ Each role name must be different. œ For aggregation and composition, there should be an aggregate or a composite end and the other end is therefore a part and should have the aggregation value of none. We assume that e1 is the composite or aggregate.

A Formal Mapping between UML Models and Object-Z Specifications

œ œ

7

For an association or an association class, the role name at an association end should be different from the attribute names of the class attached to the other end. An association name should be unique in the combination of its attached classes.

Association classes: An association class inherits from a class and an association. We define an Object-Z class UMLAssocClass inheriting from UMLClass and UMLAssociation. ÆÊ80/$VVRF&ODVVÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç Ç80/&ODVV Ç80/$VVRFLDWLRQ Ç ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇHDJJUHJDWLRQ QRQHµHDJJUHJDWLRQ QRQH ÇÇVHOI^HDWWDFKHG&ODVVHDWWDFKHG&ODVV` ÇÇ^DDWWULEXWHV³DQDPH`—^HUROHQDPHHUROHQDPH` ” ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ The constraints describe well-formedness rules for association classes: œ the aggregation value of both association ends is none, œ an association class cannot be defined between itself and something else, and œ the role names and the attribute names do not overlap. Generalizations: In UML, a generalization describes a taxonomic relationship between objects, in which objects of the superclass have general information and objects of the subclasses have more specific information [9]. We define this relationship with an Object-Z class named UMLGeneralization. In the class, two variables, super and sub are declared to represent the superclass and the subclass involved in a generalization. The constraint prohibits any circular inheritance. ÆÊ80/*HQHUDOL]DWLRQÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇVXSHU”80/&ODVV ÇÇVXE”80/&ODVV ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇ^J80/*HQHUDOL]DWLRQ³ JVXSHUJVXE `¤—LG ”80/&ODVV  ” ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Class diagrams: A UML class diagram is a collection of classes including association classes, associations and generalizations between these classes. Classes should have unique names within the class diagram. The following Object-Z class is a formal description of UML class diagrams. ÆÊ80/&ODVV'LDJUDPÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇFODVV~”80/&ODVV ÇÇDVVRF~80/$VVRFLDWLRQ ÇÇDVVRF&OV~80/$VVRF&ODVV ÇÇJHQ~80/*HQHUDOL]DWLRQ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇDZFFFODVV³FQDPH FQDPH·F F ÇÇ›^DDVVRF³^DHDWWDFKHG&ODVVDHDWWDFKHG&ODVV``–FODVV ÇÇ›^JJHQ³^JVXSHUJVXE``–FODVV ÇÇDVVRF&OV–FODVV ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

8

S.-K. Kim and D. Carrington

The constraints describe that: œ Classes that are involved in associations or association classes should be classes in the diagram. œ Classes involved in generalizations should be classes in the diagram.

3 Object-Z Object-Z is an object-oriented extension to Z [13] designed specifically to facilitate specification in an object-oriented style. In this section, we define a metamodel of Object-Z using the UML class diagram and give a formal description for the metamodel. 3.1 A Metamodel of Object-Z Figure 2 is a UML class diagram showing core modeling constructs in Object-Z and their relationships (we add OZ to the names of the model constructs to distinguish them from the UML modeling constructs described in section 2). OZType is a toplevel metaclass from which all possible types in Object-Z can be drawn. Classes are a kind of type in Object-Z, so OZClass inherits from the OZType. All other types in Object-Z such as given types, power set types, Cartesian types, , schema types and so on [2], are abstracted as a metaclass called OtherTypes. Parameter

OtherTypes

name : Name

OZAttribute name : Name visibility:VisibilityKind multiplicity:Integer navigability : NavigabilityKind relationship : RelationshipKind

type 1

OZType name : Name

1

1

OZOperation

name : Name visibility: VisibilityKind

*

OZSpecification

*

type

0..1 0..1

1

OZClass

1 system

1

*

*

*

*

superclass

class

Fig. 2. A class diagram showing the structure of modeling constructs in Object-Z

Classes: In Object-Z, classes are the major modeling construct for specifying a system. A Class is a template for objects that have common behaviors. A Class has a set of attributes and a set of operations. Each attribute has a name, a type, a visibility and some other features. Each operation has a name, a visibility and a set of parameters, each of which also has a name and a type. Syntactically, visible attributes and operations are listed in the visibility list.

A Formal Mapping between UML Models and Object-Z Specifications

9

Inheritance and Instantiation: Classes in Object-Z can be used to define other classes by inheritance. A class can inherit from several classes (multiple inheritance). Also, classes can be instantiated in other classes as attributes. In Object-Z, instantiation is used as a mechanism for modeling relationships between objects, which in UML is modeled using a separate modeling construct, Association. Objects which instantiate other classes as their attributes can refer to the objects of the instantiated classes. The values of these attributes are object-identities of the referenced objects. Object-Z specifications: In Object-Z, a specification is usually developed in a bottom-up approach. Once behaviors of individual objects are modeled in terms of classes, the whole system is then modeled by composing the developed individual classes from the system’s point of view in terms of the system class. Syntactically, there is no denotation to distinguish the system class from other classes in Object-Z. However, the intention of the system class differs from other classes. The system class captures the behavior of objects as a group, and relationships and interactions between the objects. In the metamodel, an Object-Z specification has a system class and a set of classes. 3.2 A Formal Description for the Object-Z Metamodel We give a formal definition for the modeling constructs in the metamodel using Object-Z classes. Prior to this, we extend the semantics of type Name to include the names of all classes, attributes, operations and operation parameters in Object-Z. The following Object-Z class OZType is a formal description of metaclass OZType. In the metamodel, OZType is an abstract class from which all possible types in ObjectZ can be derived. We assume that metaclass OtherTypes is also formalized by an Object-Z class of that name. ÆÊ2=7\SHÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Attributes: The Object-Z class OZAttribute is a formal description of attributes. Each attribute has a name, a type, and a multiplicity constraining the number of values that the attribute may hold. It also has an attribute, relationship, to represent whether this attribute models a relationship between objects. Like UML, relationships between objects can be common reference relationships, shared or unshared containment relationships. For this, we define an enumeration type, RelationshipKind, which can have relNone, reference, sharedContainment and unsharedContainment as its values. The value relNone represents pure attributes of a class. When an attribute models a relationship, the attribute navigability represents the direction of the relationship (although the navigability of a relationship is modeled implicitly in Object-Z). Visibility in Object-Z can be public or private. RelationshipKind :: = relNone | reference | sharedContainment | unsharedContainment NavigabilityKind :: = navNone | bi | one

10

S.-K. Kim and D. Carrington

ÆÊ2=$WWULEXWHÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇW\SH”2=7\SH ÇÇYLVLELOLW\9LVLELOLW\.LQG ÇÇPXOWLSOLFLW\}‚ ÇÇUHODWLRQVKLS5HODWLRQVKLS.LQG ÇÇQDYLJDELOLW\1DYLJDELOLW\.LQG ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Parameters and Operations: We formalize OZParameter and OZOperation in the same way as OZAttribute. ÆÊ2=3DUDPHWHUÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇW\SH”2=7\SH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

ÆÊ2=2SHUDWLRQÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇQDPH1DPH ÇÇYLVLELOLW\9LVLELOLW\.LQG ÇÇSDUDPHWHUVVHT2=3DUDPHWHUÈ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

Classes: Now we are in the position to formalize Object-Z classes. An Object-Z class named OZClass is a formal description for classes in Object-Z. Since classes are a kind of type, OZClass inherits from OZType. The attribute superclass maintains inheritance information of classes. Each class has its own attributes and operations defining static and dynamic behaviors of its instances. Circular inheritance is not allowed. Attribute and operation names should be unique within a class. These properties are specified in the predicate of OZClass. Functions directSuperclass and allSuperclass return direct superclasses of a class and all inherited superclasses of a class, respectively. ÇGLUHFW6XSHUFODVV2=&ODVV†}2=&ODVV ÇDOO6XSHUFODVV2=&ODVV†}2=&ODVV ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±RF2=&ODVV³ ÇGLUHFW6XSHUFODVV RF  RFVXSHUFODVV ÇDOO6XSHUFODVV RF  GLUHFW6XSHUFODVV RF ˜ Ç  ›^VFRGLUHFW6XSHUFODVV RF ³DOO6XSHUFODVV VFR ` ÆÊ2=&ODVVÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç2=7\SH ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇVXSHUFODVV~2=&ODVV ÇÇDWWULEXWHV~2=$WWULEXWHÈ ÇÇRSHUDWLRQV~2=2SHUDWLRQÈ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇVHOIDOO6XSHUFODVV VHOI ÇDZDDDWWULEXWHV³DQDPH DQDPH·D D ÇDZRSRSRSHUDWLRQV³RSQDPH RSQDPH·RS RS ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Object-Z specifications: In class OZSpecification, an Object-Z specification is composed of a system class and a set of classes. If two Object-Z specifications contain the same system class, they model the system from an identical view. Thus, the two specifications can be considered as identical. Class names should be unique within the Object-Z specification in which they are used. These properties are specified in the predicate of OZSpecification.

A Formal Mapping between UML Models and Object-Z Specifications

11

ÆÊ2=6SHFLILFDWLRQÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇV\VWHP2=&ODVV ÇÇFODVV~2=&ODVV ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇV\VWHPFODVV ÇDZFFFODVV³FQDPH FQDPH·F F ÇDZVV2=6SHFLILFDWLRQ³VV\VWHP VV\VWHP·V V ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

4 The Semantics of Object-Z In the previous section, we formalized the structure of core modeling constructs in Object-Z. In this section, we define the semantics of these constructs and map them to their semantics. Figure 3 is a UML class diagram showing the semantics of static modeling constructs in Object-Z (we do not include the semantics of dynamic modeling constructs such as operations in this paper). *

a ttrib u te

O Z A ttr ib u te

1

ty p e

* 1

*

1

s e m a n tic s

1

c la s s

0 ..1

1

I n s ta n c e

*

1

O b je c t

O Z C la s s s y s te m

0 ..*

* v a lu e

ty p e

O Z T ype nam e : N am e

s lo t

A ttr ib u te L in k

* 0 ..1

s e m a n tic s

O Z S p e c ific a tio n 1

1 in s ta n c e

*

0 . .1

0 .. 1

O Z S p ecM e a n in g

*

Fig. 3. A UML class diagram showing the semantics of core modeling constructs in Object-Z

4.1. The semantics of attributes An attribute is a named slot within a class that describes a range of values that instances of the class can have for that attribute. In the metamodel, an attribute is associated with a set of AttributeLinks. An AttributeLink is a named slot in an object, which holds the value of an attribute (see the relationships between OZAttribute, AttributeLink and Object in Fig 3). In Object-Z, attributes may hold information about relationships between objects. In this case, the attribute values are the object-identities of the referenced objects. A formal description of AttributeLink is as follows. ÆÊ$WWULEXWH/LQNÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇDWWULEXWH2=$WWULEXWH ÇÇYDOXH”,QVWDQFH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

12

S.-K. Kim and D. Carrington

Mapping attributes to their meaning: The following function maps an attribute to its meaning. ÇPDS2=$WWULEXWH7R0HDQLQJ2=$WWULEXWH†}$WWULEXWH/LQN ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±D2=$WWULEXWH³ Ç PDS2=$WWULEXWH7R0HDQLQJ D  ^DO$WWULEXWH/LQN_DODWWULEXWH D` 4.2 The Semantics of Types The semantics of a type is a set of instances of that type. A formal description of Instance is as follows. Types can map to their meaning in the same way as attributes. ÆÊ,QVWDQFHÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇW\SH”2=7\SH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 4.3 The Semantics of Classes Instances of a class are objects. In Object-Z, an object has a type that declares its structure and behavior. It has a set of attribute values including the object-identities of the other instances that it may reference. Unlike UML, inheritance in Object-Z does not imply instances of a subclass are also instances of its superclasses. Instead, to allow polymorphism for object types, Object-Z has a notation ”, which can be attached to a type. We define the type of an attribute which has a down-arrow attached to the type as a PolyType (see below for the definition of PolyType). When the downarrow symbol is attached to a type, the type corresponds to its own instances and instances of all its subtypes. Prior to giving a formal description for objects, we define two functions: descendents and allDescendents that return children of a class and all descendents of a class, respectively. We also define another two functions: containedObjects and allContainedObjects that return objects directly or indirectly contained by an object, respectively. Ç3RO\7\SH}”2=7\SH ÈÊÊÊÊÊÊÊÊÊÊÊ Ç3RO\7\SH–2WKHU7\SHV ÇGHVFHQGHQWV”2=7\SH…}”2=7\SH ÇDOO'HVFHQGHQWV”2=7\SH…}”2=7\SH ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±F2=&ODVV³ ÇGHVFHQGHQWV F  ^FF2=&ODVV_FFFVXSHUFODVV` ÇDOO'HVFHQGHQWV F  GHVFHQGHQWV F ˜ Ç  ›^FFGHVFHQGHQWV F ³DOO'HVFHQGHQWV FF ` ÇFRQWDLQHG2EMHFW2EMHFW…}2EMHFW ÇDOO&RQWDLQHG2EMHFW2EMHFW…}2EMHFW ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±R2EMHFW³ ÇFRQWDLQHG2EMHFW R  ^VRVORW_VYDOXH2EMHFWµ Ç  VDWWULEXWHUHODWLRQVKLS XQVKDUHG&RQWDLQPHQW¶ Ç VDWWULEXWHUHODWLRQVKLS VKDUHG&RQWDLQPHQW ³VYDOXH` ÇDOO&RQWDLQHG2EMHFW R  FRQWDLQHG2EMHFW R ˜ Ç  ›^FRFRQWDLQHG2EMHFW R ³DOO&RQWDLQHG2EMHFW FR `

A Formal Mapping between UML Models and Object-Z Specifications

13

A formal description of Object is as follows. ÆÊ2EMHFWÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç,QVWDQFH ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇVORW}$WWULEXWH/LQNÈ ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇW\SH2=&ODVV ÇÇ^VVORW³VDWWULEXWH` W\SHDWWULEXWHV ÇDZDW\SHDWWULEXWHV³ ÇÇ^VVORW_VDWWULEXWH D`DPXOWLSOLFLW\ ÇDZVVORW³ ÇÇVDWWULEXWHW\SH3RO\7\SH· ÇÇ VYDOXH^L,QVWDQFH_LW\SH^VDWWULEXWHW\SH` ÇÇ ˜DOO'HVFHQGHQWV VDWWULEXWHW\SH ` ÇÇVYDOXH2EMHFWµVDWWULEXWHQDYLJDELOLW\ EL· ÇÇ ²LVVYDOXHVORW³LVYDOXH VHOI ÇÇVHOIDOO&RQWDLQHG2EMHFW VHOI ÇÇ^R2EMHFW_²VRVORW_VDWWULEXWHUHODWLRQVKLS XQVKDUHG&RQWDLQPHQWµ ÇÇ VYDOXH VHOI`¿ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ The invariants defined in the predicate state that: œ The attributes of slots must be equal to the attributes of the type. œ The number of attribute links that map to an attribute should satisfy the multiplicity constraint given for that attribute. œ If the type of a slot is a PolyType, the value of the slot corresponds to the instances of the type or subclasses of the type. œ If an attribute slot of an object represents a bi-directional relationship, the referenced object (which is the value of the attribute slot) should have a slot that has the referencing object as its value. This assures links between objects. œ When an object has a containment relationship with other objects, any circular or self-containment should be prohibited. œ For all objects, there can be only one composite object for them. Mapping classes to their meaning: The following function maps a class to its meaning. ÇPDS2=&ODVV7R0HDQLQJ2=&ODVV†}2EMHFW ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±F2=&ODVV³ Ç PDS2=&ODVV7R0HDQLQJ F  ^R2EMHFW_RW\SH F` 4.4 The Semantics of Object-Z Specifications Semantically, an Object-Z specification consists of a system object and a collection of objects, which can be derived from the classes in the specification. ÆÊ2=6SHF0HDQLQJÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇV\VWHP2EMHFW ÇÇLQVWDQFH~2EMHFW ÇÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇV\VWHPLQVWDQFH ÇDZLLQVWDQFH³ ÇÇ ±VLVORW_VDWWULEXWHW\SH2=&ODVV³VYDOXHLQVWDQFH ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ

14

S.-K. Kim and D. Carrington

The invariants stated in the predicate denote that œ The system object is also an instance in the system. œ All objects linked with other objects are instances within the system. Mapping Object-Z Specifications to their meaning: Based on the formal definition given for the structure and semantics of Object-Z specifications, we define a function that extracts the meaning of an Object-Z specification from its structure. ÇPDS2=6SHF7R0HDQLQJ 2=6SHFLILFDWLRQ†}2=6SHF0HDQLQJ ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç±V2=6SHFLILFDWLRQ³ ÇPDS2=6SHF7R0HDQLQJ V  ^VP2=6SHF0HDQLQJ_ Ç VV\VWHP VPV\VWHPW\SHµ Ç ^LVPLQVWDQFH³LW\SH`–VFODVV`` To be a semantic model for a given Object-Z specification, the type of the system object should be the system class in the Object-Z specification. All instances defined in the semantic model are instances of the classes defined in the Object-Z specification. The mapping function defined above describes all these properties.

5 A Formal Mapping between UML Class Constructs and Object-Z Constructs In this section, we describe how to map UML class constructs to Object-Z constructs with the bank example. The mapping is based on the formal definitions given for the constructs in sections 2, 3 and 4, and the UML semantics [9]. All translation rules are also described formally. 5.1 UML Classes Semantically, a UML class in isolation is a type, from which all possible objects of that class can be drawn. With this semantics, each class in a UML class diagram is translated to an Object-Z class. Transformation rules for classes: A formal description for mapping a UML class to an Object-Z class is given by function mapUMLClassToOZ that takes a UML class and returns the corresponding Object-Z classes. The UML class name is used as the Object-Z class name. All attributes of the UML class are declared as attributes in the state schema of the corresponding Object-Z class. Also, each operation in the UML class is translated to an operation schema. In UML, types of attributes are a languagedependent specification of the implementation types and may be suppressed [9]. Types of attributes in Object-Z are language-independent specification types and cannot be omitted. Operation parameters are similar. In our work, we do not provide detailed transformation rules regarding attribute types and operation parameter types. Instead, we define an abstract function, convType that maps an UML type to an Object-Z type. Visibility and multiplicity features are mapped to those of Object-Z.

A Formal Mapping between UML Models and Object-Z Specifications

15

ÇFRQY7\SH”80/7\SH‹”2=7\SH ÇPDS80/&ODVV7R2=80/&ODVV†}2=&ODVV ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±XF80/&ODVV³ ÇPDS80/&ODVV7R2= XF  ^RF2=&ODVV_XFQDPH RFQDPHµ Ç ±XDXFDWWULEXWHV³ Ç ²RDRFDWWULEXWHV³ Ç RDQDPH XDQDPHµRDW\SH FRQY7\SH XDW\SH µ Ç  RDYLVLELOLW\ XDYLVLELOLW\µRDPXOWLSOLFLW\ XDPXOWLSOLFLW\µ Ç RDUHODWLRQVKLS UHO1RQHµRDQDYLJDELOLW\ QDY1RQH Ç ±XRXFRSHUDWLRQV³ Ç ²RRRFRSHUDWLRQV³ Ç RRQDPH XRQDPHµRRYLVLELOLW\ XRYLVLELOLW\ Ç ±XSUDQXRSDUDPHWHUV³ Ç ²RSUDQRRSDUDPHWHUV³ Ç RSQDPH XSQDPHµRSW\SH FRQY7\SH XSW\SH ` The following Object-Z class is a formal description of class Account in the bank example in Figure 1. Attributes and operations that have the value public for their visibility are listed in the visibility list. Operation parameters are defined as inputs. ÆÊ$FFRXQWÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç Ç¨ EDODQFH:LWKGUDZ'HSRVLW Ç ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇDF1R ÇÇEDODQFH€ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊ:LWKGUDZÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÆÊ'HSRVLWÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇDQ" ÇDQ" ÇÇDPRXQW"€ ÇDPRXQW"€ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 5.2 UML Associations An association provides a mechanism for communication between objects of the classes it associates. Objects can communicate with each other via the links between them. An association can be directed with an arrow attached to one association end. In this case, only objects of the class to which the arrow is not attached can reference objects of the other class to which the arrow is attached. As described in section 3, Object-Z uses instantiation as a mechanism for communication between objects. That is, a class can be instantiated within other classes as an attribute. The values of these attributes are the object identities of objects of the instantiated classes. In this way, an object can reference other objects via the object identities that it holds. We use this mechanism to model UML associations. Transformation rules for associations: For each class attached to an association, if the association end attached to the opposite class is navigable, the association is defined as an attribute within the corresponding Object-Z class. The attribute type is a power set of the corresponding Object-Z class of its opposite class. A formal description for these rules is as follows. We first extend type OtherTypes to include a power set type PType and define a function powerT, which maps a type to a power set type.

16

S.-K. Kim and D. Carrington

We also define auxiliary functions, convRelationship and convNavigability that map UML relationships and kinds of UML navigability to those of Object-Z respectively. Ç37\SH}”2=7\SH ÈÊÊÊÊÊÊÊÊÊÊÊ Ç37\SH–2WKHU7\SHV ÇSRZHU7”2=7\SH‹37\SH ÇFRQY5HODWLRQVKLS $JJUHJDWLRQ.LQGŽ$JJUHJDWLRQ.LQG ‹5HODWLRQVKLS.LQG ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±DD$JJUHJDWLRQ.LQGRD5HODWLRQVKLS.LQG³ ÇFRQY5HODWLRQVKLS DD  RD¸ ÇD D QRQH·RD UHIHUHQFHµ ÇD DJJUHJDWH·RD VKDUHG&RQWDLQPHQWµ ÇD FRPSRVLWH·RD XQVKDUHG&RQWDLQPHQW ÇFRQY1DYLJDELOLW\%RROHDQ…1DYLJDELOLW\.LQG ÈÊÊÊÊÊÊÊÊÊÊÊ ÇFRQY1DYLJDELOLW\ WUXH  ELµFRQY1DYLJDELOLW\ IDOVH  RQH Function mapAssocToOZ takes an association and returns the corresponding Object-Z classes of the UML classes that the association relates. ÇPDS$VVRF7R2=80/$VVRFLDWLRQ†} 2=&ODVVŽ2=&ODVV ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±D80/$VVRFLDWLRQ³ ÇPDS$VVRF7R2= D  ^VRFWRF2=&ODVV_ ÇVRFPDS80/&ODVV7R2= DHDWWDFKHG&ODVV µ ÇWRFPDS80/&ODVV7R2= DHDWWDFKHG&ODVV µ ÇDHQDYLJDELOLW\ WUXH· Ç ²VDVRFDWWULEXWHV³ Ç VDQDPH DHUROHQDPHµVDW\SH SRZHU7 WRF µ Ç VDPXOWLSOLFLW\ DHPXOWLSOLFLW\µ Ç VDQDYLJDELOLW\ FRQY1DYLJDELOLW\ DHQDYLJDELOLW\ µ Ç VDUHODWLRQVKLS FRQY5HODWLRQVKLS DHDJJUHJDWLRQDHDJJUHJDWLRQ ÇDHQDYLJDELOLW\ WUXH· Ç ²WDWRFDWWULEXWHV³ Ç WDQDPH DHUROHQDPHµWDW\SH SRZHU7 VRF µ Ç WDPXOWLSOLFLW\ DHPXOWLSOLFLW\ Ç WDQDYLJDELOLW\ FRQY1DYLJDELOLW\ DHQDYLJDELOLW\ µ Ç WDUHODWLRQVKLS FRQY5HODWLRQVKLS DHDJJUHJDWLRQDHDJJUHJDWLRQ In the bank example, the association between CheckingAccount and CheckBook is modeled according to the rules described above. In this case, the association is bidirectional, so the association is modeled as attributes within their corresponding Object-Z classes. The multiplicity of attribute book in class CheckingAccount has values {1..20}, which means that an instance of CheckingAccount maps to at least one instance of class CheckBook and at most twenty instances of CheckBook. The first constraint in the predicate of CheckingAccount formalizes this multiplicity constraint. The second constraint in the predicate of CheckingAccount assures actual links between instances of CheckBook and instances of CheckingAccount. In the same way, attribute checkaccount and check in class CheckBook are restricted in their set sizes. In the case of attribute check, its relationship is defined as unsharedContainment (we assume that

A Formal Mapping between UML Models and Object-Z Specifications

17

class Check is also formalized with an Object-Z class named Check). Thus the notation È which models unshared containment relationships in Object-Z [1] is added to the attribute type. Also a constraint assuring actual links between instances of the two classes is added in the predicate. ÆÊ&KHFNLQJ$FFRXQWÊÊÊÊÊÊÊÊÊÊÊÊ ÆÊ&KHFN%RRNÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Çe ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇFKHFNDFFRXQW}&KHFN$FFRXQW ÇÇFUHGLWB/LPLW ÇÇFKHFN}&KHFNÈ ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÇERRN}&KHFN%RRN ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÇFKHFNDFFRXQW¿ ÇÇ¿ERRN¿ ÇÇ¿FKHFN¿ ÇDZFDFKHFNDFFRXQW³ ÇDZEERRN³ ÇÇ VHOIFDERRN ÇÇ EFKHFNDFFRXQWVHOI ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇDZFFKHFN³VHOIFFKHFNERRN ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 5.3 UML Association Classes An association class is the case where an association has class-like properties, as well as association-like properties. Transformation rules for association classes: The class-like properties of an association class are formalized as an Object-Z class as described earlier. Since an instance of an association class should map to a pair of instances of its associated classes, the association properties are formalized as two additional attributes within the Object-Z class that formalizes the association class. The types of these two attributes are the Object-Z classes corresponding to the classes that the association class relates. For each of the associated classes, an attribute is defined with the type of the corresponding Object-Z association class. With this rule, instances of the associated classes can reference each other via instances of the association class that link them. Function mapAssocClassToOZ is a formal description of these rules. The function takes an association class and returns a set of the three associated Object-Z classes of the association class. ÇPDS$VVRF&ODVV7R2=80/$VVRF&ODVV†} 2=&ODVVŽ 2=&ODVVŽ2=&ODVV ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±DF80/$VVRF&ODVV³ ÇPDS$VVRF&ODVV7R2= DF  ^RFVRFWRF2=&ODVV_ ÇRFPDS80/&ODVV7R2= DF µ ÇVRFPDS80/&ODVV7R2= DFHDWWDFKHG&ODVV µ ÇWRFPDS80/&ODVV7R2= DFHDWWDFKHG&ODVV µ DzDVRFDWWULEXWHV³ Ç DVQDPH DFHUROHQDPHµDVW\SH WRFµ Ç DVQDYLJDELOLW\ FRQY1DYLJDELOLW\ DFHQDYLJDELOLW\ µ Ç DVUHODWLRQVKLS FRQY5HODWLRQVKLS DFHDJJUHJDWLRQDFHDJJUHJDWLRQ DzDWRFDWWULEXWHV³ Ç DWQDPH DFHUROHQDPHµDWW\SH VRFµ Ç DWQDYLJDELOLW\ FRQY1DYLJDELOLW\ DFHQDYLJDELOLW\ µ Ç DVUHODWLRQVKLS FRQY5HODWLRQVKLS DFHDJJUHJDWLRQDFHDJJUHJDWLRQ ÇDFHQDYLJDELOLW\ WUXH· DzVDVRFDWWULEXWHV³ Ç VDQDPH DFHUROHQDPHµVDW\SH SRZHU7 RF µ Ç VDW\SHPXOWLSOLFLW\ DFHPXOWLSOLFLW\µ

18

S.-K. Kim and D. Carrington

Ç VDQDYLJDELOLW\ FRQY1DYLJDELOLW\ DFHQDYLJDELOLW\ µ ÇVDUHODWLRQVKLS FRQY5HODWLRQVKLS DFHDJJUHJDWLRQDFHDJJUHJDWLRQ ÇDFHQDYLJDELOLW\ WUXH· DzWDWRFDWWULEXWHV³ ÇWDQDPH DFHUROHQDPHµWDW\SH SRZHU7 RF µ Ç WDW\SHPXOWLSOLFLW\ DFHPXOWLSOLFLW\µ Ç WDQDYLJDELOLW\ FRQY1DYLJDELOLW\ DFHQDYLJDELOLW\ µ ÇWDUHODWLRQVKLS FRQY5HODWLRQVKLS DFHDJJUHJDWLRQDFHDJJUHJDWLRQ dz RF VRFWRF ` In the bank example, association class Transaction is modeled as an Object-Z class named Transaction. Within this class, the association is modeled as two attributes: account and customer. In this case, the association is bi-directional, constraints mapping links between instances of the three classes are added in the predicates of the classes. Since the multiplicity is unlimited, no further constraints are given for the set size. ÆÊ7UDQVDFWLRQÊÊÊÊÊÊÊÊ ÆÊ$FFRXQWÊÊÊÊÊÊÊÊÊÊ ÆÊ&XVWRPHUÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇe ÇÇGDWH'DWH ÇÇe ÇÇFXVWRPHU}7UDQVDFWLRQ ÇÇDFFRXQW}7UDQVDFWLRQ ÇÇWLPH7LPH ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÇDFFRXQW$FFRXQW ÇDZWFXVWRPHU³ ÇDZWDFFRXQW³ ÇÇFXVWRPHU&XVWRPHU ÇÇWDFFRXQW VHOI ÇÇWFXVWRPHU VHOI ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÇVHOIDFFRXQWFXVWRPHU ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇVHOIFXVWRPHUDFFRXQW ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 5.4 UML Generalizations Generalization represents inheritance information between classes. When a class inherits from other classes, the inherited classes must be included as superclasses within the corresponding Object-Z class of the inheriting class. Function mapGenToOZ is a formal description for this rule. ÇPDS*HQ7R2=80/*HQHUDOL]DWLRQ†} 2=&ODVVŽ2=&ODVV ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±J80/*HQHUDOL]DWLRQ³ ÇPDS*HQ7R2= J  ^VXSRFVXERF2=&ODVV_ ÇVXSRFPDS80/&ODVV7R2= JVXSHU µ ÇVXERFPDS80/&ODVV7R2= JVXE µVXSRFVXERFVXSHUFODVV` In the bank example, CheckingAccount is refined as follows: ÆÊ&KHFNLQJ$FFRXQWÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ Ç$FFRXQW Çe ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 5.5 Mapping A UML Class Diagram to an Object-Z Specification We are now in the position to map the UML class diagram as a whole to an Object-Z specification. Individual class constructs: classes, associations, association classes and generalization that appear in a class diagram map to their corresponding Object-Z constructs using the predefined mapping functions.

A Formal Mapping between UML Models and Object-Z Specifications

19

When a class is interpreted as a component within a class diagram, the class represents a set of existing objects of that class at a certain point in time. We formalize this semantics of classes by instantiating their corresponding Object-Z classes as sets within the system class of the Object-Z specification that is developed for the class diagram in which the classes are declared. When a class is inherited by other classes, the type of the set that represents the existing instances of the class is a PolyType, which maps the type to the class and all its subclasses. With this translation, the semantics of generalizations in UML, in which instances of a subclass are also instances of its superclasses, is described formally. A formal description for these rules is as follows. We first define an auxiliary function polyT, which maps an OZType to a PolyType. Function mapUMLClassDiagramToOZSpec maps a UML class diagram to its corresponding Object-Z specification. Since the mapping functions defined for individual UML class constructs return a set of the corresponding Object-Z classes of the UML constructs, more predicates restricting the Object-Z classes to those which map to the UML class constructs defined in a particular class diagram are added. ÇSRO\7”2=7\SH…3RO\7\SH ÇPDS80/&ODVV'LDJUDP7R2=6SHF80/&ODVV'LDJUDP†2=6SHFLILFDWLRQ ÈÊÊÊÊÊÊÊÊÊÊÊ Ç±G80/&ODVV'LDJUDPV2=6SHFLILFDWLRQ³ ÇPDS80/&ODVV'LDJUDP7R2=6SHF G  V¸ Ç^FGFODVV³FQDPH` ^FVFODVV³FQDPH`µGFODVV VFODVVµ DZFGFODVV³²RFVFODVV³RFPDS80/&ODVV7R2= F DZDGDVVRF³²VFWFVFODVV³ VFWF PDS$VVRF7R2= D DZDFGDVVRF&OV³²RFVFWFVFODVV³ RF VFWF PDS$VVRF&ODVV7R2= DF DZJGJHQ³²VSFVEFVFODVV³ VSFVEF PDS*HQ7R2= J Ç Ç[Each Object-Z class is instantiated as a set in the system class] DZRFVFODVV³ Ç^VRFVFODVV_RFVRFVXSHUFODVV` ”· Ç ²DWVV\VWHPDWWULEXWHV³DWQDPH RFQDPHµDWW\SH SRZHU7 RF Ç^VRFVFODVV_RFVRFVXSHUFODVV`‘”· Ç ²DWVV\VWHPDWWULEXWHV³DWQDPH RFQDPHµ Ç DWW\SH  SRO\7œSRZHU7 RF Ç Ç[An Object-Z class only has attributes of its corresponding UML class and Çattributes that formalize associations with which the UML class is related. ] DzXFGFODVV_RFQDPH XFQDPH³ DZRDWRFDWWULEXWHV_RDWQDPH^XDWXFDWWULEXWHV³XDWQDPH`³ Ç ²XDGDVVRF³ Ç ±VRFWRF2=&ODVV_ VRFWRF PDS$VVRF7R2= XD µ Ç ^VRFWRF`–VFODVVµRF^VRFWRF`³ Ç RF VRF·RDWVRFDWWULEXWHVµ Ç RF WRF·RDWWRFDWWULEXWHV ¶ Ç  ²XDFGDVVRF&OV³ Ç ±RRFVRFWRF2=&ODVV_ RRF VRFWRF PDS$VVRF&ODVV7R2= XDF µ Ç ^RRFVRFWRF`–VFODVVµRF^RRFVRFWRF`³ Ç RF RRF·RDWRRFDWWULEXWHVµ Ç RF VRF·RDWVRFDWWULEXWHVµ Ç RF WRF·RDWWRFDWWULEXWHV In the bank example, class BankSystem is the system class. Within this class, all classes in the bank system are instantiated as sets. The type of attribute account is a

20

S.-K. Kim and D. Carrington

PolyType. Thus, the notation ” which models the polymorphism for object types in Object-Z is added to the attribute type. Since instances referenced by other instances should be existing instances of their types, constraints describing this property are added in the predicate. ÆÊ%DQN6\VWHPÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÆÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇÇDFFRXQW}” $FFRXQW ÇÇFKHFNLQJDFFRXQW}&KHFNLQJ$FFRXQW ÇÇFKHFNERRN}&KHFN%RRN ÇÇFXVWRPHU}&XVWRPHU ÇÇWUDQVDFWLRQ}7UDQVDFWLRQ ÇÈÊÊÊÊÊÊÊÊÊÊÊ ÇÇFKHFNLQJDFFRXQW–DFFRXQW ÇDZFDFKHFNLQJDFFRXQW³FDERRN–FKHFNERRN ÇDZFFKHFNERRN³FFKHFNDFFRXQW–FKHFNLQJDFFRXQW ÇDZDDFFRXQW³DFXVWRPHU–WUDQVDFWLRQ ÇDZFFXVWRPHU³FDFFRXQW–WUDQVDFWLRQ ÇDZWWUDQVDFWLRQ³WDFFRXQWDFFRXQWµWFXVWRPHUFXVWRPHU ÇÇe ÇÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÉÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ 5.6 Mapping the UML Class Diagram to its Semantic Models In the previous sections, individual UML class constructs are translated to Object-Z constructs based on their semantics. Then, the entire class diagram is mapped to an Object-Z specification by using the predefined mapping functions for the class constructs. We now define a function that maps the class diagram to its semantic models. This function is defined using the two predefined functions, mapUMLClassDiagramToOZSpec and mapOZSpecToMeaning as follows: ÇPDS80/&ODVV'LDJUDP7R0HDQLQJ80/&ODVV'LDJUDP†}2=6SHF0HDQLQJ ÈÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ ÇPDS80/&ODVV'LDJUDP7R0HDQLQJ Ç  PDS80/&ODVV'LDJUDP7R2=6SHFœPDS2=6SHF7R0HDQLQJ

6 Related Work Many people have proposed translating an informal OO model to a formal specification. Since our research involves UML and Object-Z, we only discuss other work that uses one of these notations. For example, France et al. [6] present a translation of a UML class diagram to a Z specification. Dupuy et al. [3] develop an Object-Z specification from an object model in OMT [10]. However, none of this work provides a formal basis for the languages used and translation rules are described informally at the model-level. In contrast our work provides a formal definition for the languages used. Given the formal definition, we provide a formal mapping at the meta-level, which makes our translation more systematic. Evans et al. [5] formalize part of the UML class constructs using Z. Our work extends their work to most of the UML class constructs. Recently, the pUML group [4] focused on defining the semantics of core modeling concepts in UML. At this abstract level, they use OCL to specify the semantics of model elements in much the same way as the UML semantics are specified. Our work is a lower level formalization that provides a formal basis for core modeling constructs in UML using a formal specification technique.

A Formal Mapping between UML Models and Object-Z Specifications

21

7 Conclusion This paper has presented a formal mapping from UML models to Object-Z specifications. To achieve the formal mapping, it defines a metamodel for Object-Z and gives a formal definition for the UML. Given the formal definition for both languages, a semantic mapping between the two has been described. Defining a formal model for UML modeling constructs enhances the precision of UML as a specification language. Translating UML models to Object-Z specifications provides a sound mechanism for rigorous semantic analysis of the UML models. Moreover, defining a metamodel for Object-Z not only improves the Object-Z definition but also provides a precise basis for mapping between Object-Z and other specification languages. Future work will extend to other modeling constructs and diagrams in UML and Object-Z. Acknowledgements. The authors would like to thank Graeme Smith and David Leadbetter for many helpful discussions. References [1] J. S. Dong and R. Duke. The Geometry of Object Containment, Object-Oriented Systems, vol. 2(1), pp. 41-63, Chapman & Hall, 1995. [2] R. Duke, G. Rose, and G. Smith. Object-Z: A specification language advocated for the description of standards, Computer standards & Interfaces, vol. 17, pp. 511-533, 1995. [3] S. Dupuy, Y. Ledru, and M Chabre-Peccoud, Integrating OMT and Object-Z, Proceedings of BCS FACS/EROS ROOM Workshop, 1997. [4] A. Evans and S. Kent, Core Meta-Modeling Semantics of UML: The pUML Approach, nd Proc 2 IEEE conference on UML: UML’99, LNCS, No 1723, pp. 140 -155, 1999. [5] R. B. France, A. Evans, K. Lano, and B. Rumpe, Developing the UML as a Formal Modeling Notation, Computer Standards and Interfaces, No 19, pp. 325-334, 1998. [6] R. B. France, J.-M., Bruel, M. M. Larrondo-Petrie, and M. Shroff. Exploring the Semannd tics of UML type structures with Z, Proc. 2 IFIP conference, Formal Methods for Open Object-Based Distributed Systems(FMOODS’97), pp. 247-260, Chapman and Hall, 1997. [7] A. Griffiths, A formal semantics to support modular reasoning in Object-Z, PhD Thesis, The University of Queensland, Australia, 1998. nd [8] S-K. Kim and D. Carrington, Formalizing the UML class diagram using Object-Z, Proc 2 IEEE conference on UML: UML'99, LNCS, No 1723, pp. 83 -98, 1999. [9] Object Management Group, OMG Unified Modeling Language Specification, version 1.3, 1999, http://www.omg.org [10] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-oriented modeling and design, Prentice-Hall, 1991. [11] G. Smith, Extending W for Object-Z, ZUM’95: The Z Formal Specification Notation, pp. 276-295, Springer, 1995. [12] G. Smith. The Object-Z Specification Language. Advances in Formal Methods. Kluwer Academic Publishers, 2000. nd [13] J. M. Spivey. The Z Notation: A Reference Manual, Prentice Hall, 2 edition, 1992.

A Generic Process to Refine a B Specification into a Relational Database Implementation R´egine Laleau and Amel Mammar CEDRIC-IIE(CNAM) 18 all´ee Jean Rostand, 91025 Evry, France {laleau, mammar}@iie.cnam.fr

Abstract. In this paper, an approach for refining B abstract specifications describing data-intensive applications into relational database implementations is presented. Using the refinement process of the B method, a set of generic refinement rules are described that take into account both data and operations. The last step consists of mapping the final refined component into a relational database implementation. The different rules have been checked with the AtelierB prover. The aim of the work is to automate the refinement steps. This is possible thanks to the genericity feature of the rules. The approach is illustrated through a running example. Keywords: B method, Refinement Process, Relational Database Implementation.

1

Introduction

The derivation of database programs from formal specifications is a well known but incompletely solved problem. Most of the previous works on the area was restricted to the derivation of the database structure, mainly because among the different specification languages used in classical systems analysis and design methods (such as UML or OMT) only those describing the system data can be considered as formal. Another frequent mistake has been to consider that once the derivation of the database structure is achieved, the production of transactions is straightforward. However it is not obvious since a major feature of database applications is the great number of integrity constraints that must be verified at all times. Thus programs must be designed such that the constraints are satisfiable in order to guarantee the consistency of the database. This can result in specifying elaborate transactions. Our solution consists of using the B formal method to design database applications. The system is first specified at an abstract level. We have proposed an approach to combine semi-formal methods (UML and OMT) with the B specification language [1] to facilitate this task. The result is a complete descri-ption of the system (both data and operations) that can be proved to be consistent. The following phase is to use the B refinement process in order to derive database programs. Up to now, we have only considered the development of relational database [7] implementations. In this paper we describe the process that produces J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 22–41, 2000. c Springer-Verlag Berlin Heidelberg 2000

Refining a B Specification into a Relational Database Implementation

23

the core of a relational implementation: database structure with its main constraints and update SQL statements generated with respect to the constraints. More elaborate transactions generation is an ongoing work. Up to now, proof activity in a formal development is manual and costly. In order to extend the use of formal methods outside the domain of critical systems, automatic provers need to be developed, as it is suggested in [4]. It is only possible if provers are specialized for particular domains where programs share common characteristics. An important feature of our work is the provision of a generic process refinement available for any relational database specification. Generic rules are defined for each step of the refinement and proved. This should cut down on the number of proofs required and logically enable the development of an automatic refiner. The remainder of this paper is organized as follows: in Section 2, a brief overview of the B refinement process and the description of relevant characte-ristics of SQL relational language is given. In Section 3 a concise summary of the global method is presented. Section 4 gives a brief report on different related works. Section 5 presents the main features of the refinement process and des-cribes the most interesting rules that allow an abstract B model to be refined into a concrete B model. More details can be found in [14]. The last step, presented in Section 6, consists of the mapping of the concrete model to a relational database implementation. Section 7 briefly describes through an example how the soundness of the refinement rules can be checked. Lastly, conclusions are presented in Section 8.

2 2.1

Overview of the B Refinement Process and the Relational Model Overview of the B Refinement Process

Refinement is the process of transforming an abstract specification into a less abstract one. These transformations operate first on data by adding new variables or replacing the existing ones by others which are supposed to be more concrete (closer to a target implementation). Of course, any transformation on variables automatically involves transformations on operations in order to obtain a consistent specification. A refinement can operate on an abstract machine or another refinement component. The last step of the refinement process produces an implementation component which is to be used as the basis for translation to executable code. Both specification and refinement give rise to proof obligations. Specification proofs ensure that operations preserve the invariant, whereas refinement proofs ensure the correctness of a refined component with respect to its initial component. In order to prove the correctness of refinements, we used the relational semantics of substitutions based on the definition of the two predicates Trm(S) and Prd(S) which are defined in the B-Book: Trm(S)∆[S](x = x) .

(1)

Prd(S)∆¬[S](x0 6= x) .

(2)

24

R. Laleau and A. Mammar

Intuitively: Trm(S): gives the necessary and sufficient condition for the good execution of S, Prd(S): gives the link between the values of the variable x before(denoted x) and after(denoted x’) the execution of S. With this definition, the correctness of refinement can be expressed as follows: Let S and T be two substitutions that act respectively on the set of variables P and Q. T refines S according to the gluing invariant J(P,Q), denoted S vJ(P,Q) T, iff: (i) ∃u, v. J(u, v). (ii) ∀X, P, Q. (T rm(S) ∧ J(P, Q)) → T rm(T ). (iii) ∀X, P, Q, X 0 , Q0 . (T rm(S) ∧ J(P, Q) ∧ P rd(T )) → ∃P 0 .(P rd(S) ∧ J(P 0 , Q0 )). where: - X is the set of variables that occur in S and not in P or Q, - X’, P’, Q’ are respectively the set of values of variables X, P and Q after the execution of S and T. (i) means that the gluing invariant must be satisfiable(it isn’t a contradiction) , (ii) means that: for each possible interpretation of X and P that ensures the good execution of S, the corresponding set Q (that satisfies the gluing invariant) ensures the execution of T, (iii) means that: for each possible result Q’ of T, there must exist a corresponding result P’ of S such that P’ and Q’ verify the gluing invariant J. 2.2

Overview of the Relational Model

The relational model was defined by Codd [6]. Relations, that we call tables according to the SQL world to avoid confusion with the B concept, are specified as sets of tuples. A tuple is an element of a cartesian product. The formal definition of a table contains two parts : the table intention and the table extension. The intention is a tuple type which defines the table attributes, each of which must be of a valid type. The extension is the set of instances of type tuple which exists at a given moment. We use SQL-92 syntax [16]. Integrity constraints can be defined on tables: – NOT NULL constraint : defined on an attribute (or a group of attributes) specifies that the attribute must be valued. – UNIQUE constraint : defined on an attribute (or a group of attributes) specifies that each attribute value is unique in the table extension. – Key constraints : each table has at least one key which is an attribute (or a group of attributes). One of them is defined as the PRIMARY KEY of the table. The others are defined by specifying NOT NULL and UNIQUE constraints. – Referential constraint : defined on an attribute (or a group of attributes), denoted R1, of a Table T1 towards a key of a Table T2, denoted K2. It specifies that the set of R1 attributes values is included in the set of K2 attributes values. Other constraints, introduced by the keyword CHECK, can be defined. They represent predicates that must be verified by one or a group of tables.

Refining a B Specification into a Relational Database Implementation

3

25

Description of our Approach

The aim of our project is to automatically derive relational database implementations from abstract specifications described with UML notations [18] (object diagram, state diagrams and collaboration diagrams). Moreover each implementation can be proved to be consistent with the corresponding abstract specification. The functional architecture is represented by the following figure: A

*

*

B

UML Diagrams

111 000 000 111 000 111 000 111

Translation Phase

111 000 000 111 Proof

Abstract Model(B Language) Proof

Refinement Phase

Proof

Concrete Model(B Language) Translation Phase Relational Database Structure +Programs

Fig. 1. Formal mapping of a UML conceptual model into a relational implementation using the B method

In order to map a UML conceptual model into a B abstract model, precise rules on the use of UML concepts have been defined [12]. In this paper, we are interested in generating the core of a relational implementation: relational database structure and basic update statements. They are derived only from class diagram that are detailed in the next section. 3.1

From UML Class Diagrams to B Specifications

In this section, we outline the mapping of UML class diagrams into B specifications. Only the main concepts (class, attribute and association) are presented. A complete description can be found in [11], [17]. The simple diagram of Figure 2 is used to illustrate the mapping: Customer last_name: string first_name: string 1 tel_no:nat {1..2}

Own

*

Order numor: nat {K} date: nat

*

1..*

concerns quantity:nat

Fig. 2. An UML class diagram

Product numpr:nat {K} compos: string {1...*}

26

R. Laleau and A. Mammar

The class Order is described by two attributes numor and date. The label {K} specifies that numor is a key for this class (note that a key can be composed of a set of attributes). The class P roduct is described by a key attribute numpr and a multivalued attribute compos, specified by the label {1..*}. A Customer is specified by his/her last and first names and an attribute tel no (telephone number) which is mandatory and multivalued (specified by the label {1..2} that means that tel no can have 2 values at most). Each order concerns one or more products. Each product may appear in zero or more orders. For each product related to an order, an ordered quantity is specified. Each order is placed by exactly one customer who may place zero or more orders. • Classes. Each class A is formalized by an abstract given set of all possible instance of A and a variable vA representing the set of existing instances of A. Each attribute Att of a class A is modeled by a variable vAtt which type is a binary relation between vA and the attribute type TAtt . Depending on the cardinality of the attribute, the relation may become a total or partial function. TAtt is a basic type (including enumerated sets) or built on a basic type using PartOf or Sequence constructor. We impose that UML basic types correspond to B basic types. Up to now, we have not yet considered how to specify specific or aggregate types (such as address type composed of a street name and a town). A key is translated by a total injective function because its value is unique and mandatory in every object. • Associations. We consider only binary associations. This limit is not very important since n-ary associations can always be transformed into binary associations with additional constraints. An association Ass between two classes X and Y is modeled by a variable vAss defined as a relation between the existing instances of the two classes vX and vY . As for attributes, the type of the relation depends on the cardinality of each role (a role is an association end). An attribute of an association is formalized as an attribute of a class. In database applications, basic update operations, which are application independent, can be automatically generated from class diagrams. Cardinality constraints are taken into account. In the example, only some of these operations are presented. Example: The example is translated into the B specification as follows: MACHINE MODEL SETS Order, Customer, Product VARIABLES customer, last name, first name, tel no, order, numor, date, own, product, numpr, compos, concerns, quantity INVARIANT customer ⊆ Customer ∧ last name ∈ customer → STRING ∧ first name ∈ customer → STRING ∧ tel no ∈ customer ↔ NAT ∧ ∀x.(x ∈ customer ⇒1≤card(tel no[{x}])≤ 2) ∧ order ⊆ Order ∧ numor ∈ order  NAT ∧ date ∈ order → NAT ∧ own ∈ order → customer ∧ product ⊆ Product ∧ numpr ∈ product  NAT ∧ compos ∈ product → STRING ∧ concerns ∈ order ↔ product ∧ dom(concerns)=order ∧ quantity ∈ concerns → NAT

Refining a B Specification into a Relational Database Implementation

27

INITIALISATION customer, last name, first name, tel no, order, numor, date, own, product, numpr, compos, concerns, quantity:=∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅, ∅ OPERATIONS /*only some operations are given*/ Add Customer (cus1 , ln1 , fn1 , t1 ) PRE cus1 ∈ Customer-customer ∧ ln1 ∈ STRING ∧ fn1 ∈ STRING ∧ t1 ⊆ NAT ∧ card(t1 ) ≤ 2 ∧ card(t1) ≥ 1 THEN customer:=customer ∪ {cus1 } k last name := last name ∪ {cus1 7→ ln1 } kfirst name:= first name ∪ {cus1 7→ fn1 }ktel no := tel no ∪ {cus1 } * t1 END; /*P arameters f or adding an order necessarily contain a customer, a product and a quantity in order to guarantee consistent cardinalities*/ Add Order(ord1 , numor1 , date1 , cus1 , pd1 , quantity1 ) PRE ord1 ∈ Order-order ∧ numor1 ∈ NAT-ran(numor) ∧ date1 ∈ NAT ∧ cus1 ∈ customer ∧ pd1 ∈ product ∧ quantity1 ∈ NAT THEN order:=order ∪ {ord1 } k numor:=numor ∪ {ord1 7→ numor1 } k date:=date ∪ {ord1 7→ date1 } k own:=own ∪ {ord1 7→ cus1 } k concerns:=concerns ∪ {ord1 7→ pd1 } k quantity:=quantity ∪ {(ord1 7→ pd1 ) 7→ quantity1 } END; /*Removing an order requires removal of its related concerns links.*/ Rem Order(ord1 ) PRE ord1 ∈ order THEN order:=order-{ord1} k numor:={ord1 } / numor k date:={ord1 } / date k own:={ord1 } / own k concerns:={ord1 } /concerns k quantity:=({ord1 }*product) / quantity END; Add Product(pd1 , n1 , comp1 ) PRE pd1 ∈ Product-product ∧ n1 ∈ NAT-ran(numpr) ∧ comp1 ⊆ STRING THEN product:=product ∪ {pd1 } k numpr:=numpr ∪ {pd1 7→ n1 } k compos:=compos ∪ {pd1 }*comp1 END; Rem Product(pd1 ) PRE pd1 ∈ product - ran(concerns) THEN product:=product-{pd1 } k numpr:={pd1 } / numpr k compos:={pd1 } / compos END END Let us remark that usually the specification we obtain with our rules is modularized (for more details see [11], [17]). In this paper, the specification is presented as a single module in order to simplify the description of the refinement method. 3.2

Refinement of B Specifications into Relational Database Implementations

All classical systems analysis and design methods propose an algorithm to translate class diagrams into relational schemas [3]. The steps are:

28

R. Laleau and A. Mammar

(i) transformation of inheritance links. Three solutions are possible: either the superclass is kept and its subclass characteristics are shifted into it, or the opposite solution, or both super and subclasses are kept and inclusion constraints added. (ii) transition to the first normal form : in the relational model, all the attribute domains are atomic. Thus a multivalued attribute of a class diagram is replaced either by a new class or by new atomic attributes according to its cardinality. (iii) transition from an object-based model to a value-based model. In the relational model, each table must have a key. It is a value-based model, that means that each tuple is identified by a key which is a set of attributes. On the contrary, the UML model is an object-based model : each object is identified by an object identifier, independent of its value. (iv) transformation of association links : an association between classes A and B with at least one monovalued role (attached for example to class A) is replaced by a new attribute in A, which is linked to the key of B by a referential constraint. Other associations are replaced by a new class with two attributes which are linked to the keys of A and B by referential constraints. (v) finally, each class is mapped into a table. Using these rules, the example gives the following SQL schemas: CREATE TABLE Order( numor INT PRIMARY KEY, date INT NOT NULL, own1 INT NOT NULL REFERENCES Customer(numcus)) CREATE TABLE Product( numpr INT PRIMARY KEY) CREATE TABLE Customer ( numcus INT PRIMARY KEY, last name CHAR(30) NOT NULL, first name CHAR(30) NOT NULL, tel no1 NAT NOT NULL, tel no2 NAT) CREATE TABLE Compos( label CHAR(30) NOT NULL, numpr2 INT NOT NULL REFERENCES Product(numpr), PRIMARY KEY (label , numpr2 )) CREATE TABLE Concerns( numor1 INT NOT NULL REFERENCES Order(numor), numpr1 INT NOT NULL REFERENCES Product(numpr), quantity INT NOT NULL, PRIMARY KEY (numor1 , numpr1 ))

Refining a B Specification into a Relational Database Implementation

29

Our method produces the same schema using the B refinement process. Basic SQL statements corresponding to the basic operations defined at the conceptual level are also generated (cf Section 6). An important objective of our work is to automate this refinement process. If this automation seems an arduous task in general, we think it should become possible if we consider applications of a particular domain, which are implemented using the same refinement process. In this framework the automation consists of : - defining generic refinement rules - proving the correctness of the rules These two points are detailed in Sections 5 and 6.

4

Related Work

Direct related works are those of Barros [2] and G¨ unther&al. [13]. Both generate from formal specifications implementations in DBPL which is a more powerful language that SQL. In the second work, they use the B refinement process. However they present their approach only through an example and no formal rules are specified. Moreover no precision is given about a tool. Barros’ method consists of first specifying the most important relational concepts in Z and then defining rules to translate such a Z specification into a DBPL implementation. This last part corresponds to the last step of our method that consists of mapping a concrete B model into a SQL implementation. The works of Evans [10] and Castelli&al. [5] are more general than our work. They deal with the definitions of formal rules to transform diagrams. Refinement is considered as a specific case of diagrams transformation. They are not interested in automating the refinement process that produces a relational implementation. The work of Locuratolo and her collaborators [15] deals with refinement for database design but considers object oriented databases.

5

Refinement Rules

Now, a description of the specific B refinement process we have defined is presented. The abstract model is obtained using the method described in Section 3.1 The “concrete model” is to be defined such that the mapping to a relational implementation is automatic. Thus it is not described in B0 since SQL is more abstract than B0. The different refinements correspond to the algorithm presented in Section 3.2 For each of them, two kinds of rules are specified : first a data refinement rule then refinement rules for the substitutions that occur in the basic operations. RULE Ri D IS ABSTRACT VARIABLES a abstract CONSTRAINT C CONCRETE VARIABLES a concrete INVARIANT I ∧ J END

RULES Ri S IS REFINE P | Abstract S REFINEMENT Q | Concrete S END

30

R. Laleau and A. Mammar

• Data Refinement Rule (Ri D): this rule specifies for each abs-tract variable a abstract that satisfies the constraint C, the concrete variable a concrete, its type invariant I and the gluing invariant J. The gluing invariant J gives the existing relation between the variables a abstract and a concrete. • Substitution Refinement Rules (Ri S): this set of rules refines substitutions that act on variables refined by the corresponding Ri D. In this paper, we consider two kinds of basic operation: add and remove. Each rule depends on the kind of the basic operation. This gives two refinement rules: Ri A for Add and Ri R for Remove. More precisely, each possible substitution on the abstract variable refined by Ri D, generates one rule Ri A or Ri R. It means that: when the data refinement rule Ri D is performed then each abstract substitution P |Abstract S is refined by the concrete one Q|Concrete S. If P does not depend on a abstract then we have decided not to refine P (thus ”P=Q”). In this case, preconditions are omitted. For lack of space, only the rules representative of the transformation are presented. For more details, see [14]. 5.1

Transition to the First Normal Form

Multivalued attributes and attributes with a complex type (ie. types built on the PartOf or Sequence constructors) must be transformed. Rule 1 achieves this. Rule 1. Multivalued Attributes. A multivalued attribute Att is represented by a relation between the class variable and its type. Its refinement depends on the cardinality of the attribute: if it is bounded by an integer k (k is determined by the designer.), Att is split into k new monovalued attributes corresponding to the concrete variables f1 , . . . , fk . If the cardinality is unbounded or if it is greater than k, the data refinement rule can be considered as a rewriting rule that makes the relation f more suitable for further refinement. A special class is introduced (variable C) with no given set and f is refined by f1 which becomes a standard relation between two classes. Data Refinement Rule (Bounded Cardinality) RULE R11 D IS ABSTRACT VARIABLES f CONSTRAINT f ∈ A ↔ T ∧ ∀x.(x ∈ A ⇒ j ≤ card(f [{x}]) ≤ k) CONCRETE VARIABLES f1 , . . . fj , . . . , fk INVARIANT f1 ∈ A → T ∧ . . . ∧ fj ∈ A → T ∧ fj+1 ∈ A 9 T ∧ . . . ∧ fk ∈ A 9 T ∧ f1 ∪ . . . ∪ fk = f END

Data Refinement Rule (Unbounded Cardinality) RULE R12 D IS ABSTRACT VARIABLES f CONSTRAINT f ∈ A ↔ T C CONCRETE VARIABLES f1 , C, vkey INVARIANT C = T ∧ f1 ∈ A ↔ C∧ C f1 = f ∧ vkey = id(C) END

C Note : vkey denotes a function which represents a key attribute of C. Applying R11 D to tel no results in the definition of two new attributes specified as follows: tel no1 ∈ customer → N AT ∧ tel no2 ∈ customer 9 N AT The gluing invariant is: tel no1 ∪ tel no2 = tel no Applying R12 D to compos gives:

Refining a B Specification into a Relational Database Implementation

31

T ype comp = ST RIN G ∧ compos1 ∈ product ↔ T ype comp ∧ compos1 = compos ∧ T ype comp = id(ST RIN G) vkey The substitution refinement rules corresponding to R11 D are: Add Substitution RULE R11 A IS REFINE f := f ∪ {a} ∗ b REFINEMENT AN Y g1 , . . . , gk W HERE ∧i=1..k gi ⊆ T ∧i=1..j card(gi ) = 1 ∧i=j+1..k card(gi ) ≤ 1 ∪i=1..k gi = b ∧ ∀i, j (gi ∩ gj = ∅) T HEN ki=1..k IF gi 6= ∅ T HEN fi = fi ∪ {a 7→ choice(gi )} EN D EN D END

Remove Substitution RULE R11 R IS REFINE f := {a} / f REFINEMENT ki=1..k fi := {a} / fi END

Note : ∧i=1..k pi is an abbreviation for p1 ∧. . .∧pk and ki=1..k Si for S1 k . . . k Sk . In R11 A, a denotes an object of class A, b is a set of values included in T that becomes the new value for the object a of the abstract multivalued attribute represented by f . Intuitively, the refinement rule consists of splitting set b into k atomic values linked to the concrete variables fi . In Add Customer, the substitution tel no := tel no ∪ {cus1 } ∗ t1 is refined by: AN Y g1 , g2 W HERE g1 ⊆ N AT ∧ g2 ⊆ N AT ∧ g1 ∪ g2 = t1 ∧ card(g1 ) = 1 ∧ card(g2 ) ≤ 1 ∧ g1 ∩ g2 = ∅ T HEN IF g1 6= ∅ T HEN tel no1 = tel no1 ∪ {cus1 7→ choice(g1 )} EN D k IF g2 6= ∅ T HEN tel no2 = tel no2 ∪ {cus1 7→ choice(g2 )} EN D EN D The rules corresponding to R12 D are obvious (just replace f by f1 ). 5.2

From an Object-Based Model to a Value-based Model

Rule 2. Introduction of Keys. For each class, a key is to be exhibited. In B, a UML class is represented by a variable v whose type is a given set. A key corresponds to a total injective function defined on v. If such a function doesn’t exist in the initial specification, a new variable vkey , that we call primary key, is added as follows: RULE R2 D IS ABSTRACT VARIABLES v CONSTRAINT SET (V ) ∧ v ⊆ v 0 ∧ v 0 ⊆ V CONCRETE VARIABLES vkey v INVARIANT vkey ∈ v  Tkey END

32

R. Laleau and A. Mammar

v SET(V) means that the set V is a given set. Tkey is a basic B type, such as NAT, STRING, . . . chosen by the designer. The substitution refinement rule consists of adding a new substitution to the existing ones, in order to preserve the invariant.

Add Substitution RULE R2 A IS REFINE v := v ∪ {v1 } REFINEMENT v := v ∪ {v1 } k v AN Y x ∈ Tkey − ran(vkey ) T HEN vkey := vkey ∪ {v1 7→ x} EN D END

Remove Substitution RULE R2 R IS REFINE v := v − {v1 } REFINEMENT v := v − {v1 } k vkey := {v1 }/ vkey END

In the example, there is no injective function defined on customer, so the variable numcus is added: numcus ∈ customer  N AT . The substitution :AN Y numcus1 ∈ N AT − ran(numcus) T HEN numcus := numcus ∪ {cus1 7→ numcus1 } is added to the Add Customer operation. In the case where the designer has specified several keys for a class, he/she has to choose one to become the primary key. In order to simplify presentation of the following rules, we consider that a primary key is composed by only one attribute. 5.3

Transformation of Associations

Rule 3. Monovalued Associations. This kind of associations corresponds to associations with at least one monovalued role. In B, such an association is represented by a function f defined between two class variables C and D. It is refined into a function between C and the primary key of D. RULE R3 D IS ABSTRACT VARIABLES f D D CONSTRAINT f ∈ C 9 D ∧ vkey ∈ D  Tkey CONCRETE VARIABLES f1 D D INVARIANT f1 ∈ C 9 Tkey ∧ f1 = f ; vkey END

The substitution refinement replaces each D object by its key value: Add Substitution RULE R3 A IS REFINE f := f ∪ {c 7→ d} REFINEMENT D f1 := f1 ∪ {c 7→ vkey (d)} END

Remove links from C RULE R3 R1 IS REFINE f := {c}/ f REFINEMENT f1 := {c}/ f1 END

Remove links from D RULE R3 R2 IS REFINE f := f . {d} REFINEMENT D f1 := f1 . {vkey (d)} END

Refining a B Specification into a Relational Database Implementation

33

The own function of our example is refined by the own1 function defined as follows: own1 ∈ order → N AT ∧ own1 = own; numcus The different substitutions that act on own are refined: In Add Order : own := own ∪ {ord1 7→ cus1 } v own1 := own1 ∪ {ord1 7→ numcus(cus1 )} In Rem Order : own = {ord1 }/own v own1 := {ord1 } /own1 Rule 4. Other Associations. This rule refines all the associations that can’t be refined by the previous rule. The idea is to ”replace” the variable corresponding to the association by two new variables (v1 and v2 ) which correspond to the key attribute of each class linked by the association and such that (v1 , v2 ) is a key of this association. RULE R4 D IS ABSTRACT VARIABLES f C C D D CONSTRAINT f ∈ C ↔ D ∧ vkey ∈ C  Tkey ∧ vkey ∈ D  Tkey CONCRETE VARIABLES v1 , v2 INVARIANT C D C D (v1 k v2 ) ∈ f  Tkey × Tkey ∧ v1 = (dom(f )) / vkey ∧ v2 = (ran(f )) / vkey END

The substitution refinement rules for a standard association are : Add Substitution

Remove links from C RULE R4 A IS RULE R4 R1 IS REFINE REFINE f := f ∪ {c 7→ d} f := {c}/f REFINEMENT REFINEMENT f := f ∪ {c 7→ d} k f := {c} /f k C v1 := v1 ∪ {c 7→ vkey (c)} k v1 := {c}/v1 k D v2 := v2 ∪ {d 7→ vkey (d)} v2 := f [{c}]/v2 END END

Remove links from D RULE R4 R2 IS REFINE f := f .{d} REFINEMENT f := f .{d} k v1 := f −1 (d)/v1 k v2 := {d}/v2 END

The concerns relation of our example is refined as follows: (numor1 k numpr1 ) ∈ concerns  N AT × N AT ∧ numor1 = (dom(concerns)) / numor ∧ numpr1 = (ran(concerns)) / numpr In Add Order: concerns := concerns ∪ {ord1 7→ pd1 } v concerns := concerns ∪ {ord1 7→ pd1 } k numpr1 := numpr1 ∪ {pd1 7→ numpr(pd1 )} k numor1 := numor1 ∪ {ord1 7→ numor(ord1 )} Let us remark that rule R4 D is also applicable to relations derived by Rule R12 D, corresponding to multivalued attributes. However the Add substitution refinement rule is different : Add Substitution RULE R4 A Multi IS REFINE f := f ∪ {c} ∗ d REFINEMENT C f := f ∪ {c} ∗ d k v1 := v1 ∪ {c 7→ vkey (c)} k v2 := v2 ∪ id(d) END

34

R. Laleau and A. Mammar

Applying Rules R4 D and R4 A Multi to compos1 gives: (numpr2 k label) ∈ compos1  N AT × ST RIN G ∧ T ype comp numpr2 = (dom(compos1 )) / numpr ∧ label = (ran(compos1 )) / vkey In Add P roduct, the refined substitutions are: compos1 := compos1 ∪ {pd1 } ∗ comp1 k numpr2 := numpr2 ∪ {pd1 7→ numpr(pd1 )} k label = label ∪ id(comp1 ) 5.4

Extension of Partial Functions

Rule 5. This rule consists in extending each partial function, defined from A to T , into a total one. It is necessary because, in the relational model, each attribute has a value, possibly the Null value. In B, extending a basic type to this Null value is not possible, thus we propose to consider for each type a special value which represents this Null value. For example Null value for a string is the empty string, for an integer Maxint. The Boolean type should be simulated by using natural numbers {0,1}. Intuitively, the extension of a partial function f consists of assigning the Null value for each element that does not belong to the domain. Formally, this refinement is specified as follows. Only the Add substitution refinement rule corresponding to the case of optional monovalued attribute is given. Each time a new object is added, a substitution that assigns a Null value to a partial function is added. Data Refinement Rule

Remove Substitution RULE R5 D IS RULE R5 A IS RULE R5 R IS ABSTRACT VARIABLES REFINE REFINE f A := A ∪ {a} f := {a}/f CONSTRAINT REFINEMENT REFINEMENT f ∈A9T A := A ∪ {a} k f1 := {a}/f1 CONCRETE VARIABLES f1 := f1 ∪ {a 7→ N ull} END f1 END INVARIANT f1 ∈ A → T ∪ {N ull}∧ f = f1 .{N ull} END

5.5

Add Substitution

From a Functional Representation to a Set Representation

Rule 6. Definition of the Table Structures This rule consists of mapping a functional model to a set model or more precisely to a model based on the cartesian product which is the formal definition of a table. Rules are defined for building the table corresponding to a class (R6Clas D) and the table corresponding to an association refined by R4 (R6Ass D). All the attributes defined by a function from the same source set are gathered in a cartesian product that will give the table structure.

Refining a B Specification into a Relational Database Implementation Class Table Structure RULE R6Clas D IS ABSTRACT VARIABLES f1 , f2 , . . . , fn CONSTRAINT f1 ∈ A  T1 ∧ f2 ∈ A → T2 ∧ ... fn ∈ A → Tn CONCRETE VARIABLES TA INVARIANT T A ∈ A → T1 × T2 × . . . × Tn ∧ T A = f1 ⊗ f2 ⊗ . . . ⊗ fn END

35

Association Table Structure RULE R6Ass D IS ABSTRACT VARIABLES f1 , f2 , . . . , fn , f CONSTRAINT f ∈ C → D∧ (f1 k f2 ) ∈ f  T1 × T2 ∧ C f1 = (dom(f )) / vkey ∧ D f2 = (ran(f )) / vkey ∧ ... fn ∈ f → Tn CONCRETE VARIABLES Tf INVARIANT T f ∈ f → T1 × T2 × . . . × Tn ∧ T f = (f1 k f2 ) ⊗ f3 ⊗ . . . ⊗ fn END

Hereafter are the corresponding substitution rules: Add Substitution RULE R6Clas A IS REFINE b1 ∈ T1 − ran(f1 )| A := A ∪ {a} k (ki=1..n fi := fi ∪ {a 7→ bi }) REFINEMENT b1 ∈ T1 − dom(n−1) (ran(T A))|A := A ∪ {a} k T A := T A ∪ {a 7→ (b1 , b2 , . . . , bn )} END

Remove Substitution RULE R6Clas R IS REFINE A := A − {a} k ki=1..n fi := {a}/fi REFINEMENT A := A − {a} k T A := {a}/T A END

n times

z }| { where domn (f ) is an abbreviation of dom((. . . (dom(f )))). In the Add rules, the precondition has to be refined because it depends on f1 which is removed in the refined component. Applying these rules to the functions numor, date and own1 gives: T order ∈ order → N AT × N AT × N AT ∧ T order = numor ⊗ date ⊗ own1 Add Order is refined by the substitution: ord1 ∈ N AT − dom(dom(ran(T order)))|order := order ∪ {ord1 } k T order := T order ∪ {ord1 7→ (numor1 , date1 , numcus(cus1 ))} and Rem Order is refined by : order := order − {ord1 } k T order := {ord1 }/T order Similar substitution rules exist for associations.

6

Mapping into SQL Tables

By the previous refinement rules we obtained a concrete model which is a formal description of the relational model and such that all its components can be

36

R. Laleau and A. Mammar

automatically translated into SQL. This ensures the consistency of the relational implementation with the concrete model and thus the abstract model. The automatic mapping is performed as follows. 6.1

Description of the Mapping

T1. Definition of Tables. Each variable TA or Tf which is the source set of a direct product P is mapped into a SQL table (with the same name). T2. Definition of Table Attributes. For a table TA or Tf, each attribute corresponds to a function fi that occurs in the definition of P. Its domain is defined by the target set of fi . If fi corresponds to a total function of the initial specification then a NOT NULL constraint is defined on the corresponding attribute. The type of the SQL attribute depends on the type of the attribute in the abstract model. In general, for one type of B correspond several SQL types. Thus the designer has to choose the most appropriate type (especially for the B String type which must be bounded in SQL). These two steps allow the structure of Order, Customer, P roduct and Concerns tables to be derived. T3. Definition of Keys. The primary key of a table TA is the mapping from the primary key defined in Rule R2 D. For a table Tf derived from the rule R6Ass D, the primary key is composed of the attributes corresponding to the functions of the parallel product. For example, (numor1 , numpr1 ) is the primary key of Concerns table. Other injective functions defined on TA are translated by a UNIQUE constraint on the corresponding attributes. T4. Definition of Referential Constraints. Referential constraints ”replace” the associations of the UML class diagram. Let us suppose that f is an association from C to D, TC and TD are the corresponding tables. If f is a monovalued D ). For association, R3 D has refined it by the concrete variable (f1 ∈ vC 9 Tkey the attribute of TC corresponding to f1 a referential constraint towards the primary key of TD is added. For example, a referential constraint from own1 to the key numcus of Customer is defined. In other cases, f is refined by R4 D. For the attributes of Tf corresponding to the variables occurring in the parallel product, two referential constraints towards the primary keys of TC and TD are added. For example, two referential constraints are added from numor1 and numpr1 respectively to the key numor of Order and the key numpr of P roduct. T5. Mapping of Given Sets. The mapping of a given set depends on its type: – an enumerate set is used as a target set in the typing of a function fi . Thus, it is translated by a SQL domain definition which becomes the type of the corresponding attribute.

Refining a B Specification into a Relational Database Implementation

37

– an abstract given set is used in the typing of a class variable. If no property is specified on the given set, it is just ignored in the mapping phase. Otherwise the property is mapped by a constraint on the table corresponding to the class variable. T6. Mapping of a Total Relation. A relation f defined from A to B, may be mandatory on A (dom(f ) = A) or B (ran(f ) = B). With Rule 4 D, such a A B ∧ v2 = (ran(f )) / vkey relation is refined by: v1 = (dom(f )) / vkey . A - dom(f ) = A means that v1 = vkey which is translated by the SQL constraints: A In Tf : Check(v1 IN (SELECT vkey FROM TA)) A In TA : Check(vkey IN (SELECT v1 FROM Tf)) B which is translated by the SQL constraints: - ran(f ) = B means that v2 = vkey B In Tf : Check(v2 IN (SELECT vkey FROM TB)) B In TB : Check(vkey IN (SELECT v2 FROM Tf)) Applying this rule to concerns generates the constraint: In Concerns : Check(numor1 IN (SELECT numor FROM Order)) In Order : Check(numor IN (SELECT numor1 FROM Concerns)) Remark that other properties can be defined in a UML class diagram. They are mapped into additional conjuncts in the Invariant clause of the corresponding B abstract model. Their translation into SQL integrity constraints is an ongoing work. T7. Generation of SQL Statements: Each basic operation is mapped into SQL basic operations which are composed of SQL statements. An Add operation is mapped into one or more insert statements, whereas a Remove operation is mapped into one or more delete statements. Parameters of the operation whose type is a reference to an object must be replaced by parameters whose type is a reference to a key. For example Add Order(ord1 , numor1 , date1 , cus1 , pd1 , quantity1 ) defined in Section 3.1 is translated by : Add Order(numor1 :Int,date1 :Int,numcus1 :Int,numpr1 : Int,quantity1 :Int). Its body becomes: INSERT INTO Order VALUES (numor1 , date1 , numcus1 ) INSERT INTO Concerns VALUES (numor1 ,numpr1 ,date1 , numcus1 ) And Rem Order(ord1 ) is translated by : Rem Order(numor1 :Int) DELETE FROM Order WHERE numor=numor1 DELETE FROM Concerns WHERE numor=numor1 Let us remark that in the last example, CASCADE DELETE could be used.

6.2 Discussion Preconditions on basic operations used to type the input parameters are directly translated into SQL basic operation signatures. Other preconditions are ignored

38

R. Laleau and A. Mammar

in the mapping. In fact they indicate which tests must be performed in the calling operations to ensure a correct execution of the SQL basic operations. In a relational implementation, tables must be at least in third normal form to guarantee a consistent behavior of the database. In a classical algorithm as described in Section 3.2, the normalization process is performed at the last step. It can result in the modification of tables and the definition of new ones which in turn requires modification of operations. In our approach, both the database structure and the operations are derived from the conceptual model, thus cannot be modified during the last step of the process. The solution to guarantee the third normal form property is to build a class diagram correct with respect to functional dependencies. Definition of the second and third normal forms needs to be revisited with an object-oriented point of view.

7

Proof of the Refinement Rules

When a refinement process is used in the development of an application, it is necessary to prove each step of this refinement. This ensures the correctness of the final result. In our case, to establish the correctness of the relational implementation, we have to prove each refinement rule presented above. It is important to remark that the proof of each rule depends only on the rule itself. In other words, the proof of a given rule is completely independent of the application we refine. This independence enables not only the automation of the proof process, but also the reusability of each basic proof rule. Indeed, whatever the application refined, any refinement rule always gives rise to the same proof obligation. Thus each proof is achieved in a generic way and will be reusable(instantiated) in any specific refinement. All the proofs of the refinement rules have been achieved using AtelierB [8] and can be found in [15]. For lack of space, we give just the proof of Rule R4 A: Using Definition 2 (section 2.1), the refinement is correct iff these three conditions are satisfiable: D P1. ∃u, v. v = u; vkey B P2. ∀C, f, f1 . ((Trm(c ∈ / C|f := f ∪ {c 7→ d}) ∧ f1 = f ; vkey )→ D Trm(c ∈ / C|f1 := f1 ∪ {c 7→ vkey (d)})) D / C|f := f ∪ {c 7→ d}) ∧ f1 = f ; vkey ∧ P3.∀ C, C 0 , f, f1 , f10 . ((Trm(c ∈ D Pred(c ∈ / C|f1 := f1 ∪ {c 7→ vkey (d)})) → D ∃f 0 . (Pred(c ∈ / C|f := f ∪ {c 7→ d}) ∧ f10 = f 0 ; vkey )) let us prove each goal: B is satisfiable by taking u=∅ and v=∅. Proof of P1: ∃u, v. v = u; vkey Proof of P2: (i) Trm(c∈C / | f:=f∪ {c7→d})=(c∈C)∧ / Trm( f:=f∪{c7→d})=(c∈C) / (d)})= (c ∈C)| / Trm(f :=f ∪{a7 → vD / (ii) Trm(c∈C|f / 1 :=f1 ∪{c7→ v D 1 1 key key (d)})= (c∈C) D / ∧ f1 =f; vkey )→ c∈C / So, P2 is equivalent to: ∀ C, f, f1 . (c∈C which is a tautology. Proof of P3: / C) → Prd(f1 :=f1 ∪{c7→ vD Prd(c∈ / C| f1 :=f1 ∪{c7→ vD key (d)})= (c∈ key (d)})

Refining a B Specification into a Relational Database Implementation

39

=(c∈C) / → (f’1 =f1 ∪ {c7→ vD key (d)}) Prd(c∈C|f:=f / ∪{c7→d})=(c∈ / C)→ Prd(f:=f ∪{c7→d}) =(c∈C) / →(f’=f ∪{c7→d}) By using (i), P3 is rewritten as: / | f1 =f; vD / C)→ (f’1 =f1 ∪{c7→vD ∀ C, C’, f, f1 , f’1 . (c∈C key |((c∈ key (d)})) → )) ∃ f. (((c∈ / C) → (f’=f ∪{c7→}))|f’1 =f’;vD key Applying the deduction theorem gives the following hypotheses: (H1) c∈C, / (H2) f1 =f; vD / C) →(f1 =f1 ∪{c7→vD key , (H3)((c∈ key (d)}) / C) →(f’=f ∪{c7→d}))∧f’1 =f’;vD The current goal is: (G1 ) f’.(((c∈ key ) D (H1)+(H3)+(Modus Ponens) gives: (H4) (f’1 =f1 ∪{c7→ vkey (d)}) let us assume:(H5 ) (f’=f ∪{c7→})) in (G1 ): / ∪ {c7→d}=f ∪{c7→d}))∧f’1 =f’;vD (G2 ) (((c∈C)→(f key ) (H1)+(G2)+(Modus Ponens) gives: (G3 ) (f’1 =f’;vD key ) This last goal is checked as follows: D f’; vD key =(f ∪{c7→d});vkey (rewriting of f ’ using H5) D =(f;vkey ) ∪({c7→d};vD key ) (distributivity of composition over union) D =f1 ∪{c7→vkey (d)} (H2 +vD key is an injection) =f’1 (rewriting using H4 ) . QED All the other proofs are similar. They are rather simple and then could be easily automated. Table 1 gives a synthesis of the different refinement proofs automatically generated by the AtelierB prover. Most of them were automatically proven. The exceptions of the automatic prover are not due to the specific application domain we consider but essentially to the lack of rules concerning properties of some relational operators and also the lack of proof tactics. For example in Rule 6, the prover fails to establish that a direct product of partial functions is also a partial function. Thus, in order to automate the refinement process, the rule base of the prover needs to be enriched. Table 1. Proof result RULES Rule1 Bounded Case Unbounded case Rule 2 Rule 3 Rule 4 Rule 5 Rule 6

(Add Operation, Remove Obv nPO nUn (7, 2) (0,3) (3,1) (6, 7) (4, 4) (1, 0) (6, 2) (3, 5) (0, 0) (7, 6) (2, 10) (0, 0) (11, 8) (2, 10) (5, 8) (6,2) (3,3) (2,1) (9, 2) (2, 1) (1, 2)

Operation) Pr 75% 95% 100% 100% 70% 82% 82%

40

8

R. Laleau and A. Mammar

Conclusions and Future Works

In this paper, we have described the B refinement process for transforming an abstract specification (obtained from an UML diagram) into a concrete implementation which can be directly mapped into a relational database implementation. The process is specified by refinement rules that take into account both data and operations. Its correctness is achieved by the elaboration of the proofs related to each rule. The generic characteristic of the proofs enables the automation of both the refinement process and the proofs themselves. Thus, for a specific application, the designer can use a base composed of these generic proofs that have just to be instantiated. This genericity is limited to the domain of relational database applications. The most important benefits of our approach are : – reducing the development cost of the B refinement process; – allowing the designer to concentrate on the specification phase, thanks to the automation of the refinement process; – extending the application domain of the B method, up to now only used for critical systems design. This requires development of tools specific to a domain (such as relational databases) that guide designers in the use of the B method – providing database designers with a method which ensures the complete consistency of applications (both data and programs). – producing code which is standardized and thus easier to understand and maintain. We are currently working on the development of a tool in order to automate the complete refinement process. This will operate in two ways: automatic and interactive. Designers might intervene to express choices concerning, for example, the introduction of a new key. The tool will extend an existing tool that translates UML diagrams into a B abstract specification. We are also working on the reusability of each generic proof rule and the complete automation of the proof phase. Another work consists of refining more elaborate transactions. These are specified using UML state and collaboration diagrams which are then translated in B.

References 1. 2. 3. 4. 5. 6.

Abrial, J.R.: The B-Book, Cambridge University Press, 1996. Barros, R.: Deriving relational database programs from formal specifications, 2nd Int. Conf. FME’94, Springer-Verlag, LNCS 873, Barcelona, Spain, Oct. 94. Batini, Ceri, Navathe: Conceptual Database Design: an EntityRelationship Approach, The Benjamin/Cummings Publishing Company, 1992. Burdy, L., Meynadier, J-M.: Automatic Refinement, BUG Meeting, FM’99, Toulouse, France, September 1999. Castelli,D., Pisani,S.: A Transformational Approach to Correct Schema Refinement, Conceptual Modeling-ER’98. 17th International Conference on Conceptual Modeling. Singapor, Novembre 1998. Codd, E.F.: A Relational Model of Data for Large Shared Data Banks. Communications of the ACM, V13, N◦ 6, June 1970, pp.377-387.

Refining a B Specification into a Relational Database Implementation 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

41

Date, C.J.: An Introduction to Database Systems, Addison-Wesley, 6th edition, 1996. Digilog groupe STERIA: Atelier B–Manuel de r´ef´erence, 1996, DIGILOG, BP 16000, 13791 Aix-en-Provence Cedex 3 France. Donzeau-gouge, V., Simonot, M.: Conception Rigoureuse de Programmes, coursebook of the DESS ”D´eveloppement de Logiciels Sˆ urs”, CNAM, Paris, 1999. Evans, A.S.: Reasoning with UML Class Diagrams, Workshop on Industrial Strength Formal Methods, WIFT’98, Florida, IEEE Press, 1998. Facon, P., Laleau R., Nguyen, H. P.: Mapping Object Diagrams into B Specifications, Methods Integration Workshop, Leeds, UK, March 1996. Facon, P., Laleau, R., Mammar, A.: Combining UML with the B Formal Method for the Specification of Database Applications, Research report, CEDRIC laboratory, CNAM, Paris, September 1999. G¨ unther, T., Schewe, K.D., Wetzel, I.: On the Derivation of Executable Database Programs from Formal Specifications. Int. Symp. FME’93, Odense, Denmark, April 1993. Mammar, A., Laleau, R.: Using a Formal Refinement to Derive Relational Database Implementations from B Specifications, Research report, CEDRIC laboratory, CNAM, Paris, January 2000. Matthews, B., Locuratolo, E.: Formal Development of Databases in ASSO and B. FME’99 Word Congress on Formal Methods, Springer-Verlag, LNCS 1709, Toulouse, France, Sept.99. Melton,J., Simon, A.: Understanding the new SQL: A Complete Guide. Morgan Kaufmann Publishers, 1993. Nguyen, H.P.: D´erivation de sp´ecifications formelles B ` a partir de sp´ecifications semi-formelles, PHD thesis, CEDRIC laboratory, Paris, France, December 98. OMG: The UML Group: Unified Modeling Language, version 1.1, Rational Software Corporation, www.rational.com/uml, Santa Clara, USA, July 1997.

Recursive Schema Definitions in Object-Z Graeme Smith Software Verification Research Centre University of Queensland, Australia [email protected]

Abstract. Unlike Z, Object-Z allows schemas to be defined recursively. This enables mutual and self recursive structures, commonly occurring in object-oriented programs, to be readily specified. In this paper, we provide a fixed point interpretation of such definitions. In addition, we provide simple guidelines for producing non-recursive schema definitions which are semantically identical to recursive ones.

1

Introduction

Object-Z [7,2] is an extension of Z [8] to facilitate specification in an objectoriented style. It is a conservative extension in the sense that the existing syntax and semantics of Z are retained and new constructs are added. The major new construct is the class schema which captures the object-oriented notion of a class by encapsulating a single state schema, and its associated initial state schema, with all the operation schemas which may change its variables. The class schema is not simply a syntactic extension but also defines a type whose instances are object references, i.e., identifiers which reference objects of the class. The notion of object references in Object-Z is a major departure from the semantics of Z. It allows variables to be declared which, rather than directly representing a value, refer to a value in much the same way as pointers in a programming language. Their introduction facilitates the refinement of specifications to code in object-oriented programming languages. Object references also have a profound influence on the structuring of specifications. When an object is merely referenced by another object, it is not encapsulated in any way by the referencing object. This enables the possibility of self and mutually recursive structures. To facilitate the specification of such structures, which occur commonly in object-oriented programs, Object-Z does not insist on the notion of “declaration before use” of Z. Specifically, it allows a class schema to include references to its own objects, and initial state and operations schemas to be defined recursively. Griffiths [4] has provided a definition for recursive operations in Object-Z. This definition is given in terms of the denotational semantics of Object-Z [3, 5]. Therefore, it relies on some knowledge of this semantics and is not readily accessible to the average Object-Z user. In particular, it does not suggest a straightforward way to represent recursive operation definitions non-recursively to enable them to be more easily understood. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 42–58, 2000. c Springer-Verlag Berlin Heidelberg 2000

Recursive Schema Definitions in Object-Z

43

In this paper, we provide definitions of recursion for both initial state schemas and operations. Our approach is a simple application of fixed point theory which does not require an understanding of Object-Z’s denotational semantics. Furthermore, our definitions provide an interpretation of recursive schemas in terms of semantically equivalent non-recursive definitions. In Section 2, we discuss and provide examples of recursion in Object-Z. In particular, we specify an ordered binary tree. In Section 3, we provide an overview of fixed point theory and show how it can be used to provide definitions for recursive initial state schema and operation definitions. In Section 4, we use the fixed point definitions to provide guidelines for representing recursive schemas in terms of non-recursive ones. We apply these guidelines to the ordered binary tree specification in Section 5.

2

Recursion in Object-Z

The specification of recursive structures in Object-Z is facilitated by its reference semantics. The type corresponding to a class schema is a set of identifiers which reference objects of the class. That is, the objects they reference behave according to the definitions in the class schema. The identifiers themselves, however, are independent of these definitions. This independence allows us to relax the notion of “definition before use” of Z. This notion prevents, for example, a schema S including a variable whose value is an instance of S . In Object-Z, however, a class may contain a state variable that is an instance of the type defined by that class. For example, the following specification is allowed. (An ellipsis is used to denote elided initial state and operation definitions.) Person spouse : Person ... Each class in Object-Z has an implicitly declared constant self which for a given object of the class denotes that object’s identifier. This enables mutually and self recursive structures such as the following to be specified. Person spouse : Person spouse.spouse = self ... To take full advantage of such recursive structuring, however, we need to be able to refer to the initial state schema and operations of the (recursively)

44

G. Smith

referenced objects. Object-Z therefore also relaxes the notion of “definition before use” for schemas within classes. For example, consider the following specification of an ordered binary tree based on that in Smith [7]. Tree nodes : P Tree val : N left tree, right tree : Tree null : B hleft tree.nodes, {self }, right tree.nodesi partitions nodes ∀ t : left tree.nodes • t.null ∨ t.val < val ∀ t : right tree.nodes • t.null ∨ t.val > val INIT null ∧ left tree.INIT ∧ right tree.INIT Insert = b [ ∆(null , val ) v ? : N | null ∧ val 0 = v ? ∧ ¬ null 0 ] [] [ ¬ null ] ∧ (left tree.Insert [] right tree.Insert) The class Tree describes the functionality of an ordered binary tree abstractly by defining an infinite tree structure, a subset of the nodes of which denote the actual tree. This subset necessarily includes the root node of the infinite structure. An example instance of the class is shown in Figure 1. actual tree

rest of infinite tree structure

Fig. 1. Abstract representation of a binary tree

Recursive Schema Definitions in Object-Z

45

The state of class Tree is defined recursively in terms of two subtrees left tree and right tree. It also has a variable val denoting the value in the root node of the tree (or subtree) and a Boolean-valued variable null denoting whether or not the root node is part of the actual tree. To facilitate specifying the properties of an ordered binary tree in the state schema’s predicate, a constant nodes denoting the set of all nodes in the infinite structure, is declared. Initially, the tree is empty (i.e., for all nodes, null is true). In class Tree, this is specified recursively. The predicate of the initial state schema states that the root node is not part of the tree (null is true) and that its left and right subtrees are also in their initial states. A node is added to the tree when a value is inserted into that node. The operation Insert is also specified recursively. If the root node is not part of the tree then the input value v ? is inserted into it and it is added to the tree. If, on the other hand, the root node is already part of the tree then the Insert operation is applied to either the left or right subtree ( [] denotes angelic choice). The final two predicates of the state schema must be true after the operation and, hence, determine which of the subtrees Insert is applied to. The recursive schema definitions greatly simplify the specification of the ordered binary tree. However, for each definition, we need to be certain that a solution, in the form of a schema, satisfying the definition exists, and that we can uniquely choose a solution if more than one exists. Furthermore, we would like to be able to reexpress the definitions in a non-recursive fashion in order to facilitate reasoning about the schemas and their class.

3

Fixed Point Definitions

To guarantee that a recursive schema definition has a solution, we prove, in this section, the existence of a least fixed point. We do this according to the fixed point theory presented in Back and von Wright which allows unbounded nondeterminism within the constructs being defined recursively [1]. Let the domain of the recursively defined construct be D. To prove the existence of a least fixed point, we need to find a complete lattice vD on D. That is, a partial order for which there exists, for any subset s of D, a greatest lower bound glbD (s) and least upper bound lubD (s). This is formalised as follows. vD : D ↔ D glbD , lubD : P D → D ∀ d : D • d vD d ∀ d1 , d2 : D • d1 vD d2 ∧ d2 vD d1 ⇒ d1 = d2 ∀ d1 , d2 , d3 : D • d1 vD d2 ∧ d2 vD d3 ⇒ d1 vD d3 ∀s : PD • (∀ d : s • glbD (s) vD d ∧ d vD lubD (s)) ∧ (∀ lb : D | (∀ d : s • lb vD d ) • lb vD glbD (s)) ∧ (∀ ub : D | (∀ d : s • d vD ub) • lubD (s) vD ub)

46

G. Smith

We then need to reformulate the recursive schema definition as a (nonrecursive) function fD which is monotone. Monotonicity of fD is formalised as follows. fD : D → D ∀ d1 , d2 : D • d1 vD d2 ⇒ fD (d1 ) vD fD (d2 ) Given these definitions, according to the Knaster-Tarski theorem, fD has a least fixed point µD . This least fixed point is chosen as the unique solution of the recursive definition. To calculate the value of µD , we need to find the “limit” of applying fD to the bottom element of the lattice, i.e., to glbD (D). In other words, we need to apply fD to glbD (D), then apply fD to the result of this application of fD , and continue in this fashion until we reach a value d such that fD (d ) = d . This value d is the least fixed point µD . If there is a possibility of unbounded nondeterminism in elements of D, it is not possible to prove fD is continuous. Hence, it is not possible to use the theory of Scott which defines this limit over the natural numbers [6]. Instead, the limit is defined over the ordinals [1]. The ordinals O extend the natural numbers, 0 to ω, with the additional elements ω + 1, ω + 2, . . . , 2 ∗ ω, 2 ∗ ω + 1, . . .. The least fixed point µD is defined below. µD : D ∃γ : O • fDγ (glbD (D)) = fDγ+1 (glbD (D)) ∧ µD = fDγ (glbD (D)) 3.1

Initial State Schemas

An initial state schema in Object-Z comprises a predicate part only [7]. Not all recursive initial state schema definitions have a solution. For example, the defib [ ¬ INIT ] requires ¬ INIT when INIT . Similarly, the semantically nition INIT = identical definitions INIT = b [ INIT ⇒ false ] and INIT = b [ INIT ⇔ false ] have no solutions. This suggests the need for a proof obligation to show that initial state schemas have a solution. In this section, we show that all recursive initial state schema definitions whose predicates are expressed, or can be reexpressed, with certain restrictions on the occurrences of INIT have a least fixed point. Specifically, these restrictions prevent INIT occurring in a predicate p where, for any predicate q and declaration d , the predicate ¬ p, p ⇔ q, p ⇒ q, ∃1 d • p or ∀ d | p • q appears as part of the predicate of the initial state schema. (Note that ∀ d | p • q is equivalent to ∀ d • p ⇒ q, and hence causes the same problem as p ⇒ q, and that ∃1 d • p is defined in terms of ∀ d | p • q [8].) Given predicates p and q which respect this restriction on occurrences of INIT , we define a relation vINIT on initial state schemas such that

Recursive Schema Definitions in Object-Z

47

[ p ] vINIT [ q ] ⇔ (∀ d • q ⇒ p) where d declares all variables occurring free in p and q. Since implication is reflexive, anti-symmetric and transitive, the relation vINIT is a partial order. Furthermore, it is a complete lattice [1]. For a finite, nonempty sets of predicates s = {p1 , p2 , . . . , pn } the greatest lower bound glbINIT (s) is p1 ∨ p2 ∨ . . . ∨ pn and the least upper bound lubINIT (s) is p1 ∧ p2 ∧ . . . ∧ pn . b [ p ], the function representing Given an initial state schema definition INIT = this definition maps a given parameter x to the right-hand side of the definition with all occurrences of the left-hand side, i.e., INIT , replaced by x . (We let the meta-notation s(a/b) denote s with all occurrences of b replaced by a.) fINIT = λ x • [ p ](x /INIT ) The function fINIT is monotone as shown below. (Predicates q, r , s and t respect the restriction on occurrences of INIT . d declares all variables occuring free in q, r , s and t.) Theorem If ∀ d • q ⇒ r then ∀ d • fINIT (q) ⇒ fINIT (r ). Proof The proof is by induction on the structure of fINIT . 1. If ∀ d • q ⇒ r then ∀ d • q ∧ s ⇒ r ∧ s and ∀ d • q ∨ s ⇒ r ∨ s. Therefore, the theorem holds for any fINIT which returns predicates constructed from just ∧ and ∨. For example, if fINIT = λ x • x ∧ (q ∨ (x ∧ r )), where x does not occur in q or r , then for any predicates a and b, given that all free variables of a, b, q and r are declared by d and ∀ d • a ⇒ b, ∀ d • a ∧ r ⇒ b ∧ r and, therefore, ∀ d • q ∨ (a ∧ r ) ⇒ q ∨ (b ∧ r ) and, therefore, ∀ d • a ∧ (q ∨ (a ∧ r )) ⇒ b ∧ (q ∨ (b ∧ r )). That is, ∀ d • fINIT (a) ⇒ fINIT (b). 2. If ∀ d • q ⇒ r then ∀ d • (s ⇒ q) ⇒ (s ⇒ r ). Therefore, the theorem holds for any fINIT which returns predicates constructed from ∧, ∨ and ⇒ provided that INIT does not occur on the left-hand side of an implication. 3. If ∀ d • q ⇒ r then, for any declaration d1 , ∀ d • (∀ d1 • q) ⇒ (∀ d1 • r ) and ∀ d • (∃ d1 • q) ⇒ (∃ d1 • r ). Therefore, the theorem holds for any fINIT which returns predicates constructed from ∀, ∃, ∧, ∨ and ⇒ provided that INIT does not occur on the left-hand side of an implication. (Note that this step includes predicates ∀ d | p • q, which is equivalent to ∀ d • p ⇒ q, and ∃ d | p • q, which is equivalent to ∃ d • p ∧ q.) Since parts of predicates not including INIT can be constructed in terms of any operators (cf., predicates q and r in the example in step 1 above), it follows

48

G. Smith

that if fINIT returns predicates that respect the restriction on occurrences of INIT then the theorem holds.  Hence, the least fixed point of a recursive schema definition exists when the occurrences of INIT in its predicate are appropriately restricted. Since for all predicates p and declarations of p’s free variables d , ∀ d • p ⇒ true, the bottom b [ p ], element of vINIT is [ true ]. Given the initial state schema definition INIT = we therefore have the following unique value for INIT . INIT = fIγNIT ([ true ]) ([ true ]). where γ ∈ O such that fIγNIT ([ true ]) = fIγ+1 NIT 3.2

Operation Schemas

An operation schema comprises a declaration and a predicate part and a ∆-list (listing the state variables which may change). Given declarations d1 and d2 , predicates p and q and lists of variables u and v , we define a relation vOp on operations such that1

[ ∆(u) d1 | p ] vOp [ ∆(v ) d2 | q ] ⇔ {u} ⊆ {v } ∧ d1 ⊆ d2 ∧ (∀ d • p ⇒ q) where d declares all variables occurring free in p and q. Note that, for convenience, we treat declarations as sets of basic declarations of the form x : T . (Hence, the ⊆ symbol between d1 and d2 above is a metalogical operator as are the ∧ symbols.) It is easy to show that the relation vOp is reflexive, anti-symmetric and transitive. Hence, it is a partial order. Furthermore, it is a complete lattice [1]. The greatest lower bound of a set s = {[ ∆(u1 ) d1 | p1 ], . . . , [ ∆(un ) dn | pn ]} is [ ∆(u1 ∩ . . . ∩ un ) d1 ∩ . . . ∩ dn | p1 ∧ . . . ∧ pn ]. The least upper bound of s is [ ∆(u1 ∪ . . . ∪ un ) d1 ∪ . . . ∪ dn | p1 ∨ . . . ∨ pn ]. Object-Z does not permit operations to appear in declarations and predicates [7]. Therefore, all recursive operations are defined as operation expressions. Such expressions may be constructed using the operation operators ∧, k, k! , [] , o9 and • (scope enrichment) and may involve hiding and renaming [7]. Smith [7] shows how arbitrary operation expressions can be expressed in terms of operation schemas. For each of the operators, the ∆-list of the resulting operation schema is the union of the ∆-lists of the argument operations. Also, the schema part of the resulting operation schema can be defined in terms of the Z schema operators ∀, ∃, ∧, ∨ and renaming. For example, the schema part of an operation formed using the sequential composition operator o9 is simply the 1

This ordering is not the refinement ordering on operations.

Recursive Schema Definitions in Object-Z

49

conjunction of the argument schemas with the intermediate state and communicated2 variables renamed (so that they are identified) and hidden. Hiding is defined in terms of ∃. Given an operation definition Op = b OP (where OP is an operation expression), the result of the function representing the definition is formed by replacing all occurrences of Op in OP by the parameter x . That is, fOp = λ x • OP (x /Op) The function fOp is monotone as shown below. Theorem If {u} ⊆ {v } and d1 ⊆ d2 and ∀ d • p ⇒ q and fOp ([ ∆(u) d1 | p ]) = [ ∆(u 0 ) d10 | p 0 ] and fOp ([ ∆(v ) d2 | q ]) = [ ∆(v 0 ) d20 | q 0 ] then {u 0 } ⊆ {v 0 } and d10 ⊆ d20 and ∀ d • p0 ⇒ q 0. Proof 1. Since all operation operators form the union of the ∆-lists of their argument operations, if {u} ⊆ {v } then {u 0 } ⊆ {v 0 }. For example, if fOp = λ x • x ∧ [ ∆(w ) d3 | p3 ] then {u 0 } = {u} ∪ {w } and {v 0 } = {v } ∪ {w }. 2. Since all operation operators are defined in terms of Z schema operators and these latter operators merge declarations, if d1 ⊆ d2 then d10 ⊆ d20 . 3. From steps 1, 2 and 3 of the proof of Section 3.1, it follows that if ∀ d • p ⇒ q then ∀ d • p 0 ⇒ q 0 provided that fOp constructs p 0 and q 0 using only ∀, ∃, ∧ and ∨. Also, since given that ∀ d • p ⇒ q then ∀ d • p(x /y) ⇒ q(x /y), it follows that ∀ d • p 0 ⇒ q 0 provided that fOp constructs p 0 and q 0 using ∀, ∃, ∧, ∨ and renaming. Since all operation operators are defined in terms of the Z schema operators ∀, ∃, ∧, ∨ and renaming (and these in turn are defined in terms of the predicate operators ∀, ∃, ∧ and ∨ and renaming), it follows that if fOp returns a valid Object-Z operation expression then the theorem holds.  Hence, the least fixed point exist for all recursive operation definitions. Since for all sets s, ∅ ⊆ s, and for all predicates p and declarations of p’s free variables d , ∀ d • false ⇒ p, the bottom element of vOp is [ false ]. Given the operation definition Op = b OP , we therefore have the following unique value for Op. γ ([ false ]) Op = fOp γ γ+1 ([ false ]) = fOp ([ false ]). where γ ∈ O such that fOp 2

In Object-Z, sequential composition combines the notions of piping and sequential composition of Z [7].

50

4

G. Smith

Interpreting Recursive Definitions

In this section, we show how, from the fixed point definitions, we can derive nonrecursive schema definitions which are semantically equivalent to the recursive ones. This is done by calculating the results of successive applications of the function fD to the bottom of the associated lattice vD . The ordinals 0, ω, 2 ∗ ω, . . . are referred to as limit ordinals [1]. The successive application of the function fD to an element d : D is calculated differently for limit ordinals and arbitrary ordinals α as shown below [1]. fD0 (d ) = d fDα+1 (d ) = fD (fDα (d )) for arbitrary ordinals α fDα (d ) = lubD ({β | β < α • fDβ (d )}) for nonzero limit ordinals α 4.1

Initial State Schemas

In this section, we present, via an example, a general method for deriving a nonrecursive initial state schema from a recursive definition. We use the fixed point theory of Section 3.1, and hence require that the recursive definition respects the restrictions on the occurrences of INIT . Consider the following recursively defined class. A a:A n:N INIT n = 0 ∧ a.INIT Initially, an object of class A has n equal to zero. Furthermore, the object referenced by a also has n equal to zero, and so on. To represent the initial state schema non-recursively, we need to extend the class by introducing a secondary variable (one whose value can be derived from the values of the other variables [7]) in order to be able to refer to all objects referenced by the class. This variable s models the infinite sequence of objects self , a, a.a, a.a.a, . . .. That is, class A is replaced by the semantically equivalent class A1. (By including a visibility list (. . .) which does not include s, s is effectively removed from the class’s interface.)

Recursive Schema Definitions in Object-Z

51

A1

(a, n, INIT ) a:A n:N ∆ s:N→A s(0) = self ∀ i : N • s(i + 1) = s(i ).a INIT n = 0 ∧ a.INIT For this class, fINIT is equal to λ x • [ n = 0 ∧ a.x ]. Applying this function to the bottom element of the initial state schema lattice, [ true ], we have fINIT ([ true ])= [ n = 0 ∧ a. [ true ] ] = [n = 0] fI2NIT ([ true ])= [ n = 0 ∧ a. [ n = 0 ] ] = [ n = 0 ∧ a.n = 0 ] fI3NIT ([ true ])= [ n = 0 ∧ a. [ n = 0 ∧ a.n = 0 ] ] = [ n = 0 ∧ a.n = 0 ∧ a.a.n = 0 ] = [ s(0).n = 0 ∧ s(1).n = 0 ∧ s(2).n = 0 ] and so on. It is easy to see that for an arbitrary natural number ν that fIνNIT ([ true ]) = [ ∀ i : 0 . . ν − 1 • s(i ).n = 0 ] Note that the expression self .n is equivalent to n. Hence, at the first non-zero limit ordinal ω, we have fIωNIT ([ true ])= lubINIT ({ν : N • fIνNIT ([ true ])} = ∧ν : N • fIνNIT ([ true ]) = [ ∀ i : N • s(i ).n = 0 ] Applying the function again, we have ([ true ])= [ n = 0 ∧ a. [ ∀ i : N • s(i ).n = 0 ] ] fIω+1 NIT = [ n = 0 ∧ ∀ i : N • a.s(i ).n = 0 ] = [ ∀ i : N • s(i ).n = 0 ] Therefore, fIωNIT ([ true ]) = fIω+1 ([ true ]) and so, according to the theory in SecNIT tion 3.1, INIT = fIωNIT ([ true ]) = [ ∀ i : N • s(i ).n = 0 ]

52

G. Smith

4.2

Operation Schemas

We now look at using the fixed point theory of Section 3.2, to derive non-recursive operation schemas from recursive definitions. As in the previous section, we will illustrate the general approach via examples. Consider the following Object-Z class from Smith [7]. A n:N Op1 = b [ ∆(n) | n 0 = n + 1 ] o9Op1 Op2 = b ([ ∆(n) | n 0 = n + 1 ] o9Op2 ) [] [ n > 10 ] It is not immediately clear what the meaning of Op1 is (the recursion never terminates3 ). However, our theory is valid for all operation expressions and so we should be able to find a least fixed point. The function corresponding to this operation definition is fOp1 = λ x • [ ∆(n) | n 0 = n + 1 ] o9 x Applying this function to the bottom of the operation lattice, i.e., to [ false ], we get fOp1 ([ false ])= [ ∆(n) | n 0 = n + 1 ] o9 [ false ] = [ false ] 0 0 Since fOp1 ([ false ]) is also equal to [ false ] by definition, we have fOp1 ([ false ]) = fOp1 ([ false ]). Therefore,

Op1 = [ false ] That is, the operation has a false precondition and can, therefore, never occur [7]. All operations in which recursion cannot terminate will similarly be equivalent to the operation [ false ]. For Op2 , the function representing the recursive definition is fOp2 = λ x • ([ ∆(n) | n 0 = n + 1 ] o9 x ) [] [ n > 10 ] Applying this to the bottom element [ false ] we have 3

We use the word “terminates” loosely here. We are dealing with recursive definitions involving sets and predicates, not programs.

Recursive Schema Definitions in Object-Z

53

fOp2 ([ false ])= ([ ∆(n) | n 0 = n + 1 ] o9 [ false ]) [] [ n > 10 ] = [ n > 10 ] 2 fOp2 ([ false ])= ([ ∆(n) | n 0 = n + 1 ] o9 [ n > 10 ]) [] [ n > 10 ] = [ ∆(n) | n 0 = n + 1 ∧ n 0 > 10 ] [] [ n > 10 ] = [ ∆(n) | n 0 = n + 1 ∧ n 0 > 10 ] [] fOp2 ([ false ]) 3 ([ false ])= ([ ∆(n) | n 0 = n + 1 ] fOp2 0 0 o 9([ ∆(n) | n = n + 1 ∧ n > 10 ] [] [ n > 10 ])) [] [ n > 10 ] = [ ∆(n) | n 0 = n + 2 ∧ n 0 > 10 ] [] [ ∆(n) | n 0 = n + 1 ∧ n 0 > 10 ] [] [ n > 10 ] 2 = [ ∆(n) | n 0 = n + 2 ∧ n 0 > 10 ] [] fOp2 ([ false ]) Continuing in this fashion, we see that for any natural number ν, ν+1 ν ([ false ]) = [ ∆(n) | n 0 = n + ν ∧ n 0 > 10 ] [] fOp2 ([ false ]) fOp2

Note that [ n > 10 ] resulting from fOp2 ([ false ]) is semantically identical to [ ∆(n) | n 0 = n + 0 ∧ n 0 > 10 ]. Therefore, for the limit ordinal ω (noting that, for a given list of variables x , the schema part of [ ∆(x ) a ] [] [ ∆(x ) b ] is equivalent to [ a ] ∨ [ b ] [7]) we have ω fOp2 ([ false ])= [] ν : N • [ ∆(n) | n 0 = n + ν ∧ n 0 > 10 ] = ∃ ν : N • [ ∆(n) | n 0 = n + ν ∧ n 0 > 10 ] = [ ∆(n) | ∃ ν : N • n 0 = n + ν ∧ n 0 > 10 ] = [ ∆(n) | n 0 > n ∧ n 0 > 10 ]

Applying the function again, we have ω+1 ([ false ])= ([ ∆(n) | n 0 = n + 1 ] o9 [ ∆(n) | n 0 > n ∧ n 0 > 10 ]) fOp2 [] [ n > 10 ] = [ ∆(n) | n 0 > n ∧ n 0 > 10 ] [] [ n > 10 ] = [ ∆(n) | n 0 > n ∧ n 0 > 10 ] ω+1 ω Therefore, fOp2 ([ false ]) = fOp2 ([ false ]) and so, according to the fixed point theory of Section 3.2,

Op2 = [ ∆(n) | n 0 > n ∧ n 0 > 10 ] For classes with recursive object references, we need to add a secondary variable to provide access to all referenced objects as was done in the example of Section 4.1. As a simple example, consider the following class (also from Smith [7]).

54

G. Smith

B b:B n:N Op = b b.Op [] [ ∆(n) | n 0 = n + 1 ] It can be extended to the semantically equivalent class B 1 below. B1

(b, n, Op) b:B n:N ∆ s:N→B s(0) = self ∀ i : N • s(i + 1) = s(i ).b Op = b b.Op [] [ ∆(n) | n 0 = n + 1 ] The function representing the recursive definition of Op is fOp = λ x • b.x [] [ ∆(n) | n 0 = n + 1 ] Applying this to [ false ], we have fOp ([ false ])= b. [ false ] [] [ ∆(n) | n 0 = n + 1 ] = [ ∆(n) | n 0 = n + 1 ] 2 fOp ([ false ])= b. [ ∆(n) | n 0 = n + 1 ] [] [ ∆(n) | n 0 = n + 1 ] = b. [ ∆(n) | n 0 = n + 1 ] [] fOp ([ false ]) 3 ([ false ])= b.b. [ ∆(n) | n 0 = n + 1 ] [] b. [ ∆(n) | n 0 = n + 1 ] fOp [] [ ∆(n) | n 0 = n + 1 ] 2 = b.b. [ ∆(n) | n 0 = n + 1 ] [] fOp ([ false ]) That is, for an arbitrary natural number ν, we have ν+1 ν ([ false ]) = s(ν). [ ∆(n) | n 0 = n + 1 ] [] fOp ([ false ]) fOp

and, hence, ω ([ false ]) = fOp

Also,

[]

ν : N • s(ν). [ ∆(n) | n 0 = n + 1 ]

Recursive Schema Definitions in Object-Z

55

ω+1 fOp ([ false ])= b.( [] ν : N • s(ν). [ ∆(n) | n 0 = n + 1 ]) [] [ ∆(n) | n 0 = n + 1 ] = ( [] ν : N1 • s(ν). [ ∆(n) | n 0 = n + 1 ]) [] s(0). [ ∆(n) | n 0 = n + 1 ] = [] ν : N • s(ν). [ ∆(n) | n 0 = n + 1 ]

Therefore, Op =

5

[]

ν : N • s(ν). [ ∆(n) | n 0 = n + 1 ]

Tree Example Revisited

In this section, we use the approach developed in Section 4 to provide a nonrecursive interpretation of the schemas of the ordered binary tree of Section 2. This example presents us with a slightly more complex structure of referenced objects. To allow access to these objects, we extend the class with a type LeftRight ::= left | right, as well as a secondary variable s which maps sequences of elements of the type LeftRight to the corresponding tree objects as shown in Figure 2.

left

right

right

s()

Fig. 2. Root node of subtree s(hleft, right, righti)

The state schema of the extended class has the additional secondary variable declaration

56

G. Smith

s : (seq LeftRight) → Tree and predicate s(h i) = self ∀ i : seq LeftRight • s(i a hlefti) = s(i ).left tree ∧ s(i a hrighti) = s(i ).right tree 5.1

Initial State Schema

The function representing the initial state schema of class Tree is fINIT = λ x • [ null ∧ left tree.x ∧ right tree.x ] Applying this to the bottom element [ true ], we have fINIT ([ true ])= [ null ∧ left tree. [ true ] ∧ right tree. [ true ] ] = [ null ] fI2NIT ([ true ])= [ null ∧ left tree. [ null ] ∧ right tree. [ null ] ] = [ null ∧ left tree.null ∧ right tree.null ] fI3NIT ([ true ])= [ null ∧ left tree. [ null ∧ left tree.null ∧ right tree.null ] ∧ right tree. [ null ∧ left tree.null ∧ right tree.null ] ] = [ null ∧ left tree.null ∧ right tree.null ∧ left tree.left tree.null ∧ left tree.right tree.null ∧ right tree.left tree.null ∧ right tree.right tree.null ] It is easy to see that fIωNIT ([ true ]) = [ ∀ i : seq LeftRight • s(i ).null ] Furthermore, fIω+1 ([ true ]) = fIωNIT ([ true ]). Therefore, NIT INIT = [ ∀ i : seq LeftRight • s(i ).null ] That is, all nodes are initially null. This is what we intuitively expected from the recursive definition. 5.2

Insert Operation

Let NodeInsert = b [ ∆(null , val ) v ? : N | null ∧ val 0 = v ? ∧ null 0 ]. The function representing the operation Insert of class Tree is fInsert = λ x • NodeInsert [] [ ¬ null ] ∧ (left tree.x [] right tree.x )

Recursive Schema Definitions in Object-Z

57

Applying this to the bottom element [ false ], we have fInsert ([ false ])= NodeInsert [] [ ¬ null ] ∧ (left tree. [ false ] [] right tree. [ false ]) = NodeInsert 2 ([ false ]) = NodeInsert fInsert [] [ ¬ null ] ∧ (left tree.NodeInsert [] right tree.NodeInsert) 3 ([ false ]) = NodeInsert fInsert [] [ ¬ null ] ∧ ((left tree.NodeInsert [] (¬ left tree.null ∧ (left tree.left tree.NodeInsert [] left tree.right tree.NodeInsert))) [] (right tree.NodeInsert [] (¬ right tree.null ∧ (right tree.left tree.NodeInsert [] right tree.right tree.NodeInsert))))

From this we can see that the NodeInsert operation is applied to a node N such that all nodes between the root node and N are not null. The choice of the actual node out of those fulfilling this condition is made angelically. Therefore, we can deduce that4 ω fInsert ([ false ]) = [] i : seq LeftRight | (∀ j : seq LeftRight | j ⊂ i • ¬ s(j ).null ) • s(i ).NodeInsert ω+1 ω = fInsert and hence It also follows that fInsert

Insert = [] i : seq LeftRight | (∀ j : seq LeftRight | j ⊂ i • ¬ s(j ).null ) • s(i ).NodeInsert Once again, this is intuitively what we expected. The choice of the node to which Insert is applied is further restricted by the state schema’s predicate which ensures the tree is ordered. 4

Note that since j and i are sequences, j ⊂ i means that j is a prefix of i.

58

G. Smith

6

Conclusion

This paper has presented fixed point definitions for recursive initial state and operation schemas in Object-Z. In particular, it has provided a set of conditions under which recursive initial state schemas are consistent. These conditions amount to restrictions on the occurrences of INIT in the predicate of an initial state schema. Also, the paper has shown that all recursive operation schema definitions are consistent. The primary advantage of fixed point definitions is that they provide a straightforward way of representing recursive schema definitions by semantically equivalent non-recursive ones. The paper illustrates guidelines for doing this via simple examples and shows how to apply these guidelines to a recursive specification of an ordered binary tree. Acknowledgements The author would like to thank Ian Hayes for fruitful discussions on aspects of this work, and for comments on an earlier draft of this paper. This work is funded by Australian Research Council grant number A49801500.

References 1. R.-J. Back and J. von Wright. Refinement Calculus: A Systematic Introduction. Graduate Texts in Computer Science. Springer-Verlag, 1998. 2. R. Duke, G. Rose, and G. Smith. Object-Z: A specification language advocated for the description of standards. Computer Standards and Interfaces, 17:511–533, 1995. 3. A. Griffiths. An extended semantic foundation for Object-Z. In 1996 Asia-Pacific Software Engineering Conference (APSEC’96), pages 194–207. IEEE Computer Society Press, 1996. 4. A. Griffiths. A semantics for recursive operations in Object-Z. In L. Groves and S. Reeves, editors, Formal Methods Pacific’97 (FMP’97), pages 81–102. SpringerVerlag, 1997. 5. A. Griffiths. A Formal Semantics to Support Modular Reasoning in Object-Z. PhD thesis, Software Verification Research Centre, University of Queensland, 1998. 6. D. Scott and C. Gunther. Semantic domains and denotational semantics. In Handbook of Theoretical Computer Science, chapter 12, pages 633–674. Elsevier Science Publisher, 1990. 7. G. Smith. The Object-Z Specification Language. Advances in Formal Methods. Kluwer Academic Publishers, 2000. 8. J.M. Spivey. The Z Notation: A Reference Manual. Prentice Hall, 2nd edition, 1992.

On Mutually Recursive Free Types in Z I. Toyn, S.H. Valentine, and D.A. Duffy Dept. of Computer Science, University of York, Heslington, York, YO10 5DD, UK. {ian,sam,dad}@cs.york.ac.uk

Abstract. Mutually recursive free types are one of the innovations in the forthcoming ISO Standard for the Z notation. Their semantics has been specified by extending a formalization of the semantics of traditional Z free types to permit mutual recursion. That development is reflected in the structure of this paper. An explanation of traditional Z free types is given, along with some examples, and their general form is defined. Their semantics is defined by transformation to other equivalent Z notation. These equivalent constraints provide a basis for inference rules, as illustrated by an example proof. Notation for mutually recursive free types is introduced, and the semantics presented earlier is extended to define their meaning. Example inductive proofs concerning mutually recursive free types are presented.

1

Introduction

A specification written in Z [10] names the components of the specified system and expresses constraints between the values of those components. The constraints are written in typed set theory. Each type has a carrier set, denoted in expressions by the type name. The members of this set are determined by subsequent constraints. [switch] Off , On : switch Off 6= On switch = {Off , On} In this example, switch is a type, and Off , On are distinct members of the carrier set of that type. The last constraint specifies that there are no other members. For some types, the members of their carrier sets conform to a free algebra, i.e. the members are distinct enumerated elements and/or values formed by injection of values from other types, and the carrier sets have no other members. The switch type above is a simple example of such a type. Z provides free type notation to ease the definition of such types, abbreviating many constraints that would otherwise be needed to specify the same effect. Here is the switch type presented in free type notation. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 59–74, 2000. c Springer-Verlag Berlin Heidelberg 2000

60

I. Toyn, S.H. Valentine, and D.A. Duffy

switch ::= Off | On This paper specifies the constraints that free type notation abbreviates. That has been done before [10], but a different form for the constraints was needed for the formal semantics in the draft Z standard [14], and that is the form presented here. An extension of the Z free type notation to permit the specification of mutually recursive free types [12] is the major contribution of this paper. (That extension also appears in the draft Z standard.) The formal semantics of the extension is specified by extension of the equivalent constraints. The equivalent constraints inspire inference rules for use in reasoning about conjectures involving values of types specified using the free type notation. Some example proofs using those inference rules are presented, both for traditional free types and for mutually recursive free types. Sufficient conditions for the consistency of the definitions given are also described.

2 2.1

Traditional Free Types Example Free Types

The free type notation that has traditionally been used in Z is illustrated by the following examples. weekDay ::= Mon | Tue | Wed | Thu | Fri nats ::= Zero | Succhhnatsii intTree ::= Leaf hhZii | BranchhhintTree × intTreeii Types switch and weekDay have several elements and no injections. (The names element and injection are used respectively for the nullary and non-nullary constructors of values of a free type throughout this paper.) Type nats has both an element and an injection. Type intTree has two injections and no elements. The order in which the elements and injections of a free type are written does not affect the meaning. Types nats and intTree are recursive: values of either type can be injected into another value of that type. 2.2

General Form of a Free Type

The general form of a free type can be expressed by the pattern f ::= h1 | ... | hm | g1 hhe1 ii | ... | gn hhen ii where f is the name of the free type, h1 ...hm are the names of its elements, g1 ...gn are the names of its injections, e1 ...en are the expressions denoting the domains of those injections, and m + n ≥ 1. The expressions e1 ...en may contain

On Mutually Recursive Free Types in Z

61

occurrences of the free type name f . This pattern assumes that the elements are written before the injections. The draft Z standard formalizes the permutation of elements and injections into this order using the following syntactic transformation rule, ... | ghheii | h | ... =⇒ ... | h | ghheii | ... the exhaustive application of which effects a sort. The scope rules of Z require that f , h1 , ..., hm , g1 , ..., gn should be distinct names. 2.3

Constraints Abbreviated by a Free Type

All of the values in the carrier set of a free type can be denoted by the element names and by applications of the injections. If two such denotations look the same, then they denote the same value, whereas if they look different, then they denote different values. All such values are in the carrier set, and no other values are. The following subsections formalise the constraints expressed informally in the preceding paragraph, in terms of the general form given in Section 2.2. The constraints can be expressed most concisely by exploiting notations given in the mathematical toolkit. The use of the toolkit is optional, however, while free types are part of the core notation, so it is necessary to be able to express the constraints in core notation alone. First, a given type is declared whose carrier set will be constrained to correspond to that of the free type. [f ] Membership constraints Every element is a member of the carrier set of the free type. For all j ∈ 1 . . m, hj : f Every injection relates a value from its domain to a member of the carrier set of the free type. For all k ∈ 1 . . n, gk : P(ek × f ) These membership constraints also serve to declare the element and injection names. The declarations of injections could have been made to subsume the following total functionality and injectivity constraints, but separating them is informative and convenient for formal proof. Examples The constraints abbreviated by type nats include two membership constraints. Zero : nats Succ : P(nats × nats)

62

I. Toyn, S.H. Valentine, and D.A. Duffy

Total functionality constraints Every injection is a total function over the values of its domain expression. For all k ∈ 1 . . n, gk ∈ ek → f In core notation, for every value in the domain of an injection, there exists a single value in the carrier set of the free type. For all k ∈ 1 . . n, ∀ u : ek • ∃1 x : gk • x .1 = u The notation x .1 projects the first component from tuple x . In this case x is a pair, and so x .1 is equivalent to the toolkit notation first x . Example The constraints abbreviated by type nats include one total functionality constraint. ∀ u : nats • ∃1 x : Succ • x .1 = u Injectivity constraints Every injection relates different values in its domain to different values in the free type. For all k ∈ 1 . . n, 7 f gk ∈ ek  In core notation, if two applications of the same injection denote the same value in the carrier set, then they must have been applied to the same value. For all k ∈ 1 . . n, ∀ u, v : ek | gk u = gk v • u = v Example The constraints abbreviated by type nats include one injectivity constraint. ∀ u, v : nats | Succ u = Succ v • u = v Disjointness constraints Any two values denoted by different elements or injections are different values. disjointh{h1 }, ..., {hm }, ran g1 , ..., ran gn i In core notation, different element names denote different elements. For all i ∈ 1 . . m, j ∈ 1 . . m, such that i 6= j , ¬ h i = hj Any element is different from any application of an injection. For all k ∈ 1 . . n, j ∈ 1 . . m, ∀ u : ek • ¬ hj = gk u Applications of different injections produce different values. For all k ∈ 1 . . n, l ∈ 1 . . n, such that k 6= l , ∀ u : ek ; v : el • ¬ gk u = gl v

On Mutually Recursive Free Types in Z

63

Examples The constraints abbreviated by types switch, nats and intTree include one disjointness constraint each. ¬ Off = On ∀ u : nats • ¬ Zero = Succ u ∀ u : Z; v : intTree × intTree • ¬ Leaf u = Branch v Induction constraint Any subset of a free type containing the values of all of its elements and applications of injections contains all the values of its carrier set. ∀w : Pf | {h1 , ..., hm } ∪ g1 (| let f == w • e1 |) ∪ ... ∪ gn (| let f == w • en |) ⊆ w • w =f If the free type is recursive, some of expressions e1 ...en contain references to f . Within the induction constraint, these references must refer to w instead; this substitution is effected by the let f == w • notation. If the free type is not recursive, these substitutions leave the expressions unchanged. The induction constraint corresponds to a well-founded ordering[4]: the elements are the minimal values, and the injections impose a partial order. Equivalent core notation is as follows. ∀w : Pf | h1 ∈ w ∧ ... ∧ hm ∈ w ∧ (∀ y : (let f == w • e1 ) • g1 y ∈ w ) ∧ ... ∧ (∀ y : (let f == w • en ) • gn y ∈ w ) • w =f Example The constraints abbreviated by type intTree include this induction constraint (in which the substitutions have been performed). ∀ w : P intTree | (∀ y : Z • Leaf y ∈ w ) ∧ (∀ y : w × w • Branch y ∈ w ) • w = intTree 2.4

Finiteness

We require below the notion of finiteness in order to establish whether a free type definition is consistent. If we were to define finiteness in terms of properties of natural numbers, and also define the set of natural numbers as a free type, there would be a circularity. This circularity could be broken by establishing the existence of the natural numbers as a special case. This is the approach taken by Spivey[9,10]. Similarly, in HOL, numbers are used to establish the existence of

64

I. Toyn, S.H. Valentine, and D.A. Duffy

recursive types [7]. We avoid the circularity by defining finiteness independently of numbers. The set of all finite subsets of a set as given in [14] is: T F X == {A : P(P X ) | ∅ ∈ A ∧ ∀ a : A; x : X • a ∪ {x } ∈ A} A set S is then finite if S ∈ F S is true. An induction principle for finite sets can be derived directly from the above definition. 2.5

Consistency of Constraints

Proofs performed relative to specified constraints assume the consistency of the specification: if the constraints are contradictory, then the specification is inconsistent and anything may be formally proven from it. When a free type is recursive, the equivalent constraints may be contradictory, and hence the specification could be inconsistent. An example of such a free type is the following. inf ::= ConshhP inf ii However small or large the set inf is, P inf is a larger set, yet Cons injects every value in P inf into inf , so there is a contradiction. It is therefore appropriate to consider the consistency of a free type before performing proofs involving its constraints. The problem of demonstrating consistency for a free type has been investigated in depth elsewhere [8,1,15], so only a brief sketch is given here. The consistency of a free type can be argued from the perspective of building up the collection of values in its carrier set, starting from the set of element values. At each step around the recursion, additional values should be added to the carrier set without any values being removed. In other words, each expression denoting the domain of an injection should be a monotonic function of the free type. Formally, for any free type, f , and domain expression, ei , the expression is monotonic if S {w : P f • let f == w • ei } ⊆ ei Also, any value in the carrier set should be added by a finite number of steps around the recursion, which we describe by saying that expressions denoting the domains of the injections are finitary functions. Formally, S ei ⊆ {w : F f • let f == w • ei } Other authors [1,10] take these two conditions together, and define a condition called “finitary” which subsumes monotonicity, thus. S ei = {w : F f • let f == w • ei } If this condition can be shown to hold for each of the domains of the injections, the consistency of the recursive free type is guaranteed.

On Mutually Recursive Free Types in Z

65

The powerset function is not finitary, and the example above shows the consequences of allowing that to be used within the constructor function. An example of a non-monotonic definition would be solo ::= unit | successor hh{t : solo | ∀ x , y : solo • x = y}ii where the monotonicity condition becomes S

{w : P solo • let solo == w • {t : solo | ∀ x , y : solo • x = y}} ⊆ {t : solo | ∀ x , y : solo • x = y}

which is true if solo is a singleton set, but not otherwise. The singleton case is not consistent with the other properties required, however. For the example nats above, the finitariness condition is nats =

S

{w : F nats • let nats == w • nats}

and for the example intTree we have the two conditions ints =

S

{w : F intTree • let intTree == w • ints}

and intTree ×SintTree = {w : F intTree • let intTree == w • intTree × intTree} all of which can easily be seen to be satisfied. The only proviso is that intTree is not empty, which is guaranteed by the definition as a whole. 2.6

Formulation as Inference Rules

When trying to prove a conjecture |=? p in which the predicate p refers to an element h or an injection g of a free type defined in the specification, a valid inference is to introduce any of the corresponding constraints specified in Section 2.3 as an antecedent. If C is one of those constraints, then |=? C ⇒ p is the result of a valid inference. A membership constraint expressed as a declaration i : e is expressed as a predicate i ∈ e in this context, with expression i referring to the declaration of the corresponding element or injection in the specification.

66

2.7

I. Toyn, S.H. Valentine, and D.A. Duffy

Use of Inference Rules in an Example Proof

Given the type nats introduced in Section 2.1, the following conjecture can be proved. |=? ∀ x : nats • ¬ x = Succ x The proof begins by appealing to the induction constraint for type nats ∀ w : P nats | Zero ∈ w ∧ (∀ y : nats • Succ y ∈ w ) • w = nats instantiating w with the set {x : nats | ¬ x = Succ x } derived syntactically from the consequent of the conjecture. The resulting equality between that set and nats can be used to rewrite the reference to nats in the consequent. After simplification, two goals remain. |=? ¬ Zero = Succ Zero |=? ∀ y : nats | ¬ y = Succ y • ¬ Succ y = Succ(Succ y) These two goals may be recognised as the base case and the induction case [3]. The base case is proved by appealing to the disjointness constraint ∀ u : nats • ¬ Zero = Succ u and instantiating u with Zero, derived syntactically from the goal, followed by a little simplification. The step case is proved by appealing to the injectivity constraint ∀ u, v : nats | Succ u = Succ v • u = v distributing this inside the ∀ y... so that u can be instantiated with y and v with Succ y. Some simplification suffices to prove the remaining sub-goals. The above presentation of the proof was derived by abstracting from a proof produced using the CADiZ proof assistant [13,11]. 2.8

Single Disjointness Constraint

Using the notation of the mathematical toolkit, the disjointness requirement can be expressed as a single predicate using the relation “disjoint”. In core language, a direct approach is to express separately the disjointness constraint for each pair of elements and injections, but if there are n of these, the number of constraints is n ∗ (n − 1) ÷ 2, which becomes unmanageably large if n is large. An alternative is to use core language that more closely corresponds to the meaning of the “disjoint” relation. In terms of the general form of a free type given in Section 2.2, the single disjointness constraint is as follows.

On Mutually Recursive Free Types in Z

67

∀ b1 , b2 : N • ∀w : f | (b1 = 1 ∧ w = h1 ∨ ... ∨ b1 = m ∧ w = hm ∨ b1 = m + 1 ∧ w ∈ {x : g1 • x .2} ∨ ... ∨ b1 = m + n ∧ w ∈ {x : gn • x .2}) ∧ (b2 = 1 ∧ w = h1 ∨ ... ∨ b2 = m ∧ w = hm ∨ b2 = m + 1 ∧ w ∈ {x : g1 • x .2} ∨ ... ∨ b2 = m + n ∧ w ∈ {x : gn • x .2}) • b1 = b2 This formulation associates numbers 1. .m with the m elements, and m +1. .m +n with the n injections. It specifies that for any choice of two numbers, a value of the free type is equal to both of the values constructed by the corresponding elements or injections only if the numbers are the same, i.e. if the same constructor is used. This formulation makes use of the addition operator, which fortunately has moved from the optional toolkit to the draft Z standard’s mandatory prelude (where it assists in the definition of number literal expressions). Hence this new disjointness constraint can be said to be in core notation. An example appears in Section 3.4.

3

Mutually Recursive Free Types

The traditional Z free type notation described above does not permit several types to be mutually recursive. However, support for mutually recursive free types is included in the draft Z standard. Free types that are mutually recursive must be written within the same paragraph, separated by the & symbol. (This syntactic restriction is a consequence of the decision not to require tools to admit references from one paragraph to the top-level declarations of later paragraphs.) 3.1

Example

One use for mutually recursive free types is in specifying the syntax of languages, as in this example exp ::= NodehhN1 ii | Cond hhpred × exp × expii & pred ::= Comparehhexp × expii where an expression exp can be a conditional involving a predicate pred , and a pred compares expressions. 3.2

General Form

Mutually recursive free types have the following general form.

68

I. Toyn, S.H. Valentine, and D.A. Duffy

f1 ::= h1 1 | ... | h1 m1 | g1 1 hhe1 1 ii | ... | g1 n1 hhe1 n1 ii &...& fr ::= hr 1 | ... | hr mr | gr 1 hher 1 ii | ... | gr nr hher nr ii where for all j ∈ 1. .r and k ∈ 1. .nj , mj +nj ≥ 1 and ej k can contain occurrences of f1 through fr . The scope rules of Z are extended to ensure that the free type names, element names and injection names are all distinct. 3.3

Constraints

As for traditional Z free types, each free type in a mutually recursive collection has a carrier set of values, denoted by element names and by applications of injections. The membership, total functionality, injectivity and disjointness constraints formalized in Section 2.3 for traditional free types remain appropriate for each free type in a mutually recursive collection. Only the induction constraint needs to be revised to support mutual recursion. Induction constraint The induction constraint is similar to that of a single free type, but takes all of the mutually recursive free types into account at once. Any subsets of mutually recursive free types, that each contain the values of all of the elements and applications of injections, contain all the values of the corresponding carrier sets. ∀ w1 : P f1 ; ...; wr : P fr | h1 1 ∈ w1 ∧ ... ∧ h1 m1 ∈ w1 ∧ .. .∧

hr 1 ∈ wr ∧ ... ∧ hr mr ∈ wr ∧ (∀ y : (let f1 == w1 ; ...; fr == wr • e1 1 ) • g1 1 y ∈ w1 ) ∧ ... ∧ (∀ y : (let f1 == w1 ; ...; fr == wr • e1 n1 ) • g1 n1 y ∈ w1 ) ∧ .. .∧

(∀ y : (let f1 == w1 ; ...; fr == wr • er 1 ) • gr 1 y ∈ wr ) ∧ ... ∧ (∀ y : (let f1 == w1 ; ...; fr == wr • er nr ) • gr nr y ∈ wr ) • w1 = f1 ∧ ... ∧ wr = fr

When r = 1, this induction constraint collapses to that of a single free type. The induction constraint is expressed in general form. More specific constraints may be more directly relevant to particular proofs [6], and they may themselves be proven relative to the general form. Strategies for performing inductive reasoning with respect to mutually recursive free types and functions may be found in the AI literature (e.g. [6,5]). The present concern is not such strategic reasoning, but to characterise completely the notion of mutually recursive free type in Z. 3.4

Constraints for the Example

The example mutually recursive free type of Section 3.1 abbreviates the following constraints (using toolkit notation where it helps clarity to do so).

On Mutually Recursive Free Types in Z

69

Type declarations [exp, pred ] Membership, total functionality and injectivity constraints These can be combined into the single property of total injectivity. Node : N1  exp Cond : (pred × exp × exp)  exp Compare : (exp × exp)  pred Disjointness constraint There are no disjointness constraints from the pred type as it has only one injection and no element values. disjointhran Node, ran Cond i Induction constraint ∀ w1 : P exp; w2 : P pred | (∀ y : (let exp == w1 ; pred == w2 • N1 ) • Node y ∈ w1 ) ∧ (∀ y : (let exp == w1 ; pred == w2 • pred × exp × exp) • Cond y ∈ w1 ) ∧ (∀ y : (let exp == w1 ; pred == w2 • exp × exp) • Compare y ∈ w2 ) • w1 = exp ∧ w2 = pred 3.5

Consistency

The consistency of mutually recursive free types is shown using criteria that are a direct extrapolation of those given above for the singly recursive case. That is, using the second definition of “finitary” given in section 2.5 above, the free type definition is consistent if the domain of each free type injection is a finitary function of each of the free types being defined. Thus in the case of the example, the expressions pred × exp × exp and exp × exp must each be finitary functions of pred and of exp. 3.6

An Inductive Proof Involving the Example

Given the mutually recursive free types exp and pred introduced in Section 3.1, we can define the following functions:

70

I. Toyn, S.H. Valentine, and D.A. Duffy

sum : exp → N1 ∀ x : N1 • sum(Node x ) = x ∀ p : pred ; e1 , e2 : exp • sum(Cond (p, e1 , e2 )) = sum e1 + sum e2 nodes : exp → N1 ∀ x : N1 • nodes(Node x ) = 1 ∀ p : pred ; e1 , e2 : exp • nodes(Cond (p, e1 , e2 )) = nodes e1 + nodes e2 To show that the specification has any meaning, the definitions of sum and nodes must first be shown to be consistent. Taking sum as an example, since the case of nodes will be closely similar, it must first be established that |=? ∃ sum : exp → N1 • (∀ x : N1 • sum(Node x ) = x ) ∧ (∀ p : pred ; e1 , e2 : exp • sum(Cond (p, e1 , e2 )) = sum e1 + sum e2 ) which we do by giving an explicit witness, namely explicitSum == T {sum : exp ↔ N1 | (∀ x : N1 • (Node x , x ) ∈ sum) ∧ (∀ p : pred ; e1 , e2 : exp; se1 , se2 : N1 | (e1 , se1 ) ∈ sum ∧ (e2 , se2 ) ∈ sum • (Cond (p, e1 , e2 ), se1 + se2 ) ∈ sum)} and then it can be shown that it has the required properties using the induction principle of the free type. The consistency of recursive functions of this sort has been discussed further elsewhere [2]. We can now pose the conjecture |=? ∀ e : exp • sum e ≥ nodes e whose proof begins by appealing to the induction constraint. ∀ w1 : P exp; w2 : P pred | (∀ y : N1 • Node y ∈ w1 ) ∧ (∀ y : w2 × w1 × w1 • Cond y ∈ w1 ) ∧ (∀ y : w1 × w1 • Compare y ∈ w2 ) • w1 = exp ∧ w2 = pred Instantiate w1 with the set {e : exp | sum e ≥ nodes e} derived syntactically from the consequent of the conjecture, and instantiate w2 with pred . (The latter instantiation could have been done first, specialising the induction constraint to a form that is still sufficient for this example.) After simplification, this leaves three sub-goals. The first is the base case. |=? ∀ y : N1 • sum (Node y) ≥ nodes (Node y)

On Mutually Recursive Free Types in Z

71

The second is the step case. |=? ∀ y : pred × {e : exp | sum e ≥ nodes e} × {e : exp | sum e ≥ nodes e} • sum (Cond y) ≥ nodes (Cond y) The third results from rewriting the reference to exp in the original conjecture according to the equality generated by the instantiation of the induction constraint. |=? ∀ e : exp | {e : exp | sum e ≥ nodes e} = exp • sum e ≥ nodes e This third sub-goal is proven by simplification. In the base case, the applications of the functions sum and nodes can be unfolded, to leave the following sub-goal, |=? ∀ y : N1 • y ≥ 1 which is a numeric property that can be proven straightforwardly. The step case simplifies to |=? ∀ y : pred × exp × exp | sum y.2 ≥ nodes y.2 ∧ sum y.3 ≥ nodes y.3 • sum (Cond y) ≥ nodes (Cond y) then the applications of the functions sum and nodes can be unfolded to give |=? ∀ y : pred × exp × exp | sum y.2 ≥ nodes y.2 ∧ sum y.3 ≥ nodes y.3 • sum y.2 + sum y.3 ≥ nodes y.2 + nodes y.3 which is an instance of the following law of arithmetic. |=? ∀ a, b, c, d : N | a ≥ b ∧ c ≥ d • a + c ≥ b + d 3.7

A Second Example of Mutual Recursion

An example that occurs repeatedly in the mutual recursion literature is that of even and odd numbers. Here it is as mutually recursive free types in Z notation. even ::= Zero | Succodd hhodd ii & odd ::= Succevenhhevenii The following functions map even and odd numbers onto Z’s existing natural numbers.

72

I. Toyn, S.H. Valentine, and D.A. Duffy

valeven : even → N valodd : odd → N valeven Zero = 0 ∀ o : odd • valeven (Succodd o) = valodd o + 1 ∀ e : even • valodd (Succeven e) = valeven e + 1 As before, it is necessary to prove that these definitions are consistent. Then, the conjecture that every even number is a multiple of 2 and every odd number is the successor of a multiple of 2 can be proven. |=? (∀ e : even • ∃ n : N • valeven e = 2 ∗ n) ∧ (∀ o : odd • ∃ n : N • valodd o = 2 ∗ n + 1) The proof begins by appealing to the induction constraint. ∀ w1 : P even; w2 : P odd | Zero ∈ even ∧ (∀ y : w2 • Succodd y ∈ w1 ) ∧ (∀ y : w1 • Succeven y ∈ w2 ) • w1 = even ∧ w2 = odd Instantiate w1 with the set {e : even | ∃ n : N • valeven e = 2 ∗ n} and w2 with the set {o : odd | ∃ n : N • valodd o = 2 ∗ n + 1}. (Unlike the last example, the full generality of the induction constraint is needed.) This generates four sub-goals: three arising from the conjuncts of the induction hypothesis, and one arising directly from the original goal. That last one is as follows, |=? (∀ e : {e : even | ∃ n : N • valeven e = 2 ∗ n} • ∃ n : N • valeven e = 2 ∗ n) ∧ (∀ o : {o : odd | ∃ n : N • valodd o = 2 ∗ n + 1} • ∃ n : N • valodd o = 2 ∗ n + 1) which is proven by simplification. The Zero case is as follows, |=? ∃ n : N • valeven Zero = 2 ∗ n which is proven by unfolding valeven Zero to 0, and simplifying. The Succodd case is as follows, |=? ∀ y : odd ; n : N | valodd y = 2 ∗ n + 1 • ∃ n : N • valeven (Succodd y) = 2 ∗ n which is proven by unfolding valeven (Succodd y) to valodd y + 1 and instantiating the inner n to n + 1, giving |=? ∀ n : N • 2 ∗ n + 1 + 1 = 2 ∗ (n + 1)

On Mutually Recursive Free Types in Z

73

which is a law of arithmetic. The Succeven case is as follows, |=? ∀ y : even; n : N | valeven y = 2 ∗ n • ∃ n : N • valodd (Succeven y) = 2 ∗ n + 1 which is proven by unfolding valodd (Succeven y) to valeven y + 1 and instantiating the inner n to n, giving |=? ∀ n : N • 2 ∗ n + 1 = 2 ∗ n + 1 which is a law of arithmetic.

4

Conclusions

This paper has presented a formalisation of the semantics of Z free types. It has extended the notation to permit mutually recursive free types. This extension has been implemented in the CADiZ tool, and has been adopted into the draft Z standard. Acknowledgements: Thanks to Rob Arthan for many useful discussions. Alan Frisch, Jeremy Jacob, Steve King and Susan Stepney provided helpful comments on earlier drafts. Funding for this work was provided by EPSRC grants GR/L31104 and GR/M20723.

References 1. R. D. Arthan. On free type definitions in Z. In J. E. Nicholls, editor, Z User Workshop, York, December 1991. Springer. 2. R. D. Arthan. Recursive definitions in Z. In J. P. Bowen, A. Fett, and M. G. Hinchey, editors, ZUM’98: The Z Formal Specification Notation, LNCS 1493, Berlin, September 1998. Springer. 3. Robert S. Boyer and J. Strother Moore. A computational logic. Academic Press, 1979. 4. H. B. Enderton. Elements of Set Theory. Academic Press, 1977. 5. D. Kapur and M. Subramaniam. Automating induction over mutually recursive functions. In M. Wirsing and M. Nivat, editors, Proceedings of the 5th International Conference on Algebraic Methodology and Software Technology (AMAST’96), LNCS 1101, Munich, 1996. Springer. 6. P. Liu and R.-J. Chang. A new structural induction scheme for proving properties of mutually recursive concepts. In Proceedings of the 6th National Conference on Artificial Intelligence, volume 1, pages 144–148. AAAI, 1987. 7. Thomas F. Melham. Automating recursive type definitions in higher order logic. In G. Birtwistle and P. A. Subrahmanyam, editors, Current Trends in Hardware Verification and Automated Theorem Proving, pages 341–386. Springer, 1989. 8. A. Smith. On recursive free types in Z. In J. E. Nicholls, editor, Z User Workshop, York, December 1991. Springer.

74

I. Toyn, S.H. Valentine, and D.A. Duffy

9. J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall, first edition, 1989. 10. J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall, second edition, 1992. 11. I. Toyn. CADiZ web pages. http://www.cs.york.ac.uk/˜ian/cadiz/. 12. I. Toyn. Innovations in the notation of standard Z. In J. P. Bowen, A. Fett, and M. G. Hinchey, editors, ZUM’98: The Z Formal Specification Notation, LNCS 1493, Berlin, September 1998. Springer. 13. I. Toyn. A tactic language for reasoning about Z specifications. In Third Northern Formal Methods Workshop, Ilkley, September 1998. 14. I. Toyn, editor. Z Notation. ISO, 1999. Final Committee Draft, available at http://www.cs.york.ac.uk/˜ian/zstan/. 15. S. H. Valentine. Inconsistency and undefinedness in Z - a practical guide. In J. P. Bowen, A. Fett, and M. G. Hinchey, editors, ZUM’98: The Z Formal Specification Notation, LNCS 1493, pages 233–249, Berlin, September 1998. Springer.

Reasoning Inductively about Z Specifications via Unification David A. Duffy and Ian Toyn Dept. of Computer Science, University of York, Heslington, York, YO10 5DD, UK. {dad,ian}@cs.york.ac.uk

Abstract. Selecting appropriate induction cases is one of the major problems in proof by induction. Heuristic strategies often use the recursive pattern of definitions and lemmas in making these selections. In this paper, we describe a general framework, based upon unification, that encourages and supports the use of such heuristic strategies within a Zbased proof system. The framework is general in that it is not bound to any particular selection strategies and does not rely on conjectures being in a “normal form” such as equations. We illustrate its generality with proofs using different strategies, including a simultaneous proof of two theorems concerning mutually-defined relations; these theorems are expressed in a non-equational form, involving both universal and existential quantifiers.

1

Motivation

Proof by induction is a necessary tool for reasoning about the properties of specifications. In the Z specification language [24,25], the primary basis for such proofs are free type paragraphs, which each have an associated principle of induction. Unfortunately, performing such proofs simply by invoking this principle can be both tedious and difficult. One of the main difficulties is making the selection of the appropriate induction cases, including the associated induction hypotheses. In the literature, several heuristic strategies, such as “recursion analysis”, have been proposed to resolve this problem [2,6,28,12,15,8,20,5,7,17]. The purpose of the present paper is to present, in the context of the CADiZ proof system [26] for Z, a general framework for inductive proof that encourages and supports the use of such strategies, without being bound to any in particular. This framework is based upon the use of unification. Unification is the process of replacing the variables of several expressions to produce a common instance of these expressions. A special case of unification is matching, in which one expression is shown to be an instance of another. The utility of algorithms for unification and matching is well-accepted in the automated-reasoning community, for the efficiency of reasoning they support and the range of applications that have been found for them [3]. Even in the ∗

This work was funded by the EPSRC under grant no. GR/L31104.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 75–94, 2000. c Springer-Verlag Berlin Heidelberg 2000

76

D.A. Duffy and I. Toyn

context of interactive natural-deduction-style proof systems, where efficiency is not the major concern, unification and matching algorithms perform tedious and repetitious proof steps that users would prefer not to perform by hand. Although Z is a typed higher-order language, the underlying mechanism of unification that we use is little different from that used for untyped first-order expressions [10]. The most significant aspect of our approach to building unification into CADiZ’s sequent calculus is that we do not rely upon “normal forms”, Skolemisation or “cycle” checks [10] to ensure soundness. Our approach is based upon equivalence preserving inference rules; the unifier simply derives particular values to pass to these rules. This approach is in keeping with the ethos underlying the development of CADiZ; most of CADiZ’s inference rules preserve equivalence. Moreover, it allows us to apply unification between any two quantified predicates; this turns out to be important for the application to induction. To illustrate our approach to unification, consider the goal G == |=? ∃ x : N1 • x + x = x ∗ x A proof of G might proceed as follows: instantiating x with y + 1 produces |=? . . . ∨ ∃ y : N • y + 1 ∈ N1 ∧ (y + 1) + (y + 1) = (y + 1) ∗ (y + 1) then instantiating y with z + 1 produces |=? . . . ∨ ∃ z : N • z + 1 ∈ N ∧ z + 1 + 1 ∈ N1 ∧ ((z + 1) + 1) + ((z + 1) + 1) = ((z + 1) + 1) ∗ ((z + 1) + 1) Finally, instantiating z with 0 and simplifying reduces the consequent to true. Each of these instantiation steps might be performed via our unifier; for instance, if + has the axiom ∀ w , y : N • w + (y + 1) = (w + y) + 1 then the first instantiation above might be generated by unifying x + x in the original predicate with w + (y + 1) in the axiom. Moreover, if P denotes the original predicate, the result of this unification step will be |=? P ∨ ∃ y : N • y + 1 ∈ N1 ∧ (y + 1) + (y + 1) = (y + 1) ∗ (y + 1) The predicate of this derived goal is equivalent to P , and thus, even if P had been embedded within another predicate, the unification step would have remained sound. The foundation for our approach to inductive reasoning is the work on “implicit induction” [19,14,12,21]. These approaches utilise unification to generate both the induction conclusions and induction hypotheses [10] for conjectures. The benefit of following such an approach is that, while it provides substantial assistance in the development of proofs, performing automatically many of the tedious steps and suggesting strategies of proof, it does not impose the same constraints as, say, recursion analysis, on the way proofs are constructed; in particular, induction hypotheses may be generated “on the fly” [20] rather than through prior analysis.

Reasoning Inductively about Z Specifications via Unification

77

To illustrate the idea behind our approach to induction, consider the following specification. − :N×N→N ∀u : N • u − 0 = u ∀v : N • 0 − v = 0 ∀ u, v : N • succ u − succ v = u − v Suppose that we wished to prove the goal |=? ∀ x , y : N • x − y ≤ x We may do this via the standard principle of induction for N (see Section 4.1), inducing first on x and then on y. More directly, we may look at the definition of ‘−’, and see that we wish to verify the conjecture for three cases of its variables (x , y), namely, (u, 0), (0, v ) and (succ u, succ v ). The third of these cases is |=? ∀ u, v : N • succ u − succ v ≤ succ u Via the definition of ‘−’, this case may be simplified to |=? ∀ u, v : N • u − v ≤ succ u At this point, we see that we would like to assume the “induction hypothesis” u − v ≤ u, and thus prove |=? ∀ u, v : N • u − v ≤ u ⇒ u − v ≤ succ u To prove this subgoal, we invoke the lemma |=? ∀ u, w : N • w ≤ u ⇒ w ≤ succ u of which the subgoal is an instance. Having proven these three cases, we must show that this constitutes a proof of the original goal. This involves showing that we have covered all possible cases of N × N, and that (succ u, succ v ) > (u, v ) holds, where > is some wellfounded ordering. These problems may be resolved in several ways, which we will discuss in Sections 2 and 4.3. The point here is that the cases reasoning allows the user to utilise directly the definitions and lemmas at hand to generate the induction steps of a proof. It is more direct than, say, the application of a free-type induction principle, and thus more readily supports heuristic search strategies (see Section 4.4). The purpose of this paper is to show how the cases proof, including the application of induction hypotheses, may be performed through unification. Crucial in this process is that no unsound steps may be inadvertently applied; we do not wish to rely upon the user to check independently all the steps. At the end of the cases reasoning either all steps should have been verified in CADiZ, or there should be side conditions left which it is up to the user to accept or verify. We

78

D.A. Duffy and I. Toyn

do not just arbitrarily cut in induction hypotheses, as we did informally in the above proof, and rely on the user to check that these are justified; the process we describe enforces this justification through side conditions of the proof. It will be seen how the higher-order nature of Z assists us in this process. On top of this framework, the user may themselves impose various proof strategies via tactics. For instance, we show how several aspects of “rewrite induction” [21] may be simulated in our framework, in particular, how we simulate “simultaneous induction”, which allows us to prove two conjectures simultaneously using instances of each other as induction hypotheses. Our general approach has strong similarities with the meta-level cases reasoning used by Walther [27] to justify induction with respect to algorithms, and our application of unification has strong similarities to the application of higher-order-pattern unification in the “middle-out” reasoning of Kraan et al [17]. However, we should further emphasise that the use of Z, particularly due to its higher-order nature, enables us, in principle, to embed our basic approach entirely within its logic. In particular, it allows uninstantiated induction hypotheses to be associated with each induction case, perhaps surprisingly, it supports our use of first-order unification and, furthermore, it provides us with a single framework for proving not only our conjectures, but also the specialised induction principles we use to prove those conjectures. We also show how Z’s generics supports the use of syntactic orders for induction. Examples of the application of our approach may be found on the web at ftp://ftp.cs.york.ac.uk/pub/aig/examples.

2

Z and the CADiZ Proof System

Z combines predicate calculus with ZF set theory, on which is imposed a decidable type system. Its major innovation is a schema calculus, which is used both for specifying structures in the system and for structuring the specification itself, but we use little of the schema calculus in this paper. In Z, phrases of the predicate calculus are referred to as predicates (logicians usually refer to these as formulae), and phrases of set theory are referred to as expressions (logicians usually refer to these as terms). A Z sequent [g1 , . . . , gk ] d1 ; . . . ; dl | a1 , . . . , am |=? c1 , . . . , cn expresses the conjecture that the conjunction of the antecedent predicates a1 ∧ . . . ∧ am implies the disjunction of the consequent predicates c1 ∨ . . . ∨ cn in the scope of the declarations d1 ; . . . ; dl . The names g1 , . . . , gk are generic parameters, which may be referred to from elsewhere in the sequent; the conjecture is a theorem only if it is valid for all set-valued instantiations of these generic parameters. There need not be any generic parameters, declarations, antecedents or consequents, i.e. any of k , l , m, n can be 0; note, in particular, that m being 0 is equivalent to the single antecedent true, while n being 0 is equivalent to the single consequent false. Sequents are the syntactic phrases that represent not only conjectures but also goals, sub-goals, lemmas, laws, axioms, theorems, and so on.

Reasoning Inductively about Z Specifications via Unification

79

In CADiZ, proof rules apply to a goal and produce zero or more sub-goals. As well as elementary proof rules, CADiZ also offers some larger proof steps, such as a decision procedure for Presburger arithmetic, and allows proof steps to be combined into larger steps using a tactic language.

3

Unification in CADiZ

Unification appears in CADiZ as an auxiliary to proof rules for instantiating quantified variables. In this paper, we use only a subset of Z’s quantified notations, namely the following universal and existential quantified predicates ∀ decl | pred • pred ∃ decl | pred • pred where each pred can be any Z predicate (the bar part ‘| pred ’ may be omitted if its pred is the predicate true), and decl can be any declaration written using the following syntax decl = basicdecl , {‘;’ , basicdecl } ; basicdecl = declname , {‘,’ , declname} , ‘:’ , expr ; where declname is a name being declared, and expr is any Z expression. For a single basicdecl , the relevant proof rules for quantified predicates are (∀ i : e1 | p1 • p2 ) ∧ (∀ i : e1 | i = e2 ∧ p1 • p2 ) ∀ i : e1 | p1 • p2 (∃ i : e1 | p1 • p2 ) ∨ (∃ i : e1 | i = e2 ∧ p1 • p2 ) ∃ i : e1 | p1 • p2 where i does not occur free in e2 . In practice, these rules are applied backwards, constraining the quantified variable to a particular expression. The constrained form is logically combined with the original, so that the rule can be used as an in situ replacement wherever the quantified predicate appears, not solely whole antecedents or consequents. The expression e2 can be determined in various ways: it can be keyed in by the user into a dialogue box; it can be selected from those already displayed; or it can be determined by unification of an expression within the quantified predicate with another expression. When e2 is determined by unification, the other expression can itself be within another quantified predicate (otherwise the unification is merely one-way pattern matching). That other quantified predicate will also be instantiated by the same proof rule if the unifier provides expressions for any of its quantified variables. In the following goal, the consequent is the instance of the antecedent in which x is succ y. y : N | ∀ x : N • x + 0 = x |=? succ y + 0 = succ y

80

D.A. Duffy and I. Toyn

That instantiation can be determined by unification of the expressions x + 0 and succ y + 0, or alternatively by unification of the smaller expressions x and succ y. In performing these unifications, the quantified predicate determines which names are viewed as variables for the unification, x in this example. The other names (succ, y and +) refer to global variables, and are viewed as constants that should match exactly, as are the number literal 0 and the application notation. The above example illustrates matching, that is, “one-sided” unification. The existence example of Section 1 illustrates “two-sided” unification. As another example, consider the following goal | ∀ u : N • succ u − u = succ 0 |=? ∀ x , y : N • x > y ⇒ (succ x − succ y) + y = x Unification of expression succ u − u with expression succ x − succ y produces the unifier h| x == succ y, u == succ y |i, which binds a variable from each of the two quantified predicates — note further that the binding for u affects that for x . When those quantified predicates are instantiated with this unifier, the goal | . . . ∧ ∀ y : N • ∀ u : N | u = succ y • succ u − u = succ 0 |=? . . . ∧ ∀ x , y : N | x = succ y • x > y ⇒ (succ x − succ y) + y = x is produced, where the dots denote the original predicates. This may be simplified to | . . . ∧ ∀ y : N | succ y ∈ N • succ(succ y) − succ y = succ 0 |=? . . . ∧ (∀ y : N | succ y ∈ N • succ y > y ⇒ (succ(succ y) − succ y) + y = succ y) In the second conjunct of the consequent, a quantification of y has been introduced, as otherwise the reference to y would not have been declared. This concern for the well-formedness of Z predicates has resulted in unification being implemented as an auxiliary of the proof rule for quantified predicates, rather than it being accessible directly to the user. In general, the soundness of the unification process follows from the following two equivalences (∀ x : S • P ) ⇔ (∀ x : S • P ) ∧ (∀ y : S 0 • ∀ x : S | x = t • P ) (∃ x : S • P ) ⇔ (∃ x : S • P ) ∨ (∃ y : S 0 • ∃ x : S | x = t • P ) where t is a term involving y, and where y must not occur free in P . The unification process simply applies these equivalences in a restricted manner, choosing particular t and S 0 . We have found that the preferable choice for S 0 is not the type of y, but the set in the declaration associated with y before the unification; this is illustrated in the examples above, where N is taken to be the value of S 0 , rather than the numeric type A of Standard Z [25]. The preceding examples illustrate unification of variables, constants and applications; those are the usual notations used to illustrate unification. In unifying

Reasoning Inductively about Z Specifications via Unification

81

Z expressions, other notations have to be considered, for example, conditional expressions involving predicates. Conditional expressions, logical predicates and relational predicates are treated much like applications: corresponding components of two of them should all unify. In the goal | ∀ x : N • (if x ≥ 1 then x − 1 else x ) ≤ x |=? (if succ 1 ≥ 1 then succ 1 − 1 else succ 1) ≤ succ 1 the two conditional expressions unify, allowing x to be instantiated with succ 1. Some expressions and some predicates introduce local declarations. References to one of these declarations in one expression should match only with references to the corresponding declaration in the other expression. So the unification procedure has to be careful not to allow the locally declared variables to be unified with any expressions. For example, given the two predicates ∀ x : N • #{y : N | y < x } = x 3 0 ⇒ s x > 0 (for the usual definition of > over Z) would clearly not be a proof of ∀ x : Z • x > 0. Further constraints are required [21,11]. For simplicity, we will consider just one important special case where these extra constraints are automatically satisfied. Theorem. Suppose that we have a specification of the form [T ] c1 , . . . , cm : T d 1 , . . . , dn : T → T Axioms ∀ Set : P T | c1 ∈ Set ∧ · · · ∧ cm ∈ Set ∧ (∀ x : Set • d1 (x ) ∈ Set) ∧ · · · ∧ (∀ x : Set • dn (x ) ∈ Set) • Set = T (Note that arbitrary additional axioms are represented by “Axioms”; we do not assume that T is a free type.) Now suppose that we have a stable well-founded ordering >, and that we alter the induction axiom in the above specification by replacing any of the predicates ∀ x : Set • di (x ) ∈ Set

Reasoning Inductively about Z Specifications via Unification

89

by a predicate of the form ∀ x : Set • e1 (x ) ∈ Set ∧ · · · ∧ ek (x ) ∈ Set ⇒ di (x ) ∈ Set where ej ∈ {d1 , . . . , dn } and di (x ) > ej (x ). Then ∀ x : T • P is a consequence of the first specification if it is a consequence of the second. Proof: It is a theorem of the first specification that |=? ∀ x : T • x ∈ {c1 , . . . , cm } ∨ ∃ y : T • x ∈ {d1 (y), . . . , dn (y)} Therefore, if ∀ x : T • P does not follow from the first specification, then we may assume that P (d ) is not a consequence, where d is an expression consisting of only c1 , . . . , cm , d1 , . . . , dn , and we may further assume that d is minimal with respect to >. But then d is di (e), for some e, and we have that |=? P (e1 (e)) ∧ · · · ∧ P (ek (e)) ⇒ P (di (e)) (for some ej ) is a consequence of the second specification, and therefore the first. This means that one of the P (ej (e)) must not be a consequence of the first specification, which contradicts the minimality of d . This theorem simply asserts the soundness of a variation of constructor induction in which we might prove not only P (x ) ⇒ P (c(x )) for a constructor c, but also, say, P (d (x )) ⇒ P (c(x )), where d is another constructor. For example, the Z proof in the above example satisfies the conditions of this theorem, and is thus sound. 4.4

Selection Strategies

We have shown how induction cases and hypotheses may be generated for a conjecture via unification. However, there will typically be many possible sets of unifiers that will produce a set of induction cases; we would like to select those that are most likely to result in a proof, or at least in the application of induction hypotheses. The framework described supports many possible strategies for this selection. For example, we may apply “recursion analysis” [2,6,8], which attempts to derive simultaneously the induction cases and corresponding hypotheses from the recursive pattern of function definitions (or lemmas); this strategy formalises the approach we have used in the induction examples above (see also [4]). However, we may also delay the selection of induction hypotheses, as in the following example. Example 4. Suppose that we have the following specification of addition + :N×N→N ∀x : N • x + 0 = x ∀x : N • x + s 0 = s x ∀ x , y : N • x + s(s y) = s(s y + x )

90

D.A. Duffy and I. Toyn

and that we wish to prove its commutativity, C == |=? ∀ u, v : N • u + v = v + u In outline, we may proceed as follows. Unifying the LHSs of the definition of + with the LHS of the commutativity conjecture, and simplifying, we derive three subgoals, Γ == |=? ∀ x : N • x = 0 + x |=? ∀ x : N • s x = s 0 + x |=? ∀ x , y : N • s(s y + x ) = s(s y) + x We now unify the LHSs of the definition of + with the RHSs of these equations, and simplify again. This produces eight identities and |=? ∀ x , y : N • s(s(s x + s y)) = s(s(s y + s x )) Now we may apply the induction hypothesis s x + s y = s y + s x to produce an identity again, and the proof is completed. Thus, indirectly, we have derived the induction step |=? ∀ x , y : N • s x + s y = s y + s x ⇒ s(s x ) + s(s y) = s(s y) + s(s x ) This example may be viewed as an illustration of “hierarchical” or “mutual” induction [21,7], in that instances of the conjecture are applied as induction hypotheses not to (or not just to) an immediately derived induction case, but to some subsequently derived case. The “mutual” aspect of the proof arises from the fact that the third of the Γ subgoals is a lemma for the proof of C and, at the same time, its own proof involves C being applied as an induction hypothesis. More generally, mutual (or simultaneous) induction [21,7] may be simulated to a certain extent in our framework in the following way. Suppose that we wish to prove simultaneously the conjectures ∀ x : T • P (x ) and ∀ y : T • Q(y); then we conjoin them and distribute one of the quantifiers over the conjunction to derive ∀ x : T ; y : T • P (x ) ∧ Q(y) We may now use instances of P (x ) in the proof of Q(y), and vice-versa. The major application of such simultaneous proofs is in proving properties of mutually-recursively defined relations. This is illustrated in the following simple example. Example 5. Suppose even and odd are defined by relation (even ) relation (odd )

Reasoning Inductively about Z Specifications via Unification

even , odd

91

: PN

even 0 ¬ odd 0 ∀ x : N • even(s x ) ⇔ odd x ∀ x : N • odd (s x ) ⇔ even x and that we wish to prove |=? ∀ u : N • even u ⇒ ∃ y : N • u = 2 ∗ y and |=? ∀ v : N • odd v ⇒ ∃ y : N • v = 2 ∗ y + 1 Then, instead, we prove |=? ∀ u, v : N • (even u ⇒ ∃ y : N • u = 2 ∗ y) ∧ (odd v ⇒ ∃ y : N • v = 2 ∗ y + 1) For this example, it turns out to be much simpler to prove |=? ∀ z : N • (even z ⇒ ∃ y : N • z = 2 ∗ y) ∧ (odd z ⇒ ∃ y : N • z = 2 ∗ y + 1) Unfortunately, this requires us to know beforehand that u and v should be collapsed (by suitable renaming) into a single variable; in general, where there may be multiple variables, it will not be so easy to determine which variables should be so collapsed. Although keeping the variables distinct makes the proofs a lot messier in our framework, it has the advantage that we do not need to determine beforehand which variables to collapse. In outline, to prove the conjecture of the example, we would proceed as follows. Let P (u) denote even u ⇒ ∃ y : N • u = 2 ∗ y and Q(v ) denote odd v ⇒ ∃ y : N • v = 2 ∗ y + 1 We first generate |=? ∀ u, v : N • H (u, v ) ⇒ P (u) ∧ P (v ) where H (u, v ) is ∀ m, n : N • (m, n) smaller (u, v ) ⇒ P (m) ∧ Q(n) We then generate via unification the four cases |=? |=? |=? |=?

∀ v : N • H (0, v ) ⇒ P (0) ∧ Q(v ) ∀ u, v : N • H (s u, v ) ⇒ P (s u) ∧ Q(v ) ∀ u : N • H (u, 0) ⇒ P (u) ∧ Q(0) ∀ u, v : N • H (u, s v ) ⇒ P (u) ∧ Q(s v )

92

D.A. Duffy and I. Toyn

We then attempt to prove |=? |=? |=? |=?

∀ v : N • H (0, v ) ⇒ P (0) ∀ u, v : N • H (s u, v ) ⇒ P (s u) ∀ u : N • H (u, 0) ⇒ Q(0) ∀ u, v : N • H (u, s v ) ⇒ Q(s v )

(We may select these parts of the above subgoals via CADiZ’s distribution rules. Note that in proving the induction cases for this example, we generate side conditions of the form m, n : N |=? (m, n) smaller (s n, m) and m, n : N |=? (m, n) smaller (n, s m) which are clearly satisfied by the ordering 2) end

119

Fig. 4. The abstract machine specifications of Example 1.

Note that it is not conservative over the whole context of M 1 because the uses-invariant may strengthen the invariant over the state variables of the used machine M 1. In the following paragraphs, we discuss the impact of these context-related proof obligations in the cases of uses and sees separately. The Impact of the Context-Related Proof Obligations on uses . Establishing Sh2 in this case guarantees that the underlying theory of the constants of the using machine is conservative over the corresponding theory of the used machine. Further, the Modularisation property of first-order theory presentations (Proposition 1) guarantees that the union of the corresponding theories, underlying the sharing mechanism of uses, will preserve consistency. Consequently, no further consistency proofs about the union are necessary. This is demonstrated in the following example. Example 1. Consider the following abstract machines M 1, M 2, M 3 and M 4 provided in Fig. 4. The abstract machine M1 is shared; it is used by M2 and extended by M3. When the closure machine M4 is formed no further proof obligations about the context are needed. (Indeed no constant-context proof obligations are generated by the B-Toolkit.) Because of the Modularisation property (Propositions 1 and 2), the proof obligation establishing the conservativeness of the context of M 2 over the context of M 1 guarantees the conservativeness of the (constant) context of M 4 over the context of M 3. Consequently the internal consistency of M 3 guarantees the internal consistency of M 4. Of course in this particular example the proof obligation related to the conservativeness of M 2 uses M 1 cannot be discharged. So the potential of a consistency problem related to the incremental specification in the design is detected when it first appears. It is worth noting that, unlike sees, the conservativeness requirement between the constant contexts that underlies uses is not seen as fundamental. It facilitates

120

T. Dimitrakos et al.

proof modularity, presentation clarity and it is consistent with the architectural concept of “one writer, many readers” but it can be omitted. For example Bert, Potet and Rouzaud propose in [4] an alternative version of sees which is very similar to an includes that does not instantiate parameters and does not promote any operation. Although they do not consider constants in their development, one can imagine a variant of uses where the constant context of the using machine is not conservative over the constant context of the used machine. In such a case, one would not be able to employ the Modularization property and the consistency of the closure should be re-established (or delayed until the implementation). Since only the closure machine is refined and later implemented, the conservativeness of the constant context for uses can be conceived to be a matter of taste and style of development. Though, as we illustrate in this paper, the above argument does not apply in the cases of sees and imports where the conservativeness between the static contexts of the seeing/importing component and the seen/imported machine is required. The Impact of the Context-Related Proof Obligations on sees. For sees, establishing Sh2 guarantees that the underlying theory about the constants of the seeing machine is conservative over the theory of the seen machine. Therefore the properties of the seeing machine cannot impose any further “emerging” properties on the constants of the seen machine. In other words, there is no information flow from the seeing machine to the seen machine. This is fundamental for the following reasons. 1. The seen machine M 1 may be consulted from machines other than M 2 and any emerging properties from M 2 may have unpredictable side-effects on the operation of those machines by implicitly enriching their properties clause with the potential of creating conflicts or inconsistencies. 2. If M 1 is enriched (viz. refined) in a development, such a modification takes place on the only shared copy of M 1 and the only properties about s1 , c1 taken into consideration are those specified within M 1; any emerging properties cannot be considered. 3. The seen machine M 1 will be implemented separately and in such an implementation only the properties about the sets and constants s1 , c1 are considered. If emerging properties on the constants of M 1 had been allowed these will not be considered by the implementation therefore causing incompatibilities in parallel development. Given that Sh2 holds, then Sh3 (non-empty state space) is equivalent to the corresponding proof obligation of a machine M 30 whose context is the result of the enrichment of M 3 with the constants and properties of M 1. Notice that the parameters p1 and the state variables v1 of the seen machine do not appear in I3. This “hiding principle”, together with the context-related proof obligations, ensures the conservativeness of I3.9 Example 2 demonstrates the importance 9

Because of the Modularisation property (Propositions 1 and 2), the implication context(M 3) ⇒ (I1 ⇒ ∃(v3 ).I3) reduces to context(M 3) ⇒ ∃(v3 ).I3, by the

Compositional Structuring in B machine M1 constants f properties f : N AT → N AT ∧ ∀x.(x:N at ⇒ f (x) > 2) end

machine M 2 sees M 1 constants g properties g:N AT → N AT ∧ ∀x.(x:N AT ⇒ (g(x) < 5 ∧ f (x) < g(x))) end

121

machine M 3 sees M 1 constants h properties h:N AT → N AT ∧ f (1) 6= h(1) ∧ ∀x.(x:N AT ⇒ h(x) = 3) end

Fig. 5. The abstract machines specifications of Example 2.

of considering these context-related proof obligations associated with the sees primitive. Example 2. Consider the following example where one machine M1 is seen by two other machines M2 and M3 as presented in Fig. 5. Although each of M2 and M3 extend M1 in a consistent way, they induce contradictory emerging properties on M1. On the one hand, M2 alters the context of M1 by forcing the f to accept only one possible model interpretation, namely the constant function f (x) = 3. On the other hand, M3 alters f by accepting only those model interpretations where f (1) > 3. In fact, both M2 and M3 are ill defined. Because they implicitly modify the static context of M1 by imposing (in this example conflicting) emerging properties. In order to avoid such side-effects when a machine M sees a machine M1, the context of M must be conservative on the context of M1. That is, all sentences about the sets and constants identifiers of M1 that are provable in the context of seeing machine M should also be provable in the context of the seen machine M1. As we explain in section 2.6 the latter is the case if and only if the sentences ∃g.(g:N AT → N AT ∧ ∀x.(x:N AT ⇒ (g(x) < 5 ∧ f (x) < g(x)))) and, respectively, ∃h.(h:N AT → N AT ∧ ∀x.(x:N AT ⇒ h(x) = 3) ∧ f (1) 6= h(1)) follow from the context axioms of M1. Clearly, in this example, none of the above mentioned proof obligations can be discharged.

5

Layered Implementation (imports)

In the case of imports (in an implementation) the context related proofs have the form provided in Figure 6. Imp1 establishes, in analogy to includes, the correctness of the instantiation p1 :=n1 of the imported machine M 1(p1 ). We note that the assumption in Imp1 does not embody PROP1 although imported definition of context(M 3) (Fig. 3) and because context(M 1) ⇒ ∃(v1 ).I1, by the assumption that M 1 is internally consistent, and v1 does not appear in I3 by the syntactical conditions of sees. In fact, with an analogous argument one can also drop CN1 from the assumption in Sh3-sees.

122

T. Dimitrakos et al. implementation refines imports sets constants properties values invariant end

M2I(p2 ) M2(p2 ) M1(n1 ) s3 c3 PROP3(s1 , c1 , s2 , c2 , c02 , s3 , c3 ) s3 :=E3 , c3 :=e3 I3(v2 , v1 , s1 , c1 , s2 , c2 , c02 , s3 , c3 ) .. .

Imp1: context(M 2) ∧ PROP3 ⇒ [p1 :=n1 ]CN1 Imp2: PROP1 ⇒ [s2 := E2 , c2 := e2 , s3 := E3 , c3 := e3 ](PROP2 ∧ PROP3) ...

Notes: (1.) c02 are the (abstract) constants of M 2 which are not given concrete values via the implementation and Ei , ei are concrete scalar values. (2.) Both proof context-related obligations Imp1 and Imp2 are proposed by the B-Book. Fig. 6. The general form of the context-related clauses and proof obligations for the imports primitive.

constant identifiers in c1 may appear in the assumption via PROP3. As we elaborate in the sequel, this is important for establishing the conservativeness of the properties of context(M 3) over the properties of the imported machine M 1. Imp2 proves the uniform interpolant of the implementation instance of the refined properties in the static language of imported machine. On the one hand, the sentence PROP2 ∧ PROP3 axiomatises the static context of the (resulting machine of the) refinement. On the other hand, the instance [s2 :=E2 , c2 :=e2 , s3 :=E3 , c3 :=e3 ]PROP1 ∧ PROP2 ∧ PROP3 which is produced by the evaluation of the refined sets and (concrete) constants and the embodiment of imported properties PROP1, axiomatises the static context of the implementation. Hence, Imp2 establishes that the properties of implementation context extend conservatively the properties of the imported context and, by Modularisation (Proposition 1), that context(M 2I) is conservative over context(M 1(n1 )). But this alone is not enough. In order to ensure that no “emerging” properties are imposed over the (static) context of M 1 by the implementation M 2I one has to establish that the instantiated context(M 1(n1 )) does not affect PROP1. Because n1 may depend on some constants c2 , c3 in the contextual extension signature of M 2I which are then related to c1 via PROP3, there is an indirect channel through which “emerging”properties about c1 can flow from M 2I to M 1. However, the conservativeness of context(M 1(n1 )) over PROP1 follows by taking Imp1 into account. Imp1 and Imp2 together guarantee that if context(M 1(n1 )) ⇒ ϕ(c1 ) then PROP1 ⇒ ϕ(c1 ). We note that, among the contextual conservativeness requirements we discuss in this paper, only Imp2 is considered in the B-Book (page 599). Indeed, the

Compositional Structuring in B implementation sees refines imports properties operations end machine sees (abstract )constants properties operations end machine sees (abstract )constants properties operations end end

123

M 2I Bool TYPE M2 M1 ∀xx.(xx:N AT ⇒ f un2 (xx) = f un1 (xx)) rr ← op2 = rr ← op1 M2 Bool TYPE f un2 f un2 :N AT → N AT ∧ ∀xx.(xx:N AT ⇒ f un2 (xx) = xx) rr ← op2 = any v2 where v2 :N AT then r:=bool(f un2 (v2 ) = v2 ) end M1 Bool TYPE f un1 f un1:N AT → N AT ∧ ∀xx.(xx:N AT ⇒ f un1 (xx) = xx + 1) rr ← op1 = any v1 where v1 :N AT then rr:=bool(f un1 (v1 ) = v1 ) end

Fig. 7. The implementation of Example 3.

conservativeness of this extension is fundamental. Because the imported machine M 1 is implemented independently in a layered development, the only (abstract) properties considered in the implementation of M 1 are PROP1 (i.e., those specified in the properties clause of M 1). If any emerging properties about the static context of M 1 were allowed, these properties would not be considered in the implementation of M 1. The latter could allow the validation of an implementation M 1I of a behaviour that is weaker than that assumed by M 2I, in which case the correctness of each implementation layer individually would not guarantee the correctness of the overall development. This is illustrated in Example 3. Example 3. Consider the implementation presented in Figure 7 where the abstract machine to be refined is M 2 which specifies an operation op2 such that op2 always returns T RU E, and the imported machine is M 1 which specifies an operation op1 such that op1 always returns F ALSE. Both machines M 1 and M 2 are clearly internally correct but the property given in the implementation is inconsistent with those inherited from the refinement sequence and the imported machine. Consequently, the proof obligation related to the preservation of the invariant in M 2I holds trivially: V

HYP ⇒ ∃(v2 ).(v2 :N AT ∧bool(f un2 (v2 )=v2 )=bool(f un1 (v1 )=v1 ))

where

HYP={CNM 2I,PROPBool T Y P E,PROPM 1,PROPM 2,PROPM 2I,I1,I2,IM 2I,assign(M 3I),PRE(op2 )}

V This is because the conjunct {PROPBool TYPE, PROPM1, PROPM2, PROPM2I} in the assumption is self-contradictory (asserting 0 = 1). The incorrectness of the implementation is caught in the validation of the context related

124

T. Dimitrakos et al. cst(M2I$1) ⇒ ( ctx(M1) & ctx(Bool TYPE) ⇒ (#(fun1 , fun2).( !xx.(xx : NAT ⇒ fun1(xx) = fun2(xx)) & fun2 : NAT −→ NAT & !xx.(xx : NAT ⇒ fun2(xx) = xx) )))

Fig. 8. A tool-generated incorrect contextual proof obligation for the implementation of Figure 7, Example 3. V

V

proof obligation: {PROPM 1,PROPBool T Y P E}⇒ ∃(f un2 ). {PROPM 2,PROPM 2I} which fails by, for example, reducing the proof to the validity of 0 = 0 + 1. We note that while testing the above example in the B-Toolkit, the incorrect proof obligation of Figure 8 was generated by the tool. The tool generated proof obligation is incorrect because it trivialises the conservativeness argument by bounding the occurrences of f un2 and f un1 in the conclusion with an existential quantifier and thus producing a formula that is strictly weaker than the uniform interpolant. In the correct form of this proof obligation f un1 appears unbounded (as the same constant in the conclusion and the assumption). The above mentioned proof obligation reads “ there exist f un1 and f un2 such that f un2 has those properties and f un1 is equal to f un2 ” which is of course valid. (Because it reduces to the internal consistency of the abstraction M 2.)

6

Conclusion

In this paper we have discussed how the conservativeness between the static context of components in B can be established by means of proof obligations which have a common (meta-)form. From a deduction perspective, the common (meta-)form of these proof obligations consists in establishing that the uniform interpolant of the contextual extension axioms is implied by the base context. Furthermore, in order to produce the above mentioned uniform interpolant, one can simply abstract away the identifiers of the extension signature from the extension axioms by replacing all their occurrences with new existential quantifier-bounded variables. Hence, the above mentioned proof obligations take the following general form, where M 1 is the component specification, M 2 is the compound specification and c01 , . . . , c0n are the identifiers in the contextual extension signature, context(M 1) is the conjunction of the context-related properties of the component and ∆context(M 2) is conjunction of the properties of the contextual extension, i.e., context(M 1) ⇒ ∃(c01 , . . . , c0n ).∆context(M 2) From a logical viewpoint, these proof obligations can be seen as a set of necessary and sufficient conditions for establishing the conservativeness of the contextual extension. By requiring such a contextual extension to be conservative, one guarantees the relative consistency, relative completeness of presentation and the absence of information flow from the context of the compound specification to the context of the component specification. By establishing the conservativeness of the compound context over the context of the component these proof obligations are

Compositional Structuring in B

125

1. essential in order to ensure the correctness of the “consultation only” sharing architecture in the case of sees and the compatibility of layered implementation in the case of imports; 2. useful in order to facilitate proof modularity and orthogonality of incremental specification in the case of sharing specification modules via the uses primitive. Instances of such proof obligations are generated by the B-Toolkit (release Beta 4.58) for the validation of the contextual extensions that are associated with the uses, sees, or imports primitives, as well as ensuriing that (the static context) of a component specification in B is internally consistent (following the Proof Obligations for Internal Consistency provided by Lano in [20]). We also noted that only in the case of imports (in an implementation) some form of conservativeness validation by means of appropriate uniform interpolants has been explicitly considered in the B-Book. However, the context-related proof obligation generated by the B-Toolkit for imports is not sufficient and needs to be corrected. We plan to investigate the potential of producing analogous proof obligations to ensure non-interference and to control state sharing and information flow in the dynamic part of component specifications in B. This may involve developing an appropriate (polymodal) formalism (c.f. [3,14]) to model the correlation between sequences of general substitutions and state transitions and then reduce it to classical logic enriched with a deductive presentation of set theoretic membership (c.f [8,12]). Non-interference and absence of information flow are known to be related with bisimulation between the abstract state spaces [5,27] while the existence of interpolants is known to be equivalent with entailment along bisimulation in various (poly)modal logics [8,12]. Furhtermore, an equivalence between the stability of theorem conservation under amalgamation of theory extensions and some variants of interpolation has been established in [16] for various families of reflexive, transitive and monotonic entailment relations.

References 1. J.R. Abrial. The B-Book : Assigning Programs to Meanings. C.U.P., 1996. 2. B-CORE (UK) Ltd. The b-toolkit. 1999. URL: http://www.b-core.com. 3. J. Barwise, D. Gabbay, and C. Hartonas. On the logic of information flow. Bulletin Of The IGPL, 3 (1):7–49, 1995. 4. D. Bert, M-L. Potet, and Y. Rouzaud. A study on Components and Assembly primitives in B. In H. Habrias, ed., First Conference on the B-Method, 1996. 5. J.C. Bicarregui. Non-Interference, Security and Bisimulation: explorations into the roles of Read and Write frames. CLRC-RAL, 1998. 6. J.C. Bicarregui and et al. Formal Methods Into Practice: case studies in the application of the B Method. I.E.E. Transactions on Software Engineering, 1997. 7. J.C. Bicarregui, J.Dick, B.Matthews, and E.Woods. Making the most of formal specification through animation, testing and proof. Sci. of Comp. Prog., 1997. 8. J. van Benthem. Modality, bisimulation and interpolation in infinitary logic. ANNALSPAL: Annals of Pure and Applied Logic, 96, 1999.

126

T. Dimitrakos et al.

9. M. Buchi and B. Back. Compositional Symmetric Sharing in B. In FM’99 – Formal Methods, volume I of LNCS, pages 431–451. Springer, Septermber 1999. 10. D. Clutterbuck, J.C. Bicarregui, and B.M. Matthews. Experiences with proof in formal development. In H. Habrias, ed., First Conference on the B-Method, 1996. 11. W. Craig. Three uses of the Herbrand-Getzen theorem in relating model theory and proof theory. Journal of Symbolic Logic XXII, pages 269–285, 1957. 12. G. D’Agostino, A. Montanari, and A. Policriti. A set-theoretic translation method for (poly)modal logics. Lecture Notes in Computer Science- 900, 1995. 13. Th. Dimitrakos and T.S.E. Maibaum. Notes on refinement, interpolation and uniformity. In ASE’97, 12th IEEE Int. Conf., 1997. 14. Theodosis Dimitrakos. Formal support for specification design and implementation. PhD thesis, Imperial College, March 1998. 15. Theodosis Dimitrakos. Parameterising specifications on diagrams. In ASE’98, 13th IEEE Int. Conf., 1998. 16. Theodosis Dimitrakos and Tom Maibaum. On a generalised modularisation theorem. Information Processing Letters,74(1-2):65-71, 2000. 17. S. Dunne. The Safe Machine: A New Specification Construct for B. In FM’99 – Formal Methods, volume I of LNCS, pages 472–489. Springer, Septermber 1999. 18. H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, 1972. 19. Cliff B. Jones. Accomodating interference in the formal design of concurrent objectbased programs. Formal Methods in System Design, 8(2):105-122, March 1996. 20. Kevin Lano. The B Language and Method.. Springer-Verlag, 1996. 21. P.J. Lupton. Promotin Forward Simulation. In J.E. Nicholls, editor, Z User Workshop, pages 27–49. Springer-Verlag, Oxford 1990. 22. B. Matthews, B. Ritchie, and J. Bicarregui. Synthesising structure from flat specifications. In 2nd International B Conference, LNCS, 1998. 23. M.C. Mere and P.A.S. Veloso. Definition-like extensions by sorts Bulletin of the IGPL, 3:579-595, 1995. 24. B. Meyer. Object Oriented Construction. Prentice-Hall, 1988. 25. M-L. Potet and Y. Rouzaud. Composition and Refinement in the B-Method. In D. Bert, editor, Second B International Conference, pages 46–65, 1998. 26. Yann Rouzaud. Interpreting the B-Method in the Refinement Calculus. In J. Wing, J. Woodcock, and J. Davies, editors, FM’99 – Formal Methods, vol. I, 1999. 27. P.Y.A Ryan and S.A. Schneider. Process algebra and non-interference. In PCSFW: Proc. of The 12th Computer Security Foundations Workshop. IEEE Comp. Soc. Press, 1999. 28. Ketil Stølen. Development of Parallel Programs on Shared Data-Structures. PhD thesis, University of Manchester, 1990. Available as a technical report UMCS-91-1-1. 29. Wladyslaw M. Turski and Thomas S. E. Maibaum. The Specification of Computer Programs. Addison-Wesley, 1987. 30. P.A.S. Veloso and T.S.E. Maibaum. On the modularisation theorem for logical specifications. Information Processing Letters 53, pages 287–293, 1995. 31. P.A.S. Veloso and S.R.M. Veloso. On extensions by function symbols: coservativeness and comparison. Tech. Report. COPPE/UFRJ. 1990. (See also [23,32]) 32. P.A.S. Veloso and S.R.M. Veloso. Some remarks on conservative extensions: a Socratic dialogue. Bulletin of the EATCS, vol. 43, 1991. 33. J.C.P. Woodcock. Mathematics as a Management Tool: Proof Rules for Promotion. In CSR Sixth Annual Conference on Large Sofware Systems. Bristol, 1989.

Automatic Construction of Validated B Components from Structured Developments Pierre Bontron and Marie-Laure Potet LSR-IMAG, Grenoble, France Laboratoire Logiciels Syst`emes R´eseaux - Institut IMAG (UJF - INPG - CNRS) BP 72, F-38402 Saint Martin d’H`eres Cedex - Fax +33 4 76827287 [email protected], [email protected]

Abstract. Decomposition and refinement provide a way to master the complexity of system specification and development. Decomposition allows us to describe a complex system in term of simpler and more understandable components and in terms of the interactions between these components. Refinement/Abstraction allows us to use more general specifications, which should also be more understandable, and which can be gradually made more precise. Combining decomposition and refinement offers a very powerful tool to build specifications. This process results in a structured object which describes both the final specification and its elaboration in term of interaction and refinement. Nevertheless the result remains intrinsically a complex object. The next step consists in developing tools to represent, to manipulate and to reason about such structured objects. The aim of this paper is to propose such a tool in the framework of the B method. By exploiting the B theory, and as far as possible without changing the method, we propose three algorithms to extract validated B components, using properties underlying the structure of developments. These new components can be exploited to extend a structured development, for instance to validate new properties.

1

Introduction

Decomposition and refinement provide a way to master the complexity of system specification and development [13][5]. Decomposition allows us to describe a complex system in term of simpler and more understandable components and in terms of the interactions between these components. The complexity is split into two parts: local understanding of subcomponents and global understanding of interactions. Refinement/Abstraction allows us to use more general specifications, which should also be more understandable, and which can be gradually made more precise. Refinement/Abstraction introduces levels in the specification elaboration process intended to describe both the final specification and the way in which this specification is built. Using this principle, complexity is split into three parts: understandable components, specification levels and properties which must be preserved through refinement. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 127–147, 2000. c Springer-Verlag Berlin Heidelberg 2000

128

P. Bontron and M.-L. Potet

In the framework of the model-based approach, most specification languages offer constructs which deal with the decomposition principle. We can point out the schema notation in Z [14], modular extensions to VDM in VVSL [12], the modular extensions of IFAD/VDM [6,8], the RAISE language [7], or the structuring primitives of the B method [2]. By contrast, refinement is rarely integrated into the methods. For instance the notion of refinement is included in the VDM method [9] but still absent from the proposed ISO standard [3]. In the B method, refinement is integrated as a particular form of structuring and a development contains both structuring primitives and levels of refinement. For instance in the M´et´eor project [4], about 1000 components have been developed, corresponding either to levels of refinement or to components which are combined. Other advanced languages propose both decomposition, refinement and methods to combine proofs for components (denoted by horizontal and vertical composition or monotonicity of refinement according to decomposition). We can cite TLA, the Temporal Logic of Actions, in which powerful theorems are proposed to refine subsystems [1], or the Specware language [16], which combines axiomatic specifications and category theory. The most important aspect of this framework is the ability to represent explicitly the structure of specifications, refinements and program modules, via specification diagrams. Reasoning about the overall structure of specifications, without a precise knowledge of the module contents, becomes possible. Combining decomposition and refinement offers a very powerful tool to build specifications. This process results in a structured object which describes both the final specification and its elaboration in term of interaction and refinement. Nevertheless the result remains intrinsically a complex object. So the next step to improve the mastering of complexity is to propose methodologies and tools related to the development of structured specifications. Methodologies can help to choose kinds of structure adapted to some classes of specifications. Moreover, tools can offer some techniques to represent, to manipulate and to reason about such structured objects. The aim of this paper is to propose such a tool in the framework of the B method. This work was initially inspired by a comparison between VDM, B and Specware to develop specifications structured by refinement and composition ([18] in French!). In view of the Specware technology [15] [17], it appears to us that the B method can offer more flexibility than it does now. In particular it is not easy to extend a structured development, for instance to validate new properties. By exploiting the B theory, and as far as possible without changing the method, we propose three algorithms to extract validated B components, using properties underlying the structure of developments. These new components can be exploited to extend or complete the model. In section 2 we briefly present how developments can be structured through refinement and composition primitives in the B method. In section 3 we give a first extraction algorithm, extending the one proposed in the B Book. After that we exploit two fundamental properties of the refinement: stepwise refinement, which allows implementations to be developed by introducing several refinement

Validated B Components from Structured Developments

129

steps and partwise refinement which allows structured specifications to be refined by refining their parts. From these properties we propose two component extraction algorithms. Finally, in the last section we show how the proposed approach can be put into practice.

2

Component, Composition and Refinement in B

In this section we present how B developments are structured through refinement and composition primitives and we point out several consequences of the chosen solutions. 2.1

B-Components

In the B-method, there are three syntactic kinds of components: abstract machines, refinements and implementations. These components behave differently in regard to the development process. The introduction of several kinds of components is a particular feature of the B-method. Generally, languages or methods offer only one kind of component, and provide some particular properties to characterize the implementations. As will be pointed out this choice is relative to the effectiveness of the method (proof obligations must be as simple as possible, in order to make the proof activity easier). Abstract Machines. Abstract machines are specifications. They define sets, constants, variables and operations, in an abstract way. A simple abstract machine is: machine M(P) constraints C sets S constants K properties P assertions A variables X invariant I initialisation U operations O = pre Q then T end end

The sets clause defines given sets which are considered as basic independent types. Such sets can be enumerated or deferred (a finite and non-empty unspecified set). Constants and variables are qualified as concrete or as abstract. Concrete constants and variables must be kept through refinements until the implementation. The assertions clause declares some lemmas which have to be proved from the properties and the invariant.

130

P. Bontron and M.-L. Potet

Refinements. Refinements are not full specifications. They are a sort of differential to be ”added” to its abstraction (an abstract machine or else a previous refinement). Refinements contain also information about the refinement (the gluing relation between variables and constants of the abstraction and variables and constants of the refinement). In that sense refinement components are more than specifications; they also draw out the development, linking two components by a refinement relation. Refinement constructs are similar to abstract machines (see Fig. 1). A refinement inherits some parts of its abstraction (sets, concrete constants and variables, properties, invariant, preconditions, unchanged operations). In this way the syntactic information to be added is limited.

machine M1 variables v1 invariant L1 initialisation U1 operations op = pre P1 then S1 end end

refinement R2 refines M1 variables v2 invariant L2 initialisation U2 operations op = pre P2 then S2 end end

Fig. 1. A refinement R2 of M1

A refinement can be seen as a stand-alone abstract machine. Fig. 2 describes the resulting abstract machine, as is done in the B-Book for a simple abstract machine and a simple refinement (in that case we have to suppose variables v1 and v2 are disjoint). As can be seen, the invariant of M2 is restricted by the invariant of its abstraction. The lemma ∃v1 (L1 ∧ L2 ) can be established by composing the refinement proof of the refinement R2 and the invariant preservation proof of the machine M1 . In conclusion the refinement R2 ”inherits”, in some sense, the validated invariant L1 . In a similar way, the precondition of the refined operations is restricted by the initial precondition of the abstract operations.

machine M2 variables v2 invariant ∃v1 (L1 ∧ L2 ) initialisation U2 operations op = pre P2 ∧ ∃v1 (L1 ∧ L2 ∧ P1 ) then S2 end end

Fig. 2. Refinement R2 seen as an abstract machine M2

Validated B Components from Structured Developments

131

The introduction of a particular form of component for refinement has some practical consequences. The first advantage is to limit the syntactic text to be added: a refinement only contains the details corresponding to the new specification level. Moreover the proof obligation complexity is mastered. For instance an unchanged operation is not repeated and, consequently, no proof obligation is generated. A more important consequence is that the invariant part of the refinement only contains the additional properties of the new variables (L2 in Fig. 1) and only this invariant must be established. Nevertheless the stronger invariant ∃v1 (L1 ∧ L2 ) is also valid. In this way the generated proof obligations are as simple as possible. Implementations. Implementations are the last level of a development, they cannot be refined. Implementations are particular refinements in which substitutions are executable, so they can be translated into code. Moreover at the implementation level concrete constants and deferred sets must be given a value. Deferred sets must be given a value of a non-empty interval whose limits are numeric scalars. Valuations can be created indirectly (by homonymy) or explicitly, through the values clause, which assigns constants and sets in a deterministic and non circular way. For concrete constants the values clause is a particular form of a properties clause. For a given set S, the values clause modifies the type associated to S. This type is initially a basic type corresponding to the given set S, and the values clause instantiates S with a subset of NAT. So a values clause modifies the model as a whole because typing is affected. Nevertheless, except for type checking, a given S can be considered as a constant with the property S ∈ P1 (NAT), where P1 is the power-set operator, excluding the empty set, and the values clause can be interpreted as a particular form of property. 2.2

B Composition

B Composition Primitives. The B method offers some mechanisms to build components from others: the includes, imports, sees and uses clauses. We do not detail these primitives here but we only focus on their use in developments. All of these clauses have abstract machines as argument. That is to say refinements or implementations can never be included, used, seen or imported. On the contrary, depending on the kind of B components, the use of these clauses is restricted. In an abstract machine only the clauses sees, uses and includes are allowed, in a refinement only the clauses sees and includes are allowed and in an implementation only the clauses sees and imports are possible. The imports clause can be seen as an encapsulated view of the includes clause (abstract constants and variables of the imported machine cannot be directly referenced in operations and initialisations). Such a restriction allows an imported abstract machine be substituted by its implementation, in order to elaborate the final code. The includes and imports clauses can be used with the promotes clause. With this clause operations of the included or imported machine

132

P. Bontron and M.-L. Potet

become proper operations of the component. The extends clause is an includes or imports clause with a promotion of all operations of the included or imported machine. Refinement of Structured Components. Now we have to consider how composition clauses are taken into account in refinement. An abstract machine with a uses clause is not a refinable specification. In the same way the imports clause is only allowed in implementations, and implementations are not refinable. So we only consider the includes and sees clauses. When a sees clause is introduced, it must be preserved in the chain of refinements, until the implementation. At the level of implementation the seen machine must be used in an encapsulated view (abstract constant and variables are only accessible through operation calls). An includes clause may, or not, be kept in the refinement chain. At the level of implementation an includes clause becomes an imports clause and then an encapsulated view is imposed. Architectural Constraints. The B composition clauses give syntactic mechanisms to build new specifications from other ones. Moreover, because the B method is based on the verification of some properties (invariant preservation and refinement relation), some strong constraints are necessary. For instance seen variables can not be constrained in the invariant part and some global architectural rules must be fulfilled (for instance an abstract machine can be included/imported only once). In particular the form of a structured development depends intrisically on the properties to be stated (variables which are linked by an invariant must be declared or included/imported in the same component). 2.3

The proposed Tool

Objectives. Because refinements and implementations are not full specifications they can not be included, used, seen or imported. This restriction strengthens the abstraction/decomposition criterion. Only abstract machines can be seen by external users and their chain of development is private. This point of view corresponds to the use of refinement/implementation during the coding phase: abstract machines are detailed specifications and refinements and implementations represent the coding phase. In such a case independence between developements can be guaranteed. Nevertheless, when refinements and implementations are used during the analysis or specification phases, this distinction is not relevant. In that case refinements or implementations are used to introduce new detail and there is a priori no conceptual distinction between the different kinds of component. Using the B method, with its new extensions [10], early in the development activity seems to be very promising. In that case there is no reason not to see, include, use or import a refinement or even an implementation, if it corresponds to a specification level. Encapsulation seems rather relative to methodological guidelines (connected for instance to the different phases of the development process) than encoded into the component syntax.

Validated B Components from Structured Developments

133

The Algorithms. The three algorithms aim to mechanically build abstract machines corresponding to any component of a structured development. These algorithms are based on the underlying theory of the B method and, obviously, the extracted abstract machines are valid, by construction. We will also produce components stating relations between the extracted components and the initial components, in order to preserve the structure of the development. 1. The first algorithm is intended to build an abstract machine from a direct refinement of a machine. Such an algorithm is almost proposed in the BBook. We extend it in order to treat all the clauses allowed in B components and to deal with variables renaming B facilities. 2. The second algorithm generalizes the first one, allowing us to extract an abstract machine in any point of a chain of refinements. It is based on the transitivity property of the refinement relation. 3. The third algorithm extracts components corresponding to structured refinements. It is intended to combine refinement levels through several chains of refinement. It is based on the monotonicity property of the refinement relation. The two first algorithms produce abstract machines corresponding to refinement or implementation components. The third algorithm produces abstract machines corresponding to components which don’t syntactically appear in the development, but exist in the B theory (take as example the code issued from a structured development). This algorithm is the more interesting one because it produces new components not existing in the development. Nevertheless, the two first algorithms are also interesting because, as it was pointed before, the definition of refinement and implementation as ”delta” simplifies the syntax and the proof obligation activity. Practical Consequences. Firstly these algorithms collect, through a structured development, all the properties related to a component, these properties being explicitly stated in the extracted abstract machine. When refinement and implementation are used at the specification level, we can have a synthetic view of these properties, at each point of a development. Secondly, because refinements and implementations become abstract machine, they can be included, imported and combined. In this way we can state new invariants of a refinement or an implementation, for instance in extending the corresponding abstract machine. Moreover refinements/implementations (through their extracted abstract machines) can be combined to develop a new structure corresponding to several points of view about a specification (the functional aspects, the tracability of some properties . . . ). In particular we can state new properties linking variables defined into several refinement components, without changing the structure of the model. Otherwise, due to the B restrictions on architectures, the model must be modified as a whole, in order to introduce such variables at the level of abstract machines.

134

P. Bontron and M.-L. Potet

In the following we give the syntactic construction of the resulting components and the theoretical framework which justifies their validity. In the conclusion we briefly present some use of these algorithms.

3 3.1

Extraction Algorithms Component Extraction from a Direct Refinement

Figure 3 represents the different components which are extracted and their connections. The abstract machine MR2 describes the specification corresponding to the refinement level R2 . Because sets must be defined only once we build a machine MSETS which factorizes the definition of the sets of machine M1 . Thus the machine EM1 is similar to M1 , without its sets clause and with a clause sees on MSETS. Finally, the relationship between the two abstract machines EM1 and MR2 is built in the refinement component RR2 . In the following we see how the two components MR2 and RR2 are produced from the different clauses of the initial components M1 and R2 .

EM 1 M1

refines

refines

sees MSETS

RR 2

sees

sees extends

R2 MR 2

Fig. 3. The components extracted from a direct refinement

The Abstract Machine MR2 . The Heading Part. When an abstract machine has parameters, there is also a constraints clause which constrains these parameters. In this case this clause is repeated in MR2 .

MACHINE M 1 (X) CONSTRAINTS C

MACHINE MR 2 (X) CONSTRAINTS C

Validated B Components from Structured Developments

135

Variables and Constants.

MACHINE M 1

REFINEMENT R2

ABSTRACT_CONSTANTS AC1 CONCRETE_CONSTANTS CC 1 PROPERTIES P1 ABSTRACT_VARIABLES AV1 CONCRETE_VARIABLES CV1 INVARIANT I1

ABSTRACT_CONSTANTS AC2 CONCRETE_CONSTANTS CC 2 PROPERTIES P2 ABSTRACT_VARIABLES AV2 CONCRETE_VARIABLES CV2 INVARIANT I2

MACHINE MR 2 ABSTRACT_CONSTANTS AC2 CONCRETE_CONSTANTS CC 1 , CC 2 PROPERTIES $ c.(P1 /\ P2 ) ABSTRACT_VARIABLES AV2 CONCRETE_VARIABLES CV1 , CV2 INVARIANT $ v,c.(P1 /\ P2 /\ I 1 /\ I2 )

Fig. 4. Variables and constants

In figure 4, we have AC1 , AC2 , CC1 , CC2 which are sets of constants, and AV1 , AV2 , CV1 , CV2 which are sets of variables (the second letter is V for variables and C for constants). Concrete constants and variables cannot be redefined in the refinement, they are implicitly inherited. Abstract constants and variables may or may not disappear. If they are redefined they are implicitly assimilated to the constants or variables of the machine, with the same name. Constants and variables which do not belong to the machine MR2 can be obtained by the two following formulae: v = AV1 − (AV2 ∪ CV2 ) c = AC1 − (AC2 ∪ CC2 ) Finally, as is described in the B-Book, the invariant is obtained by an existential quantification on these variables. A similar treatement is applied on the properties clause. Assertions. Assertions of M1 are inherited by its refinements, and because assertions are proved from the invariant or from the previous assertions, the order between A1 and A2 in the abstract machine MR2 of the figure below is important.

MACHINE MR 2 ASSERTIONS $ v,c.A 1; $ v,c.A 2

136

P. Bontron and M.-L. Potet

As with the variables and constants, v represents the abstract variables and c represents the abstract constants which do not appear within the refinement R2 . Assertions of MR2 can be proved from the proof of A1 in M1 (P1 ∧ I1 ⇒ A1 ) and from the proof of A2 in R2 (P1 ∧ I1 ∧ P2 ∧ I2 ∧ A1 ⇒ A2 ). The proof is left to the reader. Composition Primitives. We retain the structuring clauses from R2 in MR2 , except when R2 is an implementation. In this case, the imports clause, which is not allowed in abstract machines, is transformed into an includes clause. Operations. Two different cases must be treated: when an operation of the abstraction is redefined in the refinement and when it isn’t. The promoted operations (with promotes or extends) in an abstract machine or in a refinement are considered as any operation of the component. If an operation defined in M1 is not redefined in the refinement R2 its definition is copied, otherwise its precondition becomes ∃v, c(P1 ∧ P2 ∧ I1 ∧ I2 ∧ P M2 ) ∧ P R2 (see Fig. 5). Initialisation is treated as a particular operation.

MACHINE M 1 ... INITIALISATION U 1 OPERATIONS op1 = PRE PM 1 THEN TM 1 END ; op2 = PRE PM 2 THEN TM2 END END

REFINEMENT R 2 REFINES M 1 INITIALISATION U 2 OPERATIONS op2 = PRE PR2 THEN TR 2 END END

MACHINE MR 2 ... INITIALISATION U 2 OPERATIONS op1 = PRE PM 1 THEN TM 1 END ; op2 = PRE $ v,c.(P 1/\ P 2 /\ I 1 /\ I 2 /\ PM 2 ) /\ PR 2 THEN TR 2 END END

Fig. 5. Operations

The treatment of the operations does not necessary preserve B restrictions on the form of allowed substitutions into an abstract machine (sequencing for instance can be introduced into an abstract machine). Nevertheless such restrictions are methodological ones and admitting such substitutions into an abstract machine, introduces no theoretical problem. In combining proofs of the invariant preservation in M1 and refinement proofs in R2 we can deduce that operations of MR2 preserve the invariant ∃v, c(P1 ∧ I1 ∧ P2 ∧ I2 ). Valuations. This clause appears only in implementations. When this clause is relative to constants, the part of the values clause becomes a part of the properties clause of the abstract machine MR2 after a syntactic transformation (the “;” becomes “∧”). In the case of sets we have to modify the abstract machine M1 (given sets become constants with their type properties as described before). This transformation is not simply syntactic because the type checking is affected.

Validated B Components from Structured Developments

137

The Refinement RR2 . The initial refinement R2 is split into two components: the abstract machine MR2 and the refinement relation RR2 linking EM1 and MR2 . RR2 sees MSETS and S2 , which are the sets defined in the refinement R2 . The invariant and the properties clauses of RR2 are identical to the ones of R2 . As a consequence, the truth of proof obligations on RR2 is directly deduced from the refinement proofs of R2 .

REFINEMENT R 2 REFINES M 1 SEES S 2 PROPERTIES P2 INVARIANT I 2 ... END

3.2

REFINEMENT RR 2 REFINES EM 1 SEES S 2 , MSETS EXTENDS MR2 PROPERTIES P2 INVARIANT I 2 END

Component Extraction through a Chain of Refinements

In this part we exploit the transitivity of refinement at the level of components. First we recall the theoretical framework and we use it to build components corresponding to any level of refinement. In this way we implement the stepwise refinement which allows implementations to be developed by introducing several levels of refinement. Refinement Transitivity. At the level of components the expected property is: C1 vu C2 C2 vv C3 C1 vw C3 where Ci vu Cj stands for Ci is refined by Cj with the refinement relation u, and w is a relation built from u and v. Refinement of B components is defined by sufficient conditions on the refinements of initialisations and each operation. So component refinement is reduced to refinement of generalized substitutions, defined for instance as below.

138

P. Bontron and M.-L. Potet

Definition 1. (B-Book page 520) Let T and U be two substitutions. Let v be a total binary relation linking the concrete domain of T and the abstract domain of U . We have U vv T , i.e. U is refined by T using the relation v, if: v −1 [pre(U )] ⊆ pre(T ) v −1 ; rel(T ) ⊆ rel(U ); v −1 where pre(S) is {x | [S](x = x)} and rel(S) is {(x, x0 ) | ¬[S]¬(x0 = x)}, x being the list of variables on which S is defined, and x0 a fresh list of variables. From this definition we can deduce the following transitivity property (proof is left to the reader): U vu T T vv S U vv;u S where U , T and S are three generalized substitutions, u and v two relations and v; u the relational composition operator between relations. Because generalized substitution refinement is transitive, and because component refinement is reduced to generalized substitution refinement, component refinement is also transitive. Component Extraction Using Refinement Transitivity. Now we use the previous result to build an abstract machine through a chain of refinements. Let M1 , R2 and R3 be three B components defined as below (R3 could be an implementation):

machine M1 constants c1 properties P1 variables v1 invariant I1 ... end

refinement R2 refines M1 constants c2 properties P2 variables v2 invariant I2 ... end

refinement R3 refines R2 constants c3 properties P3 variables v3 invariant I3 ... end

We first apply the previous algorithm between M1 and R2 producing MR2 and RR2 . Then we apply this algorithm between MR2 and R3 to build MR3 and RR3 :

Validated B Components from Structured Developments

139

refines M1

RR2 extends

refines R2

MR2

R3

MR3

refines

RR3

refines extends

The component MR3 is the abstract machine associated to the refinement R3 . The last step consists in building the direct refinement between MR3 and M1 , using the two intermediate refinements RR2 and RR3 : refinement RR2 refines M1 extends MR2 properties P2 invariant I2 end

refinement RR3 refines MR2 extends MR3 properties P3 invariant I3 end

We recall the correspondence between a refinement relation v and a gluing invariant. Let M1 be an abstract machine with the invariant I1 on variables v1 and let R2 be its refinement with the invariant I2 on variables v2 . So we have: v = {(v2 , v1 ) | I1 ∧ I2 } Recall that v1 , v2 and v3 can have variables in common, which are identified. The B method introduces a constraint: when a variable disappears at some point in a refinement chain, it cannot reappear further down in subsequent refinements (B-Book page 536). This restriction guarantees that variables cannot be assimilated by chance. So we have v1 ∩ v3 ⊆ v2 . By composing the two refinement relations we obtain {(v3 , v2 ) | I2 ∧ I3 }; {(v2 , v1 ) | I1 ∧ I2 }, i.e. {(v3 , v1 ) | I1 ∧ ∃v20 (I2 ∧ I3 )}, where v20 = v2 − (v1 ∪ v3 ). Because v1 ∩ v3 ⊆ v2 there is no illicit name capture. So the gluing invariant is ∃v20 (I2 ∧ I3 ). By a similar construction for the constants the resulting component R4 is: refinement R4 refines M1 extends MR3 properties ∃c02 (P2 ∧ P3 ) invariant ∃c02 , v20 (P2 ∧ P3 ∧ I2 ∧ I3 ) end

140

P. Bontron and M.-L. Potet

So, we have described an algorithm to extract an abstract machine from any step of a chain of refinements and to build the direct refinement relation between the top level machine and this refinement. 3.3

Component Extraction from a Structured Refinement

The last algorithm is based on the monotonicity of refinement at the level of components. First we introduce the theoretical framework. From this we explain how new components are obtained. In this way we implement the partwise refinement which allows structured specifications to be implemented by implementing their parts. Such properties are very important because they are implicitly applied in the code generation tool. Code generation consists in translating each implementation and in connecting them, to build a global implementation. Obviously this implementation is a new structured component which must refine the initial specification. Let us give a simple example: implementation C imports M1 , M2 operations op = op1 ; op2 end

machine M1 variables x1 invariant I1 operations op1 = S1 end

machine M2 variables x2 invariant I2 operations op2 = S2 end

Now suppose that M1 and M2 have implementations, C1 and C2 , in which op1 is defined by T1 and op2 is defined by T2 . The partwise property states that C1 and C2 can be composed in order to build complete and correct code for the operation op, which is T1 ; T2 . In the following we explain how refinement monotonicity with respect to composition clauses can be deduced from more basic monotonicity properties. It is more complicated than the transitivity property: some conditions are necessary, because when components are composed together, variable space are modified. Monotonicity with Respect to Generalized Substitution Operators. In order to explicit the monotonicity at the level of B component we recall two properties relating to refinement monotonicity for generalized substitutions. Property 1. Refinement is monotonic with respect to each basic construct of the generalized substitution notation, when generalised substitutions work within the framework of the same abstract machine, except for the || operator. In this case, variable spaces must be disjoint. For instance for the sequencing operator we have: S1 vv T1 S2 vv T2 S1 ; S2 vv T1 ; T2

Validated B Components from Structured Developments

141

Property 1 provides a refinement of an external substitution from refinements of substitutions which constitute them. An external substitution is a substitution which does not contain any explicit reference to the variables or constants of a machine or the refinement. It is one that uses abstract machines in an encapsulated way. Nevertheless concrete variables and constants can be referenced because they persist during the whole of refinement process. The second property allows us to combine substitutions defined on different variables spaces. For this property a lot of precautions must be taken for each variable space. Definition 2. We denote by var(C) the set of variables of the component C (local, included and seen variables). Then var(S), where S is a generalized substitution, is defined by var(C), where C is the component in which S appears. Property 2. Let A, S, B and T be for substitutions and u and v two refinement relations. So we have: A vu B ∧ S vv T ⇒ (A || S) vu||v (B || T ) if: var(A) ∩ var(S) = ∅ ∧ var(B) ∩ var(T ) = ∅ var(A) ∩ var(T ) = ∅ ∧ var(B) ∩ var(S) = ∅

(1) (2)

where u || v is the parallel product defined by {((x1 , x2 ), (y1 , y2 )) | (x1 , y1 ) ∈ u ∧ (x2 , y2 ) ∈ v}. The first hypothesis guarantees that the || operator is well defined, that an abstract variable cannot be refined in two distinct ways and that a refined variable does not refine two abstract variables. Hypothesis 2 guarantees that we cannot mix variables of an abstraction and variables of a refinement, because in this case, refinement does not remain valid. Let us give an example: we could have x := 1 v y := 1 and z := 2 v x := 2, but we cannot deduce x := 1 || z := 2 v y := 1 || x := 2. Monotonicity with Respect to Composition Primitives. Now we show how these properties are used to extend the refinement monotonicity property at the level of components, with some restrictions. Such a result is not directly stated in the B-Book. We use the example to illustrate how component monotonicity is built. 1. First each refinement is proved separately. So we prove S1 vu T1 and S2 vv T2 for the two gluing relations u and v. This step corresponds to partwise refinement. 2. Secondly we put together the abstract machines on the one hand and their refinements on the other hand. This operation consists implicitly in extending each operation to the global variables space. For instance the extended definition of S1 , denoted by Ex2 (S1 ) is S1 || skipx2 , where skipx2 stands for x2 := x2 .

142

P. Bontron and M.-L. Potet

3. Third we have to build a refinement relation w such that each extended operation is always refined by its extended refinements. If we are within the hypotheses of the Property 2, w is the parallel product of the individual refinement relations, because any refinement relation is preserved by skip. In the example we have Ex2 (S1 ) vu||v Ey2 (T1 ). 4. Last we use the monotonicity of refinement (Property 1) to deduce that combinations of substitutions are refined by their refinements (for instance S1 ; S2 vw T1 ; T2 can be deduced from S1 vw T1 and S2 vw T2 in the example). As stated before, application of property 1 requires having an encapsulated form of combination, i.e. an external substitution. So we have proved component monotonicity when the two hypotheses of property 2 are respected, i.e. when composed components have no variables in common (hypothesis 1) and when variables of different refinement levels are not mixed (hypothesis 2). This result can be extended to treat the form of variable sharing allowed by the B method [11]. Nevertheless hypothesis 2 remains essential. Component Extraction Using Refinement Monotonicity. The component monotonicity property gives us an effective way to build new components. We describe it when abstract machines do not share variables, and neither do their refinements. Moreover abstract machines must be used in an encapsulated way and refinements must be homogeneous, i.e.: Definition 3. Let M1 , M2 . . . Mn be a chain of refinements and N1 , N2 . . . Nk be another chain of refinements. Mn and Nk are said to be homogeneous if they have no local variables with the same name and if var(Nk )∩var(Mi ) ⊆ var(Mn ) and var(Mn ) ∩ var(Ni ) ⊆ var(Nk ). This condition guarantees that we cannot have in the composition of Mn and Nk , refined variables and their abstraction. Now let M be an abstract machine, which can be obtained from a refinement or an implementation by one of the previous algorithms, and let M be built from two abstract machines M1 and M2 by any composition primitives (we take a sees and an includes clause here). Let C1 and C2 be two homogeneous components belonging to the chain of refinements of M1 and M2 , as described below.

sees M1

includes M

M2

a refinement chain

a refinement chain C1

C2

Validated B Components from Structured Developments

143

We can build a refinement MC1 C2 of M using the two refinements C1 and C2 . To do that we build first the abstract machines MC1 and MC2 and refinement relations RC1 and RC2 from C1 and C2 using the previous algorithm. Finally, we build MC1 C2 , using the two abstract machines MC1 and MC2 , and the refinement relation between M and this new component, RMC1 C2 :

sees

includes

M1

M

refines

M2 refines

refines RC1

RMC1C2

RC2

extends

extends MC1

extends MC1C2

sees

MC2 includes

Let I1 and I2 be the two invariants of the abstract machines M1 and M2 and let G1 and G2 be the two gluing invariants of the two new refinements RC1 and RC2 . The new components MC1 C2 and RMC1 C2 are built in the following way:

machine M sees M1 includes M2 operations op = S end

machine MC1 C2 sees MC1 includes MC2 operations op = S end

refinement RMC1 C2 refines M extends MC1 C2 invariant G1 ∧ G2 end

using the correspondence: u

{(var(M C1 ), var(M1 )) | I1 ∧ G1 }

v

{(var(M C2 ), var(M2 )) | I2 ∧ G2 }

u || v {((var(M C1 ), var(M C2 )), (var(M1 ), var(M2 ))) | I1 ∧ G1 ∧ I2 ∧ G2 } This algorithm can be applied to build the global implementation level of a structured development. In this case implementations are necessarily homogeneous and the encapsulated view is guaranteed by the B method. Nevertheless it can be also applied at each level of structured refinements, with respect to the hypotheses.

144

4

P. Bontron and M.-L. Potet

Conclusion

The proposed algorithms are intended to extract new valid components, deduced from the properties of the refinement and composition primitives in the B theory. Now we briefly present how these algorithms can be applied when refinement and composition are used to build structured specifications and we exhibit some points which must still be developed. At the level of specification, refinement can be used to introduce detail gradually: for instance the first level describes abstract operations which only verify safety properties. Refinements gradually introduce the different parts of the system (physical variables, a controller component . . . ). High level operations are implemented in term of more basic operations using composition primitives (layered developments) and data refinement provides the way to introduce the physical variables (channel, sensors . . . ). In this case a specification is a complex object and the global properties of the specification are distributed among different parts of the specification. In Fig. 6 we draw a structured specification and we explain some manipulations which can be applied to this specification.

M1

M0 includes

refines M3

C1 imports

refines

M4 imports refines

C3

C4

Fig. 6. A structured specification

4.1

Specification Structure

A first manipulation consists in building the final specification, i.e. specification which only contains the last level of each specification part. In this case we build abstract machines corresponding to each last level of refinement. In this way we exhibit all properties of variables corresponding to these levels, because abstract machine extraction from a refinement gives the stronger invariant on the refined variables. We also exhibit the direct dependences between components which constitute the last level of the specification. In Fig. 7 we draw the resulting specification. MC3 and MC4 are the abstract machines corresponding to C3 and C4 . C1 C3 C4 is the abstract machine corresponding to the refinement of C1 stated in term of C3

Validated B Components from Structured Developments

145

M0C1 includes MC3

C1C3C4 includes

MC4 includes

Fig. 7. The last level structure

and C4 , and M0 C1 corresponds to the refinement of M0 stated in term of C1 C3 C4 . In this way we obtain the effective definitions of M0 operations, i.e. the refined definitions directly stated in terms of the C1 , C3 and C4 variables. Using the refined components, not included in the schema, we keep the abstract properties of this model. 4.2

Stating New Properties

In the B method we have two ways to validate properties: invariants for global state properties, and refinement/abstraction for more behavioural properties. We can now state new properties of the resulting components. For instance EX1 reinforces the invariant properties of MC3 and EX2 gives an invariant, PROP2 , linking variables of C3 and C4 . EX3 establishes a behavioral property PROP3 of C1 C3 C4 . The abstract machine PROP3 is not given here: I is a gluing invariant linking C1 , C3 and C4 variables with PROP3 variables.

machine EX1 extends MC3 invariant PROP1

machine EX2 extends MC3 , MC4 invariant PROP2

end

end

refinement EX3 refines PROP3 invariant I extends C1 C3 C4 end

The invariant PROP1 could be stated in the initial model by adding it to C3 . But, if we do that, we modify the component and the proofs must be replayed. By contrast, the two properties PROP2 and PROP3 cannot be expressed in the initial model. Adding new properties to a structured specification, as proposed below, introduces new structures, adapted to these properties. There is no reason that the structure should be the same for different objectives. Nevertheless, to put together several structures, some precautions must be taken. The B method imposes global constraints on structures in order to preserve invariants and refinement proofs. New structures must remain compatible with these constraints.

146

P. Bontron and M.-L. Potet

Nevertheless problems appear only if we combine components belonging to several structures. So, methods to add new architectures, preserving these constraints, must be studied. 4.3

Calculus of Component Relations

Up to now we have not exploited the refinement components extracted from structured developments. These components only contain the refinement relation between the initial and the extracted components. Nevertheless the semantics of the different types of relations between components (refinement, composition) could be exploited more, in order to propagate proved properties through the model as a whole. For instance invariant preservation can be seen as a particular form of refinement (the variables are unchanged, the abstract preconditions are the invariant and the operations assign variables in any way which preserves the invariant) and sometimes composition can also be seen as a form of refinement (for instance when an abstract machine is extended and no new operations are added, extension is a particular form of refinement). For instance, in the example, all refinements of MC3 verify the property PROP1 because MC3 refines EX1 (more exactly MC3 and EX1 are equivalent, i.e. they describe the same specification.) Such a calculus is developed in Specware in which composition and refinement are defined in term of particular morphism. So morphisms properties can be applied to derive new relations between components, and, in this way, to deduce properties of components, following from the structure. 4.4

Perspectives

A tool implementing the proposed algorithms is under development. It produces syntactic B components which can be effectively reused in the method. Further work will be directed at the use of these algorithms. The first aim is to give some help in the understanding of structured specifications, exploiting the fact that they allow us to distinguish between the specifications (the final product) and how they are obtained (the approach used to develop the final product and its objective in term of properties validation). Moreover we think that a specification is often developed from several points of view, which correspond to different structure: for instance a functional point of view gives a specification structured in term of operations calls and a specification guided by expected properties is rather structured with respect to the decomposition possibilities of these properties. In these cases, the proposed algorithms allow specifications to share different points of view and to best manage proved properties of each component. More experiments must be done, in order to test the limitations of the proposed approach and to study possible extensions. Acknowledgement. We are grateful to the anonymous reviewer who helped us to improve the English wording of this paper.

Validated B Components from Structured Developments

147

References 1. M. Abadi and L. Lamport. Conjoining specifications. ACM Transactions on Programming Languages and Systems, 17(3):507–534, may 1995. 2. J.R. Abrial. The B-Book. Cambridge University Press, 1996. 3. D.J. Andrews, H. Bruun, B.S. Hansen, P.G. Larsen, N. Plat, and all. Information Technology — Programming Languages, their environments and system software interfaces — Vienna Development Method-Specification Language Part 1: Base language. ISO, 1995. 4. P. Behm, P. Benoit, A. Faivre, and J.-M. Meynadier. M´et´eor: A Successful Application of B in a Large Project. LNCS, FM’99 - Formal Methods, 1:348–387, september 1999. 5. Grady Booch. Object-Oriented Analysis and Design. Addison-Wesley, 1994. 6. R. Elmstrom, P. G. Larsen, and P. B. Lassen. The IFAD VDM-SL toolbox: a practical approach to formal specifications. ACM SIGPLAN Notices, 29(9):77–80, 1994. 7. RAISE Language Group. The RAISE Specification Language. Prentice Hall - BCS Practioner series, 1992. 8. The VDM-SL Tool Group. Users Manual for the IFAD VDM-SL Toolbox. IFAD, Forskerparken 10, 5230 Odense M, Denmark, February 1994. IFAD-VDM-4. 9. C. B. Jones. Systematic Software Development Using VDM (Second Edition). Prentice-Hall, London, 1990. 10. L. Mussat and J.R. Abrial. Introducing Dynamic Constraints in B. In D. Bert, editor, Proceedings of the Second International B Conference, volume 1393 of Lecture Notes in Computer Science. Springer-Verlag, 1998. 11. M.-L. Potet and Y. Rouzaud. Composition and Refinement in the B-method. In D. Bert, editor, Proceedings of the Second International B Conference, volume 1393 of Lecture Notes in Computer Science. Springer-Verlag, 1998. 12. C.A. Middelburg. Logic and Specification: Extending VDM-SL for advanced formal specification. Chapman and Hall, 1993. 13. D. Parnas. On the criteria to be used in decomposing systems into modules. Communications of the ACM, 15(12):1053–1058, december 1972. 14. J.M. Spivey. The Z notation - A Reference Manual (Second Edition). Prentice Hall, 1992. 15. M. Srinivas and L.M. Patnaik. Genetic algorithms: a survey. IEEE Computer, pages 17–26, june 1994. 16. Y.V. Srinivas and R. J¨ ullig. Specware(TM): Formal support for composing software. Technical Report KES.U.94.5, Kestrel Institute, december 1994. 17. Y.V. Srinivas and R. J¨ ullig. Specware Language Manual. Kestrel Institute, November 1995. 18. Y. Ledru, C. Oriat, and M-L. Potet. Le raffinement vu comme primitive de sp´ecification - une comparaison de VDM, B et Specware. Technical report, AFADL Approches Formelles dans l’Assistance au D´eveloppement de Logiciels, LISI/ENSMA, 86960 FUTUROSCOPE, 1998.

Playing with Abstraction and Refinement for Managing Features Interactions A Methodological Approach to Feature Interaction Problem Dominique Cansell1,2 and Dominique M´ery1,3 1

LORIA UMR 7503 BP 239, Cazmpus Scientifique 54506 Vandœuvre-l`es-Nancy C´edex, France 2 Universit´e de Metz Ile du Saulcy 57045 Metz C´edex, France 3 Universit´e Henri Poincar´e,Nancy 1 BP 239, Cazmpus Scientifique 54506 Vandœuvre-l`es-Nancy C´edex, France {cansell,mery}@loria.fr

Abstract. The feature interaction problem can be managed by the use of abstract models related by the refinement relationship. A service is incrementally built with respect to the requirement and the combination of services is defined as an instance of the refinement relationship. We use the B method and especially the eventbased approach and we show how features and services can be safely combined to obtain a sound model of combined services. Two refinements are defined following directions of refinement and the refinement-as-composition principle is developed with the B-event-based approach called B system. Although service composition is non monotonic, B system provides a framework for analysing services and services composition.

1

Introduction

The feature interaction problem in telecommunications and software engineering reports observations of undesirable behaviours with respect to services requirements. Services requirements are stated either in a natural language, or in a formal one and services are considered as entities of trade between potential customers and providers. As trade products, customers can require quality. Problems may appear when composing services; services may have different models according to views of customers or providers. The formal modelling of a service is one of the objectives of this contribution and the B method supports our incremental development of services by refining B abstract models. Our approach is based on the B event-based approach and allows us to tackle the problem of composition through the refinement of B abstract models. The refinement process is controled by proof obligations and guarantees the preservation of safety properties of the currently developed service; it preserves abstract traces (or behaviours). The main idea is to analyse causes of interactions, while refining abstract models into more concrete models; we illustrate the refinement as composition paradigm for services. The idea is J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 148–167, 2000. c Springer-Verlag Berlin Heidelberg 2000

Playing with Abstraction and Refinement for Managing Features Interactions

149

clearly related to our first works [33,34,35] on using B for modelling services and on works of J.R. Abrial [5,8,3,4,6,7] on the B event-based approach - called B system. The incremental development of complex systems is made easier because of the refinement, that appears to be a structuring mechanism for services. We have developed several services, namely CF, TCS, OCS, starting from an abstract model - initial view - and we have incrementally refined different views of systems. Our refinement is carried out on two main directions: – making abstract models more concrete: we add more and more details that are considered as hidden in more abstract models and we call it the horizontal refinement. – instantiating a new service on the current system: we add new variables and new events for the new service and we call it the vertical refinement. The instantiation of a new service was stated using the ⊕ operator (see Combes et all [16,17]); ⊕ is a symbolic notation and we get a meaning for it through the refinement. We consider that a new service may add new noises and may lead to interferences [37]; the rely-guarantee paradigm [28] was developed to help in controlling interferences among concurrent processes. The assumption-commitment paradigm [1] provides a larger framework for composing specifications and systems and we might formulate our methodology inside this framework; however, we exploit another direction of research on the composition of systems namely the incremental refinement as composition principle. Our idea is directed by our previous experiments on B and TLA+ [22,23,26,24,33, 36], where we have effectively developed case studies, because a tool [40] was available and usable and we were able to manage complex proofs of developments. Fundamentally, the osmosis of the refinement and the prover allows us to manage interactions and services. Our work explores the use of B to master the complexity of the composition of features and services and we obtain an interactive and technological framework. Classical approaches [29,32] to feature interaction problem are mainly based on an analysis of formal models (state-based, temporal, algebraic, semi-formal, relational . . . ) and the validation of formal models is carried out after writing specifications or programs: – Feature integration [38] using model checking for validation of features properties – Incremental integration validation [19] using the synchronous model Lustre and a testing tool for validation of features properties. – Modular description of features and static detection of feature interaction [41,42] mainly based on an architectural approach based on LOTOS. – State-based approach for defining and detecting feature interactions [24,33,44,21] – Relational approach for defining and detecting feature interactions [20] – Logical approach for defining and detecting feature interactions [12,11,22,24,25, 10] The refinement of an (abstract) formal model into a more concrete one is the key technique to validate in fine the final formal model and for getting a better understanding of interactions sources. An interaction occurs, when something bad or wrong happens. Something bad or wrong means that an invariant is violated or that new behaviors appeared when composing services. Hence, our B models for services tell us that services satisfy invariants and the B-event approach allows us to develop B models in an interactive and

150

D. Cansell and D. M´ery

incremental way. The B refinement controls possible interactions, when a new service is added to the system and allows us to communicate with the customer. The customer may be either the person who will buy the service, or the person who will sell it. A B abstract model for a service has a contract rˆole between the customer and the provider. We do not use the B methodology in a classical way, but we use B for writing abstract systems. An abstract system is characterized by a set of events and an invariant over variables. The invariant states properties of variables. Hence, a B abstract model of a service is simply defined by a set of events, which model reactions with respect to environment. The refinement of events may be carried out by strengthening guards or by adding new variables, following the principle of superposition [14]. The paper is structured as follows. Section 2 gives details on the feature interaction problem. Section 3 introduces abstract models for services using B and contains a modeling of the basic service. Section 4 illustrates the refinement-as-composition principle and shows what is the use of B in the service engineering process; three services are discussed and we limit our analysis to services of type A. Section 5 discusses technical questions related to the use of the tool and to the complexity of designed abstract models and concludes the document.

2

Feature Interaction Problem

The feature interaction problem is not one problem, but rather a multitude of problems defined by solutions or frameworks [29]. We have already addressed it using different approaches and frameworks, but we have got partial solutions helping us to understand problems related to composing features. According to literature on feature interaction questions, non determinism plays a very important role to trigger interactions, when two features are combined - the resulting system may have two different possible behaviors, if one presses a button. The new feature interaction problem raises problems, since interactions may be good or bad; the refinement allows one to give another way to classify interactions. A telecommunication service is a basic telecommunication service plus an arbitrary set of supplementary services. A supplementary service supplements a basic telecommunication service. Examples are Call Forwarding (CF), Call Waiting (CW). A service feature or more simply a feature is a unit of one or more elements, a network provides to a user and is defined by ITU-T as the smallest part of a service that can be perceived by the service user. We model a service or a feature, as a reactive system, namely, a system waiting for external events triggered either by the user, or by modeling system. A feature interaction refers to situations or states or configurations, where different service features or instances of the same services feature affect each other; when interactions occur between features of different services, one says that there is service interactions. When an interaction is undesirable, one says that there is an interference. But, it is quite difficult to decide, if an interaction is good or bad, since it is dependent on the point of view. There are interactions on billing that may be good for customers but bad for providers. Hence, we consider good and bad feature/service interactions; interactions emerge from different levels in a system. Following Combes et all [16,17], we separate the logical level, the network level and the implementation level and this model matches the conceptual model underlying the Intelligent Network [29,

Playing with Abstraction and Refinement for Managing Features Interactions

151

43]. The logical level is used to model the user’s view and corresponds to the service plane of the intelligent network. Our current contribution focuses on the logical level and on the safety requirements. At the logical level, interactions may emerge and we will develop abstract models that help in studying services and combinations of services at the logical level. The question is to detect interactions but also to resolve them, to prevent them or to manage them. We conjecture that our approach may be used to detect, to resolve, to prevent and to manage interactions. It offers a general framework for understanding interactions relatively to safety properties and it allows us to write a documentation for services.

3 Abstract Models for Services An abstract model for a service is defined by a list of variables - characterizing the state of the service-, an invariant - expressing properties satisfied by variables-, and a list of events which are handled by the service. The central idea is to avoid to write complex abstract models and to use refinement for adding details or features to the under-development abstract model. The refinement allows one to control the validity of the current abstract model and to trace decisions of the development. An abstract model for a service is also called an event system. We recall that such an event system models actions done by a user or triggered by a user or done by the system. We illustrate our ideas on the basic call service called BASE plus services as CF (call forwarding), OCS (originating call screening), TCS (terminating call screening). We do not recall the B method and B definitions [2], but we use Atelier B[40] to validate every derived abstract model. Remember that required proof obligations are partially generated by Atelier B. Let us recall that an abstract model for a service is a view with more or less details. An effective service is a commercial product and its description is very simple. Our main contribution is the use of invariant for stating safety requirements; the combination of two (or more) services is achieved by transformations over abstract models. The two main features of a telephone system are the calling feature and the receipt feature. Moreover, our model must be extendible and flexible. Hence, a telephone system provides a set of functionalities (calling, receiving, ringing, billing, . . . ) and there are user-managed functionalities and system-managed functionalities. We promote an interactive style for building abstract models of services and we assume that there are two actors: – the customer who is waiting for a formal model and who is an expert of services. – the system-mate who is the writer of abstract models and who has to communicate with the customer. The contract between the customer and the system-mate is produced by an answers / replies party; the system-mate asks questions to his customer to get more hints on the system, he is currently developing. He must be sure that his abstract models fit informal requirements of the customer. The dialogue for requirements is an effective part of the development of abstract models. We give a sketch of the dialogue and how formal statements are derived from the informal requirements. We will skip it in the sequel.

152

D. Cansell and D. M´ery

The system-mate: What is a telephone system ? What are rˆoles of a telephone system? The customer: A telephone system is a system that allows people to communicate: creating a communication, halting a communication. The dialogue uses the natural language for exchanging ideas but the system-mate will use a formal language for writing abstract models calls also abstract systems. Rather than to define a formal grammar, we prefer to use the B abstract machines for analysing our abstract models or abstract systems. We use the same assumptions and notations over abstract systems than Abrial’s ones (see short table in the figure 1), but no fairness. Hence, we continue the dialogue between the customer and the system-mate. A call is a link between two persons and the variable CALLS denotes the current active calls: CALLS ∈ PERSONS ↔ PERSONS

Name Binary relation Domain Codomain Identity Restriction Co-restriction Anti-restriction Anti-co-restriction Partial function Partiel into function

Syntax Definition s↔t P(s×t) dom(r) {a |a ∈ s∧∃b.(b ∈ t∧a 7→ b ∈ r)} ran(r) dom(r−1 ) id(s) {x 7→ x |x ∈ s} sr id(s); r rt r; id(s) sC (dom(r)−s)  r −r rB r  (ran(r)−t) −t s→ 7 t {r | r ∈ s ↔ t ∧ (r−1 ; r) ⊆id(t)} s 7 t {f | f ∈ s → 7 t ∧ f −1 ∈ t → 7 s}

Fig. 1. Notations for abstract systems

The first model BASE1 is an abstract system with two events. The Customer: Can you explain how your model is built ? The System-mate: My view of your system is the set of current calls (variable calls) and two events may appear to modify calls. A call is a link between persons and the system evaluates by modifying links between persons and hence by updating the variable calls. The Customer: I understand your model and I think that your model is too abstract. A call is permitted, when the system has authorized it. The System-mate : But, can the system authorize two persons to communicate, if they are already communicating. The Customer: If Jim and Jane are communicating, a new call between Jim and Jane is not allowed. CALLS is a set, I agree. A call is a pair between two persons and it is a very abstract view of the states of the persons and what is relating persons. PERSONS is a non-empty set of persons and it is the set of potential customers. Now, the two identified events are defined as follow:

Playing with Abstraction and Refinement for Managing Features Interactions

153

Creating a call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ / CALLS then CALLS := CALLS ∪ { p1 7→ p2 } end; Halting a call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ CALLS then CALLS := CALLS − { p1 7→ p2 } end; Following J.R. Abrial [7,5,8,3,4,6], we may add others events to the current set of events and we have to check two laws : Law 1: Our system remains "live" and the deadlock freedom property is a safety property of our system. Formally, we have to state that the disjunction of the guards of events must be always true. The current version of the tool does not allow to generate proof obligations for checking deadlock freedom. Nevertheless, a hand-simulation is possible and the prover is able to derive the desired proof. An extension of the generator of proof obligations would allow one to discharge required proofs to the prover. The following property is a safety of every abstract model: ∃ e .(e ∈ Events ∧ guard(e)) The condition for the abstract model BASE1 is reduced to: ∃(p1, p2).( p1, p2 ∈ PERSONS ∧ (p1 7→ p2 ∈ / CALLS ∨ p1 7→ p2 ∈ CALLS ) Law 2: We assume a fairness assumption over events; no event can ever "take control" for ever. Fairness assumptions are critical, when one wants to ensure that a progress is eventual; J.-R. Abrial and L. Mussat [4] develop an extension of the B method to take into account dynamic properties as eventuality properties and they introduce explicit counters for ensuring the fairness of the resulting systems. Their solution is very close to the proof rules of Apt and Olderog [9], where scheduling variables were used to keep only fair traces of systems. We do not address the question of fairness in our current report and let it for further researches. Now, we use rules for transforming abstract models into more concrete ones. Two main transformations are applied on event systems: adding a new event, modifying an event. Remember that we have to preserve safety properties and the invariant when applying these transformations, when it is possible. According to the UNITY approach [14], we will call them superpositions, since we superpose a new computation on an old one. When one applies a transformation rule, one has to check proof obligations stating the refinement relationship. If the new model introduces a new variable x, then a new event on x will refine a skip statement and it is clear that, the new event was observed as a skip

154

D. Cansell and D. M´ery

statement in the refined model. Hence, no new proof obligation is required; when one strengthens a guard, new proof obligations are generated for stating the preservation of invariant: – [G1 ⇒ S1 ] v [G2 ⇒ S2 ] where G1 and G2 are two conditions over system variables and they satisfy the following property : G2 ⇒ G1 and S1 and S2 are two subtitutions over system variables which satisfy S1 v S2 (S2 refines S1 ). – [S1 v S1 k S2 ] where S1 and S2 are two subtitutions over system variables which do not share variables. We refine our current version of BASE, called BASE1 into BASE2; BASE2 improves the view of our system, since it is clear that a call is possible, when the switch has authorized it. We do not know how authorizations are assigned but we use a variable to control and to record authorizations: SYSTEM AUTH ∈ PERSONS ↔ PERSONS Jim is authorized to call Jane, when Jim 7→ Jane is in SY ST EM AU T H. We require that, when a user is authorized to call somebody else, he/she will loose the authorization at the call creation: CALLS ∩ SYSTEM AUTH = {} Two new events are appended to our current system: System Adding Auth = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ / CALLS then SYSTEM AUTH := SYSTEM AUTH ∪ { p1 7→ p2 } end; System Removing Auth = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ SYSTEM AUTH then SYSTEM AUTH := SYSTEM AUTH−{p1 7→ p2 } end; and we modify the event Creating a call by strengthening the guard and reinforcing the computation: Creating a call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ SYSTEM AUTH then CALLS := CALLS ∪ { p1 7→ p2 } k SYSTEM AUTH := SYSTEM AUTH − { p1 7→ p2 } end;

Playing with Abstraction and Refinement for Managing Features Interactions

155

The new invariant is trivially checked. The new model is BASE2 and it refines BASE1: BASE1 −→ BASE2 The unfolding of abstract events and abstract traces continues. In BASE2, when A is calling B, first A has to be authorized to call B and secondly the authorization must be maintained. Removing an authorization may be due to a decision of the switch system, which controls the call process: billing problem, off-hook, problem of the line . . . We observe that an authorization is a reply to a request of a user for calling somebody. Hence, we define a new model BASE3. A new variable USER REQ AUTH TO SYSTEM records the fact that a user A wants to call B: USER REQ AUTH TO SYSTEM ∈ PERSONS ↔ PERSONS New events are added and we modify the event which assigns an authorization, since we know that it replies to a request. Following the process, we add more and more events and we make our abstract models very close to the required behaviour of classical calls.Four other models are incrementally defined and checked by Atelier B. We have to check new proof obligations generated by the law 1 and we assign an abstract fairness to guarantee eventuality properties. Eventuality properties are out of the scope of the current paper. The final model BASE6 looks like a very concrete model of POTS: BASE1 → BASE2 → BASE3 → BASE4 → BASE5 → BASE6 We have to address the problem of services, features and possible interactions by refining BASE1, . . . , BASE6 models. Events systems are useful for understanding how services are working and what are safety properties. Hence, the communication with the customer/user involves a simple way to write safety requirements. We can use the ASSERTIONS clause and hide details of events. The invariant of BASE6 tells us that a user A can not be in communication with himself/herself; the property is clearly accepted by the customer but the property was not true in the first abstract models.. We have strengthened the guard of operation SYSTEM AUTH by stating that p1 and p2 are distinct. Hence, the invariant has been incrementally defined and is in fact inductive. The invariant of BASE6 (see figure 4) expresses classical requirements of the basic phone system. Since CALLS and SY ST EM AU T H are two partial into mappings, p1 can only communicate with only one person p2; U SER REQ AU T H T O SY ST EM is a partial mapping, because p1 and p2 may want to call concurrently p3. CALLS ∈ PERSONS  7 PERSONS ∧ dom(CALLS) ∩ ran(CALLS) = ∅ 7 PERSONS ∧ SYSTEM AUTH ∈ PERSONS  7 PERSONS ∧ USER REQ AUTH TO SYSTEM ∈ PERSONS → SY ST EM REM OV IN G AU T H is an event which models the following fact: a person wants to call somebody and after a while, he/she decides to hang up. We have to

156

D. Cansell and D. M´ery

Creating a call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ SYSTEM AUTH then CALLS := CALLS ∪ { p1 7→ p2 } k SYSTEM AUTH := SYSTEM AUTH − { p1 7→ p2 } end; Halting a call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ CALLS then CALLS := CALLS − { p1 7→ p2 } end; System Adding Auth = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ / (SYSTEM AUTH ∪ CALLS) ∧ p1 7→ p2 ∈ USER REQ AUTH TO SYSTEM then SYSTEM AUTH := SYSTEM AUTH ∪ { p1 7→ p2 } k USER REQ AUTH TO SYSTEM := USER REQ AUTH TO SYSTEM − { p1 7→ p2 } end; System Refusing Auth = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ USER REQ AUTH TO SYSTEM then USER REQ AUTH TO SYSTEM := USER REQ AUTH TO SYSTEM−{ p1 7→ p2 } end; System Removing Auth = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ SYSTEM AUTH then SYSTEM AUTH := SYSTEM AUTH − {p1 7→ p2 } end; User Requesting Call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS then USER REQ AUTH TO SYSTEM := USER REQ AUTH TO SYSTEM ∪ { p1 7→ p2 } end; User Halting Requesting Call = any p1, p2 where p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p1 7→ p2 ∈ USER REQ AUTH TO SYSTEM then USER REQ AUTH TO SYSTEM := USER REQ AUTH TO SYSTEM − {p1 7→ p2 } end;

Fig. 2. Events of the abstract system BASE3

Playing with Abstraction and Refinement for Managing Features Interactions VARIABLES CALLS, SYSTEM AUTH, USER REQ AUTH TO SYSTEM INVARIANT CALLS ∈ PERSONS ↔ PERSONS ∧ SYSTEM AUTH ∈ PERSONS ↔ PERSONS ∧ USER REQ AUTH TO SYSTEM ∈ PERSONS ↔ PERSONS ∧ CALLS ∩ SYSTEM AUTH = ∅ INITIALISATION CALLS, SYSTEM AUTH, USER REQ AUTH TO SYSTEM ∅, ∅, ∅

157

:=

Fig. 3. Invariant of the abstract system BASE3

remove the authorization and this requirement was detected, when we model the event of hanging up. Finally, any person can not call two persons at the same time, since the following safety property is derived from the invariant in the ASSERTIONS clause: CALLS ∩ id(PERSONS) = ∅ The refinement guarantees that no undesired behavior is possible. In our previous works on using B for services, we were addressing state properties and we have composed invariants ie conjunction of invariants. Using the incremental development of abstract models, we add more and more details into our abstract model, but we remain inside the possible behaviors or traces of the system. BASE6 is not yet the full concrete basic phone system, but it is already a very expressive model for the basic phone system; we may want to add billing features if required and is very simple.

4 The Refinement-As-Composition Principle In the previous section, the BASE service is incrementally defined to get a concrete model. The composition of BASE plus a new service, for instance CF (Call Forwarding), is often written BASE⊕CF. ⊕ is a multi-form operator and its semantics is dependent on the service, which is under composition. Let us consider simply that BASE⊕CF is a refinement of BASE into a more complex system called BASE⊕CF. Hence, the term BASE⊕CF is interpreted on the diagram we have given for relating every intermediate abstract model of BASE, namely BASE1, . . . , BASE6. The composition of services supposes that we have already a basic system and we have several views of the same systems. The main idea is to use the refinement as a mechanism for composing features. In Combes [17], authors used a symbolic notation for composition ie ⊕ and the feature interaction problem is stated as follows: F1 |= P1 , . . . Fn |= Pn , but / P1 ∧ . . . ∧ Pn . Unfortunately, the rule of composition is generally unF1 ⊕ . . . ⊕Fn |= sound, without adding extra assumptions. The assumption-commitment paradigm [1] has explored the quest of compositionality and the seminal works of C. Jones [28] has shown our interferences between programs may be controlled. Our solution leads us to

158

D. Cansell and D. M´ery

INVARIANT CALLS ∈ PERSONS  7 PERSONS ∧ dom(CALLS) ∩ ran(CALLS) = ∅ ∧ SYSTEM AUTH ∈ PERSONS  7 PERSONS ∧ USER REQ AUTH TO SYSTEM ∈ PERSONS → 7 PERSONS ∧ GOT DIAL TONE ⊆ PERSONS ∧ SEND OFF TO SYSTEM ⊆ PERSONS ∧ SEND ON TO SYSTEM ⊆ PERSONS ∧ RINGING ⊆ PERSONS ∧ OFF HOOK RINGING ⊆ PERSONS ∧ POST CALLS ⊆ CALLS ∧ CALLS ∩ SYSTEM AUTH = ∅ ∧ SEND OFF TO SYSTEM ⊆ PERSONS ∧ SEND OFF TO SYSTEM ∩ GOT DIAL TONE = ∅ ∧ GOT DIAL TONE ∩ dom(CALLS) = ∅ ∧ GOT DIAL TONE ∩ ran(CALLS) = ∅ ∧ GOT DIAL TONE ∩ dom(USER REQ AUTH TO SYSTEM) = ∅ ∧ GOT DIAL TONE ∩ dom(SYSTEM AUTH) = ∅ ∧ GOT DIAL TONE ∩ ran(SYSTEM AUTH) = ∅ ∧ GOT DIAL TONE ⊆ PERSONS ∧ GOT BUSY TONE ⊆ PERSONS ∧ GOT END TONE ⊆ PERSONS ∧ SEND OFF TO SYSTEM ∩ dom(CALLS) = ∅ ∧ SEND OFF TO SYSTEM ∩ dom(USER REQ AUTH TO SYSTEM) = ∅ ∧ SEND OFF TO SYSTEM ∩ dom(SYSTEM AUTH) = ∅ ∧ SEND OFF TO SYSTEM ∩ ran(CALLS) = ∅ ∧ SEND OFF TO SYSTEM ∩ ran( SYSTEM AUTH) = ∅ ∧ USER REQ AUTH TO SYSTEM ∩ CALLS = ∅ ∧ SYSTEM AUTH ∩ CALLS = ∅ ∧ USER REQ AUTH TO SYSTEM ∩ SYSTEM AUTH = ∅ ∧ GOT DIAL TONE ∩ SEND OFF TO SYSTEM = ∅ ∧ RINGING = ran( SYSTEM AUTH) ∧ dom( SYSTEM AUTH) ∩ ran( SYSTEM AUTH) = ∅ ∧ dom(CALLS) ∩ dom( SYSTEM AUTH) = ∅ ∧ dom(CALLS)∩ dom(USER REQ AUTH TO SYSTEM) = ∅ ∧ dom( SYSTEM AUTH)∩ dom( USER REQ AUTH TO SYSTEM) = ∅ ∧ ran( SYSTEM AUTH) ∩ dom(CALLS) = ∅ ∧ dom( SYSTEM AUTH) ∩ ran(CALLS) = ∅ ∧ ran( SYSTEM AUTH) ∩ ran(CALLS) = ∅ ∧ ran(CALLS)∩ dom( USER REQ AUTH TO SYSTEM) = ∅ ∧ ran( SYSTEM AUTH)∩ dom( USER REQ AUTH TO SYSTEM) = ∅

Fig. 4. Invariant of the abstract system BASE6

Playing with Abstraction and Refinement for Managing Features Interactions

159

compose abstract models of a service with other abstract models by refining the initial model. If AM = { AMi : i ∈I} is a family of abstract models related by the refinement, we combine AM with AM’ by refining every AMi with respect to AM’. When considering our abstract model BASE, we have the following family:

BASE1    y BASE1 ⊕ serv1





BASE2    y BASE2 ⊕ serv2





BASE3    y BASE3 ⊕ serv3





BASE4    y BASE4 ⊕ serv4





BASE5    y BASE5 ⊕ serv5





BASE6    y BASE6 ⊕ serv6

and we have to extend functionalities of BASE or to restrict functionalities of BASE by adding new events or by modifying existing events. The new service is composed with BASE with respect to six directions. Let us explain our methodology for adding a new service. We will clearly assume that our system will not manage every detail related to the service management. Our modeling will take into account users/customers and the switch system. Hence, we consider abstract models AM1 and AM2 related by the refinement relationship: AM1 →AM2 We call it a horizontal refinement, which adds details in the abstract model AM1 , as in the chain of BASE refinement. Now we add a new service by vertical refinement: AM 1  y AM1 ⊕



AM  2  y? AM2 ⊕

Service1 →? Service2 AM1 is an abstraction of AM2 and AM1 ⊕Service1 ; however, the vertical direction intends to model the composition ofAM2 and Service2 . The refinement ofAM2 has to be proved in both directions. It is important to obtain both refinements to avoid interactions and we will see later that either one refinement, or both refinements are not provable. It means that an interaction is detected, when a generated proof obligation is not provable. Two abstract systems AM1 and AM2 are related by the refinement denoted →, when abstract traces of AM2 contain abstract traces of AM1 ie under hiding operator. We use a restricted version of refinement and we can prove that, when the two possible refinement steps (adding a new event refining skip or strengthening a guard) are used, they guarantee the refinement. Let us recall that v denotes the refinement of events.

160

4.1

D. Cansell and D. M´ery

Modelling CF Call Forwarding

Call Forwarding allows the user to decide to have incoming calls diverted to an alternative user/phone. Call Forwarding performs a translation, changing the called number into the actual destination line. Call Forwarding has a variety of forms and names as Call Diversion, Call Forwarding on Busy Line, Call Forwarding Don’tAnswer, Call Rerouting Distribution, Follow-me Diversion. Six refined models are defined from the six abstract models BASE and we introduce the following variables: – CF ? is a partial function over PERSONS and it models the closure of the transfer function, when no cycle is detected. A cycle may exist, when a person p1 transfers to p2 , p2 transfers to p3 , . . . , pn−1 transfers to pn and pn transfers to p1 . Usually, the transfer function is updated by the subscriber of CF but we add a new event which models the updating of the call forward.. – CF SU BSCRIBER is a set of persons, who have subscribed Call Forwarding. The refinement is proved by the tool Atelier B. The new step is to refine BASE2 and we obtain a refinement of BASE2 into BASE2⊕CF2 and a refinement of BASE1⊕CF1 into BASE2⊕CF2. Finally, no interaction is detected. Now, we refine BASE3 into BASE3⊕CF3 and, after the completion of the proof process, a proof obligation remains unproved. INVARIANT CALLS ∈ PERSONS ↔ PERSONS 7 PERSONS ∧ CF? ∈ PERSONS → ∧ CF SUBSCRIBER ⊆ PERSONS ∧ ∀pp.(pp ∈ PERSONS−CF SUBSCRIBER ⇒ CF? (pp) = pp) ∧ ∀pp.(pp ∈ ran(CF) ⇒ CF? (pp) = pp) An interaction is detected, when p calls q, if q forwards on r, then p will be authorized to call r and the effect of Call Forwarding leads to an undesired behaviour of the call. However, BASE2⊕CF2 is refined into BASE⊕3CF3 and BASE3⊕CF3 is a concretization of BASE1 and BASE2. Following the process, we were able to refine BASE3⊕CF3 into BASE4⊕CF4, then BASE5⊕CF5 and finally BASE6⊕CF6. The proof obligation is required to ensure the refinement of BASE3 into BASE3⊕CF3 and is commented in the next paragraph. p1 ∈ PERSONS ∧ p2 ∈ PERSONS ∧ p2 ∈ dom(CF$1? ) ∧ not(p1 7→ p2 ∈ CALLS$1) ∧ not(p1 7→ p2 ∈ SYSTEM AUTH$1) ∧ not(p1 7→ CF$1? (p2) ∈ CALLS$1) ∧ not(p1 7→ CF$1? (p2) ∈ SYSTEM AUTH$1) ∧ p1 7→ p2 ∈ USER REQ AUTH TO SYSTEM$1 ∧ not(p2 = CF$1? (p2)) ∧ "‘Check that the invariant (SYSTEM AUTH = SYSTEM AUTH$1 ∧ USER REQ AUTH TO SYSTEM = USER REQ AUTH TO SYSTEM$1)

Playing with Abstraction and Refinement for Managing Features Interactions

161

is preserved by the operation - ref 4.4, 5.5’" ⇒ ∃(p1$0, p2$0).( p1$0 ∈ PERSONS ∧ p2$0 ∈ PERSONS ∧ not(p1$0 7→ p2$0 ∈ SYSTEM AUTH$1) ∧ not(p1$0 7→ p2$0 ∈ CALLS$1) ∧ p1$0 7→ p2$0 ∈ USER REQ AUTH TO SYSTEM$1 ∧ SYSTEM AUTH$1 ∪ {p1 7→ CF$1? (p2)} = SYSTEM AUTH$1 ∪ {p1$0 7→ p2$0} ∧ USER REQ AUTH TO SYSTEM$1−{p1 7→ p2} = USER REQ AUTH TO SYSTEM$1−{p1$0 7→ p2$0})

When a variable x occurs in both machines (abstract and concrete ones), the variable x in refinement is renamed into x$1 and an implicit gluing invariant (x = x$1) is added (∧) to the refinement invariant. Proving that USER REQ AUTH TO SYSTEM$1−{p1 7→ p2} = USER REQ AUTH TO SYSTEM$1 −{p1$0 7→ p2$0} under the assumption p1 7→ p2 ∈ USER REQ AUTH TO SYSTEM$1

leads us to instanciate p1$0 by p1 and p2$0 by p2. Hence, we have to prove that SYSTEM AUTH$1 ∪ {p1 7→ CF$1· (p2)} = SYSTEM AUTH$1 ∪ {p1$0 7→ p2$0} and that is not possible, because not(p2 = CF$1· (p2)) is in assumption. We have detected an interaction because a behavior of concrete system BASE3⊕CF3 is not a behavior of the abstract one BASE3. It is why p1 will pay only the communication between p1 and p2 and not between p1 and CF$1· (p2). But it is clear that the refinement of BASEi into BASEi⊕CFi (i ≥ 3) was not possible, because an interaction was occuring. The figure 5 gathers the proved refinements and is annotated by the number of remaining unprovable proof obligations. The table of the figure 6 gives details of the project status and we had two new B machines called vertical ones to simulate the vertical refinement. It is only a technical solution for using Atelier B. 4.2

Modelling TCS Terminating Call Screening

TCS Terminating Call Screening prevents the reception of calls from certain telephone numbers or area codes. Somebody might wish to avoid calls from his/her boss. Following the instantiation of OCS, we use a variable TCSLIST so that p1 7→ p2 is in TCSLIST, when p1 does not want to receive calls from p2. The safety property is simply expressed as follows: INVARIANT CALLS ∈ PERSONS ↔ PERSONS ∧ TCSLIST ∈ PERSONS ↔ PERSONS ∧ CALLS ∩ (TCSLIST−1 ) = ∅ No problem of interaction is detected; the Creating a call operation is strengthened by the condition over TCSLIST (see figure 5). Classical interactions for TCS with CF were detected, because of a weak guard but we have required that CF can not forward to a person who has the caller on his TCS list.

162

4.3

D. Cansell and D. M´ery

Modelling OCS Originating Call Screening

OCS Originating Call Screening prevents calls to certain telephone numbers or area codes. My university might wish to stop PhD students from calling premium rate numbers. All outgoing calls, but emergency calls, may be disabled. Using our definition of calls, OCS adds a condition over calls and we use a variable OCSLIST . The safety property states that: INVARIANT CALLS ∈ PERSONS ↔ PERSONS ∧ OCSLIST ∈ PERSONS ↔ PERSONS ∧ CALLS ∩ OCSLIST = ∅

The basic service BASE is simply refined into BASE⊕OCS; no interaction is detected, since it is a restriction over guards (see figure 5). 4.4 Analysis of OCS and TCS If one combines OCS and TCS, one gets the fact that BASE⊕OCS⊕TCS is a refinement of BASE⊕TCS⊕OCS and BASE⊕TCS⊕OCS is a refinement of BASE⊕OCS⊕TCS. It means that both services commutes with respect to safety properties.

BASE1 BASE1     y y BASE1⊕ ocs1 BASE1⊕ TCS1   y y BASE1 ⊕ocs1⊕ TCS1 ↔ BASE1 ⊕TCS1⊕ocs1 In fact, OCS and TCS are services which restricts domain or codomain of calls and it is expressed by strengthening guards; the strengthening is a transformation which refines abstract systems. Proof obligations are easily proved. 4.5

Combining CF, TCS, OCS

The three services are combined together; first at all, we analyse the abstract model BASE⊕CF⊕TCS⊕ocs. Problems of interactions are always detected at the same level (3,4,5,6) and causes are clearly identical. TCS and OCS do not change the logic of the current service. The figure 5 summaries the refinement framework obtained using the Atelier B.

Playing with Abstraction and Refinement for Managing Features Interactions

BASE1    y BASE1 ⊕ CF1    y BASE1 ⊕ CF1 ⊕ TCS1    y BASE1 ⊕ CF1 ⊕ TCS1 ⊕ ocs1









BASE2    y BASE2 ⊕ CF2    y BASE2 ⊕ CF2 ⊕ TCS2    y BASE2 ⊕ CF2 ⊕ TCS2 ⊕ ocs2



BASE3   (1) y



BASE3 →



CF3    y





TCS3    y BASE3 ⊕ CF3 ⊕ TCS3 ⊕ ocs3







CF4    y







TCS4    y BASE4 ⊕ CF4 ⊕ TCS4 ⊕ ocs4





CF5    y



BASE5 ⊕ CF5

BASE4 ⊕ CF4 →

BASE5   (1) y BASE5

BASE4

BASE3 ⊕ CF3 →

BASE4   (1) y







TCS5    y BASE5 ⊕ CF5 ⊕ TCS5 ⊕ ocs5





163

BASE6   (1) y BASE6 ⊕ CF6    y BASE6 ⊕ CF6 ⊕ TCS6    y BASE6 ⊕ CF6 ⊕ TCS6 ⊕ ocs6

Fig. 5. Refinement Diagram

5

Concluding Remarks and Future Works

The paper sketches a work which is promising for the management of telecommunications services. The characterization of a feature interaction is BASEd on the unprovability of one or more proof obligations; the incremental process for writing an abstract model makes its validation easier. The characterization of a service by safety properties is not new but we have outlined a methodology which allows one to discover the invariant of the currently developed system; the question of the statement of the invariant was one of the negative points of our previous works [33,34,35]. Another point was the problem of validation of the abstract models of service and the incrementality tackles this aspect. We understand that the use of a theorem prover is more difficult rather than the use a model checker - basically a click-and-play technology -, but a model checker allows us to check final abstract models and to debug formal specifications and programs; a model checker requires finite models of systems. The combination of both approaches would be beneficial but the B approach integrates the specifier into the process of understanding what the customer really wants; when a model checker tells us that a bug is discovered in a formal complex specification, it is a hint for understanding where is the bug - scenarios are generated -, but it is too late and a rewriting of the abstract model has to be carried

164

D. Cansell and D. M´ery COMPONENT BASE1 BASE1CF1 BASE1CF1TCS1 BASE1CF1TCS1ocs1 BASE1ocs1 BASE1ocs1TCS1 BASE1TCS1 BASE1TCS1ocs1 BASE1TCS1ocs1CF1 BASE2 BASE2CF2 BASE2CF2TCS2 BASE2CF2TCS2ocs2 BASE2TCS2 BASE2TCS2ocs2 BASE2TCS2ocs2CF2 BASE3 BASE3CF3 BASE3CF3TCS3 BASE3CF3TCS3ocs3 BASE3TCS3 BASE3TCS3ocs3 BASE3TCS3ocs3CF3 BASE4 BASE4CF4 BASE4CF4TCS4 BASE5 BASE5CF5 BASE5CF5TCS5 BASE6 BASE6CF6 vBASE2CF2 vBASE2CF2TCS2 vBASE2TCS2 vBASE2TCS2ocs2 vBASE2TCS2ocs2CF2 vBASE3CF3 vBASE3CF3TCS3 vBASE3TCS3 vBASE3TCS3ocs3 vBASE3TCS3ocs3CF3 vBASE4CF4 vBASE5CF5 vBASE6CF6 TOTAL

TC OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK

POG OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK

Fig. 6. Project status

Obv 24 104 153 190 75 113 75 113 180 72 173 242 292 122 172 292 93 233 283 342 131 168 342 219 280 347 326 487 558 552 718 192 255 118 168 275 194 275 136 174 305 354 476 699 10346

nPO 3 8 11 14 8 11 8 11 14 12 34 38 45 17 22 45 18 37 44 56 21 26 56 81 92 102 89 116 125 174 214 31 41 19 24 48 40 47 26 29 59 111 118 215 2215

nUn 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 5

% Pr 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 97 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 98 99 99 99 99

Playing with Abstraction and Refinement for Managing Features Interactions

165

out. Our approach for services is new with respect to the integration of semantics into the development process by proving proof obligations. Refinement diagrams are not clearly formalized but they provide hints on the status of refinements. We annotate arrows of refinement with the number of proof obligations which are remained unproved and which are unprovable. The project status 6 gives informations on the number of generated proof obligations and we provide the categorization of proof procedures used for getting proved proof obligations. The performances of the prover were quite good but we had to find some tricky proofs; we notice that identical proof obligations were generated through the refinement process and we had to re-apply the same proof techniques with the prover. We have not taken into account all details related to the services modeling and to the architecture-related aspects. The B event-based approach is under development and is close to works on TLA/TLA+ [31] and to works on actions systems [39]; further researches are needed for exploring structuring mechanisms and methodological issues. Our abstract models for BASE do not mention billing facilities and variables and events for controlling costs of calls and it is very useful to model costs when services related to billing are used. Hence, we have to refine our basic model beyond BASE6 to manage billing and billing-related services as RB (Reverse Billing).

References 1. M. Abadi and L. Lamport. Composing specifications. Transactions On Programming Languages and Systems, 15(1):73–132, january 1993. 2. J.-R. Abrial. The B book - Assigning Programs to Meanings. Cambridge University Press, 1996. 3. J.-R. Abrial. Extending b without changing it (for developing distributed systems). In H. Habrias, editor, 1st Conference on the B method, pages 169–190, November 1996. 4. J.-R. Abrial and L. Mussat. Introducing dynamic constraints in B. In D. Bert, editor, B’98 :Recent Advances in the Development and Use of the B Method, volume 1393 of Lecture Notes in Computer Science. Springer-Verlag, 1998. 5. J.R. Abrial. Cryptographic protocol specification and design. Steria Meeting on protocols, May 1997. 6. J.R. Abrial. Development of the abr protocol. ps file, february 1999. 7. J.R. Abrial. Event-driven sequential programs. ps file, March 2000. 8. J.R. Abrial and L. Mussat. Specification and design of a transmission protocol by sucessive refinements using B. Steria Meeting on Protocols, May 1997. 9. K. R. Apt and E. R. Olderog. Proof rules and transformations dealing with fairness. Science of Computer Programming, 3:65–100, 1983. 10. C. Areces, W. Bouma, and M. de Rijke. Description logics and feature interaction. Technical report, KPN Research, 1999. 11. J. Blom, B. Johnsson, and L. Kempe. Automatic detection of feature interactions in temporal logic. In K. E. Cheng and T. Ohta, editors, Feature Interactions in Telecommunications Systems, pages 1–19. IOS Press, 1996. [15]. 12. J. Blom, B. Jonsson, and L. Kempe. Using temporal logic for modular specification of telephone services. In L. G. Bouma and H. Velthuijsen, editors, Feature Interactions in Telecommunications Systems, pages 197–216. IOS Press, 1994. 13. L. G. Bouma and H. Velthuijsen, editors. Feature Interactions in Telecommunications Systems. IOS Press, 1994.

166

D. Cansell and D. M´ery

14. K. M. Chandy and J. Misra. Parallel Program Design A Foundation. Addison-Wesley Publishing Company, 1988. ISBN 0-201-05866-9. 15. K. E. Cheng and T. Ohta, editors. Feature Interactions in Telecommunications Systems. IOS Press, 1996. 16. P. Combes, M. Michel, and B. Renard. Formalisation verification of telecommunications service interactions using sfl methods and tools. In 6th SDL Forum, 1993. 17. P. Combes and S. Pickin. Formalisation of a user view of network and services for feature interaction detection. In L. G. Bouma and H. Velthuijsen, editors, Feature Interactions in Telecommunications Software System, pages 120–135. IOS Press, 1994. [13]. 18. P. Dini, R. Boutaba, and L. Logrippo, editors. Feature Interactions in Telecommunications Newtworks IV, Montreal, 1997. IOS Press. 19. L. du Bousquet, F. Ouebdessalam, J.-L. Richier, and N. Zuanon. Incremental Feature Validation:A Synchronous Point of View. In K. Kimbler and W. Bouma, editors, Feature Interaction Workshop. IOS Press, 1998. In [30]. 20. M. Frappier, A. Mili, and J. Desharnais. Detecting Feature Interaction on Relational Specifications. In P. Dini, R. Boutaba, and L. Logrippo, editors, Feature Interaction Workshop. IOS Press, 1997. In [18]. 21. A. Gammelgaard and J. E. Kristensen. Interaction detection, a logical approach. In L. G. Bouma and H. Velthuijsen, editors, Feature Interactions in Telecommunications Systems, pages 178–196. IOS Press, 1994. 22. J.-P. Gibson, G. Hamilton, and D. M´ery. Integration problems in telephone feature requirements. In A. Galloway and K. Taguchi, editors, IFM’99 Integrated Formal Methods 1999, Workshop In Computing Science, YORK, June 1999. Springer Verlag. 23. J.-P. Gibson, G. Hamilton, and D. M´ery. A taxonomy for triggered interactions using fair objects semantics. In Muffy Calder and Evan Magill, editors, FIW’00 Sixth International Workshop on Feature Interactions in Telecommunications and Software Systems, Glasgow, Scotland, United Kingdom, 17th - 19th May 2000. 24. J.-P. Gibson, B. Mermet, and D. M´ery. Feature interactions: A mixed semantic model approach. In Gerard O’Regan and Sharon Flynn, editors, 1st Irish Workshop on Formal Methods, Dublin, Ireland, July 1997. Irish Formal Methods Special Interest Group (IFMSIG), Springer Verlag. http://ewic.springer.co.uk/. 25. J.-P. Gibson and D. M´ery. Telephone feature verification: Translating sdl to tla+. In Eighth SDL Forum Evolving methods. North-Holland, 1997. Evry, France, 22-26 September 1997. 26. P. Gibson and D. M´ery. Formal modelling of services for getting a better understanding of the feature interaction problem - multi-view approach. In PSI’99, Andrei Ershov Third International Conference, PERSPECTIVES OF SYSTEM INFORMATICS, Lecture Notes in Computer Science, page 25, Novosibirsk, Akademgorodok, Russia, 6 - 9 July 1999. Springer Verlag. Lecture Notes in Computer Science. 27. IEEE, editor. Special Section Managing Feature Interactions in Telecommunications Sofware Systems, volume 24. IEEE Computer Society, October 1998. 28. C. B. Jones. Tentative steps towards a development method for interfering programs. Transactions On Programming Languages and Systems, 5(4):576–619, 1983. 29. D. O. Keck and P. J. Kuehn. The feature and service interaction problem in telecommunications systems: A survey. IEEE Transactions on Software Engineering, 24(10):779–796, October 1998. In [27]. 30. K. Kimbler and L. G. Bouma, editors. Feature Interactions in Telecommunications and Software Systems V, Lund, 1998. IOS Press. 31. L. Lamport. A temporal logic of actions. Transactions On Programming Languages and Systems, 16(3):872–923, May 1994.

Playing with Abstraction and Refinement for Managing Features Interactions

167

32. F. J. Lin, H. Liu, and A. Ghosh. A Methodology for Feature Interaction Detection in the AIN 0.1 Framework. IEEE Transactions on Software Engineering, 24(10):797 – 817, October 1998. In [27]. 33. B. Mermet and D. M´ery. Incremental specification of telecommunication services. In M. Hinchey, editor, First IEEE International Conference on Formal Engineering Methods (ICFEM), Hiroshima, November 1997. IEEE. 34. B. Mermet and D. M´ery. Safe combinations of services using b. In John McDermid, editor, SAFECOMP97 The 16th International Conference on Computer Safety, Reliability and Security, York, September 1997. Springer Verlag. 35. B. Mermet and D. M´ery. Service specifications to b, or not to b. In Mark Ardis, editor, Second Workshop on Formal Methods in Software Practice, Clearwater Beach, Florida, March 4-5 1998. ACM Press. 36. D. M´ery. Requirements for a temporal B : Assigning Temporal Meaning to Abstract Machines ... and to Abstract Systems. In A. Galloway and K. Taguchi, editors, IFM’99 Integrated Formal Methods 1999, Workshop In Computing Science, YORK, June 1999. 37. S. Owicki and D. Gries. An axiomatic proof technique for parallel programs i. Acta Informatica, 6:319–340, 1976. 38. M. Plath and M. Ryan. Plug-and-play features. In K. Kimbler and W. Bouma, editors, Feature Interaction Workshop. IOS Press, 1998. In [30]. 39. E. Sekerinski and K. Sere, editors. Program Development by Refinement. Springer, 1999. 40. STERIA - Technologies de l’Information, Aix-en-Provence (F). Atelier B, Manuel Utilisateur, 1998. Version 3.5. 41. K. Turner. Validating Architectural Feature Descriptions using LOTOS. In K. Kimbler and W. Bouma, editors, Feature Interaction Workshop. IOS Press, 1998. In [30]. 42. Kenneth J. Turner. Relating architecture and specification. Computer Networks and ISDN Systems, 1997. http://www.cs.stir.ac.uk/∼kjt/research/publications.html. 43. Union Internationale des T´el´ecommunications. R´eseau intelligent - introduction a` l’ensemble de capacit´es 1 du r´eseau intelligent. Technical Report UIT-T Q.1211, Union Internationale des T´el´ecommunications, October 1993. R´eseau Intelligent. 44. T. Yoneda and T. Ohta. A Formal Approach for Definition and Detection of Feature Intercation. In K. Kimbler and W. Bouma, editors, Feature Interaction Workshop. IOS Press, 1998. In [30]. 45. P. Zave. Feature interactions and formal specifications in telecommunications. Computer, August 1993.

A Formal Architecture for the 3APL Agent Programming Language Mark d’Inverno1 , Koen Hindriks2 , and Michael Luck3 1

Cavendish School of Computer Science, 115 New Cavendish Street, University of Westminster, London W1M 8JS, UK [email protected] 2 Dept. of Computer Science, Universiteit Utrecht, P.O. Box 80.089; 3508 TB Utrecht, The Netherlands [email protected] 3 Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK [email protected]

Abstract. The notion of agents has provided a way of imbuing traditional computing systems with an extra degree of flexibility that allows them to be more resilient and robust in the face of more varied and unpredictable forms of interaction. One class of agents, typically called intelligent agents, represent their world symbolically according to their beliefs, have goals which need to be achieved, and adopt plans or intentions to achieve them. Now, one approach to building agents is to design a programming language whose semantics are based on some theory of rational or intentional agency and to program the desired behaviour of individual agents directly using mental attitudes. Such a technique is referred to as agent oriented programming. Arguably, the most innovative of these languages is 3APL (pronounced “triple-a-p-l”) which supports the construction of intelligent agents for the development of complex systems through a set of intuitive concepts like beliefs, goals and plans. In this paper, we provide a Z specification of the programming language 3APL which provides a basis for implementation and also adds to a growing library of agent techniques and features.

1

Introduction

Recently, there has been an explosion of interest in agent-based systems and the related subfield of distributed artificial intelligence (DAI). The focus of much agent-based work is on building architectures for intelligent agents, providing information about essential data structures, relationships between these data structures, the processes or functions that operate on them and the operation or execution cycle of an agent. Deliberative Agent Systems symbolically model their environment and manipulate these symbols in order to act. In order to model rational or intentional agency, an abstraction level is chosen for the symbols such that they represent mental attitudes. Most agent systems include a deliberative architecture to support deliberative reasoning at the mental-attitude level. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 168–187, 2000. c Springer-Verlag Berlin Heidelberg 2000

A Formal Architecture for the 3APL Agent Programming Language

169

Mental attitudes used to describe and characterise the behaviour of agents include beliefs, goals, assumptions, desires, knowledge, plans, motivations and intentions, and are commonly grouped into three categories, informative, motivational and deliberative [9]. The first refers to that which a system considers to be true about the world and includes knowledge, beliefs and assumptions, the second to the ‘wants’ of a system including goals, desires and motivations, and the third concerns how an agent’s behaviour is directed and includes plans and intentions. The distinction between the second and third categories is subtle since it is possible that a system may desire a certain state without planning for it, or intending it to happen. There are several compelling reasons why agents defined using mental attitudes might be useful. First, if an agent can be described in terms of what it knows, what it wants and what it intends then, since it is modelled on familiar concepts, it becomes possible for users to understand and predict its behaviour. Second, understanding the relationship between these different attitudes and how they affect behaviour could provide the control mechanism for ‘intelligent action’ in general. Third, computational agents designed in this way may be able to interpret the behaviour of others independently of any implementation. Rather than defining an architecture, agent oriented programming is a paradigm for directly programming the behaviour of agents using computational languages whose semantics capture some theory of rational agency [14]. Typically agents have an initial set of beliefs, goals and plans and an interpreter that details how agents should achieve their goals given an environmental context. 1.1 The 3APL Programming Language One such agent programming language is 3APL which supports the design and construction of intelligent agents for the development of complex systems through a set of intuitive concepts like beliefs, goals and plans. In turn, these can be used to describe and understand the computational system in a natural way. Indeed, applications such as personal assistants [12] are naturally seen as agents that act on behalf of their users and in pursuit of user goals, using these concepts. 3APL supports this style of programming by means of an expressive set of primitives to program agents, which consist of sets of beliefs, goals and practical reasoning rules. Beliefs represent the issues the agent must deal with, while goals allow the agent both to focus on what it must achieve and to represent the way in which it can achieve it. In 3APL, goals are thus used to represent achievement goals and as plans. The practical reasoning rules provide the agent with planning capabilities to find an appropriate plan to achieve a goal, capabilities to create new goals to deal with a particular situation, and capabilities to use the rules to revise a plan. The architecture for 3APL [8] is based on the think-act cycle, which is divided into two parts. The first part corresponds to a phase of practical reasoning by using practical reasoning rules, and the second corresponds to an execution phase in which the agent performs some action. Originally, the operational semantics of 3APL was specified by means of Plotkinstyle transition semantics [7]. In this work, we provide a re-specification of 3APL in Z, which has a number of benefits. First, it helps to get closer to a good implementation of 3APL, because of the tools available for Z which support type-checking, animation, and

170

M. d’Inverno, K. Hindriks and M. Luck

so on. By specifying 3APL, we can provide a computational model that includes data structures, operation and architecture, thereby isolating the data-types for an efficient implementation of 3APL. Second, the process of re-specification provides a different perspective, highlighting different aspects of the language and architecture that are not manifested in a similar way in the transition style semantics. In carrying out this work, we aim to provide a clearer analysis and insight into agent languages and architectures, and add to a growing library (written in Z) of desirable and reusable agent features. The next section introduces the basic types used to build the 3APL model, comprising beliefs, actions, goals and practical reasoning rules. Then we define 3APL agents and finally we describe their operation.

2

3APL Types

Beliefs and goals are the basic types of expressions in 3APL from which rule expressions are derived. In this specification, beliefs are a subset of first order formulae (though in principle any knowledge representation language could be used), and this first order language is defined in the usual way. First order terms are defined by means of given sets of constants, first order variables, and function symbols. Since the programming language distinguishes between first order variables and variables that range over goals, we first define a partition of the set of variables, and use FOVar to denote the set of first order variables and Gvar to denote the set of goal variables. [Const, Var , FuncSym] FOVar : P Var GVar : P Var FOVar ∩ GVar = ∅ FOVar ∪ GVar = Var The sets of all constants and function symbols are respectively denoted as [Const] and [FuncSym]. In terms of this specification, the contents of these sets are unimportant, and we use them directly without further elaboration. A first order term is either a constant, a first order variable, or a function symbol with a non-empty sequence of terms as a parameter. The auxiliary function fovars returns the set of all first order variables in a (first order) term. FOTerm ::= consthhConstii | var hhFOVar ii | functor hhFuncSym × seq1 FOTermii fovars : FOTerm → (P FOVar ) ∀ c : Const; v : FOVar ; f : FuncSym; ts : seq FOTerm • fovars (const c) = ∅ ∧ fovars (var v ) = {v } ∧ S fovars (functor (f , ts)) = {t : FOTerm | t ∈ (ran ts) • fovars t}

A Formal Architecture for the 3APL Agent Programming Language

2.1

171

Beliefs

Beliefs are defined by building types from the above primitives. The set of all predicate symbols is denoted by [PredSym], and a belief atom is a predicate symbol with a (possibly empty) sequence of terms as its argument. Beliefs are then either an atom, the negation of an atom, the conjunction of two beliefs, or the implication of one belief by some other belief. [PredSym] Atom head : PredSym terms : seq FOTerm

Belief ::= poshhAtomii | nothhAtomii | and hhBelief × Belief ii | implyhhBelief × Belief ii | false | true The running example in this paper concerns a personal assistant for scheduling meetings and other activities, which acts on behalf of its user, monitors the need to schedule activities, and helps the user to find appropriate slots. The agent might also provide its user with information concerning the appropriate means of transportation to go to the location of a scheduled meeting. Part of the knowledge representation language of a personal assistant for scheduling (PAS) consists of predicates to represent, for example, the agenda of the user and the means of transportation to get from A to B. In particular, the predicate agenda represents the user’s agenda, with five arguments: agenda(Activity, Time, Duration, People, Loc) Each variable, respectively, represents the activity scheduled (such as meetings, lunch, etc), the date and time, the duration of the activity, the set of people involved, and the location. For example, the PAS may have the following belief concerning the user’s agenda, indicating that an hour meeting is scheduled for May 5th at 12:00 with John and Peter in Utrecht: pos agenda(meeting, may5th12 : 00, 60min, {john, peter}, utrecht) Other predicates include free(Time, Length), indicating that a free slot of length Length at time Time, and transport(Means, FromLoc, ToLoc, Time, DurTrans), indicating that a possible means of moving from FromLoc to ToLoc at (date and) time Time is Means, and taking DurTrans time. Finally, location(Loc, Time) keeps track of the location of the user at a particular time, which is assumed to be persistent. That is, without information to the contrary, the location of the user at a time T 0 after time T will be assumed to be the same as that at time T . 2.2 Actions In order to achieve goals or accomplish tasks, an agent must perform actions, represented by action symbols specified in the same way as atoms.

172

M. d’Inverno, K. Hindriks and M. Luck

[ActionSym] Action name : ActionSym terms : seq FOTerm A basic action used by the PAS agent is the action ins agenda for inserting items in the belief base. The action has five associated arguments matching the arguments of the predicate agenda, and is used to insert a particular item in the agenda. Specifically, ins agenda(Activity, Time, Length, People, Loc) inserts the corresponding agenda predicate into the belief base of the PAS. For example, ins agenda(meeting, may5th12 : 00, 60min, [john, peter], utrecht) inserts an hour-long meeting scheduled for May 5th with John and Peter in Utrecht in the agenda of the agent. At this point, a number of auxiliary functions may also be defined to return the set of variables in an atom, a belief, or an action, and which are used in the specification of the operation of an agent. atomvars : Atom → (P FOVar ) beliefvars : Belief → (P FOVar ) actionvars : Action → (P FOVar ) ∀ c : Const; v : FOVar ; f : FuncSym; ts : seq FOTerm; at : Atom; b1 , b2 : Belief ; a : Action • fovars (const c) = ∅ ∧ fovars (var v ) = {v } ∧ S fovars (functor S (f , ts)) = {t : FOTerm | t ∈ (ran ts) • fovars t} ∧ atomvars at = {t : FOTerm | t ∈ (ran at.terms) • fovars t} ∧ beliefvars (pos at) = atomvars at ∧ beliefvars (not at) = atomvars at ∧ beliefvars (and (b S1 , b2 )) = beliefvars b1 ∪ beliefvars b2 ∧ actionvars a = {t : FOTerm | t ∈ (ran a.terms) • fovars t} 2.3

Goals, Contexts, and Front Contexts

In 3APL, goals are used to represent both the goals and the plans to achieve these goals of the agent. Goals are program-like structures that are built from basic constructs, such as actions, and regular imperative programming constructs, such as sequential composition and nondeterministic choice. 3APL goals can be characterised as goals-to-do, which are mental attitudes corresponding to plans of action to achieve a state of affairs, or goalsto-be, which are mental attitude corresponding to the state of affairs desired by an agent. For example, an agent may have adopted the goal-to-do of learning to play the piano, and then performing at the ZB conference dinner. This might be done in pursuit of the agent’s goal-to-be of wanting to be a pop-star rather than an academic.

A Formal Architecture for the 3APL Agent Programming Language

173

Contexts Before formally describing goals, we introduce the notion of contexts, which are goals with an extra feature called ‘holes’that act as placeholders within the structure of goals1 . Contexts are well-known structures in programming language semantics, closely related to the concept of goals, and are used to describe the operation and architecture of 3APL agents. Use of contexts in this way differs substantially from the transition style semantics presented in [7], which is inappropriate in this Z specification, since it would require a recursive relationship between schemas. While contexts, by contrast, allow us to give an elegant specification of the operation of a 3APL agent, we must stress that their role is in the presentation of an architecture for 3APL, rather than in the 3APL language itself. More precisely, a context is either a basic action, a query goal, an achieve goal, the sequential composition of two contexts, the nondeterministic choice of two contexts, a goal variable or “” which represents a place within a context that might contain another context. In the definition below, we use the set of goal variables, GVar , to allow a process called goal revision to take place as will be described later. Context ::= bachhActionii | queryhhBelief ii | achievehhAtomii | seqcomphhContext × Contextii | choicehhContext × Contextii | goalvar hhGVar ii |  The , which denotes the placeholder or hole within a context, is distinct from a goal variable. Although both  and a goal variable are placeholders,  is a facility used for specifying 3APL, whereas goal variables are part of 3APL itself. Five examples of contexts are shown in Figure 1. Goals We can now define a goal as a context without any occurrences of . Only the third context in Figure 1 is a goal since it contains no squares. None of the other contexts are goals because they contain at least one occurrence of . The auxiliary function squarecount counts the occurrences of  in a context. squarecount : Context → N ∀ a : Action; b : Belief ; at : Atom; c1 , c2 : Context; gv : GVar • squarecount(bac a) = 0 ∧ squarecount(query b) = 0 ∧ squarecount (achieve at) = 0 ∧ squarecount (seqcomp(c1 , c2 )) = squarecount c1 + squarecount c2 ∧ squarecount (choice(c1 , c2 )) = squarecount c1 + squarecount c2 ∧ squarecount (goalvar gv ) = 0 ∧ squarecount  = 1 Goal == {g : Context | squarecount g = 0} Front Contexts An important type of context, known as a front context, is used to illustrate significant properties of 3APL. Front contexts are contexts with precisely one occurrence of  at the front of the context. Informally, an element at the front of a context means 1

Note that our use of the term context is distinct from the notion of a context in such systems as AgentSpeak(L) [13,3], which is defined as the pre-condition of a plan.

174

M. d’Inverno, K. Hindriks and M. Luck

1.choice (seqcomp (, bac ins agenda(meeting, Time, Dur, People, Loc))), ) 2.seqcomp (seqcomp (query pos free(Time, Length), ), goalvar X) 3.goalvar X 4.seqcomp (, bac ins(agenda(meeting, Time, Dur, People, Loc)) 5.choice (seqcomp (, bac ins(agenda(meeting, Time, Dur, People, Loc)), query (pos free(Time, Length))) Fig. 1. Examples of Contexts

that an agent could choose to perform this element first, so that if a  at the front of a context was replaced by a goal, then that goal could be performed first, before the remainder of the overall goal. A front context is defined formally as either a single square, the sequential composition of a front context with a goal, or the choice (in either order) of a front context and a goal. In Figure 1, neither Context 1, 2, nor 3 are front contexts, but both 4 and 5 are.

frontcontext : P(Context) ∀ fc, fc1 : Context; g : Goal • frontcontext (fc) ⇔ fc =  ∨ (fc = seqcomp(fc1 , g) ∧ frontcontext fc1 ) ∨ (fc = choice(fc1 , g) ∧ frontcontext fc1 ) ∨ (fc = choice(g, fc1 ) ∧ frontcontext fc1 )

The type FrontContext is the set of contexts satisfying frontcontext. FrontContext == {fc : Context | frontcontext fc}

Clearly any front context has only one occurrence of a ‘hole’.

∀ fc : FrontContext • squarecount fc = 1

A goal may contain both goal variables as well as first order variables. We therefore define three functions that return the set of all variables, goal variables and first order variables, respectively, of a goal. A definition of optional and related elements can be found in [3].

A Formal Architecture for the 3APL Agent Programming Language

175

goalvars : optional [Goal ] → (P Var ) goalgvars : optional [Goal ] → (P GVar ) goalfovars : optional [Goal ] → (P FOVar ) ∀ g : optional [Goal ]; a : Action; b : Belief at : Atom; g1 , g2 : Goal ; gv : GVar • goalvars∅ = ∅ ∧ goalvars{bac a} = actionvars a ∧ goalvars{query b} = beliefvars b ∧ goalvars {achieve at} = atomvars at ∧ goalvars {seqcomp(g1 , g2 )} = goalvars {g1 } ∪ goalvars {g2 } ∧ goalvars {choice(g1 , g2 )} = goalvars {g1 } ∪ goalvars {g2 } ∧ goalvars {goalvar gv } = {gv } ∧ goalgvars g = (goalvars g) ∩ GVar ∧ goalfovars g = (goalvars g) ∩ FOVar 2.4

Practical Reasoning Rules

A 3APL agent uses practical reasoning rules not only to plan in the more conventional sense, but also to reflect on its goals. Whilst the use of rules for planning is a familiar concept from the literature, using rules for reflection on goals or plans is less well known. Reflection allows an agent to re-consider one of its plans in a situation in which the plan will fail with respect to the goal it is trying to achieve or has already failed, or where a more optimal strategy can be pursued. Practical reasoning rules are divided into four classes: reactive rules, which are used not only to respond to the current situation but also to create new goals; plan-rules, which are used to find plans for achievement goals; failure-rules, which are used to replan when plans fail; and optimisation-rules, which can replace less effective plans with more optimal plans. (This classification of rules was first proposed in [8].) We introduce a type to correspond to each of these categories. PRType ::= reactive | failure | plan | optimisation A practical reasoning rule consists of an (optional) head, which is a goal, an (optional) body which is a goal, a guard which is a belief and a type to define its purpose. Informally, a practical reasoning rule with head g, body p and guard b, states that if the agent tries to achieve goal g and finds itself in a situation b, then it might consider replacing g by a plan p as a means to achieve it. If it is a plan-rule, the goal g is of the form achieve s where s is a simple formula, and the rule states that to achieve goal g in situation b, consider plan p. If it is a failure-rule, then g may be any goal and the rule states that if g fails in situation b, consider dropping g and instead, adopting strategy p to deal with the failure. Finally, if it is an optimisation-rule, then g may be any goal and the rule states that if g is not so efficient in situation b, consider dropping g and instead, adopting strategy p. Formally, we define a practical reasoning rule in the schema below, in which the conditions specify that a reactive-rule has an empty head (and it can be applied whenever the guard is true) and that a plan-rule has an achieve goal as its head.

176

M. d’Inverno, K. Hindriks and M. Luck

head = {achieve schedule(Activity, Time, DurationAct, People, Loc)}, guard = and (pos free(Time, DurationAct), pos location(FromLoc, Time)), body = {seqcomp (seqcomp ( query (pos transport(Means, FromLoc, Loc, Time, DurationTrans)), bac ins agenda(Means, Time − DurationTrans, DurationTrans, agent, FromLoc))), bac ins agenda(Activity, Time, Duration, People, Loc)))} head = {seqcomp (goalvar X, bac ins(agenda(Act, Time, Dur, People, Loc)))}, guard = neg free(Time, Dur), body = {achieve find alternative time(Act, Time, Dur, People, Loc, AltTime)} Fig. 2. Two Practical Reasoning Rules

PRrule head , body : optional [Goal ] guard : Belief type : PRType head = ∅ ⇔ type = reactive ∧ thehead ∈ (ran achieve) ∧ body 6= ∅ ⇔ type = plan Whilst there is a correspondence between the syntax of a PRrule and its purpose for reactive and plan rules, there is none for optimisation and failure rules because it is not possible to syntactically distinguish plans that fail or plans that can be optimised. The first rule in Figure 2 inserts an activity in the agenda. It states that a plan to achieve the scheduling of an activity is to find a means of transportation to the specified location, then reserve the time needed for transport in the agenda, and finally insert the activity itself in the agenda. This plan should only be used if the slot at time Time of length DurationAct is still free in the agenda. The second conjunct in the guard of the rule is used to retrieve the location of the user at time Time. Note that FromLoc does not occur in the head of the rule so that the binding for it must be retrieved from the agent’s beliefs. This illustrates the two uses of a guard as specifying the situation in which the rule might be considered by the agent and alternatively to retrieve some parameters from the agent’s beliefs. The second rule of Figure 2 is a revision rule that deals with failure. It states that if the agent has a sequential goal of doing anything (denoted by the goal variable X) followed by the goal of inserting an activity in the agenda of a user at a slot which is not free (specified by the guard), then the agent should consider revising that goal and replacing it with the goal of finding an alternative time for the activity. Note that the variables that occur in the head and guard of a rule, are called the global variables of the rule, and those that occur in the body but not in the head or guard of the rule, are called the local variables. This distinction is made to separate the local data-processing in the body from the global variables that may also be used in other parts of a (complex) goal. In the first rule of Figure 2, the variables Means and DurationTrans are local variables that can only be used in the body of the goal and

A Formal Architecture for the 3APL Agent Programming Language

177

cannot transfer information to other parts of a (more complex) goal. Global variables, however, can be used to ‘communicate with the rest of the goal’ by parameter passing.

3

3APL Agents

3.1 Agents and Mental State An agent can be characterised by specifying its beliefs, goals, practical reasoning rules and expertise. The main difference between these components of an agent are that the former two sets are dynamically updated while the latter two are fixed and do not change. We define an agent as an entity consisting of the static expertise and rulebase, i.e. a set of practical reasoning rules. Agent expertise : P Action rulebase : P PRrule ∀ a : expertise • actionvars a = ∅ This differs slightly from earlier work, in that the expertise of an agent is explicitly included as part of an agent. The predicate part of the schema indicates that the agent is only capable of performing grounded actions, since it is not clear how to specify the semantics of actions that contain variables. The interpretation of the instantiation of free variables in an action as a sensing act which semantically could be specified as a function of the environment of the agent fall outside the scope of this paper. We only deal with the specification of the components of a single agent in this paper. The beliefs of an agent are recorded in its beliefbase and the goals of an agent in its goalbase. Together, these comprise the mental state of an agent. During the execution of an agent, its mental state is updated; the goals and beliefs of the agent are dynamic. AgentState Agent beliefbase : P Belief goalbase : P Goal Only the mental state of the agent may change during the operation of an agent, not the expertise or the rulebase. This is shown in the schema below by the ∆ convention, which indicates that some state variables change, and the Ξ convention, which states that the dashed variables are equal to their undashed counterparts (i.e. no state change). ∆AgentState AgentState 0 AgentState ΞAgent

178

3.2

M. d’Inverno, K. Hindriks and M. Luck

Initial Agent State

A 3APL agent specifies the expertise (repertoire of basic actions), and a set of practical reasoning rules, but does not specify the initial beliefbase or goalbase of the agent, however. The first step in the operation of an agent, therefore, is to initialise the mental state. In the schema below, b0? and g0? denote input variables. InitAgentState ∆AgentState b0? : P Belief g0? : P Goal beliefbase 0 = b0? goalbase 0 = g0? The operation of an agent is parameterised by two semantic notions. First, the semantics of basic actions is defined by a global function, execute, which specifies that a basic action is an update operator on the beliefs of the agent. For example, the basic action bac ins(agenda(Act, Time, Dur, People, Loc)) of a scheduling agent updates the beliefs by inserting a new item in the agenda. Since execute is a global function, any two agents capable of performing an action are guaranteed to do the same thing when executing that action. This is particularly important to prevent confusion when specifying and programming agents. execute : Action × P Belief → 7 P Belief The second semantic notion needed to specify the semantics of agents is a logical consequence relation. The logical consequence relation determines which implications the agent is allowed to derive from its beliefs. Formally, the consequence relation is a relation between two sets of beliefs such that the first set of beliefs implies all the beliefs in the second set. The logical consequence relation is also global. This makes sure that all agents draw conclusions from their beliefs in the same way, which guarantees a “minimal amount of global consistency”. That is, one agent will not derive the negation of a belief b from the same set of beliefs from which another agent derives b. LogCon : P(P Belief × P Belief )

4

3APL Agent Operation

4.1 Applying Practical Reasoning Rules Practical reasoning rules provide 3APL agents with reflective capabilities. The rules can be used to plan, revise, and create goals. The application of a rule is formally defined in this section. The application of a rule r to a goal g results in the replacement of a subgoal g 0 which matches with the head of rule r by the body of rule r in case the head of the rule is non-empty. If the body of the rule is empty, the subgoal is simply dropped. The application yields a substitution which is applied to the entire resulting goal. In case the

A Formal Architecture for the 3APL Agent Programming Language

179

head = {seqcomp(goalvar X, bac ins agenda(Act, Time, Dur, People, Loc)))}, Rule: guard = negfree(Time, Dur), body = {achievefind alternative time(Act, Time, Dur, People, Loc, AltTime)} Bel: neg free(may5th10 : 00, 60min) seqcomp (seqcomp (bac ins(agenda(train, may5th9 : 30, 30min, [john], utrecht)), Goal: bac ins(agenda(meeting, may5th10 : 00, 60min, [john, peter], amsterdam))), achieve (new scheduling task)) Fig. 3. Example Scenario for Practical Reasoning Rules

Substitution: Guard: New Plan (Goal):

ϑ = {X/bac ins(agenda(train, may5th9 : 30, 30min, [john], utrecht)), Act/meeting, Time/may5th10 : 00, Dur/60min, People/[john, peter], Loc/amsterdam} neg free(may5th10 : 00, 60min) seqcomp ( achieve find alternative time(meeting, may5th10 : 00, 60min, [john, peter], amsterdam, AltTime), achieve (new scheduling task)) Fig. 4. Results of Applying the Rule (and Substitution)

head of a rule is empty only the guard of the rule needs to be derivable from the beliefs of the agent, and a new goal (the body of the rule) is added to the goalbase of the agent. Consider the example in Figure 3, with the practical reasoning rule of dealing with a failure of the scheduling of an activity, and a goal and belief as specified. Since the belief is an instance of the guard of the rule, and the goal can be unified with the head of the rule, the rule is applicable. Unifying the head of a rule with a (subgoal of a) goal of an agent amounts to finding a unifier (or substitution) which, when applied to the head of the rule and the goal makes them identical. In this example, unification yields a most general unifier, shown in Figure 4 which, when applied to the guard of the rule, gives the instantiated guard that is implied by the belief. Applying a rule to a subgoal means replacing that subgoal by the body of the rule, so that the new plan that replaces the original goal of the agent by applying the substitution, as also shown in the figure. The example illustrates that a subgoal at the front of a goal of the agent is replaced, rather than just any subgoal, and this is the case for all rule applications. Consequently, the front contexts introduced earlier are very useful in specifying rule application. Suppose that g 0 is a subgoal of some goal g that appears at the front of g, and g 0 matches with the head of a rule r . The task is to find a front context fc such that if the subgoal g 0 is inserted for  (at the front of fc), the resulting goal is identical to g. Applying r then amounts to updating the fc with the body of the rule. There is a crucial difference here between inserting and updating. Inserting a goal in a front context means substituting the goal for the  in the front context; while updating a front context with a goal means replacing  with that goal and also committing to the choices made (pursuing a subgoal in a choice goal means committing to the branch in which the subgoal appears in the choice goal).

180

M. d’Inverno, K. Hindriks and M. Luck

To formalise this, we define two functions, one to insert a goal into the square of a front context and one to update a front context with a goal. Insert : (Goal × FrontContext) → Goal ∀ g, g 0 : Goal ; fc : FrontContext • Insert (g, ) = g ∧ Insert (g, seqcomp(fc, g 0 )) = seqcomp(Insert(g, fc), g 0 ) ∧ Insert (g, choice(fc, g 0 )) = choice(Insert(g, fc), g 0 ) ∧ Insert (g, choice(g 0 , fc)) = choice(g 0 , Insert(g, fc)) Now, since a front context may be updated with the empty goal if a rule with empty body is applied or an execution step is performed (see below), the goal types of the function UpdateGoal are optional goals. Note that if the front context is a choice context, this commits to the branch where the  occurs. The latter branch is the one the agent has chosen to pursue. This is different from Insert, which left these branches intact. UpdateGoal : (optional [Goal ] × FrontContext) → optional [Goal ] ∀ g : optional [Goal ]; fc : FrontContext; g 0 : Goal • UpdateGoal (g, ) = g ∧ non − empty UpdateGoal (g, fc) ⇒ (UpdateGoal (g, seqcomp(fc, g 0 )) = {seqcomp(the (UpdateGoal (g, fc)), g 0 )} ∧ UpdateGoal (g, choice(fc, g 0 )) = UpdateGoal (g, fc) ∧ UpdateGoal (g, choice(g 0 , fc)) = UpdateGoal (g, fc)) ∧ empty UpdateGoal (g, fc) ⇒ (UpdateGoal (g, seqcomp(fc, g 0 )) = {g 0 } ∧ UpdateGoal (g, choice(fc, g 0 )) = ∅ ∧ UpdateGoal (g, choice(g 0 , fc)) = ∅) Note that if the front context is of the form choice (, g) for some goal g, and the empty goal is inserted for , Insert yields Insert(choice (, g)) = {g}. This is a natural definition, but the empty goal could result with a different view on dropping a branch from a choice goal. A rule with empty body prevents an agent from attempting to execute a particular goal. If the dropped goal is a branch of a choice goal, Insert simply removes this branch. The definition of Insert above does not remove the alternative branch, however, so that the choice goal thus is still not completed. If there is a reason to remove the choice goal, then this reason should be stated in the guard of a rule which removes the choice goal completely. A rule is applicable if the head unifies with a (sub)goal of the agent and the guard of the rule follows from the agent’s beliefs. If the rule has no head, it is applicable simply if the guard follows from the beliefbase. For a treatment of substitutions, binding and unification please see Appendix A.

A Formal Architecture for the 3APL Agent Programming Language

181

applicable : P(PRrule × Goal × (P Belief )) ∀ g : Goal ; r : PRrule; bb : P Belief • non − empty r .head ⇒ (applicable(r , g, bb) ⇔ (∃ ϑ, γ : Substitution; subg : Goal ; fc : FrontContext • Insert(subg, fc) = g ∧ mgu((the r .head ), subg) = ϑ ∧ (dom γ) ⊆ (beliefvars r .guard ) ∧ LogCon(bb, {ASBelief (ϑ ‡ γ)r .guard }))) ∧ empty r .head ⇒ (applicable(r , g, bb) ⇔ (∃ ϑ : Substitution | (dom ϑ) ⊆ (beliefvars r .guard ) • LogCon(bb, {ASBelief ϑ r .guard }))) Using this definition, we can now specify the rule application. If the head of the rule is not empty, applying the rule amounts to replacing a subgoal by the body of the rule. Otherwise, we simply add the body of the rule to the goalbase of the agent. Care must be taken here to avoid interference of variables occurring in rules and those variables occurring in goals (cf. [7]. For this reason, all variables in the rule applied are renamed to variables not occurring in the target goal. A function RuleRename(r , V ), which is not defined in this paper due to space constraints (but available on request from the authors), renames the variables in the rule r so that no variable from the set V of variables occurs in the renamed rule. RuleRename : (PRrule × (P Var )) → PRrule ApplyRule ∆AgentState g? : Goal ; r ? : PRrule; rr : PRrule rr = RuleRename(r ?, goalvars {g?}) r ?.type 6= reactive ⇒ (∃ fc : FrontContext; subg : Goal • Insert(subg, fc) = g? ∧ (∃ ϑ, γ : Substitution | (dom γ) ⊆ (beliefvars rr .guard ) • mgu(the rr .head , subg) = ϑ ∧ LogCon(beliefbase, {ASBelief (ϑ ‡ γ)rr .guard }) ∧ beliefbase 0 = beliefbase ∧ goalbase 0 = goalbase \ {g?}∪ ASGoal (ϑ ‡ γ) {(Insert (the(rr .body), fc))})) r ?.type = reactive ⇒ (∃ γ : Substitution | (dom γ) ⊆ (beliefvars rr .guard ) • LogCon(beliefbase, {ASBelief γ rr .guard }) ∧ beliefbase 0 = beliefbase ∧ goalbase 0 = goalbase ∪ ASGoal γ rr .body) 4.2

Goal Execution

The execution of a goal is specified through the computation steps an agent can perform on a goal. A computation step corresponds to a simple action of the agent, which is

182

M. d’Inverno, K. Hindriks and M. Luck

either a basic action or else a query on the beliefs of the agent. Recall that the semantics of basic actions is given by a global function execute and the semantics of beliefs is specified by the LogCon relation. The agent is only allowed to execute a basic action or query that occurs at the front of a goal, i.e. it is one of the first things the agent should consider doing. The notion of front context is useful to find an action or query which the agent might execute. If there is a front context fc in which a basic action or query can be inserted for , and which results in a goal of the agent, the agent might consider executing that basic action or query. After executing the goal, the goal needs to be updated, and this updating is the same as updating the front context by removing . The execution of a basic action amounts to changing the beliefbase of the agent in accordance with the function execute. The condition (a, beliefbase) ∈ (dom execute) expresses that the basic action a is enabled, and thus can be executed. ExecuteBasicAction ∆AgentState ΞAgent g? : Goal (∃ fc : FrontContext; a : Action | a ∈ expertise ∧ Insert((bac a), fc) = g? ∧ (a, beliefbase) ∈ (dom execute) • beliefbase 0 = execute(a, beliefbase) ∧ goalbase 0 = (goalbase \ {g?}) ∪ UpdateGoal ({}, fc)) Queries are goals to check if some condition follows from the beliefbase of the agent. Any free variables in the condition of the query can be used to retrieve data from the beliefbase. The values retrieved are recorded in a substitution ϑ. A query can only be executed if it is a consequence of the beliefbase (otherwise, nothing happens). ExecuteQueryGoal ∆AgentState ΞAgent g? : Goal (∃ fc : FrontContext; b : Belief • Insert(query b, fc) = g? ∧ (∃ ϑ : Substitution • LogCon (beliefbase, {ASBelief ϑ b}) ∧ beliefbase 0 = beliefbase ∧ goalbase 0 = (goalbase \ {g?}) ∪ (ASGoal ϑ (UpdateGoal ({}, fc))))) Executing a goal is then defined as the disjunction of these two functions. ExecuteGoal == ExecuteBasicAction ∨ ExecuteQueryGoal

5

Conclusions

In this paper we have described a specification of the agent-oriented programming language 3APL. The arguments for using Z in agent-based systems are well-rehearsed (eg.[5,

A Formal Architecture for the 3APL Agent Programming Language

183

3]), and we will not re-state them here. In particular, however, Z enables a uniform presentation of both the 3APL programming language and its architecture in a clear and concise way. We are not familiar with any work that specifies both these aspects in this way, and believe that our work moves a step closer to a unified account of agent languages and architectures. The contribution of this work is threefold. First, we provide an operational specification of 3APL that can be used as the basis of a subsequent implementation, so that the transition from what might be called theory to practice is facilitated. At lower levels, this kind of transition is demonstrated, for example, through the provision of a simple agent simulation environment [11], and a sophisticated Jini-based development environment [1], both based on an extensive agent framework [10]. This work addresses the more detailed aspects involved in dealing with the transition of a fully designed system, rather than an outline structure. Second, we allow an easy and simple comparison of 3APL and its competitor systems that have been specified in a similar style such as AgentSpeak(L) [3] and dMARS [2], as illustrated in [6]. Third, we provide an accessible resource in the specification of techniques for the development of agent systems that might not otherwise be available in a form relevant both to agent architects and developers. This work can thus be viewed in a standalone fashion in contributing to the understanding of 3APL on the one hand, and in using it to provide a window on the larger area of generic agent architecture on the other. Acknowledgements: We would like to thank Jean-Jules Meyer for many constructive and helpful discussions during the preparation of this work, and the Universities of Westminster and Utrecht for hosting and supporting the authors during their collaboration.

References 1. R. Ashri and M. Luck. Agent implementation through jini. In Proceedings of the Eleventh International Workshop on Database and Expert Systems Applications. IEEE Computer Society Press, to appear 2000. 2. M. d’Inverno, D. Kinny, M. Luck, and M. Wooldridge. A formal specification of dMARS. In Intelligent Agents IV: Proceedings of the Fourth International Workshop on Agent Theories, Architectures and Languages, Lecture Notes in Artificial Intelligence 1365, pages 155–176. Springer-Verlag, 1998. 3. M. d’Inverno and M. Luck. Engineering agentspeak(L): A formal computational model. Journal of Logic and Computation, 8(3):233–260, 1998. 4. M. R. Genesereth and N. Nilsson. Logical Foundations of Artificial Intelligence. Morgan Kaufman, 1987. 5. R. Goodwin. A formal specification of agent properties. Journal of Logic and Computation, 5(6):763–781, 1995. 6. K. Hindriks, M. d’Inverno, and M. Luck. Architecture for agent programming languages. In ECAI 2000: Proceedings of the Fourteenth European Conference on Artificial Intelligence, to appear 2000. 7. K. V. Hindriks, F. S. de Boer, W. van der Hoek, and J-J. Ch. Meyer. Formal Semantics for an Abstract Agent Programming Language. In Intelligent Agents IV: Proceedings of the Fourth International Workshop on Agent Theories, Architectures and Languages, Lecture Notes in Artificial Intelligence 1365, pages 215–229. Springer-Verlag, 1998.

184

M. d’Inverno, K. Hindriks and M. Luck

8. K. V. Hindriks, F. S. de Boer, W. van der Hoek, and J-J. Ch. Meyer. Control structures of rule-based agent languages. In Intelligent Agents V, Lecture Notes in Artificial Intelligence 1555. Springer-Verlag, 1999. 9. G. Kiss. Goal, values, and agent dynamics. In G. M. P. O’Hare and N. R. Jennings (eds), editors, Foundations of Distributed Artificial Intelligence, pages 247–268. John Wiley and Sons, 1996. 10. M. Luck and M. d’Inverno. Structuring a Z specification to provide a formal framework for autonomous agent systems. In J. P. Bowen and M. G. Hinchey, editors, ZUM’95: The Z Formal Specification Notation, 9th International Conference of Z Users, Lecture Notes in Computer Science 967, pages 48–62. Springer-Verlag, 1995. 11. M. Luck, N. Griffiths, and M. d’Inverno. From agent theory to agent construction: A case study. In Intelligent Agents III: Proceedings of the Third International Workshop on Agent Theories, Architectures and Languages, Lecture Notes in Artificial Intelligence, 1193, pages 49–63. Springer Verlag, 1997. 12. P. Maes. Agents that reduce work and information overload. Communication of the ACM, 37(7):30–40, 1994. 13. A. S. Rao. Agentspeak(l): BDI agents speak out in a logical computable language. In W. Van de Velde and J. W. Perram, editors, Agents Breaking Away: Proceedings of the Seventh European Workshop on Modelling Autonomous Agents in a Multi-Agent World, Lecture Notes in Artificial Intelligence 1038, pages 42–55. Springer-Verlag, 1996. 14. Y. Shoham. Agent-oriented programming. Artificial Intelligence, 60(1):51–92, 1993.

A

Substitutions

Here, we provide further details of the standard definitions of binding and unification, described in Z. First we define a substitution. This maps variables to terms, as well as mapping goal variables to goals. We therefore introduce a new type in order to define a substitution which we call SubTerm. A substitution is represented as a partial function between variables and terms since, in general, only some variables will be mapped to a term. (Remember that fovars is a function that returns the variables of a term, as defined earlier, at the end of Section 4.1.) SubTerm ::= termhhFOTermii | goal hhGoal ii allvars : SubTerm → (P Var ) ∀ t : FOTerm; g : Goal • allvars(term t) = fovars t ∧ allvars(goal g) = goalvars {g} Substitution == {ϑ : Var → 7 SubTerm} The standard definition of a substitution is a mapping from variables to terms such that no variable contained in any of the terms is in the domain of the mapping [4]. ∀ ϑ : Substitution S • (dom ϑ) ∩ {s : SubTerm | s ∈ (ran ϑ) • allvars s} = ∅

A Formal Architecture for the 3APL Agent Programming Language

185

We also have the following predicate concerning substitutions. ∀ ϑ : Substitution; v : Var ; s : SubTerm; t : FOTerm | (v , s) ∈ ϑ • (v ∈ FOVar ⇔ s ∈ (ran term)) ∧ (v ∈ GVar ⇔ s ∈ (ran goal )) A.1 Application of Substitutions The function, ASFOVar , applies either the identity mapping to a variable if the variable is not in the domain of the substitution, or it applies the substitution if it is in the domain. (Note that this function is only defined for elements of FOVar not of GVar .) ASFOVar : Substitution → FOVar → FOTerm ∀ ϑ : Substitution; v : FOVar • (v 6∈ (dom ϑ)) ⇒ ASFOVar ϑ v = var v ∧ (v ∈ (dom ϑ)) ⇒ ASFOVar ϑ v = term −1 (ϑ v ) We can then define what it means for a substitution to be applied to a term. ASTerm : Substitution → FOTerm → FOTerm ∀ t : FOTerm; f : FuncSym; ts : seq FOTerm; ϑ : Substitution | t = functor (f , ts) • t ∈ ran const ⇒ ASTerm ϑ t = t ∧ t ∈ ran var ⇒ ASTerm ϑ t = ASFOVar ϑ (var −1 t) ∧ t ∈ ran functor ⇒ ASTerm ϑ t = (µ new : FOTerm | first (functor −1 new ) = f ∧ second (functor −1 new ) = map (ASTerm ϑ) ts ) A substitution to be applied to an atom and an action. ASAtom : Substitution → Atom → Atom ASAction : Substitution → Action → Action ∀ a, a 0 : Atom; act, act 0 : Action; ϑ : Substitution • ASAtom ϑ a = a 0 ⇔ a 0 .head = a.head ∧ a 0 .terms = map(ASTerm ϑ)a.terms ∧ ASAction ϑ act = act 0 ⇔ act 0 .name = act.name ∧ act 0 .terms = map(ASTerm ϑ)act.terms ASBelief : Substitution → Belief → Belief ∀ b, c : Belief ; l , a : Atom; s : Substitution • ASBelief s (pos a) = (pos (ASAtom s a)) ∧ ASBelief s (not a) = (not (ASAtom s a)) ∧ ASBelief s (true) = true ∧ ASBelief s (false) = false A substitution to be applied to a goal. Remember, goals contain goal variables as well as terms variables so as well as using all our previous definitions for substitution

186

M. d’Inverno, K. Hindriks and M. Luck

application we must define what it means to apply a substitution to a goal variable. This is considered in the final two predicates. ASGoal : Substitution → optional [Goal ] → optional [Goal ] ∀ s : Substitution; a : Action; b : Belief ; at : Atom; g1 , g2 : Goal ; gv : GVar • ASGoal s ∅ = ∅ ∧ ASGoal s {bac a} = {bac (ASAction s a)} ∧ ASGoal s {query b} = {query (ASBelief s b)} ∧ ASGoal s {achieve at} = {achieve(ASAtom s at)} ∧ ASGoal s {seqcomp(g1 , g2 )} = {seqcomp(the(ASGoal s {g1 }), the(ASGoal s {g2 }))} ∧ ASGoal s {choice(g1 , g2 )} = {choice(the(ASGoal s {g1 }), the(ASGoal s {g2 }))} ∧ gv 6∈ (dom s) ⇒ ASGoal s {goalvar gv } = {goalvar gv } ∧ gv ∈ (dom s) ⇒ ASGoal s {goalvar gv } = {goal −1 (s gv )} A.2

Composition of Substitutions

Consider two substitutions τ and σ such that no variable bound in σ appears anywhere in τ . The composition of τ with σ, written τ ‡ σ, is obtained by applying τ to the terms in σ and combining these with the bindings from τ . For example, if τ = {x /A, y/B , z /C } and σ = {u/A, v /F (x , y, z )} then, since none of the variables bound in σ (u, v ) appear in τ , it is meaningful to compose τ with σ. In this case τ ‡ σ = {u/A, v /F (A, B , C ), x /A, y/B , z /C }. The definition is a bit convoluted though because of the typing needed top include goal variables. ‡

: (Substitution × Substitution) → Substitution

∀ τ, σ : Substitution | S (dom σ) ∩ ((dom τ ) ∪ {t : FOTerm; s : SubTerm | (s = term t) ∧ (s ∈ (ran τ )) • fovars t}) = ∅ • τ ‡ σ = (τ ∪ {x : Var ; t : FOTerm; s : SubTerm | (s = term t) ∧ (x , s) ∈ σ • (x , term (ASTerm τ t))}) A.3

Unification

A substitution is a unifier for two terms if the substitution, applied to both of them, makes them equal. unifyterms : P(Substitution × (Term × FOTerm)) ∀ t1 , t2 : FOTerm; s : Substitution • unifyterms(s, (t1 , t2 )) ⇔ (ASTerm s t1 = ASTerm s t2 )

A Formal Architecture for the 3APL Agent Programming Language

187

A substitution is a unifier for two goals if the substitution, applied to both of them, makes them equal. unifygoals : P(Substitution × (Goal × Goal )) ∀ g1 , g2 : Goal ; ϑ : Substitution • unifygoals(ϑ, (g1 , g2 )) ⇔ (ASGoal ϑ {g1 }) = (ASGoal ϑ {g2 }) A substitution is more general than another substitution if there exists a third substitution which, when composed with the first, gives the second. mg

: P(Substitution × Substitution)

∀ ϑ, γ : Substitution • γ mg ϑ ⇔ (∃ ω : Substitution • (γ ‡ ω) = ϑ) The mgu of two goals is specified as follows. mgu : (Goal × Goal ) → Substitution ∀ g1 , g2 : Goal ; γ : Substitution • mgu (g1 , g2 ) = γ ⇔ (unifygoals(γ, (g1 , g2 )) ∧ ¬ (∃ ω : Substitution • (unifygoals(ω, (g1 , g2 )) ∧ (ω mg γ))))

How to Drive a B Machine Helen Treharne and Steve Schneider Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK. Fax: +44 (0)1784 439786 {helent,steve}@dcs.rhbnc.ac.uk

Abstract. The B-Method is a state-based formal method that describes behaviour in terms of MACHINES whose states change under OPERATIONS. The process algebra CSP is an event-based formalism that enables descriptions of patterns of system behaviour. We present a combination of the two views where a CSP process acts as a control executive and its events simply drive corresponding OPERATIONS. We define consistency between the two views in terms of existing semantic models. We identify proof conditions which are strong enough to ensure consistency and thus guarantee safety and liveness properties. Keywords: B-Method, CSP, Embedded Systems, Programming Calculi, Combining Formalisms.

1

Introduction

State based methods such as B specify functional aspects of a system and the effect of individual operations. On the other hand event-based process algebras are concerned with patterns of operations. System designers are interested in both these aspects of a system and thus a combination of state and event based descriptions of a system is desirable. The systems that originally motivated our need to consider both viewpoints were safety-critical systems, for example embedded interlock systems. This paper provides a safe way of describing a combined view of a system. Systems have successfully been modelled as collections of interdependent machines within the B Method. An abstract MACHINE is described using the Abstract Machine Notation (AMN). In this paper we adopt the convention that AMN keywords are indicated in italic capitals. Large MACHINEs can be constructed from other MACHINEs using INCLUDES, SEES and other constructs. A MACHINE encapsulates some local state and a collection of modules called OPERATIONS. OPERATIONS in a MACHINE can be pre-conditioned or guarded. We are interested in specifying embedded systems and refer to a B Abstract System in terms of the MACHINE at the top of the hierarchy of MACHINES which specify the following two kinds of OPERATIONS. Firstly, pre-conditioned OPERATIONS describe the modules which will be refined to code. They have the form PRE R THEN T END. If an OPERATION is invoked when the precondition R is true it will behave as specified by T . However, if the OPERATION J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 188–208, 2000. c Springer-Verlag Berlin Heidelberg 2000

How to Drive a B Machine

189

is invoked outside its pre-condition the resulting execution may be an incorrect behaviour of the system. Secondly, the OPERATIONS which provide a model of the system context have the form SELECT P THEN V END, where P is a guard and V describes the effect of invoking the OPERATION. Guards are predicates on the state of a MACHINE which constrain the cases when an OPERATION is entitled to be invoked. If an OPERATION is invoked when the guard is true then the system will behave as expected with respect to the specification as was the case above. However, if the guard is false then execution is blocked. Process algebras such as Communicating Sequential Processes (CSP) [8] are concerned with the evolution of systems as they execute sequences of events. They are appropriate for describing execution patterns. In this paper, we will show how events in a CSP recursive loop determine which corresponding OPERATION should execute. Thus we view the AMN specifications as providing abstract models of reactions to events. The recursive loop can be viewed as an execution checker and we will refer to it as a control executive. Thus in a combined view of a system a control executive for a system is described using a process algebra which in turn drives the individual state transitions of an Abstract System. In general a CSP control executive could invoke an OPERATION outside its pre-condition, resulting in divergent behaviour. In [15] we gave conditions which ensured this did not occur. With guarded OPERATIONS we also need to ensure deadlock freedom so that a control executive never gets stuck trying to invoke OPERATIONS which are blocked. Ensuring deadlock freedom is the contribution of this paper. The main result of this paper is that we introduce a new proof condition which guarantees deadlock freedom in the context of divergence freedom. Furthermore, we verify that this new condition is strong enough to ensure the consistency of a combined system consisting of guarded OPERATIONS. In this verification we think of an Abstract System as a process and its combination with the control executive is essentially their parallel composition in CSP. In formally justifying the link between these state and event-based methods we were influenced by the existing correspondence between Action Systems and CSP. This correspondence is described by Morgan [9] in terms of weakest precondition semantics and the failures-divergences model. We assume the reader is familiar with AMN. Further details can be found in [1]. However, we will introduce the CSP notation we require. This paper is organised as follows. Section 2 gives a brief overview of CSP. Sections 3, 4 and 5 contain the main contribution of the paper. They present the theoretical foundations of the specific relationship between B and CSP. Section 6 illustrates this new relationship in relation to our previous work on divergence freedom. The final section contains a discussion and conclusions. Proofs of the results have been omitted for reasons of space and can be found in the technical report [16].

190

2

H. Treharne and S. Schneider

Overview of CSP

This section provides a brief introduction to the CSP used in this paper. More details can be found in [8,13,11]. 2.1

The Language

CSP describes systems in terms of processes, which perform events. The set of all events is called Σ. Events are either atomic (e.g. on, off ), or they may be structured into a number of components separated by dots (e.g. send .5). Communications of values along channels will be described using structured events, so for example the transmission of value 5 along channel send will be described with the event send .5. CSP provides a language for describing processes. This includes basic processes such as STOP , the process which does nothing, and DIV , the (divergent) process which represents an infinite internal loop. It also contains process constructors for building up process descriptions. The event prefixing expression a → P means that the process is prepared to engage in the atomic event a and then behaves as the process P . Input of a value x of type T along a channel c is described by c?x : T → P , where P is the subsequent process, which may depend on the input value x . Output of a value v along channel c is described as c!v → P with the subsequent behaviour given by P . The expression P 2 Q offers an external choice between the two processes P and Q: initially it is prepared to behave as either P or Q, and this choice is resolved by the occurrence of the first event, which can be chosen by the user or environment of this choice. Standard conditional statements if b then P else Q are also in the language. Processes execute in parallel by requiring synchronisation on events. The parallel combination P k Q executes P and Q concurrently, but the combination can only perform an event when both parties are willing to perform it. Thus parallel combination can introduce deadlock if the parties cannot agree on any next event. Finally, processes can be defined by means of recursive definitions: the names of recursively defined processes can be referred to in the process definitions themselves. For example, a one place buffer containing a value v can be defined as follows: BUF (v ) = out!v → in?x : T → BUF (x ) The process definition on the right hand side is called the body of the definition. In fact this is a family of equational definitions, one for each possible value of v . The family of processes defined in this way could also be written as a vector of processes BUF indexed by the possible values that v could take. 2.2

Semantics

The CSP approach to semantics is to define the semantics of a process as the set of all observations that may be made of it. The particular kind of observation

How to Drive a B Machine

191

determines the semantic model being used. All the models have a structure which ensures that recursively defined processes are always well-defined. The simplest model is the traces model which describes processes P in terms of traces(P ), the set of all possible sequences of events that P can perform. The CSP process operators are such that the traces of a process can be determined in a compositional way from the traces of its components, so for example traces(a → P ) = {hi} ∪ {hai a tr | tr ∈ traces(P )} Here hi is the empty sequence, and tr1 a tr2 is the concatenation of tr1 and tr2 . Another semantic model is the stable failures model. In this model the semantic value of a process P is given as two sets: the set of traces as in the previous model, and the set of all its stable failures. A stable failure of a process P is a trace/refusal pair (tr , X ) that P can exhibit, where tr is a sequence of events that P can perform, reaching a stable state (from which no further internal progress can occur), and X ⊆ Σ is a set of events that P can refuse to participate in from that state. Thus the process DIV has traces(DIV ) = {hi}, and no stable failures at all (since it never reaches a stable state). Examples of stable failures of BUF (3) include the empty failure (hi, ∅) and the trace/refusal pair (hout.3, in.6i, {out.4, out.5}). The stable failures model is unable to address the issue of divergence, since it does not contain any information about unstable states. Instead a third more complicated semantic model, the failures-divergences model, provides a suitable treatment of divergence. The semantic value of a process in this model consists of two sets: its set of divergences—sequences of events which can lead to infinite internal progress; and its set of all failures, which comprises the stable failures together with (for technical reasons) all divergence/refusal pairs. For divergencefree processes, the stable failures model and the failures-divergences model are equivalent. In this paper we will use the stable failures model in this context for simplicity. 2.3

Specification

A specification is a predicate on all of the possible behaviours of a process. We will be concerned with specifications on traces (written ST (tr )), and specifications on stable failures (written SF (tr , X )). A process P meets a trace specification ST (tr ) if P sat ST (tr ) ⇔ ∀ tr ∈ traces(P ) • ST (tr ) Trace specifications are used to capture safety requirements on processes, i.e. they constrain which traces may occur. For example, the trace specification SB (tr ) = tr ↓ in 6 tr ↓ out states explicitly that the number of communications on channel in should be no greater than the number on out (we use the notation tr ↓ c to denote the number of communications on c appearing in the trace tr ). Thus BUF (3) sat SB (tr ).

192

H. Treharne and S. Schneider

Similarly: P sat SF (tr , X ) ⇔ ∀(tr , X ) ∈ stable failures(P ) • SF (tr , X ) We can use refusal sets of a stable failure to describe liveness requirements. For example, the requirement that a process should be deadlock-free is expressed with the predicate ‘X 6= Σ’—that the refusal set X should never be the set of all events. This follows from the fact that a system has reached deadlock precisely when it can make no further progress—that it refuses to perform any more events, corresponding to a possible refusal set of Σ. Deadlock freedom requires that this can never occur. Thus BUF (3) sat X 6= Σ. An assertion of the form P sat S can be established by considering the semantics of P in the appropriate model. There are specification proof rules for each operator derived from the rules used in defining the semantics. There is also a rule for establishing when vectors of processes N defined by mutual recursion N = F (N ) meet (pointwise) a corresponding vector of satisfiable specifications S . The general form of the rule is as follows: ∀ Y • Y sat S ⇒ F (Y ) sat S N sat S This rule works both within the traces model (with each S as a trace specification) and in the stable failures model (with each S as a specification on stable failures). For example, in the traces model the rule can be used to show that BUF sat SB (tr ).

3

A Simple Coupling between B and CSP Loops

In this section and in Sections 4 and 5 we define and verify the framework so that a control executive, containing CSP events, ensures that the guards of the corresponding B OPERATIONS are enabled. The OPERATIONS model the reaction to the events in the control executive. Introducing the framework in stages aids clarity of presentation and the appropriate development of technical details. We have already stated that the applications that originally motivated this work were safety-critical systems. These systems are designed to run on sequential processors so in this paper we are not concerned with concurrency issues. This section is split into two parts. Firstly, we discuss how a control executive can be described. Secondly, we discuss consistency between a control executive and an Abstract System. In doing so, we briefly review previous work on control executives and proof conditions which ensure divergence freedom of these loops and their associated Abstract Systems. Then, we identify a new condition to show when control executives are consistent with MACHINE descriptions which contain guarded OPERATIONS, in the sense that they do not introduce unexpected deadlocks.

How to Drive a B Machine

3.1

193

Developing a Control Executive

Consider a recursive CSP process, LOOP . In general this will be defined using a parameterised mutual recursion. A family of processes S (p) is used to define LOOP , where p is a collection of parameters for keeping track of which process to execute. In the BUF example of Section 2.1, the process BUF was parameterised by the contents of the buffer v . Each process definition of a control executive represents a sequence of B OPERATIONS to be executed by using an event Eop for each B OPERATION op. Only information which affects the execution of the OPERATIONS needs to be carried in the parameters. In the simple case they will simply be numerical indices. For example, the following LOOP describes a recursive process which alternates between the events Eup and Edown . LOOP = S (0) S (0) = Eup → S (1) S (1) = Edown → S (0) So in general we would have the following LOOP = S (0) S (0) = R0 .. . S (n) = Rn where LOOP is bound to a process name with an initial parameter of 0 and each Ri is a CSP process expression which will describe some behaviour of the OPERATIONS and the possible S (i )s that can subsequently be reached. We first introduced the syntax of our control language in [15] to develop non-terminating loops. In this paper, our syntax will also enable us to define terminating loops. However, as we stated above the framework will be developed in stages. Thus we start with simple non-terminating loops consisting of atomic events. The syntax of the CSP terms in the process bodies for non-terminating loops is given by the following pseudo-BNF rule: R ::= a → R | R1 2 R2 | S (p) Event prefixing and external choice are defined as in Section 2. The event a is of the form Eop where op is a B OPERATION. S (p) is a process name where p is an expression. Each process body will contain a recursive call, S (p). For example, in the above process S (1) the last term in its definition is S (0) in order to provide a binding for the mutual recursive case. Furthermore, we restrict S (p) from being part of a choice. This restriction is convenient for technical reasons which will be elaborated in Section 3.5.

194

H. Treharne and S. Schneider MACHINE embedded switch VARIABLES person , status INVARIANT person ∈ N ∧ status ∈ 0 . . 1 ∧ person ≤ 5 INITIALISATION person := 0 k status := 0 OPERATIONS on = b SELECT person > 0 ∧ status = 0 THEN status := 1 END ; off = b SELECT person = 1 ∧ status = 1 THEN status := 0 END ; enter = b SELECT person < 5 THEN person := person + 1 END ; leave = b SELECT person > 0 THEN person := person − 1 END END Fig. 1. Embedded light switch

3.2

Consistency of a CSP Control Executive and a B Abstract System

Once we have a CSP control executive, LOOP, we will need to demonstrate that it is appropriate for a particular Abstract System, M, by defining the notion of deadlock freedom on the combination (LOOP || M ). Abstract Systems can be given CSP failures-divergences semantics as shown in [9,3]. Deadlock can occur in a B Abstract System when the guard of an OPERATION is false and thus execution is disallowed. Therefore, in our correspondence between CSP and B, the notion of deadlock freedom we require is that not all of the OPERATIONS offered in the CSP are actually blocked (with false guards) in the B. Consider the MACHINE in Figure 1. It defines an embedded light switch in a room. A simple control executive for this MACHINE allowing only one person in the room at a time would be: ROOM = S (0) S (0) = Eenter → Eon → Eoff → Eleave → S (0) Clearly all the guards of the OPERATIONS are true whenever they are invoked by the control executive. Conversely, if we tried to turn the light off when two people are in the room it would deadlock, since the value of person does not match the guard.

How to Drive a B Machine

3.3

195

Reviewing Divergence Freedom

Recall that in Section 3.1 we referred to a family of processes S (p) to define a mutually recursive process LOOP . In [15] we used such recursive loops to control the execution of pre-conditioned OPERATIONS. In order to ensure consistency of sequences of OPERATIONS of the form PRE R THEN W END (where W did not contain guarded substitutions) we needed to find a control loop invariant, CLI . In [15] we stated that the CLI need not hold after each individual OPERATION but must hold at every recursive call in order to guarantee divergence freedom. Reaching a recursive call corresponds to a maximal trace of a body of a process expression. We also introduced two conditions. Firstly, the initialisation of the Abstract System establishes the CLI . Secondly, any sequence of OPERATIONS, corresponding to execution between recursive calls also establishes the CLI . We then defined the notion of consistency so that if such sequences of OPERATIONS could establish the CLI we knew that all the OPERATIONS were called within their pre-conditions and terminated. If we could demonstrate this for all the bodies of a mutually recursive loop then the loop was demonstrated to be divergence-free. 3.4

Conditions for Deadlock Freedom

In this paper we will also use the above CLI to record that we are at a recursive call of a control executive. In essence this will serve as an anchor for examining the MACHINE’s possibilities at each point through the processes of a control executive. We need a stronger invariant than the invariant of an Abstract System. A CLI which is appropriate for the process ROOM and the embedded switch Abstract System is person = 0 ∧ status = 0. The importance of the CLI for this example is highlighted at the end of Section 3.5. The nature of what we have to prove here is stronger than in [15]. However, all of the analysis done in the following sections is done in the context of divergence freedom and the presence of a CLI so the stable failures model for CSP will be sufficient for our needs. We need to make sure that we do not deadlock at any point during the execution of the processes and so all the traces along the bodies of processes need to be checked individually. In this paper we define a function PAIRS to relate CSP traces to MACHINE guards. We show that the following condition is sufficient to establish deadlock freedom for simple non-terminating loops since traces(DIV ) = {hi}, any trace of Rp [DIV / S ] must be a trace of the body of Rp . Condition 1 ∀ tr • tr ∈ traces( Rp [DIV / S ] )∧ CLI ∧ I ∧ cb = p ⇒ wp(tr , PAIRS (tr , Rp )) This condition states that for all traces, tr , of Rp before a recursive call and given the invariants hold before the body is executed with the appropriate value of the control variables, the state reached after that trace enables a guard of at least one of the OPERATIONS corresponding to the next possible events. In

196

H. Treharne and S. Schneider

PAIRS (hi, Ea → R) = ga PAIRS (hEa i a tr , Ea → R) = PAIRS (tr , R) PAIRS (hi, R 2 R 0 ) = PAIRS (hi, R) ∨ PAIRS (hi, R 0 )  PAIRS (tr , R) if tr ∈ / traces(R 0 )    0 PAIRS (tr , R ) if tr ∈ / traces(R) PAIRS (tr , R 2 R 0 ) = PAIRS (tr , R)    ∧PAIRS (tr , R 0 )otherwise where tr 6= hi 0

PAIRS (tr , S (p )) = true Fig. 2. PAIRS definition

fact whenever tr ∈ traces(Rp [DIV / S ] ) the function PAIRS (tr , Rp ) gives the weakest condition which needs to be true so that some OPERATIONS will be enabled by Rp after tr has occurred. Each process body Rp is subscripted with p to highlight which process is referred to within the family of processes and its body is bound by a process name S (p), as stated in Section 3.1. The predicate cb = p arises from modelling control variables to correspond to which process S (p) is being executed. The control variables are not part of the Abstract System but do correspond to AMN variables which is why they are subscripted with b. There will be one control variable for each CSP parameter and thus one corresponding predicate. The value of cb equals the value of the index in the parameter of the process. It is present because the CLI could relate the parameters of the processes with the state of the B MACHINE. For each process body Rp of a particular control system this condition gives rise to several proof obligations that would need to be proved. In the above condition we extract the traces of the body of a process when we view Rp as a function and substitute the process DIV for the appropriate recursive call S (Rp [DIV / S ]). We use the process DIV since it is the base case in the CSP stable failures model when building up the traces for the process body, i.e. its only trace is the empty trace. For example, given a process body R0 = Eb → ((Ec → S (0)) 2 (Ed → S (0))) only hi, hEb i, hEb , Ec i and hEb , Ed i need to be checked. These are the only traces of Eb → ((Ec → DIV ) 2 (Ed → DIV )). 3.5

Determining Guards Using PAIRS

In Condition 1 we introduced the function PAIRS given in Figure 2. Given a particular sequence of events and a CSP process body the function determines which corresponding guards in the B should be offered next. It is defined over the terms in our CSP language and their trace semantics. In the definition of PAIRS the guard of an OPERATION op is denoted by gop .

How to Drive a B Machine

197

For the term Ea → R, if the trace is empty then the guard from the corresponding OPERATION a is offered. Therefore, if we had a process of this form we would have to check that the CLI ∧ I ∧ cb = p ⇒ wp(hi, ga ) in Condition 1 holds in the first instance, i.e. the invariants and the control predicate must be strong enough to imply the guard of the OPERATION a. If the trace, tr , is not empty then the function PAIRS (tr , R) represents the disjunction of all the guards of all the OPERATIONS of the B MACHINE that the CSP control executive might perform next. The definition containing the external choice term reflects the fact that when the trace is empty the choice is not resolved so either PAIRS (hi, R) or PAIRS (hi, R 0 ) holds so that at least one path of a process containing a choice will not deadlock. On the other hand, when the trace is not empty and the trace is of both R and R 0 their conjunction must hold since the CSP control could be behaving as either R or R 0 . Therefore, the B MACHINE should be able to respond in both cases. Thus both possibilities should be deadlock-free. If the trace is of either R or R 0 but not both, then deadlock freedom is required only for the appropriate branch of the choice. The case containing the recursive call, S (p 0 ), gives the predicate true. This is required as a base case. By the time a recursive call is reached, the existence of a CLI already ensures that the body of the loop is guaranteed to terminate and nothing further needs to be proved. For example, given the following process within a family of processes P (0) = Ea → Eb → P (1) and the maximal trace of its body, hEa , Eb i, the clause for the recursive case contributes to a simple instance of Condition 1 where the maximal trace terminates, i.e. CLI ∧ I ∧ cb = 0 ⇒ wp(hEa , Eb i, PAIRS (hEa , Eb i, Ea → Eb → P (1))) = wp(hEa , Eb i, true). In Section 3.1 we restricted the binding term from being part of a choice. If we had allowed S (p 0 ) to be part of a choice we would have had to provide a more complex translation mapping which referred to the guard of the first event of the next process, Rp 0 , to be executed. Thus at the cost of reduced expressiveness we prefer to restrict how S (p 0 ) can be used. In practice this restriction does not cause a problem, since such choices can be re-written. For example, S (0) = Eup → S (2) S (2) = S (0) 2 (Edown → S (0)) can be re-written as S (0) = Eup → S (1) S (1) = (Eup → S (1)) 2 (Edown → S (0)) whose behaviour in any case is easier to understand.

198

H. Treharne and S. Schneider

For the example in Figure 1 with control executive ROOM Condition 1 gives rise to the following checks that we have to prove: CLI ∧ I ∧ c1 = 0 ⇒ wp(hi, genter ) = genter

(1)

CLI ∧ I ∧ c1 = 0 ⇒ wp(hEenter i, gon )

(2)

CLI ∧ I ∧ c1 = 0 ⇒ wp(hEenter , Eon i, goff )

(3)

CLI ∧ I ∧ c1 = 0 ⇒ wp(hEenter , Eon , Eoff i, gleave )

(4)

CLI ∧ I ∧ c1 = 0 ⇒ wp(hEenter , Eon , Eoff , Eleave i, true)

(5)

where CLI = person = 0 ∧ status = 0 and I = person ∈ N ∧ status ∈ 0..1 ∧ person ≤ 5) The above proof obligations highlight the importance of the CLI . The invariant of the MACHINE alone is not strong enough to imply the guard of the B OPERATION enter, i.e. person < 5. In other words, there are some states in which enter is blocked. The CLI is used to establish that whenever enter is called by the control executive it is not blocked. All these obligations are trivial to prove. For example, person = 0 ∧ status = 0 ∧ person ∈ N ∧ status ∈ 0..1 ∧ person ≤ 5 ∧ c1 = 0 ⇒ [person =⇒ person := person + 1]person > 0 ∧ status = 0 From the above proof obligations you will notice that we are abusing the wp notation. There is one-to-one mapping between the CSP events and their corresponding B OPERATIONS. We could set up a formal correspondence to capture this notion, where the empty trace hi corresponds to skip, the singleton trace hEa i is simply the OPERATION named a, hEa , Eb i = a; b and so on. 3.6

Verification of Deadlock Freedom Consistency

Condition 1 in Section 3.4 is sufficient to ensure consistency between which OPERATIONS can be executed in the B and what is allowed by the CSP control executive. Therefore when the condition is met, LOOP is appropriate for the Abstract System M as stated below in Theorem 1. We need the following lemmas in order to prove the theorem. The first lemma links the refusals of a CSP process with the guards of its corresponding OPERATIONS by use of the function PAIRS . The lemma formalises that the state reached after a trace tr ensures a guard of the possible next events, i.e. after tr , it is guaranteed that some OPERATION x not in the refusal set X has its guard gx true. The information for the refusals comes from the CSP stable failures semantics. Lemma 1. If R is a process body then (tr , X ) ∈ failures(R) ∧ wp(tr , PAIRS (tr , R)) ⇒ wp(tr ,

W x ∈Σ−X

gx )

How to Drive a B Machine

199

In Section 3.2 we defined the control executive ROOM . Consider the example of the singleton trace hEenter i with subsequent refusal {Eenter , Eleave , Eoff }. The above lemma allows us to conclude that the following holds: (hEenter i, {Eenter , Eleave , Eoff }) ∈ failures(Eenter → Eon → Eoff → Eleave → S (0)) ∧ wp(hEenter i, PAIRS (hEenter i, Eenter → Eon → Eoff → Eleave → S (0))) _ gx ) ⇒ wp(hEenter i, x ∈Σ−X

In this case Σ = {Eenter , Eleave , Eon , Eoff } so

W

x ∈Σ−X

gx = gon , and this instance

of Lemma 1 reduces to the following which is true. (hEenter i, {Eenter , Eleave , Eoff })

∈ failures(Eenter → Eon → Eoff → Eleave → S (0)) ∧ wp(hEenter i, gon ) ⇒ wp(hEenter i, gon ) Any refusal after performing Eenter must be a subset of {Eenter , Eleave , Eoff }. In this case Σ − X will always contain {Eon } and thus will not block the guard gon which is true after performing the OPERATION corresponding to the event Eenter . This is enough to establish the following lemma. It states that the specification of being able to ensure the guard of a possible next event is true for all traces of a body of a process and is preserved by recursive calls. This lemma is in the context of divergence freedom, since in its proof we refer to maximal terminating traces of a process and need to make sure that the CLI can be established at the end of the sequence of OPERATIONS corresponding to those events. In more detail, we look at an arbitrary process but divergence freedom must be true for any process within the family of processes. This is why we state that G preserves CLI in the lemma. Lemma 2. If Y = G(Y ) is a mutual recursive process such that G preserves CLI , meets Condition 1 then ∀ p • Y p sat ge(tr , X ) ⇒ ∀ p • G(Y )p sat ge(tr , X ) W gx ) where ge(tr , X ) = CLI ∧ I ∧ cb = p ⇒ wp(tr , x ∈Σ−X

Now we can state the following theorem. If the guards of at least one of the OPERATIONS corresponding to the events offered for execution are enabled then not all the events combined with their OPERATIONS can be refused. The failures in the theorem are those given by the failures divergences model. However, we can move freely between the two semantic models since CLI guarantees that (LOOP || M ) is divergence-free so the failures from the CSP will be the same in both models as stated in Section 2.2.

200

H. Treharne and S. Schneider

INITIALISATION nn := 0 OPERATIONS up = b SELECT nn = 0 ∨ nn = 1 THEN nn := 1 END down = b SELECT nn = 1 THEN nn := 0 END END

S (0) = Eup → S (1) S (1) = (Eup → S (1)) 2 (Edown → block → STOP )

Fig. 3. Example terminating control executive and MACHINE

Theorem 1. If LOOP sat ge(tr , X ) then ∀ tr • (tr , Σ) ∈ / failures(LOOP || M ) The corollary follows immediately. Corollary 1. If Condition 1 holds for the body of LOOP , then (LOOP || M ) is deadlock-free.

4

A Coupling for Terminating Loops

In this section we augment the control language to include atomic events to model terminating loops. We then discuss the impact of modelling such loops on our notion of deadlock. We also modify the proof condition to accommodate this change and verify consistency of these new loops. 4.1

Extended Syntax

The new syntax is defined as follows: R ::= a → R | R1 2 R2 | S (p) | block → STOP STOP can be used to terminate a process, however it has no traces. The way in which we built the combined view above was to examine sequences of events and their corresponding sequences of OPERATIONS. We cannot simply use STOP because we need an event which appears in a CSP trace which we will map to an OPERATION in the B. By using the special event named block we have an event which can appear explicitly in a CSP trace and which corresponds to the guarded substitution, SELECT false THEN skip END. The reason for the guard of block, gblock , being false will be explained in the following section. An example of a MACHINE and its control executive which includes block is shown in Figure 3.

How to Drive a B Machine

4.2

201

Acceptable Deadlock

The new syntax we introduced above means that deadlocks can be explicitly introduced into the CSP by the event prefix block → STOP . These explicit deadlocks in the CSP are acceptable since we take the appearance of block in a control executive to indicate that termination is acceptable at that point. However, we do not wish to allow unexpected deadlocks which are introduced via the B as we discussed earlier. Therefore, in our correspondence between CSP and B, the notion of deadlock freedom we require is that not all of the OPERATIONS offered in the CSP are actually blocked (with false guards) in the B. However, we will not worry about deadlock if block can be the next event of a trace since that indicates acceptable deadlock at that point (e.g. after a controlled shutdown). 4.3

Modified Condition for Deadlock Freedom

The following condition is very similar to Condition 1. The only difference is that we restrict the traces that need to be examined. Condition 2 ∀ tr • block ∈ / tr ∧ a tr hblock i ∈ / traces(Rp [DIV / S ] )∧ tr ∈ traces( Rp [DIV / S ] )∧ CLI ∧ I ∧ cb = p ⇒ wp(tr , PAIRS (tr , Rp )) This condition states that for all traces, tr , of the body of Rp which do not lead to blocking and given the invariants hold before the body is executed with the appropriate value of the control variables, the state reached after that trace enables a guard of at least one of the OPERATIONS corresponding to the next possible events. For example, for the process S (1) in Figure 3, we only need to check the following traces, hi and hEup i. We do not need to check hEdown i and hEdown , block i since they do not satisfy Condition 2. Given this restriction on the traces that need to be checked the definition of PAIRS remains unaffected, i.e. we do not need to provide a definition for the block → Stop case since it will never be needed. We do not check the traces which lead to blocking since deadlock is explicitly permitted in such cases. Therefore, given the process Ec → block → Stop only CLI ∧ I ⇒ wp(hi, gc ) needs to be checked. The system should not deadlock before the event Ec occurs. 4.4

Verifying Deadlock Freedom Consistency for Terminating Loops

All the lemmas and theorems introduced so far are based on processes which do not include the block event. Now consider processes which may contain the block event. We will obtain a similar result to Lemma 2 which takes the block event into account. Consider again the process S (1) from Figure 3. Its stable failures on the empty trace and the singleton trace hEdown i are {(hi, X ) | X ⊆ {block }} {(hEdown i, X ) | X ⊆ {Eup , Edown }}

202

H. Treharne and S. Schneider

Note that in the initial case the maximal X (ignoring block ) is {}, and so Σ −X = {Eup , Edown } and so CLI ∧I ∧cb = 1 ⇒ wp(hi, gup ∨gdown ) must hold in order to satisfy the specification in lemma 2. Following hEdown i we would na¨ıvely need to show that CLI ∧ I ∧ cb = 1 ⇒ wp(hEdown i, gblock ) = wp(hEdown i, false) which does not always hold. However, in this case we do not need to concern ourselves with satisfying the deadlock freedom specification since deadlock has been explicitly permitted by the inclusion of the block event in the process description. The above failures provide an insight on the extra predicate that needs to be added to the specification so that we focus only on establishing deadlock freedom for the appropriate traces. For a given trace, if we need to ensure that a next possible guard is enabled, block will be able to augment the refusal set since it should not be possible in the trace. On the other hand if explicit blocking occurs next then block is not in the refusals. Thus, the following lemma states that all traces which do not lead to explicit blocking are deadlock-free. This gives another property which is preserved by recursion. Lemma 3. If Y = G(Y ) is a mutual recursive process such that G preserves CLI , meets Condition 2 and which can contain the block event then ∀ p • Y p sat new ge(tr , X ) ⇒ ∀ p • G(Y )p sat new ge(tr , X ) where new ge(tr , X ) = CLI ∧ I ∧ cb = ρ[p] ∧ block ∈ X ⇒ wp(tr ,

W x ∈Σ−X

gx )

Now we can state the theorem that if the trace cannot be extended by block then not all the events combined with their OPERATIONS can be refused. Thus all deadlocks are marked by block in the CSP. The theorem relies upon the stable failures axiom in CSP which states that given a trace and a refusal any event can be either appended to the trace or added to the refusals. It also relies on the property of subset closure in the refusals of a behaviour. Theorem 2. If LOOP sat new ge(tr , X ) then ∀ tr • tr a hblock i ∈ / traces(LOOP ) ⇒ (tr , Σ) ∈ / failures(LOOP || M ) The corollary follows immediately. Corollary 2. If Condition 2 holds for the body of LOOP , then (LOOP || M ) is deadlock-free.

5

Allowing Channels in Loops

This section follows the same pattern as the previous sections. We first extend the control language to include structured events and boolean expressions. We then describe what effect these new events have on deadlock freedom.

How to Drive a B Machine

5.1

203

Further Extended Control Syntax

The new syntax is given as follows; R ::= a → R | R1 2 R2 | if x then R1 else R2 end |c?x : T → R | S (p) | block → STOP The conditional term is defined as in Section 2.1. The input term behaves as described in Section 2.1. In [15] we discussed the flow of information from the CSP into the B description. In particular, we distinguished between the environments of a control executive. We discussed the existence of an environment for the whole system which is external to both the CSP and B descriptions. The input term models an input from this environment which is then passed into the B specifications. We have only considered one input, we could easily extend the approach to deal with many inputs. We also described that all the outputs from the CSP originated in the B. In our restricted language there are no terms with the standard CSP syntax for output over a channel (c!v → P ). In our combination this would correspond to both the control executive and the B description setting the output value. The CSP is simply driving the OPERATIONS and has no part in constraining the values of the outputs. Instead, we introduced a new piece of syntax c ? v → P to have precisely the CSP output semantics. The difference is that we view this term as the control executive passing information into the B specification. Further discussion of the use of outputs channels can be found in the technical report [16]. However, for a control executive to be well formed all of its variables must be bound. Variables are bound either by inputs from the external environment or from the B description or by appearing as parameters of the mutual recursion. An example of a control executive and its associated Abstract System based on this new syntax is given in Figure 4. This example meets Condition 2 and is deadlockfree. It illustrates how the variables x and f are bound by the environment and the parameter of the process L respectively. The functionality of the example is the servicing of lift requests. The lift is at a particular floor f and the control executive accepts requests to move to floor x . It proceeds to ascend or descend to the requested floor as appropriate. 5.2

Preserving Consistency with New Syntax

In Section 4 we considered loops that could terminate and ignored traces which contained the block event in Condition 2. With the new syntax above we do need to consider the traces of the bodies of the processes of a control executive which include an input over a channel and those which have been influenced by the branching of the boolean condition. Therefore, we need to change the definition of PAIRS to include cases for inputs and conditional expressions as given in Figure 5. For the term which inputs a value from the environment, if the trace is empty a guard is present so that for some CSP inputs the corresponding guard in the B is true. For example, in Figure 4 the OPERATION request restricts its

204

H. Treharne and S. Schneider

INITIALISATION floor := 0 k req := 0 OPERATIONS request(xx) = b PRE xx : NAT THEN SELECT xx 6= floor THEN req := xx END END ascend = b SELECT req > floor THEN floor := req END descend = b SELECT req < floor THEN floor := req END END

LIFT = L(0) L(f ) = Erequest ?x : N → if (x > f ) then Eascend → L(x ) else Edescend → L(x )

Fig. 4. Lift control executive and its associated MACHINE

PAIRS (hi, Ea ?xc : T → R)ρ = ∃ xb ∈ T • ga (xb )) PAIRS (hEa .v i a tr , Ea ?xc : T → R)ρ = PAIRS (tr , R(v ))ρ

where v : T

PAIRS (tr , if xc R1 else R2 end )ρ = ρ[xc ] ⇒ PAIRS (tr , R1 )ρ ∧ ¬ρ[xc ] ⇒ PAIRS (tr , R2 )ρ

Fig. 5. PAIRS definition

function to inputs which are not of the same floor as the current floor of the lift. Execution is prohibited when the input is to the same floor as it is on. Therefore, in general blocking may occur on some inputs and the guard need not be true for all inputs, but deadlock will not occur provided at least one input is not blocked. In the PAIRS clauses in Figure 5 we introduced a binding which is used to track updates to CSP variables. Use of the binding, ρ, to keep track of the values of variables is a standard technique in denotational semantics [12,14]. In the clause containing the conditional term we use it to extract the value of xc so that the appropriate predicate related to the guards of either R1 or R2 is offered. Note also that the control predicate in Condition 2 can no longer simply refer to the value of the index in the parameter of the process. We will need the predicate cb = ρ[p]. The value of cb equals the value of the expression and will be contained in ρ. It will be used to track the values associated with the variables in the CSP description. For example, the process L in Figure 4 gives rise to the control predicate cb = ρ[f ] when considering traces of its process body. Thus from the above, the impact of the new syntax on Condition 2 is minimal. We simply need to take the binding ρ in account and provide additional clauses for PAIRS . Their impact on the verification of consistency is also mini-

How to Drive a B Machine

205

MACHINE embedded switch VARIABLES person , status INVARIANT person ∈ N ∧ status ∈ 0 . . 1 ∧ person ≤ 5 INITIALISATION person := 0 k status := 0 OPERATIONS on = b PRE person > 0 ∧ status = 0 THEN status := 1 END ; off = b PRE person = 1 ∧ status = 1 THEN status := 0 END ; enter = b SELECT person < 5 THEN person := person + 1 END ; leave = b SELECT person > 0 THEN person := person − 1 END END Fig. 6. Revised Embedded light switch

mal. The theorems and corollaries remain unchanged. In the supporting lemmas the binding has to be taken into consideration in order to be able to consider bounded expressions. Therefore, when we examine the set of failures of a process we need to look at the set of failures under ρ, i.e. failures[[P ]]ρ . We also have to add new cases in the structural induction of Lemma 1 for input and conditional terms. These changes are technical details. What we have to prove and the structure of the proof to demonstrate that we are deadlock-free remain the same.

6

Example with Divergence and Deadlock Freedom

We conclude this paper by showing how the divergence and deadlock freedom verification can be applied to a small example in our style of specification. Earlier in Figure 1 we considered an embedded light switch consisting of guarded statements. We can also interpret this example with the light switch as the software to be developed and the room as its where people can enter and leave. Therefore, the OPERATIONS on and off will be defined by pre-conditioned statements and not guarded ones as before and so we have the MACHINE shown in Figure 6. The two conditions for divergence freedom for this system were outlined in Section 3.3 and are discussed in more detail in [15]. They state that the initialisation of the MACHINE must establish the control loop invariant, CLI , and that any sequence of OPERATIONS corresponding to a trace of events of

206

H. Treharne and S. Schneider

the body of the process ROOM must preserve the CLI . [INITIALISATION ]CLI CLI ∧ I ∧ cb = 0 ⇒[enter ; on; off ; leave]CLI where CLI = person = 0 ∧ status = 0 and I = person ∈ N ∧ status ∈ 0..1 ∧ person ≤ 5 In Section 3.5 we stated the five conditions to be verified to ensure deadlock freedom. The guards of the OPERATIONS on and off are merely true and so the second and third proof obligations are simplified to show termination. CLI ∧ I ∧ cb = 0 ⇒ wp(hEenter i, true) CLI ∧ I ∧ cb = 0 ⇒ wp(hEenter , Eon i, true) Since the CLI can be established for the above sequence of OPERATIONS (enter ; on; off ; leave) then we can infer that any prefix of this sequence terminates, and hence establishes true. Therefore, we do not have to check the above proof obligations nor the fifth one. Thus we are reduced to verifying the remaining proof obligations: CLI ∧ I ∧ cb = 0 ⇒ person < 5 and CLI ∧ I ∧ cb = 0 ⇒ wp(hEenter , Eon , Eoff i, person > 0) and these are both easily established. If we had failed to establish that ROOM preserves CLI then it might just be that the CLI is not appropriate. Alternatively, it could be that one of the OPERATIONS on or off, from Figure 6 were called outside their pre-conditions in which case there is a divergence and hence no possible CLI . For example, the sequence (enter ; leave; on; off ) fails to establish the CLI . Even if we can establish the CLI we must still check for deadlocks since CLI may have been established miraculously. For example, the sequence (enter ; leave; leave) can establish the CLI since the guard of the second leave is false and so anything could be true of the final state including the CLI . However, CLI ∧ I ∧ cb = 0 ⇒ wp(hEenter , Eleave i, gleave ) does not hold, and so we discover a potential deadlock. Thus the control executive S (0) = Eenter → Eleave → Eleave → S (0) is unsuitable. Control executives are only suitable if they are divergence- and deadlock-free. Given a suitable executive we can then specify safety and liveness properties in terms of events. For example, we could say that the light switch will be enabled when a person enters the room.

7

Discussion

Other work in combining a process algebra and a state based-method has predominantly centered around combinations of CSP or Timed-CSP together with Z or Object Z [6]. Some of the approaches introduce new semantics whereas our

How to Drive a B Machine

207

semantic combination preserves consistency of the original semantics of both languages so that each description could be analysed separately (potentially with the powerful tool support currently available), but links are drawn between them so that the events in a process can be interpreted from a different viewpoint. The CSP description cannot contribute to the computation of the individual state transitions. Many of these combinations split the input and output from the specification of the individual state transition. For example, Roscoe et al. [10] divide every schema into two events, one with input parameters and one for output. Our style is akin with Fischer [7] where each event maps to one OPERATION. In the remainder of this section we briefly discuss our approach in relation to Butler’s work which combines CSP and B [4]. He takes a CSP-like process and translates it into a B specification. Therefore in his approach the CSP could be simply a process which is translated into an event-based view of a system or it can be used to constrain the execution of a B MACHINE using the CONJOINS mechanism. The latter way of using CSP to drive a B MACHINE is similar to ours where we think of executing the CSP in parallel with a B MACHINE. However, the main difference is that in Butler’s approach the CSP-like process is combined into the B specification and thus does not retain the two views separately as a CSP process with corresponding OPERATIONS. Instead, Butler introduces a new MACHINE where new OPERATIONS are defined corresponding to the events of the CSP process. If the CSP is being used to constrain the B MACHINE then the OPERATIONS from that MACHINE are called within the body of these new OPERATIONS. The guards of these new OPERATIONS correspond to new state variables and are like place holders to record the point of execution of the process. However, the approach does not focus on consideration of the new guards discharging the pre-conditions of the OPERATIONS of the Abstract System nor ensuring that their guards are enabled. Another difference between Butler’s approach and ours is that we disallow direct visibility of AMN state variables in a CSP process. In Butler’s work the B state is directly visible in the CSP description which is appropriate since the CSP will be translated into B. However, we keep our descriptions separately and so the only way the CSP knows about B state is via information passed as parameters. This means that all relevant state information appears in traces and can be subject of trace specifications. No B state is directly visible within the CSP control executive. We go further and distinguish between the environments of the CSP and the external system with respect to the B Abstract System. The notion of consistency introduced in this paper between the two views could also apply to Butler’s work, although the csp2B translation does not itself ensure that the pre-conditions or the guards of the B OPERATIONS will be met. It may be the case that the notion of deadlock freedom for Butler’s work need not be as complicated as ours and would simply need to show that one of the guards of the new combined specification was enabled in any legitimate state. This is likely to require a strengthening of the MACHINE INVARIANT. Such a deadlock-free condition for event-based AMN systems has already been pointed out by Abrial in [2]. This can also be done directly in our approach by using a control loop S (0) = 2 Eop → S (0) denoting an external choice over all the

208

H. Treharne and S. Schneider

OPERATIONS in the loop. The only proof obligation we obtain from W Condition 2 concerns the empty trace and reduces to CLI ∧ I ∧ cb = 0 ⇒ gop . The CLI would have to be strong enough to imply the guard of one of the OPERATIONS. This CLI would correspond to the strengthening of the MACHINE INVARIANT likely to be required by Butler’s approach. The above loop S (0) is just a special case of our approach. Finally, the focus of our work was to develop CSP control executives for sequential processors and define a style of specification which will allow some of the OPERATIONS to be refined to code. Extending the approach to include abstraction is a topic of current research. Acknowledgements: Thanks to the anonymous referees for their detailed and constructive comments.

References 1. Abrial J. R.: The B Book: Assigning Programs to Meaning, CUP (1996). 2. Abrial J. R.:Extending B without Changing it (for Developing Distributed Systems). In H. Habrias, editor, Proc. of the 1st B Conference, Nantes, France (1996). 3. Butler M.J.: A CSP Approach to Action Systems. D.Phil Thesis, Programming Research group, Oxford University (1992). 4. Butler M.J.:csp2B: A Practical Approach to Combining CSP and B, In J.M.Wing, J. Woodcock, J. Davies, editors, FM’99 World Congress, Springer (1999). 5. Dijkstra E. W.:A Discipline of Programming, Prentice-Hall (1976). 6. Fischer C.:How to combine Z with a Process Algebra. In J. Bowen, A. Fett, and M.Hinchey, editors, ZUM’98 The Z formal Specification Notation, volume 1493 of LNCS, Springer (1998). 7. Fischer C.: CSP-OZ: A combination of Object-Z and CSP. In H. Bowman and J. Derrick, editors, Formal Methods for Open Object-Based Distributed Systems (FMOODS’97), volume 2, Chapman & Hall (1997). 8. Hoare C.A.R.: Communicating Sequential Processes, Prentice Hall (1985). 9. Morgan C.C.:Of wp and CSP. In W.H.J. Feijen, A.J.M. van Gasteren, D. Gries and J. Misra, editors, Beauty is our business: a birthday salute to Edsger W. Dijkstra. Springer (1990). 10. Roscoe A. W., Woodcock J. C. P. and Wulf L.:Non-interference through determinism. In D. Gollmann, editor, ESORICS 94, volume 875 of LNCS, Springer (1994). 11. Roscoe A. W.:The Theory and Practice of Concurrency, Prentice-Hall, 1997. 12. Scattergood J.B.:The Semantics of Machine-Readable CSP, Oxford University D.Phil thesis, 1998. 13. Schneider S.:Concurrent and Real-time Systems: The CSP approach, Wiley, 2000. 14. Stoy, J. E.: Denotational Semantics, MIT Press (1977). 15. Treharne H., Schneider S.: Using a Process Algebra to control B OPERATIONS. In K. Araki, A. Galloway and K. Taguchi, editors, IFM’99, York, Springer (1999). 16. Treharne H., Schneider S.: How to drive a B Machine (Full Version). Technical Report CSD-TR-00-03, Royal Holloway, University of London, May (2000).

Deriving Software Specifications from Event Based Models Nestor Lopez, Marianne Simonot, and Veronique Vigui´e Donzeau-Gouge Cedric Research Center, Conservatoire National des Arts et M´etiers, 292, Rue Saint-Martin, 75141 Paris Cedex, France. {lopezne, simonot, donzeau}@cnam.fr

Abstract. We present a method to derive sequential program specifications from system models. We use an event based approach to model systems, as it allows us to specify parallel, concurrent and distributed systems. We refine the specification of a system until we have introduced all the events needed by its components to interact with the environment. Then, we derive an environment specification and a specification for each component. We use pre-conditions and post-conditions in these specifications, so that they can be implemented using the classical refinement relation for sequential programs. The derived components share the environment module to interact with each other. Keywords: Atomic Operation, Program Specification, Event, System model, System Property, Environment Interaction, Concurrent Process, Distributed System, Shared module, Refinement.

1

Introduction

During the last years, formal methods have been used with success to specify and build industrial systems [10]. Several refinement techniques and methods have been proposed to derive correct programs from formal specifications [2,5,21,22]. Following these approaches, a program specification is formed by two formulas called pre-condition and post-condition [13,14]. The pre-condition designates the states in which the specified program will be used. The post-condition is a predicate that relates program states before and after its execution. Modules can be specified using pre and post-conditions as follows: a module is formed by a set of local variables, an initialization predicate which denotes the possible initial states, an invariant formula which express the properties the module must always satisfy, and a set of atomic operations which define the dynamics [2,21]. Each operation is specified using pre and post conditions. These modules can be used to specify industrial systems as closed universes [18]. However, limitations arise when we need to specify interactions between the system and its environment; and some problems appear during the refinement process. This approach is not suitable to specify concurrent, parallel or distributed systems. The main reason is the impossibility to introduce operation interleaving J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 209–229, 2000. c Springer-Verlag Berlin Heidelberg 2000

210

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

during the refinement process. Other approaches have been proposed to model these systems [1,3,8,11]. The main idea is to replace operations by events. An event is formed by two formulas called guard and action. The guard denotes the states where the event is allowed to happen. Similarly to post-conditions, the action relates system states before and after the event observation. This approach provides a powerful way to describe the expected behaviour of a system. It is well adapted to specify concurrent, parallel and distributed systems and to model program environments [4,7,11]. We combine here the two approaches. The event based approach is used to model the system. The pre-post approach is used to specify sub-system parts that correspond to software or hardware components used to implement the system. We propose a method to derive component specifications from an event based model. We proceed as follows. Firstly, we model the whole system using events. Then, we refine this model until we consider that we have introduced the events needed to observe the interactions between system components and their environment. Finally, we derive component specifications that will be used to implement the system. Proceeding on this way, we derive sub-system parts that acting independently and communicating in the specified way form a system that behaves as described by the general model. We utilize an extract of an industrial case study to present these ideas: the flight warning system used in the Airbus A340 aircraft (FWS). A complete version of this specification can be found in [17,18](in French). The paper is organized as follows: the next section presents the proposed method, together with some theoretical aspects. In section 3, we apply the method to our case study: the Flight Warning System. Section 4 contains the conclusion.

2

Description of the Proposed Method

This section is organized as follows: Firstly, we present some theoretical aspects of the methods used to specify sequential programs, and concurrent and distributed systems. Then, we present the derivation method. Finally, we explain how we can use the B method to apply these ideas. 2.1

Specifying Sequential Programs

A program can be specified by two logical formulas P (x) and Q(x, x0 ). x is a single variable or a list of distinct variables (x1, x2, ...) that represent the state just before the program execution, and x0 is a single variable or a list of distinct variables (x10 , x20 , ...) that represent the state just after the program execution. The pre-condition P (x) designates the states in which the program can be used. The post-condition Q(x, x0 ) relates program states before and after its execution. A program specification P S takes the form: P (x) ⇒ Q(x, x0 ). This formula captures our intuitive idea of a program specification. If the precondition

Deriving Software Specifications from Event Based Models

211

is satisfied the program behaves as indicated by Q(x, x0 ), if not any behaviour is accepted. A program specification can be nondeterministic as Q(x, x0 ) can associate to each initial state several final states. This sort of indeterminism is qualified of internal as it has to be eliminated during the implementation process. Inputs and outputs act as parameters. Inputs can not be modified by the program in order to avoid side effects. A parameterized specification takes the form: P (x, y) ⇒ Q(x, x0 , y, z 0 ). Here x, x0 , y, z 0 are distinct variables. y represents the inputs and z 0 the outputs. Specifying Modules: A module M S = (x, IM, B, Oi ) is formed by a shared variable x that represents the state of the module, an invariant IM that express static properties the module must always satisfy, a specification B that initializes the module, and a set of atomic operations Oi , (i ∈ 1..n) that defines the dynamics. These operations share the variable x. The module can be used as follows. Firstly, the invariant is established by B. Then, operations are executed indefinitely in any order. An operation can be executed many times. The invariant is always preserved by the execution of operations. The initialization B takes always the form true ⇒ Q(x0 ). The pre-condition true states that the initialization can be always used. The post-condition Q(x0 ) depends only on x0 because the before state is not used to initialize the module. Each operation Oi takes the form of either a parameterized or a not parameterized program specification. There is another type of indeterminism that appears with modules as the order in which operations are called is unknown. This indeterminism is qualified of external as it can not be reduced during the implementation process [15]. Modules can be implemented using refinement techniques for sequential programs [2,5,21,22]. Nevertheless, it is not possible to use these modules to introduce operation interleaving. A module operation is seen as atomic by an external observer. It is possible to combine concrete specifications to implement an operation, but an external observer is not allowed to see these concrete operations. A module specification M S is a transition system whose initial states are defined by its initialization, and whose transitions are defined by its operations. Each new state is reached by means of any module operation. 2.2

Specifying Concurrent and Distributed Systems

A different approach has been proposed to specify concurrent and distributed systems [1,3,8,11,16]. The general idea is to build a mathematical model which allows us to observe the system. Rather than operations, we use events. An event can be specified by two logical formulas G(x) and A(x, x0 ). x is a single variable or a list of distinct variables (x1, x2, ...) that represent the state just before the event observation, and x0 or (x10 , x20 , ...) represent the state just

212

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

after the event observation. The guard G(x) designates the states in which the event can be observed. The action A(x, x0 ) relates system states before and after its observation. An event specification takes the form: G(x) ∧ A(x, x0 ). This formula captures our intuitive idea of an event specification: when the event is observed, both its guard and its action are satisfied. As for programs, an event specification can be nondeterministic as A(x, x0 ) can associate to each initial state several final states. Moreover, events can be parameterized. In that case the event specification takes the form: ∃y  (G(x, y) ∧ A(x, x0 , y)). Specifying Systems: A system specification SS = (x, IS, C, Ei ) is formed by a shared variable x that represents the system state, an invariant IS that express static properties of the system, a specification C that initializes the system, and a set of events Ei , (i ∈ 1..n) that defines the dynamics. The system can be used as follows. Firstly the invariant is established by C. Then, events are observed indefinitely in any order. An event can be observed many times. The invariant is always preserved by events. Similar to programs, the initialization C takes always the form true ⇒ Q(x0 ). Each event Ei takes the form of either a parameterized or a not parameterized event. System specifications generate external indeterminism as sequences of events are unknown in advance [15]. System specifications can be refined using a different refinement relation [4,7, 11], which allows us to introduce concrete events that can interleave with abstract events and that can be seen by an external observer. This is the technique we use to specify concurrent and distributed systems. A system specification SS is a transition system whose initial states are defined by its initialization, and whose transitions are defined by its events. Each new state is reached by means of any event of the system. 2.3

Deriving Program Specifications from System Specifications

A system specification allows us to observe a complete system, this is, a system which does not interact with the observer [1]. So a system specification models a closed universe. A program taking inputs and producing outputs clearly interacts with its environment, so, it is not a closed universe. There is a way to connect these two apparently incompatible approaches: we can use system specifications as a tool to observe program behaviours. The main idea is to divide a system specification in two main parts: the environment specification and the component specifications. Initially, we specify the system until a point where all the possible interactions between the components of the system and their environment are completely described. These components can be software or hardware pieces and can be implemented separately.

Deriving Software Specifications from Event Based Models

213

This specification is directly transformed into a module, which will be shared by system components. Its role is to provide to each component the operations it needs to interact with other components and with its environment, in such a way that the properties of the system are always satisfied. We call this part the environment specification. Specifications of components are modules which are not directly derived from the event model. They contain the properties the component must satisfy and can use some environment variables. This mechanism allows us to specify properties which involve local variables and environment variables. Doing so, we can relate abstract inputs to abstract outputs, and we can specify the way the component must interact with its environment. These specifications can be later refined to obtain their corresponding implementations. During the refinement process, we must use the environment specification to implement all the environment variables used in each component specification. At the end, we obtain implementations that use environment operations to obtain inputs and to produce results. Environment Specification Deriving a Module from a System: Intuitively, we will allow to transform a system specification into a module specification if we can observe each operation of the module by means of a system event. For that, we will associate to each event a unique operation and to each operation a unique event. Formally, we will say that a module M S is observed by a system SS if the following four conditions are satisfied: 1. x = y. The state variables of M S and SS are the same. 2. IM ⇔ IS. The invariant of the module is equivalent to the invariant of the system. 3. B ⇒ C. The initialization of the module is also an initialization of the system. We use here an implication rather than an equivalence because the initialization of the module can be more deterministic than the initialization of the system. It is clear that every initial state of the module must be an initial state of the system. 4. pr(Oi ) ∧ Oi ⇒ Ei , (i ∈ 1..n). Here pr(Oi ) is the pre-condition of Oi . This condition makes evident the fundamental difference between an operation and an event. In the first case, we can assume that the pre-condition is satisfied as when we use an operation we must prove that its pre-condition is satisfied, otherwise, we can not use it. In the second case, events simply happen, and when they happen their specification is satisfied. This condition guarantees that all transitions allowed by an operation can be observed by its corresponding event. Transforming an Event into an Operation: Not parameterized events allow us to observe operations without input parameters. So, to each event of the form

214

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

G(x) ∧ A(x, x0 ) we associate an operation taking the form: P (x) ⇒ Q(x, x0 , z 0 ). This is, an operation without parameters or having only output parameters. If the operation has no parameters at all then its specification takes the form: P (x) ⇒ Q(x, x0 ). So, in order to satisfy condition 4, we have to prove the following formula: ∀x,x0 ,z0  (P (x) ∧ (P (x) ⇒ Q(x, x0 , z 0 )) ⇒ G(x) ∧ A(x, x0 )). This formula is satisfied if the following condition is satisfied: ∀x,x0 ,z0  ((P (x) ⇔ G(x)) ∧ (Q(x, x0 , z 0 ) ⇒ A(x, x0 ))). In this case we will require this condition to be satisfied by the produced operation. Parameterized events allow us to observe operations with input parameters. So, to each event of the form ∃y  (G(x, y) ∧ A(x, x0 , y)) we associate an operation taking the following form: P (x, y) ⇒ Q(x, x0 , y, z 0 ). This is, an operation with the same input parameters. If the operation has no output parameters then its specification takes the form: P (x, y) ⇒ Q(x, x0 , y). So, in order to satisfy condition 4, we have to prove the following formula: ∀x,x0 ,y,z0  (P (x, y) ∧ (P (x, y) ⇒ Q(x, x0 , y, z 0 )) ⇒ ∃y  (G(x, y) ∧ A(x, x0 , y))). This formula is satisfied if the following condition is satisfied: ∀x,x0 ,y,z0  ((P (x, y) ⇔ G(x, y)) ∧ (Q(x, x0 , y, z 0 ) ⇒ A(x, x0 , y))). In this case we will require this condition to be satisfied by the produced operation. So in the general case, our event transformations will always satisfy the following transformation condition: ∀x,x0 ,y,z0  ((P (x, y) ⇔ G(x, y)) ∧ (Q(x, x0 , y, z 0 ) ⇒ A(x, x0 , y)))

(1)

The fact that pre-conditions are equivalent to guards guarantees that system properties involving only guards are preserved by the obtained module. For example, the absence of deadlock. Actions do not imply post-conditions because the module can reduce the indeterminism of the system. Component Specifications We stated above that component specifications can use some variables and some operations of the environment specification. So, we can describe a component specification as a module CMk = (Yk , yk , ICk , Dk , Oki ) where Yk are the environment variables it uses, yk are the local variables, ICk is the invariant, Dk the initialization, and Oki the operations. These modules can be refined, but environment variables can not be data refined. Later, during the refinement process, these variables will be implemented using the environment module. As the environment specification describes interactions between components and their environment, it is a shared module. The notion of sharing a module by several components arises two important questions: – is it possible to keep satisfied the invariant of each component? – is it possible to keep satisfied the invariant of the shared module? In the general case, the answer to these questions is no. However, there are two conditions which suffices to guarantee invariant preservation of all these specifications:

Deriving Software Specifications from Event Based Models

215

1. By definition of modules, interactions can only appear with the environment variables Yk . So, introducing shared modules, and allowing variables of this module to be used in other module specifications, introduces the risk of breaking the corresponding invariants. However, if for each component CMk we allow to use Yk such that there is no operation of the shared module used by any other component that can modify Yk , the invariant is preserved. So, for each component, we identify the shared module operations it needs to interact with the environment. Then, we eliminate of this group the operations used by other components. We obtain a group of operations used exclusively by the treated component. Finally, we identify the environment variables that are exclusively modified by this group of operations, and we obtain the subset of environment variables that can be used in the specification of the component. This condition could appear very restrictive. However, in the system specification we are allowed to include all the variables and events we need to observe component behaviours and to express environment properties. So, we have a high degree of freedom with this specification. 2. Environment variables Yk used in a component specification must be implemented by the corresponding variables of the shared module. This is, Yk can only be effectively modified by operations of the environment module. This condition clearly guarantees that the invariant of the shared module is not broken by other modules. 2.4

Using the B Method and Generalized Substitutions

In the preceding sections, we used a relational model to present the theoretical aspects of the proposed method. In the coming sections, we will apply these ideas to build the specification of a concrete system. For that, we will use the formalism and techniques proposed by the B method. In the B method, we specify programs and events by means of generalized substitutions; not as before-after predicates. Nevertheless, it is possible to associate to each generalized substitution S a before-after predicate 1 . Before-After Predicates and Generalized Substitutions: For program specifications, we will use generalized substitutions of the form: P |S. The beforeafter predicate prdx associated to this substitution is: prdx (P |S) ⇔ P (x) ⇒ prdx (S). Here prdx (S) is a predicate of the form: Q(x, x0 ). In particular, if S takes the form x := E, its associated before-after predicate is x0 = E. This corresponds to our program specifications. Before-after predicates associated to parameterized substitutions take the form of our parameterized program specifications. For event specifications, we will use generalized substitutions of the form: P =⇒ S for the ’SELECT’ construct, and @z·(P =⇒ S) for the ’ANY’ construct. The before-after predicates associated to these substitutions are: 1

B Book, Section 6.3.3, Page 292.

216

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

– prdx (P =⇒ S) ⇔ P (x) ∧ prdx (S). This corresponds to the specification of a not parameterized event. Here prdx (S) is a predicate of the form: Q(x, x0 ). – prdx (@z · (P =⇒ S)) ⇔ ∃z  (P (x, z) ∧ prdx (S)). This corresponds to the specification of a parameterized event. Here prdx (S) is a predicate of the form: Q(x, x0 , z). For module and system initializations, we will use generalized substitutions of the form 2 : x : (P (x)). The associated before-after predicate takes the form: true ⇒ P (x0 ). We use this construct as it allows us to initialize the module with an high degree of indeterminism. Shared Modules: We will use abstract machines to write module and system specifications. However, shared modules are not allowed in B. So, we will introduce some conventions in our example to write the environment specification.

3

The Flight Warning System (FWS)

In this section, we illustrate our method by means of an extract of an industrial case study. We present a general system description with its more abstract properties. We do not present the approach used to build the entire specification, nor system details. These aspects are treated in [18]. The FWS main function is to monitor aeroplane sub-systems and to keep the crew informed of their states. When an abnormal situation happens, FWS must decide on when and how to emit warning signals to inform about this abnormal situation. In order to describe the behaviour of the system, it is useful to introduce the following fundamental notions: warnings, signals and flight phases. 3.1

System Description

Fundamental Notions These notions are needed to express the more abstract properties of the system. Warnings: a warning is a particular system state that requires crew attention. For example, fire in an engine is a warning situation. When a warning condition is satisfied, we will say that its associated warning is present, otherwise we will say that the warning is absent. When a warning is present, the system must wait during a defined period of time before sending alert signals. When this period is elapsed, we will say that the warning is confirmed. A confirmed warning can be either activated or inhibited following the current flight phase. 2

B Book, Section 5.1.1, Page 270.

Deriving Software Specifications from Event Based Models

217

Signals: to each warning are associated signals: text information, sounds, lights, digital information. These signals can be sent only when the warning is active. A same signal can be associated to several warnings. For example, the same sound could be used to alert the crew in several different situations. We will differentiate signals even if their external characteristics are the same. The system dispose of limited resources to emit warning signals. For this reason, it could happen that FWS is unable to emit all the activated signals; so the system must decide on what signals to emit following some well defined priority rules. We will say that a signal is absent when its associated warning is not activated, otherwise the signal is present. A present signal can be in one of the following three states: activated, erased or cancelled. FWS can only emit activated signals. The decision of whether an activated signal is emitted or not depends on the signals that are in competition, the current flight phase and the previous state of the system. The aeroplane pilots have at their disposal the possibility of erasing or cancelling warning signals. An erased signal is not emitted any more while its associated warning stays activated. A cancelled signal is never emitted, even if its associated warning becomes active. Flight Phase: a flight is divided in phases following the aeroplane state: for example, lift off, first 1500 feets, touch down, last engine shut down, etc. The flight phase is the criteria that determines whether a confirmed warning must be inhibited or not. To each warning is associated a set of inhibiting flight phases. System Behaviour This section presents a very general description of the expected behaviour of the system. Inputs and outputs are also presented here. FWS must periodically obtain sub-system state information coming from the environment and produces warning signals which must be emitted accordingly to the specified rules. This suggest a cyclic behaviour. The system must periodically refresh the emitted signals to take in account state changes, which can happen during a flight. To accomplish this task, FWS executes the following three phases: – obtain state information about the monitored systems; – activate signals associated to activated warnings; – decide on which signals will be emitted and compose the messages to be sent to the output devices. There are two main groups of input information: – state information of the monitored systems, which arrives in two forms: digital and analogical. This information suffices to calculate the current flight phase.

218

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

– Commands sent by the crew. For example: an erasing request. With this input information, FWS must calculate the warning signals to be emitted, and send the corresponding information to the devices charged with effectively emitting these signals. Output devices are: display screens controllers, warning lights controllers, loud speakers and synthetic voice controllers, and communication systems. System Properties We give here the more abstract properties of the system. They are presented in three different groups: Behaviour, Security and Ergonomic. To each property we will associate an identifier, which will be used later to relate mathematical formulas to informal text as indicated in [17,18]. Behaviour Properties: Behaviour properties express the desired system behaviour when warnings happen. Property 1. Present warnings having elapsed confirmation times are confirmed warnings. Property 2. Confirmed warnings not inhibited by the current flight phase are activated warnings. Property 3. Not cancelled signals associated to activated warnings are activated signals. Security Properties: Security properties deal with reliability. They guarantee that the system does not emit erroneous warning signals: this is, signals that do not reflect the state of the system. Property 4. Confirmed warnings are present warnings which have elapsed confirmation times. Property 5. Activated warnings are confirmed warnings. Property 6. Activated signals are associated to activated warnings. Property 7. Emitted signals are activated signals. Ergonomic Property: This property is due to the fact that the system has limited resources to emit signals. Property 8. The crew is always able to precisely determine the warning associated to an emitted signal.

Deriving Software Specifications from Event Based Models

3.2

219

Formal Specification of the FWS System

In [17,18] we used program modules to build a formal specification for the FWS system. However, there were some issues which arose with this approach: 1. We were forced to specify the system as having a unique cyclic process, even if the informal specification suggested at least two concurrent processes. 2. We did not specify inputs nor outputs because we wanted to avoid early implementation choices. It would have been possible to specify input parameters: set of present warnings, current flight phase, current crew commands; and output parameters: emitted signals. Doing so, we would have been forced to implement the system in a very specific way: read inputs, calculate signals to be emitted, and finally produce results. 3. It was not possible to specify the way the system should interact with its environment. For example, it was not possible to formally express that during a calculation cycle, FWS should examine once per cycle the state of each warning using an atomic operation provided by the environment. 4. The obtained specification had only one operation: Cycle, that can be refined by many other substitutions, which do not behave as expected. For example, SKIP refines correctly Cycle. The reason is that our model was closed: there were no external constraints which forbid these behaviours. This is a good reason to include environment constraints into a specification. These problems can be addressed using the method proposed in section 2. First Model. FWS1: Our first model allows us to observe situations where warnings happen and where signals are emitted. These situations correspond to real events; they are not program operations. Later, when we derive program specifications, some of these events are transformed into environment operations. We need only two constants W w and Ss. They are not program constants, but logical constants which allow us to express system properties. Name Constant Description Ww Ss

Properties

Set of existing warnings W w 6= ∅ Set of existing signals Ss 6= ∅

We need only two variables W p and Se. Here again, they are not program variables, but logical variables which allow us to express system properties. Name Variable Description Properties Wp Se

Present warnings Emitted signals

Wp ⊆ Ww Se ⊆ Ss

We give now the complete B model for this abstraction level. The system is initialized by the substitution: W p, Se := ∅, ∅ as at the beginning, there are no present warnings, nor emitted signals. The model contains four events.

220

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge SYSTEM FWS1

First abstraction level.

SETS W w, Ss VARIABLES W p, Se INVARIANT W p ⊆ W w ∧ Se ⊆ Ss INITIALIZATION W p, Se := ∅, ∅ EVENTS NewWarning = ˆ A new warning happens. ANY W x WHERE W x ∈ W w ∧ W x ∈ / Wp THEN W p := W p ∪ {W x} END; EndWarning = ˆ ANY W x WHERE W x ∈ W p THEN W p := W p − {W x} END;

A warning situation disappears.

EmittedSignal = ˆ ANY Sx WHERE Sx ∈ Ss ∧ Sx ∈ / Se THEN Se := Se ∪ {Sx} END;

A new signal is emitted.

EndSignal = ˆ ANY Sx WHERE Sx ∈ Se THEN Se := Se − {Sx} END

A signal is not emitted any more.

First Refinement. FWS2: Now, we refine the FWS1 model to introduce some details. we decide to implement the system using two cyclic concurrent processes. The first is charged with confirming warnings and calculating system parameters, the current flight phase for example. The second must calculate the signals to be emitted. There is no synchronization between these processes. We do not make any assumption about their calculation time. The model will contain all constants, variables and events needed to describe the desired interactions between these processes and their environment. We need two new constants Cc, P p. Cc denotes the existing crew commands and P p denotes the existing flight phases. So, the second model has the following four constants,: Name Constant Description

Properties

Ww Ss Cc Pp

W w 6= ∅ Ss 6= ∅ Cc 6= ∅ P p 6= ∅

Set Set Set Set

of of of of

existing existing existing existing

warnings signals Crew commands flight phases

We keep unchanged the variables used in the first model, and we introduce other new variables. W c denotes the confirmed warnings. This variable is a first process output and a second process input. Its typing property is not W c ⊆ W p because during a cycle warnings can happen or can disappear. Rather, we use a ’copy’ of W p to store the present warnings for a cycle: W p1. So, the typing property for W c is W c ⊆ W p1. At the beginning of a cycle W p1 is empty. As reading operations are

Deriving Software Specifications from Event Based Models

221

executed, W p1 stores the observed present warnings. So, at the end of a cycle, W p and W p1 can be different. A similar argument is used in the case of reading confirmed warnings by the second process. W c2 is a ’copy’ of W c. It stores confirmed warnings for a cycle. Its typing property is W c2 ⊆ W w. As our intention is to completely specify the interface between processes and their environment, we include the necessary variables to take in account the current flight phase and the current crew commands. P c denotes the current flight phase. It is calculated by the first process and used by the second. The second process must read once per cycle the current flight phase and the current crew commands. P c2 and Cc2 will store the obtained information. We need two new variables to store information about the already examined warnings. These variables are W e1 for the first process and W e2 for the second. We need two new variables to indicate if the current flight phase was calculated by the first process and read by the second. These variables are P e1 and P e2 for the second. We need also a variable to indicate if the current crew commands were read by the second process. This variable is Ce2. Finally, we introduce two variables to store the number of cycles executed by processes. They are N um1 and N um2. All these variables will be used to express properties about the expected behaviour of the system. Most of them will not be implemented. In summary, the model contains the following variables: Name Variable Description

Properties

Wp Se Pc Wc N um1 W p1 W e1 P e1 W c2 W e2 Cc2 Ce2 P c2 P e2 N um2

Wp ⊆ Ww Se ⊆ Ss Pc ∈ Pp W c ⊆ W p1 N um1 ∈ N AT W p1 ⊆ W w W e1 ⊆ W w P e1 ∈ BOOL W c2 ⊆ W w W e2 ⊆ W w Cc2 ⊆ Cc Ce2 ∈ BOOL P c2 ∈ P p P e2 ∈ BOOL N um2 ∈ N AT

Present warnings. Emitted signals. Current Flight phase. Confirmed warnings. Number of cycles executed. Process 1 Present warnings. Process 1 Examined warnings. Process 1 Current flight phase. Calculated or not. Confirmed warnings. Process 2 Examined warnings. Process 2 Current crew commands. Process 2 Current crew commands. Read or not Current Flight phase. Process 2 Current flight phase. Read or not Number of cycles executed. Process 2

The above properties form the invariant of the system. Properties 1 to 8 given in section 2 will be introduced later when we derive specifications for Process 1 and Process 2. Events introduced in the first model (FWS1) are kept unchanged. We introduce 9 new events required to observe the concurrent execution of process 1 and Process 2. Their names are suffixed with the corresponding process number. We give now the complete B model for this second level of abstraction.

222

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge SYSTEM FWS2 SETS W w, Ss, Cc, P p

FWS2 refines FWS1. The prove is not provided here.

VARIABLES W p, Se, P c, W c, N um1, W p1, W e1, P e1, W c2, W e2, Cc2, Ce2, P c2, P e2, N um2 INVARIANT W p ⊆ W w ∧ Se ⊆ Ss ∧ P c ∈ P p∧ W c ⊆ W p1 ∧ N um1 ∈ N AT ∧ W p1 ⊆ W w ∧ W e1 ⊆ W w ∧ P e1 ∈ BOOL∧ W c2 ⊆ W w ∧ W e2 ⊆ W w ∧ Cc2 ⊆ Cc∧ Ce2 ∈ BOOL ∧ P c2 ∈ P p∧ P e2 ∈ BOOL ∧ N um2 ∈ N AT INITIALIZATION W p, Se, P c, W c, N um1, W p1, W e1, P e1, W c2, W e2, Cc2, Ce2, P c2, P e2,N um2 : (W p = ∅ ∧ Se = ∅ ∧ P c ∈ P p∧ W c = ∅ ∧ N um1 = 0 ∧ W p1 = ∅∧ W e1 = ∅ ∧ P e1 = T RU E ∧ W c2 = ∅∧ W e2 = W w ∧ Cc2 = ∅ ∧ Ce2 = T RU E∧ P c2 = P c ∧ P e2 = T RU E ∧ N um2 = 0) EVENTS NewWarning = ˆ ANY W x WHERE W x ∈ W w ∧ W x ∈ / Wp THEN W p := W p ∪ {W x} END;

The invariant is formed only by typing properties.

The initialization is not deterministic.

A new warning happens.

EndWarning = ˆ ANY W x WHERE W x ∈ W p THEN W p := W p − {W x} END;

A warning situation disappears.

EmittedSignal = ˆ ANY Sx WHERE Sx ∈ Ss ∧ Sx ∈ / Se THEN Se := Se ∪ {Sx} END;

A new signal is emitted.

EndSignal = ˆ ANY Sx WHERE Sx ∈ Se THEN Se := Se − {Sx} END;

A signal is not emitted any more.

ExamineWarning1 = ˆ ANY W x WHERE W x ∈ W w ∧ W x ∈ / W e1 THEN IF W x ∈ W p THEN W p1, W e1 := W p1 ∪ {W x}, W e1 ∪ {W x} ELSE W e1 := W e1 ∪ {W x} END END;

Process 1 examines a new warning.

ConfirmWarning1 = ˆ ANY W x WHERE W x ∈ W w ∧ W x ∈ / Wc THEN W c := W c ∪ {W x} END;

A warning is confirmed by process 1.

AbsentWarning1 = ˆ ANY W x WHERE W x ∈ W w THEN W c := W c − {W x} END;

Process 1 indicates that a warning is absent. If the warning is already absent, the action has no effect.

If the warning is present, it becomes present from the point of view of process 1. If not, the warning is absent.

CurrentFlightPhase1 = ˆ Process 1 calculates a new flight phase. ANY P x WHERE P x ∈ P p ∧ P e1 = F ALSE Calculated once per cycle. THEN P c, P e1 := P x, T RU E END;

Deriving Software Specifications from Event Based Models

223

BeginCycle1 = ˆ Process 1 starts a new cycle. All warnings SELECT W e1 = W w ∧ P e1 = T RU E THEN were examined in the previous cycle. In the new cycle, all warnings are W e1, W p1, P e1, N um1 := examined, a new phase is calculated and ∅, ∅, F ALSE, N um1 + 1 a new cycle is executed. END; ExamineWarning2 = ˆ ANY W x WHERE W x ∈ W w ∧ W x ∈ / W e2 THEN IF W x ∈ W c THEN W c2, W e2 := W c2 ∪ {W x}, W e2 ∪ {W x} ELSE W e2 := W e2 ∪ {W x} END END;

3.3

Process 2 examines a new warning. If the warning is confirmed, it becomes confirmed from the point of view of process 2. If not, the warning is not confirmed.

CurrentFlightPhase2 = ˆ SELECT P e2 = F ALSE THEN P c2, P e2 := P c, T RU E END;

Process 2 reads the current flight phase. Once per cycle.

CrewCommands2 = ˆ SELECT Ce2 = F ALSE THEN ANY Cx WHERE Cx ⊆ Cc THEN Cc2, Ce2 := Cx, T RU E END END;

Process 2 reads the crew commands. Once per cycle.

BeginCycle2 = ˆ SELECT W e2 = W w ∧ P e2 = T RU E ∧ Ce2 = T RU E THEN W e2, W c2, P e2, Ce2, N um2 := ∅, ∅, F ALSE, F ALSE, N um2 + 1 END;

Process 2 starts a new cycle. All warnings were examined in the previous cycle. In the new cycle, all warnings are examined, a new flight phase and the crew commands are read, and a new cycle is executed.

FWS. Environment Specification:

We use now our derivation method to produce the environment specification. We transform events into operations, and we provide input and output parameters when needed as indicated in section 2.3. For each transformed event we have to prove the conditions given in section 2.3. We do not provide the clauses SETS, CONSTANTS, VARIABLES, INVARIANT and INITIALIZATION as they are borrowed directly from the FWS2 system. MACHINE FWS ENVIRONMENT

This module is derived from FWS2.

OPERATIONS NewWarning(W x) = ˆ External action PRE W x ∈ W w ∧ W x ∈ / Wp THEN W p := W p ∪ {W x} END; EndWarning(W x) = ˆ External action PRE W x ∈ W p THEN W p := W p − {W x} END; EmittedSignal(Sx) = ˆ PRE Sx ∈ Ss ∧ Sx ∈ / Se THEN Se := Se ∪ {Sx} END;

Process 2

EndSignal(Sx) = ˆ PRE Sx ∈ Ss THEN Se := Se − {Sx} END;

Process 2

224

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge bres ←− ExamineWarning1(W x) = ˆ Process 1 PRE W x ∈ W w ∧ W x ∈ / W e1 THEN IF W x ∈ W p THEN W c1, W e1, bres := W c1 ∪ {W x}, W e1 ∪ {W x}, TRUE ELSE W e1, bres := W e1 ∪ {W x}, F ALSE END END; ConfirmWarning1(W x) = ˆ PRE W x ∈ W w ∧ W x ∈ / Wc THEN W c := W c ∪ {W x} END;

Process 1

AbsentWarning1(W x) = ˆ PRE W x ∈ W w THEN W c := W c − {W x} END;

Process 1

CurrentFlightPhase1(P x) = ˆ PRE P x ∈ P p ∧ P e1 = F ALSE THEN P c, P e1 := P x, T RU E END;

Process 1

BeginCycle1 = ˆ PRE W e1 = W w ∧ P e1 = T RU E Process 1 THEN W e1, W p1, P e1, N um1 := ∅, ∅, F ALSE, N um1 + 1 END; bres ←− ExamineWarning2(W x) = ˆ Process 2 PRE W x ∈ W w ∧ W x ∈ / W e2 THEN IF Wx ∈ Wc THEN W c2, W e2, bres := W c2 ∪ {W x}, W e2 ∪ {W x}, TRUE ELSE W e2, bres := W e2 ∪ {W x}, F ALSE END END; pres ←− CurrentFlightPhase2 = ˆ PRE P e2 = F ALSE THEN P c2, P e2, pres := P c, T RU E, P c END;

Process 2

cres ←− CrewCommands2 = ˆ PRE Ce2 = F ALSE THEN ANY Cx WHERE Cx ⊆ Cc THEN Cc2, Ce2, cres := Cx, T RU E, Cx END; END;

Process 2

BeginCycle2 = ˆ PRE W e2 = W w ∧ P e2 = T RU E ∧ Ce2 = T RU E THEN W e2, W c2, P e2, Ce2, N um2 := ∅, ∅, F ALSE, F ALSE, N um2 + 1 END

Process 2

We have made a simplification here: in the operation cres ←− CrewCommands2, cres is a set. So, it is not a directly implementable variable. This problem can be solved if we further refine the FWS2 system. We do not make it here. 3.4

Component Specifications:

Component specifications are written separately. We obtain from the environment module the variables and events each component can utilize. We obtain from the informal specification the properties we must express in each component specification. The idea is to express these properties using both local variables and environment variables. Local variables can be freely refined later. Environment variables must be implemented using the provided environment module.

Deriving Software Specifications from Event Based Models

225

We build now the specification for the process 2 of the FWS system. The specification of process 1 can be written in a similar way. The events which allow us to observe the behaviour of process 2 are: ExamineWarning2, CurrentFlightPhase2, CrewCommands2, BeginCycle2, EmittedSignal and EndSignal. The variables modified by these events are: Se, W c2, W e2, P c2, P e2, Cc2, Ce2 et N um2. It is easy to verify that other events do not modify these variables. So, all these variables respect the conditions given in section 2.3. Consequently, we can use all of them in the specification of process 2. Process 2 will take in account properties 2,3 and 5 to 8 given in section 3.1. Properties 1 and 4 are taken in account by process 1. To express these properties, we need 2 new constants and 5 local variables: s w is a total function that associates a warning to each signal, c w is a relation that associates commands to warnings, wa denotes the activated warnings, wni denotes the warnings not inhibited by the current flight phase, wnis denotes the not isolated warnings, sa denotes the activated signals, and snc denotes the not canceled warnings 3 . We use W c2, P c2 and Cc2 as program inputs, and Se as the program output. We use W e2, P e2, Ce2 and N um2 to formally express the way process 2 must interact with the environment. We use constants W w, Ss, Cc and P p as basic sets. The resulting module specification is: MACHINE FWS PROCESS2

This module uses some variables of FWS2.

SETS W w, Ss, Cc, P p CONSTANTS s w, c w PROPERTIES s w ∈ Ss → W w ∧ c w ∈ Cc ↔ W w VARIABLES Se, W c2, W e2, Cc2, Ce2, P c2, P e2, N um2, wa, wni, wnis, sa, snc

3

A detailed explanation of the meaning of these constants and variables can be found in [17,18]

226

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge INVARIANT Se ⊆ Ss ∧ W c2 ⊆ W w ∧ W e2 ⊆ W w∧ Cc2 ⊆ Cc ∧ Ce2 ∈ BOOL ∧ P c2 ∈ P p∧ P e2 ∈ BOOL ∧ N um2 ∈ N AT ∧ wa ⊆ W c2∧ wni ⊆ W w ∧ wnis ⊆ wa ∧ sa ⊆ Ss∧ Se ⊆ sa ∧ snc ⊆ Ss∧ W c2 ∩ wni ⊆ wa∧ s w−1 [wa] ∩ snc ⊆ sa∧ wa ⊆ W c2∧ s w[sa] ⊆ wa∧ Se ⊆ sa∧ wnis ⊆ ran(c w)∧ W e2 = W w ∧ P e2 = T RU E ∧ Ce2 = T RU E

(Property 2.) (Property 3.) (Property 5.) (Property 6.) (Property 7.) (Property 8.) Environment Interaction.

INITIALIZATION Se, W c2, W e2, Cc2, Ce2, P c2, P e2, N um2 wa, wni, wnis, sa, snc: IN V ARIAN T OPERATIONS Cycle2 = ˆ Se, W c2, W e2, Cc2, Ce2, P c2, P e2, N um2 wa, wni, wnis, sa, snc: (IN V ARIAN T ∧ N um2 = N um2 0 + 1)

This specification can be compared with the specification given in [17,18]. It addresses the problems we mentioned in section 3.2, as: – we can use now several concurrent process to implement the system (problem 1): Process 1 and Process 2. – we specify component inputs and outputs by means of environment variables, not local variables (problem 2): W c2, Cc2, P c2 and Se. – we specify the way each component must interact with the environment (problem 3). For example, the invariant W e2 = W w∧P e2 = T RU E ∧Ce2 = T RU E express that at the end of cycle2, all warnings were examined, the current flight phase was read and the crew commands were read. – the obtained operation Cycle2 can not be refined by operations which do not respect the environment constraint (problem 4). In particular, SKIP does not refines the operation Cycle2 as N um2 must be modified (N um2 = N um2 0 + 1) and N um2 is only modified by the operation BeginCycle2. So, during the implementation process, BeginCycle2 must be used and consequently several other environment operations. 3.5

Implementing the Derived Modules

At this point, an important question arises: how can we implement the derived modules? Implementation of the Environment Module: The environment module contains atomic operations that can be used by three different agents: external, process 1 and process 2. N ewW arning, EndW arning and W p are implemented by the devices which detect warning situations, for example sensors and other electronic devices. EmittedSignal, EndSignal and Se are implemented by the output devices.

Deriving Software Specifications from Event Based Models

227

ExamineW arning1 is implemented by the mechanism which allows process 1 to examine a warning state. Conf irmW arning1, AbsentW arning1 and W c are implemented by the mechanisms provided to process 1 to confirm warnings. CurrentF lightP hase1 and P c are implemented by the mechanisms provided to process 1 to produce a new flight phase. BeginCycle1, W c1, W e1, W p1, P e1 and N um1 are not implemented. Their purpose is exclusively to allow us to formally express the way process 1 must interact with its environment. ExamineW arning2, CurrentF lightP hase2, CrewCommands2 are implemented by the mechanisms provided to process 2 to obtain inputs. BeginCycle2, W c2, W e2, P c2, P e2, Cc2, Ce2 and N um2 are not implemented. Implementation of Process 2: Cycle2 is a program specification that can be implemented as usual. During the refinement process, we must prove that environment operations are called within their preconditions. In some cases, we will need to store some information for this purpose. For example, the operation ExamineW arning2 takes as input a not examined warning, but there is no environment operation which provides such a warning. So, the program needs to know locally the already examined warnings. A solution can be to provide a module which hide the environment module. This module could offer two operations: wn ←− N extW arning which provides a not examined warning, and res ←− GetState(wn) which provides a warning state. This module must have a local variable we which contains the already examined warnings, and an invariant we = W e1 which relates the local variable to the environment variable. The operation wn ←− N extW arning can be specified as follows: PRE we 6= W w THEN ANY wx WHERE wx ∈ W w ∧ wx ∈ / we THEN wn := wx END END The operation res ←− GetState(wn) can be implemented as follows: PRE wn ∈ W w ∧ wn ∈ / we THEN res ←− ExamineW arning2(wn); we := we ∪ {wn} END

4

Conclusion

We proposed a method to derive component specifications from system models. The system specification is written using events so that we can model parallel, concurrent and distributed systems. Component specifications are written in a pre-post style so that they can be implemented using the classical refinement relation for sequential programs. We derive also an environment specification written in a pre-post style that allows system components to interact with their environment. This specification

228

N. Lopez, M. Simonot, and V. Vigui´e Donzeau-Gouge

is directly obtained from the system specification. It describes the behavior of the environment and is shared by all components to interact with each other. We allow environment variables to be used in component specifications. This is a powerful mechanism that allows us to include environment properties in component specifications. However, some restrictions are needed in order to guarantee that invariants are preserved. This mechanism allows us to treat program inputs and outputs as abstract objects. The derivation method that we propose preserves the properties of the system model as pre-conditions are equivalent to guards. So, properties like absence of deadlock, critical sections, process synchronization are preserved during the derivation step. In the action systems formalism components interact by means of shared actions [9,11]. Other approaches achieve this cooperation by means of shared states [1,4]. In our approach these interactions are achieved by means of an environment module which contains all the hypothesis we make about the environment. Most of the approaches used to specify concurrent and distributed systems utilize events [1,4,9,11], as we do. However, we proceed in a different way as we transform this event-based specification into several pre-post specifications. The idea of transforming an event into an operation is suggested in a draft written by Jean Raymond Abrial. We are using the proposed method in another industrial case. This work is in progress. Acknowledgments: The industrial case presented here was provided by the Aerospatiale Company in France. A complete informal specification was published in [17]. We would like to express our gratitude to the persons who made it possible. Discussions and meetings with Jean Raymond Abrial and with other people from RATP Company and from Matra Transport International provided very fruitful ideas. Many thanks to all of them.

References [1] [2] [3] [4] [5] [6]

M. Abadi and L. Lamport, Conjoining Specifications, ACM Transactions on Programming Languages and Systems 17, 3, 507-534, 1995. Abrial J.R., The B Book. Assigning programs to Meanings, Cambridge University Press, 1996 Abrial J.R., Extending B without changing it, for Developing Distributed Systems, in First B conference, H.Habrias editor, 169-190, Nantes, 1996. Abrial J.R., Mussat L., Specification and Design of a Transmission Protocol by successive Refinements using B, in Mathematical Methods in Program Development, Edited by M. Broy and B. Schieder, Springer-Verlag, 1997. Back R.J., A calculus of refinements for program derivations, Acta Informatica 25, 593-624, 1988. Back R.J., J. von Wright, Refinement Calculus, Part 1: Sequential Nondeterministic Programs, in J.W. deBakker, W.P. deRoever, and G. Rozenberg editors, Lecture Notes in Computer Science 430, 42-66, Springer, 1990.

Deriving Software Specifications from Event Based Models [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]

229

Back R.J., Refinement Calculus, Part 2: Parallel and Reactive Programs, in J.W. deBakker, W.P. deRoever, and G. Rozenberg editors, Lecture Notes in Computer Science 430, 67-93, Springer, 1990. Back R.J.R., Kurki-Suonio R., Decentralization of Process Nets with Centralized Control, in 2nd ACM SIGACT-SIGOPS Symp. on Principles of Distributed Computing, 131-142, ACM, 1983. Back R.J.R., Sere K., Stepwise Refinement of Actions Systems, in Mathematics of Program Construction , Lectures Notes in Computer Science 375, 115-138, Springer, 1989. Behm P., Benoit P., Faivre A., Meynadier J-M., Meteor: A Successful Application of B in a Large Project, in Jeannette M. Wing, Jim Woodcock, Jim Davies editors, Lecture Notes in Computer Science 1708, 369-387, Springer, 1999. Butler M.J., Stepwise refinement of communicating systems, Science of Computer Programming 27, 139-173, 1996. Butler M., Walden M., Distributed System Development in B, in First B conference, H.Habrias editor, 155-168, Nantes, 1996. Dijkstra E.W., A Discipline of programming, Prentice-Hall International, 1976. Hoare C.A.R., An Axiomatic Bases For computer Programming, Comm. ACM 12,10, 576-580, 1969. Hoare C.A.R., Communicating Sequential Processes, Prentice-Hall. 1985. Lam S.S., Shankar U., A Relational Notation for State Transition Systems, in IEEE Transactions on Software Engineering, Vol 16, No 7, 755-775, 1990. Lopez N., Construction de la specification formelle d’un systeme complexe, in First B conference, H.Habrias editor, 63-119, Nantes, 1996. Lopez N., Construction de la specification formelle d’un systeme complexe, Memoire d’ingenieur CNAM, 1996. Lopez N., An ’event based B’ industrial experience, in the proceedings of the B user Group Meeting, edited by Ken Robinson, Applying B in an industrial context, World Congress on Formal Methods 1999. Morgan C., The Specification Statement, in ACM Transactions on Programming Languages and Systems, Vol 10, No 3, 403-419, 1988. Morgan C., Programming from Specifications, Prentice-Hall International. 1990. Morris J.M., A Theoretical Basis for Stepwise refinement and the Programming Calculus, Science of Computer Programming 9, 287-306, North-Holland, 1987.

Reformulate Dynamic Properties during B Refinement and Forget Variants and Loop Invariants F. Bellegarde, C. Darlot, J. Julliand, and O. Kouchnarenko Laboratoire d’Informatique de l’Universit´e de Franche-Comt´e 16, route de Gray, 25030 Besan¸con Cedex Ph:(33) 3 81 66 64 52, Fax:(33) 3 81 66 64 50 {bellegar,darlot,julliand,kouchna}@lifc.univ-fcomte.fr, http://lifc.univ-fcomte.fr

Abstract. We propose a way to introduce dynamic properties into a B refinement design which differs from the approach used by J.R. Abrial and L. Mussat. First, the properties are expressed in the Propositional Linear Temporal Logic P LT L. Second, the user directs the evolution of properties through the refinement, so that a property P expressed by a formula F1 in the abstract system, is expressed again by a formula F2 in the refined system. Third, the verification combines proof and modelchecking. In particular, F1 is model-checked, and, then, to ensure F2 it suffices to prove some propositions depending on the shapes of F1 and F2 . In this paper, we show how to obtain these “sufficient propositions” from a refinement relation and the semantics of P LT L formulae. The main advantage is that the user does not need a variant or a loop invariant to achieve an automatic proof for finite state event systems. Our approach is illustrated on a protocol between a chip card and a card reader, called protocol T=1. Keywords: Event Systems, B Refinement Design, Dynamic properties, Specification, Verification

1

Introduction

It is well-known that the introduction of dynamic properties is necessary for the design and the verification of event systems [14,1,13]. In our approach, the specification of a B finite state event systems is extended by P LT L formulae. The paper is about the verification of dynamic properties which are specified in an abstract specification, and again, in the refined specification. In other words, we are mainly interested in the verification of reformulated dynamic properties. Our methodological approach as well as our verification techniques for addressing the introduction of dynamic constraints in B (see Figure 1) is quite different from the propositions of J.R. Abrial and L. Mussat in [1]. There are three essential differences: J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 230–249, 2000. c Springer-Verlag Berlin Heidelberg 2000

Dynamic Properties during B Refinement

231

Abstract Abstract Verification System Event (Model-Checking) Properties System Refinement Verification (Model-Checking)

Refinement

Refined Event System

Verification (Proof)

Reformulation

Refined System Properties

Fig. 1. Specification and verification approach

1. The dynamic properties are expressed in the Propositional Linear Temporal Logic (PLTL). 2. The dynamic properties are verified by a combination of proof and modelchecking. 3. As for the events, the dynamic properties are introduced at the abstract level and need to be formulated again at the refined level. The motivation behind these propositions is threefold. First and above all, we want to free the user from looking for a variant and a loop invariant when expressing dynamic properties. Second, we want to be able to use model-checking and proof for the verification in a way which utilizes them at their best. Third, the user can express its modalities freely using the expressive power of the PLTL logic. B + P LT L versus B + modalities (see Remark 1). In [1], J.R. Abrial and L. Mussat use three patterns of dynamic properties: the dynamic invariant and the two modalities leads to and until. The modalities have the same expressive power as a fragment of the PLTL using the two kinds of properties 2(p ⇒ 3q), and 2(p ⇒ (pUq). Moreover, besides the instances for p and q, the user has to specify a variant, and, often a loop invariant and a list of the events which may be taken during the loop. Model-checking and proof (see Remark 2). We choose the PLTL logic because its verification can be done by PLTL model-checking [8] which is entirely automatic for the totality of the logic. The main drawback is that it cannot handle very large systems and, worse yet, infinite state systems. A solution for large finite systems may consist of using jointly proof and model-checking. So, the model-checking explosion is avoided as well as the requirement consisting in providing clues such as variants and loop invariants to a theorem prover. To better explain how we propose to join both techniques to verify the reformulated properties, consider Figure 1.

232

F. Bellegarde et al.

First, the user specifies the abstract event system with its invariant and its dynamic properties expressed in P LT L. The invariant is proof-checked like in B. The dynamic properties are model-checked on the event system operational model, i.e. on the set of paths of a finite state transition system with a small number of states. Second, the user specifies its refinements introducing new variables. The relation between the set of states S2 of the refined system and the set of states S1 of the abstract system is expressed by a gluing invariant. New events are introduced, old events are formulated once more, new PLTL formulae are introduced, and old PLTL formulae are formulated anew. We do not want to use the PLTL model-checking again for the verification of the reformulated properties. So, we propose to use proof techniques but without requiring the specification of a loop invariant and of a variant. In the paper, we present two kinds of propositions which are associated systematically according to the shapes of the abstract PLTL formula and the refined formula. The first kind—a weak form, includes propositional sub-formulae and the invariants of the event systems. When they are valid, we know that the refined formula holds. The failure does not mean that the refined formula does not hold. So, the second kind—a strong form, includes either the old or the new events. Again, the success means that the refined formula holds, but from a failure we cannot conclude. We call these propositions sufficient conditions. In the paper, we show that if these propositions (weak or strong) are valid, then the reformulated properties hold without the help of neither an user-given variant nor a loop invariant. Methodological motivation behind reformulation (see Remark 3). Now, we can better explain the methodological idea behind the reformulation of the properties. The formula of the refined property specifies explicitly how we allow the new events to be interwoven among the old events. The effect of a reformulated formula compares with the effect of the gluing invariant in the following manner. The gluing invariant specifies a relation between the refined and the abstract states whereas the reformulated formula together with the gluing invariant specifies a relation between the refined and the abstract paths of the operational model. Paper organization. The paper is organized as follows. Section 2 illustrates our approach on a protocol between a chip card and a card reader. After a short presentation of our refinement relation in Section 3, we explain how to verify the reformulated dynamic properties through refinement in Section 4. Section 5 introduces the verification tools we are implementing. Finally, we situate our concerns and give some ideas about future works in Section 6.

Dynamic Properties during B Refinement

2

233

Some Dynamic Property Refinements

In this section, we show how dynamic properties can evolve when they are expressed at different refinement levels in a specification. For this, we will study briefly the specification of a half-duplex communication protocol between a chip card (CC) and a card reader (CR) called protocol T=1 in [5]. We will first see the operational specification of the protocol, and then the way to express and to verify some dynamic properties of this system. 2.1

Operational Description

The chip card and the card reader alternately exchange messages and the protocol must end by a card ejection. In this paper, the protocol will be specified at two levels of abstraction further called the abstract and the refined specifications. EVENT SYSTEM CCI1 VARIABLES: CC-ACK, CR-ACK, CC-ST INVARIANT:

CC-ACK, CR-ACK∈{true,false}∧CC-ST∈{in,out} ∧(CC-ACK 6=CR-ACK)

INITIALIZATION: CC-ACK,CR-ACK,CC-ST:=false,true,in CRsends:

SELECT CR-ACK = true ∧ CC-ST=in THEN CC-ACK,CR-ACK:=true,false END

CCsends:

SELECT CC-ACK = true ∧ CC-ST=in THEN CR-ACK,CC-ACK:=true,false END

eject: END CCI1

SELECT CC-ST=in THEN CC-ST:=out END Fig. 2. Abstract level

Abstract specification. At this level of specification, only the alternation of messages is considered. We have three events: – CRsends: the card reader sends a message; – CCsends: the chip card sends a message; – eject: the card is ejected. Figure 2 shows the abstract level B specification. The variable CC-ST, which describes the card status, indicates whether the card is inserted or not, and two variables (CC-ACK and CR-ACK) state the acknowledgements: If the variable

234

F. Bellegarde et al.

CR-ACK is true, then it is the Card Reader’s turn to transmit (and vice versa for CC-ACK and the Card Chip). The invariant gives a type to the variables and ensures that the protocol is half-duplex: the both devices should not emit at the same time (it is a static safety property). The transition system of Figure 3 points out the operational semantic corresponding to the B specification. The direction of the small arrow in each state shows the values of the CC-ACK and CR-ACK variables: If the small arrow is down, CC-ACK (resp. CR-ACK) is true (resp. false) and vice-versa. To ensure that the environment eventually ejects the card, we specify that eject is a strongly fair event.

eject

CCsends

CRsends

eject

Fig. 3. Abstract level transition system

Refined specification. On the refined level, each message is split into a sequence of blocks. Each message ends with a last block (LB) and each block (B) is acknowledged by an acknowledgement block (AckB). After a last block is sent by one of the devices, the other device answers with another sequence of blocks ending by a last block unless the card is ejected. These exchanges of messages alternate until the card is ejected. In the refined specification, we must add new variables to handle the blocks (see Figure 4). The values of the variables CC-F and CR-F tell us which kind of block is sent. The variables CC-FACK, CR-FACK describe the block acknowledgements, and the variable CC-ST’ plays the same role as in the first specification. The variables of the abstract and the refined specifications are glued together by the gluing invariant. This invariant says that when a device has sent the last block of a message, the variables CC-FACK and CR-FACK have the same value as their corresponding abstract level variables CC-ACK and CR-FACK. Moreover, when a device sends a block B or LB, it is sending a message at the abstract level. In the refined specification, the old events are refined and keep the same label. They now take place as the last event of any message (see Figure 5). Notice that

Dynamic Properties during B Refinement

235

EVENT SYSTEM CCI2 VARIABLES: CC-FACK, CR-FACK, CC-ST’, CC-F, CR-F INVARIANT: CC-FACK, CR-FACK∈{true,false} ∧ CC-ST’=CC-ST ∧ CC-F,CR-F ∈ {B,LB,AckB} ∧ (CC-FACK 6= CR-FACK) ∧ (CC-F=B ∨ (CR-F=LB ∧ CC-FACK=true)) ⇐⇒ (CC-ACK=true) ∧ (CR-F=B ∨ (CC-F=LB ∧ CR-FACK=true)) ⇐⇒ (CR-ACK=true) Fig. 4. Refined specification variables

the guard of the eject event has been reinforced so that the card is not ejected during a message transmission. Some new events are needed to send the blocks and the acknowledgement blocks: CC-blocksends, CR-blocksends, CC-acksends, CR-acksends. Some fairness assumptions are necessary to ensure the dynamic properties. These conditions are satisfied by any environment which does not send an infinite sequence of blocks. This assumption is expressed by a list of fair events. In our example, the fair events must be CRsends, CCsends and eject. Time Message 2

CC CR

Message 2

Message

Message 1

1

Abstract Refined

CC CR

B 4

AckB 7

B 4

AckB 7

LB 2

LB 1

ABSTRACT LEVEL EVENTS

1 CRsends 2 CCsends 3 eject

B 4

AckB 7

LB 2

B 5

AckB 6 LB 1

REFINED LEVEL EVENTS

1 CRsends 2 CCsends 3 eject

4 CC-blocksends 5 CR-blocksends 6 CC-acksends

7 CR-acksends

Fig. 5. Refined level execution trace

2.2

Temporal Property Expression

We want to express some dynamic properties to be verified on the system previously described. Formula 1 is an example of dynamic property which holds on the abstract system (Figure 3). It means that the card reader must send the first message if the card is not ejected.

236

F. Bellegarde et al. EVENT SYSTEM CCI2 OLD EVENTS: CRsends: SELECT (CR-FACK=true ∧ CC-ST’=in) ∧(CC-F=AckB ∨ CC-F=LB) THEN CC-FACK,CR-FACK,CR-F := true,false,LB END CCsends:

SELECT (CR-FACK=true ∧ CC-ST’=in) ∧(CR-F=AckB ∨ CR-F=LB) THEN CR-FACK,CC-FACK,CC-F := true,false,LB END

eject:

SELECT ((CR-F=LB ∧ CC-FACK=true) ∨ (CC-F=LB ∧ CR-FACK=true)) ∧ CC-ST’=in THEN CC-ST’ := out END NEW EVENTS: CC-blocksends: SELECT (CC-FACK=true ∧ CC-ST’=in) ∧ (CR-F=AckB ∨ CR-F=LB) THEN CC-F,CC-FACK,CR-FACK := B,false,true END CR-blocksends: SELECT (CR-FACK=true ∧ CC-ST’=in) ∧ (CC-F=AckB ∨ CC-F=LB) THEN CR-F,CR-FACK,CC-FACK := B,false,true END CC-acksends:

SELECT CC-FACK=true ∧ CC-ST’=in ∧ CR-F=B THEN CC-F,CC-FACK,CR-FACK := AckB,false,true END

CR-acksends:

SELECT CR-FACK=true ∧ CC-ST’=in ∧ CC-F=B THEN CR-F,CR-FACK,CC-FACK := AckB,false,true END

END CCI2

Fig. 6. Events of the refined specification

(CC-ST=in ⇒ (CC-ACK=true ∧ CR-ACK=false))

(1)

Temporal properties and refinement. For instance, we want to write a property that ensures the alternation of messages. At the abstract level, we can express it by the two following formulae: ((CC-ACK = true ∧ CC-ST = in) ⇒ (CC-ACK = f alse ∨ CC-ST = out)) (2) ((CR-ACK = true ∧ CC-ST = in) ⇒ (CR-ACK = f alse ∨ CC-ST = out)) (3)

Dynamic Properties during B Refinement

237

Formula 2 means that if the chip card is acknowledged and the card is inserted, then in the next state, the card will have emitted a message and will not be acknowledged anymore or the card will be ejected. Formula 3 has the same meaning with respect to the card reader. These two properties can be summarized by this sentence: “the messages are sent alternatively by each device unless the card is ejected”. If we want to verify these properties for the refined system, we have to reformulate them since the user should express himself the way he wants the abstract properties to be refined. Formula 4 illustrates a way to rewrite Formula 2: 2((CR-FACK = true ∧ CR-F = LB ∧ CC-ST’ = in ∧ CC-F 6= LB) ⇒ ((CR-F = AckB)U(CC-F = LB ∨ CC-ST’ = out))

(4)

The meaning of this formula is as follows: “If the card reader must emit and has just finished to send a message, and the card has not sent a message back, then from the next step, the card reader will acknowledge the card blocks until the card sends a last block or the card is ejected”. We have to write a similar formula for refining Formula 3: 2((CC-FACK = true ∧ CC-F = LB ∧ CC-ST’ = in ∧ CR-F 6= LB) ⇒ ((CC-F = AckB)U(CR-F = LB ∨ CC-ST’ = out)) (5) These both formulae still express the alternation of messages, according to the new variables and new events. We say that Formula 4 (resp. Formula 5) refines the abstract Formula 2 (resp. Formula 3) because they have the same meaning, although expressed at a different level of abstraction.

3

About Refinement

Transition systems are the operational semantics of the B event systems because of the P LT L semantics. On the abstract level, a P LT L property P is verified on the event system by model-checking on a transition system which is its operational model. This supposes that the set of states of this transition system is finite. In this section, we present a refinement relation between the set of states S2 of the refined transition system T S2 and the set of states S1 of the abstract transition system T S1 which determines a relation between the paths of the transition systems modeling the corresponding event systems. This relation has been studied thoroughly in [2]. Like in B, the important assumption is that the new events do not take the control forever. However, in our approach, this is verified by a model state enumeration. The refinement verification is linear in the number of states of the refined system. This way we prevent the state explosion coming from the P LT L model-checking itself.

238

F. Bellegarde et al.

As for the refinement in B, the conjunction of the abstract system invariant I1 and the gluing invariant I2 determines a relation µ between the refined and abstract states. The refinement relation η restricts µ taking into account that the new events do not take the control forever, and that the non determinism may decrease. The relation η between states implies a relation between the refined and some abstract paths of the transitions systems. Figure 7 gives an example of two related paths. As usual, the P LT L model-checking is based on the labeling of each state by a set of the propositions holding on it. By the refinement definition from [2], it is very important to ensure that any event which is taken on the abstract path is also eventually taken on the refined path preceded by some new events. 7• ooo~? O µ ooo ~~ o µ ooo ~~ ooo / ~~ o • • • •



en−1

en−1

/ • gOOen / • gOO • O _@@OO OOO • @@ OOO O O µ @@ µOOOO µOOOO OOO OOO @ /• /• /• • en

Fig. 7. Path refinement example

The next section describes how to deduce from the P LT L property syntax at both the refined and the abstract levels together with the path refinement relation invariants providing sufficient verification conditions. These conditions are justified under the assumption of a successful model-checking of the refinement relation η between the refined and the abstract systems. The way to obtain the conditions is to have propositions on linked states, we call building blocks, and a compositionality result to put them together. This composition must follow the abstract and the refined P LT L formulae semantics of the same property.

4

Reformulated Dynamic Property Verification

In this section, we explain how to verify the reformulated dynamic properties through refinement. We suppose that the system T S2 of state space S2 refines the abstract system T S1 of state space S1 and we exploit this refinement to show that if a property P holds on the abstract system then a reformulated property P holds on a refined system. We hope to avoid the P LT L property model-checking explosion which is likely to happen during the refinement by – model-checking on each module and not on the whole transition system as explained in [4]; – providing sufficient conditions to verify the reformulated P LT L property P . In this paper, we are concerned only with the second point. These conditions are first-order predicate formulae where the predicate domains are limited to

Dynamic Properties during B Refinement

239

finite sets so that these conditions are easily decidable by any theorem prover. Moreover, this condition set depends on the formulation of the P LT L property P at both levels. In this section, we determine two kinds of conditions. The first set does not take into account the events. These conditions are often too weak to prove the P LT L formulae. So, we consider how the new events are interwoven among the old events to exhibit stronger conditions. Therefore, as in the B proof obligations for refinement, these last conditions are formulated using the guards and the generalized substitutions of the events. 4.1

Refinement and Dynamic Properties

The dynamic properties that can be expressed in the B event systems are – either a dynamic invariant which indicates how the variables of the system are authorized to evolve; this corresponds roughly to a P LT L formula involving the next operator as 2(p ⇒ q); – or the B modalities which have P LT L equivalences as the patterns 2(p ⇒ 3q) and 2(p ⇒ (pUr)). The dynamic invariant is preserved through refinement. The modalities are preserved if their corresponding proof obligations hold. Generally, a P LT L formula of the pattern 2(p1 ⇒ q1 ) is formulated again at a refined level either as the pattern 2(p2 ⇒ q2 ) or 2(p2 ⇒ 3q2 ) or 2(p2 ⇒ (q2 Ur2 )). It can be a more complicated expression. We call refinement pattern a pair of a P LT L pattern and its reformulated pattern. Notice that in a given pattern, the variables are propositional variables. Our approach allows the user to have more possibilities to express properties through refinement than in B. On the one hand, the preservation of dynamic invariant through the B refinement  seems to correspond to the refinement pattern 2(p1 ⇒ q1 ), 2(p2 ⇒ 3q2 ) (for short, 3). On the other hand, the B modality preservations correspond to a reformulation by the refinement patterns 33, and UU. Again our refinement patterns offer more possibilities. First, the pattern U is 2(p1 ⇒ (q1 Ur1 )) whereas the B modality is 2(p1 ⇒ (p1 Ur1 )). Second, a pattern U may evolve into a pattern 3. Notice that it is inconceivable that a pattern 3 evolves into a pattern U even more so into a pattern . The direction of the implication between the patterns of the pair is naturally mirrored by the direction of the refinement. We have discussed the pattern evolution through refinement in [10]. The sufficient proof conditions are deduced from the P LT L pattern semantics. So, we are not limited to a small set of refinement patterns. Our experience shows that in most applications the same small amount of refinement patterns are often used. For example, our experiments on the P LT L formulae expressing the ISO norm about the protocol T=1 show: – 4 refinement patterns

,

240

F. Bellegarde et al.

– 13 refinement patterns U, – 14 refinement patterns ♦♦, – 5 refinement patterns UU. However, a small number of more complicated refinement patterns may be original to a particular application but it is generally easy to build a corresponding sufficient condition set as it is shown in the next section. 4.2

Weak Sufficient Conditions

Let consider the refinement pattern UU. Suppose a formula of pattern 2(p1 ⇒ (q1 Ur1 )) holds on the paths of the abstract transition system T S1 . We want to have sufficient conditions for the pattern 2(p2 ⇒ (q2 Ur2 )) holding on the paths of a refined transition system T S2 . From the semantics of U and from the path refinement relation as shown in Fig. 7, we deduce the following set of sufficient conditions. – A beginning condition. Assume p2 is satisfied on a state s2 , and s1 be the abstract state such that s2 together with s1 satisfy I2 ∧ I1 . Then p1 must be satisfied by s1 . From that we deduce a first condition p2 ∧ I2 ∧ I1 ⇒ p1 . – A maintenance condition. The proposition q1 must hold on each state s of any path of the abstract system beginning in s1 before the satisfaction of r1 . So q2 must also hold on each state s0 of any path of a refined system beginning in s2 before the satisfaction of r2 . By refinement s and s0 satisfy I2 ∧ I1 . From that we deduce a second condition q1 ∧ I2 ∧ I1 ⇒ q2 . – An ending condition. On any path after s1 there exists a state t satisfying r1 . So, if r2 holds on a state t0 such that t and t0 satisfy I2 ∧ I1 , we are done. We deduce the third condition r1 ∧ I2 ∧ I1 ⇒ r2 . We see that we have two kinds of implications, one from an abstract system property to a refined system property (either for an ending condition or a maintenance condition), and the other from a refined system property to an abstract system property (for a beginning condition) (see Figure 8).

•p2

p1

e

/

•pO 1 aDhQQQ



e

/

•p2

=• zz z z zz zz

A beginning building block

e / DD QQQ DD QQQ QQQ DD QQQ D / •p2 / •p2

e

e

A maintenance building block

/

/ •r1

aDD DD DD DD e / •r2

A ending building block

Fig. 8. Building blocks

We now give theorems providing a building block for a beginning condition. Theorem 1. Given an abstract transition system T S1 of state space S1 , and a transition system T S2 of state space S2 refining T S1 , let I1 be the invariant

Dynamic Properties during B Refinement

241

of T S1 , and I2 be the gluing invariant, each state s1 (∈ S1 ) glued with a state s2 (∈ S2 ) on which a proposition p2 holds satisfies a proposition p1 if the condition p2 ∧ I2 ∧ I1 ⇒ p1 holds on s2 ∧ s1 . Proof. Immediate by the following. Let s2 ∈ S2 be a state satisfying p2 . Let s1 ∈ S1 be a state glued with s2 . Then they satisfy p2 ∧ I1 ∧ I2 . Since p2 ∧ I2 ∧ I1 ⇒ p1 , the property p1 which contains only variables of T S1 , holds on s1 . Theorem 2. The condition stated by Theorem 1 is a building block for a beginning condition of any refinement pattern 2(p1 ⇒ Q1 ), 2(p2 ⇒ Q2 ) where Q1 and Q2 are P LT L formulae. Proof. Immediate by the following. If a refined path begins in a state satisfying p2 then it is necessarily glued with all the states in S1 satisfying p1 . We propose another building block either for a maintenance condition or for an ending condition. Theorem 3. Given an abstract transition system T S1 and a transition system T S2 refining T S1 , let I1 be the invariant of T S1 , and I2 be the gluing invariant, each state s2 (∈ S2 ) glued with a state s1 (∈ S1 ) on which a proposition p1 holds satisfies a proposition p2 if the condition p1 ∧ I2 ∧ I1 ⇒ p2 holds on s2 ∧ s1 . Proof. The proof is the same as for Theorem 1. Theorem 4. The condition stated by Theorem3 is a building block for an ending condition of any refinement pattern 2Q1 , 2Q2 where p1 is an eventuality which occurs into the P LT L formula Q1 and p2 is an eventuality which occurs into the P LT L formula Q2 . Proof. Immediate by the following. If an abstract state s1 satisfying p1 occurs in a path of S1 then all the states in S2 glued with s1 are compelled to satisfy p2 . As a consequence of Theorem 3, a maintenance condition can be deduced according to the following argument. Let s1 be a state in S1 for which p1 holds. In a path of S2 , all the states which begin a transition refining skip (these transitions are labelled by new events) that are glued with s1 verify p2 . An ending condition can be deduced in the same manner. However, it may be too weak when new events appear since, in this case, the condition may keep the termination p2 too long. We now have a way to construct the set of weak sufficient conditions associated to one of the following often used refinement patterns

, U, 3, UU, U3, 33. For instance, the set of weak sufficient conditions for the refinement  pattern 2(p1 ⇒ q1 ), 2(p2 ⇒ (q2 Ur2 )) is (see Figure 9) p2 ∧ I2 ∧ I1 ⇒ p1 , a beginning condition p1 ∧ I2 ∧ I1 ⇒ q2 , a maintenance condition q1 ∧ I2 ∧ I1 ⇒ r2 , an ending condition

242

F. Bellegarde et al. e p1 / •q1 ; • O aDD aDD w D DD w DD w DD w DD w DD w D ww / •q2 / •q2 e / •r2 •p2 ,q2

Fig. 9. Pattern U

Moreover, the building blocks can also be used to deduce weak sufficient conditions for more complex refinement patterns. Consider, for example, a refinement pattern 2(p1 ⇒ (q1 Ur1 )), 2(p2 ⇒ 3(q2 ⇒ q2 Ur2 )) . Its set of weak sufficient conditions is the following: p2 ∧ I2 ∧ I1 ⇒ p1 , a beginning condition q1 ∧ I2 ∧ I1 ⇒ q2 , a maintenance condition r1 ∧ I2 ∧ I1 ⇒ r2 , an ending condition Unfortunately, some of these building blocks are often too weak for the proof either because there are not precise enough to express the semantics of the refinement pattern or because the proof fails. The next section presents strong sufficient conditions which are used in our verification process when the weak sufficient condition fails in proving an instance of a refinement pattern. Obviously, the cause of the failure may not come from the conditions but either from the incorrectness of the refined formula (error of expression or error in the pattern evolution), or even from the incorrectness or the insufficiency of the gluing invariant. The problem with the invariant happens only if the refinement relation [2] does not hold. So, the refinement verification eliminates this cause of failure. 4.3

Strong Sufficient Conditions

 Consider a refinement pattern 2(p1 ⇒ q1 ), 2(p2 ⇒ (q2 Ur2 )) of Formula 2 refined by Formula 4, and of Formula 3 refined by Formula 5 of the protocol T = 1. Its set of weak sufficient conditions is: p2 ∧ I2 ∧ I1 ⇒ p1 , a beginning condition p1 ∧ I2 ∧ I1 ⇒ q2 , a maintenance condition q1 ∧ I2 ∧ I1 ⇒ r2 , an ending condition We can notice that the P LT L operator occurs in the maintenance condition. There is no way to express the maintenance of q2 without using the operator since a weak maintenance condition is unable to express the semantics of (q2 Ur2 ). Fortunately, from the semantics of the above formula, we can deduce a stronger sufficient condition saying that any new event enabled by a refined state s2 glued to an abstract state s1 for which p1 holds, changes s2 into

Dynamic Properties during B Refinement

243

a refined state for which q2 holds. The strong maintenance sufficient condition is then ∀ e ∈ NewEvents, (Ge ∧ p1 ∧ I1 ∧ I2 ) ⇒ [Se ]q2 where Ge is the guard of an event e, and Se is its generalized substitution. The proof of this strong maintenance sufficient condition succeeds for the P LT L formula 2 refined by Formula 4 and Formula 3 refined by Formula 5. Strong sufficient conditions are also required by the following situations: – a failure of a weak sufficient condition; – the persistence of the P LT L operator in a refinement pattern (the above case is a particular case of this situation). Failure case. In the failure case, we have to try a strong sufficient condition based on the new events (refining skip) interwoven among the old events. For the protocol T = 1 of Section 2, the refinement verification of the P LT L formulae 4 and 5 fails in proving its ending condition q1 ∧ I2 ∧ I1 ⇒ r2 . It has to be tried on the following strong ending sufficient condition: ∀ e ∈ OldEvents, G1e ∧ p1 ∧ G2e ∧ q2 ∧ I1 ∧ I2 ⇒ [S2e ]r2 where S2e is a generalized substitution of an old event e in a refined system, G2e is its guard, and G1e is the guard of the event e in the abstract system. In other words, each old event e enabled by a refined state s2 which satisfies q2 and is glued with an abstract state s1 satisfying p1 , changes s2 into a state satisfying r2 . The proof of this strong ending sufficient condition succeeds for the P LT L formula 5 (see Appendix). The persistence of the P LT L operator in a pattern evolution. We can imagine three plausible refinement patterns coming from the abstract pattern p1 ⇒ q1 . – The more likely the pattern evolves into eventuality patterns either 3 or U because of transitions refining skip. – However, in a few cases, it may happen that the property is not concerned with the new events. Obviously, this happens when no new event can precede the event concerned with the . For instance, a reformulation of the P LT L formula 2 of pattern 2(p ⇒ q1 ) into the P LT L formula 4 of pattern 2(p2 ⇒ (q2 Ur2 ) allows the card to send messages block after block. However, in another kind of protocol, it is possible that only the card reader could do that. Then, the formula 2 would be refined in 2(p2 ⇒ q2 ) so that the card is forced to answer next by only one block. We have the following strong sufficient conditions.

244

F. Bellegarde et al.

– ∀enew ∈ N ewEvents, ¬(Genew ∧ p1 ∧ p2 ∧ I1 ∧ I2 ) which means that no new event enabled by a state s2 glued with an abstract state s1 satisfying p1 , also satisfies p2 . – ∀eold ∈ OldEvents, (G2eold ∧ p2 ∧ G1eold ∧ p1 ∧ I1 ∧ I2 ) ⇒ [S2eold ]q2 which means that any old event, enabled by a state s2 satisfying p2 and glued with an abstract state s1 satisfying p1 , changes s2 into a state satisfying q2 . Notice that the strong sufficient conditions are universally quantified on either the set of the new events or the set of the old events of a refined system. As for the deduction of the weak sufficient conditions, we can exhibit building blocks but they take into account the guards and the generalized substitutions of the involved events. Again, for a given refinement pattern, we get a constructive way to find the set of strong sufficient conditions by using building blocks according to their respective semantics.

5

Tools

The above ideas are being implemented in a verification tool kit which is presented in Figure 10. For more details, see, for instance, [3]. We propose the six following verification components: 1. A component which, given an abstract and a refined B event systems, verifies the refinement conditions while constructing the transition systems of the modules (see [1] in Figure 10). Abstract Specification B Event System

LTL Properties

Refinement Verifier Module Constructor [1]

Fairness Assumptions

B Event System

Reformulated LTL Verifier [2]

Old

New

Assumptions

Fairness Assumptions Generator

[3]

New LTL Prop. under Fairness Assumptions

Module Visualisation

Module Transition Systems

Refined Specification LTL properties Fairness

[4]

Module Elimination

Modules

Parallel Verification of Modules with a Parallel SPIN Version

[5]

Fig. 10. Verification tool kit data flow diagram

[6]

Dynamic Properties during B Refinement

245

2. A component which, given a P LT L formula and its refined version, verifies (by proof) that the refined property still holds on the refined system (see [2] in Figure 10). 3. A component which, given the refined specification, adds the fairness assumptions to the P LT L formulae (see [3] in Figure 10). 4. We use Dotty1 to visualize the transition system of each module (see [4] in Figure 10). 5. A component which takes a new P LT L property and a set of modules, and attempts to remove some of them (see [5] in Figure 10) from the modelchecking using proof of invariants. 6. We use a parallelized version of SPIN to verify the new P LT L formulas on each module.

6

Related and Future Works

Our approach uses both automatic-proof and model-checking techniques. This cooperation is permitted because of a refinement methodology [10,11] which has been proposed to specify and to verify the finite B event systems dynamic properties expressed in P LT L. In this paper, we show that the verification can be fully automatic for finite state systems using no variant and no loop invariant. Given a P LT L formula and its reformulation, we get a systematic construction for finding a set of propositions (building blocks) which suffices to ensure that the refined property holds on the refined system. So, failure does not mean that the property is false. There are two main causes of failure: – either the gluing invariant is too weak, – or the property is false, but may be only outside the reachable state set. Notice that it is the same as in B where an invariant which does not hold on the whole state space could be true on the reachable state space. Formulating again a property through refinement is useful for three reasons. First, in order to establish the gluing invariant, the user can have a path refinement style of reasoning and not only a variable relationship one. Second, it allows us to deal with the model-checking explosion problem since we avoid to model-check the reformulated property by proving either weak or strong sufficient conditions. Third, it opens up an original solution to combine proof and model-checking techniques. This solution is based on the refinement. Other ways to combine proof and model-checking exist. For example, the provers HOL/VOSS [9], TLP/COSPAN [12] and PVS/SMV [15] propose formal abstraction to be used to verify safety properties. For the other kinds of properties, another method presented in [6] by J. Dingel and T. Filkorn, is based on the assumption/commitment model-checker SVE [7] relying on an abstraction which must be related to the initial specification by an homomorphism. Then, a prover is used to verify the abstraction homomorphism and the assumptions 1

http://research.att.com/sw/tools

246

F. Bellegarde et al.

for the model-checking. The user designed proof obligations come from counterexamples of properties which hold on the abstract model and do not hold on the specification. P. Wolper in [17] proposes a technique where model-checking is used to verify an induction hypothesis. We refine instead of abstracting. However, our approach can be compared with the propositions made by J. Dingel and T. Filkorn because they are both based on model-checking on the abstract system and on proof. Nevertheless, there are two main differences w.r.t. our approach: First, the assumptions that are discovered by their method may be difficult to verify automatically because they are liveness or fairness assumptions (see, for instance, the example in [6]); Second, in our refinement approach, the sufficient conditions we presented in this paper, are easier to verify by proof than theirs. It would obviously be worthwhile to extend verification to infinite state systems. We want to use a combination of proof and model-checking to verify parameterized systems. The refinement verification cannot be done by a model state enumeration. However, the results in [2] indicate a close link between our refinement relation definition and the refusal simulation defined by I. Ulidowsky in [16]. This kind of simulation is implementable for finite-branching systems. For the fairness, our approach is different from the proposition of J-R. Abrial which requires that fairness is necessarily specified by a scheduler. So, there is no need of fairness assumptions which are dynamic properties outside the scope of a set logic prover. However, sometimes it may be easier to specify fairness assumptions about the environment than to specify a scheduler. For example, it is well-known that the applications which use the protocol T=1 to transmit messages, do not send messages of infinite length. Without changing the model, this fact translates immediately as strong fairness requirements on the events CRsends and CCsends. In this case, the specification of a fair scheduler should lead to an infinite model. Our choice is to let the user be free to use one or the other approach. Notice that, the solution providing fairness assumptions at the disposal of PLTL model-checking is convenient not only because it allows keeping a finite system but also because fairness assumptions are translated as temporal properties. However, the fairness must be preserved by the refinement—a problem we are currently working on.

References 1. J. R. Abrial and L. Mussat. Introducing dynamic constraints in B. In Second Conference on the B method, LNCS 1393, pages 83–128, Nantes, France, April 1998. Springer Verlag. 2. F. Bellegarde, J. Julliand, and O. Kouchnarenko. Ready-simulation is not ready to express a modular refinement relation. In Proc. Int. Conf. on Fundamental Approaches to Software Engineering-2000 (FASE’2000). Springer Verlag, April 2000. LNCS to appear. 3. F. Bellegarde, J. Julliand, and H. Mountassir. Model-based verification through refinement of finite B event systems. In Formal Method’99 B User Group Meeting, CDROM publication, 1999.

Dynamic Properties during B Refinement

247

4. F. Bellegarde, J. Julliand, and H. Mountassir. Partition of the state space trough a B event system refinement for a modular verification of PLTL properties. In Submitted to the ZB 2000 Conference, 2000. 5. Comit´e Europ´een de Normalisation. En27816-3, European Standard - Identification Cards - Integrated circuit(s) cards with contacts - Electronic signal and transmission protocols. Technical report, ISO/CEI 7816-3, 1992. 6. J. Dingel and T. Filkorn. Model-checking for infinite state systems using data abstraction, assumption-commitment style reasoning and theorem proving. In Pierre Wolper, editor, Proc. 6th Int. Conf. On Computer Aided Verification (CAV95), Lecture Notes in Computer Science 939, pages 54–69. Springer Verlag, July 1995. 7. T. Filkorn, H.A. Schneider, A. Scholtz, and all. SVE user’s guide. zbe bt se 1-sve-1. Technical report, Siemens AG, M¨ unich, 1994. 8. G. Holzmann. Design and validation of protocols. Prentice hall software series, 1991. 9. J.J. Joyce and C.H. Seger. Linking BDD-based symbolic evaluation to interactive theorem proving. In Association for Computing Machinery, editor, Proceedings of the 30th Design Automation Conference, 1993. 10. J. Julliand, F. Bellegarde, and B. Parreaux. De l’expression des besoins ` a l’expression formelle des propri´et´es dynamiques. Technique et Science Informatiques, 18(7), 1999. 11. J. Julliand, P.A. Masson, and H. Mountassir. Modular verification of dynamic properties for reactive systems. In K. Araki, A. Cralloway, and K. Taguchi, editors, International Workshop on Integrated Formal Methods (IFM’99), pages 89–108, York, Great Britain, 1999. Springer. 12. R.P. Kurshan. Automata-Theoretic Verification of Coordinating Processes. Princeton University Press, 1993. 13. Lamport. A temporal logic of actions. ACM Transactions on Programming Languages and Systems, 16(3):872–923, may 1994. 14. Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent Systems: Specification. Springer-Verlag - ISBN 0-387-97664-7, 1992. 15. S. Owre, J.M. Rushby, and N. Shankar. PVS : A prototype verification system. In Deepak Kapur, editor, 11th International Conference on Automated Deduction, number 607 in Lecture Notes in Artificial Intelligence, pages 748–752. Springer Verlag, June 1992. 16. I. Ulidowski. Equivalences on observable processes. In Proceedings of the 7th Annual IEEE Symposium on Logic in Computer Sciences IEEE, New-York, IEEE Computer Society Press, pages 148–161, 1992. 17. P. Wolper and V. Lovinfosse. Verifying properties of large sets of processes with network invariants. In J. Sifakis, editor, International Workshop on Automatic Verification Methods for Finite State Systems, number 407 in Lecture Notes in Computer Sciences, pages 68–80, Grenoble, June 1989. Springer Verlag.

A

Proofs

The proofs will be performed on the properties 3 and 5: 2(p1 → q1 ) with: – p1 ≡ (CR-ACK=true ∧ CC-ST=in) – q1 ≡ (CR-ACK=false ∨ CC-ST=out) is refined by 2(p2 → (q2 Ur2 )) with:

248

F. Bellegarde et al.

– p2 ≡ (CC-FACK=true ∧ CC-F=LB ∧ CC-ST’=in ∧ CR-F 6= LB) – q2 ≡ (CC-F=AckB) – r2 ≡ (CR-F=LB ∨ CC-ST’=out) For this property, there are three conditions (see page 243 sqq.) to be proved in order to ensure the refined property at the refined level: 1. p2 ∧ I1 ∧ I2 ⇒ p1 2. ∀e ∈ OldEvents, G1e ∧ p1 ∧ G2e ∧ q2 ∧ I1 ∧ I2 ⇒ [S2e ]r2 3. ∀e ∈ N ewEvents, (Ge ∧ p1 ∧ I1 ∧ I2 ) ⇒ [Se ]q2 Some sketches of proofs are following : – CC-FACK=true ∧ CC-F=LB ∧ CC-ST’=in ∧ CR-F 6= LB ∧I1 ∧ I2 ⇒ CC-ST=in. Proof : Immediate because I2 ⇒ CC-ST=CC-ST’ – CC-FACK=true ∧ CC-F=LB ∧ CC-ST’=in ∧ CR-F 6= LB ∧I1 ∧ I2 ⇒ CR-ACK=true Proof : There are two cases : a) CR-F=B ; If CR-F=B, then CR-ACK=true because I2 ⇒(CR-F=B ⇒ CR-ACK=true). b) CR-F=AckB ; if CC-FACK=true, then CR-FACK=false (because of I2 ); if CR-F 6= B ∧ CR-FACK 6= true, then CR-ACK=false (because of I2 ). If CR-ACK=false, then CC-ACK=true; Il CC-ACK=true, we must have (CC-F=B ∨ (CR-F=LB ∧ CC-FACK=true)) (because of I2 ); So, there is a contradiction in the left part of the implication, and then the whole implication is true. 2. This condition must be verified with each abstract event: – Let e be eject. We have to show that (CC-ST=in ∧ CR-ACK=true ∧ CC-ST=in) ∧ (((CR-F=LB ∧ CC-FACK=true) ∨ (CC-F = LB ∧ CR-FACK=true)) ∧ CC-ST’=in ∧ CC-F=AckB ∧I1 ∧ I2 ⇒ [CC-ST’:=out](CR-F=LB ∨ CC-ST’=out). Proof : Immediate because after the substitution we have out=out. – Let e be CRsends. We have to show that (CR-ACK = true ∧ CC-ST=in ∧ CR-ACK=true ∧ CC-ST=in) ∧ ((CR-FACK = true ∧ CC-ST’=in)∧(CC-F=AckB ∨ CC-F=LB) ∧ CC-F=AckB ∧I1 ∧ I2 ⇒ [CC-FACK,CR-FACK,CR-F:=true,false,LB](CR-F=LB ∨ CC-ST’=out). Proof : Immediate because after the substitution we have LB=LB. – Let e be CCsends. We have to show that (CC-ACK = true ∧ CC-ST=in ∧ CR-ACK=true ∧ CC-ST=in) ∧ (CR-FACK = true ∧ CC-ST’=in) ∧(CR-F=AckB ∨ CR-F=LB) ∧ CC-F=AckB ∧I1 ∧ I2 ⇒ [CR-FACK,CC-FACK,CC-F:=true,false;LB](CR-F=LB ∨ CC-ST’=out). 1.

Dynamic Properties during B Refinement

249

Proof : The implication’s left part implies CC-ACK=true ∧ CR-ACK=true, which is false because of I1 . So false ⇒ (CR-F=LB ∨ CC-ST’=out) is true, so the condition is verified. 3. This condition must be verified for each new event. – Let e be CC-blocksends. We have to show that (CC-FACK=true ∧ CC-ST’=in) ∧ (CR-F = AckB ∨ CR-F = LB) ∧ (CR-Ack=true ∧ CC-ST=in) ∧I1 ∧ I2 ⇒ [CC-F, CC-FACK, CR-FACK := B, false, true](CC-F=AckB). Proof : CR-F=LB ∧ CC-FACK=true ⇒ CC-ACK=true (because of I2 ). CC-ACK=true ∧ CR-ACK=true ⇒ false (because of I1 ). false ⇒ CR-F=B is true so the condition is verified. – Let e be CR-blocksends. We have to show that (CR-FACK=true ∧ CC-ST’=in) ∧ (CC-F = AckB ∨ CC-F = LB) ∧ (CR-Ack=true ∧ CC-ST=in) ∧I1 ∧ I2 ⇒ [CR-F, CR-FACK, CC-FACK := B, false, true](CC-F=AckB). Proof : Obvious because CC-F=AckB ⇒ CC-F=AckB – Let e be CC-acksends. We have to show that (CC-FACK = true ∧ CC-ST’=in ∧ CR-F=B) ∧ (CR-Ack=true ∧ CC-ST=in) ∧I1 ∧ I2 ⇒ [CC-F, CC-FACK, CR-FACK := AckB, false, true](CC-F=AckB). Proof : Immediate because after the substitution we have AckB=AckB. – Let e be CR-acksends. We have to show that (CR-FACK = true ∧ CC-ST’=in ∧ CC-F=B) ∧ (CR-Ack=true ∧ CC-ST=in) ∧I1 ∧ I2 ⇒ [CR-F, CR-FACK, CC-FACK := AckB, false, true](CC-F=AckB). Proof : CC-F=B⇒CC-ACK=true (because of I2 ) CC-ACK=true ∧ CR-ACK=true ⇒ false (because of I1 ) false ⇒ [CR-F, CR-FACK, CC-FACK := AckB, false, true](CR-F=B) is true so the condition is verified.

Type-Constrained Generics for Z Samuel H. Valentine1 , Ian Toyn1 , Susan Stepney2 , and Steve King1 1

2

Department of Computer Science, University of York, UK {sam,ian,king}@cs.york.ac.uk Logica UK Ltd, Betjeman House, 104 Hills Road, Cambridge CB2 1LQ, UK [email protected]

Abstract. We propose an extension to Z whereby generic parameters may have their types partially constrained. Using this mechanism it becomes possible to define in Z much of its own schema calculus and refinement rules.

1

Introduction

The Z notation [5,2,7] has a type system based on given sets and on generic types. Powerset and product constructors (Cartesian and schema) form new types from other types. Nothing else is a type. The resulting system is useful, flexible and decidable. Generic types allow the definitions of many relations and functions to be made in the mathematical toolkit. The function of set union, for example, can be defined thus: ∪ [X ] == λ S , T : P X • {x : X | x ∈ S ∨ x ∈ T } To use this definition, the generic parameter X can be instantiated to a set of any type whatever. Generic types are considered as atomic in themselves. No constraint is placed on the sets which instantiate generic parameters. Whereas this works fine for set operations like union, there are other cases to which it does not extend, such as functions referring to arguments which are known to be schemas, but with unknown signatures, or known to be Cartesian products, but of unknown size. For example, the predicate S ∨ T = ¬ [¬ S ; ¬ T ] is a tautology whenever it is well-typed, and one might wish to formulate it as a theorem, or even to use it to frame a definition of schema disjunction. In Z as it stands, any such attempt would be a type error, since it would require S and T to be of compatible schema types, whereas if the types of S and T are given by generic parameters, we cannot constrain them. Yet the intention is fairly clear and the ability to define things in this way would be useful, since it would allow the existing schema calculus to be defined explicitly in Z, would provide a facility for users to extend it, and would allow the statement and use of general lemmas about schemas. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 250–263, 2000. c Springer-Verlag Berlin Heidelberg 2000

Type-Constrained Generics for Z

251

For features of the Z notation such as schema composition, existing definitions and statements of proof rules either: a) translate into equivalent Z using a translation process stated more or less informally, such as that given in [5] or [7], or b) omit rules about schema composition and features of similar difficulty altogether [11,1,4]. The purpose of this paper is to show how we can express most of schema calculus, including schema composition and piping, in Z, provided we introduce “undecoration” and allow type constraints on generic parameters.

2

Implicit Instantiation - Schema Negation

Generic definitions in Z can be written in either of two ways. One way is designed for explicit instantiation, where the instantiation can be to any set, as in an expression like F S . The other is to create a generic object whose type can be inferred from the context of its use, for example where the type of a function is inferred from the type of its argument or of its result. Throughout this paper we shall work to the latter convention, where definitions are designed on the assumption that the generic parameter need not be instantiated explicitly, and is a type. As a first example, let us examine schema negation. As we have observed in [10], we can define schema negation as set complementation, by saying schNeg [X ] == λ S : P X • {x : X | ¬ x ∈ S } This definition applies to any set, rather than being restricted to schemas as the standard schema negation is. One of the innovations of the proposed Standard for Z [7] is a fixed syntax for conjectures. Using this, we can pose conjectures about negation, such as [X ] |=? ∀ S : P X • schNeg (schNeg S ) = S

3

Type Compatibility - Schema Conjunction

To define schema conjunction, we could most briefly say schConj

[X , Y ] == λ S : P X ; T : P Y • [S ; T ]

It is clear that S and T must be schemas, since they have been used as declarations. In order to be valid they must also be type-compatible. That is, if we had declared S : P X ; T : P Y , where X and Y are types, then where X and Y have the same component names, the corresponding types must be identical. In [7] this condition would be written as X ≈ Y . This constraint is implicit in the

252

T. Valentine et al.

definition given, so we could relax the rules and allow that form of definition as it stands. It seems preferable, however, to allow the existing type checks to be imposed on generic parameters used under the existing rules. We therefore propose a new bit of syntax to introduce generic parameters that may be subject to type constraints. We do this by putting a † in the generic parameter list. Generic parameters preceding the † are subject to the existing rules, and may not be constrained. Generic parameters following the † may be subject to type constraints. It seems likely that the idea would extend to any constraints, such as the case where the parameter is known to be a Cartesian product, but of unknown size. The only cases we consider in detail in this paper, however, are those where the parameters are constrained to be schemas, and those are the only cases for which we have worked out the implications for implementation in a tool [8]. So we now write schConj

[†X , Y ] == λ S : P X ; T : P Y • [S ; T ]

We can pose conjectures about the functions we have defined, such as [†X , Y ] |=? dom( schConj ) = P X × P Y

[†X , Y , Z ] |=? ∀S : PX; T : PY; U : PZ • (S schConj T ) schConj U = S schConj (T schConj U )

[†X ] |=? ∀ S : P X • (S schConj schNeg S ) = [X | false]

4

Syntactic Overloading - Schema Logical Operations

In a similar way, we could define schema disjunction schDisj

[†X , Y ] == λ S : P X ; T : P Y • schNeg [schNeg S ; schNeg T ]

schema implication schImp

[†X , Y ] == λ S : P X ; T : P Y • schNeg [S ; schNeg T ]

and schema equivalence schEquiv

[†X , Y ] == λ S : P X ; T : P Y • [S schImp T ; T schImp S ]

Type-Constrained Generics for Z

253

In Z as it stands [7], the same symbols are used for predicate disjunction etc. as for schema disjunction etc. That is why, in the interests of clarity, the five definitions above have been made in terms of separate names. The precedences of schema disjunction etc. are also out of the range allocated to user-defined functions. Apart from these syntactic considerations, the above definitions could be used in the Mathematical toolkit of Z to define the operations concerned. Whereas the range of the precedences could be modified, the overloading of the names of the symbols seems to be a deeper issue. We proceed therefore on the assumption that the five operations concerned remain defined in the core language, but that the ideas and notation developed above will be used for the other operations of the schema calculus as described below.

5

Schema Axiomatic Definitions

The definitions above have all been in the explicit horizontal form. If the axiomatic form of definition is used, similar principles apply, but the question arises as to whether the type constraints on the generic parameters imposed implicitly by some of the predicates given in the axiomatic box necessarily apply to all other predicates in the same box. We answer this question in the affirmative, since if we wish the various parts to be considered separately, we can use more than one box.

6

The scope of Generic Schemas

An alternative definition of schema disjunction might be something like: schDisj 2

[†X , Y ] == λ S : P X ; T : P Y • [X ; Y | θX ∈ S ∨ θY ∈ T ]

A difficulty with this is that any of the names X , S , Y or T might be a component name of one (or both) of the schemas which instantiate X and Y . Under the current scope rules the terms θX , S , θY and T must be interpreted in the context created by the declaration X ; Y , and if any of the names X , S , Y or T was a component name of one of them, this would destroy the intended meaning of the whole definition. We therefore stipulate that names brought into scope by being the component names of schemas that are instantiations of generic parameters, belong to a different name-space from other names. Thus the component names of the schemas that instantiate X and Y above, and the names generated by the terms θX and θY , are in a different name-space from the stated names X , S , Y or T themselves. Explicit uses of names, such as those of X , S , Y or T above, must refer to declarations which are not dependent on the instantiations of the generic parameters of the containing paragraph. Declarations of both sorts may occur, that is, uses of schema generic parameters as declarations, and ordinary non-generic

254

T. Valentine et al.

or fully instantiated declarations. If they occur within the same schema-text, in order that the type system can remain consistent there must be a constraint of disjointness between the explicitly declared names and those that occur within the instantiations. That is, in the scope of some [†X ] a declaration like X ; a : T imposes a constraint that there is no occurrence of the name a within the schema X.

7

Projection

We can now define schema projection in a similar way to the above definitions, with the similar proviso that the name Y must be guaranteed not to be in the scope of the declarations introduced by the schema inclusions S ; T .

 [†X , Y ] == λ S : P X ; T : P Y • {S ; T • θY } 8

Further Schema Operations - Natural Composition

The schema calculus operations given in [5,7] include three operations of schema quantification. For any two schemas X and Y we can form the expressions ∃ X • Y , ∀ X • Y and ∃1 X • Y . [2] gives the first two of these, but omits ∃1 . With the extensions to notation proposed here, we could define functions which restate these operations with different syntax, and can also define new functions using them in combination. The definitions of these operations in [5] require that in the case of ∃ X • Y , for example, all names in the signature of X must also be present in the signature of Y , and similarly for the other two. In [7] this restriction is relaxed, and the only constraint between the signatures is that they should be type-compatible. For further discussion and motivation of this point see [10]. We continue on the basis of the definitions in [7]. The operations X ∧ Y , X ∨ Y , X ⇒ Y and X ⇔ Y all produce schema results whose signatures are formed by merging the signatures of the constituent schemas. The schema expression ∃ X • Y , however, has a value equal to that of X ∧ Y followed by a hiding of those names present in the signature of X . Similarly, the schema expression ∀ X • Y has a value equal to that of X ⇒ Y followed by a hiding of those names present in the signature of X . To illustrate the potential of these operations, and to prepare for the definitions of schema composition and piping below, we next define an operation which we could call “schema inner product” or “natural composition”. We shall proceed with the latter name, and define NatCompose[†X , Y ] == λS : PX; T : PY • {S ; T • θ[(∃ Y • X ); (∃ X • Y )]} This function takes two schema arguments and produces a schema obtained by conjoining them, then hiding the common components. The type of the result is

Type-Constrained Generics for Z

255

P[(∃ Y • X ); (∃ X • Y )], which can only be type-correct if X ≈ Y . We observe that this operation is commutative, but is not in general associative. We generalise this definition below to the cases of schema sequential composition and schema piping. In those cases we manipulate decorations on names, but these manipulations create the possibility of accidental coincidences between other names than those explicitly under consideration. It is then necessary to use a somewhat more complicated definition, equivalent to NatCompose, namely NatCompose2[†X , Y ] == λS : PX; T : PY • {b : [(∃ Y • X ); (∃ X • Y )] | ∃ m : ∃ ∃ Y • {b} • S • m ∈ ∃ ∃ X • {b} • T } where the values of b are the bindings in the result, and the values of m are the matching bindings, which are hidden. The value of the comprehension is formed as follows: a) take some binding b of the result type; b) {b} is the schema whose sole member is that binding; c) ∃ Y • {b} is the schema whose sole member is a binding consisting of the components of b drawn from X ; d) ∃ ∃ Y • {b} • S is the schema consisting of those bindings of the matching type that are consistent with S and the parts of b drawn from X ; e) similarly, ∃ ∃ X • {b} • T is the schema consisting of those bindings of the matching type that are consistent with T and the parts of b drawn from Y ; f) the value b is included in the comprehension if there is a value of m that is a member of both these two schemas.

9

Heterogeneous State Transitions

The relaxation of the rules of generic typing given above has allowed us to define in Z six of the operations of schema calculus. In all of these cases no special recognition is given to the “decoration” of names. Other operations of the schema calculus, however, are designed to work with the “state and operations” convention, and to do so they treat different schema components differently according to the decorations attached to their names. The descriptions of that convention [5,2] are in terms of a single state schema, together with operations which relate the values of the components of that state schema with another schema resembling it but with its component names systematically dashed. Thus the components names might be a, b, c, a 0 , b 0 , c 0 , where each of the pairs a, a 0 , b, b 0 , c, c 0 is declared as of the same type. Inputs, decorated with ?, and outputs, decorated with !, may also be present. We can describe this as the assumption of “homogeneous” state transitions, in that the type of dashed state differs from that of the undashed state only in the systematic dashing of the component names.

256

T. Valentine et al.

The operations of the schema calculus which are designed to work within the convention, namely precondition, schema composition, schema override (given in [2] only), and schema piping, make a weaker assumption, however, in that they assume that there is an undashed and a dashed state, but without any assumption that these resemble each other. That is, the recognition of a component as “undashed” is in no way dependent on the presence of a corresponding “dashed” component in the same schema, nor vice-versa. For example the components names might be a, b, c, x 0 , y 0 , z 0 , where the components a, b, c will be recognised as undashed, and the components x , y, z recognised as dashed. In our formal description here we therefore treat these as heterogeneous (that is, not necessarily homogeneous) state transitions operations, and examine the restriction to homogeneity later.

10

Removing Decorations - Schema Precondition

In order to proceed further, we propose the introduction of an operator to “undecorate” any schema, irrespective of whether it is a generic parameter or not. For any schema S and decoration d the meaning of the expression undecor d S is the schema S with all components without the decoration d hidden, and with the decoration d removed from all those component names that have it. For example, the value of undecor 0 [a, b, b 0 , c 00 : N | a = 3 ∧ b = 4 ∧ b 0 > a ∧ c 00 > b] is equal to [b, c 0 : N | ∃ a, b : N; b 0 == b; c 00 == c 0 • a = 3 ∧ b = 4 ∧ b 0 > a ∧ c 00 > b] which in turn can be simplified to [b, c 0 : N | b > 3 ∧ c 0 > 4] This description makes the operation well-defined for all schemas and decorations. If the schema has no uses of the given decoration, the result is a schema of empty signature, either [ ] or [ | false] (as legitimised fully in [7]). For any schema S and decoration d , the equation (undecor d (S d )) = S will always be valid, since the decoration is applied uniformly to all components, and then can be equally uniformly removed from them. On the other hand the expression (undecor d S )d is obtained by removing the decoration from those components which have it, hiding the others, then replacing the decoration on the result. This has the effect of hiding the components which do not have the decoration, and leaving the rest unchanged. We can use this to obtain the schema in which all components of a particular decoration have been hidden, by writing ∃(undecor d S )d • S . Using this we can define schema precondition, as pre[†X ] == λ S : P X • ∃(undecor

0

X )0 ; (undecor ! X )! • S

Type-Constrained Generics for Z

11

257

Schema Override

We can take schema override nearly verbatim from [2] and say

⊕ [†X , Y ] == λ S : P X ; T : P Y • (S ∧ ¬ (pre T ) ∨ T ) and we have some nice theorems such as [†X , Y ] |=? ∀ S : P X ; T : P Y • pre(S ⊕ T ) = (pre S ∨ pre T ) [†X , Y , Z ] |=? ∀ S : P X ; T : P Y ; U : P Z • (S ⊕ T ) ⊕ U = S ⊕ (T ⊕ U )

12

Turning an Operation Schema into a Relation

Given a conventional operation schema, it is sometimes convenient to derive the corresponding relation, as needed by [6] for example. relate[†X ] == λ A : P X • let in == (undecor ? X )?; out == (undecor ! X )!; dashed == (undecor 0 X )0 • {i : in; o : out; d : dashed ; u : ∃ in; out; dashed • X | ∀ bn : [{i }; {o}; {d }; {u}] • bn ∈ A • ((u, i ), (d , o))}

13

Schema Composition

To define schema sequential composition we need to discover all dashed components of the first schema operand that match undashed components in the second. At the same time, we must not impose any unnecessary constraints; for example, the two operands need not even be compatible. The form of the definition is modelled on the second form of “natural composition” above, as: o 9

[†X , Y ] == λS : PX; T : PY • {b : [(∃ Y 0 • X ); (∃ undecor 0 X • Y )] | ∃ m : (undecor 0 (∃ ∃ Y 0 • {b} • S )) • m ∈ (∃ ∃ undecor 0 X • {b} • T )}

The definitions of schema composition given in [5,2,7] all include a further constraint that the base names that match should themselves be undecorated, that is, that the names in the binding m above should have no decoration. There seems to be no reason to impose that constraint, however, and therefore we do not propose the extensions to notation which doing so would require.

258

14

T. Valentine et al.

Type-Correctness and Associativity of Schema Composition

The type of the result of schema sequential composition between two schemas of types P X and P Y respectively is P[(∃ Y 0 • X ); (∃ undecor 0 X • Y )] provided that expression is type-correct. This is the case provided (∃ Y 0 • X ) ≈ (∃ undecor 0 X • Y ) which in turn requires that Y 0≈X and that undecor 0 X ≈ Y If these conditions are not met, the composition cannot be formed. The operation of sequential composition is not in general associative, but it is associative whenever the formal statement of associativity is type-correct. That is, [†X , Y , Z ] |=? ∀ S : P X ; T : P Y ; U : P Z • (S o9 T ) o9 U = S o9 (T o9 U ) is valid whenever it is well-typed. Explicitly: [(∃[(∃ Z 0 • Y ); (∃ undecor 0 Y • Z )] 0 • X ); (∃ undecor 0 X • [(∃ Z 0 • Y ); (∃ undecor 0 Y • Z )])] must be the same type as [(∃ Z 0 • [(∃ Y 0 • X ); (∃ undecor 0 X • Y )]); (∃ undecor 0 [(∃ Y 0 • X ); (∃ undecor 0 X • Y )] • Z )] More illuminating, however, is to note the counter-examples to associativity. The cases where the same three schemas can be composed together sequentially, with association to the left or to the right, so that either composition is type-correct but the two cases give different resultant types, are shown by the following three representative examples: [x 00 : A] o9 [x 0 : A] o9 [x : A], [x 0 : A] o9 [x 0 : A] o9 [x : A], [x 0 : A] o9 [x : A] o9 [x : A].

15

Schema Piping

The definition of schema piping may be expressed thus:

>> [†X , Y ] ==

λS : PX; T : PY • {b : [(∃(undecor ? Y )! • X ); (∃(undecor ! X )? • Y )] | ∃ m : (undecor ! (∃ ∃(undecor ? Y )! • {b} • S )) • m ∈ (undecor ? (∃ ∃(undecor ! X )? • {b} • T ))}

As with schema sequential composition above, this operation is associative whenever the alternative associations are of the same type, and the representative

Type-Constrained Generics for Z

259

counter-examples are: [x ! : A] >> [x ! : A] >> [x ? : A] [x ! : A] >> [x ? : A] >> [x ? : A].

16

Rules of Refinement

The standard refinement rules can be stated formally using this notation. We follow [5] page 138 with minor modifications. We do not explicitly define the abstract state or the concrete state. The schemas we use are: a) an abstract operation schema Aop, whose components include those of the undashed and dashed abstract states, together with inputs and outputs as required; b) a concrete operation schema Cop, whose components include those of the undashed and dashed concrete states, together with inputs and outputs as required; c) an undashed abstraction schema Abs, which relates undashed concrete and abstract states, and possibly also their inputs; c) a dashed abstraction schema Abs2, which relates dashed concrete and abstract states, and possibly also their outputs. We can then give the refinement relation as refines [†A, C , X , Y ] == {Aop : P A; Cop : P C ; Abs : P X ; Abs2 : P Y | (∃ pre A; pre C • X ) ∧ (∃(∃ pre A • A); (∃ pre C • C ) • Y ) ∧ (∀ pre Aop; Abs • pre Cop) ∧ (∀ pre Aop; Abs; Cop • ∃(∃ pre A • A) • Abs2 ∧ Aop)} where the expression (∃ pre A • A) is used to declare just the after states and outputs of Aop, and similarly the expression (∃ pre C • C ) is used to declare just the after states and outputs of Cop. This differs from [5] in the following respects. a) All the schemas have been declared. The necessary constraints on their components and their types arise from the requirement that the formal definition be well typed. b) The main part of the formulation has been made shorter by using the operation and abstraction schemas directly as declarations, rather than as predicates. c) The formulation is more general in allowing any number and names of inputs and of outputs. d) Separate abstraction schemas Abs and Abs2 are used where [5] has Abs and Abs 0 . This allows the operation schemas to be heterogeneous in the sense explained above, and also allows refinement of the input and output components. The restriction to the cases considered by [5] is obtained by writing Abs 0 for Abs2 .

260

T. Valentine et al.

e) If any component is unchanged by refinement, it can be omitted from the abstraction schemas. f) The concrete form is allowed to have extra after-states and outputs unrepresented in the abstract state, which will still leave (pre Cop) well-typed in the environment of (pre Aop; Abs). This extension arises naturally from the formalism, and accords with the practical realities of refinement. Following [5] page 140, we can specialise to functional refinement, giving frefines [†A, C , X , Y ] == {Aop : P A; Cop : P C ; Abs : P X ; Abs2 : P Y | (∃ pre A; pre C • X ) ∧ (∃(∃ pre A • A); (∃ pre C • C ) • Y ) ∧ (∀ pre Aop; Abs • pre Cop) ∧ (∀ pre Aop; Abs; Cop; Abs2 • Aop)} and then operational refinement is derived as the case where Abs and Abs2 are empty, and so can be dropped, giving the form as in [5] page 136: opRefines [†A, C ] == {Aop : P A; Cop : P C | (∀ pre Aop • pre Cop) ∧ (∀ pre Aop; Cop • Aop)}

17

Homogeneous State Transitions - ∆ and Ξ

As stated above, the descriptions of the “state and operations” convention in [5, 2] are all in terms of homogeneous state transitions, in that the type of the dashed state differs from that of the undashed state only in the systematic dashing of the component names. Inputs, decorated with ?, and outputs, decorated with !, may also be present. This approach makes extensive use of the convention whereby special meaning is given to schemas with names whose initial characters are ∆ or Ξ. These characters are not themselves operators, but it has often been suggested that they should be. This would allow them to be applied to any schema-valued expression. To define ∆ we could say: ∆ [†X ] == λ S : P X • [S ; S 0 ] which requires that S and S 0 should be compatible schemas. The description in [5] requires that the base names should themselves be undecorated, but this seems an unnecessary restriction. The operator Ξ would be defined similarly, as Ξ [†X ] == λ S : P X • [S ; S

0

| θX = θX 0 ]

These definitions could replace, or perhaps just supplement, the existing conventions.

Type-Constrained Generics for Z

18

261

Homogeneous Operation Schemas - Recognising the State

In order to define functions which take homogeneous state transition schemas as arguments, it is necessary to identify the state components. In the heterogeneous case, such as in the definition of “pre” above, we treat all components whose names do not have a dash as “undashed”, and all components whose names have a dash as “dashed”. For the homogeneous case, however, we only recognise as “undashed” those components for which there is a dashed counterpart, and vice versa. Other components may be present, but are treated as constant. If we have some generic parameter X , constrained to be a schema and which we wish to treat as the type of a homogeneous operation schema, the expression that is equal to X after hiding of all dashed components that have corresponding undashed components is given by (∃ X 0 • X ) which leaves the undashed components together with the constant components. The expression (∃(∃ X 0 • X ) • X ) therefore yields the type of the dashed components alone. Similarly the expression that is equal to X after hiding of all undashed components that have corresponding dashed components is given by (∃ undecor 0 X • X ) which leaves the dashed components together with the constant components. The expression (∃(∃ undecor 0 X • X ) • X ) therefore yields the type of the undashed components alone. For these expressions to be type-correct, it is necessary that the type of each undashed component is the same as the type of its dashed counterpart.

19

Schema Iteration

To illustrate the possibilities opened up by the notation developed here, we define two forms of schema iteration. Neither of these appears as part of the standard schema calculus, but the facility has been called for in various forms, for example in [3]. For simplicity, we assume that all components of the schema to be iterated form matching pairs of undashed and dashed names of the same type, and that there are no “constant” components. We can than define a schema version of the existing toolkit function iter schIter [†X ] == λn : N • λS : PX • let state == (∃ X 0 • X ) • [X | (θstate, θstate 0 ) ∈ iter n {S • (θstate, θstate 0 )}]

262

T. Valentine et al.

We give also a schema version of a “while” loop. This is based on the function do as described in [9], which can be defined as do[XT ] == λ R : X ↔ X • {Q : X ↔ X | id(X \ dom R) ⊆ Q ∧ R o9 Q ⊆ Q} For any function f the effect of applying do f to an argument is the same as the effect of repeatedly applying f to that argument until the result is no longer in the domain of f . Similarly for any relation R the effect of taking the relational image of do R through a set is the union of all values obtained by taking the relational image of R through elements of that set until the result is no longer in the domain of R. Using this we define the schema version: schDo[†X ] == λS : PX • let state == (∃ X 0 • X ) • [X | (θstate, θstate 0 ) ∈ do {S • (θstate, θstate 0 )}] The effect of applying schDo S to some state is that of repeatedly re-applying S to that state until the precondition is not satisfied.

20

Implementation

We have implemented these proposals in the CADiZ tool. The implementation of “undecor” proved to be straightforward. The main proposal, to allow the introduction of generics constrained to be schemas, was less simple but we believe we have solved it successfully. The details are given in a companion paper [8].

21

Conclusions

We have proposed two extensions to Z. The first is the operation “undecor” which allows for the explicit removal of decorations from a schema. The second is a relaxation of restriction whereby generic parameters are allowed to have their types partially constrained. We have shown how this makes it possible to define in Z all the complex schema calculus operations such as sequential composition and piping. Acknowledgments: This work was done as part of the project “Standardising Z Semantics”, for which Dr. King is principal investigator, and Mr. Valentine and Dr. Toyn are receiving EPSRC funding (Grant number GR/M 20723).

References 1. J. G. Hall and A. P. Martin. W reconstructed. In J. P. Bowen, M. G. Hinchey, and D. Till, editors, ZUM ’97: The Z Formal Specification Notation, LNCS 1212, Reading, April 1997. Springer.

Type-Constrained Generics for Z

263

2. I. J. Hayes, editor. Specification Case Studies. Prentice Hall, second edition, 1993. 3. P. Luigi Iachini. Operation schema iterations. In J. E. Nicholls, editor, Z User Workshop, Oxford, December 1990. Springer. 4. Andrew Martin. A revised deductive system for Z. Technical Report 98 - 21, Software Verification Research Centre, University of Queensland, 1998. 5. J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall, second edition, 1992. 6. Susan Stepney and David Cooper. Formal methods for industrial applications. (These proceedings), 2000. 7. I. Toyn, editor. Z Notation. ISO, 1999. Final Committee Draft, available at http://www.cs.york.ac.uk/˜ian/zstan/. 8. Ian Toyn, S. H. Valentine, Susan Stepney, and Steve King. Typechecking Z. (These proceedings), 2000. 9. S. H. Valentine. Z – –, an executable subset of Z. In J. E. Nicholls, editor, Z User Workshop, pages 157–187, York, December 1991. Springer. 10. S. H. Valentine. Equal rights for schemas in Z. In J. P. Bowen and M. G. Hinchey, editors, ZUM’95, LNCS 967, pages 183–202, Limerick, September 1995. Springer. 11. J. C. P. Woodcock and S. M. Brien. W: a logic for Z. In J. E. Nicholls, editor, Z User Workshop, York, December 1991. Springer.

Typechecking Z Ian Toyn1 , Samuel H. Valentine1 , Susan Stepney2 , and Steve King1 1

Department of Computer Science, University of York, Heslington, York, YO10 5DD, UK. {ian,sam,king}@cs.york.ac.uk 2 Logica UK Ltd, Betjeman House, 104 Hills Road, Cambridge, CB2 1LQ, UK. [email protected]

Abstract. This paper presents some of our requirements for a Z typechecker: that the typechecker accept all well-typeable formulations, however contrived; that it gather information about uses of declarations as needed to support interactive browsing and formal reasoning; that it fit the description given by draft standard Z; and that it be able to check some particular extensions to Z that are intended to allow explicit definitions of schema calculus operators. The paper presents a specification of such a Z typechecker, which we have implemented.

1

Introduction

Algorithms for typechecking polymorphic functional languages, as explained by Cardelli [1] and by Hancock [2], are readily adaptable to typecheck Z specifications and their generic constructs. They are based around Milner’s unification algorithm [6]. Spivey and Sufrin gave an account of typechecking Z [12], focusing on the inference of implicit generic instantiations. They deliberately omitted any discussion of the typechecking of schemas. We have found some schemas that are awkward to typecheck but could be well-typed. An investigation of the typechecking of schemas is particularly important in view of the merging of schemas with expressions in draft standard Z [15]. Our work has involved the construction of a Z typechecker within the CADiZ toolset [17,18], replacing a previous inferior algorithm.1 Some other requirements on the design of the new typechecker are also discussed in this paper, namely keeping track of uses of declarations for the purposes of interactive browsing and formal reasoning, and the typechecking of extensions to the Z notation to permit explicit definition of schema calculus operators [21].

2

Types

Each Z type corresponds to a set of values known as its carrier set. The type system excludes combinations of expressions whose values are related in ways 1

This new typechecker has approved the formal Z in this paper.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 264–285, 2000. c Springer-Verlag Berlin Heidelberg 2000

Typechecking Z

265

that are inappropriate based on their types. It is unlikely that such combinations of expressions could be given the intended meanings. A typechecker implementing the type system can decide automatically whether to accept or reject any combination of expressions. Some of the goals that arise in formal reasoning are properties that the typechecker has already decided, and so another advantage of the type system is that it allows such goals to be discharged automatically. There are also disadvantages in having a type system. For example, the rejection of combinations of expressions that could have had sensible meanings makes the language less expressive, and explicit injections may inconveniently be needed to cast values between types. Also, the kinds of goals that are decided by the typechecker are likely to be ones that a theorem prover could decide anyway. Lamport and Paulson have discussed the advantages and disadvantages of decidable and undecidable type systems for specification languages [5]. We take the Z type system as a given—our aim is to provide a specification of how to enforce it. An implementation of this specification should reject all ill-typed Z specifications while accepting all well-typed Z specifications. The various kinds of types in the Z type system are illustrated by the following examples. Given types are introduced by given set paragraphs. [PERSON , NAME , AGE ] Using the notation of draft standard Z [16], the types introduced by this paragraph are denoted by GIVEN PERSON , GIVEN NAME and GIVEN AGE . The members of their carrier sets are as yet unspecified; they may be constrained by subsequent paragraphs. Types can be assembled in three ways to form larger types. First, the set of all subsets of a type is itself a type—a powerset type. For example, a team comprises a set of persons. team : P PERSON The type of team is P(GIVEN PERSON ). Second, the set of all tuples of a certain size of values from other types is itself a type—a Cartesian product type. For example, personal details can be represented as a tuple. personal details1 : PERSON × NAME × AGE The type of the triple personal details1 is GIVEN PERSON × GIVEN NAME × GIVEN AGE . Third, a type can be a product type but with labels on the components—a schema type. For example, personal details can be represented by a binding from a schema. personal details2 : [person : PERSON ; name : NAME ; age : AGE ] The type of the binding personal details2 is [person : GIVEN PERSON ; name : GIVEN NAME ; age : GIVEN AGE ]. The association of names and types within the square brackets is called a signature.

266

I. Toyn et al.

Generic definitions require additional type notation. An example of a generic definition is that of the empty set. ∅[X ] == {x : X | false} The type of ∅ is generic in X ; it is written as the generic type [X ] P(GENTYPE X ), the type of the reference expression X within the set comprehension being P(GENTYPE X ). Such generic types are used only in describing the types of generic definitions, and so never appear within other types. The notation used for types is similar to that used in the expressions that denote their carrier sets, with the difference that given and generic types are distinguished. Z’s free type notation does not require any additional type notation: free types are abbreviations for given sets with constraints on their members [11,16, 20].

3

Requirements on the Typechecker

This section discusses some issues for the design of our typechecker. 3.1

Schemas

Schemas have signatures that influence the environment in which formulæ are typechecked. For example, in the set comprehension {n : N | n ≥ 1 • 2 ∗ n} there is the schema n : N | n ≥ 1 which declares an n of numeric type that is referenced from both the | part of the schema itself and from the • part of the set comprehension. In the majority of schemas that occur in real specifications, their signatures can be determined by checking their declarations alone. In the exceptional cases, it is preferable not to demand that the specifier reformulate in a way with which the typechecker can cope. Indeed, the formulations may have arisen not from being written by hand but as the results of semantically-valid inferences in a tool.2 We contrive some exceptional cases below. We use draft standard Z notation, including its toolkit. section contrivedExamples parents standard toolkit 2

Special provisions are needed with some inference rules to avoid variable capture. For example, the predicate m ∈ {n : N | n ≥ 1 • 2 ∗ n} is equivalent to the predicate ∃ n : N | n ≥ 1 • m = 2 ∗ n unless the name m had been n, in which case it becomes captured by the local declaration of n. Inference rules also must be careful with implicit instantiations. For example, applying the one-point inference rule [7] to the predicate ∃ x , y : P A | x = ∅ ∧ y = ∅ • x = y produces ∅ = ∅, which without explicit instantiations is type erroneous. A common provision to guard against these potential errors is to apply the typechecker and reject the inferences if any errors are detected. An alternative, as used in CADiZ, is to make the inference rules inherit types and instantiations from their operands onto their results, and to automatically rename variables to avoid variable capture.

Typechecking Z

267

The first example also conforms to the notation of the Z reference manual [11]. S1 x : P∅ x = ∅[Z] In this example, the named schema paragraph has a single declaration and a single predicate. By considering only the declaration, the signature can be seen to contain the single name x , but the type of that name is not completely determined. The type of x as determined by the declaration may be expressed as P α, where α is a variable type (or type variable). Consideration of the predicate part of the schema constrains this type to P Z, assuming that this Z is written according to [11], or to P A if it is viewed as draft standard Z [16].3 Almost all Z typecheckers accept this example as being well-typed by this means, e.g. Hippo [14] and ZTC [4]. In draft standard Z notation, as well as the requirement to cope with signatures in which the types of some components are incompletely determined, there is also a requirement to cope with signatures in which the names of the components are incompletely determined. g[X ] == X g is a generic definition that will be referred to without an explicit instantiation. S2 s == g s = [x , y, z : Z] {s | x = y} = ∅ In example S 2, the reference to g is in the declaration of s,4 where its type (and hence its implicit instantiation) is not constrained at all. The first conjunct constrains s, and hence g, to be a schema with a particular signature.5 This signature could not be determined from the declaration alone. Yet this paragraph could be well-typed, and so a typechecker ought not to complain. S3 s == g {s | x = y} = ∅ s = [x , y, z : Z] Example S 3 is similar to S 2, differing only in that the conjuncts have been swapped. Again, the signature of the schema cannot be determined without 3 4 5

Draft standard Z uses A as the type of all numbers, including the integers Z. Draft standard Z allows use of the == notation in local as well as global declarations. This is an example of a schema being used as an expression.

268

I. Toyn et al.

consideration of the schema’s predicate part. In this case, the first conjunct uses s as an inclusion declaration, which constrains its type to be that of a schema, but without constraining its signature. The environment in which the equality x = y is to be typechecked is consequently not yet known, but it can be determined by consideration of the second conjunct. A typechecker that considers conjuncts in one particular order could not cope with both S 2 and S 3. The S 3 paragraph could be well-typed, and a typechecker ought not to complain. S4 s == g t == g {s | x = y ∨ t = [z : P x ]} = ∅ s = [x , y : P Z; z : P t] Example S 4 is like S 2 and S 3, except that instead of one conjunct providing information to help typecheck the other, information has to flow both ways. The types of x and y are determined by the second conjunct, then the type of t can be determined by the first conjunct, then the type of s and its z component can be determined from the second conjunct, and only then can remaining constraints within the first conjunct be checked. Examples can be contrived in which the mutual dependencies between conjuncts are such that the constraints cannot be solved. We consider those to be ill-typed. The requirements arising from these examples are that a typechecker should not insist on solving constraints in any particular order, except as necessitated by dependencies between the constraints themselves, and that it must be able to cope with constraints in which are signatures whose names are unknown. This requires variable signatures, analogous to variable types. A recursive descent of the phrase tree of a specification checking constraints along the way, e.g. as in [9], does not satisfy the first requirement. 3.2

Browsing

A Z browser is a tool that presents a view of a Z specification and allows the user to select formulæ and ask questions about them [13,17]. An example is the selection of a reference expression and the question “where is the referenced variable’s declaration?”. This question is not so easy to answer as might at first appear. One problem is that a Z schema text can have several declarations of the same name: so long as they have the same types, these declarations are regarded as being merged into a single declaration. Another problem arises from schema inclusion declarations. A schema inclusion declaration declares variables of the same names and types as those of the included schema. Constraints imposed on the included declarations (such as the chained relation in the following example) do not constrain the original schema’s variables. Hence uses of the included declarations should not be regarded as uses of the original schema’s components.

Typechecking Z

269

The following contrived specification illustrates these problems, as explained below. schema == [a, b : A]

|=? ∀ schema; b, c, d : A; c : A • a = b = c = d When asking to see the declaration of a reference to a variable, a browser should (at least) direct attention to the schema text where that variable is declared. In the case of the above example’s reference to b, there are two merged declarations, one from the inclusion of schema, and one explicit, so directing attention to the whole schema text will have to suffice (assuming attention is directed to a single formula). In the case of the reference to d , there is only a single explicit declaration, so attention can be directed to that specific declaration. The variable c has two explicit declarations, so directing attention to the whole schema text is appropriate. The variable a has only one declaration, arising from the inclusion of schema, and so it is appropriate to direct attention to that inclusion declaration. Browsing is a concern for a typechecker because the typechecker is clearly in the best position to determine the declarations referred to by reference expressions. But to succeed, it must be concerned not merely with the names and types that appear in signatures as introduced above, but with specific declarations and schema texts. As well as mapping reference expressions to variable declarations, a browser may map variable declarations to uses of those variables. Given the assumption that attention is directed to a single formula, this can be achieved using questions such as “where is the first use?”, “where is the next use?”, etc. The uses of a variable are not just explicit reference expressions. There may also be uses of variables in the implicit instantiations of references to generic definitions, and uses implicit in binding construction (theta) expressions (these being equivalent to binding extensions involving reference expressions) and schema predicates (these being defined in terms of theta expressions). A browser might wish to draw attention to expressions that contain such implicit uses of variable declarations. The knowledge of where a variable is used is relevant not only to interactive browsing, but also to formal reasoning. For example, the one-point rule must find all uses of a variable in replacing them by an expression of equal value [7]. A browser might also allow inspection of the types of expressions and the signatures of schemas. This is relatively easy for a typechecker to support: it just has to note that which it infers. This information is especially useful to specifiers attempting to understand type errors, as well as to implementors of typecheckers.

270

3.3

I. Toyn et al.

Draft Standard Z

In draft standard Z, the type system is given, as input, an annotated syntax tree in which some formulæ already have type annotations expressing constraints between their types. As well as determining whether the given specification is well-typed, the type system is required to assign type annotations to formulæ for use later in defining the semantics of formulæ such as schema negations. The type system as presented in draft standard Z has been subject to some criticism concerning the overloading of τ as a meta-variable and as a variable type. A requirement on the type system presented below is to avoid that criticism, while at the same time being presented in a way that satisfies the requirements of draft standard Z. 3.4

Type-Constrained Generics

A companion paper to this one proposes some extensions to Z that would enable explicit definitions of schema calculus operators [21]. Such definitions would be similar to Z’s existing generic definitions, except that whereas the parameters of existing generic definitions can be instantiated with any sets, the parameters of schema calculus operators should be instantiated with schemas, usually with constraints between their signatures. For example, schema projection takes two schemas and returns the schema that is the set of bindings of just the names present in the right operand but subject to the constraints of both schemas. function 32 leftassoc ( schProj ) schProj [†X , Y ] == λ S : P X ; T : P Y • {S ; T • θY } The † symbol separates generic parameters to its left (none in this example) from parameters to its right that should be instantiated with schemas (X , Y ). The schema S ; T imposes constraints on X and Y that they be compatible schemas. If the † had been omitted, this would have been regarded as an error, as the types of generic parameters may not be constrained. With the †, we want the definition to be well-typed, even though the signature of the schema S ; T is unknown. Given the lack of precise knowledge of signatures, how is the expression θY to be typechecked? What if the instantiation of X or Y was a schema with Y as one of its components? We would not want such a component to capture the reference to Y , as then the definition would not have the desired semantics of schema projection. We shall need to achieve the effect of the names in the signatures of the instantiating schemas being in a different name-space from the names declared explicitly in the definition. A further extension to Z proposed in the companion paper [21] is undecoration expressions, which are needed for schema calculus operators whose definitions depend on decorations.

Typechecking Z

4

271

Specification of the Typechecker

The specification of the typechecker takes the form of a formal system, for reasons as given by Cardelli [1]. “A typechecking algorithm, in some sense, implements a formal system, by providing a procedure for proving theorems in that system. The formal system is essentially simpler and more fundamental than any algorithm, so that the simplest presentation of a typechecking algorithm is the formal system it implements. Also, when looking for a typechecking algorithm, it is better to first define a formal system for it.” The specification is presented in bottom-up order, first introducing the notations to be used, then presenting the individual type inference rules, and finally explaining how these are composed in forming the whole typechecker. Type-constrained generics are addressed separately, having first presented a typechecker for draft standard Z. We also explain how implicit instantiations are determined, as that has to be revised to cope with type-constrained generics. 4.1

Notations

Phrases. The definition of the syntax of Z phrases is assumed. Defined here is the syntax of notation for types and signatures (as exemplified earlier), and notation for environments to be used during typechecking. Phrases of this syntax denote values in the type universe.6 Type = ’GIVEN’ , NAME (* given type *) | ’GENTYPE’ , NAME (* generic parameter type *) | ’P’ , Type (* powerset type *) | Type , ’×’ , Type , { ’×’ , Type } (* Cartesian product type *) | ’[’ , Sig , ’]’ (* schema type *) | ’[’ , NAME , { ’,’ , NAME } , ’]’ , Type , [ ’,’ , Type ] (* generic type *) | ’α’ , { STROKE } (* variable type *) | ’(’ , Type , ’)’ (* parenthesized type *) ; Sig = [ NAME , ’:’ , Type , { ’; ’ , NAME , ’:’ , Type } ] | ’β’ , { STROKE } (* variable signature *) | ’ε’ (* empty signature *) ; Env = Sig ; | Sig , ’⊕’ , Sig (* overridden environment *) ; Generic types never occur within other types, despite this syntax allowing that possibility. The need for the optional second type within a generic type is explained in the context of the type inference rule for reference expression on page 6

In this syntax, quotes enclose terminal symbols, comma concatenates phrases, square brackets enclose optional phrases, braces enclose phrases to be repeated zero or more times, and vertical bar separates alternatives [3].

272

I. Toyn et al.

276. Variable types and variable signatures denote unknown values that will be determined by solving the constraints in which they appear. Similar variables are needed for NAMEs, for which we use (subscripted) ı and . An empty signature could be written as nothing, but writing ε is clearer. There is also an annotation operator oo that allows types and signatures to be associated with Z phrases. Metavariables. Metavariables appear in patterns that, when matched against existing known phrases, become associated with existing known values. Metavariables are named according to the type of phrase that they can match, as listed in Table 1. Where a pattern has to match several phrases of the same types, the names of the metavariables are given distinct numeric subscripts. For example, the pattern p ∧ p matches any conjunction predicate, associating p with the left operand and p with the right operand. Table 1. Metavariables Symbol d de e i, j p s t τ σ Σ + ∗

...

Definition matches a Paragraph phrase (d for definition/description). matches a Declaration phrase. matches an Expression phrase. match NAME tokens or DeclName or RefName phrases (i for identifier). matches a Predicate phrase. matches a Section phrase. matches a SchemaText phrase (t for text). matches a Type phrase. matches a Sig phrase. matches an arbitrary type environment. matches a STROKE token. matches a { STROKE } phrase. matches elision of repetitions of surrounding phrases, the total number of repetitions depending on syntax.

Having matched a metavariable with a phrase, we will use that metavariable as denoting the value of that phrase, for example σ denotes a function from NAME to Type. Type Sequents. We write type sequents using the ` symbol, to assert the well-typedness of the possibly-annotated phrase to the right of that symbol in the environment to its left. This notation is similar to that used by Spivey [10]. We superscript each ` with a mnemonic letter to distinguish the syntax of the phrase appearing to its right — see Table 2. Type Inference Rules. Each type inference rule is written in the following form,

Typechecking Z

273

Table 2. Type sequents Formula Z ` z S Λ ` s oo Γ Σ Σ Σ Σ Σ

Definition a type sequent asserting that specification z is well-typed. a type sequent asserting that, in the context of section environment Λ, section s has section-type environment Γ. D ` d oo σ a type sequent asserting that, in the context of type environment Σ, paragraph d has signature σ. P ` p a type sequent asserting that, in the context of type environment Σ, predicate p is well-typed. E ` e oo τ a type sequent asserting that, in the context of type environment Σ, expression e has type τ. T ` t oo σ a type sequent asserting that, in the context of type environment Σ, schema text t has signature σ. DE ` de oo σ a type sequent asserting that, in the context of type environment Σ, declaration de has signature σ.

type subsequents type sequent

(constraints)

or laid out as follows if the preceding form would extend into the right margin. type subsequents type sequent (constraints) They can be read as: if the type subsequents are valid, and the constraints are true, then the type sequent is valid. Some type inference rules have no type subsequents, and some have no constraints, but all have one type sequent. The constraints are written using set theory notation; they typically express relationships that are required to hold between types or signatures. They refer to metavariables bound by pattern matching and to variables for which each application of a type inference rule uses fresh occurrences. We try to use Z-like syntax for the set theory notation used in constraints, so that no description of its intended meaning is needed here. The only unusual notation is ≈ for compatible relations, and decor 0 i, which denotes the name that is like the name associated with metavariable i but with the stroke 0 appended to it. (In contrast, i0 is a metavariable name, and i 0 is the schema resulting from decoration of the schema associated with metavariable i.) We have chosen to use these conventional notations for syntactic definitions and type inference rules because of their conciseness and readability. Others have shown that it can all be done in Z [8,9]. Our discussion of the type inference system in section 4.3 is devoid of formalism due to lack of space.

274

4.2

I. Toyn et al.

Type Inference Rules

Using the notation introduced above, one type inference rule can be presented for each production of the Z syntax. There is space in this paper to present only some of them; a fuller set is available [19]. Specification Sectioned specification. Each section7 is typechecked in an environment formed from preceding sections, and is annotated with an environment that it establishes. The constraints that establish these environments are omitted here (but are included in the fuller set of rules). From the environment in which a section is typechecked will be extracted just those section environments established by the section’s parents. The type subsequent for the prelude section should be omitted if the prelude is one of the explicit sections of the specification. S

{} ` sprelude

o o

S

Γ0

Λ1 ` s Z

` s

o o

Γ1

o o

Γ1 ... sn

... o o

S

Λn ` sn

o o

Γn

Γn

Section Rules omitted. Paragraph Given types paragraph. The names should all be different.   # {i , ..., in } = n D Σ ` [i , ..., in ] END oo σ σ = i : P(GIVEN i ); ...; in : P(GIVEN in ) Axiomatic description paragraph. The signature of the paragraph is that of its schema text. T

Σ ` t D

Σ ` AX t

o o

o o

σ

σ END

o o

σ

Generic axiomatic description paragraph. The parameter names should all be different. The schema text can refer to the parameters. The signature of the paragraph comprises generic forms of the types from the signature of the schema text. T

Σ ⊕ {i 7→ P(GENTYPE i ), ..., in 7→ P(GENTYPE in )} ` t D o o  Σ ` GENAX [i , ..., in ] t o σ END o σ # {i , ..., in } = n σ = λ j : dom σ • [i , ..., in ] (σ j) 7

o o

σ

Draft standard Z divides specifications into sections, each of which is a named sequence of paragraphs related to other sections [15].

Typechecking Z

275

Conjecture paragraph. The predicate should be well-typed. The signature of the paragraph is empty.8 P

Σ ` p D

Σ ` |=? p END

σ

o o

(σ = ε)

Generic conjecture paragraph. The parameter names should all be different. The predicate can refer to the generic parameters. The signature of the paragraph is empty. P

Σ ⊕ {i 7→ P(GENTYPE i ), ..., in 7→ P(GENTYPE in )} ` p D o Σ ` [i , ..., in ] |=? p END o σ # {i , ..., in } = n σ=ε Predicate Membership predicate. The type of the right operand should be a powerset of the type of the left operand. E

Σ ` e

o o

E

τ

P

Σ ` (e

o o

Σ ` e τ ) ∈ (e

o o

τ

τ  = P τ

τ )

o o



Truth predicate. This is always well-typed, hence there are no type subsequents. P

Σ ` true Negation predicate. P

Σ ` p P

Σ ` ¬p Conjunction predicate. P

Σ ` p

P

Σ ` p

P

Σ ` p ∧ p Universal quantification predicate. The predicate should be well-typed in the environment overridden with the signature of the schema text. T

Σ ` t

o o P

σ

Σ ` ∀t

8

P

Σ⊕σ ` p o o

σ•p

Conjectures in draft standard Z are introduced by the |=? keyword.

276

I. Toyn et al.

Expression Reference expression. A reference expression can be a reference to a generic definition in which the instantiation has been left implicit. In that case, for the instantiations to be determined later (once all constraints have been solved), the uninstantiated type has to be remembered as well as the instantiated type. The instantiated type is denoted by juxtaposing the generic type Σ i with a square bracketed list of variable types [α1 , ..., αn ] that replace instances of corresponding generic parameter types.   i ∈ dom Σ E Σ ` i oo τ τ = if Σ i = [ı1 , ..., ın ] α then Σ i, (Σ i) [α1 , ..., αn ] else Σ i Generic instantiation expression. The name should be in the environment with a generic type. The instantiating expressions should be sets.   i ∈ dom Σ  Σ i = [ı1 , ..., ın ] α    E E  o o ... Σ ` en o τn  Σ ` e o τ   τ = P α1   . E . o o o   Σ ` i[(e o τ ), ..., (en o τn )] o τ  .    τn = P αn τ = (Σ i) [α1 , ..., αn ] Set extension expression. The component type of a set can be constrained only if it has any members. Those members should be all of the same type.   if n > 0 then  (τ = τn    E E   .. o o ... Σ ` en o τn  Σ ` e o τ  .   E o o o  Σ ` {(e o τ ), ..., (en o τn )} o τ  τn− = τn    τ = P τ )  else τ = P α Set comprehension expression. The expression should be well-typed in the environment overridden with the signature of the schema text. T

Σ ` t E

o o

Σ ` {t

σ o o

E

Σ⊕σ ` e σ • (e

o o

τ )}

o o

o o

τ

τ

(τ = P τ )

Binding construction expression. The expression should be a schema. Every name and type pair in its signature, with the optional decoration added, should be present in the environment, and the types should not be generic. E



Σ ` e oo τ E Σ ` θ (e oo τ ) ∗

τ = P[β]  τ = [β] ∀ i : NAME | (i, α1 ) ∈ β • (decor

o o

τ

 



i, α1 ) ∈ Σ ∧ ¬ α1 = [ı1 , ..., ın ] α2

Typechecking Z

277

Schema conjunction expression. The two expressions should be schemas with compatible signatures. Those signatures are merged in forming the type of the whole schema conjunction.   τ = P[β1 ] E E  Σ ` e oo τ  Σ ` e oo τ   τ = P[β2 ] E   ≈ β β o o o 1 2 Σ ` (e o τ ) ∧ (e o τ ) o τ τ = P[β1 ∪ β2 ] Schema universal quantification expression. The expression should be a schema whose signature is compatible with that of the schema text. Those signatures are subtracted in forming the type of the whole schema universal quantification.   T E = P[β] τ  o o Σ ⊕ σ ` e o τ Σ ` t o σ  σ ≈ β E o Σ ` ∀ t o σ • (e oo τ ) oo τ C β] τ = P[dom σ − Schema text and declaration Schema text. The declarations should have pairwise compatible signatures. The predicate should be well-typed in the environment overridden by the merging of those signatures. Duplicate declarations of the same names are thus permitted. DE

Σ `

de

o o

σ

DE

... Σ ` den oo σn T Σ` de ; ...; den | p ooσ σ ≈ σ ... σ ≈ σn  ..  .     σn− ≈ σn  σ = σ ∪ ... ∪ σn

P

Σ⊕σ ` p

Variable declaration. The expression should be a set. The signature of the declaration is formed from the names, amongst which there can be duplicates.   E Σ ` e oo τ τ = Pα DE Σ ` i , ..., in : e oo σ σ = {(i , α)} ∪ ... ∪ {(in , α)} Variable definition. E

Σ ` e DE

Σ `

o o

τ

i == e

o o

σ

(σ = i : τ)

Inclusion declaration. The expression should be a schema. E

Σ ` e DE

Σ `

e

o o o o

τ σ

(τ = P[σ])

278

4.3

I. Toyn et al.

Type Inference System

The type inference system applies type inference rules backwards (relative to the way the notation was described in section 6): the type sequent is viewed as a pattern, and the associations of metavariables with values produced by matching that pattern are used to instantiate the type subsequents and constraints. For the patterns to match, there must already be annotations on all formulæ, excepting predicates as they have none. These annotations can be all distinct variables, except as required by draft standard Z (namely that all instances of expressions duplicated by its transformations of chained relations and commaseparated declarations should have the same types). Starting with a type sequent for a whole Z specification, the type inference rule for specification is applied to it, producing one type subsequent for each section, and some constraints to determine the environments to be used in typechecking those sections. There is no need to solve the constraints yet. Instead, type inference rules can be applied to the generated type subsequents, each application producing zero or more new type subsequents, until no more type subsequents remain. Termination is guaranteed by the finiteness of the original specification, and the fact that in every type inference rule the type subsequents involve only sub-formulæ of the type sequent’s Z phrase. This leaves a set of constraints to be solved. There are dependencies between constraints: for example, a constraint that checks that a name is declared in an environment cannot be solved until that environment has been determined by other constraints. As another example, references to generics generate a constraint involving the operation of generic type instantiation, which should not be performed until the type of the referenced generic has been determined. This can be ensured by solving the constraints in per-paragraph batches, as generics are defined at top-level and instantiated only in subsequent paragraphs. Unification is a suitable mechanism for solving constraints. For a well-typed specification, it is possible to solve all the constraints. The resulting unifier provides values for the variables in the constraints. For draft standard Z, every annotation’s original variable should be replaced by the value to which it has been constrained. For a specification to be well-typed, no variables should remain within any of those values. 4.4

Implicit Instantiations

Once a paragraph has been typechecked, the instantiations of its uninstantiated references to generics can be made explicit. This can be expressed formally by the following rule, which transforms a reference expression with a pair of annotations to a generic instantiation expression. i

o o

[i , ..., in ] τ, τ 0 =⇒ i [carrier α1 , ..., carrier αn ] where τ 0 = ([i , ..., in ] τ) [α1 , ..., αn ]

o o

τ0

Typechecking Z

279

The instantiating expressions are the carrier sets of the types inferred for the generic parameters. Those types α1 , ..., αn are determined by comparison of the generic type [i , ..., in ] τ with the instantiation of it τ 0 . 4.5

Schemas

The well-typed though awkward schemas discussed in the requirements section can all be accepted by a typechecker as specified above. They could not have been accepted if the typechecker had instead attempted to solve constraints during a recursive traversal of Z phrases. 4.6

Browsing

Typechecking is based on signatures, which comprise just names and types, yet a browser needs to know about specific declarations. The names and types in a signature originate from the variable declarations of a schema text. So a set of variable declarations can serve as a representation of a signature. When a name is looked-up in an environment, a declaration can be returned rather than just a type. The requirement that inclusion declarations introduce new variable declarations distinct from those of the included schema is a complication for this scheme. Our typechecker defers this copying of declarations until after typechecking has finished. When a reference expression is typechecked, as well as noting the declaration to which it is bound, we also note the schema text which put that declaration into scope. A traversal of the specification after typechecking can then find all schema texts, make distinct copies of included declarations, and find all reference expressions and rebind them to the new declarations. To support this, every overriding of an environment by a signature is annotated with the corresponding schema text. If a typechecker notes the declarations of all uses, including all implicit ones, then a browser has all the information needed to determine the uses of all declarations. Knowing the declaration referred to by a reference expression helps in the process of filling in implicit instantiations: the original uninstantiated type need not be remembered on the reference expression, as it can be retrieved from the referenced generic definition. 4.7

Draft Standard Z

The requirements of draft standard Z on the specification of the type system have largely been addressed by the above specification. One difference is that we have chosen to give type inference rules for schema texts and declarations, whereas those are transformed away earlier in draft standard Z. The choice made here involves more type inference rules, but generates fewer constraints elsewhere.

280

I. Toyn et al.

4.8

Undecoration Expressions

To support undecoration expressions [21], the following changes to the above specification are needed. Change to Z syntax. Undecoration expressions are written using the undecor keyword and specify the stroke of the components to be extracted. Expr = ... all existing productions ... | ’undecor ’ , STROKE , Expr ; Change to Z typechecker. The new undecoration expressions need a type inference rule. Undecoration expression. The expression should be a schema. Every name and type pair in the schema’s signature where the name’s last stroke matches the given one, is present in the result with that stroke removed. E

Σ ` e E

Σ ` undecor

+

o o

(e

o o





τ τ )

o o

τ = P[β1 ]  τ = P[β2 ] τ β = {i : NAME | (decor 2

 +

i, α) ∈ β1 • (i, α)}

Semantics of undecoration expressions. The semantic value of an undecoration expression is the set of bindings that is like that of the operand schema but without those components whose names do not have the given stroke and with that stroke removed from the retained names. 4.9

Type-Constrained Generics

To support type-constrained generics [21], the following changes to the above specification are needed. Change to Z syntax. Generic parameter lists can have a dagger, which precedes those parameters that are constrained to be schemas. Fmls = [ NAME , { ’,’ , NAME } ] , [ ’†’ , NAME , { ’,’ , NAME } ] ; Although the † notation has been introduced in formal parameter lists, and is introduced below in generic types, we do not introduce it in explicit generic instantiation lists, which just use , (comma) between instantiating expressions. Changes to Z typechecker. The notation for generic types needs to list the names of the new parameters, for use in determining implicit instantiations. Type = ’[’ , [ NAME , { ’,’ , NAME } ] , [ ’†’ , NAME , { ’,’ , NAME } ] , ’]’ , Type , [ ’,’ , Type ] (* generic type *) | ... other productions as before ... ;

Typechecking Z

281

There should be at least one NAME in a generic type, despite this syntax not requiring that. Some additional notation is needed for signatures. Sig = ... all existing productions ... | ’GENSIG’ , NAME (* generic parameter signature *) | Sig , ’∪’ , Sig (* merged signature *) ; The GENSIG notation is somewhat analogous to GENTYPE, but with the difference that a generic definition can impose constraints on a GENSIG. That difference makes GENSIG seem like the variable signature notation, but when these notations appear in environments, they are interpreted differently (see below). We also restrict the constraints on generic parameter signatures: we allow compatibility constraints, but reject unification constraints. The ∪ notation denotes the signature formed by merging two signatures. The ∪ symbol that has already been used in some of the above type inference rules was an operator of set theory: the constraints in which ∪ was used were regarded as solvable only when its operands were known signatures. Those uses of ∪ can be regarded as uses of the new signature notation, allowing some of those constraints to be solved sooner. All type inference rules concerned with generics need to be revised, as follows. Generic axiomatic description paragraph. Generic parameter signatures are treated much like generic parameter types by this rule, the difference being that they have different types in the environment. Σ ⊕ {i 7→ P(GENTYPE i ), ..., in 7→ P(GENTYPE in ), T j 7→ P[GENSIG j ], ..., jm 7→ P[GENSIG jm ]} ` t D o o  Σ ` GENAX [i , ..., in † j , ..., jm ] t o σ END o σ # {i , ..., in , j , ..., jm } = n + m σ = λ j : dom σ • [i , ..., in † j , ..., jm ] (σ j)

o o

σ

Generic conjecture paragraph. Σ ⊕ {i 7→ P(GENTYPE i ), ..., in 7→ P(GENTYPE in ), P j 7→ P[GENSIG j ], ..., jm 7→ P[GENSIG jm ]} ` p D Σ` [i , ..., in † j , ..., jm ] |=? p END ooσ # {i , ..., in , j , ..., jm } = n + m σ= Generic instantiation expression. The instantiations of schema parameters should be schemas.

282

I. Toyn et al. E

Σ ` e oo τ ... E Σ ` i[(e

E

E

E

Σ ` en oo τn Σ ` e0 oo τ0 ... Σ ` e0m 0 o 0 0 o 0 o o o o τ ), ..., (en o τn ), (e o τ ), ..., (em o τm )] o τ   α = lookup i Σ  α = [ı1 , ..., ın † 1 , ..., m ] α0      τ = P α1     ..   .     τn = P αn   0   τ = P[β1 ]     . ..       τ 0 = P[βm ] m τ = α [α1 , ..., αn , β1 , ..., βm ]

o o

0 τm

The lookup operation is described below. Reference expression. 

E

Σ ` i

τ  α = lookup i Σ τ = if α = [ı1 , ..., ın † 1 , ..., m ] α0 then α, α [α1 , ..., αn , β1 , ..., βm ] else α o o

The rule for binding construction expression needs to be revised analogously. For draft standard Z, it is possible to solve all the constraints relating to a paragraph of a (well-typed) specification before proceeding to the next paragraph. With type-constrained generics, some constraints might not be solvable then. A counterexample is the explicit definition of schema conjunction, which imposes a constraint of compatibility between the signatures of its instantiating schemas. Having typechecked a paragraph, any remaining constraints should be noted as an attribute of that paragraph. These unsolved constraints affect the check that all implicit instantiations are uniquely determined. The check cannot be delayed until the constraints are solved, as the declarations of the paragraph might never be used, and even if they are they might be used in type erroneous ways, so we continue to perform the check after typechecking each paragraph. Where an implicit instantiation is in a paragraph that has some parameters that must be schemas, and there are some unsolved constraints on which the implicit instantiation depends, we have assumed that the implicit instantiation will become uniquely determined when the paragraph’s parameters are instantiated. Constraints that involve looking up a name in an environment viewed the environment as a function, requiring any uses of the ⊕ notation in forming the environment to have been evaluated before the constraint doing the look up could be solved. With the extensions, environments can now contain generic parameter signatures, and so cannot all be evaluated. Hence the introduction of the lookup operation, which behaves as follows. If the environment is a known signature, then look up proceeds as before. If the environment is a variable signature, look up cannot succeed, and the

Typechecking Z

283

constraint doing the look up will have to be solved later. If the environment is a generic parameter signature, look up behaves as if the requested name is not defined in this environment. This solves the problem exemplified by the definition of schema projection in the requirements above. Overridden environments and merged signatures cause look up to recurse appropriately. This special treatment of generic parameter signatures in the environment is what restricts our type-constrained generics to being schemas. Relaxing that restriction would be nice, but we have been unable to find a way of doing so that also provides a solution to the name-space problem. The lookup operation needs to return not just the inferred type but also any unresolved constraints from typechecking of a generic definition; these constraints are instantiated appropriately and added to the collection of constraints yet to be solved. Those unresolved constraints are also relevant to a browser: when displaying the type of a formula in a constrained generic definition, any unresolved constraints should be revealed. Changes to implicit instantiations. Instantiating expressions are needed not just for the generic parameters but also for the parameters that are constrained to be schemas. [i , ..., in † j , ..., jm ] τ, τ 0 =⇒ i [carrier α1 , ..., carrier αn , carriersig β1 , ..., carriersig βm ] i

o o

o o

τ0

where τ 0 = ([i , ..., in † j , ..., jm ] τ) [α1 , ..., αn , β1 , ..., βm ] The carrier set of a schema type continues to be a schema construction expression, but we can no longer assume that the declarations within it are all variable declarations. Instead we need carriersig to generate an appropriate list of declarations. The carrier of a generic parameter signature is an inclusion declaration referring to the generic parameter. The carrier of a merged signature is the concatenation of the declarations that are the carriers of its operands. Beware that a schema construction expression with only one declaration that is an inclusion does not conform to draft standard Z syntax; the square brackets should be dropped in that case. Semantics of type-constrained generics. Draft standard Z’s semantic equation for generic axiomatic paragraph creates models for all set-valued instantiations of the generic parameters. It should be extended to consider all schemavalued instantiations of the parameters that should be schemas and that conform to the constraints on those instantiations. 4.10

Diagnosing Type Errors

An aim was to reject all ill-typed Z specifications, but also legible error reports should be provided when mistakes are detected. Mistakes are detected as invalid

284

I. Toyn et al.

constraints. Since every constraint arises from the application of a type inference rule to a particular phrase, this allows mistakes to be attributed to corresponding phrases. For each invalid constraint, we provide the specifier with that constraint (paraphrased to some extent), identification of the corresponding phrase, and also the ability to browse the specification to see what the typechecker inferred. Where constraints are independent of one another, they can be solved in any order. The order in which they are solved affects the phrases to which any mistakes are attributed. It is worth solving constraints that would be difficult to diagnose if invalid before solving independent constraints that would be easier to diagnose if invalid. For example, operator applications are transformed to involve tuples of operands before being typechecked; it is better to diagnose an operator as being of inappropriate arity than to say that the tuple of operands is of inappropriate size, as that tuple is not a separate phrase visible to the specifier.

5

Conclusions

It is possible to typecheck even contrived Z specifications, so long as the implementation of the typechecker does not impose extra constraints, such as on the order in which constraints are expected to be solved. A typechecker can assist browsing and reasoning tools by determining where variables are used. We have given a specification of a typechecker in a form that might be suitable for draft standard Z. A typechecker for draft standard Z can be extended to handle typeconstrained generics, and hence explicit definitions of schema calculus operators, without any backwards incompatibilities. Acknowledgements: Rob Arthan set us thinking about awkward schemas. Funding for this work was provided by EPSRC grant GR/M20723.

References 1. L. Cardelli. Basic polymorphic typechecking. Science of Computer Programming, 8(2):147–172, April 1987. 2. P. Hancock. Polymorphic type-checking. In S.L. Peyton Jones, editor, The Implementation of Functional Programming Languages, 1987. 3. ISO/IEC 14977:1996(E). Information Technology—Syntactic Metalanguage— Extended BNF. 4. Xiaoping Jia. ZTC: A type checker for Z notation, user’s guide. Technical Report Version 2.03, Division of Software Engineering, School of Computer Science, Telecommunication, and Information Systems, DePaul University, August 1998. 5. L. Lamport and L.C. Paulson. Should your specification language be typed? Transactions on Programming Languages and Systems, 21(3):502–526, May 1999. 6. R. Milner. A theory of type polymorphism in programming languages. Journal of Computer and System Science, 17:348–357, 1978. 7. D. Neilson. Machine support for Z: the zedB tool. In Proceedings of the 5th Z User Meeting, 1990.

Typechecking Z

285

8. J.N. Reed and J.E. Sinclair. An algorithm for type-checking Z. Technical Monograph PRG-81, Oxford University Computing Laboratory, Programming Research Group, March 1990. 9. C.T. Sennett. Review of the type checking and scope rules of the specification language Z. Technical Report 87017, Royal Signals and Radar Establishment, Malvern, November 1987. 10. J.M. Spivey. Understanding Z: A Specification Language and its Formal Semantics. Cambridge University Press, 1988. 11. J.M. Spivey. The Z Notation: A Reference Manual, 2nd editon. Prentice Hall, 1992. 12. J.M. Spivey and B.A. Sufrin. Type inference in Z. In D. Bjørner, C.A.R. Hoare, and H. Langmaack, editors, VDM’90: VDM and Z—Formal Methods in Software Development, LNCS 428, pages 426–451. Springer, 1990. 13. S. Stepney. Formaliser Home Page. http://public.logica.com/˜formaliser/. 14. B. Sufrin. Using the Hippo system. Technical report, Oxford University Computing Laboratory, Programming Research Group, June 1989. 15. I. Toyn. Innovations in the notation of standard Z. In ZUM’98: The Z Formal Specification Notation, LNCS 1493. Springer, September 1998. 16. I. Toyn, editor. Z Notation: Final Committee Draft. http://www.cs.york.ac.uk/˜ian/zstan/fcd.ps, August 1999. 17. I. Toyn. CADiZ web pages. http://www.cs.york.ac.uk/˜ian/cadiz/, 2000. 18. I. Toyn and J.A. McDermid. CADiZ: An architecture for Z tools and its implementation. Software — Practice and Experience, 25(3):305–330, March 1995. 19. I. Toyn and S.H. Valentine. Type inference rules for Z. ftp://ftp.cs.york.ac.uk/hise reports/cadiz/ZSTAN/rules.ps, March 2000. 20. I. Toyn, S.H. Valentine, and D.A. Duffy. On mutually recursive free types in Z. In ZB2000: International Conference of B and Z Users, 2000. 21. S.H. Valentine, I. Toyn, S. Stepney, and S. King. Type-constrained generics. In ZB2000: International Conference of B and Z Users, 2000.

Guards, Preconditions, and Refinement in Z Ralph Miarka, Eerke Boiten, and John Derrick Computing Laboratory, University of Kent, Canterbury, CT2 7NF, UK {rm17,E.A.Boiten,J.Derrick}@ukc.ac.uk

Abstract. In the common Z specification style operations are, in general, partial relations. The domains of these partial operations are traditionally called preconditions, and there are two interpretations of the result of applying an operation outside its domain. In the traditional interpretation anything may result whereas in the alternative, guarded, interpretation the operation is blocked outside its precondition. In fact these two interpretations can be combined, and this allows representation of both refusals and underspecification in the same model. In this paper we explore this issue, and we extend existing work in this area by allowing arbitrary predicates in the guard. To do so we adopt a non-standard three valued interpretation of an operation by introducing a third truth value. This value corresponds to a situation where we don’t care what effect the operation has, i.e. the guard holds but we may be outside the precondition. Using such a three valued interpretation leads to a simple and intuitive semantics for operation refinement, where refinement means reduction of undefinedness or reduction of non-determinism. We illustrate the ideas in the paper by means of a small example.

1

Introduction

In the states-and-operations (abstract data type) specification style in Z, operations are in general partial relations. The domains of these partial relations are traditionally called preconditions. Depending on which context the abstract data types are used in, there are two interpretations of the result of applying an operation outside its domain. In the traditional interpretation [11], anything may happen outside the precondition (including divergence); in the blocking (guarded) interpretation the operation is not possible. The latter interpretation is the common one when modelling reactive systems or combining Z with process algebra, and also in Object-Z. It is also called ’firing condition’ or ’enabling condition’ interpretation [9]. It has been observed that it is often convenient to use a combination of these two interpretations, which allows both modelling of refusals and underspecification. One way of doing this is by having explicit guards as in B [1] or in Fischer’s work [5]. In this paper we generalise existing work by allowing arbitrary predicates in J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 286–303, 2000. c Springer-Verlag Berlin Heidelberg 2000

Guards, Preconditions, and Refinement in Z

287

the guard. Furthermore, we give a model of refinement, refining both guard and precondition. Our inspiration comes from a non-standard semantics of operations, viz. an interpretation in three-valued logic. The third logic value is called “don’t care”, denoted ⊥. We do occasionally refer to “undefinedness”, although this should probably be distinguished from the kind of undefinedness discussed by Valentine [15] and solved by VDM’s third logic value. Using a three-valued logic leads to a simple and intuitive notion of (operation) refinement: refinement is reduction of undefinedness or reduction of non-determinism (or both). It would even allow an alternative definition of refinement which preserves “required non-determinism” [10,12]. However, such an interpretation of operations requires a more expressive notation than normal operations with explicit guards. In that notation, we take the operation to be false (impossible) outside its guard, and undefined where the guard holds but not the precondition. Clearly this allows us to state that, for certain before states, any after state “is undefined”, but not that some after states are undefined, and others possible or impossible. We will define a syntax which is sufficiently expressive for this semantics, and define operation refinement rules for this which generalise the traditional ones. The remainder of this work is structured as follows. In Section 2, we will demonstrate by means of an example (a simple money transaction system) that a combination of the traditional and blocking interpretations is sometimes required. Then, in Section 3, we define a schema notation including both guards and effect schemas. Based on that we define regions of operation behaviour, i.e. whether an operation is within or without the guard, or within or without the precondition.These regions can also be defined in a three valued interpretation, which we will give in Section 4. Using such a three valued interpretation leads to a simple and intuitive notion of refinement that generalises classical operation refinement. We introduce the rules in Section 5 and show their compatibility to the classical ones. Finally, we discuss related work (Section 6), as well as summarise our work including a discussion of possible future research (Section 7).

2 2.1

Guards and Preconditions in Z Example

Consider the following example of a simple money transaction system. It allows to transfer a positive amount of money to a person’s bank account. Therefore, we need a set of bank account holders [PID] Each bank account is characterised by its holder and the amount of money in it. Of course, we allow negative amounts in the account as well. On the other

288

R. Miarka, E. Boiten, and J. Derrick

hand, not every person in the above set has to have a bank account, therefore, a collection of accounts is a partial function. Further, total is a derived state component which calculates the amount of money in our bank by taking the sum of the money in all accounts. Bank account : PID → 7 Z total : Z P total = x : dom account • account(x ) We describe a transaction that will transfer a given amount of money to someone’s bank account. Clearly the amount transfered has to be positive, because we do not want to be able to decrease someone else’s account. transfer ∆Bank a? : Z p? : PID a? ≥ 0 p? ∈ dom(account) account 0 = account ⊕ {p? 7→ account(p?) + a?}

2.2

Classical Precondition and Guarded Interpretation

In the above example, two conditions have to be fulfilled for a transfer to be successful. On the one hand, the amount must be positive and on the other hand the receiving person must have an account. These conditions are expressed in the following schema: pre transfer Bank a? : Z p? : PID a? ≥ 0 ∧ p? ∈ dom account which is obtained as usual by existentially quantifying over the after state in transfer . But what happens if we try to apply the operation outside of these conditions? There are two possible interpretations: the precondition interpretation, allowing the operation, and the guarded interpretation, preventing it. A related issue is refinement, the development from a specification towards a more concrete representation. How do both interpretations deal with it?

Guards, Preconditions, and Refinement in Z

289

In the classical Z interpretation a precondition represents the set of states where the operation is defined, i.e. guaranteed to produce the specified result. Outside the precondition the operation is considered to be undefined which means that the operation can do anything including non-termination (“divergence”). Therefore, refinement can, apart from reduction of non-determinism, weaken a precondition, allowing one to widen the scope of the operation and thereby reduce the area of undefinedness. Other specification languages, like Object-Z, treat the precondition differently. There the precondition is considered as a guard, blocking the operation if the precondition is not fulfilled. Such an interpretation is occasionally used in Z as well, for example, when modelling reactive systems (see for example [9,13]). Refinement of guards is treated differently. In Object-Z, for example, one is not allowed to change the guard. However, other approaches, like [10] where preconditions and guards are combined, allow strengthening of guards, i.e. the reduction of the applicability of the operation. They also allow to weaken any precondition. However, the precondition is the upper bound for strengthening the guard and the guard is the lower bound for weakening the preconditions. 2.3

Refinement

In the precondition interpretation, the following two refinements of transfer would be possible, each of them weakening one of the constraints of pre transfer . First, we could allow the creation of accounts: C1 transfer ∆Bank a? : Z p? : PID a? ≥ 0 p? 6∈ dom(account) ⇒ account 0 = account ⊕ {p? 7→ a?} p? ∈ dom(account) ⇒ account 0 = account ⊕ {p? 7→ account(p?) + a?} This appears a sensible refinement, however, in the guarded interpretation it would be forbidden. The guarded interpretation also forbids the more dangerous C2 transfer ∆Bank a? : Z p? : PID p? ∈ dom(account) account 0 = account ⊕ {p? 7→ account(p?) + a?}

290

R. Miarka, E. Boiten, and J. Derrick

which, by moving the requirement that a? ≥ 0 suddenly allows withdrawal of someone else’s money. In the precondition interpretation this is still a valid refinement, though. Apparently, the two predicates in pre transfer have a different status: a? ≥ 0 is more like a guard, whereas p? ∈ dom(account) is more like a precondition. Our example shows that each interpretation alone is not always sufficient. Therefore, we want to have both guards and preconditions in the same specification. 2.4

Combining Guards and Preconditions

The idea is not new and there are a number of essentially identical approaches. For example, Fischer [4,5] provides a solution to this problem by using an “enabled” schema to denote the guard and an “effect” schema for the classical operation schema with its precondition interpretation. Using this approach the transfer operation in our example evolves to F transfer enable transfer a? : Z a? ≥ 0

effect transfer ∆Bank a? : Z p? : PID p? ∈ dom(account) account 0 = account⊕ {p? 7→ account(p?) + a?}

where enable refers to the guard of the operation and effect to the effect of the operation. Now the operation F transfer is blocked whenever a? is negative. However, the update of someone’s account is only guaranteed if the account already exists. In case it does not divergence may occur. With this notation we are able to develop refinement rules which deal with the guards and preconditions in an appropriate fashion. Such refinement rules would allow one to weaken the precondition of F transfer (i.e. effect transfer ), reduce any non-determinism in the specification, and potentially strengthen the guard (i.e. enable transfer ). With these rules in place we are able to weaken the precondition p? ∈ dom(account) provided we do preserve the guard a? ≥ 0. However, according to Fischer [5] the guard “must contain unprimed state variables only”. Unfortunately, this would still allow undesired refinements, as the after state is completely unconstrained for before states satisfying the guard but not the precondition. Sensible restrictions like C account {p?} − C account 0 = {p?} − and total 0 = total + a?

Guards, Preconditions, and Refinement in Z

291

which express that no one else’s account changes and that the total amount of money cannot exceed the previous amount plus the newly added, cannot be imposed. Adding this restriction to effect transfer would have no effect, because it can be derived from effect transfer already. However, for states currently outside the precondition but within the guard, we have no way of imposing this as a postcondition.

3

A Syntax for Using Generalised Guards

In this section we introduce the syntax to describe an operation in terms of guards and preconditions. We then use this characterisation to define the different regions of definition that an operation can have. The operation syntax we introduce again splits an operation into two parts consisting of its guard and its effect in a way similar to that described in Section 2.4.

3.1

Operations with Guards and Preconditions

An operation Op is a defined as a pair (gd Op, do Op), where gd denotes the guard of the operation and do the classical operation itself, and it is given by a schema Op gd Op Decgd

do Op Decdo

predgd

preddo

such that Dec denote the declarations of the guard and operation respectively, and, we require that Decgd are (textually) contained in Decdo . One could require that do Op ⇒ gd Op, though we avoid such a restriction by using gd Op ∧ do Op rather than just do Op in any situation where such a restriction would be relevant. In particular, when we refer to just Op in an expression, this is taken to be an abbreviation for gd Op ∧ do Op. Note, that whenever ¬ gd Op holds we do not care about do Op anymore. Note as well, that in gd Op we allow any arbitrary predicate which may involve after states (S 0 ) too, and indeed, the signature reflects this. For example, the previous discussed operation transfer with the desired extension of the guard can now be expressed as

292

R. Miarka, E. Boiten, and J. Derrick

transfer 2 gd transfer 2 ∆Bank a? : Z p? : PID a? ≥ 0 total 0 = total + a? {p?} − C account 0 = {p?} − C account

do transfer 2 ∆Bank a? : Z p? : PID p? ∈ dom(account) account 0 = account⊕ {p? 7→ account(p?) + a?}

Having primed state variables in the guard causes the guard not to be executable, because we cannot test the after state beforehand. However, we may consider specifications that contain undefined areas as not implementable anyway, because some refinement is still missing. For refinement rules which remove undefinedness see Section 5 (and 6.4). Primed state variables in the guard do not limit implementations in general, they just give us more expressiveness. 3.2

Regions of Before States

Using such a notation, we can describe (at least) three different possibilities for a particular pair of before/after states: • gd Op holds and do Op holds: the states belong to the operation. • gd Op holds but do Op does not hold: the states may or may not belong to the operation, we don’t care. • gd Op does not hold: we do not wish the states to belong to the operation. (Note, that this makes do Op for this pair of states redundant information.) Based on this description, we can define a number of regions of before states that are of interest. The next section then will formalise this description in a three-valued logic, and Section 5 will present a refinement relation that conforms with the above intuition. For simplicity, let us take Decdo = Decgd = ∆S in the following definitions. Impossible. The impossible region is the set of states where the operation is blocked, i.e. it is always going to fail. impo(Op) = b [S | ¬ ∃ S 0 • gd Op] Analysing our example, we identify that the operation transfer 2 is always rejected when the amount a? is negative, i.e. impo(transfer 2) = [Bank , a? : Z, p? : PID | a? < 0]. Precondition. The precondition region is the area where the operation is possible and well defined. It is defined by

Guards, Preconditions, and Refinement in Z

293

pre(Op) = b [S | ∃ S 0 • gd Op ∧ do Op] Observe that this is consistent with our convention of Op denoting gd Op ∧ do Op. Then this results in the following precondition for our example: pre(transfer 2) = [Bank , a? : Z, p? : PID | p? ∈ dom(account) ∧ a? ≥ 0]. Guard. The guarded region is simply the complement to the impossible region, i.e. it is the area where the blocking predicate holds. guard(Op) = b [S | ∃ S 0 • gd Op] This, however, is the same as calculating the precondition of the guarded part of the operation, i.e. guard(Op) = pre(gd Op). Then it holds for our example guard(transfer 2) = pre(gd transfer 2) = [Bank , a? : Z, p? : PID | a? ≥ 0]. Here it is clear that our approach is strictly more expressive than Fischer’s: guard(Op) contains an abstraction of the information in our approach, whereas in his pre(enable) = enable. In transfer 2 the guard is a? ≥ 0, loosing the information that any widening of the precondition should respect C account and total 0 = total + a?. {p?} − C account 0 = {p?} − Undefined. Given the regions defined by guard and precondition we could define a “completely undefined” region as the difference between guard and precondition. This would be undef(Op) = b [S | ∃ S 0 • gd Op ∧ (¬ ∃ S 0 • gd Op ∧ do Op)] In the initial transfer operation it is [Bank , a? : Z, p? : PID | a? ≥ 0 ∧ p? 6∈ dom(account)] whereas in transfer 2 this region is empty.

4

Three Valued Interpretation

In the last section we defined several regions according to pairs of before/after states. We distinguished three different possibilities: First, the region where gd Op does not hold, i.e. where the operation should be impossible. Second, the region where both gd Op and do Op hold, i.e. where after states belong to the operation. Third, the remaining region where gd Op holds but do Op does not hold. In that case the outcome of the operation is undefined. These three regions are depicted in Figure 1 and can be naturally described using a set of three truth values {f , t, ⊥} respectively. 4.1

Semantical Description of the Regions

We want to define val Op to be a mapping from the regions into the three truth values given above. Therefore, we define first the relational representation of an operation schema in the obvious way, such that if Op is an operation on state S

294

R. Miarka, E. Boiten, and J. Derrick Impossible (not gd_Op)

Undefined (gd_Op and not do_Op)

Defined (gd_Op and do_Op)

Fig. 1. Guard and Precondition

with input and output, rel Op is a binary relation between bindings of type S plus input and bindings of type S plus output.1 Further, we define a three valued boolean-like type by bool 3 ::= t | f | ⊥ Now the three valued interpretation of an operation Op = (gd Op, do Op) can be defined as follows: val Op = {x ; y | (x , y) ∈ rel Op • (x , y) 7→ t} ∪ {x ; y | (x , y) 6∈ rel gd Op • (x , y) 7→ f } ∪ {x ; y | (x , y) ∈ rel (gd Op ∧ ¬do Op) • (x , y) 7→ ⊥} We use a table style notation to relate before states and after states of an operation by means of the possible outcome, i.e. by val Op. For example, given an operation Filter gd Filter a? : Z b! : Z a? > 0

do Filter a? : Z b! : Z even(a?) b! ≤ a?

which takes only a positive number as input and returns any number less or equal to it if the given number is even. Then the table representation is 1

Cf. Appendix A for a definition of rel, and a fully typed version of val.

Guards, Preconditions, and Refinement in Z

295

a?b! . . . -1 0 1 2 3 4 5 . . . .. . -1 f f f f f f f f f f f f f f 0 1 ⊥⊥⊥⊥⊥⊥⊥ t t t t ⊥⊥⊥ 2 ⊥⊥⊥⊥⊥⊥⊥ 3 t t t t t t ⊥ 4 ⊥⊥⊥⊥⊥⊥⊥ 5 .. .

4.2

Meaning of Refinement

Operation refinement is defined as removal of undefinedness as well as nondeterminism. Taking our three valued interpretation and the above representation then we can explain refinement intuitively as replacing any ⊥ by t which may enlarge the precondition region or by replacing any ⊥ by f which in turn may reduce the guarded region. Furthermore, we can replace multiple t in a line by f (as long as one t remains), in order to reduce non-determinism. Note that the later step does not change either the precondition nor the guarded region. Let us consider our Filter operation from above in order to clarify the presented notion of refinement. Therefore, we introduce a possible refinement C Filter . C Filter gd C Filter a? : Z b! : Z

do C Filter a? : Z b! : Z

a? > 0 b! < a?

even(a?) b! = a?/2

The following refinement took place. First, we ensure that b! is always less than a?. This is done by strengthening the guard and corresponds to changing ⊥ to f for all cases where b! ≥ a?. Note, that this refinement step also strengthens the postcondition of Filter in some cases. Second, we remove non-determinism by providing a more concrete representation of the output in case that a? is even. This is done by replacing multiple t by f . Weakening of the precondition did not take place but we may define an output for the case that a? is an odd number in another refinement step. However, the result will always be bound by the newly introduced predicate in the guard. The outcome of this refinement step is illustrated in the following table.

296

R. Miarka, E. Boiten, and J. Derrick

a?b! . . . -1 0 1 2 3 4 5 . . . .. . -1 f f f f f f f f f f f f f f 0 1 ⊥⊥ f f f f f ⊥⊥ t f f f f 2 ⊥⊥⊥⊥ f f f 3 ⊥⊥⊥ t f f f 4 ⊥⊥⊥⊥⊥⊥f 5 .. .

5

Operation Refinement

In this work, we will restrict ourselves to operation refinement. Our work is intended to generalise the classical approach of refinement. In this section, we first present our generalised rules of refinement which we then apply to the transfer example. Finally, we show that our new refinement conditions indeed generalise both the guarded and the preconditioned approach. 5.1

Rules for Operation Refinement

Given an abstract operation AOp = (gd AOp, do AOp) and a concrete operation COp = (gd COp, do COp) both over the same state State with input x ? : X and output y! : Y , then COp refines AOp, denoted AOp v COp, if and only if applicability (1) and correctness (2) hold: (1) ∀ State; x ? : X • pre AOp ` pre COp (2) ∀ State; State 0 ; x ? : X ; y! : Y • pre AOp ∧ COp ` AOp The first condition allows to weaken the precondition and the second condition ensures that the refined operation does at least what the abstract operation did. Additionally, we allow strengthening of guards: (3) ∀ State; State 0 ; x ? : X ; y! : Y • gd COp ` gd AOp Conditions (1) and (3) together ensure that the precondition is the upper bound for strengthening the guard and that the guard is the lower bound for weakening the precondition. We observe that the correctness rule can be formally weakened using (3): pre AOp ∧ COp ⇒ AOp ≡ {definition of Op}

Guards, Preconditions, and Refinement in Z

297

pre(gd AOp ∧ do AOp) ∧ gd COp ∧ do COp ⇒ gd AOp ∧ do AOp ≡ {using gd COp ⇒ gd AOp} pre(gd AOp ∧ do AOp) ∧ gd COp ∧ do COp ⇒ do AOp ≡ {definition of Op} pre AOp ∧ COp ⇒ do AOp However, it turns out nice that the shape of the classical refinement rules is preserved when we use the introduced abbreviation. 5.2

Example

In Section 2 we introduced a simple money transaction system that allows to put money into the account of an existing customer. We showed via an example that using only the guarded or precondition interpretation limits the expressiveness, and also perhaps allows unintended refinement. In our combined approach we solved these problems. Therefore, we are now able to express the following refinement of the transfer operation: C transfer gd C transfer ∆Bank a? : Z p? : Z a? ≥ 0 total 0 = total + a? {p?} − C account 0 = {p?} − C account

do C transfer ∆Bank a? : Z p? : PID p? 6∈ dom(account) ⇒ account 0 = account ⊕ {p? 7→ a?} p? ∈ dom(account) ⇒ account 0 = account⊕ {p? 7→ account(p?) + a?}

First, we strengthened the guard gd transfer . Now, the money to be transfered has to be positive and we are not permitted to change another person’s bank account, no matter what future refinement will do to the precondition. Second, we also refined the do transfer operation. We weakened the precondition of transfer to handle the case that the receiving user does not have an account. In this case we allow the creation of a new bank account which will have the amount a? as initial input. 5.3

Generalisation of Traditional Refinement Rules

Our concept of refinement is a valid generalisation of the traditional operation refinement rules in both the guarded and the preconditioned approach. Taking gd Op = pre Op and do Op = Op or gd Op = true and do Op = Op, respectively, we show that our refinement rules reduce to the traditional ones.

298

R. Miarka, E. Boiten, and J. Derrick

Guarded Approach. In the guarded interpretation the guard is the precondition of the operation. Therefore, we use gd Op = pre Op and do Op = Op. Let Op1 = (gd Op1 , do Op1 ) = (pre AOp, AOp) and Op2 = (gd Op2 , do Op2 ) = (pre COp, COp). We show that for this choice of Op1 , Op2 it holds Op1 v Op2 ≡ AOp v COp in the guarded approach. (1) Applicability. pre Op1 ` pre Op2 ≡ {Op = (gd Op ∧ do Op)} pre(gd AOp ∧ do AOp) ` pre(gd COp ∧ do COp) ≡ {gd Op = pre Op and do Op = Op} pre(pre AOp ∧ AOp) ` pre(pre COp ∧ COp) ≡ {simplification: pre Op ∧ Op ≡ Op} pre AOp ` pre COp (2) Correctness. pre Op1 ∧ Op2 ` Op1 ≡ {Op = (gd Op ∧ do Op)} pre(gd AOp ∧ do AOp) ∧ (gd COp ∧ do COp) ` (gd AOp ∧ do AOp) ≡ {gd Op = pre Op and do Op = Op} pre(pre AOp ∧ AOp) ∧ (pre COp ∧ COp) ` (pre AOp ∧ AOp) ≡ {simplification: pre Op ∧ Op ≡ Op} pre AOp ∧ COp ` AOp (3) Strengthening. gd Op2 ` gd Op1 ≡ {gd Op1 = pre AOp, gd Op2 = pre COp} pre COp ` pre AOp Applicability and strengthening together result in the fact the pre COp = pre AOp, i.e. the classical condition in Object-Z that a guard cannot be strengthened nor weakened. The correctness rule is as in classical refinement as well. Precondition Approach. In order to show that our approach is a generalisation of the precondition approach, we consider that the guard of the operation is the weakest possible, i.e. gd Op = true. Then our notation coincides with the classical one where do Op = Op. Using the fact that we consider Op = gd Op ∧ do Op it is easy to show that applicability (1) and correctness (2) hold. The rule for strengthening (3) evaluates to ∀ State; State 0 ; x ? : X ; y! : Y • true which means there is no strengthening at all. Therefore, in the case of no guards our refinement rules are equivalent to the classical ones.

Guards, Preconditions, and Refinement in Z

6 6.1

299

Related and Further Work Strulo’s Work

In [13] Strulo attempts to unify both the precondition and the guarded interpretation in order to model passive and active behaviour in Z accordingly. In his work, Strulo uses the term firing condition rather than guard. An operation is then described by a single state schema, plus a label indicating whether the operation is either active or passive. A distinction is made between active operations being impossible or divergent, by interpreting before states which allow all possible after states as divergent. This encoding extends the guarded approach but is somewhat artificial. In particular, addition or removal of state invariants has subtle consequences for which states belong to the “impossible” or “divergent” regions. 6.2

The (R, A)-Calculus

Doornbos’ (R, A)-calculus [3] separates well-definedness of an operation from its effect, in an abstract setting of binary relations and sets. An operation (R, A) consists of a set A essentially representing its precondition, and a relation R specifying its effect. This is substantially different from having a relation with an explicit guard, in particular it allows the specification of “miracles”. The fragment of the calculus satisfying A ⊆ dom R (i.e., the “law” of the exclub (R, A C R). ded miracle), is generalised by our calculus, viz. (gd Op, do Op) = Doornbos also draws a parallel between the (R, A) calculus and weakest (liberal) preconditions which suggests a similar exercise would be possible for our calculus. 6.3

Hehner and Hoare’s Predicative Approach to Programming

In [6,7,8] the authors consider a specification to be a predicate of the form P ⇒ Q meaning that if P is satisfied, then the computation terminates and satisfies Q. A specification S is refined by a specification T if all computations satisfying T also satisfy S , i.e. the reverse implication S ⇐ T (T w S ). This allows weakening of the precondition P as well as strenghtening of the postcondition Q. Within this approach, the predicate guard ∧ (pre ⇒ post) in a schema body would express nearly the desired effect under the guarding interpretation of Z schemas. In this interpretation, a false guard causes the specification to be false, i.e. impossible, and a false precondition pre leads to the specification being true, which in turn allows any output. However, the advantage of our approach with two schemas gd and do is a certain independence of the guard and precondition. Even when the precondition is false, not every output is permitted: it is still restricted by the guard.

300

6.4

R. Miarka, E. Boiten, and J. Derrick

Refinement Rules for Required Non-determinism

A different interpretation is possible for the operations in three-valued logic that we have described. Various authors (e.g. [10,12]) have argued that for behavioural specifications, the traditional identification of non-determinism with implementation freedom is unsatisfactory. They would like the opportunity to specify required non-determinism, which implies a need for additional specification operators to express implementation freedom. Refinement rules should then remove implementation freedom but not non-determinism. Steen et al [12] describe such a calculus, obtained by adding a disjunction operator to LOTOS. We could give a similar calculus in Z by reinterpreting the three-valued operations described above. As before, when the operation evaluates to f for a particular before and after state, it denotes an impossibility. However, the collection of after states that are related by t to a particular before state represent required non-determinism. As a consequence, none of these t values may be removed in refinement. Finally, the collection of after states that are related by ⊥ to a particular before state represent an implementation choice, i.e. at least one of those after states will need to be related by t in a final refinement. As a consequence, expressed in terms of the tabular representation used before, refinement rules for required non-determinism and disjunctive specification are: • if a line contains a single ⊥, it is equivalent to t (required choice from a singleton set); • if a line contains multiple occurrences of ⊥, some but not all of them may be changed to f (reducing possibility of choice); • any ⊥ may be changed to t (in particular, an implementation choice between several after states may be refined to a non-deterministic choice between some of them). This approach generalises only the guarded approach – the precondition just characterises those before states for which possible after states have been determined already. It also prevents some undesired interaction between removing undefinedness and increasing determinism.

7

Conclusion and Future Work

In this work we presented the idea of using a three-valued interpretation of operations to combine and extend the guarded and precondition approaches. Using this non-standard interpretation we were able to present a simple and intuitive notion of operation refinement, which generalizes the traditional refinement relations. A full theory of refinement would also include a notion of data refinement. However, when the retrieve relation is a two-valued predicate the extension becomes

Guards, Preconditions, and Refinement in Z

301

natural. It remains an open question what might be represented by a three-valued retrieve relation. In our interpretation of pairs of schemas (gd Op, do Op) we identified only three regions. Clearly, we could further distinguish the areas ¬ gd Op ∧ ¬ do Op and ¬ gd Op ∧ do Op. The latter area might be regarded as representing “miracles” or inconsistency. Detecting and managing inconsistency between the guarded and the preconditioned region is another of our topics for future research. Further, we would like to develop a schema calculus for the operators of threevalued logic. Acknowledgement: We like to thank all the anonymous referees for their corrections and helpful suggestions in order to improve this work.

References 1. J.-R. Abrial. The B-Book: Assigning Programs to Meanings. Cambridge University Press, 1996. 2. Jonathan P. Bowen, Andreas Fett, and Michael G. Hinchey, editors. ZUM ’98: The Z Formal Specification Notation, Proceedings of the 11th International Conference of Z Users. Lecture Notes in Computer Science 1493. Springer Verlag, Berlin Heidelberg New York, September 1998. 3. H. Doornbos. A relational model of programs without the restriction to Egli-Milner constructs. In E.-R. Olderog, editor, PROCOMET ’94, pages 357–376. IFIP, 1994. 4. Clemens Fischer. CSP-OZ: A Combination of Object-Z and CSP. Technical Report TRCF-97-2, Universit¨ at Oldenburg, Fachbereich Informatik, PO Box 2503, 26111 Oldenburg, Germany, April 1997. Online: http://theoretica.Informatik.Uni-Oldenburg.DE/˜fischer/techreports.html (last access 10/01/2000). 5. Clemens Fischer. How to Combine Z with Process Algebra. In Bowen et al. [2], pages 5–23. 6. Eric C. R. Hehner. A practical theory of programming. Springer Verlag, 1993. 7. Eric C. R. Hehner. Specifications, programs, and total correctness. Science of Computer Programming, 34(3):191–205, July 1999. Online http://www.elsevier.com/cas/tree/store/scico/sub/1999/34/3/563.pdf (last access: 09/05/2000). 8. C. A. R. Hoare and He Jifeng. Unifying Theories of Programming. Prentice Hall, 1998. 9. Mark B. Josephs. Specifying reactive systems in Z. Technical Report PRG-19-91, Programming Research Group, Oxford University Computing Laboratory, 1991. 10. K. Lano, J. Bicarregui, J. Fiadeiro, and A. Lopes. Specification of Required Nondeterminism. In John Fitzgerald, Cliff B. Jones, and Peter Lucas, editors, FME’97: Industrial Applications and Strengthened Foundations of Formal Methods (Proc. 4th Intl. Symposium of Formal Methods Europe, Graz, Austria, September 1997), Lecture Notes in Computer Science 1313, pages 298–317. Springer-Verlag, September 1997.

302

R. Miarka, E. Boiten, and J. Derrick

11. J. M. Spivey. The Z Notation: A Reference Manual. Prentice-Hall International Series in Computer Science. Prentice-Hall International (UK) Ltd., 2nd edition, 1992. Online: http://spivey.oriel.ox.ac.uk/˜mike/zrm/index.html(last access 26/07/1998). 12. M.W.A. Steen, H. Bowman, J. Derrick, and E.A. Boiten. Disjunction of LOTOS specifications. In T. Mizuno, N. Shiratori, T. Higashino, and A. Togashi, editors, Formal Description Techniques and Protocol Specification, Testing and Verification: FORTE X / PSTV XVII ’97, pages 177–192, Osaka, Japan, November 1997. Chapman & Hall. Online: http://www.cs.ukc.ac.uk/pubs/1997/350 (last access: 20/01/2000). 13. Ben Strulo. How Firing Conditions Help Inheritance. In Jonathan P. Bowen and Michael G. Hinchey, editors, ZUM’95: The Formal Specification Notation, Lecture Notes in Computer Science 967, pages 264–275. Springer Verlag, 1995. 14. Ian Toyn. Z Notation: Final Committee Draft, CD 13568.2, August24 1999. Online: http://www.cs.york.ac.uk/˜ian/zstan/ (last access 09/05/2000). 15. S. H. Valentine. Inconsistency and Undefinedness in Z – A Practical Guide. In Bowen et al. [2], pages 233–249.

Appendix A: Relational View of Operations In this appendix we give a formal definition of the relational view of an operation schema, as a binary relation between the appropriate sets of bindings. Binding types are not first class citizens in “traditional” Z, but using notations and conventions from the Draft Z Standard [14] we can provide a sensible typing to the operations defined here. Define the signature of a schema by changing its predicate to true: ΣOp = Op ∨ ¬Op Using the precondition operator, we can define “before” and “after” signatures of a schema by: Σbef Op = Σ(pre Op) Σaft Op = ∃ Σbef Op • ΣOp By the conventional interpretation of the precondition operator, Σbef Op will contain Op’s before state and any inputs; Σaft Op contains its after state and any outputs. In order to provide a type for the relational view of an operation, we have to define the types of “before”-bindings and “after”-bindings of an operation. This could be done explicitly using quantification and filtering over sets of names as in the Draft Standard for pre, but also using just its [σ] notation for binding types.

Guards, Preconditions, and Refinement in Z

303

Every (well-defined) schema Op has a unique type of the form P[σ]. Let us denote Op Op this σ by b Op ; define bbef = b Σbef Op and analogously baft . Then the relational view of an operation is defined by Op Op ]; y : [baft ] | ∃ ΣOp • x = ΘΣbef Op ∧ rel Op = {x : [bbef y = ΘΣaft Op ∧ Op • (x , y)}

The definition of val Op as given in Section 4 actually requires a slight modification when ΣOp and Σgd Op are different. Let the extension ext of Op1 to the signature of Op2 be defined by: Op1 ext Op2 = [ΣOp2 | Op1 ] Then Op Op ]; y : [baft ] | (x , y) ∈ rel Op • (x , y) 7→ t} val Op = {x : [bbef Op Op ∪ {x : [bbef ]; y : [baft ] | (x , y) 6∈ rel(gd Op ext do Op) • (x , y) 7→ f } Op Op ]; y : [baft ] | (x , y) ∈ rel (gd Op ∧ ¬do Op) • (x , y) 7→ ⊥} ∪ {x : [bbef

Retrenchment, Refinement, and Simulation R. Banach 1 and M. Poppleton1,2 Science Dept., Manchester University, Manchester, M13 9PL, U.K. bFaculty of Maths. and Comp., Open University, Milton Keynes, MK7 6AL, U.K. [email protected] , [email protected] aComputer

Abstract: Retrenchment is introduced as a liberalisation of refinement intended to address some of the shortcomings of refinement as sole means of progressing from simple abstract models to more complex and realistic ones. In retrenchment the relationship between an abstract operation and its concrete counterpart is mediated by extra predicates, allowing the expression of non-refinement-like properties and the mixing of I/O and state aspects in the passage between levels of abstraction. Modulated refinement is introduced as a version of refinement allowing mixing of I/O and state aspects, in order to facilitate comparison between retrenchment and refinement, and various notions of simulation are considered in this context. Stepwise simulation, the ability of the simulator to mimic a sequence of execution steps of the simulatee in a sequence of equal length is proposed as the benchmark semantic notion for relating concepts in this area. One version of modulated refinement is shown to have particularly strong connections with automata theoretic strong simulation, in which states and step labels are mapped independently from simulator to simulatee. A special case of retrenchment, simple simulable retrenchment is introduced, and shown to have properties very close to those of modulated refinement. The more general situation is discussed briefly. The details of the theory are worked out for the B-Method, though the applicability of the underlying ideas is not limited to just that formalism. Keywords: Retrenchment, Refinement, Simulation, B-Method.

1 Introduction In [1] the authors observed that the normal practice of using refinement as the sole means of going from an abstract description of a desired system to a more realistic one, exhibited certain deficiencies as regards the desirability of keeping things simple and elegant at the highest levels of description, whilst accepting that a lower level account needs to recognise the impact of many low level details that necessarily intrude, in an essential way, upon the idealised nature of the former. We therefore proposed that the exigencies of refinement were mollified by two extra predicates per operation, the WITHIN and CONCEDES clauses, the former to strengthen the precondition and the latter to weaken the postcondition, the latter in particular allowing the expression of non-refinement-like behaviour because of the weakening of the postcondition. Permitting these clauses to also mix state and I/O information between levels of abstraction when convenient, yields a very flexible framework for building up complex specifications from over-simple but appealing predecessors. In this manner we overcame the unforgiving nature of the refinement proof obligations. In [1] we were concerned with justifying retrenchment on engineering grounds. This more pragmatic departure we considered reasonable, so that we did not fall into the trap of making a premature commitment to a particular mathematical notion that later proved to be inconvenient in the face of large examples. In the present work we return to examine the foundations of the notion that we have proposed. Specifically we J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 304−323, 2000. Springer-Verlag Berlin Heidelberg 2000

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

305

305

examine stepwise simulation, the ability to simulate a sequence of steps of the simulatee by an equal length sequence of steps of the simulator. The main tool for this is a notion of refinement, called modulated refinement, similar to elaborations of conventional refinement that allow for change of I/O representations. Modulated refinement comes in two versions, normal and inverted, and the latter supports an especially strong connection with (automata theoretic) strong simulation. This is of independent interest, and furthermore provides the means to show how the properties of retrenchment are related to those of refinement. The rest of this paper is as follows. In Section 2 we discuss via an example how refinement can be inconvenient in developing complex specifications from simpler models. We also discuss some ways in which the existing literature addresses these points, if, in our view, only partially. In Section 3 we show how retrenchment provides a natural framework for the needed flexibility. Section 4 highlights the point that stepwise simulation is the fundamental semantic notion by which we measure the relationships between systems considered in this paper. Section 5 introduces modulated refinement in its two versions, and elaborates the connection between these and (automata theoretic) strong simulation. The link between refinement and retrenchment is considered in Section 6, which introduces simple simulable retrenchment, a special case having properties very close to those of modulated refinement. Specifically, stepwise simulation and strong simulation results are easy to derive, and modulated refinements of the two kinds are recovered. Section 7 returns to the original example and Section 8 concludes. Notation. In the body of the paper we use the B Abstract Machine Notation for model oriented specification and system development (see [2, 3, 4, 5]). This provides a comprehensive syntax and semantics for the concepts of refinement most used in development, and our ideas slot very neatly into the B framework. Nevertheless the ideas of the paper are independent of notation, and readily apply to other approaches.

2 Some Inadequacies of Refinement While refinement has proved its worth many times over as an implementation mechanism, there is room for misgivings in its use to describe the much more informal processes that often occur when moving from an appealing and simple model of a system, to a realistic but more complex and less elegant one, where it is the latter that must actually be implemented. Let us illustrate with a small example. We will consider a mobile radio system. At a high level of abstraction we can model its essential features thus: MACHINE Mobile_Radio_HL SETS CALLSTATES = { Idle , Busy } VARIABLES callState , currChan INVARIANT callState ˛ CALLSTATES currChan ˛ CHANELS INITIALISATION callState := Idle || currChan :˛ CHANELS OPERATIONS call_outgoing ( num ) ^ = PRE callState = Idle num ˛ CHANELS THEN CHOICE callState := Busy || currChan := num OR skip

306

306

R. Banach and M. Poppleton

R. Banach and M. Poppleton

END END ; call_incoming ( num ) ^ = PRE num ˛ CHANELS THEN SELECT callState = Idle THEN callState := Busy || currChan := num ELSE skip END END ; disconnect_outgoing ^ = PRE callState = Busy THEN callState := Idle END ; disconnect_incoming ^ = SELECT callState = Busy THEN callState := Idle END ; END The model describes what we would expect ‘normal service’ to consist of. In this model we distinguish between outgoing operations that are initiated by the current possessor of the device in question and whose validity is protected by PRE assertions, and incoming operations that are prompted unpredictably from the (user’s) environment and whose validity (besides input typing clauses) is protected by SELECT guards. In the former case the operation diverges if called outside its PRE clause, in the latter case it will not start unless the SELECT guard is true. The distinction is made clear by considering the normal form of an arbitrary AMN operation as described in [2] Ch. 6, which can be written as: opname ^ = PRE P(x) THEN ANY x¢ WHERE Q(x, x¢) THEN x := x¢ END END

(2.1)

for suitable predicates P and Q in the variables mentioned. For opname to guarantee to establish some property P , written [ opname ] P , the following must hold: P(x)

("x¢ • Q(x, x¢) P [x¢\x])

(2.2)

Divergence or abortion or nontermination, is caused by the failure of P(x) to hold, which prevents the predicate from ever being true. Normal working is when P(x) holds and there is an x¢ such that Q(x, x¢) and P[x¢\x] hold. However when P(x) holds and there is no x¢ such that Q(x, x¢) holds, the operation succeeds miraculously since the above predicate is true independently of P[x¢\x] . Since miracles are infeasable, the miraculous region of an operation is interpreted as one in which the operation cannot start, dually to the interpretation of the nonterminating region as one in which the operation cannot stop (normally). The SELECT guards G(x) say, used above, just correspond to cases where Q(x, x¢) can be decomposed into the form G(x) x¢ = value , where value is independent of x and x¢ . Moving now to a lower level of abstraction, we take into account various facts: firstly that before the radio will work, the user must select a suitable waveband; secondly that when making an outgoing call the radio may jam, from which it must be reset; thirdly that during a call, fadeouts can occur which will also cause the radio to jam, requiring a reset. The lower level model is then as follows:

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

MACHINE SETS VARIABLES INVARIANT INITIALISATION

307

307

Mobile_Radio_LL JCALLSTATES = CALLSTATES ¨ { Jam } jcallState , jcurrChan , bandSelected jcallState ˛ JCALLSTATES jcurrChan ˛ CHANELS bandSelected ˛ BOOL jcallState := Idle || jcurrChan :˛ CHANELS || bandSelected := FALSE

OPERATIONS select_band ^ = PRE bandSelected = FALSE THEN bandSelected := TRUE END ; call_outgoing ( num ) ^ = PRE bandSelected = TRUE jcallState = Idle num ˛ CHANELS THEN CHOICE jcallState := Busy || jcurrChan := num OR skip OR jcallState := Jam END END ; call_incoming ( num ) ^ = PRE num ˛ CHANELS THEN SELECT bandSelected = TRUE jcallState = Idle THEN jcallState := Busy || jcurrChan := num ELSE skip END END ; disconnect_outgoing ^ = PRE bandSelected = TRUE jcallState = Busy THEN jcallState := Idle END ; disconnect_incoming ^ = SELECT bandSelected = TRUE jcallState = Busy THEN jcallState := Idle END ; fadeout ^ = SELECT bandSelected = TRUE jcallState = Busy THEN jcallState := Jam END ; reset ^ = PRE jcallState = Jam THEN jcallState := Idle END ; END One might say very loosely that one had refined the HL model to the LL model, but one could not attach any mathematical weight to such a statement. To see this it suffices to examine the refinement proof obligation in B notation: INVA

INVC trm(opnameA) trm(opnameC) [ opnameC] [ opnameA]

INVC

(2.3)

In this the A and C subscripts indicate the more abstract and the more concrete of the models respectively, the INV clauses refer to the invariants at the two levels, and the

308

308

R. Banach and M. Poppleton

R. Banach and M. Poppleton

trm clauses describe the termination conditions for the operation in question (given in the PRE clauses). The heart of the refinement PO is the ‘[opnameC] [ opnameA] INVC’ clause which states that whenever the concrete operation is able to make a step, there is a step that the abstract operation is able to make such that the concrete invariant is reestablished. Now normally in B, the concrete invariant contains clauses that relate the abstract and concrete state variables, i.e. the retrieve relation. In our example, with malice aforethought, we omitted to do this, but we can easily repair the situation by explicitly defining the retrieve relation thus: RETRIEVES

jcallState = callState

jcurrChan = currChan

(2.4)

and rewriting the PO thus: INVA

RETAC INVC trm(opnameA) trm(opnameC) [ opnameC] [ opnameA]

RETAC

(2.5)

We can now examine the implications of this for various operations in our example. For the moment we will disregard the fact that the more concrete model features more operations than the abstract one. Consider the operation disconnect_outgoing . Its concrete trm predicate (bandSelected = TRUE jcallState = Busy) is stronger than its abstract one (jcallState = Busy) , so the latter does not imply the former as required in the PO. One way round this is to notice that for the corresponding incoming operation, the trm predicates are both true , resolving one problem, and to notice that the ‘[opnameC] [ opnameA] RETAC’ structure demands that the concrete SELECT guard implies the abstract one (as can be derived from (2.2) and the remarks which follow it). Since the incoming operation’s guards are (essentially) just the outgoing operation’s preconditions, this succeeds. So we can model the situation desired by keeping to the incoming style, retaining a refinement, but we lose the distinction between the two kinds of operation. Consider now the operation call_outgoing . The strengthened precondition problem is just as evident here, but beyond that, in the concrete model, if the call fails to connect, the apparatus ends up in the Jam state, outside the reach of the retrieve relation. Changing preconditions to guards will not help here. No notion of refinement can cope with such a situation. A similar but more behavioural manifestation of the same phenomenon is apparent in the fadeout operation: if a communication is in progress and a fadeout event occurs, there is no way that a concrete execution sequence including this can be modelled by an abstract execution sequence, again because the concrete state ends up outside the reach of the retrieve relation. This would still be the case if we introduced a dummy abstract fadeout operation, specified by skip , to be ‘refined by’ the concrete one. From the point of view of the abstract model, such an operation would be adding uninformative clutter, and more than anything would be signalling that the relationship between the ‘real’ abstract model and the more concrete one is certainly not a refinement. A case of ‘skip considered harmful’. We can go further. Given the greater range of possible behaviours of the concrete call_outgoing operation compared to the abstract version, we would expect the user to be given more feedback, say in the form of some output. This would require a change in operation signature viz. = … res ‹— call_outgoing ( num ) ^ Changes of signature are not allowed in conventional refinement. And even if we enhanced the abstract operation with output to provide user feedback, a different set of

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

309

309

output messages would be appropriate in the two cases, again forbidden in conventional refinement. All of the foregoing is not to say that more generous interpretations of the concept of refinement have not allowed some of the phenomena mentioned above, one way or another. For example, change of I/O representation within refinement has been admitted in [6, 7, 8, 9], though this does not extend to mixing of I/O and state concerns as we have proposed. Using skip to bypass awkward points of refinement where a more informative account of things is impossible, is of course a familiar trick. Likewise the device of introducing new operations at the concrete level which refine skip at the abstract level, is familiar from action refinement [10, 11, 12] — the actual presence or absence of dummy operations at the abstract level becomes then a matter of mere notation. Such operations also play a part in superposition refinement [13, 14, 15], where additional concrete computations are introduced to control the progress of a more abstract computation. Our introduction of the bandSelected variable and its influence on the abstract computation can be seen as being in the flavour of a superposition, though it cannot be a superposition in any strict sense since the abstract computation is interfered with at the more concrete level by the consequences of jamming. The jamming phenomenon itself and the way it relates to ‘normal system behaviour’, bears comparison with work on the difficulties of refining abstract variables which take values in ideal and typically infinite domains, to concrete variables taking values in strictly finite ones. From a wide range of approaches to the latter question we can mention [16, 17, 18, 19, 20]. Despite these efforts it is fair to say that there are nevertheless drawbacks in trying to restrict oneself to pure refinement as the only way of going from an abstract to a concrete model, especially if the objective is to start from a simplified but transparent description, moving to a realistic but more complex description only by degrees. And size matters. Clearly there is little to be gained by presenting as small an example as the one above in this gradual manner, but one can imagine that in industrial scale situations, where there is much more complexity to manage, a multi stage development of a large specification is highly desirable, particularly if the size of the real specification is not dramatically smaller than that of the code that implements it, which can often happen when there is a lot of low level case analysis in the system description.

3 Retrenchment In [1], we introduced retrenchment as a means of addressing the issues just highlighted, within a ‘refinement-like’ framework. In the context of B, a syntax very similar to that of the B REFINEMENT construct captures what is required, namely the following: MACHINE

M(a)

VARIABLES INVARIANT

u I(u)

INITIALISATION X ( u ) OPERATIONS = o ‹— opname ( i ) ^ S(u,i,o)

MACHINE N(b) RETRENCHES M VARIABLES v INVARIANT J(v) RETRIEVES G(u,v) INITIALISATION Y ( v ) OPERATIONS p ‹— opname ( j ) ^ = BEGIN

310

310

R. Banach and M. Poppleton

R. Banach and M. Poppleton

END

END

T(v,j,p) [ LVAR A] WITHIN P(i,j,u,v,A) CONCEDES C(u,v,o,p,A) END

(3.1)

The left hand MACHINE M(a) , with operations o ‹— opname(i) , each with body S(u, i, o) , is retrenched (via the RETRENCHES M clause and retrieve relation RETRIEVES G(u, v) ), to MACHINE N(b) , with operations p ‹— opname(j) . These latter operations now have bodies which are ramified generalised substitutions, that is to say generalised substitutions T(v, j, p) , each with its ramification, the LVAR , WITHIN and CONCEDES clauses. Each opname of M must appear ramified within N , but there can also be additional operations in N . Speaking informally, the ramification of an operation allows us to describe how the concrete operation fails to refine its abstract counterpart. The optional LVAR A clause permits the introduction of logical variables A , that remember before-values of variables and inputs, should they be needed later. Its scope is the WITHIN and CONCEDES clauses. The WITHIN clause describes nontrivial relationships between abstract and concrete before-values of the state variables u and v , and abstract and concrete inputs i and j , and defines A if A is being used. It strengthens the precondition as we will see. The CONCEDES clause provides similar flexibility for the after-state, weakening the postcondition, and describes nontrivial relationships between abstract and concrete variables and abstract and concrete outputs, and using A if it has previously been defined. The proof obligations make all this more precise. The conventional POs for the machines M and N hold, including the initialisation POs, [ X(u) ] I(u) and [ Y(v) ] J(v) , and the machine invariant preservation POs, I(u) trm(S(u, i, o)) [ S(u, i, o) ] I(u) and J(v) trm(T(v, j, p)) [ T(v, j, p) ] J(v) . A joint initialisation PO is also required to hold, being identical to the refinement case, [ Y(v) ] [ X(u) ] G(u, v) . Of most interest however is the retrenchment PO for operations which reads: (I(u)

G(u, v)

J(v))

(trm(T(v, j, p))

P(i, j, u, v, A))

trm(S(u, i, o)) [ T(v, j, p) ] [ S(u, i, o) ] (G(u, v) C(u, v, o, p, A))

(3.2)

This contains on the left hand side the invariants (I(u) G(u, v) J(v)) , and we strengthen the concrete trm predicate with P(i, j, u, v, A) , as stated above. The right hand side infers the abstract trm predicate, and the ‘[ T(v, j, p) ] [ S(u, i, o) ] ’ structure establishes in the after-states, the retrieve relation weakened by C(u, v, o, p, A) . A detailed heuristic discussion in [1] justified the shape of this PO. A major part of the purpose of this paper is to show how simulation properties support this choice. We can now show how the low level machine Mobile_Radio_LL of the last section retrenches Mobile_Radio_HL . To conform to the above syntax we have to add the retrenchment declaration ‘RETRENCHES Mobile_Radio_HL’ , and of course the re-

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

311

311

trieves clause (2.4), but the main thing is the ramifications of the operations. We consider these in turn. The call_outgoing operation can be dealt with as follows: call_outgoing ( num ) ^ = BEGIN PRE bandSelected = TRUE jcallState = Idle num ˛ CHANELS THEN CHOICE jcallState := Busy || jcurrChan := num OR skip OR jcallState := Jam END END LVAR jCH WITHIN jCH = jcurrChan CONCEDES jcurrChan = jCH jcallState = Jam END Note that the WITHIN clause serves only to define jCH . This is because the stronger concrete trm predicate appears as hypothesis in the operation PO, and the abstract one is deduced. Of course we could strengthen the WITHIN clause with ‘bandSelected = TRUE’ to highlight the differences between the trm predicates if we wished. Note also how the CONCEDES clause captures what happens when the retrieve relation breaks down; here it is important to realise that the occurrence of the jcurrChan state variable in the WITHIN clause refers to its before-value, while the occurrence in the CONCEDES clause refers to its after-value, moreover jCH , being a fresh logical variable, remembers the former for use in the context of the latter. The corresponding call_incoming operation needs only the trivial ramification WITHIN true CONCEDES false . This is because the trm predicates are the same for this case, and the ‘[ T(v, j, p) ] [ S(u, i, o) ] ’ of the operation PO, requires the concrete guard to be stronger than the abstract guard (the same structure as found in refinement). The remaining operations that need ramifying, disconnect_outgoing and disconnect_incoming can also be given the trivial ramification. As before this is because in retrenchment, both the guard and the termination predicate of the concrete version of an operation are required to be stronger than the abstract ones (pending qualification by the WITHIN clause). In this sense retrenchment has no preference between the ‘called operation’ and ‘spontaneous event’ views of an operation.

4 Stepwise Simulation The semantic touchstone for retrenchment is stepwise simulation, by which we mean the simulation of a sequence of steps of the simulatee T by an equal length sequence of steps of the simulator S , see Fig. 1. However the precise definition of ‘simulates’ in this context will depend on the precise relationship between the two systems under study so we will not give a formal definition here. Suffice for the moment to say that

312

312

R. Banach and M. Poppleton

R. Banach and M. Poppleton

‘simulates’ always includes the preservation of the retrieve relation, since there will always be one around.

S • T •





















































Fig. 1. A stepwise simulation. We write a step of a machine such as M of (3.1) in the form: u -(i, m, o)-› u¢ where u and u¢ are the before and after states, m is the name of the operation (where it can help, we write S , the body of m , instead of m itself), and i and o are the input and output of m . This signifies that (u, i) satisfy trm(S) , and that (u, i, u¢, o) satisfy the before-after predicate of m (this is the predicate Q(u, u¢) in the normal form (2.1) when there is no I/O, otherwise it is Q(u, i, u¢, o) , the corresponding predicate for the normal form with I/O present). When discussing properties of sequences of steps, last(T ) will denote the index of the last state mentioned in T , and r ˛ dom•(T ) will mean r ˛ [0 … last(T ) – 1] if T is finite, and r ˛ NAT otherwise. Similarly for sequences of any type. In general we need to distinguish OpsM , the operation names at the abstract level, from OpsN the operation names at the concrete level, where OpsM ˝ OpsN .

5 Modulated Refinement and Simulation In this section we explore in some depth a notion, modulated refinement, that lies part way between conventional refinement and retrenchment, in order to illuminate the relationship between them. Modulated refinement is superficially a straightforward extension of refinement to allow different I/O signatures at the two levels of abstraction in question. In this sense part of what appears here bears comparison with other adaptations of refinement to cope with change of I/O representation such as [6, 7, 8, 9]. Here is a specific B syntax. MACHINE

M(a)

VARIABLES INVARIANT

u I(u)

INITIALISATION X ( u ) OPERATIONS = o ‹— opname ( i ) ^ S(u,i,o) END

MACHINE N(b) MODREF M VARIABLES v INVARIANT J(v) RETRIEVES G(u,v) INITIALISATION Y ( v ) OPERATIONS p ‹— opname ( j ) ^ = BEGIN T(v,j,p) WITHIN P(i,j,u,v) NEVERTHELESS V(u,v,o,p) END END (5.1)

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

313

313

Note the MODREF keyword, indicating the intended relationship, and as in retrenchment, the separate appearance of the retrieve relation. Because modulated refinement mirrors conventional refinement, we demand that M and N have the same set of operation names Ops . In N , each operation has a WITHIN clause, but this time there are no logical variables, so the clause simply expresses the relationship between the before-states and the inputs, as an enhancement to the retrieve relation. Likewise there is a NEVERTHELESS clause, expressing the relationship between the afterstates and the outputs. Unlike the CONCEDES clause of retrenchment, the NEVERTHELESS clause will act conjunctively, also enhancing the retrieve relation, hence the different name. In fact the above syntax will serve for two slightly different notions of modulated refinement, normal and inverted to be introduced shortly, and we could introduce separate N-MODREF and I-MODREF keywords for these, but it is convenient not to do so. Until further notice we will study normal modulated refinement. Aside from the usual machine POs for M and N , the semantics of normal modulated refinement is captured by the POs. Firstly for initialisation: [ Y(v) ] [ X(u) ] (G(u, v)

m ˛ Ops

(" jm $ im • Pm(im, jm, u, v)))

(5.2)

Next the PO for operations, which for a typical operation reads: (I(u)

G(u, v)

J(v))

(trm(S(u, i, o))

P(i, j, u, v))

trm(T(v, j, p)) [ T(v, j, p) ] [ S(u, i, o) ] (G(u, v) V(u, v, o, p))

(5.3)

and lastly the operation compatibility PO, which for a typical operation n (where n ˛ Ops , and has clauses Pn and Vn ) reads: G(u, v)

Vn(u, v, on, pn)

m ˛ Ops

(" jm $ im • Pm(im, jm, u, v))

(5.4)

The role of the operation compatibility PO is to ensure that the result of one step cannot prevent any next step purely because of the relationship between abstract and concrete I/Os and states (similar remarks apply for (5.2)). Note the appearance in a conjunctive context of the NEVERTHELESS clauses in (5.3) and (5.4). The main reason for studying normal modulated refinement is that it posesses the natural analogue of the simulation property so characteristic of conventional refinement. Definition 5.1 Let (5.1) be a (normal or inverted) modulated refinement. Suppose T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] is a concrete execution sequence, and that S ”[ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] is an abstract execution sequence. Then S is a stepwise simulation of T iff G(u0, v0) holds, dom(T ) = dom(S) , and for all r ˛ dom•(T ) : G(ur, vr)

Pmr(ir, jr, ur, vr) G(ur+1, vr+1) Vmr(ur+1, vr+1, or+1, pr+1)

(5.5)

Definition 5.2 Let (5.1) be a (normal or inverted) modulated refinement. Suppose T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] is a finite concrete execution sequence, with sequence of invoked operation names ms ”[ m0, m1 … ] . We define the abstract

314

314

R. Banach and M. Poppleton

R. Banach and M. Poppleton

trmP predicates and associated preP sets (with respect to T ) thus, where Smr is the body of operation mr in M : trmPM,r,r = true … … … trmPM,r,s = Pmr(ir, jr, u, vr)

[ Smr ] trmPM,r+1,s

(5.6)

(where r < s ˛ dom(T ) ), and for finite T with last(T ) = size(ms) = z : prePM,ms = {(u0, i0, i1 … ik–1) ˛ U · I0 · I1 … · Ik–1 | trmPM,0,z}

(5.7)

Note that these objects depend not only on abstract states and abstract inputs, but tacitly also on the concrete states and inputs (and sequence of operations ms ) appearing in T . We can now prove the following. Theorem 5.3 Let (5.1) describe a normal modulated refinement where the common set of operation names is Ops . Let T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] , with sequence of invoked operation names ms ”[ m0, m1 … ] , be a finite execution sequence of N . Suppose there is a (u0, i0 … ) ˛ prePM,ms such that u0 also witnesses the initialisation PO (5.2). Then there is an execution sequence of M , S ”[ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] , which is a stepwise simulation of T . Proof. Let T ”[ v0 -(j0, m0, p1)-› v1 … ] be as given. If dom(T ) = {0} , then the hypothesised u0 is all we need, as G(u0, v0) holds. Otherwise we construct a corresponding S ”[ u0 -(i0, m0, o1)-› u1 … ] by an induction on dom•(T ) . For r = 0 , we know from the hypotheses that a (u0, i0 … ) ˛ prePM,ms exists such that G(u0, v0) holds. Since trmPM,0,size(ms)(u0, i0 … ) trm(Sm0)(u0, i0) , we have trm(Sm0)(u0, i0) Pm0(i0, j0, u0, v0) G(u0, v0) . From the initialisation POs for M and N we know that I(u0) and J(v0) hold. So we have the antecedents of the normal modulated refinement PO for operations, which from the step v0 -(j0, m0, p1)-› v1 of T , yields a step of M , u0 -(i0, m0, o1)-› u1 , such that G(u1, v1) Vm0(u1, v1, o1, p1) holds. So we have as required: G(u0, v0)

Pm0(i0, j0, u0, v0)

G(u1, v1)

Vm0(u1, v1, o1, p1)

Since (u0, i0 … ) ˛ prePM,ms we conclude that there is a (u1, i1 … ) ˛ prePM,tail(ms) , and we also have the machine invariants I(u1) and J(v1) . For the inductive step, suppose S has been constructed as far as the r’th step. Then the machine invariants I(ur) and J(vr) hold, and we also have1 G(ur, vr) (ur, ir … ) ˛ prePM,msflr . This enables us to perform the inductive step as above. We cannot extend the above strategy to the case of infinite sequences T , as the predicate trmPM,0,r does not behave well as r grows without bound: not only does the u aspect of the predicate accumulate ‘at the front’ of the predicate, but we also have an unbounded number of input variables to contend with. These problems require extra hypotheses and a different strategy. Definition 5.4 Let (5.1) be a (normal or inverted) modulated refinement. Suppose

T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] is a(n infinite) concrete execution sequence. Let

1. For a sequence ms , ms›r is the first r elements of ms , and msflr is all except the first r elements of ms , where r refers specifically to cardinality rather than to absolute index values for dom(ms) .

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

N r = | {(u, i) ˛ U · Ir | G(u, vr) UI

We say that T is retrieve bounded iff:

trm(Smr)(u, i)

315

315

Pmr(i, jr, u, vr)} |

" r ˛ dom•(T ) • NUIr < ¥

(5.8)

Note that retrieve boundedness is inspired by the same thought as internal continuity in [21] and finite invisible nondeterminism in [22]. Theorem 5.5 Let (5.1) describe a normal modulated refinement where the common set of operation names is Ops . Let T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] , with sequence of invoked operation names ms ”[ m0, m1 … ] , be an infinite execution sequence of N . Suppose T is retrieve bounded. Suppose moreover for each r , that there is a (u0, i0 … ) ˛ prePM,ms›r such that u0 also witnesses the initialisation PO (5.2). Then there is an execution sequence of M , S ”[ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] , which is a stepwise simulation of T .

Proof. Let T ”[ v0 -(j0, m0, p1)-› v1 … ] be as given. We show there is a corresponding S ”[ u0 -(i0, m0, o1)-› u1 … ] as follows. We know that each finite prefix of T can be stepwise simulated because of Theorem 5.3. These finite simulations can be arranged into a tree thus: the root is a special node at level –1 ; the nodes at level r are (u, i) pairs such that G(u, vr) trm(Smr)(u, i) Pmr(i, jr, u, vr) holds; and there is an edge of the tree from (ur, ir) at level r to (ur+1, ir+1) at level r+1 iff there is a simulation of a finite prefix of T with ur -(ir, mr, or+1)-› ur+1 as final step; also there are edges from the root to all level 0 nodes. Because there are infinitely many finite simulations the tree is infinite, and by retrieve boundedness each of its levels is finite. By König’s Lemma, the tree has an infinite branch, which corresponds to a stepwise simulation S of T . Now we introduce inverted modulated refinement. The only difference compared to normal modulated refinement is that instead of (5.3), the operation PO for inverted modulated refinement reads: (I(u)

G(u, v)

J(v))

(trm(T(v, j, p))

P(i, j, u, v))

trm(S(u, i, o)) [ T(v, j, p) ] [ S(u, i, o) ] (G(u, v) V(u, v, o, p))

(5.9)

Note the inverted roles of the trm predicates. Theorem 5.6 Let (5.1) describe an inverted modulated refinement where the common set of operation names is Ops . Let T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] be an execution sequence of N . Then there is an execution sequence of M , S ” [ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] , which is a stepwise simulation of T . Proof. Let T ”[ v0 -(j0, m0, p1)-› v1 … ] be an execution sequence of N . The dom(T ) = {0} case is as in Theorem 5.3. Otherwise we go by induction on dom•(T ) .

For r = 0 , we know that for the given v0 and j0 from T , (5.2) holds. So for the m0 from T we can find an i0 such that G(u0, v0) Pm0(i0, j0, u0, v0) holds. By the definition of execution step, trm(Tm0)(v0, j0) holds. Now the initialisation POs for M and N yield I(u0) and J(v0) . Thus the operation PO (5.9), yields a step of M , u0 -(i0, m0, o1)-› u1 such that G(u1, v1) Vm0(u1, v1, o1, p1) holds. So: G(u0, v0)

Pm0(i0, j0, u0, v0)

G(u1, v1)

Vm0(u1, v1, o1, p1)

For the inductive step, suppose S has been constructed as far as the r’th step. Then we have I(ur) , J(vr) , and G(ur, vr) Vmr–1(ur, vr, or, pr) trm(Tmr)(vr, jr) . From the

316

316

R. Banach and M. Poppleton

R. Banach and M. Poppleton

operation compatibility PO (5.4), we can infer that for the jr and mr given by T , we can find an ir such that Pmr(ir, jr, ur, vr) holds. This is enough to complete the inductive step as before. We can see clearly why we need different termination assumptions in Theorems 5.3 and 5.5 on the one hand, and Theorem 5.6 on the other. In the latter, we need a termination assumption about the steps of T , a given execution sequence, to be able to exploit the operation PO. Since the individual concrete steps are each within their individual trm predicates, what we have is already sufficient. In the former, we need a termination assumption about the steps of S , an execution sequence yet to be constructed. Therefore stronger assumptions are needed before the relevant operation PO can be used. Theorem 5.6 can be understood as establishing a strong simulation in the automata theoretic sense, an intrinsically more local concept than the property in Theorem 5.3. Definition 5.7 Let U and V be sets (of states), and let U0 ˝ U and V0 ˝ V be subsets (of initial states). Let LU and LV be sets (of transition labels). Let U0 and TU ˝ U · LU · U be a transition system on U , and V0 and TV ˝ V · LV · V be a transition system on V . A pair of relations ( QS : U « V , QL : LU « LV ) is called a strong simulation from TV to TU iff: v0 ˛ V0

$

u0 ˛ U0 • u0 QS v0

(5.10)

and for all u , v : u QS v

v -m-› v¢ ˛ TV $ u¢ ˛ U , l ˛ LU • u¢ QS v¢

l QL m

u -l-› u¢ ˛ TU

(5.11)

Note that Definition 5.7 differs slightly from the conventional notion of strong simulation insofar as initial states of N are not required to relate via QS solely to initial states of M . Evidently this is for easier comparison with the initialisations arising from refinements and retrenchments. Definition 5.8 Let M be a machine with operation names set Ops . Then the reachable transition system TM associated to M is the initial state set U0 and subset TM of U · LM · U where: (1) (2) (3) (4)

LM = {(i, m, o) ˛ I · Ops · O} U = {u ˛ U | there is an execution sequence of M with u as final state} U0 = {u0 ˛ U s | u0 is an initial state of M} TM = {u -(i, m, o)-› u¢ | there is an execution sequence of M with u -(i, m, o)-› u¢ as final step}

Theorem 5.9 Let (5.1) describe an inverted modulated refinement where the common set of operation names is Ops . Then there is a strong simulation from the reachable transition system TN of N to the reachable transition system TM of M . Proof. We define a strong simulation (QS, QL) as follows: QS = {(u, v) ˛ U s · V s | G(u, v)

m ˛ Ops

(" jm $ im • Pm(im, jm, u, v))} (5.12)

QL = {((i, m, o), (j, m, p)) ˛ (I · Ops · O) · (J · Ops · P) | ($ u, v • Pm(i, j, u, v)) ($ u, v • Vm(u, v, o, p))}

(5.13)

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

317

317

Showing that this constitutes a strong simulation is a matter of routinely reworking the details of the inductive step of the proof of Theorem 5.6. We note that a strong simulation relates states to states, and transition labels to transition labels, essentially independently. In the case of Theorem 5.3, the properties demanded of abstract states at the beginning and end of a transition u -(i, m, o)-› u¢ , specifically that (u, i … ) ˛ prePM,mfims and (u¢ … ) ˛ prePM,ms , depend on m and ms ; in particular a step of M is not guaranteed to reestablish in the after-state, the condition assumed in the before-state, a prerequisite for the successful formulation of separate relations QS and QL . Thus only the inverted case supports a proper notion of simulation. In contrast, in normal refinement (modulated or not), the roles of ‘simulator’ and ‘simulatee’ are exquisitely confused; the abstract system says (via the trm predicate) when a step must exist and demands that the concrete system complies, while the concrete system says (via the step relation) how a step can be performed and demands that the abstract system simulates it. The preceding remarks depend on identifying the machine state with the automata theoretic state. If we relax this requirement, then we can incorporate history information in our notion of state, and then a notion of strong simulation can be recovered for the normal case (see eg. [21, 22, 23, 24, 25]), but this is a less natural correspondence than for the inverted case.

6 Simple Simulable Retrenchment In this section we build on the insights of the preceding section to address the simulation and refinement properties of retrenchment. Now, the inclusion OpsM ˝ OpsN is generally a proper one. First we give the definition of stepwise simulation in this setting. Definition 6.1 Let (3.1) be a retrenchment. Suppose that T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] is an execution sequence of N , and that S ”[ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] is an execution sequence of M , where [ m0, m1, … ] is a sequence over OpsM . Then S is a stepwise simulation of T iff G(u0, v0) holds, and for all r ˛ dom•(T ) there is an Ar such that: G(ur, vr)

Pmr(ir, jr, ur, vr, Ar) (G(ur+1, vr+1) Cmr(ur+1, vr+1, or+1, pr+1, Ar))

(6.1)

We now look at a simple sufficient condition for simulation and refinement. The properties of the relevant class of retrenchments are so strong that they are indeed almost refinements. Definition 6.2 For a retrenchment like (3.1), suppose the joint initialisation establishes: (G(u0, v0)

m ˛ OpsM

(" jm $ im, Am • Pm(im, jm, u0, v0, Am)))

(6.2)

and suppose that each OpsM operation n ”(Tn, An, Pn, Cn) of N satisfies the operation compatibility PO: G(u, v)

Cn(u, v, o, p, B)

(G(u, v)

m ˛ OpsM

(" jm $ im, Am • Pm(im, jm, u, v, Am)))

(6.3)

318

318

R. Banach and M. Poppleton

R. Banach and M. Poppleton

then we say that the retrenchment is a simple simulable retrenchment. The proof of the following theorem is very similar to that of Theorem 5.6 (only the final part of the inductive step is changed slightly), and is given in full in [26]. Theorem 6.3 Let (3.1) describe a simple simulable retrenchment where the set of abstract operation names is OpsM . Let T ”[ v0 -(j0, m0, p1)-› v1 -(j1, m1, p2)-› v2 … ] be an execution sequence of N . Suppose that the sequence of invoked operation names ms ”[ m0, m1 … ] is an OpsM sequence. Then there is an execution sequence of M , S ”[ u0 -(i0, m0, o1)-› u1 -(i1, m1, o2)-› u2 … ] , which is a stepwise simulation of T . Theorem 6.3 leads directly to a strong simulation result analogous to Theorem 5.9 for simple simulable retrenchment. Definition 6.4 Let M and N be machines with operation name sets OpsM and OpsN . The M-restricted reachable transition system of N (whose states are solely those reachable by sequences of operations with names in OpsM ) is defined by setting: (1) LNM = {(j, m, p) ˛ J · OpsM · P} (2) VM = {v ˛ V | there is an execution sequence of N consisting solely of OpsM operations, with v as final state} (3) VM0 = {v0 ˛ VM | v0 is an initial state of N} (4) TNM = {v -(j, m, p)-› v¢ | there is an execution sequence of N consisting solely of OpsM operations, with v -(j, m, p)-› v¢ as final step} Theorem 6.5 Let (3.1) describe a simple simulable retrenchment where the set of abstract operation names is OpsM . Then there is a strong simulation from the Mrestricted reachable transition system TNM of N to the reachable transition system TM of M . Proof. We define a strong simulation (QS, QL) as follows, after which the details are relatively straightforward. QS = {(u, v) ˛ U · VM | G(u, v)

m ˛ OpsM

(" jm $ im, Am • Pm(im, jm, u, v, Am))}

QL = {((i, m, o), (j, m, p)) ˛ (I · OpsM · O) · (J · OpsM · P) | ($ A • ($ u, v • Pm(i, j, u, v, A)) ($ u, v • Cm(u, v, o, p, A)))} Simple simulable retrenchment is sufficiently strong to yield a notion of modulated refinement. Here and in similar results below we disregard any operation of N not in OpsM of course. We present a number of closely related results. For the first of these we give a full proof, the other cases following easily. Theorem 6.6 Let (3.1) describe a simple simulable retrenchment where the set of abstract operation names is OpsM . Then the following is a normal modulated refinement: MACHINE

M "T ( a )

VARIABLES INVARIANT

u I(u)

MACHINE MODREF VARIABLES INVARIANT RETRIEVES

N $A ( b ) M "T v J(v) G(u,v)

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

INITIALISATION X ( u ) OPERATIONS = o ‹— OpName ( i ) ^ S "T ( u , i , o ) END

INITIALISATION Y ( v ) OPERATIONS p ‹— OpName ( j ) ^ = BEGIN T(v,j,p) WITHIN P $A ( i , j , u , v ) NEVERTHELESS true END END

319

319

(6.4)

where: S "T ( u , i , o ) ” PRE ("v,j•I(u) G(u,v) trm( T ( v , j , p ) ) ) THEN S(u,i,o) END P

$A

J(v)

trm( S ( u , i , o ) )

(6.5)

(i,j,u,v) ”($A•P(i,j,u,v,A))

(6.6)

Proof. Since M "T is a fresh machine, we must check its consistency. The initialisation PO is as for M . And the operation consistency PO, I trm(S "T) [ S "T ] I , 2 for any F , trm(F|S) F trm(S) , and [ F|S ] I F [ S ] I . For

follows since the rest of this proof, let F refer specifically to the PRE clause ("v, j • I(u) … trm(T(v, j, p))) introduced in (6.5) above. We can now check that the POs of the simple simulable retrenchment imply those of the refinement. For the initialisation PO, reinterpreting the $A in (6.2), yields (5.2) with Pm$A replacing the Pm . Likewise for the operation compatibility PO, weakening, and reinterpreting the $A in (6.3), yields (5.4) with Pm$A again replacing the Pm .

For the operation PO we need to show that for all members of OpsM : (I(u)

G(u, v)

J(v))

(trm(S "T(u, i, o))

P $A(i, j, u, v))

trm(T(v, j, p)) [ T(v, j, p) ] [ S "T(u, i, o) ]

G(u, v)

(6.7)

knowing that (3.2) and (6.3) hold. So let us hypothesise the antecedents of (6.7). These contain the ingredients of an application of generalised modus ponens, from which we can infer trm(T(v, j, p)) , as required in the consequent of (6.7); also through foresight, we add trm(T(v, j, p)) to the hypotheses. Now since we assume P $A(i, j, u, v) , we can infer P(i, j, u, v, A) for some A , and we add this P(i, j, u, v, A) to the hypotheses. At this point the hypotheses contain the antecedents of (3.2), so we infer in particular that: [ T(v, j, p) ] [ S(u, i, o) ]

(G(u, v)

C(u, v, o, p, A))

Next we use the trm/prd version of the normal form for generalised substitutions given in [2] and elaborated to include I/O. Applying this to both T and S in (6.7), after a little manipulation we get: 2. F|S is the substitution S preconditioned by F , so F|S is also PRE F THEN S END .

320

320

R. Banach and M. Poppleton

R. Banach and M. Poppleton

trm(T)

("v~p~ • prd(T)

(trm(S)

("u~o~ • prd(S) … …

)))

We lose nothing by disjoining F to the (outer) right conjuct since F is one of our hypotheses. After some working we obtain: trm(T)

("v~p~ • prd(T)

(trm(F|S)

("u~o~ • prd(F|S) … …

)))

Winding back the normal forms gives: [ T(v, j, p) ] [ S "T(u, i, o) ] We use (6.3) on the (G yields:

(G(u, v)

C(u, v, o, p, A))

C) term, and monotonicity, after which some simplification

[ T(v, j, p) ] [ S "T(u, i, o) ]

G(u, v)

Now it remains to apply the deduction principle, to arrive at (6.7). We are done. The next two results follow from the preceding, the first by noticing that in the refinement operation PO, trm(T) may be hypothesised directly, and the second by noticing that the termination part of the PO becomes trivial. Theorem 6.7 Let (3.1) describe a simple simulable retrenchment where the set of abstract operation names is OpsM . Then (6.4), with the occurrences of M "T and S "T replaced by M T and S T respectively, where S T is the generalised substitution: S T ( u , i , o ) ” PRE trm( T ( v , j , p ) ) THEN S ( u , i , o ) END

(6.8)

and (6.6), describe a normal modulated refinement. Theorem 6.8 Let (3.1) describe a simple simulable retrenchment where the set of abstract operation names is OpsM . Then (6.4), with the occurrences of M "T and S "T replaced by M and S respectively, and (6.6), describe an inverted modulated refinement. It is interesting to compare the three refinement results of Theorems 6.6, 6.7, 6.8, particularly with respect to the variables that occur in the various components. One thing we would like to do is to consider the various machines M "T , M T , M , that occur in these theorems as versions of the M that occurs in (3.1), since we would like the theorems to be saying something about the original simple simulable retrenchment (3.1). Clearly there is no problem in such an identification for Theorem 6.8, since the two versions of M are syntactically identical. However in the case of Theorems 6.6 and 6.7, there is more to be said. For Theorem 6.6, machines M "T and M involve the same free variables. Nevertheless one cannot say straightforwardly say that the two machines exist in the same universe, for the concrete variables v and j are mentioned in the precondition of S "T , even though they are universally quantified away. What discloses the dependence of machine M "T on these concrete entities, is the free ocurrence of the relational (meta-)variables G , J , trm(T) . These reveal that even though the concrete variables that they depend on are quantified away, whatever their identity (as concrete variables), they still take values in their own appropriate concrete universes. Thus one cannot regard M "T as a version of the M of (3.1) without taking this amplification of the universe of variable values into account. Of course this is not a novel phenomenon since it is familiar already from standard B refinement. (See [2] Ch. 11, where a refinement of an abstract machine, though free only in the concrete variables, is nevertheless viewed as a ‘differential’ added to the abstract machine, this being expressed via the existential quantification of the abstract variables in the concrete construct).

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

321

321

The issue we have mentioned strikes us even more forcefully in the case of Theorem 6.7, where the concrete variables actually occur free in the machine M T via (6.8). Here we cannot pretend that M T exists in the same world as M ; the applicability of an abstract operation of M T depends on the values of concrete variables that we wish to relate to the abstract variables at the point of application. Still, the discusion above persuades us that the difference between the cases of Theorems 6.6 and 6.7 is not as great as it might at first appear to be.

7 The Mobile Radio Example Revisited As given, our mobile radio example does not in fact provide us with a simple simulable retrenchment. The problem lies with the call_outgoing operation, because it can violate the retrieve relation, as revealed in Section 3, where the possibility jcallState = Jam admitted in the CONCEDES clause conflicts with the retrieve relation clause jcallState = callState. However it is not hard to prove that we would have had a simple simulable retrenchment if we had substituted the call_outgoing operation in Mobile_Radio_LL with the following more robust version in which the Jam state is not a possible outcome: call_outgoing ( num ) ^ = PRE bandSelected = TRUE jcallState = Idle num ˛ CHANELS THEN CHOICE jcallState := Busy || jcurrChan := num OR skip END END This version of the operation requires only the trivial ramification to remain within the retrenchment. Note that even though the resulting system cannot violate the retrieve relation via any of the retrenched operations, such violation can still take place via the fadeout event which leads to jcallState = Jam. Thus even the simple simulable case of retrenchment expresses things that eg. superposition refinement cannot. Of course in reality, the situation described in Mobile_Radio_LL will be the more typical one. Operations will have the capacity to yield results which either obey the retrieve relation or not. Those calls of the operation that remain within the retrieve relation will be simulable, the others not. A theoretically weaker framework will be able to distinguish between these cases, to derive stepwise simulations of the well behaved execution sequences. However the price paid for the presence of the others is the absence of results like Theorems 6.6, 6.7, 6.8 which involve an implicit quantification over all possible cases. The details are beyond the scope of this paper.

8

Conclusions

In this paper we started by looking at the passage from a relatively simple description of the key elements of a system, to a more comprehensive and therefore more cluttered picture, encompassing much detail that ‘could at the start be left till later’. This kind of piecemeal buildup is typical of what goes on in realistic system design in its initial and preformal stages. Leaving details till later is usually not a symptom of laziness, but a pragmatic response to the task of understanding the complexity of a large system. In discussing our example we debated the extent to which existing

322

322

R. Banach and M. Poppleton

R. Banach and M. Poppleton

elaborations of refinement were capable of giving an account of this process and we concluded that none of the existing ones fully covered what was needed. Our retrenchment proposal allowed the inclusion of the desired detail in a flexible framework that also incorporated the most pertinent aspects of existing techniques. One benefit of retrenchment is that it allows the controlled denial of previously stated abstract properties. This is useful since a system structuring strategy that totally forbids such denial can force the system structure into a form that appears upside down when compared with normal engineering intuition. Consider a simple example: if one were not allowed to contradict the unrealistically unbounded nature of Peano natural numbers, then strictly speaking, the properties of finite arithmetic ought to come out as the top level component of almost any design that requires calculations. Our main concern in this paper was to explore some of the semantic properties of retrenchment, focusing on simulation. Our approach bears comparison with similar work eg. [27, 28]. In our case we introduced modulated refinement as a notion intermediate between conventional refinement and retrenchment. A key observation was that in the inverted version of this concept, we could relate sequence-oriented simulation and automata theoretic strong simulation whilst making a natural identification of the notions of state involved, a situation that fails for conventional refinement. We then applied these insights to a simple special case of retrenchment, the simple simulable case. We consider the simulation theoretic properties exhibited by this special case, (and others, based on weaker assumptions, whose treatment is beyond the scope of this paper), as ample retrospective reinforcement of the semantics of retrenchment given in the proof obligations proposed in [1] on purely heuristic grounds. Essentially, when models are connected only weakly as in retrenchment, a more unidirectionally oriented relationship between them is more informative than the more intimately interdependent one expressed through (normal) refinement. Even so, for the simple simulable special case, we were able to recover a notion of refinement, albeit at the price of some frame issues. References 1.

Banach R., Poppleton M. Retrenchment: An Engineering Variation on Refinement. in: Proc. B-98, Bert (ed.), Springer, 1998, 129-147, LNCS 1393. See also: UMCS Technical Report UMCS-99-3-2, http://www.cs.man.ac.uk/cstechrep

2.

Abrial J. R. The B-Book. Cambridge University Press, 1996.

3.

Wordsworth J. B. Software Engineering with B. Addison-Wesley, 1996.

4.

Lano K., Haughton H. Specification in B: An Introduction Using the B-Toolkit. Imperial College Press, 1996.

5.

Sekerinski E., Sere K. Program Development by Refinement: Case Studies Using the B Method. Springer, 1998.

6.

Hayes I. J., Sanders J. W. Specification by Interface Separation. Form. Asp. Comp. 7, 430-439, 1995.

7.

Mikhajlova A, Sekerinski E. Class Refinement and Interface Refinement in Object-Oriented Programs. in: Proc. FME-97, Fitzgerald, Jones, Lucas (eds.), Springer, 1997, 82101, LNCS 1313.

8.

Boiten E., Derrick J. IO-Refinement in Z. in: Proc. Third BCS-FACS Northern Formal Methods Workshop. Ilkley, U.K., BCS, 1998, http://www.ewic.org.uk/ewic/ workshop/view.cfm/NFM-98

Retrenchment, Refinement, and Simulation

Retrenchment, Refinement, and Simulation

9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.

323

323

Stepney S., Cooper D., Woodcock J. More Powerful Z Data Refinement: Pushing the State of the Art in Industrial Refinement. in: Proc. ZUM-98, Bowen, Fett, Hinchey (eds.), Springer, 1998, 284-307, LNCS 1493. Back R. J. R., Kurki-Suonio R. Decentralisation of Process Nets with Centralised Control. in: Proc. 2nd ACM SIGACT-SIGOPS Symp. on Princ. Dist. Comp., 131-142, ACM, 1983. Back R. J. R. Refinement Calculus Part II: Parallel and Reactive Systems. in: Proc. REX Workshop, Stepwise Refinement of Distributed Systems, de Roever, Rozenberg (eds.), Springer, 1989, 67-93, LNCS 430. Back R. J. R., von Wright J. Trace Refinement of Action Systems. in: Proc. CONCUR94, Jonsson, Parrow (eds.), Springer, 1994, 367-384, LNCS 836. Francez N., Forman I. R. Superimposition for Interactive Processes. in: Proc. CONCUR90, Baeten, Klop (eds.), Springer, 1990, 230-245, LNCS 458. Katz S. A Superimposition Control Construct for Distributed Systems. ACM Trans. Prog. Lang. Sys. 15, 337-356, 1993. Back R. J. R., Sere K. Superposition Refinement of Reactive Systems. Form. Asp. Comp. 8, 324-346, 1996. Blikle A. The Clean Termination of Iterative Programs. Acta Inf. 16, 199-217, 1981. Coleman D., Hughes J. W. The Clean Termination of Pascal Programs. Acta Inf. 11, 195210, 1979. Neilson D. S. From Z to C: Illustration of a Rigorous Development Method. PhD. Thesis, Oxford University Computing Laboratory Programming Research Group, Technical Monograph PRG-101, 1990. Owe O. An Approach to Program Reasoning Based on a First Order Logic for Partial Functions. University of Oslo Institute of Informatics Research Report No. 89. ISBN 8290230-88-5, 1985. Owe O. Partial Logics Reconsidered: A Conservative Approach. Form. Asp. Comp. 3, 116, 1993. Jonsson B. Simulations between Specifications of Distributed Systems. in: Proc. CONCUR-91, Baeten, Groote (eds.), Springer, 1991, 346-360, LNCS 527. Abadi M., Lamport L. The Existence of Refinement Mappings. Theor. Comp. Sci. 82, 253-284, 1991. Jonsson B. On Decomposing and Refining Specifications of Distributed Systems. in: Proc. REX Workshop, Stepwise Refinement of Distributed Systems, de Roever, Rozenberg (eds.), Springer, 1989, 361-385, LNCS 430. Lynch N. Multivalued Possibilities Mappings. in: Proc. REX Workshop, Stepwise Refinement of Distributed Systems, de Roever, Rozenberg (eds.), Springer, 1989, 519-543, LNCS 430. Merritt M. Completeness Theorems for Automata. in: Proc. REX Workshop, Stepwise Refinement of Distributed Systems, de Roever, Rozenberg (eds.), Springer, 1989, 544560, LNCS 430. Banach R., Poppleton M. Retrenchment and Punctured Simulation. in: Proc. IFM-99, Taguchi, Galloway (eds.), 457-476, Springer, 1999. Derrick J., Bowman H., Boiten E., Steen M. Comparing LOTOS and Z Refinement Relations. in: Proc. FORTE/PSTV-9, 501-516, Chapman and Hall, 1996. Bolton C., Davies J., Woodcock J. On the Refinement and Simulation of Data Types and Processes. in: Proc. IFM-99, Taguchi, Galloway (eds.), 273-292, Springer, 1999.

Performing Algorithmic Refinement before Data Refinement in B Michael Butler1 and Mairead Meagher1,2 1

2

Department of Electronics & Computer Science, University of Southampton, Southampton SO17 1BJ, United Kingdom [email protected] Department of Physical & Quantitative Sciences, School of Science, Waterford Institute of Technology, Waterford, Ireland Tel: +353-51-302627 Fax: +353-51-878292 [email protected]

Abstract. Algorithmic Refinement is part of the theory of the B method both at the refinement and implementation stages. It a sign of how little loop introduction is used in practice at the refinement stage that neither the B-Toolkit nor Atelier-B provide support for loop introduction until the implementation stage. This paper examines the use of algorithmic refinement in general before data refinement. This involves extending the usual scope of data refinement which usually happens before algorithmic refinement. Two case studies are used to compare and contrast the application of algorithmic refinement before data refinement and vice versa. Some extensions are needed in the B-Toolkit to implement this style (i.e., algorithmic before data refinement) and are proposed. Some workarounds are also presented when appropriate.

1

Introduction

We deal with systems developed using the B method. Such systems are originally specified using abstract data types. The eventual implementation will involve concrete data types and algorithms working on these. To get from the abstract specification to the concrete implementation, we repeatedly refine the previous machine. Refining a machine means making it less abstract whilst preserving the previous refined machine’s properties. There are two main types of such refinements. These are data refinement and algorithmic refinement. Data refinement involves introducing change into the type of data being worked on (introduces concrete data types). Algorithmic refinement involves introducing more concrete programming language-like structures to work on the data, leaving the structure of the data unchanged. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 324–343, 2000. c Springer-Verlag Berlin Heidelberg 2000

Performing Algorithmic Refinement before Data Refinement in B

325

In this paper, we examine the relative merits of introducing algorithmic refinement before data refinement and vice versa. It seems that most developments using B take the approach of data refinement first. In each of our case studies, we illustrate the two approaches. When we introduce algorithmic refinement before data refinement, we use rules on the distribution of data refinement through an algorithmically refined structure as used in [5,8,9,11,12]. The standard approach in the development of systems using the B method is to apply data refinement and then proceed with algorithmic refinement on the concrete data types. If we take the example of an operation which changes state, then using this approach the first refinement(s) will involve introducing data refinements into the B specification, i.e., the form of the state changed by the operation has become more concrete. Once this is done, attention is turned to ‘how’ the specification, based on the concrete data, will be implemented. Future refinements will introduce correctness-preserving programming languagelike structures, i.e., algorithmic refinements. An alternative approach is to apply algorithmic refinement immediately on the abstract data types. Data refinement is then applied to the algorithms based on abstract data types. This style, although part of the theory and literature of the B method [1] is not directly implementable in either of the B-Toolkit or Atelier-B. Specifically, loop introduction is not allowed until the implementation stage. This suggests that this style is neither used nor sought in practice. If we again take the example of the operation which changes state, algorithm-like structures (usually a loop including sequential statements) are immediately introduced. So we have, very early, attended to ‘how’ the implementation might be achieved. Next, we change the form of the state being changed, using distribution laws [5,8,9,11,12] which tell us under which circumstances the transformations from one statement to another are correctness-preserving under the current data refinement.

2

Algorithmic Refinement before Data Refinement

We examine the approach of applying algorithmic refinement early, before data refinement. This leads to program-like structures based on abstract data types. We need to change what the program-like structures work on, i.e., bring the data closer to concrete, implementable data types. So we need to state what we mean by both algorithmic and data refinement. Data refinement involves replacing abstract program variables with concrete program variables preserving an abstraction relation between them. For statement S and postcondition Q, [S]Q represents the weakest-precondition under which S is guaranteed to terminate and establish Q [1]. Let S be a statement with abstract program variables a, and let T be a statement with concrete variables c. S is data-refined by T under abstraction relation R, written S vR T , if the following holds [1]: R ∧ [S]true ⇒ [T ]¬[S]¬R.

326

M. Butler and M. Meagher

We also make use of the least data-refinement of a statement [12]. Again, let S be a statement with program variables a, and let R be an abstraction relation relating a and c, then the least-refined statement on program variables c which a,c [[ S ]], that is: is also a data refinement of S under R is denoted DR a,c [[ S ]] and S vR DR

a,c S vR T ⇒ DR [[ S ]] v T.

a,c through the structure of S may be found in [5,8,9, Rules for distributing DR a,c 11,12]. Some of these rules are repeated in Fig. 1. The first rule shows that DR distributes through sequential composition. The second and third rules show a,c distributes through if-statements and loops provided the guards are that DR equivalent under the abstraction relation. Note that the abstract B loop will normally have an associated variant and invariant. These will not be explicitly carried forward in later refinements as they are only required when the abstract loop is first introduced in a refinement step. The fourth rule shows the conditions under which nondeterministic assignments may be data refined. The fifth rule deals with the data refinement of blocks with local variables. The sixth and final rule deals with a special type of operation. Let Aproc be an abstract parameterised operation whose only effect is to calculate and return the value of the output variable. We treat such operations as procedures, i.e., named pieces of code as in the refinement calculus [10]. The rule says that if the body of Aproc is data-refined by the body of Cproc, then a call to Aproc is data-refined by a call to Cproc. Note that this does not correspond to the standard definition of refinement in B where the refined operation should have exactly the same parameters as the abstract operation. In Section 6, we outline a way of representing this procedural refinement in standard B. We calculate a data refinement of a statement S under R by calculating a,c [[ S ]]. For example, we wish to data refine (under R) the a refinement of DR a,c [[ S1 ]] vR S10 and sequential composition represented by S1 ; S2 . Given that DR a,c 0 DR [[ S2 ]] vR S2 , and appealing to the data refinement laws of Fig. 1, then: a,c a,c a,c [[ S1 ; S2 ]] vR DR [[ S1 ]]; DR [[ S2 ]] vR S10 ; S20 . S1 ; S2 vR DR

So,

S1 ; S2 vR S10 ; S20 .

As well as the data refinement rules of Fig. 1, we use standard algorithmic refinement rules from [1,10] throughout the case studies in this paper.

3

System of Interest

Both case studies used in this paper are taken from a system currently under development. The overall system is a system for the counting of votes for the academic elected membership of the Academic Council in Waterford Institute of Technology, Ireland. (Each academic member of staff can vote to elect 13 members from the academic staff). An implementation of a Proportional Representational system, known as Single Transferable Vote, is used [6].

Performing Algorithmic Refinement before Data Refinement in B a,c a,c a,c DR [[ S1 ; S2 ]] v DR [[ S1 ]]; DR [[ S2 ]]

(1.1)

p ∧ R ⇒ G ⇐⇒ H

(1.2)

a,c DR [[ PRE p THEN WHILE G DO S END ]] a,c v WHILE H DO DR [[ S ]] END

p ∧ R ⇒ G ⇐⇒ H a,c DR [[ PRE p THEN IF G THEN S END ]] a,c v IF H THEN DR [[ S ]] END

(1.3)

R ∧ Q ∧ p ⇒ (∃a0 • [a, c := a0 , c0 ]R ∧ P ) a,c DR [[ PRE p THEN @a0 • (P =⇒ a := a0 ) ]] v @c0 • (Q =⇒ c := c0 )

R ⇒ (∃a1 , c1 • Q) (∀a1 , c1 • R) ⇐⇒ R (@a1 • S) = S (a ,a ),(c0 ,c1 )

a0 ,c0 0 1 DR [[ VAR a1 IN S END ]] v VAR c1 IN DR∧Q

[[ S ]] END

a2 ←− Aproc(a1 ) = b A c2 ←− Cproc(c1 ) = b C (a ,a ),(c ,c ) DR 1 2 1 2 [[ A ]] v C (a0 ,a02 ),(c01 ,c02 )

DR01

327

[[ a02 ←− Aproc(a01 ) ]] v c02 ←− Cproc(c01 )

(1.4)

(1.5)

(1.6)

where a, c are formal parameters, a’, c’ are actual parameters. Fig. 1. Data Refinement Laws

The input to this system is a collection of votes, which are then counted according to a set of rules. (These rules are not of interest to us for the scope of this paper). The main output of the system is the list of elected candidates. The raw input can contain errors, either accidentally or deliberately introduced by the voter. A decision was taken at specification stage to specify the counting system based on validated votes (which we call ballots). We therefore need to specify (and implement) this preprocessing of the input. At the abstract level, the input is modelled as a bag of papers where a paper is a partial function from candidate to N and bag T == T → 7 N1 , where for b ∈ bag T and item ∈ dom(b), b(item) returns the number of occurrences of item in the bag. This raw input paper is processed to become what we term a ballot. The first case study will deal with this operation, whose abstract specification is called M ake Ballot. The preprocessing of the bag of papers returns a bag of ballots. Not all votes will be valid, so some input will be discarded. The second

328

M. Butler and M. Meagher

case study will deal with this overall operation, whose abstract specification is called P re P rocess. Valid preferences on a paper are that set of preferences that are unique, contiguous and start at one. Duplicate preferences (e.g., two candidates have preference 3 associated with them) are disregarded as are all higher preferences on the paper. A skip in preferences (e.g., the voter expresses preferences 1,2,4,5 but no 3 ) invalidates all preferences after the skipped preference (in this case, only 1,2 are valid). The ballot holds only valid preferences and uses an injective sequence such that the first element of the sequence is the candidate whose preference was 1, etc. (No other candidate will have been validly assigned preference 1). It may happen that a paper has no valid preferences (e.g., if two candidates are given preference 1), in which case the ballots sequence will be empty. This is termed a spoiled vote and is not added to the (resultant) bag of ballots. Note that in the specification, we use stateless machines. We specify the (parameterised) operations using mathematical functions defined in the CONSTANTS and PROPERTIES clauses. The final operation simply calls these functions.

CONSTANTS make ballot, pre process PROPERTIES make ballot ∈ P aper → Ballot ∧ pre process ∈ bag P aper → bag Ballot ∧ ∀ paper.(paper ∈ P aper ⇒ make ballot(paper) = 1..(min({nn | nn ∈ 1..no cands + 1 ∧ card(paper∼ [{nn}]) 6= 1}) − 1) C paper∼ ∧ ∀ bagpapers.(bagpapers ∈ bag P aper ⇒ pre process(bagpapers) = {bb, nn | bb ∈ Ballot ∧ nn ∈ N ∧ ∃ pp.(pp ∈ dom(bagpapers) ∧ bb = make ballot(pp)) ∧ nn = Σpp • (bb = make ballot(pp) | bagpapers(pp)) ∧card(bb) > 0} OPERATIONS bagballots ←− P re P rocess(bagpapers) = b PRE bagpapers ∈ bag P aper THEN bagballots := pre process(bagpapers) END Fig. 2. Abstract Specification of Preprocessing

Before we specify the parameterised operations, we introduce a few types. The raw input is modeled as simply a partial function from Candidate to N. The

Performing Algorithmic Refinement before Data Refinement in B

329

type is called Paper. The validated form of the paper is modelled as an injective sequence of Candidates in order of preference. This type is called Ballot. In summary, the types are as follows: P aper = b Candidate → 7 N Ballot = b iseq(Candidate). We have a system-wide constant called no cands and stands for the number of registered candidates. It is an important number as it limits the size of the validated ballot. The specification of the preprocessing operation is shown in Fig. 2. We present two case studies based on this operation. The first case study illustrates the development of the M ake Ballot operation. The abstract paper is a function from Candidate to N and, if we invert the function, we have a relation from preferences to candidates at that preference. If we call the lowest non-unique preference ≥ 1 f irst skip or dup(licate), i.e., the lowest preference either to appear more that once (duplicate) or not at all (causing a skip in the order of preferences), then it follows that all preferences between 1 and f irst skip or dup -1 appear exactly once. Thus, if we domain restrict the inverted abstract paper’s function between 1 and f irst skip or dup -1, we have a sequence. This sequence contains the candidates in order of preference and is the abstract ballot. This sequence is injective as it is formed from an inverted function. It may happen that all preferences are used ‘correctly’, i.e., the size of the abstract ballot is no cands. This case is dealt with by the use of nn ∈ 1..no cands + 1 in the definition of make ballot(paper). The second case study illustrates the development of the P re P rocess operation which takes a collection of ballots, each returned by the M ake Ballot operation, and inserts them into an output bag under certain conditions.

4

Case Study 1 - Development of Make Ballot

The abstract specification of M ake Ballot is as follows: b aballot ←− M ake Ballot(apaper)= PRE apaper ∈ P aper THEN aballot := make ballot(apaper) END. When we substitute for make ballot in the assignment, we get aballot := ( 1..min({nn | nn ∈ 1..no cands + 1 ∧ card(apaper∼ [{nn}]) 6= 1}) − 1 ) C apaper∼ . If we let f irst skip or dup = min({nn | ∈ 1..no cands+1∧card(apaper∼ [{nn}]) 6= 1}),

330

M. Butler and M. Meagher

then the above can be rewritten as aballot := ( 1..f irst skip or dup − 1 ) C apaper∼ . The R.H.S. of the assignment statement can be simplified, using the definition of C to { nn 7→ cc | nn ∈ 1..f irst skip or dup − 1 ∧ cc 7→ nn ∈ apaper }. We explore the two possible paths of development, the style of algorithmic refinement first. We take the following approach on deciding on the shape of our implementation: We outline the shape of a number of different possible implementations. We compare these for efficiency, using [2] and decide on a best possible implementation. (This decision making-process is of no further interest to us for the purpose of this paper). We then have a possible implementation to aim for which helps us to make decisions during development. Using this approach means that for the case studies in this paper, we start with the same specification and expect to arrive at a similar implementation using both styles of development. 4.1

Make Ballot - Algorithmic Refinement followed by Data Refinement

We look directly for a loop invariant based closely on the structure of the specification, as follows: b LI1 =

aballot = { nn 7→ cc | nn ∈ 1..so f ar − 1 ∧ cc 7→ nn ∈ apaper }.

The guard of the loop is so f ar < f irst skip or dup. We introduce the loop shown in Fig. 3. Note that apaper∼ (so f ar) is well defined since: apaper ∈ Candidate → 7 N ∧ so f ar < f irst skip or dup ⇒ ∀ii.(ii ∈ 1.so f ar ⇒ card(apaper∼ [{ii}]) = 1 ⇒ 1..so f ar  apaper∼ ∈ N → 7 Candidate. Most of the proof obligations generated from the introduction of this loop are easily discharged. We have a close look at the P-Rule [13], i.e., LI1 ∧ G ⇒ [Body]LI1 . [aballot(so f ar) := apaper∼ (so f ar)] [so f ar := so f ar + 1] (aballot = { nn 7→ cc | nn ∈ 1..so f ar − 1 ∧ cc 7→ nn ∈ apaper } ) = [aballot(so f ar) := apaper∼ (so f ar)] (aballot = { nn 7→ cc | nn ∈ 1..so f ar ∧ cc 7→ nn ∈ apaper } ) = (aballot n The initial specification is: Fib ∆[y : N] z? : N y 0 = fib(z ?) 2.2

The Development

The strategy for obtaining a simply recursive program for this specification is to introduce another accumulator: ExFib ∆[x , y : N] z? : N y 0 = fib(z ?) x 0 = fib(z ? − 1)

346

M.C. Henson and S. Reeves

We have, using rules for frame expansion (rule exp, section 3.3) and strengthening + of postconditions (rule wpost , section 3.2): ExFib w Fib This immediately highlights a property of our framework which makes it particularly well suited to specifications which are formed, using the algebra of schemas, from atomic schemas with partially disjoint alphabets. Our version of the frame law does not force observations outside those specified to remain unchanged. In theories of refinement built upon a weakest precondition semantics the stability of such observations is forced, making that semantic basis difficult to reconcile with specifications constructed generally using the algebra of schemas. We will revisit this in more detail in section 3.3. Next we formulate λz ? • ExFib, a curried version of ExFib which, at input n, is: ∆[x , y : N] y 0 = fib(n) x 0 = fib(n − 1) Note that this is not a definition of a new schema, but rather a schema expression resulting from the currying process, and this is why the schema is presented anonymously. The currying process itself is described in detail in section 4 below (see definition 5). The point of currying is to prepare the way for the construction of a program by recursion on n, more exactly, by a recursive procedure: fibonacci[z?] The rule by which this is achieved will be described in section 4 below (rule rps + ). We first obtain, eliding a little logical simplification, the schema (λz ? • ExFib)[0]: ∆[x , y : N] y0 = 1 x0 = 1 and, with similar simplification, (λz ? • ExFib)[n + 1]: ∆[x , y : N] y 0 = fib(n + 1) x 0 = fib(n)

Program Development and Specification Refinement in the Schema Calculus

347

The first of these can be refined as usual into a simultaneous assignment (see section 4, rule A + := ): x,y := 1,1 +

The second schema can be refined to a disjunction (rule ∨w , section 3.3), by splitting the (implicit) true precondition into two cases: zero and successor. U0 [n] ∨ U1 [n] w (λz ? • ExFib)[n + 1] where: U0 [n] ∆[x , y : N] n=0 y0 = 1 x0 = 1 and: U1 [n] ∆[x , y : N] ∃m : N • n = m + 1 y 0 = fib(n + 1) x 0 = fib(n) The purpose of this decomposition is to prepare for the introduction of a conditional command in due course. Assuming commands cmd0 and cmd1 which meet U0 [n] and U1 [n] we can refine the disjunction to: if n == 0 then cmd0 else cmd1 using rule A + if described in section 4 below. + We can weaken the precondition (rule wpre , section 3.2) of U0 [n] to obtain: U2 ∆[x , y : N] y0 = 1 x0 = 1 and this can also be refined to a simultaneous assignment (rule A + := , section 4: x,y := 1,1 U1 [n] can be further refined to the composition U3 [n] o9 Step, where:

348

M.C. Henson and S. Reeves

U3 [n] ∆[x , y : N] ∃m : N • n = m + 1 y 0 = fib(n) x 0 = fib(n − 1) and: Step ∆[x , y : N] x0 = y y0 = x + y We will demonstrate this refinement in more detail by observing that U3 [n] can be written as: ∆[x , y : N] ∃ m : N • n = m + 1 ∧ y 0 = fib(m + 1) ∧ x 0 = fib(m) and the composition of this with Step is a refinement (rule

o+ 9w

, section 3.3) of:

∆[x , y : N] ∃ m : N • n = m + 1 ∧ y 0 = fib(m + 1) + fib(m) ∧ x 0 = fib(m + 1) or: ∆[x , y : N] ∃ m : N • n = m + 1 ∧ y 0 = fib(m + 2) ∧ x 0 = fib(m + 1) or: ∆[x , y : N] ∃m : N • n = m + 1 y 0 = fib(n + 1) x 0 = fib(n) which is U1 [n] as required. We will now be able to introduce a sequence of commands (rule A + o , section 9 4) into our program, in which the obvious simultaneous assignment:

Program Development and Specification Refinement in the Schema Calculus

349

x,y := y,x+y appears as the second component (rule A + := , section 4). That is: U3 [n] o9 Step w U1 [n] which will, for a suitable command cmd2 , be refined to: cmd2 ;x,y := y,x+y +

We may now weaken the precondition (rule wpre , section 3.2) of U3 [n] to obtain: ∆[x , y : N] y 0 = fib(n + 1) x 0 = fib(n) which is just (λz ? • ExFib)[n]. This is refined by the recursive call: fibonacci[n] which we have available as an assumption (second premise of rule rps + , section 4). The program this yields, is, in summary: proc fibonacci(z?) cases z? in 0: x,y := 1,1 n+1: if n==0 then x,y := 1,1 else fibonacci[n] ; x,y := y,x+y endcases

3

An Interpretation of Operation Schemas

In our approach the specification of an operation is modelled as a set of legitimate implementations rather than the more usual interpretation as a set of bindings, and thus, when p is a program and U is a specification, the implementation relation is just membership: p A U =df p ∈ U This, then, is just the claim that the program p meets the specification U . Refinement, then, is simply containment. Roughly: U0 w U1 =df U0 ⊆ U1 We have evidently taken the unusual step of distinguishing between programs and specifications, eschewing the more common view that programs are special cases of specifications (e.g. [14]). The position is quite reasonable: we argue that specifications are extensional objects characterising classes of possible

350

M.C. Henson and S. Reeves

algorithms, whereas programs are, in contrast, intensional objects representing particular algorithms. Such ontological scruples live quite happily, however, with the standard program development methodology in which it is only refinement that makes an explicit appearance. This is as simple as observing that: p A U ⇔ {p} w U Nevertheless, mathematically, implementation is for us the basic notion, and the relation of refinement is then derived from that. This offers us the opportunity to investigate a semantics for specifications which is not otherwise available; moreover, as we will see, it possesses satisfactory mathematical and pragmatic properties. Although our framework is classical, there is a clear resonance between judgements of the form p A U and the kind of relationship between proof (objects) and propositions that is well-known in constructive frameworks (e.g. [13]). We will make further comments about this correspondence at the end of section 4. 3.1

Preconditions and Postconditions

There are two possible approaches to preconditions: the syntactic approach offered by B, Morgan’s refinement calculus and very many others; and the logical approach adopted by Z. This latter approach is essentially a postcondition only approach, in which one induces the weakest condition consistent with meeting the postcondition. If one is content to use Z for specification and design, the logical approach has much to recommend it. On the other hand, if one wishes to derive programs, there are very many circumstances in which this approach is a serious burden: discharging such conditions is, in the worst case, equivalent to deriving a program. Occasionally, there may be trivial methods at hand; but often there are not. The derivation of a sorting algorithm furnishes a simple example: discharging the logical precondition of the specification will require a demonstration that sorted permutations exist; but this is equivalent to the entire task one is undertaking. Our framework may be set up using either approach (so the partisan may choose quite freely) and the reader will not be able to tell from the example in section 2 above which one we are discussing in this paper. But, since the logical approach is standard, we will explore the alternative here, and introduce explicit and syntactically determined preconditions into the notation. Our form for atomic operation schemas, in this paper, has the following general structure: Op Decls Pre Post

Program Development and Specification Refinement in the Schema Calculus

351

We insist that Pre, the precondition, may only refer to observations on the input state and input observations listed amongst the schema declarations. There need be no restriction on the observations permitted in Post, the postcondition. A notational convention simplifies matters and makes the new dividing line superfluous: if the precondition is true then that section of the schema can be omitted. Or, more generally, the line can be omitted if the schema constraint is expressed as a conjunction of propositions (the dividing line occurring after the last conjunct in which no after-observation occurs). These conventions were used in the example in section 2 and again in section 5. One must take some care, however, because specifications written to look like standard Z will not necessarily have equivalent syntactic and logical preconditions. We tend to retain the new dividing line in adumbrating the theory, however, for clarity of presentation. 3.2

Operation Schema Calculus

We permit the construction of expressions using the usual schema algebra; in this paper the following syntax for operation schemas is sufficient: U → [D | P | P ] | U ∨ U | U ∧ U | ∃ z : C • U | U o9 U We shall in the sequel permit U (etc.) to range over arbitrary sets of operations; so we shall treat this grammar as establishing notational shorthands for certain such sets. As we have already suggested, our interpretation of this language takes operation schema to denote sets of programs. A program is interpreted as a lambda abstraction over some universal state W. This state may be considered a sufficiently large schema type populated with state, input and output observations. It is not necessary to specify this schema type in any detail; it is analogous to the notion of a global state that is routinely used in denotational semantic methods where, similarly, precise details beyond its being suitably large are rarely needed. In that context variable scope and the distinction between parameters, stack and heap locations are handled by semantic equations; in this context the distinction between state, input and output observations are handled by the logic (although note that we make no use of output observations in this paper). Thus, we are led to the following:1 Definition 1. [D | P | Q] =df {p W_W | ∀ z W • z .P ⇒ z .(p z )0 .Q} That is, an operation schema is modelled as the set of operations (programs, lambda terms) which are guaranteed to construct results which satisfy the postcondition whenever inputs satisfy the precondition. Note that outside the precondition and outside the frame (the alphabet of D) no constraints are imposed on the programs: they may have arbitrary behaviour. 1

We use the generalised selection notation z .P from [16] to indicate the proposition P in which each observation instance x is replaced by z .x . Also, priming is an operation which renames the observations of a binding, by priming them. Finally, note that the lambda calculus is simply typed; hence application terms always denote.

352

M.C. Henson and S. Reeves

The semantics above is then extended to the more complex expressions of disjunction, conjunction etc. by union, intersection etc. To begin with we define precisely what it means for a program to implement a specification:2 p A U =df p ∈ U The following rules are then easy consequences of definition 1. For atomic operation schemas we have: z .P ` z .t 0 .Q (A+ ) λz .t A [D | P | Q] and:

p A [D | P | Q] t.P (A− ) t.(p t)0 .Q

The rules for more complex expressions are also easily derived. We do not list them here since they are obvious given the semantics indicated above. Refinement, as we have already hinted, is now easily interpreted as containment: Definition 2.

U0 w U1 =df U0 ⊆(W_W) U1

Note that the subset relation is typed : we are considering here the type W_W of operations over the universal state W. We need to show that certain fundamental properties follow from this. Firstly, that refinement preserves implementation, which follows from definition 2. U0 w U1 f A U0 − (w ) f A U1 More generally, of course, the refinement relation is clearly seen to be transitive. Secondly and thirdly, introduction rules for refinement in terms of preconditions and postconditions are derivable. Weakening preconditions: P 1 ` P0 + (wpre ) [D | P0 | P ] w [D | P1 | P ] Strengthening postconditions: P 0 ` P1 + (wpost ) [D | P | P0 ] w [D | P | P1 ] Finally, the operation schema operators are monotonic with respect to refinement. The following rules, for example, are all derivable: U 0 w U1 U0 ∧ U w U1 ∧ U 2

U0 w U 1 U 0 ∨ U w U1 ∨ U

U0 w U2 U 1 w U3 U0 o9 U1 w U2 o9 U3

The reader may wonder why we introduce a special symbol when its interpretation is simply membership. We have, in fact, developed many alternative approaches to program development in which implementation is understood in significantly different ways; it is sensible, therefore, to settle on one symbol for the abstract notion which can be interpreted according to these various contexts.

Program Development and Specification Refinement in the Schema Calculus

3.3

353

Refinement Inequations

We begin with inequations for schema operations. We give them in simplified form for clarity of presentation (in particular we have avoided input and output observations in the case of schema composition. A more general treatment of composition was given in [11].) + For conjunction we have rule ∧w : [x , x 0 : C P T | P0 | Q0 ] ∧ [x , x 0 : C | P1 | Q1 ] w [x , x 0 : C | P0 ∧ P1 | Q0 ∧ Q1 ] −

and rule ∧w : [x , x 0 : C P T | P0 ∨ P1 | Q0 ∧ Q1 ] w [x , x 0 : C | P0 | Q0 ] ∧ [x , x 0 : C | P1 | Q1 ] +

Then for disjunction we have rule ∨w : [x , x 0 : C P T | P0 | Q0 ] ∨ [x , x 0 : C | P1 | Q1 ] w [x , x 0 : C | P0 ∧ P1 | Q0 ∨ Q1 ] For composition we can derive rule

o+ 9w

:

[x , x 0 : C P T | P0 | Q0 ] o9 [x , x 0 : C | P1 | Q1 ] w [x , x 0 : C | P0 ∧ ∀ u : C • Q0 [x 0 /u] ⇒ P1 [x /u] | ∃ v : C • Q0 [x 0 /v ] ∧ Q1 [x /v ]] +

And for hiding we have rule ∃w : 0

∃ z : C • [z , z 0 ∈ C P T | P | Q] w [z 0 : C | ∃ u : C • P [z /u] | ∃ v : C • (P ∧ Q)[z /v ]] +

and rule ∃w : 1

∃ z 0 : C • [z , z 0 : C P T | P | Q] w [z : C | P | ∃ v : C • Q[z 0 /v ]] Finally, we have our expand frame rule: [D0 ; D1 | P | Q] = [D0 | P | Q]

(exp)

Note that these schemas are equal, that is they are the same set of operations. This is very different from the situation in, for example, [14] in which implementations may not change the values of observations outside the frame. Our law indicates that a program which implements a specification is unconstrained both outside the specification’s precondition and outside its frame. This is much more convenient when specifications may be constructed using schema operations such as conjunction in which the frames do not necessarily coincide.

354

4

M.C. Henson and S. Reeves

Types and Programs

There is a natural means for introducing programs (intensional functions) into our underlying theory: the typed lambda notation. The basic syntax of types we require is an extension of the language one needs for Z specification alone: T → · · · | P T | T × T | [· · · l : T · · ·] | T _T In addition to some base types and the usual type formation by powerset, cartesian product and schema type we add the type T0 _T1 of operations from T0 to T1 . The usual identification of functions from T0 to T1 with a usual subset of the relations between T0 and T1 remains: every operation is represented as a function by its graph in the obvious manner. With the new types of course come new terms: typed lambda abstraction and application. Just as we showed in [11] how a base typed set theory could be used to define and support a logic for higher level notions of specification (schemas and their calculus), so these primitive operations are sufficient to support the definition of higher level programming language constructs in a standard way; that is, by the methods of denotational semantics. For example: J· · · vari · · · := · · · expi · · ·K =df λσ.σ[· · · vari / Jexpi K σ · · ·] introduces simultaneous assignment (σ is a binding belonging to W, the universal state). We do not in this paper provide the semantics for the entire language, and indeed there are no surprises or innovations involved.3 The programming language we use here is very simple but illustrative: cmd → skip | var · · · var :=exp · · · exp | cmd ;cmd | begin var var ;cmd end | if exp then cmd else cmd | var [exp] prc → proc var (var ) cmd | proc var (var ) cases var in 0 : cmd m+1 : cmd endcases exp → True | False | exp==exp · · · etc. num | var | exp + exp · · · etc. num → 0, 1, · · · etc. The syntax given here is untyped; but obvious typechecking rules apply. We now present rules which link these various programming idioms with specifications. Although these rules are all expressed in terms of the implementation relation, they may be all re-expressed, and used, as refinement rules in view of the observation we made at the beginning of section 3: p A U ⇔ {p} w U Indeed we used these rules as refinements in section 2 above and will do so again in section 5 below. All the rules are easily derivable in the logical framework given the semantics of the programming language and the Z logic. 3

We will omit the semantic brackets whenever possible for clarity of presentation.

Program Development and Specification Refinement in the Schema Calculus

355

For skip we have the following rules: skip A [D | P | Q] z .P z .z 0 .Q

z .P ` z .z 0 .Q skip A [D | P | Q]

Recall that z 0 is not a variable with a diacritical prime, but a term: the variable z subject to the priming operation. Thus z and z 0 are the same binding modulo the priming of their observations. Consequently these rules express, as expected, the fact that skip implements a specification in which identical before and after state observations satisfy the postcondition whenever the before observations satisfy the precondition. For simultaneous assignment to variables vari we have: (z .z 0 .P )[· · · vari0 · · · / · · · (expi z ) · · ·] (A+ := ) · · · vari · · · := · · · expi · · · A [∆[· · · vari · · · : N] | P ] No after state observation may occur in the expressions expi . Conditional commands are implementations of disjunctive specifications, rule (A+ if ): cmd0 A U0

cmd1 A U1 exp z = True ` z .Pre U0 exp z = False ` z .Pre U1 if exp then cmd0 else cmd1 A U0 ∨ U1

Sequences of commands are implementations of composed schemas: cmd0 A U0 cmd1 A U1 (A+ o ) 9 cmd0 ;cmd1 A U0 o9 U1 Finally, blocks are implementations of existentially quantified operation schemas. For each variable z we have: cmd A U (A+ block ) begin var z;cmd end A ∃ z , z 0 : N • U Procedures are, of course, more complex, though can be handled in a very elegant manner. In this case we will provide a little more in the way of a technical overview. We need to start by introducing mechanisms for currying and uncurrying operations with schema type domains. Definition 3. Suppose that T0 is schema type in which the observation l does not occur.4 uncurry[l:T ] f T _T0 _T1 t T0 g[l:T ] =df f t.l (t  T0 ) 4

T0 g T1 is the compatible union of the schema types T0 and T1 . That is, the schema type comprising the union of those of the two arguments: it is not defined if there is an incompatibility. The binding t  T is the binding t restricted to just the observations in the alphabet of the schema type T . Whenever we deploy these notions we are assuming that their uses are well-defined.

356

M.C. Henson and S. Reeves

Definition 4. Suppose that T0 is schema type in which the observation l does not occur.5 curry[l:T ] f T0 g[l:T ]_T1 t T t0T0 =df f t0 [l Wt] We can now define curried operation schemas as sets of curried operations.6 Definition 5. curry[l:T ] U =df {g | ∃ f : U • g ≡ curry[l:T ] f } For notational convenience we will write λz : T • U for curry[z :T ] U and we usually omit the type of z . We now define application for these curried schemas to arbitrary expressions from the programming language. These expressions have type W_N, and this explains the types involved in the following definition. Definition 6. (λz • U )[h W_N ] =df {g | ∃ f : (λz • U ) • g ≡ λσ.f (h σ) σ} Note that (λz • U )[e] is a set of operations over the universal state and is therefore a new form of operation schema. We now use these ideas to introduce (primitive) recursive procedures. The semantics is given in terms of an operator elimN for primitive recursion over the natural numbers which is part of the underlying mathematical theory. Jproc p(z) cases z in 0 : cmd0 m+1 : cmd1 endcasesK =df uncurry[z :N] (elimN Jcmd0 K (λn.λw . Jcmd1 [m/n][p[m]/w]K)) The following introduction rule can then be derived: p[n] A (λz • U )[n] (A+ proc ) pAU There is also an elimination rule: pAU (A− proc ) p[e] A (λz • U )[e] These two rules are also both derivable for simple (non-recursive) procedures too. The most important rule, however, is the introduction rule for recursive procedure synthesis. cmd0 A (λz • U )[0] p[m] A (λz • U )[m] ` cmd1 A (λz • U )[m+1] (rps+ ) proc p(z) cases z in 0 : cmd0 m+1 : cmd1 endcases A U 5 6

The binding t0 [l Wt1 ] is the binding t0 updated (or extended) at observation l by the value t1 . Note that we use extensional equality in this definition: this is technically crucial, but it is outside the scope of this paper to explain why this should be so.

Program Development and Specification Refinement in the Schema Calculus

357

This is the rule we used in section 2 to derive the simply recursive algorithm for the fibonacci function. Note the assumption p[m] A (λz • U )[m] in the second premise: this is the claim that the procedure meets the specification at the previous value, and is the assumption we commented on specifically in that example derivation. The reader may notice some interesting similarity between the rules we have given in this section and the rules one often finds in the area of constructive type theories (for example [13], [2] and [4]). This close correspondence is especially + given above and is not accidental. It was noticeable in rules like A + if and rps always a strength of constructive approaches that the rules are very elegant, particularly so for the development of recursive programs; but the disadvantage of those methods has always been an inability to deal with substantial examples because of limited means for expressing specifications. We began our own research into program development from specifications in Z with the aim of investigating the consequences of adopting an alternative foundation for Z based on constructive rather than classical logic. The point of this was to see if one could use techniques of program extraction from proofs as a means for integrating program development with specification in Z. Such an experiment was reported in [8] and was based on theories of the TK family [7] which were, in turn, heavily influenced by the theories T0 of Feferman (e.g. [5]) and EON of Beeson [1]. In that context the judgement p A U becomes, as we noted at the beginning of section 3, similar to a type-theoretic judgement in a theory such as Martin-L¨ of’s: “p is a proof of proposition U ” or “p is a program for specification U ”. We rapidly discovered that it was possible to make U explicitly into a set of proofs (or programs) and, in that context, constructive logic ceased to be necessary. The trace of the type-theoretic heritage still remains in the form of the rules such as those we have illustrated.

5

An Example Using Promotion

We shall examine another very simple example to illustrate the techniques. As usual we begin with the specification of a local operation over some local state. Then we introduce a global state and explain the abstract relationship between the local and global states. We can then specify the global operation and derive the programs. 5.1

Specification

Consider the following specification: Inc n, n 0 : N n0 = n + 1 We wish to consider this as a local operation over the local state N. It is trivially implemented by:

358

M.C. Henson and S. Reeves

n := n+1 In the global state we have two numbers. This can be represented by the cartesian product N × N. The global operation simply generalises the local operation by specifying which of the two values is to be altered. The promotion schema as usual explains how the local and global state spaces are to be connected. Promote n, n 0 : N p, p 0 : N × N z? : B (z ? = true ∧ p.1 = n ∧ p 0 .1 = n 0 ∧ p 0 .2 = p.2) ∨ (z ? = false ∧ p.2 = n ∧ p 0 .2 = n 0 ∧ p 0 .1 = p.1) Finally, the global operation is defined by hiding the local state changes: GlobalInc =df ∃ n, n 0 : N • Inc ∧ Promote 5.2

Refinement

The first step is to curry with respect to the input observation z ? and then to express the conjunction as a disjunction of conjunctions by splitting the precondition of the promotion schema. This leads to: ∃ n, n 0 : N • (Inc ∧ P0 [m]) ∨ (Inc ∧ P1 [m]) where: P0 [m] n, n 0 : N p, p 0 : N × N m = true p.1 = n p 0 .1 = n 0 p 0 .2 = p.2 and: P1 [m] n, n 0 : N p, p 0 : N × N m = false p.2 = n p 0 .2 = n 0 p 0 .1 = p.1

Program Development and Specification Refinement in the Schema Calculus

359

Our next series of refinements concern the two conjunction expressions. Taking the former to illustrate both essentially identical arguments, we first use the − inequation ∧w to express: Inc ∧ P0 [m] as: n, n 0 : N p, p 0 : N × N m = true p.1 = n n0 = n + 1 p 0 .1 = n 0 p 0 .2 = p.2 Following this, we use the inequation

o+ 9w

to express this as a composition of:

n, n 0 : N p, p 0 : N × N n 0 = p.1 + 1 with: n, n 0 : N p, p 0 : N × N p 0 .1 = n p 0 .2 = p.2 The latter can be refined to an assignment:7 p.1 := n and the former can be refined, once again using the inequation sition of:

o+ 9w

, to a compo-

n, n 0 : N p, p 0 : N × N n 0 = p.1 7

Apart from the obvious simple extension to the programming language required here, strictly speaking, one also needs a generalisation of the assignment rule. The general version of this can be proved by decomposition into a conjunction, and then use of the rule A + . ∧

360

M.C. Henson and S. Reeves

followed by: n, n 0 : N n0 = n + 1 which is the schema Inc. These are refined to the assignments: n := p.1 and the local operation: n := n+1 These assignments now sequence to implement the relevant composition specifications (rule A o9 ); and the disjunction is refined (rule A + if ) into a conditional. The quantified program can be refined into a block (rule A + block ); and the entire specification of GlobalInc into a procedure (rule A + proc ). In summary we have the following program: proc globalinc(z?) begin var n; if z? then n := p.1; n := n + 1; p.1 := n else n := p.2; n := n + 1; p.2 := n end We are obviously not expressing any great delight in this particular program, nor in the simplicity of the specification we have illustrated. We do hope to have indicated a strategy which could be followed in more ambitious examples. Generally we should stress that one does not necessarily wish to connect specification structure with program structure: indeed, of course, it is of fundamental importance that specifications can be made abstractly and independently of particular implementation decisions. But in the case of larger structured specifications (those for example constructed by promotion) there is an argument for examining the role of system design as well as the role of requirements. There are compelling reasons for using promotion for structuring specifications and sometimes these reasons are equally valid for organising implementations. In other words, there are of course circumstances in which one wishes to treat the structure of a specification (at some point in the refinement process) as indicating or reflecting a design intention with respect to program structuring. In such cases it is evident that the program development, just like the specification itself, can be factored into entirely separate components; the development of an implementation of a local operation can proceed quite independently of the development of a promotion for example. This fulfills two important properties: extensionally, the program development fragments come together to produce a correct implementation of the entire specification; intensionally, the algorithmic choices made in the development of the local operation carry over into the implementation of

Program Development and Specification Refinement in the Schema Calculus

361

the specification as a whole. What this indicates, albeit rather tentatively, is that integrating program development with a notation for specification that is as powerful as Z, opens up the topic of system design in addition to the usual topics of specification and implementation.

6

Further Work and Conclusions

In this paper we have no more than sketched, with illustrative examples, an integrated mathematical framework for reasoning about programs and specifications, and for undertaking program development from Z specifications by refinement. The scope of the programming notation we have investigated here was very limited (no data structures beyond the natural numbers) and so we make no extravagant claims. However, we hope to have indicated an approach we can be usefully extended to cover a more expressive programming notation; the underlying mathematical system is certainly designed specifically to make this a possibility and we are actively engaged in these investigations. In addition to an alternative programming language, one might also investigate alternative theories of refinement. For example those based on the ideas in [15] (chapter 5) which is based on weakening preconditions and strengthening postconditions (see also [12]) or alternatively the notion of refinement underlying data refinement in [17] (chapter 16) which is based on the lifted totalisation of relations (for schemas this would be sets of bindings). Formalising these might lead to distinct theories with specific properties that could usefully compared with the approach outlined here both in theory and in practice. Again, we are currently exploring these avenues. We would like to thank the four referees for their comments and careful reviewing.

References 1. M. Beeson. Foundations of Constructive Mathematics. Springer Verlag, 1985. 2. M. Beeson. Proving programs and programming proofs. In Logic, Methodology and Philosophy of Science, pages 51–82. Elsevier, 1986. 3. A. Cavalcanti and J. Woodcock. ZRC—a refinement calculus for Z. Formal Aspects of Computing, 10(3):267—289, 1998. 4. R. Constable, et al . Implementing mathematics with the NUPRL proof development system. Prentice Hall, 1986. 5. S. Feferman. Constructive theories of functions and classes. In Logic Colloquium ’78, pages 159–224. North Holland, 1979. 6. L. Groves. Adapting program derivations using program conjunction. In J. Grundy, M. Schwenke, and T. Vickers, editors, International Refinement Workshop and Formal Methods Pacific’98, Springer series in discrete mathematics and theoretical computer science, pages 145—164. Springer, 1998. 7. M. C. Henson. Program development in the programming logic TK. Formal Aspects of Computing, 1:173–192, 1989.

362

M.C. Henson and S. Reeves

8. M. C. Henson and S. Reeves. New foundations for Z. In J. Grundy, M. Schwenke, and T. Vickers, editors, Proc. International Refinement Workshop and Formal Methods Pacific ’98, pages 165–179. Springer, 1998. 9. M. C. Henson and S. Reeves. Revising Z: I - logic and semantics. Formal Aspects of Computing Journal, 11(4):359–380, 1999. 10. M. C. Henson and S. Reeves. Revising Z: II - logical development. Formal Aspects of Computing Journal, 11(4):381–401, 1999. 11. M. C. Henson and S. Reeves. Investigating Z. Journal of Logic and Computation, 10(1):1–30, 2000. 12. S. King. Z and the Refinement Calculus. In D. Bjørner, C. A. R. Hoare, and H. Langmaack, editors, VDM ’90 VDM and Z—Formal Methods in Software Development, volume 428 of Lecture Notes in Computer Science, pages 164–188. Springer-Verlag, April 1990. 13. P. Martin-L¨ of. Constructive mathematics and computer programming. In Logic, Methodology and Philosophy of Science VI, pages 153–175. North Holland, 1982. 14. C. Morgan. Programming from Specifications. Prentice Hall International, 2nd. edition, 1994. 15. J. M. Spivey. The Z notation: A reference manual. Prentice Hall, 1989. 16. J. Woodcock and S. Brien. W: A logic for Z . In Proceedings of ZUM ’91, 6th Conf. on Z. Springer Verlag, 1992. 17. J. Woodcock and J. Davies. Using Z: Specification, Refinement and Proof. Prentice Hall, 1996.

Are Smart Cards the Ideal Domain for Applying Formal Methods? Jean-Louis Lanet Gemplus Research Laboratory, Av du Pic de Bertagne, 13881 G´emenos cedex BP 100. [email protected]

1

Introduction

The traditional approach for programming smart cards does not allow the creation of downloadable executable code and requires programmers with experience in programming in low-level languages. This approach, associated with a high quality qualification process, produce secured smart card. Unfortunately, it does not allow card manufacturers and issuers to quickly respond to the market changes, and it limits the flexibility of smart card applications. Open smart card programming provides a more dynamic approach to card applications. High-level languages and security mechanisms are the basis for the programming of open smart cards. Most notable efforts towards such smart card systems are Java Card [22], MultOS [14] and Smart Card for Windows [15], which provide application developers an opportunity to develop rapidly applications. The main drawback with this kind of smart card is the risk to download a hostile application that will exploit a faulty implementation module of the platform.Security is always a big concern for smart cards, but the issue is getting more intense with multi-applicative platforms and post issuance code downloading. The correct design and implementation of the system is the key to shun such an attack. Fault prevention offers different techniques to remove latent errors from the system. The fault avoidance concerns methodologies and appropriate techniques to avoid the introduction of fault during the design and the construction of the system. In a first approach, one can believe that smart card can only get benefits of using formal methods. But it remains difficult to integrate these methods in the development process. The need of formal methods in the smart card domain has three origins: mastering the complexity of the new operating systems (fault avoidance), certifying at a high level a part of the smart card and reducing the cost of the test. In a first part, after presenting the smart card and its security requirements, we explain the certification process that appears to be the most important vector for introducing formal methods in the software development cycle. Then we present some attempts to formalise complex software elements of smart cards. The use of model checkers in order to automatically generate the test suites can notably increase the productivity of applet development. The second part of this paper J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 363–373, 2000. c Springer-Verlag Berlin Heidelberg 2000

364

J.-L. Lanet

explains why smart cards are not currently the expected success story of formal methods.

2 2.1

Needs of Security and Formalism The Small and Secure System

A smart card is a piece of plastic, the size of a credit card, in which a single chip microcontroller is embedded. Usually, microcontrollers for cards contain a microprocessor (8-bit ones are the most widespread, but 16-bit and 32-bit processors can now be embedded in the chip) and different kinds of memories: RAM (for run-time data), ROM (in which the operating system and the basic applications are stored), and EEPROM (in which the persistent data are stored). Since there are strong size constraints on the chip, the amounts of memory are small. Most smart cards sold today embed a chip with at most 512 bytes of RAM, 32 KB of ROM, and 16 KB of EEPROM. Today’s smart card devices have no on-chip power and no clock. This means that the security functions are limited and cannot presume about the reliability of clock or power. The chip usually also implements some techniques and functions in order to safeguard information like limit sensors (heat, voltage, clock etc.), scrambled and distributed layout, data encryption and memory segregation which are used to deactivate the card when it’s somehow physically attacked. A smart card cannot be totally secure; it must just be secure enough, the goal is to make it sufficiently tamper-resistant. Smart cards are not only storage media; they are able to execute application software with cryptographic functions (DES, triple DES, RSA, DSS or elliptic curves). It makes them a key technology for numerous high-security consumer applications. Smart cards are often used either to store or manage some kind of currency (money or tokens), to record personal information (like medical history), or to identify a person. In these applications, smart cards provide a means to guarantee the security (confidentiality and integrity) of the whole system. In fact, a failure of a smart card impacts the reliability of the system. With open smart cards, functionality of the card can be extended by downloading new programs into the card. The security requirements for such an operating system are more stringent than those for conventional cards. The ability of adding new applications after the issuance poses a threat and particular attention must be paid to the post issuance loading or deleting of application. The cardholder or a hacker has a complete control over the card after issuance and can subject it to any number of hacking attempts. For these reasons (high value object, physical and logical attack in a discrete environment without any on-line detection mechanism) the smart card is often the target of hackers. Since around 1994, some smart cards used in pay-TV have been successfully reverse engineered. Most of the attacks were carried out without any access to information from the manufacturer. In order to reach the smart cards quality requirements, it is of prime importance to eradicate all the latent errors in the smart card software. For this

Are Smart Cards the Ideal Domain for Applying Formal Methods?

365

reason, the card issuer must develop a test strategy that eliminates all the errors or use new development techniques. Formal methods for specification and verification have always been among the central issues in computer science. It has a considerable impact in the hardware industry but its impact on the software development process has been limited. This is due to the complexity of modern software system involving thousand of lines of code. The smart card operating system and the embedded applications are relatively small. They are well within the limits of what can be handled by modern verification tools. Several attempts to formalize part of smart card operating systems or applications have been done by academic researchers. Some similar work is probably done by smart card providers but a few papers have been published. 2.2

The Certification Process

As we saw, the security is the main characteristic of smart cards. Each smart card provider claims that its product has the ad-hoc security. This claim can be enforced by submitting the product for certification by an independent evaluation lab. This process is a means to gain market share by highlighting the differences between smart card providers. Sometimes regulation requires the use of certified product for some markets. Currently, a certification at an EAL4 level is mandatory in Germany and Hungary for systems that use private signature keys. The Common Criteria for Information Technology Security Evaluation (CC) standard defines a set of criteria to evaluate the security properties of a product in term of confidentiality, integrity and availability. The framework is being developed jointly by the IEC (International Electrotechnical Commission) and ISO (International Standards Organisation), with the participation of national bodies. It is drawn from previous security requirements and assurance frameworks that are the ITSEC and TCSEC. The CC focuses mainly on the first part of the lifecycle: requirements, specifications, design, development and test. The deployment and the maintenance are not covered by the CC. The requirement phase is the most important part. The CC are used for writing the requirements document which must be complemented by a general requirement document because the CC are only concerned by the security aspects of the system. The Target Of Evaluation (TOE) is the part of the product or the system that is subject to evaluation. The TOE security threats, objectives, requirements, and summary specification of security functions and assurances measures together form the primary inputs to the Security Target (ST). It’s used by the evaluators as the basis for evaluation. The CC also defines the Protection Profile (PP) that allows the developers to create sets of security requirements available for several TOEs. A PP is intended to be reusable. The CC presents the security requirements under the distinct categories of functional requirements (e.g., requirements for identification, authentication, non-repudiation) and assurance requirements (e.g., constraints on the development process rigour, impacts of po-

366

J.-L. Lanet

tential security vulnerabilities). The assurance that the security objectives are achieved is linked to: – The confidence in the correctness of the security functions implementation, i.e., the assessment whether they are correctly implemented, – The confidence in the effectiveness of the security functions, i.e., the assessment whether they actually satisfy the stated security objectives. The CC contains a set of defined assurance levels that define a scale for measuring the criteria for the evaluation of PPs and STs. The Evaluation Assurance Levels (EAL1 to EAL7) form an ordered set to allow simple comparison between TOEs of the same kind. The role of the EALS is the same as the ITSEC security levels, although they represent only assurance requirements. EAL levels may be augmented to include higher assurance requirements or to substitute assurance components. Note that the EAL may only be augmented. At EAL5 level the assurance is gained through a formal model of the TOE security policy and a semiformal presentation of the functional specification and high-level design and a semiformal demonstration of correspondence between them. A modular TOE is also required. Note that the analysis must include validation of the developers covert channel analysis. The last EAL levels require a formal in-depth and exhaustive analysis. Three types of specification styles are mandated by the CC: informal, semiformal and formal. An informal specification is written in natural language and is not subject to any notational restriction but it requires defining the meanings of the used terms. A semiformal notation is written with a restricted syntax language and may be diagrammatic (data-flow diagrams, state transition diagrams, entity-relationship diagrams, etc). A formal description is written in a notation based upon well-established mathematical concepts. These concepts define the syntax and the semantics of the notation and the proof rules that support logical reasoning. A correspondence can take the form of an informal demonstration, a semiformal demonstration or a formal proof. A semiformal demonstration of correspondence requires a structured approach at the analysis of the correspondence. A formal proof requires well-established mathematical concepts and the ability to express the security properties in the formal specification language. It is important to notice that in the USA a complete scheme for certification is set up. Seven labs received accreditation for evaluation. This year in Baltimore will be held the first ”Common Criteria Conference”. At the beginning of the year the SCSUG (Smart Card Security User Group) specified a new PP for open operating systems like Java Card, Windows for Smart Cards and Multos. This shows clearly the involvement of the USA to fill the gap between them and Europeans country and the importance of the certification process. Gemplus obtained the first common criteria for Java Card last year. Two others certificates EAL4+ are forecast for this summer and other certifications are planned. The smart card dedicated protection profiles (PP) were defined in 1999 for EAL4 certification. This year several PP have been published in order to reach EAL5 certification. Moreover the GIE Carte Bancaire that requested EAL3+ certificate is now moving to at least an EAL5 level. Multos obtained

Are Smart Cards the Ideal Domain for Applying Formal Methods?

367

in 1999 the first ITSEC E6 certificate for a part of its operating system. This demonstrates that customers are requiring higher level certificates. 2.3

The Complexity is Increasing

Until now the complexity of smart card software was manageable by engineers. The size of a smart card application was around some thousand of C code lines and they all had quite the same architecture. The arrival of Java brings to the fore the complexity of the underlying mechanisms used in the virtual machine. It is not surprising if the first formal model of smart cards where devoted to this architecture. Java Card is a subset of Java that reduces the complexity of the model. This is probably the reason of the success of the formal specification of Java Card components. A lot of work has been done in the smart card domain but unfortunately no one achieved a proof of a complete smart card application neither a complete component of the virtual machine. There have been several efforts to formalise components of smart cards: In [2], the authors present a part of a stack-based interpreter for a smart card based system: Tosca. The interpreter is object-based and is written in Clasp. It shares several features with Forth (e.g., extensibility of the language). This language is used to create applets. The aim of the study is to provide a formal description of a subset of the Clasp language and to use this description in order to prove properties. They prove by induction that certain run-time errors (e.g., stack overflow, non-determinism) can never occur. The Defensive Java Virtual Machine (dJVM) has been modelled using ACL2 [7]. This work aims to provide a JVM with run time checks in order to assure type-safe execution of the byte code. The only available document is a draft version where not all theorems were proved to our best knowledge. No information has been published on the complete proof of the model. An extension has been proposed [18] on verification related to proofs on object oriented byte code. In [20], the authors propose a new approach to verify the properties of the Java Byte code using a model checker. A Java Card verifier performs an exhaustive search in the behavioural tree of the program in order to detect data flow or control flow errors. In fact, a byte code program can be seen as a description of the transition system. The state is given by the virtual machine state. Due to the potential infinite state of an arbitrary program, it is necessary to derive a finite abstraction of the program and to restrict as much as possible the usage of variables. This abstraction can be restricted to type information and the current method. The state is restricted to a method interpretation. The INRIA proposes researches on the formal semantics of Java language; program analysis for program optimisation and methods for verifying safety and security related properties [11]. Most of the paper presented are more related with Java rather than Java card. The technique used for the verification of security properties is close to the previous one. They make an abstraction of the method under inspection. They translate it into a transition system and they verify some temporal formulae that describe an allowed path in the graph. Their transition graph is infinite but they proved that a bound exists on the

368

J.-L. Lanet

state number. This allows them to use a model checker to verify their formulae [23]. In their model, all information related to the data flow were removed and only information linked with the control flow and the call graph of the program were kept. Some properties cannot be formalised with this approach like flow of classified information and detection of covert channel. A recent paper [10] presents a model of the Windows for Smart Card runtime environment. The authors use the Abstract State Machine to describe formally their system. Gemplus provided several formal models of parts of the Java Card. We paid a particular attention to verify the correctness of the embedded code and to demonstrate the correctness of the JVM Firewall in [16], [17]. This intense activity about the formalisation of open operating systems from academic and industrial researchers points out the difficulty to be convinced by the soundness of the specification. There is currently an important effort in the Coq and Isabelle communities to completely formalise the Java semantics at the source and the byte code level. These efforts are supported by smart card manufacturers. The importance of the correctness of the byte code verifier requires a formalisation and the proof of this important piece of code. But unfortunately, this effort is far from the resource availability of each smart card manufacturer and will need some forms of collaboration between them. The complete proof of the verifier has been estimated at around 60 man/month. Surprisingly, the B community has never paid a similar attention to the open operating system formalisation. In the Z community, we noticed only the work of the York University in collaboration with Logica. The smart card domain has several interesting and not confidential problems that can be solved using formal methods. 2.4

Reducing the Cost of the Test

We have seen that formal methods can be used for marketing motivation through the certification process and for security reason to ensure the soundness of the specification. The last point is related to productivity by reducing the cost of the test. Card manufacturers have a fairly extensive qualification process. Consequently, quality insurance requirements for smart cards are often very strong. In order to fulfil these requirements, card application providers have developed methods and tools adapted to smart card specific constraints. An important part of this development is devoted to test. Starting from specifications, a tester enumerates all the tests that are necessary to verify that the product fulfils its requirements. It is then always possible to prove the conformity of the implementation regarding the specification. Moreover, this approach facilitates the maintenance of tests in case of product evolutions. In order to provide a high level of confidence, the testers use a data base which capitalizes all tests cases that can be done to reveal faults in smart card applications. Then, this approach takes advantages of fault driven testing approaches. The expected results of test are provided by a model of the application, which is developed in parallel with the product. At last, the test coverage can be estimated using a tool that evaluates the test coverage on the model. Test

Are Smart Cards the Ideal Domain for Applying Formal Methods?

369

execution is fully automated. It is then possible to stress applications, in order to increase even more confidence on the application. This traditional approach has two major drawbacks: firstly, it needs to develop two instances of the program and any software evolution implies to modify the models, secondary during test execution if an error occurs (in fact a divergence in the behavior) both models must be checked. This process is secure but very costly. Generating the test cases automatically from a specification can reduce this process. For generating test cases, we need a specification. Such specification can be obtained through a formal model. Some studies propose to generate the test cases from a B specification [1], [3]. During the last decade, testing theory and algorithms for the generation of tests have been developed from specifications modeled by variants of the Labeled Transition System model (LTS). A LTS is a structure consisting of states with transitions between them. Transitions are labeled with actions or events. The most efficient algorithms are based on adaptations of on-the-fly model-checking algorithms. Academic tools such as TGV [8] and now industrial tools such as TestComposer (Verilog) already exist. They implement these algorithms and produce correct test cases in a formal framework. We choose to express the specification with an UML model. The specification is automatically translated into a labeled transition system thanks to the UMLAUT tool [12]. Then we use TGV to automatically produce test cases from this LTS and from test purposes produced by hand. We are now working on the methodology in order to help the applet designers to enrich the UML views in order to obtain a testable UML model.

3

The Constraints

We have seen several good reasons of using formal specifications for smart card applications. Surprisingly, only the productivity advantage is well understood and accepted. Prior to integrate those methods in the development process, several points such as: development overhead, predictability, human resistance and tools must be solved. We believe that for the smart card domain, work must be done on the methodology, and tools must be improved in order to efficiently use formal methods. 3.1

Development Overhead

We have to keep in mind that the smart card is a mass product and that its price must remain as low as possible in order to be competitive. The price of smart card ranges from 1 to 10 Euro depending on the chip and the gross profit is very weak. There is a strong pressure to reduce the development cost. The potential overhead introduced by formal methods remains acceptable under two conditions: – If we develop generic components. For example, the backup mechanism, the memory anti-stress and the protocol layer are components that can be reused

370

J.-L. Lanet

in every smart card. This overhead is paid off on the number of produced smart cards. – If we can reduce the test process. This can only be done if the development process is proven until the implementation. In this case it is possible to remove the unitary tests and save a lot of time. But unfortunately, the Atelier B is currently unable to generate the code that fit in the smart card. Moreover, when we proceed to an implementation a lot of restriction on the language are imposed. Several structures are not accepted in an IMPLEMENTATION clause. Efficient conversion from specification to machine code is necessary for smart card based applications. In fact the current B0 translator has an overhead of more than 20% compared to a manually produced code [13]. If this overhead is acceptable for the code, it is too important for the RAM. For this purpose, we have developed a prototype of code translator in order to meet the smart card constraints [5]. 3.2

Industrial Constraints

Our main problem with a formal development is the lack of metrics to predict the duration of such a development. Predictability is of prime importance for smart card development, due to the burning phase. When a new development is scheduled, we have to keep a time slot to the chip manufacturer to burn the wafers (often a 10-14 months delay). This is often the critical path in the development process and it is very difficult to modify this time slot. For this reason, it is important to be able to meet the deadlines. Due to the lack of metrics, it remains very hazardous to evaluate the time needed for a new development. We have to improve our knowledge by developing several case studies in order to be able to give an accurate estimation. This allows to clearly identify the eventual caveats of the domain. For example, achieving the work described in [6] has proven useful when we decided to prove the FAC ¸ ADE verifier [9] even if the language and the static semantics are totally different the problem and the solutions where similar. The second problem is the scalability. When we made our first model of the virtual machine with a reduced number of byte codes, the proof was easily manageable. Unfortunately, when we add new byte code, the complexity of the proof increased. This has been solved by a complete redesign of the formal specification. Solving a sub-problem cannot give accurate information of the complexity of the whole problem. The last point is more related with the tool. It is often better to adapt the design of the specification to the capacity of the prover by using the adequate structures. For those reasons it is very difficult for a project manager to include formal methods in its product. We believe that the way to obtain their acceptance is to propose formal models for generic components and to develop higher level component (e.g., a complete virtual machine) beside the development process.

Are Smart Cards the Ideal Domain for Applying Formal Methods?

3.3

371

Cultural Resistance

One of the difficulties for a project manager to incorporate formal methods is the weight of the past. Until now, no bug has been discovered in a smart card. For smart card providers it seems natural to design secure system without formal methods. Moreover, formal methods cannot provide any help with the up-to-date attacks (Differential Power Analysis) against the smart card neither physical attacks. Logical attacks are often less considered but this idea must be left out when considering open operating systems. Another point is related to the designer. It is difficult to express an abstraction of a problem and to refine it. People are considering formal language like B as a new programming language and they have a lot of difficulties (abstraction capabilities, less expressive language). This point can be solved by adequate training but if remains difficult to specify a provable model. 3.4

Need of a Methodology

We believe that a clean methodology with related metrics and tools improvements would consequently help the integration of formal methods and in particular B in the software process. It is important to have guidelines for the specifications and proofs that help the designers. For this purpose we joined a European project, named MATISSE in collaboration with MATRA, Soton university, Abo Akademi, Steria and DERA. In this project we will exploit and enhance existing methodologies and associated technologies that support the correct construction of critical systems. Our own methodology is related to the certification process. It’s often said that the use of formal methods is time consuming and very costly, they required very skilled people and there is an important gap between the semi formal and the formal specification. But combining a semi formal language like UML and a formal method like B is probably the less expensive way to reach the CC requirements. In [17], we explain how to apply this methodology for a smart card certification. But if this approach is well suited for certification it does not help a lot for modelling a system. Some work have been done in this direction [19] for translating UML views, but our own experience shows the difficulty to match a B model to an UML class diagram. Even if B shares some characteristics with object languages, the B architecture clauses have not the same expressiveness. We prefer to have two approaches: one dedicated to certification with a preliminary semi formal work, and a second one with a preliminary informal work: rewriting the specifications in order to clarify them. The lack of metrics for a B development is a problem either for predictability and quality measurement. It is often said that a high ratio of automatically proved proof obligations (PO) is an indication of a good design. Unfortunately, our own experience shows that sometimes this indicator is wrong. For example, we made two models of a virtual machine, one na¨ıve but we a high ratio of automatically proven PO and a second one where we regrouped the opcode per properties. In the latter case, the ratio was poor (around 40%) but the proofs

372

J.-L. Lanet

were generic [21]. And the complete proof of the specification has been done with the second model in a shorter time. We expect that MATISSE will reveal some metrics for quality assessment. We expect also some tool improvements. The first one is linked to the code generator that must be considerably improved in order to generate code that fits the smart card constraints. But other points must also be improved from ergonomics to proof management like: hypotheses naming, useless hypothesis removal from the stack, a proof editor, a better sub-goal management...Currently a lot of academic work are done around the tools for the B method (parser, test generator, code generator) and the recent announcement of Steria about the B compiler will probably help to the tool improvement.

4

Conclusions

Smart Card domain is not yet the expected success story but is probably the ideal field for applying formal methods. There are a lot of good reasons for formal methods to be well accepted. One way to differentiate card manufacturers will be on security aspect and the relevant method to reach it. This goal can be achieved through the certification process that is now well handled for the medium level (e.g., EAL5). But a new effort must be set up in order to reach the higher levels. Certification is only the visible part of the iceberg. The security of smart cards relies on other work that are probably more fruitful for example the generic components of the smart cards. But in order to generalize the use of formal methods, we have to prepare their integration in the software process, which remains the real challenge. For achieving it, a lot of work must be done on the methodology, the associated metrics and on the tools. We believe that there is not only one formal method that is suitable for smart cards. The success of the PACAP project [4] and the well acceptance of the test generation clearly show that there is enough room for different methods. We have now to define the adequate form (size, training, and mission) for a formal method team developing models for the Gemplus R&D.

References 1. L. Aertryck, L. Benveniste, D. Le Metayer, CASTING: A Formally Based Software Testing Generation Method, IEEE Computer Society, Nov. 1997. 2. M. Alberda, P. Hartel, E. de Jong, Using Formal Methods to Cultivate Trust in Smart Card Operating System, In Proceeding of CARDIS’96, pp. 111-132, Amsterdam,Netherlands, Sept. 1996. 3. S. Behnia, H. Waeselynck, Test Criteria Definition for B Models, FM 99, Vol 1, LNCS 1708, pp. 509-529, 1999. 4. P. Bieber , J. Cazin, V. Wiels, G. Zanon, P.Girard, J-L. Lanet, Electronic Purse Applet Certification, in Workshops on Secure Architectures and Information Flow, Royal Holloway College, December 1999. 5. G. Bossu, A. Requet, Embedding Formally Proved Code in a Smart Card: Converting B to C, submitted to ICFEM, York, Sept. 2000.

Are Smart Cards the Ideal Domain for Applying Formal Methods?

373

6. L. Casset, J.L. Lanet, A Formal Specification of the Java Byte Code Semantics using the B method, ECOOP’99 Workhop on Formal Techniques for Java Programs, June 1999. 7. R. Cohen, The Defensive Virtual Machine Specification Version 0.5, [http://www.cli.com/software/djvm]. 8. J. -C. Fernandez, C. Jard, T. J´eron, C. Viho, Using on-the-fly verification techniques for the generation of test suites. In CAV ’96, LNCS 1102, Springer, July 1996. 9. G. Grimaud, J.-L. Lanet, J.-J.Vandewalle, FAC ¸ ADE: a typed intermediate language dedicated to smart cards, ESEC 99, Toulouse, Sept. 1999. 10. Y. Gurevitch, C. Wallace, Specification and verification of the Windows Card runtime environment using Abstract State Machines, Microsoft Research, MSR-TR99-07, Feb. 1999. 11. T. Jensen, D. Le M´etayer, T. Thorn, Verification of control flow based security properties. Research Report n◦ 1210, IRISA, Rennes Oct. 1998. 12. J.-M. J´ez´equel, A. Le Guennec, F. Pennaneac’h, . Validating distributed software modeled with UML. In Proc. Int. Workhop UML98, Mulhouse, France, June 1998. 13. J.-L. Lanet, P. Lartigue, The Use of Formal Methods for Smart Cards, a Comparison between B and SDL to Model the T=1 Protocol, Proceedings of the International Workhop on Comparing Systems Specification Techniques, Nantes, March 1998. 14. Maosco Ltd. “MultOs” Web site. [http://www.multos.com] 15. Microsoft Corp. “Smart Card for Windows” Web site. [http://www.microsoft.com/windowsce/smartcard/]. 16. S. Motr´e, Formal Model and Implementation of the Java Card Dynamic Security Policy, AFADL’2000, Grenoble, Jan. 2000. 17. S. Motr´e and C. Teri Using B Method to Formalise the Runtime Security Policy for a Common Criteria Evaluation, NISSC, 2000. 18. J.S. Moore, Proving Theorems about Java-like Byte Code, [http://www.cs.utexas.edu/users/moore/publications/tjvm]. 19. H.P.Nguyen, D´erivation de sp´ ecifications formelles B a ` partir de sp´ ecifications semi-formelles PhD Thesis, CEDRIC, 1998. 20. J. Posegga, H. Vogt, Offline Byte Code Verification for Java Using a Model Checker, 5th European Symposium on Research in Computer Security (ESORICS) 1998, Springer LNCS 1998. 21. A. Requet, A B Model for Ensuring Soundness of the Java Card Virtual Machine FMICS 2000, March 2000, Berlin. 22. Sun Microsystems, Inc . Java Card 2.1 Virtual Machine, Run Time Environment, and Application Programming Interface Specification, Public Review ed., Feb. 1999. [http://java.sun.com/products/javacard/javacard21.html]. 23. T. Thorn, V´erification de politiques de s´ ecurit´e par analyse de programmes, PhD. Thesis no 2172, Rennes 1, Feb. 1999.

Formal Methods for Industrial Products Susan Stepney and David Cooper∗ Logica UK Ltd, Betjeman House, 104 Hills Road, Cambridge, CB2 1LQ, UK {stepneys,cooperd}@logica.com Abstract. We have recently completed the specification and security proof of a large, industrial scale application. The application is security critical, and the modelling and proof were done to increase the client’s assurance that the implemented system had no design flaws with security implications. Here we describe the application, specification structure, and proof approach. One of the security properties of our system is of the kind not preserved in general by refinement. We had to perform a proof that this property, expressed over traces, holds in our state-and-operations style model.

1

Introduction

Over the past few years we have been working with the National Westminster Development Team (now platform seven), proving the correctness of Smartcard applications for electronic commerce, which are currently being sold as commercial products. We have modelled the abstract security behaviour and properties of the products, modelled the more concrete top level design, and have rigorously proved the preservation of both functional and non-functional security properties. All work was done in Z. We have previously described one of the Smartcard products, an electronic purse [Stepney et al. 1998]. Here we describe another product: a smartcard operating system that ensures a secure environment for running segregated applications.

2

Overview of the Application

A Smartcard operating system should host, and segregate, separately loaded executable applications. If no loaded application can interfere with any other applications co-resident on the smartcard, independent application providers can be assured that their own applications are operating in a secure environment. NatWest called in Logica to discover if it is feasible in a commercial setting both to develop formal models of such a system and its security policy, and to prove that the system design meets all the security properties required. ∗ current address: Praxis Critical Systems Ltd, 20 Manvers Street, Bath, BA1 1PX [email protected]

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 374–393, 2000. c Springer-Verlag Berlin Heidelberg 2000

Formal Methods for Industrial Products

3

375

Special Features

There were various things we had to deal with, to ensure that we were specifying and proving the right things, and to ensure that we were presenting the specification and proof to the right level of abstraction, detail and clarity. Some of these issues are discussed in the following sections. 3.1

Security Models and Proofs

We specified two key formal models, an abstract Security Policy model (SP) that clearly captures the desired security properties of the system, and a more concrete Hardware Design model (HW), that clearly maps to the semiformal design. We also performed a proof that HW exhibits the security properies of SP. We actually chose to structure the model in three levels: SP , VM (virtual machine, an intermediate level), and HW , introducing implementation detail only where needed, and as low down the specification hierarchy as possible. Having the intermediate model resulted in more proofs, but simpler ones. 3.2

Segregation with Communication

The fundamental security property of Smartcard operating systems is that applications are segregated; one application cannot read or change secret data in another application, either by rogue intent or because of a bug. However, segregation need not be absolute; there could be support for applications to communicate with each other to some limited extent, over explicitly identified overt communication channels. Although much has been published in this area (see, for example, [Bell & Padula 1976], [Rushby 1981], [Goguen & Meseguer 1984], [Bell 1988], [Jacob 1992], [Roscoe 1995], [Gollman 1998], among many others), nothing existing fitted our needs without modification, because of other technical constraints imposed on the particular commercial product we were dealing with. (Industrial scale formal methods work often requires modification of, or extension to, existing idealised academic results, because of conflicting real world constraints and demands.) So, building on the existing concepts, we formulated a suitable property of segregation with communication [Cooper & Stepney 2000] and proved that our system model possesses an appropriate instantiation of this property. Developing a suitable formulation, and proving it holds, was the major technical challenge of our development work. Summarising that definition: The segregation property is formulated as constraints on sets of system traces (sequences of communication events) that ensures that the system’s applications are behaving independently, except for the explicitly identified communication events. The segregation property states that if certain event traces are allowed, then other event traces, corresponding to the same applications executing in a different order, must also be allowed, because

376

S. Stepney and D. Cooper

the only way the other traces could be disallowed would be by some covert communication, coordination, or interference between the applications. This segregation property is a kind of property not preserved in general by refinement. Refinement can be viewed as taking a subset of allowed system traces, provided the subset does not narrow the precondition, but just resolves non-determinism. Yet a property that says ‘if t1 is a trace of the system, then so is t2 ’ is not necessarily preserved by a subset of the traces (this would correspond to using some covert communication to resolve the non-determinism). So, as well as proving that HW is a refinement of SP (necessary to show that HW has the functional properties of SP), we have to provide a different kind of proof to show that HW also preserves the segregation property of SP.

4

Modelling Consequences

Because of these special features of the problem, we did not have complete freedom in the way we could specify the model. We had to structure it to allow the various proofs to be performed. 4.1

Segregation and Multi-promotion

The requirement for segregation with communication permeates the entire structure of our specification and proof approach. Although the segregation property is important, Smartcard operating systems have a lot of other, functional, properties and behaviour that must be specified. These are most naturally captured in Z using a conventional stateand-operations specification style, and SP–HW model correspondence shown with a conventional data refinement proof. In addition, any mapping from the concrete HW model to a semi-formal design is more naturally achievable for a state-and-operations style specification. These considerations led us to adopt such a style. But then we needed a way of expressing the trace-based segregation property as a property of our state-andoperations model. We proved an unwinding theorem [Cooper & Stepney 2000], which allowed us to show that (a particular form of) unconstrained multipromotion has our segregation property. Promotion is a commonly used Z specification structuring technique that allows operations specified on individual ‘local’ pieces of state to be ‘promoted’ to operations on a ‘global’ state comprising labelled copies of the local state (explained in [Barden et al. 1994, chapter 19]). A ‘framing schema’ identifies the single piece of local state being changed, and requires the other pieces of local state to be unchanged. The framing schema is then combined with the relevant local operation schema (and the local state hidden) to say how that identified local state changes. Multi-promotion is an obvious extension to allow an operation to affect two or more pieces of local state in concert. In a Smartcard operating system with communicating applications, there is the need to promote a single application,

Formal Methods for Industrial Products

377

or two applications, or three applications, or all loaded applications, on occasion. For the case of two pieces of local state promoted together, with a global state like global : ID 9 Local , the relevant framing schema might look like ΦTwo ∆Global ∆Local ∆Local2 from?, to? : ID disjoint h{from?}, {to?}i {from? 7→ θLocal , to? 7→ θLocal2 } ⊆ global global 0 = global ⊕ {from? 7→ θLocal 0 , to? 7→ θLocal20 } and then the corresponding promoted operation would look like GlobalOp = b ∃ Local ; Local2 • ΦTwo ∧ LocalOpFrom ∧ LocalOpTo2 In general, multi-promotion allows an arbitrary choice of the number of application promoted. Unconstrained multi-promotion is a promotion where there are no global constraints on the promoted state or operations, that is, no constraints linking pieces of local state. Informally: because all the operations are defined in terms of local (single application) state only, there are no opportunities for one local state’s behaviour to be influenced by another’s at the point of specification, so no communication can occur. Formally: our unwinding theorem proves that this is the case. Security properties can have notoriously counter-intuitive consequences, so we were very careful to prove our property formally, rather than relying on informal justifications. It is relatively easy to justify that an operation on one piece of state does not alter another piece of state: that other piece of state can be seen to be unchanged. It is much harder to justify that every operation changes its piece of state in a way that is independent of the value of all other segregated pieces of state: the other states do not change, but their values are accessible through the mathematical formulation. Our formalism not only made precise what was meant by segregation with communication, but also formally justified that unconstrained multi-promotion, with its explicit communication between the promoted states, exhibits this form of segregation. Unfortunately, the formulation of our unwinding theorem, although a kind of multi-promotion, is not expressed in the form most natural for a Z specification. It is centered around the communication events, and those particular applications involved in the event have to be deduced. On the other hand, the conventional Z multi-promotion style illustrated above, identifies explicitly which applications are involved in a particular promoted operation, and the corresponding communication event must be deduced.

378

S. Stepney and D. Cooper

We had two choices in order to prove our system model to be segregated: either write our specification directly in the communication-centred form suited to the segregation theorem, or prove that our specification written in the more natural application-centred form was equivalent to one formulated in terms of communications. After experimenting with both approaches, we decided on the second one. Although that choice requires performing an extra proof, we felt that the added clarity of the specification, and ease of proof in other areas, outweighed the penalty. 4.2

Modelling the Functionality

When talking about an operating system supporting applications, one most naturally thinks of the OS as a ‘layer’ beneath the applications. However, that natural modelling approach is not compatible with our view of segregation, which is expressed in terms of applications only. Smartcard operating systems allow user applications to be securely loaded, securely deleted, and securely executed. In addition to loading and deleting user applications, and mediating between inter-user-application communication, Smartcard operating systems usually offer some additional trusted functionality, such as random number generation and various query functions. We needed to incorporate such functionality into our segregation framework, which recognises only ‘applications’. In addition, ISO standard Smartcards have some required functionality (Master File, ATR File and Directory File) that behave to some ways like user applications – they are selectable – but not in others – they have fixed functionality and are permanently resident. So we modelled loadable and deletable user applications, we modelled ISO standard functionality as three special applications, and we modelled the operating system functionality itself as the single Scos trusted application. In order to simplify the segregation proof, which talks of a single kind of application, we used a free type to build applications from user applications, ISO applications, and the Scos application. Also, because user applications are loadable and deletable, but the segregation formulation assumes the segregated applications are fixed (it assumes a total mapping APPL −→ LocalState), we also modelled absent user applications. APPL ::= scoshhScosii | isohhIDii | user hhUserAppl ii | absenthhIDii This formulation using a total mapping over a free type does make the specification a little clumsy in places (particularly the continual extraction of states from their free type wrappers), but makes it possible for us to prove the segregation property.

5

Determinism

We used determinism in two different places to solve two different modelling problems.

Formal Methods for Industrial Products

5.1

379

Imposed Determinism

We had to make sure our SP model is sufficiently constrained not to allow unwanted refinements; in particular, that any resolution of non-determinism does not subvert certain confidentiality requirements (this is in addition to the requirement that the refinement also preserves the segregation property). The Scos is intended to be a trusted application: trusted by the other applications not to pass any of their communications with it to other applications. For example, the Scos application can be trusted not to store the last random number it gave to application A, then use that as a way of resolving a non-deterministic interaction with application B , and it can be trusted not to store the messages that it passes between an application and the external communication channels. So, in our SP , the Scos application’s behaviour must be tied down sufficiently that it can be seen to be trustworthy. We achieved this by making the abstract Scos state small, and imposing operational determinism. Thus it is not possible to use secret or covert information to resolve the nondeterminism in a more concrete model. Because the specific behaviour of user applications is not specified in the abstract SP model, it is not possible to make that model explicitly deterministic. Instead, we added a predicate to assert that every operation’s behaviour is deterministic, without specifying what that behaviour is; we introduced a requirement for functionality into the system behaviour. If the abstract state is captured by the schema A, and the operation by the schema AOp, we can define a function that converts operations defined using delta schemas into operations defined as relations between (before and after) schemas, as: relA : PAOp −→ (A × IN ) ←→ (A × OUT ) ∀ op : PAOp • relA op = { AOp | θAOp ∈ op • (θA, m?) 7→ (θA0 , m!) } Assume the non-determinised form of the (total) operation is AOp. We can define possible deterministic forms as1 aDet == P(relA AOp) ∩ ( −→ ) and an augmented state as ADet = b [ A; f : aDet ] The deterministic operation is then 1

In general, this is a sufficient, but not necessary, constraint for determinism, as discussed in section 5.3. In this case, however, our abstract model is “sufficiently abstract” in that all its state is observered, either through outputs or finalisation. (Technicalities aside, a merely sufficient condition for determinism is sufficient for our pragmatic, industrial purposes.)

380

S. Stepney and D. Cooper

AOpDet ∆ADet m? : IN m! : OUT f0 = f f (θA, m?) = (θA0 , m!) This says that the particular choice of determinism, f , is unchanged by the operation, and that the operation behaves like AOp and is deterministic in the way captured by aDet. Only the operations of the SP model are constrained to be deterministic in this way. The initialisation is highly non-deterministic, because it does not constrain the value of f . Refinement chooses the particular deterministic behaviour that is implemented. We proved our refinement using the conventional ‘forward’ Z refinement rules ([Spivey 1992b, chapter 5], augmented with finalisation [Stepney et al. 1998]), and not the ‘backward’ Z refinement rules, thus showing that the refinement did not move the non-determinism at initialisation to occur later2 ; the refined operations stay deterministic, and so the refined Scos application remains trustworthy. If it were impossible to make the operations deterministic (if aDet were empty), adding such a constraint to ADet would make the state empty. We proved this was not the case by showing the existence of such an f expressed in terms of a more concrete model proved both to be deterministic (including initialisation) and to be a refinement of the non-determinised abstract model. This concrete model is then a suitable refinement of the (non-empty) determinised abstract model. 5.2

Determinism and Refinement

We used our unwinding theorem and multi-promotion to prove that our top-level SP and intermediate-level VM models are segregated. However, our lowest level HW model is not structured as an unconstrained multi-promotion: it is highly 2

An uninterpreted Z specification is not sufficient by itself to define the legal implementations. For example, it may not be clear what schemas are intended to correspond to operations to be implemented, and which are merely scaffolding. The most common difference in interpretation is behaviour outside the precondition: [ Spivey 1992b, chapter 5]’s forward rules allow “weakening the precondition”, whereas different Z refinement rules need to be used to support a firing condition interpretation [Josephs 1991]. It is customary in Z specifications to leave much of this interpretation implicit; we were more careful, stating (necessarily informally) precisely what schemas comprised the operations, which refinement rules we were using, and why. All this kind of validation and meta-argumentation, that the right property is being proved, and that the proof performed really does establish that property, is carefully documented for the reviewers’ scrutiny.

Formal Methods for Industrial Products

381

constrained, because it shows how the applications are arranged in a flat memory space, and how they share use of the RAM. Such a constraint is the very kind of thing that might indicate a covert communication through the shared memory. We had to prove that the applications as laid out in a single flat memory space remain segregated. We needed another way to demonstrate segregation. Segregation is expressed in terms of traces; refinement in terms of subsets of traces. If a model is deterministic, a refinement cannot resolve non-determinism (remove traces). If a model is also total, a refinement cannot weaken the precondition (add traces). So a refinement of a total, deterministic model has the same traces: if the original is segregated, so must be the refinement. We used this fact to prove segregation of our HW model: by proving segregation (by unconstrained multi-promotion), totality, and determinism of our VM model. 5.3

Determinism and Traces

Whilst trying to solve the problem of making the Scos provably trustworthy, and proving the VM and HW have the same traces, we found ourselves bandying about phrases like ‘operationally deterministic’, yet when we came to write down the proof obligations, we realised our first naive attempt was too restrictive. We had thought we had to prove that the state transition is functional, to prove that BOp ` relB BOp ∈ B × IN 9 B × OUT (where relB is defined with respect to state B and operation BOp in a similar way to relA above3 ). However, consider a specification of a state comprising a set, with some obviously deterministic operations on it such as ‘add an element’ and ‘remove an element’. It would be quite legitimate to refine this set to a sequence, and the operation of ‘add an element to the set’ to ‘if it is not already there, add the element anywhere in the sequence’. The abstract state transition relation is functional, but the concrete state transition relation is no longer functional, yet the observed behaviour is still deterministic. And this corresponds closely to the case of a Smartcard operating system: in the concrete HW model there are various possible ways of laying out applications in memory, but the observed behaviour is independent of which way this layout is actually implemented. Our naive proof obligation was too strong. We were able to use our trace model of segregation to help us determine the appropriate proof obligation for determinism.

3

It would be nice if such a relation could be defined generically. This is not possible in Z as it stands today, because generic definitions cannot be constrained to be applicable to particular sets, such as schemas. Type constrained generics [ Valentine et al. 2000] would allow such a definition.

382

6 6.1

S. Stepney and D. Cooper

Functional and Non-functional Properties Two Security Models

We needed to specify various functional security properties, based in the usual way on an external view (inputs and outputs) of the system. We also needed to express the segregation property, based on a necessarily internal view of the inter-application communications, not engaged in with the outside world. These two views need to be related somehow. We also had to make the specification structure match the implementation; in particular, cope with the fact that the overt inter-application communication channels are unobservable from outside the smartcard. So we wrote two security policy models, one capturing the functional properties, SPf , with only the external communications visible, and one capturing the segregation property, SPs , with all the external and internal communications present. In principle, these models could be entirely unrelated. In practice, we made them very similar: they differ only in the external observability of the inter-application communication channels, which are fully visible in the SPs model, and finalised away [Stepney et al. 1998] to invisibility in the SPf model. We also wrote two corresponding intermediate models, VMf and VMs , and proved that each captured the required properties of the corresponding SP model. We wrote a single concrete model, HW , corresponding to the implemented device, and proved it possessed the properties of both the SP models, via the VM models. 6.2

Differently Segregated

Segregation is a property of a single system. Two different systems may each be segregated, yet have no relationship to each other. For example, once we have carefully specified our segregated SP , with lots of separate SP -applications not interfering, we do not want to be presented with a purported implementation that bundles the whole behaviour into a single HW -application, despite such a single-application system necessarily being ‘segregated’ according to our definition. So we defined the property of segregation with respect to a model (segWrt) to capture the fact that the two models are segregated in the same way. This boils down to having corresponding applications communicating in the same way. We proved that a sufficient condition for B segWrt A, where A is segregated, is for B itself to be segregated with the same interpretation of application structure as A (that is, using the same asEvent bijection introduced in [ Cooper & Stepney 2000]), and for B also to be a refinement of A. Even that is not sufficient for our purposes, because we have two security models, SPs defining the segregation structure, and SPf , defining the visible functional behaviour. Our final concrete model HW is a necessarily a refinement of SPf , not of SPs , and so cannot be segregated with respect to it.

Formal Methods for Industrial Products

is SEG

is SEG

SPs

SPf

refines

refines

VMs

is segHidWrt

383

functional properties

VMf

has same traces

HW

Fig. 1. Overview of relationships between the formal models (boxes) and the proofs (ellipses)

So we defined the property of segregated and hidden with respect to a model (segHidWrt), to capture the fact that one model behaves as if it is segregated in the same way as another, except that the communication channels are hidden. This corresponds quite nicely to the implementation, which is not physically segregated – applications share a flat memory space, and perform their allowed communications using shared memory buffers – yet nevertheless they behave as if they are segregated. Which is the whole point of the exercise, of course. We also proved some properties about these relationships, in order to be able to prove that our concrete model has the desired segregation property. In particular: ` C segHidWrt Bs ∧ Bs segWrt As ⇒ C segHidWrt As

7

Resulting Specification Structure

The specification and proof structure we developed is summarised in figure 1. This structure is intended to simplify the proofs. It also has the advantage that some of the details of the virtual machine can be changed without changing the model of security. We prove that HW possesses the security properties, by proving both that it is a refinement of SPf and that it is segregated and hidden with respect to SPs 7.1

Abstract Security Policy Model, SP

The abstract SP model describes the world of applications and their communication through explicitly identified overt communication channels. It expresses

384

S. Stepney and D. Cooper

some functional security properties to do with securely loading and deleting user applications, and the key non-functional property, that applications are segregated: that they do not communicate or otherwise interfere with each other, except over the overt channels. Our SP model is relatively small, simple, and easy to understand, running to approximately 40 pages of Z and natural language commentary4 . The difference between the SPf and SPs models is captured in two different finalisation schemas. The simplicity of the SP model allows these communication channels to be clearly identified, so that the client can easily verify that these channels are acceptable. 40 pages of Z may sound a lot for an abstract model, but most of the complexity was in the identification of the several overt communication channels present in the design. The required functional security properties are proved to be consequences of the various SPf operations. The SPs model is constrained to be segregated, which gives us the segregation property by definition. The behaviour of the virtual machine, and hence the behaviour of user applications, is not specified at this level. No matter what a user application does, the system is secure (segregated). SP is secure, by definition. 7.2

Virtual Machine Model, VM

Our more concrete VM model captures the behaviour of the Virtual Machine that ensures that abstract applications remain segregated. In practice, segregation is achieved by performing run-time memory access checks; this is the critical aspect of the VM specification. Our VM model is more complicated than the SP , reflecting the design of the Virtual Machine. VM adds more design detail to SP by specifying the detailed behaviour of the virtual machine; it captures the actual behaviour of a user application, given its code. This model is approximately 140 pages long, of which about 80 pages is a detailed description of the virtual machine. Again, the difference between the VMf and VMs models is captured in two different finalisation schemas. 7.3

Concrete Hardware Model, HW

Our concrete HW model captures the memory map of the design, showing how the segregated applications are securely implemented in a common flat memory space of physical RAM, ROM, and EEPROM, with shared use of the RAM. Our HW model is approximately 20 pages long. It captures the memory structure explicitly; the operations are defined indirectly, in terms of the VM operations and the retrieve relation. 4

The various page lengths quoted here give an indication of the relative effort involved in each of the specification and proof sections. The actual effort involved was not inconsistent with the metric discovered in [Barden et al. 1992].

Formal Methods for Industrial Products

8

385

Resulting Proof Structure

8.1

Proof Tree

Some of the arguments we have presented concerning segregation, determinism, and refinement are subtle. In the morass of detail inherent in a large scale specification and proof, it would be easy to miss out some steps. As well as convincing ourselves we had not missed anything, we also had to make the proof structure comprehensible to third party reviewers. We devoted two chapters of the final document just to documenting the proof structure. The first of these chapters is an overview of the structure, describing what needs to be proved, how the proofs are broken down into large components, and which proofs rely on other proofs (illustrated in sections 8.3 and 8.4 below). The second chapter is a detailed proof tree, summarising the entire proof structure, showing what proofs are done where in the document, and demonstrating that everything that needs to be proved has been proved. 8.2

Proof Sizes

All but one of the security properties of our abstract model are functional, and so are preserved by refinement. The segregation property is non-functional, and is not preserved in general by refinement. So we rigorously proved that our concrete HW model is a refinement of our abstract SP model (thus proving it exhibits the functional security properties), and that the HW concrete model also segregates applications (thus proving it exhibits the non-functional security property). The purpose of performing a proof is to greatly increase the assurance that the chosen design (the behaviour of the virtual machine, and the memory flattening) does, indeed, behave just like the abstract model. We chose to do rigorous proofs by hand, because our experience of existing proof tools is that current tools are not yet appropriate for a task of this size5 . We did, however, type-check the statements of the proof obligations and many of the proof steps using a combination of f uzz [Spivey 1992a] (see appendix A) and Formaliser [ Flynn et al. 1990] [Stepney]. All proofs were also independently checked by third party reviewers. The proofs of the refinement obligations, the preservation of the segregation property, and the proofs of some model consistency obligations take approximately 280 pages. In addition, there are approximately 100 further pages of formal 5

Each ‘proof step’ in our rigorous proof is fairly small for a hand proof, because of the requirement for checking by independent reviewers: we could not instruct them to do “several pages of (unspecified) algebra” for each step. So each step typically involves one (or a few) applications of a simple inference rule such as cut, one-point, Leibnitz, or of a Z toolkit law, or of a schema calculus law. Our Z proof tool evaluation exercises show that each of these rigorous steps typically expands out to 20–100 elementary steps when performed with a tool such as CADiZ [Toyn 1996] (ignoring the steps needed to prove the toolkit law, where relevant).

386

S. Stepney and D. Cooper

derivation in support of the underlying theory of segregation with communication over overt channels. We performed various consistency proofs and proofs that our SP model possesses the desired security properties. But the bulk of the proof work was showing that the HW model is consistent with the SP model: that it has the segregation property of the SPs model, and the functional properties of the SPf model. 8.3

HW has SP Segregation Property

We prove that HW behaves as if it is segregated in the same way as SPs , except that the communication channels are hidden: ` HW segHidWrt SPs We do this by introducing the intermediate VMs and VMf models, and using the property of segHidWrt that C segHidWrt Bs ∧ Bs segWrt As ⇒ C segHidWrt As which allows us to break the proof into two parts: 1. ` HW segHidWrt VMs We show that the HW model is segregated in the same way as the VMs model, except for the internal communications being hidden, by showing that the VMf model is segregated in the same way as the VMs model, and that the HW model has the same traces as the VMf model. a) ` VMf segHidWrt VMs This is easy to prove, because VMf is equal to VMs except for the internal communications being hidden by finalisation, which is the definition of seqHidWrt. b) traces HW = traces VMf The traces are the same if HW is a refinement of VMf , and VMf is total (so no traces can be added by widening a precondition) and deterministic (so no traces can be removed by resolving non-determinism). i. ` VMf v HW We define the HW model operations in terms of the VM model operations and a retrieve relation, and so the refinement holds by construction provided certain properties hold of the retrieve relation [Woodcock & Davies 1996, section 18.3]. That is, we prove that the local retrieve is functional from HW to VMf , is total (covers the HW state), and is surjective (covers the VMf state). ii. ` isTotal VMf We prove the preconditions of all the VMf operations are true. iii. ` isDeterministic VMf We prove all the VMf operations are functional.

Formal Methods for Industrial Products

387

2. ` VMs segWrt SPs We show that the VMs model is segregated in the same way as the SPs model, by showing that it is segregated, and that it is a refinement of the SPs model. a) ` VMs isSegregated We prove that the VMs model is equivalent to one written as an eventcentric unconstrained multipromotion. The unwinding theorem from [ Cooper & Stepney 2000] gives us that such a model is segregated. b) ` SPs v VMs We prove refinement. 8.4

HW has SP functional properties

We prove that HW is a refinement of SPf : ` SPf v HW We introduce the intermediate VMf model, and use transitivity of refinement to split the proof into two parts. 1. ` SPf v VMf a) We state and prove lemma ‘squeeze’, that shows that refinement is preserved under hiding the internal communications b) We apply lemma ‘squeeze’ to ` SPs v VMs , proved above, 8.3 2(b) 2. ` VMf v HW proved above, 8.3 1(b)i

9

Results

As well as providing a specification and proof that helped our customer gain assurance about the security of their product, the use of formality and proof improved the design and exposed some problems. 9.1

Design of the Virtual Machine

A major part of a Smartcard operating system’s security functionality is provided by its virtual machine: this performs appropriate run-time memory access checks to ensure applications access only their own memory. The formal specification work proved that the designed checks are indeed sufficient to ensure segregation. But in addition, the formalisation of the checks fed back into the documentation of the virtual machine, documenting more clearly, uniformly and precisely what checks are needed. The formal modelling was a valuable part of the iterative design process. To start with, formality was used as a thinking aid, as we and the design team used a Rapid Application Development approach [DSDM Consortium] to sketch the design of the virtual memory model and opcode structure. The formal work then

388

S. Stepney and D. Cooper

became more detailed, and the particular memory access checks were specified in detail. In addition, the specification work exposed a flaw in the original design of one of the opcodes: in certain rare circumstances it could overflow and overwrite memory outside its allowed region. The opcode was redesigned to remove the flaw. 9.2

Identification of Communication Channels

The requirement for segregation required that the communication channels between applications be made overt. This then allowed a decision as to whether such channels were appropriate, both in existence, and in bandwidth. The Scos application provides some functionality to the user applications, including a random number generator. In order to faithfully model the way the generator works, it was necessary to introduce an overt communication channel. This exposed the fact that there is a potential communication of the random number between applications; further analysis demonstrated that this channel could not in fact be used to pass any useful information. 9.3

Proof Detected an Error

An early version of the design had a rather subtle error to do with clearing RAM when swapping between applications. In one very special case, which required unloading the only application on a card, then loading another in a special mode, the RAM was not properly cleared, resulting in a potential covert communication. This error was detected both by the design team and by the formal modelling. Interestingly, it was not the proof effort itself that detected the error, it was in the mapping between the formal model and the semi-formal design. The VM model is deterministic, so it fully specifies the contents of RAM, and segregated, so it specifies the contents of RAM for each unpromoted application. The most sensible deterministic specification of the contents of RAM for a newly loaded application is to set it to some predefined ‘cleared’ value: this clearing did not occur in the semi-formal model, and so the mapping detected a flaw. (Had the formal model been written from the semi-formal model, the proof of determinism would not have been possible, which would have uncovered the flaw at that point.) So, in this case, just thinking about what the proof obligations were going to be influenced how we wrote the model, and exposed the design flaw.

10 10.1

Lessons Learned Model Structure Versus Proof Structure

There is a fine balance between model structure clarity and ease of proof. In order to prove the difficult property of segregation, we sacrificed some clarity for

Formal Methods for Industrial Products

389

ease of proof (by using a free type to bundle the different kinds of applications into one, and by requiring the global promotion function to be total), and we sacrificed some ease of proof for clarity (by using the more conventional form of multi-promotion, and proving it equivalent to the unwound form). We experimented with the alternative approaches before converging to this particular compromise. 10.2

Presentation

The specification is large, and the proof structure subtle. The third party reviewers had to navigate a complex document, had to be able to find definitions, and had to be assured nothing had been left out. The index and the proof tree chapter were arguably two of the most important sections in the final document. Even the authors found these chapters essential when coming back to the document after a break! 10.3

Providing Further Justification

On their first pass through the model, the third party reviewers raised an observation about the size of the input space: we had formally modelled more input messages than could actually be implemented. The reviewers wanted us to justify that this was not a problem: that the implemented restriction could not be used to covertly signal information. We provided such a justification (as an appendix) in the next version of the model. So the formal development process can be iterative: external comments can require rethink, further justifications, and more detail to be provided. 10.4

Elegant Mathematical Results may not Help

Just because the statement of a proof obligation is simple and elegant, does not mean that its application to a particular problem will be simple and elegant. Much hard, potentially messy, proof work may be required. We had an elegant formulation of segregation, but it was not in a form that mapped naturally to the conventional state-and-operations style of Z specification we used for the modelling work. Even after moving the result into the Z world, and unwinding it to a multi-promotion form, it still did not allow a natural specification style. So we did not use it in the modelling, which necessitated us discharging an extra proof obligation. We also used a specification trick to define the HW model operations in terms of the VM model operations. This simplified the modelling enormously, and all we had to do to prove refinement was to prove that the retrieve was functional, total, and surjective [Woodcock & Davies 1996, section 18.3]. The proof obligation can be expressed in one line: ` R ∈ HWState  VMState

390

S. Stepney and D. Cooper

However, that one line hides a wealth of messy and not very interesting detail: the state spaces of both states have many components. After expanding out the states and the retrieve, the mere statement of the proof obligation extends over several pages. The proof itself was quite cumbersome.

11

Summary

We have proved the correctness of the refinement, and the preservation of a security property, of a real industrial product, working to real development timescales. In the process, we uncovered a security flaw in one part of the system design (to do with clearing memory under some unusual conditions), and identified the corrections needed. We achieved a very high level of rigour in our proofs. The proofs are far more detailed than typical proofs done in general mathematics. Despite this the formal methods activity was never on the critical path of the development. The formal methods component was usually ahead of schedule, and never caused a delay in development. As a byproduct of doing these proofs, we have also generalised the notion of segregation to allow controlled communication, and applied it in a Z state-andoperations style. Acknowledgements: The work described in the paper took place as part of a development for the NatWest Development Team. Parts of the work were carried out by Eoin Mc Donnell, Barry Hearn and Andy Newton (all of Logica). We would like to thank Jeremy Jacob and John Clark for their helpful comments and careful review of this work.

References [Barden et al. 1992] Rosalind Barden, Susan Stepney, and David Cooper. The use of Z. In John E. Nicholls, editor, Proceedings of the 6th Annual Z User Meeting, York 1991, Workshops in Computing, pages 99–124. Springer Verlag, 1992. [Barden et al. 1994] Rosalind Barden, Susan Stepney, and David Cooper. Z in Practice. BCS Practitioners Series. Prentice Hall, 1994. [Bell & Padula 1976] David E. Bell and Len J. La Padula. Secure computer system: unified exposition and MULTICS. Report ESD-TR-75-306, The MITRE Corporation, March 1976. [Bell 1988] D. E. Bell. Concerning “modelling” of computer security. In Proceedings 1988 IEEE Symposium on Security and Privacy, pages 8–13. IEEE Computer Society Press, April 1988. [Cooper & Stepney 2000] David Cooper and Susan Stepney. Segregation with communication. (These proceedings), 2000.

Formal Methods for Industrial Products

391

[DSDM Consortium] DSDM Consortium. Dynamic Systems Development Method manual. Technical report, http://www.dsdm.org/. [Flynn et al. 1990] Mike Flynn, Tim Hoverd, and David Brazier. Formaliser—an interactive support tool for Z. In John E. Nicholls, editor, Z User Workshop: Proceedings of the 4th Annual Z User Meeting, Oxford 1989, Workshops in Computing, pages 128–141. Springer Verlag, 1990. [Goguen & Meseguer 1984] J. A. Goguen and J. Meseguer. Unwinding and inference control. In Proceedings 1984 IEEE Symposium on Security and Privacy, pages 75–86. IEEE Computer Society, 1984. [Gollman 1998] Dieter Gollman. Computer Security. John Wiley, 1998. [Jacob 1992] Jeremy L. Jacob. Basic theorems about security. Journal of Computer Security, 1(4):385–411, 1992. [Josephs 1991] Mark B. Josephs. Specifying reactive systems in Z. Technical Report TR-19-91, Programming Research Group, Oxford University Computing Laboratory, 1991. [Roscoe 1995] A. W. Roscoe. CSP and determinism in security modelling. In Proceedings 1995 IEEE Symposium on Security and Privacy, pages 114–127. IEEE Computer Society Press, 1995. [Rushby 1981] J. M. Rushby. The design and verification of secure systems. In Proceedings 8th ACM Symposium on Operating System Principles, December 1981. [Spivey 1992a] J. Michael Spivey. The f uzz Manual. Computer Science Consultancy, 2nd edition, 1992. ftp://ftp.comlab.ox.ac.uk/pub/Zforum/fuzz. [Spivey 1992b] J. Michael Spivey. The Z Notation: a Reference Manual. Prentice Hall, 2nd edition, 1992. [Stepney] Susan Stepney. Formaliser Home Page. http://public.logica.com/˜formaliser/. [Stepney et al. 1998] Susan Stepney, David Cooper, and Jim Woodcock. More powerful Z data refinement: pushing the state of the art in industrial refinement. In Jonathan P. Bowen, Andreas Fett, and Michael G. Hinchey, editors, ZUM’98: 11th International Conference of Z Users, Berlin 1998, volume 1493 of Lecture Notes in Computer Science, pages 284–307. Springer Verlag, 1998. [Toyn 1996] Ian Toyn. Formal reasoning in the Z notation using CADiZ. In N. A. Merriam, editor, 2nd International Workshop on User Interface Design for Theorem Proving Systems. Department of Computer Science, University of York, July 1996. http://www.cs.york.ac.uk/˜ian/cadiz/. [Valentine et al. 2000] Sam Valentine, Ian Toyn, Susan Stepney, and Steve King. Type constrained generics. (These proceedings), 2000.

392

S. Stepney and D. Cooper

[Woodcock & Davies 1996] Jim Woodcock and Jim Davies. Using Z: Specification, Refinement, and Proof. Prentice Hall, 1996.

A

Conjectures with f uzz

We did all our proofs by hand, but we did use f uzz to typecheck them, which provided a valuable level of tool support. However, f uzz does not support the syntax for conjectures, so we had to make judicious use of its various %% directives, to allow the same markup to be both checked by f uzz and typeset by LATEX. The semantics of conjectures are different from those of predicates, but the type rules are the same. So the technique we use it to present a conjecture to f uzz as if it were a predicate, using %%ignore to hide the turnstile, and present it to LATEX to typeset correctly, by using %% to hide the parts f uzz needs but LATEX does not. Instruct f uzz to ignore turnstiles. %%ignore \shows A.1

Non-generic Conjectures

Markup a simple non-generic conjecture as a Predicate paragraph, hiding from LATEX the predicate’s ∀ and •. The markup

appears to f uzz as

appears to LATEX as

\begin{zed} %%\forall y:\nat %%@ \\ \shows \\ y=y \end{zed}

\begin{zed} \forall y:\nat @ \\ \\ y=y \end{zed}

\begin{zed} y:\nat \\ \shows \\ y=y \end{zed}

and so typechecks

and so typesets as y :N ` y =y

Formal Methods for Industrial Products

A.2

393

Generic Conjectures

Markup a generic conjecture as a Generic-Box paragraph, hiding from LATEX the box markup (which means LATEX needs an extra math-mode markup), the (dummy) declaration, and the predicate’s ∀ and •. The markup

appears to f uzz as

appears to LATEX as

\[ %%\begin{gendef} [X] %% dummy42 : X \\ %%\where %%\forall y:X %%@ \\ \shows \\ y=y %%\end{gendef} \]

\begin{gendef} [X] dummy42 : X \\ \where \forall y:X @ \\ \\ y=y \end{gendef}

\[ [X] \\ y:X \\ \shows \\ y=y \]

and so typechecks

and so typesets as [X ] y :X ` y =y

An Execution Architecture for GSL Bill Stoddart School of Computing and Mathematics, University of Teesside, North Yorkshire, U.K.

Abstract. We present a virtual machine architecture designed to provide an executional interpretation for a major subset of the Generalised Substitution Language and its probabilistic extension pGSL, including bounded non-determinism and infeasible operations. The virtual machine techniques we use to support abstract program execution are reversible execution and execution cloning. The architecture we propose will also allow the efficient execution of concrete programs, and a free mixture of abstract and concrete components, so it is possible to envisage a blurring of the distinction between the animation of a specification and the execution of its implementation. Keywords: B, Animation, Virtual Machines, Reversible Computation.

1

Introduction

B has a programming notation, the “Abstract Machine Notation”, AMN, with a semantics given in the “Generalised Substitution Language”, GSL. In this paper, which is not generally about concrete syntax but about the architecture of a virtual machine, we use an abstract command language based on GSL. Consider the operation: A= b (x := 1 [] x := 2); x = 2 =⇒ skip A consists of two commands in sequence, with the first providing an apparent non-deterministic choice. If the choice x := 1 is selected however, the following command is infeasible. Let’s remind ourselves how the predicate transformer semantics of GSL handle this situation. First let us see if A is feasible. We have: fis(A) ⇔ ¬ ([A]false) (from defn of fis) ⇔ ¬ ([(x := 1 [] x := 2); x = 2 =⇒ skip]false) (from defn of A) ⇔ ¬ ([x := 1 [] x := 2]([x = 2 =⇒ skip]false)) (by [S ; T ]Q ⇔ [S ]([T ]Q)) ⇔ ¬ ([x := 1 [] x := 2](x = 2 ⇒ [skip]false)) (by [g =⇒ S ]Q ⇔ g ⇒ [S ]Q ) ⇔ ¬ ([x := 1 [] x := 2](x = 2 ⇒ false)) (by [skip]Q ⇔ Q ) ⇔ ¬ ([x := 1 [] x := 2](¬ (x = 2))) ⇔ ¬ ([x := 1]¬ (x = 2) ∧ [x := 2]¬ (x = 2)) (by [S [] T ]Q ⇔ [S ]Q ∧ [T ]Q ) ⇔ ¬ (¬ (1 = 2) ∧ ¬ (2 = 2)) (by substitution) ⇔ ¬ (true ∧ false) ⇔ true J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 394–413, 2000. c Springer-Verlag Berlin Heidelberg 2000

An Execution Architecture for GSL

395

So S is always feasible. There is no possibility of introducing infeasibility by making the wrong demonic choice. Also note that S will always establish x = 2 since: [S ]x = 2 ⇔ [(x := 1 [] x := 2); x = 2 =⇒ skip]x = 2 ⇔ [(x := 1 [] x := 2)]([x = 2 =⇒ skip]x = 2)] ⇔ [(x := 1 [] x := 2)](x = 2 ⇒ x = 2) ⇔ [x := 1]true ∧ [x := 2]true ⇔ true Thus it seems that the program is clairvoyant. It never makes the choice to set x := 1 as this will lead to infeasibility. But there is also an operational interpretation: if a program makes a non-deterministic choice which subsequently leads to infeasibility, it backtracks to the point of choice and looks for another alternative. We propose a virtual machine designed to provide an execution platform for GSL, which performs backtracking by means of reversible computation. This is achieved by designing the primitive operations of the virtual machine to be information conserving. Reversing computation back to a point of choice restores the machine state to what it previously was at that point. The technique is thus an alternative to “check-pointing” in which the machine state is saved at any point liable to be a backtracking return destination. In this paper we describe the architecture of the virtual machine, and how this supports some abstract programming constructs that are not normally thought of as executable (though the executional interpretation of backtracking for nondeterministic choice is well known). The architecture we propose will also allow the efficient execution of concrete programs, and a free mixture of abstract and concrete components, so it is possible to envisage a blurring of the distinction between the animation of a specification and the execution of its implementation. We also consider the execution of probabilistic choice, and introduce a form of probabilistic choice which has the properties of both non-deterministic choice and Morgan’s pGSL. Important topics not covered here are: any details of how abstract (set and relation based) data is handled in the virtual machine, the evaluation of predicate truth values, which subset of mathematical notation would be included in an executable abstract command language, and what method might best exploit the synergy between proof and executable abstract code.

2

Reversible Computation

Interest in the reversibility of computing has a long history going back at least as far as John McCarthy’s article on the inversion of functions defined by Turing machines [12]. Charles Bennett’s work on reversible computation [4] [5] considers the thermodynamic aspects of information loss during computation and relates

396

B. Stoddart

these to the related thermodynamic behaviour of the physical machine performing the computation. To refute the hypothesis that computation inevitably increases entropy because of its propensity to lose energy Bennett showed that computation is essentially a reversible process.[5]. In ”The Thermodynamics of Garbage Collection” [3]. Henry Baker argues the case for using reversible computation as an alternative to garbage collection.

3

Virtual Machine Architecture, Outline

We will refer to the organisation of our reversible virtual machine as the “Abstract Command Language Architecture” (ACLA). In this section we outline the basic principles of its organisation. As a program runs, certain commands discard information whilst others conserve it. x := y discards the previous value of x , whilst x +y ≤ maxint | x := x +y is information conserving when executed within its pre-condition, since we can recover the original value of x by executing x := x − y. So long as a command conserves information we can provide a corresponding shadow command that will reverse its effect. To provide for reversible computing we ensure that (almost) every command in our virtual machine instruction set is information conserving. Our approach is to take an existing virtual machine instruction set, and modify any command which discards information to save this lost information on a “history stack”. Shadow commands retrieve this information during reverse execution. The virtual machine used as the basis of ACLA is that associated with the stack based language Forth, and complies with the ISO and ANSI Forth Standards[8]. Stack based architectures tend to be more information conservative than register based architectures, since “housekeeping” operations generally consist of rearranging the contents of the stack rather than copying information between registers. In addition Forth’s niche application area is real time systems1 and many of its components, such as the multi-tasker, are very efficiently designed. Forth is also an extensible virtual machine New primitives can be added in assembler to support the particular application area the virtual machine is deployed in. For ACLA we add such primitives to support the use of sets and relations. When Forth commands appear in our text they will be boxed, as in DUP . The examples of primitive Forth commands we consider here consume arguments from the parameter stack and leave results on the same stack. For example + removes the top two values (x and y say) from the stack, adds them, and pushes the result back on the stack. + loses information since we cannot recover the values x and y from the value x + y. To run it in information conserving mode we must save one of the values, say y, on the history stack. 1

For some examples see the list of Forth applications in space research, maintained by NASA at http://forth.gsfc.nasa.gov.

An Execution Architecture for GSL

397

Then, when we re-encounter the same instance of + when running in reverse mode we have x +y on the parameter stack and y on the history stack. To perform the reverse computation we copy y from the history stack (so the parameter stack contains x + y and y) perform a subtraction (the stack now contains x ) and finally move y from the history stack to the parameter stack. Each primitive compiled command of the ACLA virtual machine may invoke one of three fragments of machine code of the underlying physical machine, depending on the execution mode in which it is invoked. These are normal forward mode, N, in which information is not conserved, conservation forward mode C which saves discarded information on the history stack, and reverse mode R which runs backward through the compiled virtual machine code reversing the effect of each command.

4

“Executing” Non-deterministic Choice

Consider the “execution”2 of a non-deterministic choice of the form A [] B . . [] C . Such a choice offers two or more possible continuations of a program. When the choice is executed, the locations of these continuations are pushed onto the history stack, which is suited for this purpose because we will never backtrack past this point until every choice has been tried (and its entry on the history stack thus used up). There are three execution modes for choice: first feasible, simultaneous and random. In first feasible mode, we attempt to execute each choice in turn, consuming the corresponding continuation address on the history stack as we do so. If a choice proves to be infeasible we try another. If no choices are feasible, the whole choice construct is infeasible, and sets the virtual machine into reverse execution mode so that it will backtrack to the previous non-deterministic choice. This provides a depth first search for a feasible path through a computation. Each program “run” starts with an empty history stack. If the program is not feasible, the virtual machine is forced to backtrack to the beginning of the program and reports “fail” to indicate that the program was impossible to run. If the program is feasible and terminates the virtual machine clears the history stack and reports “ok”. This is a general pattern, not limited to first feasible choice. In simultaneous mode the computation is cloned into as many dopplegangers as there are choices. These run in parallel under a “termination pact”. The first to terminate execution survives. The others die, either committing suicide if their continuation proves infeasible, or killed by a signal from the terminating clone. This provides a breadth first search for a feasible path through a computation. 2

We originally entitled this section “Implementing Non-Deterministic Choice” but it occurred to us that in B we speak of implementing non-deterministic choice by a process of refinement which removes any syntactic representation of non-deterministic choice from our programs. To emphasize that this is not the case here we use “execute” rather than “implement”.

398

B. Stoddart

Simultaneous choice is more angelic than first feasible mode since it will prefer terminating choices over non-terminating ones. It is therefore not the form of choice to use when testing an abstract B specification that has not had its consistency proof obligations discharged, as it will avoid errors of non-termination when it can. We will, however, be able to utilise this angelic behaviour when we consider General Correctness semantics. In random mode a pseudo-random number generator is used to select a choice. This is useful for testing the firing of “events” in a B Abstract System3 [1], [2] to see if some event sequence can be found to break the conjectured system invariant. Consider a system with invariant I and operations A and B , where S ∗ is used for repeat S until S becomes infeasible and n is a fresh variable name. We can define: result ← test = b var n = 0; (n < 100 ∧ I =⇒ (A [] B ); n = n + 1)∗ ; result := n; If result = 100, then 100 events have been randomly selected and fired without breaking the invariant. If result is 51, then we have detected a sequence of 51 events which breaks the invariant. We can profitably integrate this approach with the method of proof. If we have an original invariant I ∧ J and are able to use this to prove J is always maintained by all events, we can run our test with I as the invariant and avoid the overheads (and often difficulty) of computing the truth value of J . An interesting case is where some point is reached before termination of test in which none of the events A, B is feasible. This will provoke backtracking. In abstract systems we do not admit “undoing” an event, so this is not what we want. Rather we would want to interpret this situation as deadlock. This unwanted backtracking has occurred because we have embedded A and B in a test program rather than invoking them in their own right, and thus extended the scope of the backtracking mechanism which extends over the “run” of a program. One solution is to test for possible deadlock by including the clause fis(A) ∨ fis(B ) in I , i.e. by asserting as an invariant property that the system is deadlock free. The evaluation of this predicate presents some difficulties which we will return to later.

5

The Command x :∈ X

The command x :∈ X assigns an unspecified value from the set X to x . This form of non-deterministic choice is supported by our virtual machine in the case where X is a variable name. The value of such a variable is held in memory as a list of elements. It is a straighforward matter to choose one of these elements (and remember that it has been chosen so that a different choice can be made if we return to this point when backtracking). One of the referees suggested we comment briefly on unbounded non-deterministic choice. In fact we do not have any concept of executing such a choice, or 3

In this context an event is a B operation E say, and fis(E ) gives the condition under which the event may occur. magic is the event which is always impossible. Events in Abstract Systems as envisaged by Abrial do not have pre-conditions

An Execution Architecture for GSL

399

of the finite representation of non-finite objects. It may be thought that we could make a choice from an infinite set without having an exhaustive representation, but we would necessarily come unstuck because we need to be able to exhaust the choices to determine infeasibility. Let us remove the restriction that X be a variable name and allow it to be any set expression. Now consider the code: x :∈ EVEN ; x ∈ ODD =⇒ skip This construct is infeasible since any x chosen by the first command will inevitably cause the guard of the second to be false. However, in any implementation we will never exhaust our choice of even numbers, so the virtual machine will never realise that it has an infeasible command.

6

Random Choice and Morgan’s pGSL

In [14] Carroll Morgan presents pGSL, an extension of GSL which provides a theory of probabilistic predicate transformers which incorporates both probabilistic choice and non-deterministic choice. Here we compare pGSL with our execution of GSL using randomised demonic choice. We develop a specialised form of our random choice with a notation for weighting the probability of the choices. This differs slightly from the probabilistic choice of pGSL but shares with it the important property of sublinearity, which will be described later in this section, and from which many other properties may be derived. The first obvious but important remark to make is random demonic choice satisfies the predicate transformer semantics for demonic choice, we have just chosen to implement a demon who makes a choice in a random way. We will see that the situation in not the same for pGSL. We first recall the basic pGSL definitions. For any predicate Q, hQi is the “expectation” of Q and has the value 1 where Q is true and 0 where Q is false. The pre-expectation [S ]hQi is an expression over the state space. Its value represents the probability that executing S will result in an after state that satisfies Q. Let p be an expression over machine state variables which takes values in [0, 1]. We denote the probabilistic choice between S and T in which we choose S with probability p and T with probability 1 − p by S p ⊕ T , and we define: [S p ⊕ T ]E = p ∗ [S ]E + (1 − p) ∗ [T ]E We always consider the expectations of predicates which we would like to be true, and in non-deterministic choice the demon will always make the choice which minimises these expectation: [S [] T ]E = min([S ]E , [T ]E ) To handle guards in a manner that preserves sublinearity we allow infinite expectations, with ∞ ∗ 0 = ∞ and 0 ∗ ∞ = 0. We define:

400

B. Stoddart

[g =⇒ S ]E = (1/hgi) ∗ [S ]E The intended effect is that miracles yield infinite pre-expectations, even when one is trying to achieve the impossible, e.g. [false =⇒ skip]hfalsei = (1/0) ∗ [skip]0 = ∞ ∗ 0 = ∞ Accordingly, feasibility in GSL is defined by fis(S ) ⇔ [S ]0 = 0. Miracles dominate in probabilistic choice, e.g. assuming p 6= 0 [false =⇒ skip p ⊕ S ]E = p ∗ ∞ + (1 − p) ∗ [S ]E = ∞ Thus probabilistic choice in which one of the choices is infeasible is itself infeasible. Let’s compare this with what happens in our execution of randomised nondeterministic choice. We have already said that if a choice that is made proves to be infeasible then the virtual machine will backtrack out of that choice and make another. It makes no difference whether the choice is immediately found to be infeasible, or whether this happens after some further processing. In either case the infeasible choice is simply discarded from consideration. It has been said of demonic (non-deterministic) choice that “the demon abhors a miracle”. The dice-man of pGSL, on the other hand, makes straight for a miracle without even bothering to roll his dice. When we execute a non-deterministic choice in random mode we are simply using a random animation of the demon. Consequently we abhor miracles. Any random choice which results in infeasibility is simply discarded. To execute a probabilistic choice in the style of pGSL on the other hand we must execute both possible choices, or at least test if both choices are feasible, which in general comes to the same thing. If either proves infeasible we consider the choice as a whole to be infeasible. Thus pGSL is more expensive to execute than a random implementation of non-deterministic choice. Also it is not what we want when using a test harness to randomly fire the events of an abstract system as we did in the operation test defined in the previous section. If some event proves infeasible in such a test we do not want this to imply that the entire choice of events is infeasible. Therefore we are motivated to introduce a new form of measured random choice, p which is randomised non-deterministic choice plus the association of a probability with each choice. For any events S and T and expectation E we define [S p  T ]E as follows: For p = 0 or p = 1 we have the special cases [S 1 T ]E = [S ]E and [S 0 T ]E = [T ]E .

An Execution Architecture for GSL

401

For p ∈ (0, 1) we have : (fis(S ) ∧ fis(T ) ⇒ [S p  T ]E = [S p ⊕ T ]E ) ∧ (fis(S ) ∧ ¬ (fis(T )) ⇒ [S p  T ]E = [S ]E ) ∧ (¬ fis(S ) ∧ fis(T ) ⇒ [S p  T ]E = [T ]E ∧ (¬ fis(S ) ∧ ¬ fis(T ) ⇒ [S p  T ]E = [S p ⊕ T ]E the important point to note being that p can never make an infeasible choice when the other choice available to it is feasible. We have mentioned that the constructs of pGSL comply with the property of sublinearity. That is if a, b, c are non-negative finite reals and R, R 0 expectations: [S ](a ∗ R + b ∗ R 0 c) = a ∗ [S ]R + b ∗ [S ]R 0 c where denotes truncated subtraction: x y = (x − y) max 0 Morgan comments[14]: “Although it has a strange appearance, from sublinearity we can extract a number of very useful consequences. We begin with monotonicity, feasibility and scaling...” A key question for us therefore, is whether p satisfies sublinearity, that is whether: [S p  T ](a ∗ R + b ∗ R 0 c) = a ∗ [S p  T ]R + b ∗ [S p  T ]R 0 c We can show in an obvious way that it does, considering the four cases fis(S ) ∧ fis(T ), fis(S ) ∧ ¬ fis(T ), ¬ fis(S ) ∧ fis(T ), and ¬ fis(S ) ∧ ¬ fis(T ). The first and fourth cases are trivial, since here p and p⊕ are identical. The arguments for the second and third cases are symmetrical. Consider therefore the case fis(S ) ∧ ¬ fis(T ), in which [S p  T ]E = [S ]E . We have: [S p  T ](a ∗ R + b ∗ R 0 c) = [S ](a ∗ R + b ∗ R 0 c) (from defn of p ) = a ∗ [S ]R + b ∗ [S ]R 0 c (from sublinearity of S) = a ∗ [S p  T ]R + b ∗ [S p  T ]R 0 c (from defn of p ) as required. We do not actually use p in our concrete syntax, rather we have a notation for n way probabilistically weighted choice which avoids any need for non-integer expressions since we do not need to express probabilities directly. For a three way choice between R, S and T with weights i , j , k (where i , j , k are integer expressions over the state space) we write: i 99K R 3 j 99K S 3 k 99K T for a probabilistic choice between R, S and T . Because this is a randomised form of demonic choice, choice is governed by feasibility and infeasible events are excluded from the choice. If this example is executed when R and S are feasible and T is infeasible and i and j are not both zero, then R is chosen with probability i /(i + j ) and S with probability j /(i + j ).

402

7

B. Stoddart

Abstract Command Language

The “Abstract Command Language” [10] [9] is an abstract programming notation whose syntax is almost identical to that of GSL but with a semantics that is extended to include wlp as well as wp properties of predicate transformers. The extensions allow us to describe a form of concurrent execution in which multiple programs can work on the same problem. Suppose we have an unimplemented specification OP = b P | S such that b P1 | S , OP2 = b P2 | S ... trm(S ) and implementations of the specifications OP1 = b Pn | S . Suppose we know: OPn = P ⇒ P1 ∨ P2 ∨ ... ∨ Pn Then it seems that in any state of our system satisfying P we should be able to provide an execution of OP by choosing one of OP1 , OP2 ... OPn . Let’s also suppose, however, we do not know which of these operations has a true precondition, though of course we know at least one of them must have. To make the choice we propose a constructor #, which we call “concert”, which is monotonic with respect to refinement and such that: OP v OP1 #OP2 #..#OPn We have shown in [10] and [9] that such a constructor cannot be defined within the total correctness framework of GSL, but can be defined by enhancing the weakest pre-condition semantics of GSL with either strongest post-conditions or weakest liberal pre-conditions. This gives what Jacobs and Gries have called “General Correctness”[11]. One additional condition on OP 1..OPn is required to use them in concert: they must be guaranteed not to terminate outside their pre-conditions. It is precisely this guarantee that cannot be given in terms of total correctness, which, unlike general correctness, is incapable of imparting any information about the execution of an operation outside its pre-condition. The Abstract Command Language is thus an abstract programming notation based on GSL plus wlp and which includes the concert operator. ACL does not require the discharge of wlp proof obligations unless these are relevant, as they are with concert, so does not impose any added burdens for developments within the total correctness framework. The execution of OP1 #OP2 #..#OPn under ACLA is identical to the simultaneous choice style execution of non-deterministic choice. This is ok since: S [] T v S #T Concert refines non-deterministic choice by angelically avoiding non-termination whenever possible.4 4

We make this observation to justify our implementation decisions, but note that the given refinement is not useful for development purposes: if we need to refine by angelically avoiding non-termination there is something wrong with the abstract specification.

An Execution Architecture for GSL

8

403

ACLA Details

Virtual machine implementations such as byte code interpreters generally impose considerable execution overheads, though there are techniques for eliminating these. One example is the “just in time” compilation of virtual machine code into native code. Another is a technique for compiling Forth, in which the relationship between the virtual machine and the underlying host architecture evolves dynamically: at one point the notional “top of stack” for the virtual machine may be in a certain register, at another point it may actually coincide with the top of the physical stack. Housekeeping operations such as rearranging items on the stack do not then generate code, but rather cause the compiler to update the injection which describes the relationship of the virtual to the physical machine. One requirement of ACLA is the ability to interpret the same sequence of virtual machine instructions in three different modes, and this prompted us to choose a simpler technique which keeps a fixed relationship between virtual and physical machines. We use a modified form of “Indirect Threaded Code”, a technique first used by Charles Moore [13] the inventor of Forth. We then show how this technique can be extended to associate three different operations with each virtual machine instruction: to support N C and R modes of execution. We detail the reverse execution of nested code and how to reverse the effect of branch instructions. Finally we touch briefly on the complex area of evaluating the truth values of predicates by considering the evaluation of the feasibility of an operation. 8.1

Indirect Threaded Code

We refer to virtual machine operations that are defined directly in the machine code of the host machine as “primitive” operations, and virtual machine operations defined in terms of other virtual machine operations as “high level” operations. All operations are held in a data structure called a dictionary. Each dictionary entry is called a ”word”. An entry for a word includes such details as a reference to the source code file and line from which it was compiled, its name, and a pointer to the machine code to be entered when the word is executed. Fig. 1 shows in simplified form the dictionary entry for the primitive forth operation DUP which duplicates the top of the parameter stack. The pointer to the machine code for DUP is held in the “code field” of the dictionary entry. In the case of a primitive definition this points to the block of machine code that directly implements the functionality of the operation. This code concludes with a sequence of machine code instructions whose job is to thread control through to the next word to be executed. We refer to this sequence of machine code instructions as next. The instructions of our virtual machines are pointers to the code field entries for those instructions. In fig. 2 we see the virtual machine about to execute the instruction DUP with the following instruction being * . VMIP is the virtual machine instruction pointer.

404

B. Stoddart

name field D

U

P

code field

pop reg1 push reg1 push reg1 next

Fig. 1. A Dictionary Entry: the code field contains a pointer to the machine code associated with the operation.

VMIP

name field D

U

code for DUP

P

next name field

* code for * next

Fig. 2. The Basic Principal of Threaded Code

We can imagine that in fig. 2 the instruction which preceded DUP is about to execute its next sequence. The job of next in this instance is to set the physical machine instruction pointer (e.g. the i386 instruction pointer in the case of an i386 host) to the code for DUP and to adjust the virtual machine instruction pointer to point to the next token. In GSL terms and using HMIP for the host machine instruction pointer: next = b HMIP := m(m(VMIP )) k VMIP := VMIP + 4 where m is the function that maps memory addresses to 32 bit integers such that m(a) is the integer represented by the bytes at locations a..a + 3, and

An Execution Architecture for GSL

405

VMIP is incremented by 4 to point to the next virtual machine instruction. The implementation of next is generally quite succinct, requiring, for example, three bytes of machine code on the i386. Now consider a high level definition. Suppose we wish to define an operation that will square the value on the top of the stack by performing a DUP followed by a * . If this new operation is to be named SQUARE we represent it in Forth as follows: : SQUARE DUP * ; And will be compiled into an operation that executes the virtual machine instruction for DUP followed by that for * . The execution of SQUARE is illustrated in fig. 3. VMIP is pointing (solid arrow) to a memory location that contains the virtual machine instruction for SQUARE and this in turn points to the code field of the dictionary entry for SQUARE . This in turn (un peu de patience) points to the machine code that threads control through into the definition of SQUARE . The function of this machine code is to save the current value of VMIP on the return stack and set the new value of VMIP to point to the first virtual machine instruction in the compiled definition of SQUARE . Its new setting is shown by the dotted arrow in fig. 3. In GSL the functionality of this nesting code can be expressed as: nest = b (rstack := rstack _ hVMIP i k VMIP := m(VMIP ) + 4); next where rstack is the return stack, used to hold return addresses when executing nesting definitions, and separate from the parameter stack. Execution now threads through DUP and * and arrives at the primitive operation EXIT whose function is to restore VMIP to its previous value and thread execution into the following virtual machine instruction at that point: EXIT = b (VMIP := top(rstack ) k rstack := front(rstack )); next The execution overheads for the nest and exit code are twelve and nine bytes of machine code respectively with the i386 instruction set. 8.2

Multiple Code Fields

The ACLA is a modified form of indirect threaded code virtual machine in which each dictionary entry has three code fields, which point to the code for normal execution, conservative forward execution and backward execution of the operation. For each primitive definition, these three code fields contain pointers to sections of machine code that directly implement the normal, forward and reverse

406

B. Stoddart

code to nest definition

VMIP

S

Q

U

A

R

E D

U

P code for DUP next

E

X

I

T

* code for * code for EXIT

next

next Fig. 3. The Nesting of Threaded Code

forms of the operation. Each of these sections of machine code has its own version of next, with the reverse code version causing execution to thread backwards through the threaded code. For a high level definition the three code fields contain pointers to sections of machine code that nest into the definition. The reverse version of nest must commence execution of the high level definition from the point that definition previously exited. The conservative code for EXIT will have left this address on the history stack when the definition last executed in a forward direction. When running backwards through a high level definition we reverse the effect of each virtual machine instruction until we reach the forward entry point of the definition, but what then? Running backwards into the definition’s code field pointers has no meaning. We need to insert an extra virtual machine instruction, which will perform a reverse exit from the definition and go back to the calling

An Execution Architecture for GSL

407

location.5 Note that this does not impose a run time penalty when executing in either of the forward modes, since the nesting code for each of these modes simply steps over this instruction. Fig. 4 shows the pointer organisation of ACLA.

N mode nest code

S Q U A R E

C mode nest code

R mode nest code

forward entry point reverse entry point

R mode exit code

D U P E X I T

code for C mode DUP

*

never used

code for N mode DUP

Code for C mode exit: saves reverse entry point on the history stack

N next

C next code for R mode DUP Code for N mode exit

R next

Fig. 4. Threaded Code Organisation for the Abstract Command Language Architecture

8.3

Reversing Branch Instructions

Fig. 5 illustrates the use of a forward branch instruction in the case of a “normal” machine architecture (on the left) with the ACLA version which supports reversible execution on the right. First consider forward execution of the normal code. After executing OP1 execution encounters a branch instruction. We assume this to be a conditional branch with some unspecified test on machine state deciding whether the branch will be taken. If taken, the branch is said to 5

But where was this location? It can be found on what is generally called the “return stack”, but which in the case of reverse execution is more appropriately thought of as the “came from” stack.

408

B. Stoddart

OP1

OP1

branch

branch

destination

destination

OP2

departure

OP3

OP2 arrival OP3

Fig. 5. Normal (left) and Reversible Instruction Sequences Containing a Forward Branch.

be “active”, and execution continues at the location indicated by the destination field of the branch; that is with OP3. If the branch is not taken, OP2 is executed followed by OP3. Now consider an attempt at reverse execution of this code. There are basically two problems. Firstly, after reverse execution has undone the effect of OP3 it will move on to OP2. This is incorrect because we do not know that OP2 was ever executed. And secondly, after OP2, reverse execution will encounter the destination field of the branch, and try to interpret this as an instruction. We deal with the first problem by inserting at the destination point of all jumps an additional instruction called arrival . It works in conjunction with a “branch active” flag which forms part of the virtual machine state. This flag is set whenever a branch occurs, and reset by arrival . Its function is to tell arrival whether it received control by a local or remote arrival. For example if arrival in fig. 5 is executed immediately after OP2 this constitutes a local arrival. When an active branch occurs, the executing branch instruction pushes the “from” address of the branch onto the history stack.6 If the branch is inactive (execution of branch instruction but no branch) no history is recorded. Thus when arrival executes there are two possible conditions: either the branch active flag is set and the “from address” of the branch is on the history stack, or, the branch active flag is reset and there is no from address on the history stack. In the second case arrival saves the address of the previous instruction on the history stack, so that reverse execution of arrival can always find its backwards continuation address on the history stack: arrival = b if not(BA) then hstack := hstack _ hVMIP − 4i; BA := false 6

More specifically it pushes the address of the last instruction executed before the branch, which is OP1 in this case.

An Execution Architecture for GSL

409

The second problem has a simple solution. We insert a virtual machine instruction departure after the destination field of any branch instruction. This allows reverse execution to step over this field. The departure instruction is never executed in forward mode, as the branch instruction just steps over it. In fig. 6 we see these ideas applied to a backward branch. The letters a, b, c.. are used to label the locations of each virtual machine instruction so we can represent some forward and reverse traces.

a:

OP1

b:

arrival

c:

OP2

d:

branch

e: destination f:

depart

g:

OP3

Fig. 6. A Reversible Backward Branch

Consider an instance of execution in which the branch is first active and then inactive, so that OP1, OP2, OP3 are performed in the sequence: OP1, OP2, OP2, OP3 Assuming an empty history stack at the start of execution, we have the following trace of forward execution in C mode, where x is the history recorded by OP1, y1 and y2 the histories recorded by the two executions of OP2, and z is the history recorded by OP3. Location Operation BA flag History Stack false hi a OP1 false x b arrive false x _ hai c OP2 false x _ hai _ y1 d branch true x _ hai _ y1 _ hci b arrive false x _ hai _ y1 _ hci c OP2 false x _ hai _ y1 _ hci _ y2 d branch false x _ hai _ y1 _ hci _ y2 g OP3 false x _ hai _ y1 _ hci _ y2 _ z For the reverse trace the branch active flag is not needed. All jumps in the reverse execution sequence are handled by the reverse execution of arrival which

410

B. Stoddart

finds its continuation location on the history stack: Location

Operation

g: f: c: b: c: b: a:

OP3 depart OP2 arrival OP2 arrival OP1

History stack x _ hai _ y1 _ hci _ y2 _ z x _ hai _ y1 _ hci _ y2 x _ hai _ y1 _ hci _ y2 x _ hai _ y1 _ hci x _ hai _ y1 x _ hai x empty

We finish our discussion of branch instructions with a note on normal (N mode) forward execution. The data in the branch destination field compiled immediately following each branch instruction is interpreted slightly differently in N mode and C mode. N mode branch execution jumps to the instruction following the instruction jumped to by C mode execution of the same instruction. This means arrival is never executed in N mode and the mechanism for reverse execution of branches imposes no run time penalty on N mode code. 8.4

The ACLA Multi-tasker

We have said that for the execution of certain constructs, execution must be cloned. The ACLA provides multi-tasking with a simple non-preemptive round robin scheduler. Creating a clone requires the allocation of memory for its state space, the copying of the existing state space into that of the clone, and linking the clone into the round robin as a task. Killing a task requires unlinking it from the round robin and de-allocating its memory space. Duplicating the compiled program code is not required as this is re-entrant. To take a simple example using the concert construct, the execution of (S #T ); U proceeds as follows. S works on the current state space, and an identical copy of the state space is created for T . Both S and T then execute in parallel. Task switches are performed by virtual machine instructions inserted at strategic points in the compiled code such as backward jumps. The code for S and T is followed in each case by a virtual machine instruction whose role is to implement the terms of the termination pact. If T terminates first, the instruction which follows T de-allocates S ’s copy of the state space. Control now passes on to U which inherits the state space of T . Clones do not perform any form of inter-task communication other than under the terms of their communication pact. Our interest in communicating concurrent state machines [15] attracted us, at first, to the idea of implementing inter task communication via event synchronisation and channels. The underlying virtual machine can readily be extended to support these mechanisms. As an alternative to using explicit multi-tasking when modelling distributed systems, one can use B Abstract Machine composition mechanisms to create a monolithic machine from representations of conceptually concurrent state machines

An Execution Architecture for GSL

411

or processes [7] [16] [6]. With such composition, synchronisation becomes the conjunction of guards and channels become local variables.

9

Evaluating the Feasibility of an Event

Occasionally we would like to take a peek into the future and return with some information we have found there. Evaluating the feasibility of an event A is one such case. To do this the compiler generates code to attempt the execution of A, record whether the attempt was successful, and restore any state change caused by executing A. The difficulty is that restoring the state by reverse execution will lose the information thus gained. The virtual machine needs a way to record some information that will not be erased by backtracking. This is provided by the “fis flag” and the instructions setfis and resetfis which set and reset the fis flag in forward execution but act as skips in reverse execution. The compiled code to evaluate fis(A) is shown in fig 7. Comments on the left refer to forward execution and those on the right to reverse execution. We begin in forward mode by resetting the fis flag. Then bounce, whose name refers to how it behaves in reverse mode, skips two cells and execution continues with the operation A. When execution of A is attempted there are two cases. If A is found to be infeasible this will switch execution into reverse. The instruction pad , which is only executed in reverse mode, skips back over the previous cell (which holds a branch offset) and arrives at bounce. This switches back to forward mode execution and branches forward to the instruction beyond magic with the fis flag still reset. Alternatively, if A is feasible we continue forward execution after A with setfis 7 We then encounter magic which throws execution into reverse. The reverse execution of setfis is a skip as remarked above, so that after unexecuting A we arrive at bounce and branch forwards as described for the first case, but this time with the fis flag set.

10

Conclusions

A virtual machine architecture supporting reversible code can be used to support an executable interpretation of GSL. Such a machine could be implemented using an extension of the “indirect threaded code” technique of virtual machine construction. This extended form of indirect threaded code allows each instruction to have three interpretations: for normal forward execution, for information conserving execution, and for reverse execution of information conserving code. There are a number of reasons why such an implementation should be efficient. The basic virtual machine is based on the Forth stack machine, which has been developed and proved in the area of real time applications. Once in a particular 7

Assuming A terminates! We are mainly thinking here in terms of the modelling of Abstract Systems, in which pre-conditions are not an issue.

412

B. Stoddart

reset fis flag

resetfis bounce

skip forwards

destination pad

switch into forward and branch skip back to bounce

execute A

A

unexecute A

set fis flag

setfis

reverse skip

switch into reverse

magic

Fig. 7. A Peek into the Future

mode, the machine does not have to perform any test to know what to execute, but uses different code macros to thread execution through what is, in effect, a gestalt consisting of three intertwined but separate virtual machines. The virtual machine instruction set is extensible: new operations can be added in the native code of the host machine to tune its performance for use in a particular type of application. For GSL this means we can add efficient primitive operations to support the use of abstract data representations, as well as primitive operations which are of particular use to the GSL compiler. The three execution modes are sufficiently independant that the ability to compile reversible code imposes no speed penalty on the execution of forward code, other than the normal overheads of an indirect threaded code virtual machine. Reversible and normal (abstract and concrete) operations can be freely intermixed. The “clairvoyant” aspects of GSL which allow it to avoid infeasible choices, and of ACL, which allow it to avoid non-terminating choices, are simulated by means of backtracking using reversible computation or by cloning program execution. The effect of a miracle (in the absence of other choice) is to switch execution into reverse. For the probabilistic modelling of Abstract Systems we have proposed and described the implementation of rGSL, a variant of Morgans pGSL. It maintains the property of sublinearity from which the desirable properties of pGSL may be derived, but differs from pGSL in that its predicate transformer semantics are those of non-deterministic choice. The expression of n way probabilistic choice in rGSL is integer based, giving each choice an integer weight and a probability which is that weight over the total weights of feasible choices. Acknowledgements. The author thanks the referees for their comments, which were greatly appreciated. Thanks to Andrew Haley of Cygnus Solutions, Cambridge, and Gordon Charlton of the Forth Interest Group for sharing their knowledge of reversible computation.

An Execution Architecture for GSL

413

References 1. J R Abrial. Extending B without Changing it (for Developing Distributed Systems). In H Habrias, editor, The First B Conference, ISBN : 2-906082-25-2, 1996. 2. J R Abrial and L Mussat. Introducing Dynamic Constraints in B. In Bert D, editor, B98: Recent Developments in the Use of the B Method., number 1393 in Lecture Notes in Computer Science, 1998. 3. H G Baker. The Thermodynamics of Garbage Collection. In Y Bekkers and Cohen J, editors, Memory Management: Proc IWMM’92, number 637 in Lecture Notes in Computer Science, 1992. See ftp://ftp.netcom.com/pub/hb/hbaker/ReverseGC.html. 4. C Bennett. The logical reversibility of computation. IBM Journal of Research and Development, 6, 1973. 5. C Bennett. The thermodynamics of computation. International Journal of Theoretical Physics, 21 pp 905-940, 1982. 6. M Butler. csp2B: A practical approach to combining CSP and B. In J M Wing, Woodcock J, and Davies J, editors, FM99 vol 1, Lecture Notes in Computer Science, no 1708. Springer Verlag, 1999. 7. M J Butler. An approach to the design of distributed systems with B AMN. In J P Bowen, M J Hinchey, and Till D, editors, ZUM ’97: The Z Formal Specification Notation, number 1212 in Lecture Notes in Computer Science, 1997. 8. ANSI J14 Technical Committee. American National Standard for Information Systems: Programming Languages-Forth. American National Standards Institute, 1994. 9. S E Dunne, A J Galloway, and Stoddart W J. Specification and Refinement in General Correctness. In A Evans, editor, 3rd Northern Formal Methods Workshop, EWIC (Electronic Workshops in Computing): www/ewic.org.uk/ewic/workshop/view.cfm/NFM-98. BCS, 1998. 10. S E Dunne, W J Stoddart, and A J Galloway. Extending the generalised substitution to model semi-decidable operations. In H Habrias, editor, The First B Conference, ISBN : 2-906082-25-2, 1996. 11. D Jacobs and D Gries. General Correctness: a unification of partial and total correctness. Acta Informatica, 22, 1985. 12. J McCarthy. The inversion of functions defined by Turing Machines. In C E Shannon and J McArthy, editors, Automata Studies. Princeton, 1956. 13. C Moore. Forth, a new way to program mini-computers. The Annals of Leiden Observatory, 1972. See http://forth.gsfc.nasa.gov1. 14. C Morgan. The Generalised Substitution Language Extended to Probabalistic Programs. In Bert D, editor, B98: Recent Developments in the Use of the B Method., number 1393 in Lecture Notes in Computer Science, 1998. 15. W J Stoddart. An Introduction to the Event Calculus. In J P Bowen, M J Hinchey, and Till D, editors, ZUM ’97: The Z Formal Specification Notation, number 1212 in Lecture Notes in Computer Science, 1997. 16. W J Stoddart, S E Dunne, Galloway A J, and Shore R. Abstract State Machines: Designing Distributed Systems with State Machines and B. In Bert D, editor, B98: Recent Developments in the Use of the B Method., number 1393 in Lecture Notes in Computer Science, 1998.

A Computation Model for Z Based on Concurrent Constraint Resolution Wolfgang Grieskamp Technische Universit¨ at Berlin, FB13, Institut f¨ ur Kommunikations- und Softwaretechnik. Sekr. 5–13, Franklinstr. 28/29, D–10587 Berlin. [email protected]

Abstract. We present a computation model for Z, which is based on a reduction to a small calculus, called µZ , and on concurrent constraint resolution techniques applied for computing in this calculus. The power of the model is comparable to that of functional logic languages, and combines the strength of higher-order functional computation with logic computation. The model is implemented as part of the ZETA system, where it is used for executing Z specifications for the purpose of testdata evaluation and prototyping.

1

Introduction

The automatic evaluation of test data for safety-critical systems is an interesting application that can help to put formal methods into industrial practice. Some studies report that more than 50% of development costs in this application area go into testing. A setting for test-case evaluation that can improve this situation is as follows: given a requirements specification, some input data describing a test case, and the output data from a run of the system’s implementation on the given input, we check by executing the specification whether the implementation meets its requirements. At the first sight, this goal would seem to be simple, since input and output data are fixed. However, a real-world specification of a complex system may contain a lot of “hidden” data, which is used to describe the observable behavior. Thus, the problem scales up to finding solutions to (a sequence of) partial data bindings. Moreover, for embedded systems, a run over a few minutes may easily produce sequences of several thousand data bindings, making the execution of the specification time-critical, such that symbolic proving techniques are hard to apply. In the application-oriented research project Espress1 , which is concerned with methods and tools for the development of embedded systems, the set-based specification language Z [20] is used for requirements specification in combination 1

Espress was a joint project involving industrial partners (Daimler-Benz AG and Robert Bosch AG) and academic research institutes (GMD-First, FhG-ISST, and TU-Berlin), funded by the German “Bundesministerium f¨ ur Bildung, Wissenschaft, Forschung und Technologie”.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 414–432, 2000. c Springer-Verlag Berlin Heidelberg 2000

A Computation Model for Z Based on Concurrent Constraint Resolution

415

with other notations such as Statecharts which are incorporated by a shallow encoding in Z [5]. Performing test-case evaluation in this setting requires a computation model for Z. In this paper, we report on some results of the efforts on defining and implementing such a model. The main theme of the paper is the presentation of the µZ calculus and the definition of a computation model for it. The µZ calculus is a pure expression language with only a few constructs which abstracts from the diversity of Z. Since it can embed the full Z language – the simple-typed λ-calculus, set calculus, predicate calculus, and schema calculus – computation in µZ can indeed not be complete: but a computation model and implementation by an abstract machine can be provided which is comparable to that of functional logic paradigms (see [10] for on overview on the functional logic paradigm). The work presented in this paper describes a refinement of an existing compiler for Z, which is part of the ESPRESS tool environment ZETA [4]. Within the old compiler, Z is reduced to an earlier version of µZ , and µZ is compiled to Java, using a functional (deterministic) approach to execution. Case studies performed together with industry partners demonstrated the feasibility of the principal approach, but also clearly showed that deterministic computation is a serious restriction for test-data evaluation, since it constraints the way of how requirement specifications can be formulated in Z significantly. Thus this work defines an enhanced computation model for µZ , incorporating indeterministic logic principles. For the enhanced model we report on, an abstract machine has been developed [8,9], and a high-performance implementation in C++ has been derived, which is part of the ZETA release since version 1.5. This paper is organized as follows. In Sec. 2 an introduction to the µZ calculus is given. Sec. 3 defines the mapping of central Z concepts to µZ . Sec. 4 sketches the model semantics of µZ . Sec. 5 provides a complete description of the computation model of µZ in the style of natural semantics. In Sec. 6, finally, we give some examples of executable Z constructs, analyze the power of the computation model, and point out possible enhancements, whereas in Sec. 7 we discuss related work and provide the conclusion.

2

The µZ Calculus: Syntax and Informal Semantics

The µZ calculus is a set-based expression language which can embed notations such as Z as well as functional and logical programming languages. The calculus is designed for the automatic analysis and transformation, where it is desirable working with a small set of language constructs. However, µZ is not “as small as possible”2 . The calculus has redundancy in that it allows extensional as well as intensional description of the same meaning. For example, the set containing the numbers 1 and 2 can be denoted in µZ as {1} ∪{2} or by the set comprehension {x | (x , 1) ∈ ( ≥ ) ∧(x , 2) ∈ ( ≤ )}. The support of both intensional and 2

Higher-order logic [7], for instance, has a smaller number of language constructs.

416

W. Grieskamp

extensional description allows the definition of computation as a source-level transformation from intensional towards extensional representation. The syntax of µZ is given as follows. Let x ∈ V be variable symbols, and let ρ ∈ T be (term) constructor symbols. We have the syntactic categories of expressions e, a subset of them, patterns p, and of constraints φ: e ∈ E ::= x | ρ(e) | {e} | e ∪ e 0 | {p \ x | φ} | homδ e p∈P ⊆ E φ ∈ C ::= e ⊆ e 0 | true | false | φ ∧ φ0 δ ∈ ∆ == {µ, E, . . .} The form ρ(e) describes a term constructor application, where e is a sequence of expressions. (We generally use the convention to denote sequences of terms t by t.) The set of patterns, P ⊆ E, are those expressions which are variables, x , or constructor applications to patterns, ρ(p). The form {e} denotes a a singleton set, the form e ∪ e 0 set union. With {p \ x | φ} a set comprehension is given: it denotes the set of values which match the pattern p such that there exists a solution for the local existential variables x such that the constraint φ is satisfied under the substitution of the match (constraints are described below). We will write {p | φ} for the case that #x = 0. The last expression form, homδ e, denotes a homomorphism on the set e. In general, a homomorphism enumerates the elements of a set to compute a result from them. The µZ calculus provides a fixed (but extendible) number of builtin homomorphisms, whose treatment in the computation model is homogeneous. In this paper, we will explicitly use two of them: – the µ-homomorphism determines the “µ-value” of a set. It is defined iff the set is not empty and contains no distinct elements; the unique element is then delivered. – the E-homomorphism forces the extensional enumeration of elements of a set; it is defined iff the set is finitely enumeratable, then yielding the set itself. homδ e is the only expression form in our calculus which introduces undefinedness: the other expression forms can be interpreted as mere term constructions, whose interpretation is eventually demanded by an homomorphism. We now take a closer look at the constraints used in set-comprehensions. We have tautologies, subset constraints, and constraint conjunction (denoted as φ ∧ φ0 ). The following abbreviations are used for constraints and expressions: e ∈ e 0 == {e} ⊆ e 0 e = e 0 == e ∈ {e 0 }

0 == {x | false} 1 == {x | true}

The µZ calculus uses a shallow polymorphic type system in the ML style. From a set of basic type constructors for given and free types (which may be generic), higher types are build by powerset construction. Cartesian product is

A Computation Model for Z Based on Concurrent Constraint Resolution

417

treated as a special (generic) free type constructor. Value constructors ρ are associated with an arity describing their argument types and their result type. Based on this, we can assume that expressions have a principal type. In the sequel, we only look at such well-formed expressions and ignore typing considerations. As a tiny example, consider a constraint which defines the append function on lists. Lists are denoted by the constructors for the empty list, nil , and a list cell, cons(x , xs). Tuples are denoted by ( , . . . , ), a constructor application in mixfix notation: append = {((nil , ys), ys) | true} ∪ {((cons(x , xs), ys), cons(x , zs)) | ((xs, ys), zs) ∈ append }

3

Mapping Z to µZ

Before we discuss the semantics and computation model of µZ , let us take a closer look on the mapping of Z to µZ . We do not give a formal and complete definition of the translation here, but only the most important equivalences. 3.1

Abstraction and Application

Z’s set comprehension, λ-abstraction, and relational application is mapped as follows3 : {x : e • e 0 } ; {y \ x | x ∈ e ∧ y = e 0 } λ x : e • e 0 ; {(x , y) | x ∈ e ∧ y = e 0 } ; homµ {y | (e 0 , y) ∈ e} e e0 This handles functions as sets of pairs. The mapping of the application also captures relations: it is sufficient that e is a binary relation which is right-unique and defined just at the point e 0 . As we will see later on, µZ computation model uses applicative order reduction for expressions, taking set comprehensions to be in normal form. But in the above mapping of λ x : e • e 0 , for example, the domain e is shifted in the comprehension, and thus not strictly evaluated. In order to enforce a strict evaluation of the domain, we can use the following alternative mapping4 : λ x : e • e 0 ; homµ {f \ t | t = e ∧ f = {(x , y) | x ∈ t ∧ y = e 0 }} 3.2

Predicates

We look at the mapping of Z predicates in a context of a µZ set comprehension, {p \ x | C q}, where C is a conjunctive context (a sequence of one or more conjuncted constraints), and q, q 0 are not yet mapped Z predicates (the restriction to a 3 4

For ease of readability, we ignore name clash problems in the presentation – which can be tackled in the usual ways. Our compiler uses this mapping for λ-abstraction. In general, it compiles expressions appearing in the declaration part of Z schema text to become strictly evaluated.

418

W. Grieskamp

comprehension context is not a real one, since an entire Z specification is treated as a top-level set comprehension). In order to represent truth values, sets over a singleton type are used, whose element is constructed by tt, such that truth is {tt} and falsity 0: {p \ x {p \ x {p \ x {p \ x {p \ x

| C(q ∨ q 0 )} | C(q ⇒ q 0 )} | C(¬ q)} | C(∃ y : e | q • q 0 )} | C(∀ y : e | q • q 0 )}

; ; ; ; ;

{p \ x | C(tt ∈{tt | q} ∪ {tt | q 0 })} {p \ x | C({tt | q} ⊆{tt | q 0 })} {p \ x | C({tt | q} ⊆ 0)} {p \ x , y | C(y ∈ e ∧ q ∧ q 0 )} {p \ x | C({y | y ∈ e ∧ q} ⊆{y | q 0 })}

Here, the introduction of the general subset-operator (instead of its special version, e ∈ e 0 ) indicates the critical predicates regarding computability (negation, implication, and universal quantification). A general subset-relation requires the most sophisticated resolution techniques, and is not always computable in our model, as we will see later on. 3.3

Schema Calculus

In Z, a schema denotes a set of bindings, where bindings are tuples (records) with named components. If we treat bindings simply as term constructors, the schema calculus maps easily to µZ . Let the binding constructor of, for example, the Z schema [a, b : N | a < b] be h| a == , b == |i. The given schema can be represented as a µZ set comprehension {h| a == x , b == y |i | x ∈ N; y ∈ N; (x , y) ∈ ( < )} where we used mixfix notation for the binding constructor. In order to explain the operators of the schema calculus, we use Σ(e) to denote a pattern representing a binding of e. With Σ(e) f Σ(e 0 ) we construct the pattern for the “joined” signatures of e and e 0 . Σ(e) 0 denotes the “primed” version of a binding pattern for e: e ∧ e 0 ; {Σ(e) f Σ(e 0 ) | Σ(e) ∈ e ∧ Σ(e 0 ) ∈ e 0 } e ∨ e 0 ; {Σ(e) f Σ(e 0 ) | Σ(e) ∈ e} ∪{Σ(e) f Σ(e 0 ) | Σ(e 0 ) ∈ e 0 } ; {Σ(e) 0 | Σ(e) ∈ e} e 0 In a similar way, all the other operators of the schema calculus can be mapped. 3.4

Types and Genericity

A Z given type, [T ], generates an according 0-ary type constructor in µZ ’s type system. In the context where the given type is introduced, the constraint T = 1 is added. A Z free type, T ::= chheii | . . . introduces the given type T in µZ ’s type system and constructors ρc with the according arity. It generates the constraints T = 1 and c = {(x , y) | x ∈ e ∧ y = ρc (x )}.

A Computation Model for Z Based on Concurrent Constraint Resolution

419

Z’s genericity is more general then the shallow polymorphism of µZ , since constrained instances can be denoted. Consider e.g. the generic definition id [X ] == {x : X • (x , x )} of the identity relation. Then id [{1, 2}] is a possible instantiation and denotes the set {(1, 1), (2, 2)}. Generic definitions of Z need therefore to be mapped to functions over the generic’s instance. The iden0 tity function is thus represented in µZ as id = {(X , id 0 ) | id = {(x , x ) | x ∈ X }}. The application of this generic name in Z, id [e], is function application in µZ , id e, thus eventually the literal form homµ {y | (e, y) ∈ id }, which delivers the identity relation instantiated with e. In many applications of genericity in Z, the instantiation of generics is automatically derived by the type checker. These instances are unconstrained – they are given types, power-sets of unconstrained types, and so on. It is subject of an easy to perform partial evaluation on µZ expressions to exploit such cases: for example, id T , where T is given type, is equivalent to id 1, which after unsugaring 0 and unfolding yields in homµ {y | (1, y) ∈ {(X , id 0 ) | id = {(x , x ) | x ∈ X }}}, which results in homµ {y | y = {(x , x ) | x ∈ 1}}, and finally (by an instance of the “one-point-rule” for eliminating the µ-homomorphism) in {(x , x ) | true}.

4

Semantics

The model of µZ is a hierarchical typed universe, where sets are represented as partial characteristic (boolean) functions in a set-theoretical meaning, which can be founded by Z itself. This representation of sets is a generalization of partial three-point logics (see e.g. [18]) to the case of set-algebra. There are three observation to be made about a set represented as a partial characteristic function: an element is member (it is in the domain and is mapped to true), an element is not member (it is in the domain and is mapped to false), or it is unknown to be member or not (it is not in the domain). For example, consider the expression {x | ((1, x ), 1) ∈ div}: for 0, membership is unknown, for all other numbers it is known (1 is member and the other numbers are not). The representation of sets allows for a natural definition of the set-algebraic 7 P[] operations. Let f1 , f2 be the encoding of two sets in Z as functions A → (where booleans are represented as the empty schema, with ∅ falsity and {hi} truthness). Set intersection and set union are defined in our model as follows: f1 ∩ f2 = λ x : (dom f1 ∩ dom f2 ) ∪ (f1∼ )(|{false}|) ∪ (f2∼ )(|{false}|) • [| f1 x ∧ f2 x ] f1 ∪ f2 ≡ λ x : (dom f1 ∩ dom f2 ) ∪ (f1∼ )(|{true}|) ∪ (f2∼ )(|{true}|) • [| f1 x ∨ f2 x ] Thus, as sets with unknowns are combined with other sets, unknowns may vanish. On intersection, falsity dominates undefinedness and on union, truthness dominates undefinedness. Since we have union and intersection as total functions on sets, a µZ powerset domain constitutes a complete lattice, with the empty set (the total characteristic function constantly delivering false) the bottom element and the universal set of the according type the top element. The undefined set (that is the characteristic function with the empty domain) is the bottom of a sub-lattice which starts in the

420

W. Grieskamp

“middle” of this lattice. In the lattice of a powerset, we can expect to construct fixed-points. Note that this does not necessarily mean that we can compute every construct of µZ (this would require continuity or at least monotonicity of all constructs): the µZ calculus and its semantic model is not constructive a priori – otherwise we were not able to map full Z to the calculus. Not restricting the calculus to computability has the advantage that we can give µZ specifications a meaning which are not entirely computable, but where our computation goal is concerned only with a computable subpart. A rough edge in µZ ’s model is the treatment of free types. Given a type constructor t and a value constructor ρ : . . . × P t → t, we know that the constructor cannot be free in a set-based model. We need to restrict its domain to the finite subsets of T , or the finite approximation of some chain in a lattice. Thus constructors for recursive free types are actually partial injective functions regarding the domain of their arity in µZ ’s type system.

5

Computation Model

A computation model for µZ is described in the style of natural semantics [16]. The intention is to get a model which is abstract enough to provide an idea of computation, but also concrete enough to serve as a starting point for an efficient implementation. The computation model is not primarily designed with the goal to get “maximal” computation power out of µZ (a complete model cannot be realized anyway). Efficient implementability and transparent traceability of execution are as significant as computation power. The last point is in particular important for the application to test-case evaluation, where the reason of the failure of a test needs to be understandable by humans. Two major design decision are derived from these considerations: – we use a strict (applicative) evaluation order. Strictness in the context of µZ means that members of the singleton set, {e}, are completely evaluated before the set is constructed. Similarly, in constructed values, ρ(e), the arguments are fully evaluated prior to the constructor call. – we will use depth-first, left-to-right resolution. Though our model can be easily extended to breadth-first, an implementation with breadth-first search is considerable more complex and less efficient. Though depth-first search is used, the resolution techniques go behind those of Prolog, since the computation model provides full “parallel” execution of conjunctive constraints. Thus, when the constraint of {p | φ1 ∧ φ2 } is executed, and one of the φi ’s diverges, the result is nevertheless determined if the other constraint fails. 5.1

Domains

Values of the computation model are a normal form of expressions as specified by the following grammar: v ∈ EV ::= ρ(v ) | {v } | {p \ x | φ} | v ∪ v 0 | x

A Computation Model for Z Based on Concurrent Constraint Resolution

421

Thus in values all applications of the homδ e form not appearing inside of a set-comprehension are eliminated. Values can be “cyclic”, that is regularly infinite. It is well known how cyclic terms can be represented by acyclic structures (for example, by a pair of a variable and value assigned to this variable, where the variable may appear in the value) and we thus use them without explicit formalization. Variables contained in a value can be “frozen”. Freezing a variable x maps it to a unique variable xf . With freeze X v the variables x ∈ X are frozen in v , with unfreeze X v they are unfrozen. The function free v delivers those free variables in a value which are not frozen, the function frozen v those which are. The purpose of freezing variables is to treat them as constants regarding unification in certain contexts, as will be discussed below. A substitution is a mapping from variables to values, defined in the usual way. With bound σ = {x : dom σ |Sσx 6= x } the bound variables of a substitution are denoted. It holds bound σ ∩ ((free ∪ frozen)(|ran σ|)) = ∅. With σ[x := v ] we extend a substitution by a variable assignment – the value v may contain references to x , which are converted to cyclic terms. With σ ◦ σ 0 we denote composition of substitutions. A (partial) equality on values, written as v ∼ v 0 , is available. Variables (frozen or not) are equal only by name. Set comprehensions are equal by syntactic structure (where no renaming of variables is involved)5 . The equality takes commutativity, associativity and idempotency of set union into account, and handles cyclic terms. Furthermore, we have the relation v 6∼ v 0 to indicate inequality of values. Note that this is not the reverse of v ∼ v 0 , since inequality of variables of different names cannot be decided, as well as inequality of set comprehensions. A few further auxiliary definitions are required. In the computation rules, we use the notion of a goal, a substitution paired with a constraint, written as θ = σ :: φ. A choice is a sequence of goals, θ = hθ1 , . . . , θn i. The propagation of a substitution over a choice is defined as σ Bhσ1 :: φ1 , . . . , σn :: φn i = hσ ◦ σ1 :: φ1 , . . . , σ ◦ σn :: φn i 5.2

Reduction

Computation is defined by two relations which are mutually dependent. The σ first relation, e −→ e 0 , describes a reduction step on expressions under the 0 θ , describes a resolution step by mapping a substitution σ. The second, θ choice into a choice. σ The rules for e −→ e 0 are given in Fig. 1. Here, S describes a strict reduction redex, which is specified by a “grammar with a hole”. Every expression which is not embedded in a set-comprehension is strict. Rule E 1 defines reduction in such contexts. Rule E 2 describes the unfolding of a bound variable. The remaining rules for expression reduction handle homomorphisms. To abstract from the particular kind of homomorphism, we use the following characteristic functions for homomorphisms: 5

In the implementation, this kind of equality can be easily detected by pointer equality.

422

W. Grieskamp

S · ::= (S ·) ∪ e | e ∪(S ·) | {S ·} | ρ(. . . , S ·, . . .) | homδ (S ·) | · σ

E1

E3

σ

S e −→ S e

E2

0

x ∈ bound σ σ

x −→ σx

X = free v ; x ∈ / X ∪ frozen v σ

homδ v −→ HOMδ (start δ, X , x , hσ :: {x } ⊆ freeze X v i) E4

E5

e −→ e 0

σBθ

θ

0

σ

0

HOMδ (hs, X , x , θ) −→ HOMδ (hs, X , x , θ )

(hs, σ 0 x ) ∈ dom(next δ) σ HOMδ (hs, X , x , hσ 0 :: truei a θ) −→ HOMδ (next δ (hs, σ 0 x ), X , x , θ)

E6

(hs, σ 0 x ) ∈ dom(stop δ) σ HOMδ (hs, X , x , hσ 0 :: truei a θ) −→ unfreeze X (stop δ (hs, σ 0 x )) E7

hs ∈ dom(end δ) σ

HOMδ (hs, X , x , hi) −→ unfreeze X (end δ hs) Fig. 1. Expression Reduction

– hs = start δ describes the initial value of an internal state hs which is used to represent accumulated information as the elements of a set are stepwise enumerated. – hs 0 = next δ (hs, v ) describes how the state hs is transformed for the next element v . In general, next δ might be a partial function, where (hs, v ) ∈ / dom(next δ) means that for the according state hs and value v no transition is possible. – v 0 = stop δ (hs, v ) describes that the reduction of δ can be stopped, since enough information for computing the result v 0 is available. For example, a homomorphism representing a test whether a set is non-empty can immediately stop when the first solution is encountered. Again, stop δ is partial, the domain test indicating when it can be applied. – v = end δ hs describes how the final result is computed from the state hs after all elements have been enumerated. end may be partial. In rules E 3 to E 7 in Fig. 1, we use the intermediate expression form HOMδ (hs, X , x , θ) to represent the state of homomorphism reduction: hs is the current internal state, X a set of frozen context variables and θ a choice which represents possible solutions for the variable x . In Rule E 3, we initialize this intermediate form: hs is set to start δ, and the choice to hσ :: x ∈ freeze X v i, where x is a fresh variable and X the set of free variables contained in the set-value v . These variables need to be frozen (which will be discussed below).

A Computation Model for Z Based on Concurrent Constraint Resolution

423

The choice embedded in HOMδ (hs, X , x , θ) is reduced by rule E 4. The substitution of the expression reduction context is propagated over θ, in order to make the current bindings of frozen context variables available. Each time a solution for θ is found, we can either continue by applying next δ (Rule E 5), or stop enumerating solutions with stop δ (Rule E 6). end δ is invoked if all solutions are enumerated (Rule E 7). The context variables in the result of end δ and stop δ are unfrozen. Note that it might be possible that none of the rules E 5 to E 7 is applicable, since the functions characterizing the homomorphism are not defined for the current reduction state. In this case, our expression is irreducible, which we identify with “undefined”. A further source of undefinedness is divergence, which happens if sets are enumerated infinitively. Characterizing Basic Homomorphisms. Here are the definitions of the characteristic functions for the E morphism, which computes an extensional representation of a set (if it exist), and of the µ homomorphism. In the definitions we assume that for all unmentioned cases arguments are not in the domain of the according functions. We give no cases for the stop function (implying that it is the totally undefined function) since both homomorphisms need to enumerate the entire set: start E =0 next E (v , v 0 ) = v ∪{v 0 } end E v =v

start µ next µ (∅, v ) next µ ({v }, v 0 ) end µ {v }

= = = =

∅ {v } {v } if v ∼ v 0 v

Note that homomorphism functions in general need to be right-commutative, since the order in which the elements of a set are enumerated is not determined. Frozen Variables. During homomorphism reduction, variables from the context of the reduction need to be frozen, preventing that resolution (as described in the next section) can generate assignments for them. To understand the reason for freezing, consider an expression like {y | homE {tt | y = 1} ⊆ 0} – which is a possible literal representation of {y | ¬ y = 1}. Clearly, during the reduction of the homomorphism, we are not allowed to generate bindings for y. Instead, constraints in negated contexts must residuate until required context variables are bound. 5.3

Resolution

Having defined expression reduction, we now look at the rules for constraint 0 θ , mapping a choice into a choice. resolution, which spawn the relation θ Selection and Reduction. The first group of rules describes the selection of the subset constraint to be next reduced, and the reduction of expressions in subset constraints (Fig. 2). Let C describe a conjunctive context. Rule C 1 uses C φ to select some constraint to be reduced next from the head goal of the choice.

424

W. Grieskamp

C · ::= (C ·) ∧ φ | φ ∧(C ·) | ·

C2

C1

hσ :: e ⊆ e 0 i hσ :: C(e ⊆ e 0 )i a θ

hσ1 :: φ1 , . . . , σn :: φn i hσ1 :: C φ1 , . . . , σn :: C φn i a θ

σ

i, j ∈ {1, 2}; i 6= j ; φi = true hσ :: C(φ1 ∧ φ2 )i a θ hσ :: C φj i a θ

C3

ei −→ ei0 ; i, j ∈ {1, 2}; i 6= j ; ej0 = ej hσ :: e1 ⊆ e2 i

hσ :: e10 ⊆ e20 i

Fig. 2. Constraint Resolution Rules: Selection and Reduction

C4

hσ :: {ρ(v1 , . . . , vn )} ⊆{ρ(v10 , . . . , vn0 )}i

C5

ρ 6= ρ0 hσ :: {ρ(. . .)} ⊆{ρ0 (. . .)}i C7

C9

v ∼v hσ :: {v } ⊆{v 0 }i

hi

C6

hσ :: {v1 } ⊆{v10 } ∧ . . . ∧{vn } ⊆{vn0 }i i, j ∈ {1, 2}; i 6= j vi not frozen variable; vi ∈ / bound σ

hσ :: {v1 } ⊆{v2 }i

0

hσ :: truei

C8

v 6∼ v 0 hσ :: {v } ⊆{v 0 }i

i, j ∈ {1, 2}; i 6= j ; vi non-extensional set; hσ :: {v1 } ⊆{v2 }i

hσ[vi := vj ] :: truei

vi0

= homE vi ;

hi vj0

= vj

hσ :: {v10 } ⊆{v20 }i

Fig. 3. Constraint Resolution Rules: Unification and Equality

Note that by the definition of C this selection is non-deterministic – indicating that constraints in conjunctions are concurrently reduced. Note moreover, since we only select from the head goal, our resolution strategy resembles depth-first search. The selected subset constraint in Rule C 1 is rewritten in the rule’s premise into a choice which is not necessarily singleton. Propagating the context over this several possibilities in the result can be actually realized by backtracking in an implementation. Rule C 2 in Fig. 2 eliminates tautologies, Rule C 3 handles expression reduction in subset constraints. Before actual resolution can start, expressions are reduced to a normal form, such that all homomorphisms are eliminated, and bound variables in strict contexts are substituted. Unification and Equality. The rules given in Fig. 3 describe the resolution of equalities, that is subset constraints of the form {v } ⊂ {v 0 }, by unification. Rule C 4 models the decomposition of constructor terms, Rule C 5 the case where constructors are detected to be unequal. Rule C 6 handles variable assignment: only non-frozen, free variables can be assigned. Note that no occurs-check is performed.

A Computation Model for Z Based on Concurrent Constraint Resolution

C 10

hσ :: 0 ⊆ v i

C 12

hσ :: v1 ∪ v2 ⊆ v i

hσ :: {v } ⊆ v1 ∪ v2 i

C 15

C 11

hσ :: {v } ⊆ 0i

hi

rename p,x ,φ wrt σ hσ :: {v } ⊆{p \ x | φ}i

C 13

C 14

hσ :: truei

425

σ :: {p \ x | φ} ⊆ v

hσ :: {v } ⊆{p} ∧ φi hσ :: v1 ⊆ v ∧ v2 ⊆ v i

hσ :: {v } ⊆ v1 , σ :: {v } ⊆ v2 i σ :: homE {p \ x | φ} ⊆ v

Fig. 4. Constraint Resolution Rules: Membership and Inclusion

The remaining rules are for the case that we look at equality between sets (that is something like {{. . .}} ⊆ {{. . .}}). If we can decide this equality by use of the approximative relations ∼ (respectively 6∼ ), we have done. However, if the sets are not extensional, we need to convert them to an extensional representation, which is driven by Rule C 9, inserting a homE vi expression. Membership and Inclusion. The rules in Fig. 4 define the resolution of membership and general inclusion constraints. Rules C 10 and C 11 handle tautologies. In Rule C 12, {v } ⊆ v 0 (v ∈ v 0 ) is treated for the case that v 0 = {p \ x | φ}. The set comprehension v 0 is instantiated in the resolution context by adding the equality {v } ⊆ {p} and the constraint φ, after renaming its variables away from the variables of the context. Rule C 13 reduces a set union on the left-hand side of a subset constraint into a conjunction of subset constraints. Rule C 14 reduces a union at the righthand side, thereby creating a choice. C 14 is the only source of choice creation, and corresponds to rule selection in logic languages. For example, consider the append relation given in Sec. 2, defined as a union of two set comprehensions, term ∪ recur , one describing the termination case, the other the recursive case. For resolving (xs, ys, zs) ∈ term ∪ recur , we create the choices (xs, ys, zs) ∈ term and (xs, ys, zs) ∈ recur . Rule C 15, finally, handles a set-comprehension on the left-hand side of a subset constraint: this can be resolved only by extensional enumeration such that afterwards C 13 can take over. Confluence, Totalness and Strategies. An overlapping between the applicability of rules is present between the equality checks in C 7/C 8 and some other rules. However, in these cases the equality rules are only shortcuts which yield in similar results. In principle, a further candidate source for non-confluence is the nondeterministic selection of the next subset constraint from the head of a choice

426

W. Grieskamp

(Rule C 1) – but in a conjunction of subset constraints it is obvious that this non-determinism introduces no problems for confluence. As seen from the syntactic shapes of the resolution rules, we cover all cases of subset constraints where the left and the right-hand side are not variables. Thus our resolution is not total. Since expressions can be irreducible, resolution is not total from this end as well. Moreover, for resolving some subset constraints it is necessary to introduce new expressions, converting intensional descriptions by an homomorphism into extensional ones (Rules C 9 and C 15). The failure of extensionalization of sets is the source of failure of resolution of these subset constraints. Though our system of resolution rules is confluent, it makes a big difference regarding efficiency in which order the rules are applied. In an implementation, we employ the so-called Andorra Principle [22], suspending the creation of choices (Rule C 14) as long as possible, and preferring deterministic for indeterministic computation, a technique which is known to prune the search space dramatically for many applications. 5.4

Implementation

The computation model of µZ implies the design of an abstract machine [9], called ZAM, which has been implemented in C++ as part of the ZETA system. The ZAM is a concurrent constraint resolution machine, which has some similarities with the machine described for Oz in [17], but inherits its particular set-based flavor from the µZ calculus. Since it is a byte-code machine, concurrent constraint resolution can be implemented quite efficiently using software threads. Threads basically use logic variables for communication. Trailing and backtracking can be efficiently realized. The Andorra Principle is implemented by using thread priorities. More details are found in [8,9].

6

What Can be Computed?

Our model has limitations, some of them by design, others because the right (efficient) resolution techniques are not yet integrated. But what Z specifications, as they are mapped to µZ , can be actually computed and what not? In this section, we give some examples and also identify some restrictions and possible enhancements for the future. 6.1

The Positive Answer

To central answer to the positive version of the question is by analogy: we can compute all constructs of Z which correspond to functional languages, as well as all constructs which correspond to logic languages such as Prolog. We look at some examples to illustrate these capabilities.

A Computation Model for Z Based on Concurrent Constraint Resolution

427

Logic Style. Set objects – relations or functions – are executable if they are defined by (recursive) equations, as in the following example, where we define natural numbers as a free type, addition on these numbers and an order relation: N ::= Z | S hhN ii

three == S (S (S (Z )))

add : P((N × N ) × N ) add = {y : N • (Z , y) 7→ y} ∪ {x , y, z : N | (x , y) 7→ z ∈ add • (S x , y) 7→ S z } less == {x , y : N | ∃ t : N • (x , S t) 7→ y ∈ add } In the ZETA system, we may now execute queries such as the following, where we ask for the pair of sets less and greater than three: ({x : N | (x , three) ∈ less}, {x : N | (three, x ) ∈ less}) V ({Z,S(Z),S(S(Z))},{S(S(S(S(t))))}) Note that the second value of the resulting pair is a singleton set containing the free variable t. These capabilities are similar to logic programming. In fact, we can give a translation from any clause-based system to a system of recursive set-equations in the style given for add , where we collect all clauses for the same relational symbol into a union of set-comprehensions, and map literals R(e1 , . . . , en ) to membership tests (e1 , . . . , en ) ∈ R. Functional Style. The elegance of the functional paradigm comes from the fact that functions are first-class citizens. In our implementation of execution for Z, sets are first-class citizens as well. For example, we can execute operators such as relational image on base of their natural definition: [X , Y ] (| |) == λ R : P(X × Y ); S : P X • {x : X ; y : Y | x ∈ S ∧ (x , y) ∈ R • y} We can now query for the relational image of the add function over the cartesian product of the numbers less then three: let ns == {x : N | (x , three) ∈ less} • add (|ns × ns|) V {Z,S(Z),S(S(Z)),S(S(S(Z))),S(S(S(S(Z))))} It is also possible to define the arrow types of Z, as shown below for the set of partial functions: [X , Y ] → 7 == {R : P(X × Y ) | (∀ x : X | x ∈ dom R • ∃1 y : Y • (x , y) ∈ R)}

428

W. Grieskamp

This example makes use of universal and unique existential quantification, which are a possible source of non-executability in our setting. As seen in Sec. 3, the predicate ∀ x | p • q is mapped to the subset constraint {x | p} ⊆ {x | q}. A subset constraint with a set-comprehension on the left-hand side is only resolvable if the set-comprehension can be enumerated by the E-homomorphism. Thus, if we try to check whether add is a function, we get in a few seconds: add ∈ N × N → 7 N V still searching after 200000 steps gc #1 reclaimed 28674k of 32770k ... In enumerating the domain of add our computation diverges. However, if we restrict add to a finite domain it works: 7 N ∃1 ns == {x : N | (x , three) ∈ less} • (ns × ns) C add ∈ N × N → V *true* The examples show that Z’s way to deal with functions and relations is fully supported: most toolkit functions are executable on base of their natural definition in Z. Obviously, by our mapping of the schema calculus to sets, also the usual sequential specification style of Z is fully supported. Specifications. A Z specification is represented in µZ as a “top-level” setcomprehension containing as constraints all the axiomatic definitions of the incorporated Z sections6 . We can execute such specifications even if they are loosely specified. Consider the following axiomatic definitions: x, y : N y ∈ {z : N | (z , three) ∈ less} Querying for the pair (x , y) yields in: (x , y) V (x,Z) (possibly more solutions) ... (x,S(Z)) (possibly more solutions) Thus the ZETA frontend allows us to step through the different “models” of a loose specification. The case that a constant is not constraint at all – in the example the x – causes no problems as long as its value is not required for other computations. 6

In order to support efficient separate compilation of Z sections, the reality is a little more complicated, but the conceptual view is as this.

A Computation Model for Z Based on Concurrent Constraint Resolution

6.2

429

The Negative Answer (And What Can be Enhanced)

Declarations. The most obvious restriction when trying to execute a typical Z specification with our approach is that restrictions for set objects as often introduced by declarations are only executable if the objects are finitely enumeratable. The example of the partial function arrow and the divergence of the test add ∈ N → 7 N in the previous section has illustrated this. There are two pragmatic ways to deal with this. One is to treat all constraints introduced by Z declarations, x : e, as assumptions, which are discarded on their translation to µZ . A second is to demand that the user must explicitly markup assumed declarations, by using a special primitive function assumed , applied as in add : assumed N × N → N . This function is semantically identity, but treated specially by the compiler. Both approaches are currently supported by the ZETA system, controlled by user switches. It has turned out in several applications that discarding declaration constraints seems to be the more practical approach. If a declared membership is actually a constraint required for execution, the user has to place it in the constraint part of schema text. However, this treatment violates the Z semantics which defines add : N ×N →N to be equivalent to add : P((N ×N )×N ) | add ∈ N ×N →N . Definition Forms. When providing recursive definitions of relations and functions one currently has to stick to certain conventions. For example, the factorial function would be typically specified in Z by axioms of the kind: fac 0 = 1; ∀ x : N | x > 0 • fac(x ) = x ∗ fac(x − 1) A definition of this form is unamenable to execution. Instead one has to write: fac = {(0, 1)} ∪ (λ x : N | x > 0 • x ∗ fac(x − 1)) The automatic conversion of axioms of the above form to definitions in the below form is subject of ongoing research, using techniques similar to those described in [3]. Powerset And The Like. The resolution techniques for general subset constraints in our model are relatively weak (they capture just functional and logic computation, not more). For example, we are not able to enumerate the powerset P A, by its representation in µZ as {x | x ⊆ A}, even if A is finitely enumeratable (indeed, we can test whether a ground value is in this set). Technically, the constraint resolution system presented in this paper misses rules for the following kinds of subset constraints: – subset constraints where the left or the right hand side is a free variable: x ⊆ v or v ⊆ x . These constraints can generally not be resolved before x becomes bound. A typical instance is the powerset schema. – constraints where the left-hand side is a set comprehension which fails to be enumeratable: {x | φ} ⊆ v . These constraints are resolved by enumerating {x | φ} using homE . A typical instance is the universal quantor.

430

W. Grieskamp

– subset constraints which represent unification problems on sets, for instance {{x ∪ y}} ⊆ {{1} ∪ {2}}. These constraints cannot produce bindings for x and y, since set-unification is currently not employed in our model. Instead, they must residuate until other constraints provide bindings. There exist techniques which can (partially) solve these problems. An adaption of Gentzen’s cut rule to constraint solving has been analyzed in [24] on base of the µZ calculus. Subset constraints are also subject of research in the realm of set-based program analysis (e.g. [1]), subset logic programming [14] and set unification (e.g. [2]). Our set-based model provides a framework for smooth integration of these techniques, which, however, needs further consolidation. Resolution In Specific Domains. Though it is possible in our model to represent e.g. natural numbers by constructors 0 and S hhN ii, as shown in our toy examples, enabling resolution techniques on them, this approach is hopelessly inefficient for real applications. Thus numbers are actually integrated by a native implementation and resolution techniques for arithmetic constraints are not available. However, sophisticated solvers for e.g. linear arithmetics exist, as for many other specialized domains. A goal of future work is to provide an interface for integrating foreign solvers into our computation model and its implementation.

7

Conclusion and Related Work

Executing Z. Animation of the “imperative” part of Z is provided by the ZANS tool [15], imperative meaning Z’s specification style for sequential systems using state transition schemas. This approach is highly restricted. An elaborated functional approach for executing Z has been described in [21], though no implementation exists today. A mapping of Z to Haskell is described in [6], where monads are used for dealing with the sequential specification style, but no logic resolution is employed. Mappings to Prolog are described, e.g., in [23]. Mappings of Z to Mercury are described in [25]: however, in this approach the effective power of execution is restricted compared with ours, because the data flow has to be determined statically. The approach presented in this paper goes beyond all the others, since it allows the combination of the functional and logic aspects of Z in a higher-order setting, treating sequential behaviors as first-class citizens, comparable as to the integration of imperative computation into functional languages using monads. Implementation of Functional Logic Languages. The beauty of the set-based approach, which inherits µZ from Z, provides a way of viewing the λ-calculus, predicate calculus, schema’s calculus, set-algebra, etc., in a unified, small expression language. Though the motivation of µZ and its implementation is the execution of Z, our approach is also interesting for the implementation of declarative programming languages in general, since µZ can embed functional logic languages such as Curry [12]. In its intention, the µZ calculus is comparable to the calculi given for Oz in [19].

A Computation Model for Z Based on Concurrent Constraint Resolution

431

The computation model we have given is comparable to the one described in [11]. Normalized set values correspond to the “definitional trees” used in this work. Higher-orderness has a cleaner representation in µZ , since set-values are first-class inhabitants of the operational domain which definitional trees are not. Moreover, our model supports encapsulated search by computation of the homδ e form, which goes behind [11]. A strict reduction order has been chosen for a easier traceability of execution by humans, which is important for the application to test-data evaluation, and a supposed better overall performance. It is drawn from experiences with functional programming and language implementation that the overhead of laziness has to be paid for significantly, which in practice often results in that programmers use strictness annotations [13]. Acknowledgment. The discussions in our constraint group at the TU Berlin have contributed much to the presentation of µZ and its computation model in this paper. Thanks go to Petra Hofstedt, Markus Lepper and Jacob Wieland for the fruitful discussions. Moreover, the referees have provided useful information to enhance the paper.

References 1. A. Aiken and E. Wimmers. Solving systems of set constraints. In Symposium on Logic in Computer Science, pages 329–340, June 1992. 2. P. Arenas-Sanchez and A. Dovier. A minimality study for set unification. Journal of Functional and Logic Programming, 1997(7), 1997. 3. R. D. Arthan. Recursive definitions in Z. In J. P. Bowen, A. Fett, and M. G. Hinchey, editors, ZUM’98: The Z Formal Specification Notation, volume 1493 of Lecture Notes in Computer Science, pages 154–171. Springer-Verlag, 1998. 4. R. B¨ ussow and W. Grieskamp. The ZETA System Documentation. Technische Universit¨ at Berlin, Dec. 1998. URL: http://uebb.cs.tu-berlin.de/zeta. 5. R. B¨ ussow and W. Grieskamp. A Modular Framework for the Integration of Heterogenous Notations and Tools. In K. Araki, A. Galloway, and K. Taguchi, editors, Proc. of the 1st Intl. Conference on Integrated Formal Methods – IFM’99. SpringerVerlag, London, June 1999. 6. H. S. Goodman. The Z-into-Haskell tool-kit: An illustrative case study. In J. P. Bowen and M. G. Hinchey, editors, ZUM’95: The Z Formal Specification Notation, volume 967 of Lecture Notes in Computer Science, pages 374–388. Springer-Verlag, 1995. 7. M. Gordon and T. Melham, editors. Introduction to HOL: A theorem proving environment for higher-order logics. Cambridge University Press, 1993. 8. W. Grieskamp. The mZ calculus and its implementation. In Proceedings of the International Workshop on Implementation of Declarative Languages (IDL’99), Sept. 1999. 9. W. Grieskamp. A Set-Based Calculus and its Implementation. PhD thesis, Technische Universit¨ at Berlin, 1999. 10. M. Hanus. The integration of functions into logic programming: From theory to practice. Journal of Logic Programming, 19(20), 1994.

432

W. Grieskamp

11. M. Hanus. A unified computation model for declarative programming. In Proc. of the 1997 Joint Conference on Declarative Programming, Grado (Italy), 1997. 12. M. Hanus. Curry – an integrated functional logic language. Technical report, Internet, 1999. Language report version 0.5. 13. P. H. Hartel, M. Feeley, M. Alt, L. Augustsson, P. Baumann, M. Beemster, E. Chailloux, C. H. Flood, W. Grieskamp, J. H. G. van Groningen, K. Hammond, B. Hausman, M. Y. Ivory, R. E. Jones, J. Kamperman, P. Lee, X. Leroy, R. D. Lins, S. Loosemore, N. R¨ ojemo, M. Serrano, J.-P. Talpin, J. Thackray, S. Thomas, P. Walters, P. Weis, and P. Wentworth. Benchmarking implementations of functional languages with “pseudoknot”, a Float-Intensive benchmark. J. of Functional Programming, 6(4), 1996. 14. B. Jayaraman and K. Moon. Subset logic programs and their implementation. Journal of Logic Programming, 19(20), 1999. 15. X. Jia. An approach to animating Z specifications. Internet: http://saturn.cs.depaul.edu/˜fm/zans.html, 1996. 16. G. Kahn. Natural semantics. In Symposium on Theoretical Computer Science (STACS’97), volume 247 of Lecture Notes in Computer Science, 1987. 17. M. Mehl, R. Scheidhauer, and C. Schulte. An Abstract Machine for Oz. Research Report RR-95-08, Deutsches Forschungszentrum f¨ ur K¨ unstliche Intelligenz, Stuhlsatzenhausweg 3, D66123 Saarbr¨ ucken, Germany, June 1995. Also in: Proceedings of PLILP’95 , Springer-Verlag, LNCS, Utrecht, The Netherlands. 18. O. Owe. Partial logic reconsidered: A conservative approach. Formal Aspects of Computing, 5:208–223, 1997. 19. G. Smolka. A calculus for higher-order concurrent constraint programming with deep guards. Research Report RR-94-03, Deutsches Forschungszentrum f¨ ur K¨ unstliche Intelligenz, Stuhlsatzenhausweg 3, D-66123 Saarbr¨ ucken, Germany, Feb. 1994. 20. J. M. Spivey. The Z Notation: A Reference Manual. Prentice Hall International Series in Computer Science, 2nd edition, 1992. 21. S. Valentine. The programming language Z−− . Information and Software Technology, 37(5–6):293–301, May–June 1995. 22. D. H. D. Warren. The extended andorra model with implicit control. In ICLP’90 Parallel Logic Programming Workshop, 1990. 23. M. M. West and B. M. Eaglestone. Software development: Two approaches to animation of Z specifications using Prolog. IEE/BCS Software Engineering Journal, 7(4):264–276, July 1992. 24. J. Wieland. A resolution algorithm for general subtyping constraints. Master’s thesis, Technische Universit¨ at Berlin, 1999. 25. M. Winikoff, P. Dart, and E. Kazmierczak. Rapid prototyping using formal specifications. In Proceedings of the Australasian Computer Science Conference, 1998.

Analysis of Compiled Code: A Prototype Formal Model R.D. Arthan Lemma 1 Ltd. 2nd Floor, 31A Chain Street, Reading UK RG1 2HX [email protected]

Abstract. This paper reports on an experimental application of formal specification to inform analysis of compiled code. The analyses with which we are concerned attempt to recover abstraction and order from the potentially chaotic world of machine code. To illustrate the kind of abstractions of interest, we give a formal model of a simple microprocessor. This is based on a traditional state-based Z specification, but builds on that to produce a behavioural model of the microprocessor. We use the behavioural model to specify a higher-order notion: the concept of a program whose control flow can be decomposed into basic blocks. Finally, we report on the use of our techniques in the development of tools for analysis of compiled code for a real microprocessor.

1 1.1

Introduction Background

Much of the emphasis in formal methods research has been into formalisation of the process of specifying and designing systems. However techniques and tools for analysing software engineering artefacts are of considerable importance. This paper is intended to give a simple example of how notations such as Z may be used to provide rigorous foundations for program analysis. Because of its importance in safety-critical applications (such as avionics systems), we are particularly concerned with analysis of compiled code. Rigorous development methods greatly increase confidence in the outputs of the systems engineering process. However, those outputs still require validation, typically by inspection and testing. We believe that automated or semi-automated analysis of compiled code will be of increasing importance for system validation and that it deserves a rigorous mathematical foundation. 1.2

Program Analysis

The formal model that we present as an example of our approach is adapted from a specification originally written in late 1997. At that time, Dr. Tom Lake of Interglossa and the present author were doing some preliminary work on techniques for analysing and verifying compiled code. J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 433–449, 2000. c Springer-Verlag Berlin Heidelberg 2000

434

R.D. Arthan

Our thinking was influenced by Interglossa’s suite of reverse engineering tools called REAP. These tools carry out static analysis on assembly language code to retrofit the kind of abstractions that one might expect to find in compilergenerated code. Programs that pass the conditions imposed by the analysis conform to disciplines of control flow and data access of the sort typically obeyed by a compiler. In REAP, the analysis provides a semantic justification for a translation of the assembly language code into a higher level language such as C. Our belief was (and remains) that this kind of analysis should also justify a tractable approach to formal reasoning about low level code. 1.3

Formal Model

To provide a formal underpinning of the kinds of analysis we have in mind requires a formal model of the execution environment for the code being analysed. Towards this goal, the present author wrote a behavioural model of a simple but general microprocessor. The idea was that the control flow and data access disciplines of interest can be formally defined as constraints on the possible behaviours of particular programs. Not all programs will conform to the disciplines that we impose; however, those that do should be significantly more amenable to formal reasoning. In safety-critical applications, we would claim that non-conforming programs should be deemed unacceptable. This would be the analogue at the machine code level of the use of “safe” high-level language subsets (such as SPARKAda). One would expect compilers to generate conforming target code for most reasonable high level language programs. To demonstrate a simple form of control flow and data access discipline, we formalise the notion of a basic blocks decomposition of a program running on the microprocessor. If a basic blocks decomposition can be shown to be correct, then we know that the program is not self-modifying and never executes data. In other words, we can validly treat the program as running on a Harvard architecture rather than the von Neumann architecture of the physical microprocessor. 1.4

Expressing Higher-Order Properties in Z

The basic blocks abstraction is a so-called higher-order property; i.e., it cannot be expressed as a constraint on one state transition, but rather has to be expressed in terms of complete execution histories. Traditional methods for using Z focus on specification of a system as a state-transition machine, i.e., via first-order properties alone. However, Z provides all the mathematical features needed to specify higher-order properties and the schema calculus helps to abbreviate many of the definitions. The approach is to construct a behavioural model of the system in terms of a specification in the traditional style. This paper is intended both to illustrate and to promote this approach to formal modelling.

Analysis of Compiled Code: A Prototype Formal Model

1.5

435

Specification of Program Analyses

Many algorithms have the characteristic that it is much easier to specify the solution to be found than it is to design and specify an algorithm that finds it. Milner’s type inference for ML, [6] is an example. Discovering properties like the basic blocks abstraction by automatic analysis often involves techniques such as abstract interpretation which are algorithmically complex. We believe it is important to have rigorous specifications of what the results of such analyses mean. The specification of the basic blocks abstraction in this paper is intended to demonstrate that it is possible to give rigorous and concise definitions of the requirements on a program analysis without giving the implementation detail. 1.6

Structure of the Paper

The rest of this paper is organised as follows: – Section 2 gives a model of the simple microprocessor. This is a behavioural model, i.e., it characterises a program running on the microprocessor by its input/output relation. As an aside, we show how the behavioural model allows us to formalise the concept of refinement. – Section 3 specifies the notion of a decomposition of a program into basic blocks. This demonstrates a simple, but not untypical, example of the kind of property that advanced program analysis techniques are used to find. – Section 4 gives some concluding remarks including a list of the shortcomings of the simple model we present here and a discussion of how some of these were addressed in a real-life example. An index to the Z specification is given at the end of this section.

2

Processor Model

In this section we give a complete behavioural model of a somewhat idealised microprocessor. In a real example, we would have rather more work to do transcribing the manufacturer’s data sheets along the lines of the Z framework we set up here. However, the work has not been found to be excessive on an actual example. Our approach is first to develop a state-transition model of the microprocessor using the traditional Z style for specifying a sequential system [8,9]. This is not dissimilar in spirit to the specification in chapter 9 of [3], although we choose to abstract away some of the details such as instruction decoding. We then use the state-transition model to construct a behavioural model — a specification of the observable behaviour of the microprocessor formulated as a relation between input and output streams. In more detail, the plan of the specification is as follows: – First of all, in section 2.1, we give our model of the registers and memory of the microprocessor. These provide “data types” that are used throughout the rest of the specification.

436

R.D. Arthan

– In section 2.2, we define the state space of the microprocessor. – In section 2.3, we describe a kind of abstract syntax for the instruction set of the microprocessor. – In section 2.4, we specify in the traditional Z style how the instructions of section 2.3 are executed. – In section 2.5 we pull together the operations defined in section 2.4 into a single schema, T ICK, describing one processor execution cycle. – Finally, in section 2.6 we define sets to represent input and output streams and use the schema T ICK to construct the behavioural model. The specification is written in the dialect of Z supported by the ProofPower system, [2,5], which was used to prepare and type-check all the Z in this

document. The global variables in the Z are listed in the index in section 4 and are shown in a bold font at the point of their definition. 2.1

Register and Memory Model

The following otherwise unspecified positive numbers give the maximum values of a memory word and of a memory address. M AX W ORD, M AX ADDR : N1 The following sets give the types for words in memory and memory addresses: W ORD = b 0 .. MAX WORD ADDR = b 0 .. MAX ADDR There is a set of registers, possibly empty. Some of these may be memorymapped. We can identify memory-mapped registers by their addresses and other registers by numbers outside the address space. REGIST ER : F Z A storage location is then either a register or an address (or possibly both). LOCAT ION = b REGISTER ∪ ADDR The store is a total function mapping storage locations (either register identifiers or memory addresses) to words: ST ORE = b LOCATION → WORD Some of the memory may be ROM. The set of ROM addresses is some subset of the address space: ROM : P ADDR Some of the store (memory or registers) may be given over to memorymapped I/O. The following two sets are the locations which serve as input and output ports. They may overlap with each other but not with the ROM: IN P ORT S, OU T P ORT S : P LOCATION IN PORTS ∩ ROM = OUT PORTS ∩ ROM = ∅

Analysis of Compiled Code: A Prototype Formal Model

2.2

437

Processor State

The processor has one special register: the program counter, PC. For simplicity, we model the contents of the program counter as a component of the processor state in its own right, rather than assigning a location in the store for it, as we do for other registers. This simplification does mean that the program counter cannot be memory-mapped, but that is appropriate for the vast majority of modern microprocessor types. The processor state is thus given by the following schema: P ROCESSOR ST AT E pc : ADDR; store : STORE

2.3

Instruction Set

We will give a “syntax” for instructions which actually embeds most of the semantics of computational and test instructions. A computation is any total function on processor states delivering a result comprising a location and a word; the location indicates where the word is to be stored. COM P U T AT ION = b PROCESSOR STATE → (LOCATION × WORD) A test is any set of words: a word w satisfies test t iff. w ∈ t. T EST = b P WORD Informally, the syntax and semantics of the instruction set is as shown in the following table: Instruction Operands Compute comp

Description Perform computation comp giving a pair (l, w); Store w at location l. StorePC loc Store the contents of PC at loc. Jump addr Assign addr to PC. CondJump loc, test, addr1 , addr2 If the contexts of location loc satisfy test, then assign addr1 to PC, otherwise assign addr2 to PC. LoadPC loc Assign the contents of loc to PC. For simplicity, we specify that if any instruction attempts to write to the ROM, then the write is ignored. This aspect of the model would need to be reconciled with the actual behaviour of a particular microprocessor configuration in a real world example.

438

R.D. Arthan

The conditional jump instruction, CondJump, is unlike most, if not all, real microprocessors in having an “if-then-else” effect, rather than “if-then”. This is technically convenient in the sequel and simply requires us to encode the usual “if-then” behaviour using an “else-branch” which jumps to the instruction immediately after the conditional jump. The StorePC and LoadPC instructions would be used by a compiler to implement subroutine call and return. Most real microprocessors offer a combination of StorePC and some kind of jump instruction as a single “jump-to-subroutine” or “call” instruction. The instruction set is modelled by the following Z free type1 IN ST RU CT ION ::= Compute (COMPUTATION ) | StoreP C (LOCATION ) | J ump (ADDR) | LoadP C (LOCATION ) | CondJ ump (LOCATION × TEST × ADDR × ADDR) 2.4

Instruction Execution

We now describe the state change caused by execution of a single instruction using an operation schema for each type of instruction. These operations have an input parameter which is the instruction being executed. Each operation only fires if the parameter is the instruction dealt with by that operation. A Compute instruction is executed by carrying out the computation in the current state to give a location-word pair (l, w) then updating the store by writing w to location l, provided l is not in ROM. If l is in ROM, then by the assumptions made in section 2.3, the store is unchanged2 . COM P U T E instr ? : INSTRUCTION ; ∆PROCESSOR STATE ∃ comp : COMPUTATION ; l : LOCATION ; w : WORD | instr ? = Compute comp • (l , w ) = comp (θPROCESSOR STATE ) ∧ pc 0 = (pc + 1 ) mod MAX ADDR ∧ (l 6∈ ROM ∧ store 0 = store ⊕ {l 7→ w } ∨ l ∈ ROM ∧ store 0 = store) 1 2

The ProofPower dialect of Z does not currently support the chevron brackets around the sets in the branches of a free type required in other Z dialects. In many typical microprocessor configurations, attempting to write to ROM gives “undefined” behaviour, this can readily be modelled by removing the disjunct l ∈ ROM ∧ store 0 = store from the predicates of COMPUTE and STORE PC .

Analysis of Compiled Code: A Prototype Formal Model

439

A StorePC instruction causes the current value of the program counter to be written to the store in the location given by the operand of the instruction. The rule about attempts to write to ROM is the same as for the Compute instructions. ST ORE P C instr ? : INSTRUCTION ; ∆PROCESSOR STATE ∃ l : LOCATION | instr ? = StorePC l • pc 0 = (pc + 1 ) mod MAX ADDR ∧ (l ∈ 6 ROM ∧ store 0 = store ⊕ {l 7→ pc} ∨ l ∈ ROM ∧ store 0 = store)

Jump A Jump instruction assigns a new value to the program counter, which will cause a transfer of control on the next execution cycle. JUMP instr ? : INSTRUCTION ; ∆PROCESSOR STATE ∃ a : ADDR | instr ? = Jump a • pc 0 = a ∧ store 0 = store

A CondJump instruction evaluates the test and assigns a new value to the program counter according to the result of the test. This will cause the required transfer of control on the next execution cycle. CON D J U M P instr ? : INSTRUCTION ; ∆PROCESSOR STATE ∃ l : LOCATION ; t : TEST ; a 1 , a 2 : ADDR | instr ? = CondJump (l , t, a 1 , a 2 ) • (store l ∈ t ∧ pc 0 = a 1 ∨ store l 6∈ t ∧ pc 0 = a 2 ) ∧ store 0 = store

A LoadPC instruction assigns the contents of the indicated store location to the program counter, which will cause a transfer of control on the next execution cycle.

440

R.D. Arthan

LOAD P C instr ? : INSTRUCTION ; ∆PROCESSOR STATE ∃ l : LOCATION | instr ? = LoadPC l • pc 0 = store l ∧ store 0 = store

2.5

CPU Execution Cycle

The instruction decode function maps a word-address pair to an instruction as defined in section 2.3 above. The word is the value to be decoded; the address is its location in the memory and is needed to decode instructions that use PCrelative addressing. It is a partial function: some words may have an undefined effect if the microprocessor attempts to execute them. The internal details of the function are of no interest to us here, so we omit the predicate part of the definition. decode : WORD × ADDR → 7 INSTRUCTION The schema T ICK describes what happens in one execution cycle. This schema is partial: if the instruction that PC points to has no decoding, the pre-condition of the schema will be false. T ICK ∆PROCESSOR STATE ∃ instr ? : INSTRUCTION | (store pc, pc) 7→ instr ? ∈ decode • COMPUTE ∨ JUMP ∨ COND JUMP ∨ LOAD PC ∨ STORE PC

2.6

Behavioural Model

We now use the schema T ICK to define a behavioural model of the microprocessor. This is a description of its input/output behaviour with the details of its internal structure as a state-transition machine abstracted away. To define the behavioural model we need to define sets to represent the input and output streams. An individual input or output is a function mapping the relevant port addresses to words: IN P U T = b IN PORTS → WORD

Analysis of Compiled Code: A Prototype Formal Model

441

OU T P U T = b OUT PORTS → WORD An input or output stream is then a series of inputs or outputs indexed by time (measured in CPU execution cycles). T IM E = b N IN ST REAM = b TIME → INPUT OU T ST REAM = b TIME → OUTPUT We also need the notion of an execution history. This is a time-indexed series of processor states: HIST ORY = b TIME → PROCESSOR STATE The i/o history relation is the ternary relation that relates a triple comprising an input stream, an output stream and an execution history precisely when the following conditions hold: (i) the history values may be obtained by successively placing the input values for each time period on the input ports and letting the processor run for one clock tick; (ii) the outputs thus obtained at each time period are the ones observed in the history. IO HIST ORY inputs : IN STREAM ; outputs : OUT STREAM ; history : HISTORY ∀ t : TIME ; TICK | pc = (history t).pc ∧ store = (history t).store ⊕ inputs t • pc 0 = (history (t+1 )).pc ∧ store 0 = (history (t+1 )).store ∧ outputs t = OUT PORTS C store 0

The behaviour of a processor running a particular program is its input-output relation and belongs to the following set: BEHAV IOU R = b IN STREAM ↔ OUT STREAM Now we can describe the behavioural model. This is explicitly parametrised by the initial state, i.e. the program to be run.

442

R.D. Arthan

behaviour : PROCESSOR STATE → BEHAVIOUR ∀

prog: PROCESSOR STATE ; inputs : IN STREAM ; outputs : OUT STREAM • inputs 7→ outputs ∈ behaviour prog ⇔ (∃ history : HISTORY • history 0 = prog ∧ IO HISTORY ) Using the behavioural model, we may now specify rigorously various general properties of programs. As an example, we can now characterise the programs that never run out of control; they are precisely those whose behaviour is a total relation: total : PPROCESSOR STATE total = {prog : PROCESSOR STATE | dom(behaviour prog) = IN STREAM }

2.7

Discussion

The techniques we have used to define the function behaviour in this section can readily be adapted to construct a behavioural model from almost any statebased specification. We believe that this approach should be more widely used. Notions like data and code refinement admit a very direct and clear formulation for a behavioural model. The refinement relation, v , can be defined directly in Z as follows: v ∀ •

: BEHAVIOUR ↔ BEHAVIOUR

b 1 , b 2 : BEHAVIOUR b 2 v b 1 ⇔ dom b 1 ⊆ dom b 2 ∧ dom b 1 C b 2 ⊆ b 1

That is to say, behaviour b 2 refines behaviour b 1 if, and only if, (i) b 2 is defined for all inputs for which b 1 is defined, and (ii) every output stream of b 2 on an input admitted by b 1 is also a possible output stream for b 1 . These are the usual liveness and safety properties that one requires in refinement. Refinement rules for the underlying state-based specifications can then be derived in Z as theorems from the above definition rather than posited and justified by metalinguistic methods as is commonly done in the literature. In the sequel, we will be interested in higher-order properties that have to be expressed with some reference to the internal state. To define these, we will use the i/o history relation. Methodologically, this reflects the following fact: while externally observable behaviour is the ultimate thing of interest, program analysis is carried out using implementation details (the program!); to capture

Analysis of Compiled Code: A Prototype Formal Model

443

the requirements on a program analysis technique we need some view of those details. We do expect an analysis to have useful consequences at the behavioural level — for example, the basic blocks abstraction that we look at in the next section is expressed in terms of the i/o history relation, but, when it holds, it guarantees that the behaviour relation is total.

3 3.1

Basic Blocks Abstraction Introduction

The notion of a basic block is well known in the world of compiler design. To quote [1], a basic block is: . . . a sequence of consecutive statements which may be entered only at the beginning and when entered are executed in sequence without halt or possibility of branch (except at the end of the basic block. Compiler designers use decompositions of programs into a set of basic blocks for various kinds of code optimisations. We are concerned with program analysis techniques that deduce a decomposition of a machine code program into a set of basic blocks. Our intention is that the structure recovered by such a decomposition will enable deeper semantic analyses to be carried out more easily. For example, the REAP reverse engineering tools are able automatically to find program decompositions of a similar sort to the basic block decompositions described here. These decompositions then justify a translation of machine code into a high level language. In general, an arbitrary program executing on our microprocessor may admit no useful decomposition into basic blocks; indeed, the program may be selfmodifying — making it impossible to distinguish between code and data in the store. However, programs generated by compilers or written by well-disciplined assembly language programmers will normally admit a clear separation of code and data and will have the code structured to admit a decomposition into a set of basic blocks (corresponding to a flow-chart for the program). We are interested in formalising the notion of such a decomposition. 3.2

Representing Basic Blocks

We will need to distinguish between instructions that can cause a transfer of control and the non-jump instructions — those for which control just proceeds to the next instructions. N ON J U M P IN ST RU CT ION = b ran Compute ∪ ran StorePC A basic block comprises a (possibly empty) body of non-jump instructions followed by an instruction (the coda of the basic block) that may cause a transfer

444

R.D. Arthan

of control3 . The basic block is labelled with the address in memory at which the basic block starts. The following definition captures these aspects of a single basic block. To make formal the full content of the definition given in section 3.1 above, we need to describe a relationship between a whole set of basic blocks and a processor execution history: this is done in sections 3.3 and 3.4 below. BASIC BLOCK body : seq NON JUMP INSTRUCTION ; coda : INSTRUCTION ; label : ADDR

3.3

Instantaneous Basic Block Decompositions

A basic block decomposition comprises a set of basic blocks which we will require to act as a correct description of the possible control behaviour of a program. To define this requirement, we first define the notion of an instaneous basic block decomposition. An instantaneous basic block decomposition is a relation between a set of basic blocks and a processor state. We will develop our specification of this relation in four stages. 1. The basic blocks must correctly describe the instructions encoded in some portion of the memory, IN ST BBD1 blocks : P BASIC BLOCK ; PROCESSOR STATE ∀ • ∧

b : blocks (∀i :dom b.body• (store (b.label + i − 1 ), (b.label + i − 1 )) 7→ (b.body) i ∈ decode) (store(b.label + #(b.body)), (b.label + #(b.body))) 7→ b.coda ∈ decode

2. No two basic blocks may apply to the same piece of memory: 3

It is important that we allow the coda to be a non-jump instruction. E.g., using C as an assembly language, consider the program fragment: “R1 = 1; L: R1 *= R2; R2 -= 1; if(R2 > 0) goto L;”. A basic block decomposition of this must take the computation instruction “R1 = 1;” as the coda of a basic block with an empty body.

Analysis of Compiled Code: A Prototype Formal Model

445

IN ST BBD2 blocks : P BASIC BLOCK ; PROCESSOR STATE ∀ b 1 , b 2 : blocks | ∃i : 0 .. #(b 1 .body); j : 0 .. #(b 2 .body)• b 1 .label + i = b 2 .label + j • b1 = b2

3. The program counter must point to some instruction in one of the basic blocks: IN ST BBD3 blocks : P BASIC BLOCK ; PROCESSOR STATE ∃

b : blocks • pc ∈ b.label .. b.label + #(b.body)

4. If the processor has reached the end of a basic block, then the next value of the program counter must be the label of one of the basic blocks: IN ST BBD4 blocks : P BASIC BLOCK ; PROCESSOR STATE ∀ •

PROCESSOR STATE 0 ; b : blocks | TICK ∧ pc = b.label + #(b.body) pc 0 ∈ {c : blocks• c.label }

The conjunction of the above four schemas gives us our definition of an instantaneous basic blocks decomposition. IN ST BBD = b INST BBD1 ∧ INST BBD2 ∧ INST BBD3 ∧ INST BBD4 Any state of the microprocessor in which the current and next values of the program counter point to valid instructions admits at least one instantaneous basic block decomposition: namely the degenerate decomposition with just two basic blocks, one for the current instruction and one for the next instruction.

446

3.4

R.D. Arthan

Correct Basic Block Decompositions

For a basic block decomposition to be useful it must persist over successive execution states. We therefore define a correct basic block decomposition for a given initial state (i.e., a given program) to be one which will work as an instantaneous decomposition for ever. correct bbd : PROCESSOR STATE → PP BASIC BLOCK ∀ • ⇔

prog : PROCESSOR STATE ; blocks : P BASIC BLOCK blocks ∈ correct bbd prog (∀ inputs : IN STREAM • ∃ outputs : OUT STREAM ; history : HISTORY • IO HISTORY ∧ (history 0 ).pc = prog.pc ∧ (history 0 ).store = prog.store ⊕ inputs 0 ∧ (∀ t : TIME ; PROCESSOR STATE | pc = (history t).pc ∧ store = (history t).store ⊕ inputs t • INST BBD))

3.5

Discussion

A program possessing a correct basic block decomposition is necessarily total — a fact that is wired into the definition above in a fairly direct way. Moreover, programs that have such decompositions are much nicer than those that don’t; in particular, the basic blocks give a clear distinction between code and data in memory; a program with a correct basic block decomposition will neither modify its code nor execute its data4 . The notion of a correct basic block decomposition as specified above is directly applicable to embedded systems that run a fixed program held in ROM. It would also apply with a little elaboration to a system that loads a fixed program into RAM at boot time — the above definition would need to be modified to allow the basic block decomposition not to take effect until the execution cycles that load the program are complete. A multiprocessing operating system cannot be expected to satisfy the above definition directly; however, the virtual machine provided by an operating system to each process could well permit a correct basic block decomposition. For example, of the hundreds of programs that can be run on the Linux system that 4

This “Harvard” property could be expressed directly in terms of the behavioural model without introducing the basic blocks. We have introduced the basic blocks precisely because the program analysis techniques of interest produce decompositions of this sort as part of their output and we are concerned with formalising exactly what that output means.

Analysis of Compiled Code: A Prototype Formal Model

447

I am using to prepare this paper, only a handful of specialist programs (like interactive compilers) would be expected to modify their own code.

4

Concluding Remarks

4.1

Limitations of the Processor Model

A number of important features of microprocessors in the real world have not been addressed by the specification in this paper. To list a few: Multi-word instructions: to handle these simply requires the decode function to take as its argument a sequence of memory words and to return the number of memory words actually occupied by the decoded instruction for use in adjusting the program counter while executing the non-jump instructions. Multiple word lengths: most real microprocessors support several different word lengths (e.g., 8-bit, 16-bit and 32-bit words for the Intel x86 and the Motorola 680x0). Multiple word lengths can be dealt with quite simply in a variety of ways (see chapter 9 of [3] for one approach). Multiple reads & writes: instructions with a built-in loop, such as the Z80’s block transfer instructions or the x86’s repeated string-to-string-copy instructions, could be handled by modifying the schema T ICK to execute the loop. Pipelining: In some RISC processors and microcode engines, side-effects of the decode/execute pipeline are visible to the programmer. In the SPARC 9 architecture for example, while a jump instruction of a certain type is being executed, the instruction immediately following the jump is being decoded and will be executed on the next cycle5 . The pipeline would have to be included explicitly in the execution model to cover such an architecture. Interrupts: in a sense, handling interrupts amounts just to including the program counter (and any other registers affected) as a possible input port. However, to have abstractions like the basic block decomposition work correctly, it would probably be better to model interrupts as an additional kind of instruction and to include the microprocessors interrupt lines in the model. With the possible exception of interrupts, none of these features should prove a major source of additional complexity. Moreover the extra complexity is mostly in the traditional state-based part of the model, so that one may readily exploit techniques that have been proposed in the literature (see [3] for a literature guide). The SPARC 9 model which we outline in section 4.2 below addresses both multiple word lengths and pipelining. 5

So, for example, subroutine exit on this architecture is implemented by a returnfrom-subroutine instruction immediately followed by the instruction that restores register values for the calling routine.

448

R.D. Arthan

4.2

Recent Work

Interglossa have recently undertaken a research project sponsored by DERA, Malvern to work on Chris Sennett’s Template Analysis [4] — a technique for analysing the use of pointers in C code. The result of the Template Analysis is, in effect, a type assignment for a C program in a type system that is “tighter” than that imposed by the C language rules. Part of Interglossa’s research was to push the type assignment resulting from the Template Analysis through to the assembly language level, giving a way of interpreting low-level entities in higher level terms. Based on the specification in the present paper, Tom Lake developed a formalisation of the Sun Microsystems SPARC 9 architecture to guide the research. The specification of the SPARC 9 model together with “validity conditions” and a description of the template typing for assembly code took about 30 pages of Z. The validity conditions are similar in general spirit to the basic blocks abstraction discussed above but extended to cover separation of data and code, orderly control flow, and correct use of the stack and of global data. In the technical document of 1997 in which the specification in this paper was first written up, I wrote: The specification could be used in two ways to model a real microprocessor: (i) its ideas could be re-used to model the specifics of the microprocessor; (ii) it could be used as a sort of microcode engine to describe the semantics of the microprocessor. On reading the Interglossa specification two years later, it was interesting to see how these ideas had stood the test of time. For the SPARC 9 example, it turned out to be easiest to do a mixture of (i) and (ii). The main framework for the store model re-used the general ideas of section 2.1 above but recast to cover the specific details of the SPARC 9 memory model. However, to simplify the execution model, individual SPARC 9 instructions were handled as sequences of “microcode instructions” very similar to those described in section 2.3. 4.3

Future Research

Lemma 1 and Interglossa hope to collaborate soon to continue this line of research. There have been considerable advances in program analysis theory and practice in recent years [7]. We are particularly interested in investigating how program analysis techniques and formal specification and verification technology can interact to make automated or semi-automated validation of compiled code a viable method. Acknowledgments. I am indebted to Tom Lake of Interglossa and to Mike Hill of the Defence and Evaluation Research Agency, Malvern, for their assistance in the preparation of this paper. The referees’ comments were most helpful and have resulted in improvements both to the specification and to its presentation.

Analysis of Compiled Code: A Prototype Formal Model

449

Index of Z Global Variables ADDR . . . . . . . . . . . . . . . . . . . . . . . . 2.1 BASIC BLOCK . . . . . . . . . . . . . . . 3.2 BEHAV IOU R . . . . . . . . . . . . . . . . . 2.6 behaviour . . . . . . . . . . . . . . . . . . . . . . 2.6 COM P U T AT ION . . . . . . . . . . . . . 2.3 Compute . . . . . . . . . . . . . . . . . . . . . . 2.3 COM P U T E . . . . . . . . . . . . . . . . . . . 2.4 CondJump . . . . . . . . . . . . . . . . . . . . . 2.3 CON D JU M P . . . . . . . . . . . . . . . . 2.4 correct bbd . . . . . . . . . . . . . . . . . . . . . 3.4 decode . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 HIST ORY . . . . . . . . . . . . . . . . . . . . 2.6 IN P U T . . . . . . . . . . . . . . . . . . . . . . . 2.6 IN ST RU CT ION . . . . . . . . . . . . . . 2.3 IN ST BBD1 . . . . . . . . . . . . . . . . . . 3.3 IN ST BBD2 . . . . . . . . . . . . . . . . . . 3.3 IN ST BBD3 . . . . . . . . . . . . . . . . . . 3.3 IN ST BBD4 . . . . . . . . . . . . . . . . . . 3.3 IN ST BBD . . . . . . . . . . . . . . . . . . . 3.3 IN P ORT S . . . . . . . . . . . . . . . . . . . 2.1 IN ST REAM . . . . . . . . . . . . . . . . . 2.6 IO HIST ORY . . . . . . . . . . . . . . . . . 2.6 Jump . . . . . . . . . . . . . . . . . . . . . . . . . 2.3

JU M P . . . . . . . . . . . . . . . . . . . . . . . . 2.4 LoadP C . . . . . . . . . . . . . . . . . . . . . . . 2.3 LOAD P C . . . . . . . . . . . . . . . . . . . . 2.4 LOCAT ION . . . . . . . . . . . . . . . . . . . 2.1 M AX ADDR . . . . . . . . . . . . . . . . . . 2.1 M AX W ORD . . . . . . . . . . . . . . . . . 2.1 N ON JU M P IN ST RU CT ION . 3.2 OU T P U T . . . . . . . . . . . . . . . . . . . . . 2.6 OU T P ORT S . . . . . . . . . . . . . . . . . . 2.1 OU T ST REAM . . . . . . . . . . . . . . . 2.6 P ROCESSOR ST AT E . . . . . . . . . 2.2 REGIST ER . . . . . . . . . . . . . . . . . . . 2.1 ROM . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 StoreP C . . . . . . . . . . . . . . . . . . . . . . 2.3 ST ORE P C . . . . . . . . . . . . . . . . . . . 2.4 ST ORE . . . . . . . . . . . . . . . . . . . . . . . 2.1 T EST . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 T ICK . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 T IM E . . . . . . . . . . . . . . . . . . . . . . . . 2.6 total . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 W ORD . . . . . . . . . . . . . . . . . . . . . . . . 2.1 v . . . . . . . . . . . . . . . . . . . . . . . . 2.7

References [1] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, 1986. [2] R.D. Arthan. Mechanizing Proof for the Z Toolkit. To appear; available on the World Wide Web at http://www.lemma-one.com/papers/papers.html, presented at the Oxford Workshop on Automated Formal Methods, June 1996. [3] Jonathan Bowen. Formal Specification and Documentation using Z: a Case Study Approach. International Thomson Computer Press, 1996. [4] M.G. Hill and C.T. Sennett. Final Report of Integrity Methods for Commercial Software, DERA Technical Report DERA/CIS/CIS3/TR980138/1.0. Technical report, Defence Evaluation and Research Agency, Malvern, 1998. [5] D.J. King and R.D. Arthan. Development of Practical Verification Tools. Ingenuity — the ICL Technical Journal, 1996. [6] Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. The Definition of Standard ML (Revised). MIT Press, 1997. [7] Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of Program Analysis. Springer-Verlag, 1998. [8] J.M. Spivey. The Z Notation: A Reference Manual, Second Edition. Prentice-Hall, 1992. [9] Jim Woodcock and Jim Davies. Using Z: Specification, Refinement, and Proof. Prentice/Hall International, 1996.

Zzzzzzzzzzzzzzzzzzzzzzzzzz David Everett platform seven 1–2 Finsbury Square London EC2A 1AA, UK

Abstract. You can almost hear the snores of the audience, what can be the importance of Z and all their formal methods? People used to say the same thing about number theory and other branches of advanced mathematics and then suddenly the world changed. Cryptography and in particular public key systems have become the cornerstone of electronic commerce. Advanced statistical methods can be found in almost every facet of our modern electronic world from advanced financial engineering methods to data mining. In this paper we are going to explore the practical use of formal methods in two major commercial products, Mondex the fully transferable electronic cash alternative and Multos the first multi-application operating system for Smart Cards. Both of these projects have been evaluated and certified to ITSEC level E6. Would we do it again????

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, p. 450, 2000. c Springer-Verlag Berlin Heidelberg 2000

Segregation with Communication David Cooper1 and Susan Stepney Logica UK Ltd, Betjeman House, 104 Hills Road, Cambridge, CB2 1LQ, UK {cooperd,stepneys}@logica.com

Abstract. We have developed a general definition of segregation in the context of Z system specifications. This definition is general enough to allow multi-way communications between otherwise segregated parties along defined channels. We have an abstract definition of segregation in terms of the traces allowed by systems, a concrete style of specification to ensure segregation (a generalisation of promotion called multi-promotion) and a proof that unconstrained multi-promotion is a sufficient condition to ensure segregation.

1

Introduction

We have been working with the National Westminster Development Team (now platform seven), proving the correctness of Smartcard applications for electronic commerce. Two of these products have now achieved ITSEC security certification [ITSEC 1996] at the E6 level (the highest level defined). At the time of writing these are the only two products to have achieved this level of certification. One of the most important security requirements for a smart card operating system is the segregation of applications — ensuring that co-resident applications are kept from interfering with each other (except along clearly defined communication channels). This paper discusses the general mathematical model of segregation we used, and a number of the issues arising from the task of proving that a Z state-and-operations model of a system possesses the required segregation property.

2

Motivating the Problem

Although segregation, non-interference, information flow and access control have all been the subject of investigation for a long time, our needs were specific: – applications must be kept segregated in general, but must be able to communicate with each other along certain defined channels – communication channels must support communication between multiple (more than two) applications 1

current address: Praxis Critical Systems Ltd, 20 Manvers Street, Bath, BA1 1PX [email protected]

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 451–470, 2000. c Springer-Verlag Berlin Heidelberg 2000

452

D. Cooper and S. Stepney

– the model of the system under consideration [Stepney & Cooper 2000] would be written in a conventional state-and-operations style in Z – any segregation property would have to be shown to be possessed by the Z model through formal proof Although much has been published in this area (see, for example, [Bell & Padula 1976], [Rushby 1981], [Goguen & Meseguer 1984], [Bell 1988], [Jacob 1992], [Roscoe 1995], [Gollman 1998], among many others), nothing existing fitted our needs without modification.

3

Trace Definition of Segregation

We choose to define segregation by considering a system to be a set of allowed traces (sequences of events) [Hoare 1985], [Hoare & He Jifeng 1998]1 . We bring all of the interesting behaviour of the system out into the visible events, ignoring any internal state that may be used to control the behaviour. This is consistent with the semantics imposed on Z by the refinement rules [Stepney et al. 1998], provided the communication events are rich enough to capture inputs, outputs, initialisation and finalisation. In our model, a system is completely defined by its allowed TRACE s. So a system is a (prefix closed) set of TRACE s: SYSTEM == {s : PTRACE | (∀ t : s; p : TRACE | p prefix t • p ∈ s)} Some systems behave as though they are made up of independent, segregated applications, communicating with each other through defined channels. Our task is to define the set of such systems. We impose only as much structure on events as necessary. We are defining the concept of segregation between identified applications, and so we require a set of application identifiers [A] We want to have control over the communication between applications, and this requires a structure on the data in the events; a collection of named values: [N , V ] A communication EVENT consists of the set of applications that are engaging in the event, identifying which of the named values each application can see EVENT == {n : A ←→ N ; v : N 9 V | n 6= ∅ ∧ dom v = ran n} At least one application is engaging, and the named values are precisely those visible to at least one application. 1

We do not need to work with a richer model of a system, which could include properties such as failures/divergences, or statistical properties of traces.

Segretation with Communication

453

We add a dummy event, Φ, representing the view an application has of events it cannot see. Φ == (∅[A × N ], ∅[N × V ]) EVENTφ == EVENT ∪ { Φ } A sequence of EVENT s constitutes a TRACE . TRACE == seq EVENTφ 3.1

A Segregated System

An actual system TRACE consists of a sequence of EVENT s. We can view this sequence of EVENT s from the perspective of each application, seeing only the EVENT s engaged by this application, and seeing only the parts of the EVENT s visible to this application. We can choose how much of an EVENT is visible to an application; one choice is: η : EVENTφ −→ A −→ EVENTφ ∀ n : A ←→ N ; v : N 9 V ; a : A | (n, v ) ∈ EVENTφ • η(n, v )a = (n B nL{a}M, nL{a}M C v ) This choice allows applications to see only the named values identified with this application, and to see any applications engaging that have visibility of at least one named value also visible to this application. Other choices are possible (see section 8 for the implications). The remaining discussion of segregation is valid for any choice. We lift η up to a projection of whole TRACE s, dropping invisible EVENT s. π : TRACE −→ A −→ TRACE ∀ t : TRACE ; a : A • π t a = { i : dom t • i 7→ η(t i )a }  EVENT and lift again to sets of TRACE s (systems) Π : PTRACE −→ A ←→ TRACE S ∀ s : PTRACE • Π s = { t : s • π t } We now give the formal definition of being segregated: for a system to be segregated, any collection of application traces derived from the system (that can legally be re-combined) must yield an allowed trace of the system when recombined. S SEG == { s : SYSTEM | ((Π # Π ∼ )L{s}M) = s } Intuitively, what does this mean?

454

D. Cooper and S. Stepney

If a system behaves as though it consists of independent applications that interact only via explicit shared communication events, then if one of its applications exhibits a certain behaviour in one context, it should be able to exhibit the same behaviour in any other consistent context. By “consistent” we mean supplying the same view of shared communication events. If this is not true, it means that something must be preventing this application from behaving this way, even though all the allowed interfaces are identical. There must be some interference, some back door communication. That is, the system is not behaving as independent applications interacting only via explicit shared communication events. A communications-segregated system has to be quite large: it must at least contain all the individual application behaviours as possible traces; it must also contain the individual behaviours occurring in any order.

4

Using Segregation

A definition of segregation in terms of traces is good, but is not immediately applicable to conventional state-and-operations Z models of systems. In a practical, industrial development (such as the development of which this is part [Stepney & Cooper 2000]) one needs specific tools that can be applied to the system specifications being developed. Each small part of the set of tools is relatively straightforward, but when combined needs care and attention to detail. We summarise here the elements needed to turn the abstract segregation definition into a practical tool, and then expand each element in the succeeding sections. 4.1

Multiple Models

To make the definition of the segregation property simple we have chosen to work in the world of traces. To make the modelling of the Smartcard system under investigation practical, we have chosen a conventional state-and-operations style of Z specification. These are worlds apart. Although the conceptual step from one to the other is not particularly large or difficult, the practical step needs to be taken, and needs to account for all the messy detail that invariably appears in real systems. Merging these worlds requires a mathematically sound translation. We developed a number of models, building in progressively more of the computational framework assumed by Z, defining transition functions at each step. This allows us to take a conventional Z specification, extract its equivalent traces model, and impose the requirements of segregation on it. 4.2

Multi-promotion

We developed a Z specification structure that naturally leads to segregated systems. It is a natural extension of promotion to accommodate the simultaneous update of multiple local states; we call it multi-promotion. The expectation is

Segretation with Communication

455

that if we can constrain the specification of the system structurally, it should be easier to prove that a given system is segregated. 4.3

Unwinding Theorem

It is useful, in proving a property of systems over sequences of transitions, to prove an equivalent property over individual transitions. This process is generally known as unwinding. We state and prove the unwinding theorem for our definition of segregation. It is, in essence, that a Z model written as an unconstrained multi-promotion is segregated This is the justification for all the previous work. Having developed a clear, abstract definition of segregation, we have now proved that, under a mathematically sound and justifiable transformation, a specifically structured specification will possess the segregation property.

5

Multiple Models

We have many ways of specifying a system, each suitable for different purposes. In the various relational formulations, we have a global state Γ , a model state Σ, inputs I , outputs O, and events EVENT . (For simplicity we assume that the global inputs and outputs, and the model inputs and outputs, have the same type.) [Γ, Σ, I , O, EVENT ] We use the following models: event traces: a system is modelled as a set of traces of events: P(seq EVENT ). This is the model used to define segregation. input–output traces: a system is modelled as a set of traces of input–output pairs: P(seq(I × O)) computational model: a system is modelled as a global state transition relation: Γ × (seq I × seq O) ←→ Γ × (seq I × seq O) state transition system: a system is modelled as a state transition relation: Σ × I ←→ Σ × O, a state initialisation Γ ←→ Σ, an input initialisation I ←→ I , a state finalisation Γ ←→ Σ, and an output finalisation O ←→ O. Z model: like the state transition relation model, but using schemas instead of relations: a state System, an operation SystemBhvr , state initialisation InitSystem, identity input initialisation, state finalisation FinSystem, and output finalisation FinOut+ , or identity. This is the model used to specify the behaviour of our system. We need ways of moving between these different descriptions. We develop a number of translation functions, as summarised in figure 1 below.

456

D. Cooper and S. Stepney P(seq(I × O))

6

toEvent

-

6

ggTraces

P(seq EVENT ) 

6

GM ←→ GM ioTraces

6

gg

traces

wound

(SM ←→ SM ) × IF

6

toProg

(Σ × I ←→ Σ × O) × IF

simplify

- P(EVENT × Σ × Σ)

Fig. 1. Summary of the various translation functions

5.1

Traces from the Computational Model

Z has an implicit computational model used to interpret specifications. This is discussed in some detail in [Woodcock & Davies 1996] and [Stepney et al. 1998]. The essential behaviour of a system, and the behaviour we use when deciding whether one system is the refinement of another, is captured in a relation between two global states. These global states are deemed to be ‘real’, in that the elements in them refer to real-world objects and values that can be detected and can be used to test a real system. We define a means of passing back and forth between the global states and the internal, specification states used in Z. This section develops a function gg that maps a Z specification to its essential global-to-global relation. From this, we develop a real-world notion of a system trace. [He Jifeng et al. 1986] is the basis of the theory of Z refinement we use. They make use of general programs written in Dijkstra’s guarded command language, but for our purposes we can be more specific. All actions of our system can be represented as a simple sequence of operations, one after the other; no recursion, choice, or non-determinism in operation choice are needed. 5.2

The Computational Model

As explained in [Woodcock & Davies 1996], we use a system state SM that is rich enough to store the sequence of inputs not yet consumed and the sequence of outputs already produced, as well as the actual system state. SM == Σ × (seq I × seq O) We are interested in state transitions sop that respect the computational model of consuming inputs and producing outputs. Such sops can be written in terms of some input–output state transition op: op : Σ × I ←→ Σ × O ` split # (op || id ) # merge ∈ SM ←→ SM

Segretation with Communication

457

(See [Woodcock & Davies 1996], for definitions of split, id , ||, and merge.) We are interested in the transitive closure of such sops (representing an arbitrary sequence of these operations). Omitting some details, we define the function toProg, which takes a relational system description in terms of individual state transitions in the computational model world, and yields the resulting programs in the SM world, by imposing the computational model to yield single state transitions in the SM world, then taking the closure of these to yield the programs. toProg : (Σ × I ←→ Σ × O) −→ (SM ←→ SM ) ∀ op : Σ × I ←→ Σ × O • toProg op = (split # (op || id ) # merge)∗ toProg can also be written explicitly as a constructed set of traces (using extraction functions in, out, beforeS ,and afterS from the type of op in the obvious way): toProg op = { τ : seq1 op | ∀ k : 1 . . #τ − 1 • afterS (τ k ) = beforeS (tail τ k ) • (beforeS (τ 1), (τ # in, h i)) 7→ (afterS (τ (#τ )), (h i, τ # out)) } ∪ id SM 5.3

Including Initialisation and Finalisation

We define the set of ‘complete’ programs by taking the set of programs allProg op and adding an initialisation step to the front and a finalisation step to the back. These steps map between the global world and the specification world. The computational model uses a global structure very similar to the set SM , with a state, input sequence and output sequence. Initialisation maps the global state to the specification state using a state initialisation relation, si ; the input sequence to the input sequence using an input initialisation relation, ii ; and the output sequence it ignores, giving an empty output sequence in the specification. Finalisation is similar, using sf and of , but ignores the input sequence. GM == Γ × (seq I × seq O) Define IF == (Γ ←→ Σ) × (I ←→ I ) × (Γ ←→ Σ) × (O ←→ O) gg : (SM ←→ SM ) × IF −→ GM ←→ GM ∀ sopProg : SM ←→ SM ; si , sf : Γ ←→ Σ; ii : I ←→ I ; of : O ←→ O • gg(sopProg, (si , ii , sf , of )) = (si ||(ˆii ||(seq O × {hi}))) # sopProg # (sf ∼ ||((seq I × {hi}) ||ˆof ∼ )) (theˆoperator lifts functions on elements to functions on sets of elements.) gg results in a relation from the initial global state to the final global state.

458

5.4

D. Cooper and S. Stepney

Input–Output Traces

For segregation we are interested in just the inputs and outputs of a system, ignoring the initial and final states. The function ggTraces maps a global-toglobal relation to the corresponding set of input/output traces. ggTraces : (GM ←→ GM ) −→ P(seq(I × O)) ∀ r : GM ←→ GM • ggTraces r = { g, g 0 : Γ ; is, is 0 : seq I ; os, os 0 : seq O | #is = #os 0 ∧(g, (is, os)) 7→ (g 0 , (is 0 , os 0 )) ∈ r • { i : dom is • i 7→ (is i , os 0 i ) } } Notice that any components in the gg relation with differing length input and output sequences have no corresponding input-output trace. ioTraces maps a state transition relation to a set of input–output traces ioTraces : (Σ × I ←→ Σ × O) × IF −→ P(seq(I × O)) ioTraces = toProg # gg # ggTraces This can be written as explicit sets of traces: op : Σ × I ←→ Σ × O; si , sf : Γ ←→ Σ; ii : I ←→ I ; of : O ←→ O ` ioTraces(op, (si , ii , sf , of )) = { τ : seq1 op; g, g 0 : Γ ; is : seq I ; os : seq O | #is = #os = #τ ∧beforeS (τ 1) ∈ si L{g}M ∧g 0 ∈ sf ∼ L{afterS (τ (#τ ))}M ∧( ∀ k : dom τ • in(τ k ) ∈ ii L{is k }M ∧os k ∈ of ∼ L{out(τ k )}M ∧(k < #τ ⇒ afterS (τ k ) = beforeS (tail τ k )) ) • { i : dom τ • i 7→ (is i , os i ) } } ∪ {h i} 5.5

Event Traces

The Z computational model is expressed in terms of inputs and outputs, the segregation model in terms of events. We introduce a bijection asEvent that relabels inputs and outputs as events, allowing us to move freely between the two alternative representations. asEvent : I × O  EVENT We lift this to the function toEvent, which converts from sets of input–output traces to sets of event traces.

Segretation with Communication

459

toEvent : P(seq(I × O)) −→ P(seq EVENT ) toEvent = ( λ s : P(seq(I × O)) • (ˆasEvent)LsM ) 5.6

Mapping from State-and-Operations to Traces

Finally, we can define the function that takes a system specification written conventionally as state-and-operations, and delivers the equivalent traces model. traces : (Σ × I ←→ Σ × O) × IF −→ P(seq EVENT ) traces = ioTraces # toEvent This gives us the ability to move formally between our system specification written in a conventional Z style and our definition of segregation written in the traces model. We have built up this translation function from the theory surrounding Z refinement, which means that we have a good understanding of how this translation is affected by refinement. This is important when relating two system specifications (one a refinement of the other) to the same definition of segregation. This we need to do because in general segregation is not preserved under refinement, and so an abstract system proved to be segregated must be re-proved segregated after refinement. Indeed, we have used this understanding in defining a property of segregation with respect to a model (called segWrt), which captures the fact that two models are segregated in the same way. This is discussed in [Stepney & Cooper 2000].

6 6.1

Multi-promotion A Reminder of Single Promotion

Promotion is a commonly used technique of structuring a Z specification to aid understanding when the system state consists of a collection of local states, each of which generally changes in isolation (explained in [Barden et al. 1994, chapter 19]). Calling the individual local states applications, consider the following simple example: [X , A, C ] Each local application has a state (possibly with some invariant predicate) and some locally defined operations taking in an input communication of type C and delivering an output communication of the same type. ApplState x :X ...

460

D. Cooper and S. Stepney

LocalOp ∆ApplState c?, c! : C ... These are then collected together into a promoted state, where each is labelled by an application name2 : PromotedState collection : A −→ ApplState Each local application operation can be promoted to an operation on the whole promoted state using a so-called framing schema ΦPromotedStateIn ∆ApplState ∆PromotedState a? : A collection a? = θApplState collection 0 = collection ⊕ { a? 7→ θApplState 0 }

PromotedOpIn = b ∃ ∆ApplState • ΦPromotedStateIn ∧ LocalOp We can understand the behaviour of LocalOp in the context of a single application state. Having internalised this (and other local operations), all local operations are promoted to the collection in a similar way (via ΦPromotedStateIn) — that one application state is updated according to the local operation, and all other application states don’t change. 6.2

Introducing Multi-promotion

We extend this structuring to cater for multiple applications changing simultaneously. First, though, we need to look at some of the features of single promotion. We have identified which application state changes through an input, a?. This isn’t the only choice — it could be an internal choice of the system based on some system state (such as the “currently selected application”), which we can model by hiding the a?: PromotedOpHide = b ∃ ∆ApplState; a? : A • ΦPromotedStateIn ∧ LocalOp 2

Usually, collection would be a partial function, and applications could be added and removed by changing its domain. We chose to use a total function and model “absent” applications explicitly, to ease the connection with the segregation definition.

Segretation with Communication

461

This allows any appropriate application to be the one engaging in the operation — other system constraints may force only one to be appropriate, or the choice may be made non-deterministically. The choice is invisible (unless the resulting state change is visible). Alternatively, the choice can be made an output, a!. This behaves like the hidden value (it is the system rather than the user that decides which application state changes), but makes visible which choice was made. For technical reasons, this was the choice we took in our development. ΦPromotedStateOut ∆ApplState ∆PromotedState a! : A collection a! = θApplState collection 0 = collection ⊕ { a! 7→ θApplState 0 }

PromotedOpOut = b ∃ ∆ApplState • ΦPromotedStateOut ∧ LocalOp The inputs and outputs c? and c! pass directly to the (single) application state that changes. Consider now an extension to allow multiple application states to change: MPromotedOpOut ∆PromotedState α! : P1 A c?, c! : C α! C – collection 0 = α! C – collection ∀ a : α! • ∃ ∆ApplState • collection a = θLocalState ∧collection 0 a = θLocalState 0 ∧LocalOp A set of application names are identified to change. The same options exist in the multiple case as in the single: this can be an input, hidden inside an existential, or an output. Notice that all local application states are experiencing the same LocalOp. This is not a restriction, as it is always possible to harmonise signatures and disjoin all the operations on an application state into a single LocalOp, and then use information in the input communication c? or state to select the required operation. In the case when α! is a singleton set, this formulation reduces to single promotion.

462

D. Cooper and S. Stepney

The form just presented is of unconstrained multi-promotion. There are no constraints in the PromotedState that affect the ability of individual local applications responding to inputs as they choose. This is why unconstrained multipromotion fits so naturally with segregation: if all system behaviours are modelled as local operations on local application states, then an unconstrained multi-promotion specifies a system of segregated parts. Where do the communication channels come in? They appear in the amount of sharing we choose in α!, c? and c!.

6.3

Defined Communication Channels

In the Smartcard system we were developing, we had two requirements on communication. First, applications needed to be able to synchronise with others, ensuring that they engaged in an EVENT only if specific other applications did (or sometimes, did not) engage. Second, parts of inputs and outputs were sometimes shared. We met these requirements by expanding on the simple MPromotedOpOut in line with the choice of η made in section 3.1. We modified α!, the collection of interacting applications, to be a relation between the interacting applications and the named communications variables they could see. α! : A ←→ N The inputs and output communications then consists of named values γ?, γ! : N 9 V We allowed each application to see a restricted view of α! (it could see the applications that shared at least one named communication variable) and to see those named communication variable identified for this application by α!. This yields LocalOpWithComms ∆ApplState l α! : A ←→ N l γ?, l γ! : N 9 V ...

Segretation with Communication

463

MPromotedOpWithComms ∆PromotedState α! : A ←→ N γ?, γ! : N 9 V α! 6= ∅ hdom γ?, dom γ!i partition ran α! (dom α!) C – collection 0 = (dom α!) C – collection ∀ a : dom α! • ∃ ∆ApplState; l α! : A ←→ N ; l γ?, l γ! : N 9 V • l α? = α! B α!L{a}M ∧l γ? = α!L{a}M C γ? ∧l γ! = α!L{a}M C γ! ∧collection a = θLocalState ∧collection 0 a = θLocalState 0 ∧LocalOp It is now possible to specify that a local operation will execute only if a specific other application is also executing, by adding the predicate a1 ∈ dom l α! or to ensure that at least one other application is executing # dom l α! ≥ 2 Two applications can exchange a value during execution as follows. Assume one of the named communication variables is info. The sending application can set the value of info info ∈ dom l γ! ∧l γ! info = 27 and the receiving application can respond on the basis of the value info ∈ dom l γ! ∧l γ! info ≥ 20 ⇒ . . . If the receiving application wants to be sure that the value had been set by a specific application, a constraint on l α! can be added. In these examples there are no constraints in the multi-promotion schema other than those directly related to promotion. In such an unconstrained multipromotion α!, γ? and γ! constitute the communication channels between the applications. There is a clear statement of the information being transmitted — there can be no back-door communication between applications. If the local constraints are complex, though, it can be hard to determine the actual precondition on a promoted operation. The promoted operation can execute whenever a collection of applications can be found that together have a consistent set of constraints. In practice, this may not be obvious.

464

7

D. Cooper and S. Stepney

Unwinding Theorem

We have defined a function that extracts from a state-and-operations specification the set of visible TRACE s. traces : (Σ × I ←→ Σ × O) × IF −→ SYSTEM This function allows us to formalise the property “our specification M describes a system that is segregated” as the theorem ` traces M ∈ SEG As it stands, this is quite difficult to prove, because the general definition of SEG is expressed in terms of traces over arbitrary application executions, whereas our specification M is written as a state-and-operations Z model. Simplify, Based on Properties of the Model However, we can make some simplifications, to produce a much simpler sufficient condition. All these simplifications are driven by the particular properties that our model M has. These properties are: – initialisation is very simple: no refinement of inputs is needed, so inputs are initialised via the identity; state initialisation is chaotic (it ignores the global state from which initialisation came). – finalisation is also simple: no refinement of outputs is needed, so outputs are finalised via the identity; all of the state is of interest, so it is finalised via the identity. We therefore work with S, a simpler representation of M, expressed in terms of P(EVENT × Σ × Σ), where the dependence on the particular initialisation and finalisation IF has disappeared, and the inputs and outputs have been bundled up into events. The function corresponding to traces, that converts the simplified state transition system to a traces description, is called wound (see figure 1). We prove the simplification theorem, that if s is a simplified form of m, then wound s gives the same set of traces as traces m: m : dom traces ` traces m = wound (simplify m) Hence it is sufficient to show that the wound form of our simplified system is segregated. S == simplify M ` wound S ∈ SEG

Segretation with Communication

465

Unwinding The unwinding step is the heart of our proof; it moves the definition of segregation from the world of traces into the world of simplified state-transition systems like S. We introduce a set of simplified state transition models, UNWOUND, that is a direct analogy of SEG: any application state transition derived from the system by projection must also be a state transition allowed by the system. We prove the unwinding theorem, that if a simplified state transition model is in UNWOUND, then its traces model has the segregation property: s : UNWOUND ` wound s ∈ SEG This proof involves expanding the definition of wound to explicitly construct the set of system traces, and then using the properties given in UNWOUND, and much tedious algebra, to deduce the properties required by SEG. Hence it is sufficient to show that the simplified state transition model of our system is in the set UNWOUND. ` S ∈ UNWOUND Labelling We have moved the segregation proof obligation into the world of general state transitions. We now move into the world a particular kind of state transition: we assume the global state of the system Σ has a structure of labelled local states (where the labels are the application identifiers) Σ == A −→ S We introduce a new set of labelled application systems, LABELLED. We prove the labelling theorem, that if a simplified state transition model is in LABELLED, then it is in UNWOUND. s : LABELLED ` s ∈ UNWOUND Hence it is sufficient to show that the simplified state transition model of our system is in the set LABELLED. ` S ∈ LABELLED Promotion One particular form of a labelled system is a multi-promoted system, a particular way of gluing together labelled local state transitions into a global state transition system. We prove the promotion theorem, that such a promoted system is in LABELLED. s : PROMOTED ` s ∈ LABELLED

466

D. Cooper and S. Stepney

Hence it is sufficient to show that the simplified state transition model of our system is in the set PROMOTED. ` S ∈ PROMOTED The set PROMOTED is still expressed in terms of a state transition relation between local labelled states, on events. But it sets the stage for moving from the state transition relation world to the more familiar state-and-operations schemas world. It is the first time the details of η (the way global events are seen by local applications) appear in the proof. So altering the precise details of the visibility properties of communication channels requires only a small change to the total proof. Recasting to a Schema Form We recast the set PROMOTED into schema form, and show that it is a form of Z unconstrained multi-promotion. We have reduced the segregation proof obligation to showing that our system is multipromoted. So we now need to show that our system S can be written as a multi-promoted system. It is time to express our state transition relation in the world of Z schemas. First, we make a direct translation into Z. A local application transition has a type like: ((A ←→ N ) × (N 9 V )) × S × S We map this to a schema with a type like: [ s, s 0 : S ; l α! : A ←→ N ; lc : N 9 V ] The EVENT -based work we have been doing above makes no distinction between inputs and outputs, so we make an arbitrary division. The local operation schemas do not need to conform to the Z convention for operations, and so the input/output distinction does not need to be made. Here we have mapped the first element of the EVENT pair to l α! and the second to lc. DirectLocalOp = b [ s, s 0 : S ; l α! : A ←→ N ; lc : N 9 V | P ((l α!, lc), s, s 0 ) ] Here P is some predicate over the state that captures the local operation. A global application transition has a type like: (A ←→ N × N 9 V ) × (A −→ S ) × (A −→ S ) We map this to a schema with a type like: [ σ, σ 0 : A −→ S ; gα! : A ←→ N ; gc?, gc! : N 9 V ]

Segretation with Communication

467

The global operation, being a normal Z operation, needs to have a before and after state, and inputs and outputs. We have mapped the first element of the EVENT pair to gα! and the second to the two variables gc? and gc!. We choose these two to not overlap, and between them to cover all of the value of EVENT ’s second element. We build up the global operation by promoting individual application schemas. The global operation comprises a promotion of local operations: Global σ, σ 0 : A −→ S gα! : A ←→ N gc?, gc! : N 9 V gα! 6= ∅ hdom gc?, dom gc!i partition ran gα! (dom gα!) C – σ = (dom gα!) C – σ0 ∀ a : dom gα! • ∃ s, s 0 : S ; l α! : A ←→ N ; lc : N 9 V • s = σa ∧s 0 = σ 0 a ∧l α! = gα! B gα!L{a}M ∧lc = gα!L{a}M C gc? ∪ gc! ∧DirectLocalOp Hence it is sufficient to show that our system state and operations model can be written can be written as a Global schema. In fact, we made some further simplifications to Global , by instantiating it with the particularly simple value of N in our model. Instead of writing our model in the form of Global , we chose a more natural form to express it, and proved that it is equivalent to a global -style model [Stepney & Cooper 2000]. Hence we proved our model segregated.

8

Strength of Segregation

In the definition of segregation we chose the definition of the projection function η (section 3.1). At various stages over the course of the development of the model and proof, we found it necessary to change the definition of η. We noticed that changing η had only a small effect on the proof of segregation: it directly affected the structure of a multi-promoted state transition system and the corresponding operation schema in Z, but it left the rest of the proof unchanged. It transpires that, in the context of segregation, η determines how much of an event is visible to an operation. Choosing an η that makes more of each event visible means that more systems are classed as segregated, since we have allowed more communication. Had we defined η so that undesirable systems were classed as being segregated, it may not have shown up in the proof.

468

D. Cooper and S. Stepney

To address this, we have developed a theory of ‘strength of segregation’ of η, but do not have the space to go into detail here. In summary, though, we can rank choices of η in a partial order, from the finest (which allows a local application to see no part of any EVENT ) to the coarsest (which allows a local application to see all of all the EVENT s). The finest segregator is most conservative, in that it permits the minimum number of systems to be classed as segregated. The coarsest segregator permits many systems to be segregated. We have endeavoured to choose a form of η that is as fine as possible, while still being representative of the actual system being modelled. Thus our choice allows parts of the inputs and outputs to be selectively shared between applications, without forcing all parts to be visible to all applications. If we had chosen a coarser segregator, which revealed all inputs and outputs equally to all applications, we would have been forced to open up wider communication channels between applications than we wanted. Care must be exercised in choosing η. Too fine, and you will be unable to prove your system segregated. Too coarse, and although you will be able to prove your system is segregated, the form of segregation will be too weak to be useful.

9

Property not Preserved by Refinement

It is worth noting that segregation as we have defined it is a kind of property not necessarily preserved by refinement. It is possible to specify a system abstractly, prove that such a system is segregated, prove that a more concrete specification is a refinement of the first, but then show that the more concrete specification is not segregated. The reason for this is that refinement allows non-determinism in the abstract specification to be resolved in any way the implementor chooses in the concrete. One such way may involve using supposedly secret information inside one application to influence the behaviour of another application. This sometimes raises the query of what we mean when we say that the abstract specification was shown to be segregated? If we can exhibit a system that is an implementation of this specification (is a refinement of it) and yet is not itself segregated, how can we say that the abstract specification is segregated? The definition of segregation is derived from the totality of the system traces allowed by the specification. Being segregated is a property of all these traces, and is therefore only necessarily a property of systems that actually exhibit all these traces. Systems that do not exhibit all these traces may be argued to be correct versions of refinements of the specification, but they are not correct versions of the specification itself. We thus take a very constrained view of a specification: it specifies systems that behave in exactly this way; no more, no less. It is also important to realise the limitations of this. Consider a specification of two applications that output independent values, with no communication between them. This can be proved to be a segregated system. Consider an actual

Segretation with Communication

469

system that genuinely exhibits all of the specified traces. Such a system would be segregated, by our definition. But this is true even if the system actually chooses its traces from some non-segregated subset of all the traces 99% of the time, and only 1% of the time adds in the full segregated behaviour. For example, the system could keep the outputs from the two applications in synchrony 99% of the time, and only 1% of the time allow them to nondeterministically diverge. With this information we could reliably (with 99% confidence) predict the output of one application knowing the output of the other. Segregation is a slippery subject, and not to be entered lightly!

10

Conclusions

As part of an industrial project, we have defined a form of segregation with communication, in which a set of applications are shown to be kept separated except for defined channels of communications. These channels allow for an arbitrary number of applications to simultaneously engage in sharing information. We have given this definition in terms of system traces, and have also rigorously developed a set of translation functions from conventional Z state-andoperation specifications to system traces. We have defined a generalisation of promoting a single local state to promoting multiple local states, called multi-promotion, and proved an unwinding theorem that a Z model written as an unconstrained multi-promotion is segregated Acknowledgements. The work described in the paper took place as part of a development funded by the NatWest Development Team. Parts of the work were carried out by Eoin Mc Donnell, Barry Hearn and Andy Newton (all of Logica). We would like to thank Jeremy Jacob and John Clark for their helpful comments and careful review of this work.

References [Barden et al. 1994] Rosalind Barden, Susan Stepney, and David Cooper. Z in Practice. BCS Practitioners Series. Prentice Hall, 1994. [Bell & Padula 1976] David E. Bell and Len J. La Padula. Secure computer system: unified exposition and MULTICS. Report ESD-TR-75-306, The MITRE Corporation, March 1976. [Bell 1988] D. E. Bell. Concerning “modelling” of computer security. In Proceedings 1988 IEEE Symposium on Security and Privacy, pages 8–13. IEEE Computer Society Press, April 1988. [Goguen & Meseguer 1984] J. A. Goguen and J. Meseguer. Unwinding and inference control. In Proceedings 1984 IEEE Symposium on Security and Privacy, pages 75–86. IEEE Computer Society, 1984.

470

D. Cooper and S. Stepney

[Gollman 1998] Dieter Gollman. Computer Security. John Wiley, 1998. [He Jifeng et al. 1986] He Jifeng, C. A. R. Hoare, and Jeff W. Sanders. Data refinement refined (resum´e). In ESOP’86, number 213 in Lecture Notes in Computer Science, pages 187–196. Springer Verlag, 1986. [Hoare & He Jifeng 1998] C. A. R. Hoare and He Jifeng. Unifying Theories of Programming. Prentice Hall, 1998. [Hoare 1985] C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. [ITSEC 1996] UK IT Security Evaluation and Certification Scheme, issue 3.0. Technical report, UK ITSEC, Cheltenham, December 1996. [Jacob 1992] Jeremy L. Jacob. Basic theorems about security. Journal of Computer Security, 1(4):385–411, 1992. [Roscoe 1995] A. W. Roscoe. CSP and determinism in security modelling. In Proceedings 1995 IEEE Symposium on Security and Privacy, pages 114–127. IEEE Computer Society Press, 1995. [Rushby 1981] J. M. Rushby. The design and verification of secure systems. In Proceedings 8th ACM Symposium on Operating System Principles, December 1981. [Stepney & Cooper 2000] Susan Stepney and David Cooper. Formal methods for industrial products. (These proceedings), 2000. [Stepney et al. 1998] Susan Stepney, David Cooper, and Jim Woodcock. More powerful Z data refinement: pushing the state of the art in industrial refinement. In Jonathan P. Bowen, Andreas Fett, and Michael G. Hinchey, editors, ZUM’98: 11th International Conference of Z Users, Berlin 1998, volume 1493 of Lecture Notes in Computer Science, pages 284–307. Springer Verlag, 1998. [Woodcock & Davies 1996] Jim Woodcock and Jim Davies. Using Z: Specification, Refinement, and Proof. Prentice Hall, 1996.

Closure Induction in a Z-Like Language? ?? urgen Giesl2 David A. Duffy1 and J¨ 1

2

Department of Computer Science, University of York, Heslington, York, YO10 5DD, UK, [email protected] Computer Science Department, University of New Mexico, Albuquerque, NM 87131, USA, [email protected]

Abstract. Simply-typed set-theoretic languages such as Z and B are widely used for program and system specifications. The main technique for reasoning about such specifications is induction. However, while partiality is an important concept in these languages, many standard approaches to automating induction proofs rely on the totality of all occurring functions. Reinterpreting the second author’s recently proposed induction technique for partial functional programs, we introduce in this paper the new principle of “closure induction” for reasoning about the inductive properties of partial functions in simply-typed set-theoretic languages. In particular, closure induction allows us to prove partial correctness, that is, to prove those instances of conjectures for which designated partial functions are explicitly defined.

1

Motivation

Partial functions are endemic in specifications written in languages such as Z and B. To reason about their inductive properties a method amenable to mechanical support by automated theorem provers is inevitable. In [13], Giesl has shown that, under certain conditions, many of the reasoning processes used to prove inductive properties of total functions (e.g., those in [5,9,19,25,27]) may be transposed to partial functions. The inference rules proposed by Giesl allow us to prove conjectures involving partial functions for all instances of the conjecture for which designated partial functions are explicitly defined. However, Giesl’s technique has been designed for a first-order functional language with an eager (call-by-value) evaluation strategy. In this paper, we examine thoroughly which interpretation of partiality and which restrictions on the allowed theories are required in order to extend Giesl’s induction principle from the original functional programming framework to a simply-typed set-theoretic language closely related to Z and B. We refer to our new principle as “closure induction”, since instances of it may be described within our set-theoretic language itself, and these instances may be viewed as “closure axioms” for a function definition, asserting that the function is defined in only those cases explicitly specified. For the soundness of closure ? ??

D. Duffy was supported by the EPSRC under grant no. GR/L31104. J. Giesl was supported by the DFG under grant no. GI 274/4-1.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 471–490, 2000. c Springer-Verlag Berlin Heidelberg 2000

472

D.A. Duffy and J. Giesl

induction we must make certain assumptions about the semantics of types (i.e., the carrier of a type must include “undefined” values that can be used as the value of a partial function when applied outside of its domain). We describe an appropriate semantics for our language in Section 2. Our approach to induction is applicable to languages such as Z and B if they too assume our semantics. This semantics is, we claim, not very restrictive; we would argue that it imposes the minimal requirements needed in order to distinguish between defined and undefined expressions. A commonplace interpretation of partial-function application in the Z community [1,24] is that any such application always returns a value in the function’s range type; we refer to this as the “classical” semantics. In such a framework we cannot distinguish between defined and undefined function applications. However, there is some debate as to whether this is the appropriate interpretation of function application [17], and our alternative semantics has already gained some interest within the Z community via its earlier presentation in a more general set-theoretic framework [11]. Apparently, no particular semantics is fixed by the standard definition of Z [20, 21]. Moreover, our semantics may be simulated within the classical semantics in a straightforward way [11]; this allows us to simulate our approach to induction in the CADiZ system [22], a tool for reasoning about Z specifications, which currently supports the classical semantics. In Section 3 we formalize our concept of inductive validity in the context of partial functions and in Section 4 we introduce the technique of closure induction in order to perform induction proofs automatically. We then discuss conditions under which closure induction is sound. We formalize these conditions in terms of rewriting, and it may thus come as no surprise that a confluence property forms part of the conditions. In particular, we show that the applicability of closure induction extends beyond the “orthogonal” equational theories considered previously by Giesl [13]. Finally, in Section 5 we present some further rules that are needed in addition to closure induction to verify definedness conditions that arise in most proofs about partial functions. The closure-induction approach described in this paper has been simulated within the CADiZ system [22]; simulations of the diff and quot examples we describe may be found on the web at ftp://ftp.cs.york.ac.uk/pub/aig/examples.

2

A Typed Language and Its Semantics

Elsewhere [11], Duffy has described a quite general set-theoretic language (essentially a subset of Z) and its associated semantics. Since, in the present paper, we are concerned with inductive reasoning in the context of free types and equational theories, we are able to consider a much restricted subset of this higher-order language, which we will refer to as F (signifying “free types”). 2.1

The Syntax of Expressions

We refer to all allowed syntactic objects as “expressions”. We separate expressi-

Closure Induction in a Z-Like Language

473

ons into “types”, “terms”, and “formulae”, distinguishing types from terms, for simplicity, since we do not allow types as subterms. T ype ::= T ypeN ame | P T ype | T ype × · · · × T ype Here, T ypeN ame denotes given sets [20] which are introduced in a so-called declaration part of specifications. Intuitively, P is the powerset operator and × denotes cross product. T erm ::= Const | V ar | T uple | Application Const is used for function names — as for T ypeN ames they are introduced in declaration parts of specifications. Variable names V ar are introduced by the quantification of a formula (as in, e.g., ∀x : N • P ). T uple ::= (T erm, . . . , T erm) An n-tuple of terms (t1 , . . . , tn ), where n ≥ 1, is often abbreviated t; the type of (t1 , . . . , tn ) is T1 × · · · × Tn , where Ti is the type of ti . Application ::= T erm T erm where the first T erm is of type P (T1 × T2 ) and the second T erm has type T1 ; the type of the application is T2 . We often write f (t) instead of “f t ”. F orm ::= T erm = T erm | T erm ∈ T ype | T erm ∈ T erm | ¬F orm | F orm ∧ F orm | F orm ∨ F orm | F orm ⇒ F orm | Q V ar : T ype • F orm where Q ∈ {∀, ∃, ∃1 } (∃1 denoting unique existence). We also allow the formula Qx1 : T1 ; . . . ; xn : Tn • P as an abbreviation for Qx1 : T1 • . . . Qxn : Tn • P , and if T1 = . . . = Tn = T , we also write Qx1 , . . . , xn : T • P . Moreover, we always demand that all terms and all formulae must be well typed. So for example, for any formula t1 = t2 , both terms t1 and t2 must have the same type. A specification consists of a declaration and an axiom part, where the declaration part introduces all given sets (i.e., all T ypeN ames) and constants used, and the axiom part is a set of formulae. 2.2

The Semantics of Expressions

In the “classical semantics” described by Arthan [1], every expression is a member of its type. In our semantics, we include “undefined” expressions that are not members of their type, thus allowing function applications to “return a value” not a member of the function’s range type. For this purpose, we distinguish “having type T ” from “being a member of T ”. We formalize this as follows. Let Σ be a specification involving a type T . In an interpretation for Σ we assign a set T ∗ to T , constructed according to the form of T :

474

D.A. Duffy and J. Giesl

– If T is a given set, then T ∗ is the union of two disjoint sets T + ∪ T − , where T + is assumed to be non-empty. = T1+ × · · · × Tn+ and T ∗ = = – If T is a product T1 × · · · × Tn , then T + = ∗ ∗ T 1 × · · · × Tn . = P (T1+ ) and T ∗ = = P (T1∗ ). – If T = = P (T1 ), then T + = Informally, T + may be interpreted as the defined values of type T . The assumption that T + is non-empty ensures that there is at least one possible value for any application, and allows us to avoid treating the special case of an empty type. In the language of our models we use the same symbols P, ×, etc. as in F, since no confusion should arise. The symbol == is our metalogical equality. We now define the total function App, which will be assigned to function applications. Let r be a subset of P (T1∗ × T2∗ ), and x be an element of T1∗ .  the unique y such that (x, y) ∈ r if such a y exists App(r, x) = = some y in T2∗ otherwise App is defined so that it is consistent with the usual Z interpretation of application [20]. Note that App(r, x) = y 6⇒ (x, y) ∈ r. We are now able to define the meaning of F expressions in an interpretation I, under an assignment a to any occurring free variables. In the following, let T denote a type, P, Q denote formulae, x denote a variable, c denote a constant, s, t, ti denote terms, and f denote a term of type P (T × T 0 ) for some T, T 0 . As the relationship between the symbol ∈ of F and membership in the models is not straightforward, we use  for membership in the model language. The interpretation of a term of type T is some value of T ∗ . Only function application is given special treatment; the meaning of other terms is standard. I(c)[a] = = cI , an element of T ∗ , where T is the type of c I(x)[a] = = a(x), the value assigned to x by the function a = (I(t1 )[a], . . . , I(tn )[a]) I((t1 , . . . , tn ))[a] = I(f t)[a] = = App(I(f )[a], I(t)[a]) For a formula P , we always have I(P )[a] == T rue or I(P )[a] == F alse. The interpretation of equality and the propositional connectives is standard; only membership and quantification are given special treatment. I(s = t)[a] = = T rue iff I(s ∈ t)[a] = = T rue iff I(t ∈ T )[a] = = T rue iff I(¬P )[a] = = T rue iff I(P ∧ Q)[a] = = T rue iff I(P ∨ Q)[a] = = T rue iff I(P ⇒ Q)[a] = = F alse iff I(∀x : T • P )[a] = = T rue iff I(∃x : T • P )[a] = = T rue iff I(∃1 x : T • P )[a] = = T rue iff

I(s)[a] = = I(t)[a] I(s)[a]  I(t)[a] I(t)[a]  T + I(P )[a] = = F alse I(P )[a] = = T rue and I(Q)[a] = = T rue I(P )[a] = = T rue or I(Q)[a] = = T rue I(P )[a] = = T rue and I(Q)[a] = = F alse = T rue for all e  T + I(P )[ae/x ] = I(P )[ae/x ] = = T rue for some e  T + e/x I(P )[a ] = = T rue for one unique e  T +

Closure Induction in a Z-Like Language

475

In the last three equations, e is assigned to any occurrences of x in P (i.e., ae/x (x) = e and ae/x (y) = a(y) for all y 6= x). Note, in particular, that, under our semantics, the symbol “∈” does not represent true membership, but only membership of the “defined part” of any type. Similarly, the quantifiers only range over the defined parts of the respective types. Example 1. If o is a constant of a type nats, and f is a function from nats to nats, then I(f (o) ∈ nats) = = App(fI , oI )  nats+ . u t We may simulate our semantics in the classical semantics in the following way [11]. Let Σ be a specification with exactly the given sets T1 , . . . , Tn . Then the declaration of each Ti is replaced by the declaration of a new given set Ti∗ . Subsequently, a declaration for each Ti is added asserting it to be a subset of Ti∗ . The rest of Σ remains unchanged. Now, under the classical semantics, every expression will return a value of its type Ti∗ ; the “undefined” expressions are those that do not return a value of the subset Ti of their type. We may now define models in the usual way. Definition 1 (Model). An interpretation I is a model of a specification Σ if all axioms in Σ are satisfied by I under all variable assignments a. For example, let Σ be a specification involving the type nats, a member o of nats, two functions s and f from nats to nats, and the axioms {∀x : nats • ¬ x = s(x), f (o) = s(f (o))}. Then f (o) is of type nats, but the value of App(fI , oI ) in any model of Σ will not be in nats+ in order to avoid violating the first axiom. Having defined which interpretations are models of a specification, we can now define consequence. Definition 2 (Consequence). A formula P is a consequence of a specification Σ (or “valid”), denoted Σ |= P , if every model of Σ satisfies P under all variable assignments. In this paper, we are concerned not so much with the consequences as with the “inductive consequences” of specifications — though these two terms become synonymous if we include the appropriate “induction formulae” within a specification. Our goal is to present an induction principle that allows us to prove such inductive consequences. First, we clarify what we mean by this term in the context of specifications that may involve partial functions.

3

Inductive Reasoning

For our purposes, a free type is a given set whose elements are freely generated by a set of constructors [20]. For example, the elements of a type nats, representing the natural numbers, can be generated from the nullary constructor o

476

D.A. Duffy and J. Giesl

and the unary constructor s. In Z, the free type nats would be introduced into a specification by the abbreviation nats ::= o | s hhnatsii. Such a statement would then be expanded into a declaration and a set of axioms. The declaration introduces the given set nats and the constants o of type nats and s of type P (nats × nats). The axioms assert that s is a total injection, that {o} and the range of s are disjoint, and that any subset of nats that includes o and is closed under s is the whole of nats. The latter axiom corresponds to a structural induction principle for nats. Sufficient conditions for the consistency of an arbitrary free type are outlined by Spivey [20]; the presentation of nats above satisfies these conditions. The details of the expansion for any free type may be found in [23]. For illustration, the axioms for nats are (equivalent to) the following formulae: 1. 2. 3. 4. 5.

Membership Total Function Injectivity Disjointness Induction

o ∈ nats, s ∈ P (nats × nats) ∀x : nats • ∃1 y : nats • (x, y) ∈ s ∀x, y : nats • s(x) = s(y) ⇒ x = y ∀x : nats • ¬ o = s(x) ∀nats0 : P nats• o ∈ nats0 ∧ (∀x : nats • x ∈ nats0 ⇒ s(x) ∈ nats0 ) ⇒ ∀x : nats • x ∈ nats0

Under our semantics, the meaning of the declaration and axioms associated with nats is that, in every model of the specification, nats+ must be isomorphic to the constructor ground term algebra generated by the constructors o and s. In other words, nats+ may contain only objects which occur as interpretations of constructor ground terms and, moreover, different constructor ground terms must be interpreted as different objects. This corresponds to the notion of initial algebras usually applied in inductive theorem proving, cf. e.g. [4,13,15,25,26,27]. The structural induction principle associated with any free type allows us to prove conjectures that hold for every element of the type. However, typically we wish to prove properties of a partial function on its defined cases only, as illustrated by the following example from [13]. Example 2. nats ::= o | s hhnatsii diff , quot : nats × nats → nats ∀x : nats • diff (x, o) = x ∀x, y : nats • diff (s(x), s(y)) = diff (x, y) ∀y : nats • quot(o, y) = o ∀x, y : nats • quot(s(x), y) = s(quot(diff (s(x), y), y)) We use the usual Z bar notation to separate the declaration part of a specification from the axiom part. For types T and T 0 , we use the expression f : T → T 0 to introduce a new constant f in the declaration of a specification and to denote

Closure Induction in a Z-Like Language

477

the assumption that f is a “partial function” from T to T 0 . More precisely, the expansion of f ∈ T → T 0 is f ∈ P(T × T 0 ) ∧ ∀x : T ; y, z : T 0 • (x, y) ∈ f ∧ (x, z) ∈ f ⇒ y = z. Clearly, diff is explicitly defined only for x ≥ y and quot(x, y) is explicitly defined only if y is a divisor of x. Note that in the “classical” semantics there is no model of the quot specification respecting the semantics of free types, because quot(s(o), o) must be equal to s(quot(s(o), o)). However, our semantics solves this problem, because the interpretation of quot(s(o), o) is now a member of the t u carrier set nats∗ \ nats+ . Note that we have not explicitly specified the domains of the functions diff and quot in the above example. Our approach to partiality thus differs from the more conventional one in which the equations defining a function are usually conditional on predicates that ensure that the function is assigned explicit values only for arguments within its domain. In this conventional approach, the value of a function application is always a member of its type, this value simply being left unspecified for arguments outside of the function’s domain. This approach thus models underspecified rather than partial functions. In contrast, our approach allows a function application to be undefined for arguments outside of the function’s domain. This makes our approach significantly more expressive, allowing a more general class of consistent specifications, and providing several other advantages for specification and reasoning. In particular, there are many important and practically relevant algorithms with undecidable domains. Typical examples are interpreters for programming languages and sound and complete calculi for first-order logic. For these algorithms, there do not exist any (recursive) predicates describing their domains. The conventional approach for modelling partial functions cannot handle such “real” partial functions. In our framework, on the other hand, such algorithms can be expressed without difficulty, and, moreover, the proof technique described in this paper supports their verification [12,13]. More generally, our framework has the advantage that specifications can be formulated much more easily, since one does not have to determine the domains of functions. Consequently, our approach is well-suited to the early “loose” stages of specification when the function domains may be still unknown. Finally, our representation allows proofs which do not have to deal with definedness conditions, which makes (automated) reasoning much more efficient, cf. [18]. For those cases where diff and quot are (explicitly) defined it can be shown that the following conjectures follow from the above specification (if the specification is extended by appropriate definitions for + and ∗): ∀x, y : nats • diff (x, y) + y = x ∀x, y : nats • quot(x, y) ∗ y = x

(1) (2)

The problem in trying to prove these conjectures is that the equations for diff and quot provide us with only sufficient conditions for these functions to

478

D.A. Duffy and J. Giesl

be defined; we cannot infer that they are defined in only those cases. We may overcome this problem by adding suitable “closure axioms”. Whenever there is a model of the specification where a function application is undefined, these closure axioms eliminate all models where this function application would be defined. Examples of such closure axioms are the following: ∀x, y : nats • diff (x, y) ∈ nats ⇒ y = o ∨ ∃u, v : nats • x = s(u) ∧ y = s(v) ∀x, y : nats • quot(x, y) ∈ nats ⇒ (x = o ∨ ∃u : nats • x = s(u) ∧ quot(diff (s(u), y), y) ∈ nats ∧ diff (s(u), y) ∈ nats). These closure axioms, the equations for diff and quot, and the free type axioms imply for m, n ∈ nats that diff (m, n) is not in nats if m is “smaller” than n, and that quot(m, n) is not in nats if m is not “divisible” by n. Most importantly, now the axioms imply our original conjectures in the forms ∀x, y : nats • diff (x, y) ∈ nats ⇒ diff (x, y) + y = x

(3)

∀x, y : nats • quot(x, y) ∈ nats ⇒ quot(x, y) ∗ y = x.

(4)

We refer to specifications that consist only of free types, function declarations, and equations as equational. For such specifications Σ, the desired properties of closure axioms are given by the following definition. Definition 3 (Closure Axioms). A set of closure axioms for an equational specification Σ is a set of formulae C consistent with Σ such that Σ 6|= f (q1 , . . . , qn ) ∈ T implies Σ ∪ C |= ¬ (f (q1 , . . . , qn ) ∈ T ), for each n-ary function f (whose application has type T ) and each n-tuple of appropriately-typed constructor ground terms (q1 , . . . , qn ). The addition of a set of closure axioms to a specification is referred to as the closure of the specification. In those cases where we assume that a specification includes all the relevant closure axioms, we will say that the specification is a closed system. For diff and quot, their above closure axioms may be derived automatically from their equations, but this is not so straightforward in general. For example, consider a function f : nats → nats “defined” by only the equation ∀x : nats • f (x) = f (x). Since this equation tells us nothing about the values returned by f , we infer that f is undefined for all m in nats, and the corresponding closure axiom must support this inference. An appropriate closure axiom is thus ∀x : nats • ¬ f (x) ∈ nats. However, it is not obvious how we may derive this closure axiom automatically from the given equation. Giesl et al. [6,7,14] have developed techniques for termination analysis of partial functions, which would easily find out the domains of

Closure Induction in a Z-Like Language

479

such simple functions as f (and also quot and diff ) automatically, but, in general, this is an undecidable problem. In fact, we will only use the (non-constructive) closure axioms to define our notion of partial validity. To prove partial validity in practice, we will introduce the proof technique of closure induction, which allows us to verify properties of partial functions without knowing their domains and without having to compute closure axioms explicitly. Definition 4 (Partial Validity). For an equational specification Σ we say that a conjecture P is partially valid if Σ ∪ C |= P holds for any set of closure axioms C. In practice, the verification of partial validity of a conjecture is accomplished in two separate steps. The first is a proof of the f (x)-validity of a conjecture, which means that the conjecture is valid for all those instantiations of x where f (x) is defined. These proofs are supported by the principle of closure induction. Definition 5 (f (x)-Validity). Let Σ be a specification involving the free types T1 , . . . , Tn , T and the function f : T1 × · · · × Tn → T . Let x1 , . . . , xn be variables of types T1 , . . . , Tn , respectively, and let P be a quantifier-free formula.1 We say that the conjecture ∀x1 : T1 ; . . . ; xn : Tn • P is f (x)-valid, where x represents x1 , . . . , xn , if 2 Σ |= P (q1 , . . . , qn ) holds for every sequence q1 , . . . , qn of constructor ground terms such that Σ |= f (q1 , . . . , qn ) ∈ T . The conjectures (1)-(4) are respectively diff (x, y)-valid and quot(x, y)-valid. For a closed system Σ, P is f (x)-valid iff Σ |= ∀x1 : T1 ; . . . ; xn : Tn • ( f (x) ∈ T ⇒ P ). It is clear that this notion of f (x)-validity does not make any sense for the classical semantics of “∈”: f (q1 , . . . , qn ) ∈ T holds automatically in that case, and thus f (x)-validity collapses to general (inductive) validity. The second step in proving partial validity of a conjecture P is a proof of Σ |= ∀x1 : T1 ; . . . ; xn : Tn • ¬ f (x) ∈ T ⇒ P.

(5)

If (5) can be verified, then f (x)-validity of P implies that P is a consequence of each closure of Σ, and thus partially valid. To see this, let I be an interpretation that is a model of Σ ∪C and let q1 , . . . , qn be arbitrary constructor ground terms. We have to show that I is a model of P (q1 , . . . , qn ). If Σ |= f (q1 , . . . , qn ) ∈ T , then the claim follows from f (x)-validity of P . Otherwise, Σ 6|= f (q1 , . . . , qn ) ∈ T and hence, Σ ∪ C |= ¬(f (q1 , . . . , qn ) ∈ T ). As I is a model of Σ ∪ C, I satisfies ¬(f (q1 , . . . , qn ) ∈ T ) and by (5) we have that I is a model of P (q1 , . . . , qn ). We refer to Requirement (5) as the permissibility condition [13]. Note that if Σ is not a closed system, then proving (5) is, of course, not the same as proving for all constructor ground terms q1 , . . . , qn Σ 6|= f (q1 , . . . , qn ) ∈ T implies Σ |= P (q1 , . . . , qn ). 1 2

(6)

It does not matter if the xi do not occur in P , or if other variables do occur in P . We denote by P (q1 , . . . , qn ) the formula P with each variable xi replaced by qi .

480

D.A. Duffy and J. Giesl

(In fact, (6) implies (5), but not vice versa.) A proof of f (x)-validity and (6) would constitute a proof of the inductive validity of P (instead of just partial validity). Proving the permissibility condition becomes trivial if suitable hypotheses are included in the conjecture, as in the conjectures (3) and (4) and the conjecture of the following example. Example 3. Suppose we have the free type A ::= a | b, the function f : A → A, and the single axiom f (a) = a. To prove that ∀x : A • f (x) ∈ A ⇒ f (x) = x

(7)

is partially valid we first prove its f (x)-validity. Since f is (explicitly) defined only for a, we have to show f (a) ∈ A ⇒ f (a) = a, which is clearly valid by the given axiom. We now prove the permissibility condition ∀x : A • ¬ f (x) ∈ A ⇒ (f (x) ∈ A ⇒ f (x) = x), which is also clearly valid. This completes the proof of (7)’s partial validity. u t Note that our logic is non-monotonic w.r.t. extensions of the specification. For example, f (b) ∈ A ⇒ f (b) = b is an instance of (7) and hence, it is partially valid. But adding f (b) = a subsequently to our specification would make f (b) ∈ A ⇒ f (b) = b and (7) false. (But note also that the non-monotonicity of our logic has the advantage that we never need any consistency checks, which are required in monotonic frameworks for partiality and which are difficult to automate for non-terminating functions.) We discuss this problem further in the next section.

4

Closure Induction

In principle, for f (x)-validity we have to consider infinitely many instantiations. To perform such proofs (automatically), we introduce the principle of closure induction. We restrict ourselves to equational specifications whose equations E are universally quantified over the (defined parts) of the respective types — we will frequently omit their quantifiers in the rest of this discussion. Definition 6 (Equations Defining Functions). A subset E 0 of E defines the function f if E 0 consists of all equations from E of the form f (t1 , . . . , tn ) = r. Definition 7 (Closure Induction). Suppose that f : T1 × · · · × Tn → T (for free types T1 , . . . , Tn and T ) is a declared function symbol defined by a set of equations of the form f (t11 , . . . , t1n ) = r1 , . . . , f (tm1 , . . . , tmn ) = rm such that each ri has a (possibly empty) set of subterms of the form {f (si1 ), . . . , f (siki )}. Let P be a quantifier-free formula, let Γi be the (possibly empty) conjunction of the formulae P (sij ) for j = 1 . . . ki , and let ∀•F denote the universal

Closure Induction in a Z-Like Language

481

closure of any formula F . The principle of closure induction is the following: “from the f (x)-validity of ∀ • (Γ1 ⇒ P (t11 , . . . , t1n )) ∧ . . . ∧ ∀ • (Γm ⇒ P (tm1 , . . . , tmn )) infer the f (x)-validity of ∀x1 : T1 ; . . . ; xn : Tn • P .” Note that closure induction directly corresponds to the techniques commonly used in inductive theorem proving (such as cover set induction or recursion analysis), cf. e.g. [5,9,19,25,27]. However, the important differences are that our induction principle also works for non-terminating partial functions (like the induction principle of [13]) and that it can be used in the framework of a simplytyped set-theoretic language (unlike the induction principle of [13]). As closure induction proves only f (x)-validity (and it can also be applied if f is partial), to verify that P is partially valid w.r.t. the specification, we must also prove the permissibility condition (5). If we consider a specific function, say quot, we may express the induction principle and the associated permissibility condition in the language F itself : ∀p : P (nats × nats) • (∀y : nats • (o, y) ∈ p ∧ ∀x, y : nats • ((diff (s(x), y), y) ∈ p ⇒ (s(x), y) ∈ p) ∧ ∀x, y : nats • (¬quot(x, y) ∈ nats ⇒ (x, y) ∈ p)) ⇒ (∀m : nats × nats • m ∈ p). Thus, we show that a conjecture p holds if quot is explicitly defined, and that p also holds when quot is not defined. Since we have expressed the principle as an F formula, we may add it as an axiom to the specification Σ. This possibility of stating the induction rule on the object level is due to the expressiveness of our set-theoretic language (this was not possible in the firstorder language of [13]). Not only does this demonstrate that closure induction may be simulated within F, thus allowing a quite straightforward simulation of closure induction in CADiZ, without the need to implement the inference rule (at least for initial experimental purposes), but it also provides a partial solution to the problem of non-monotonicity. The problem is that while a new axiom may be consistent with the initially given axioms, it may not be consistent with some proven conjectures. Representing closure induction as an additional axiom eliminates this possibility, and makes more transparent to the specifier what properties are being assigned to each function. We now describe sufficient conditions under which closure induction is sound. Firstly, the arguments to each function definition must be “constructor terms”, and, secondly, if f (q1 , . . . , qn ) is equal to a type element q, then it must be “reducible” to q. For the formal expression of these conditions, we reinterpret a set of F equations as a set of rewrite rules. The next three definitions restate the required notions from the theory of term rewriting. For a detailed introduction to term rewriting see e.g. [2,10]. Definition 8 (Cbv-Rewriting). A rewrite rule has the form l → r, where l is a non-variable term, r is a term, and every variable in r also occurs in l.

482

D.A. Duffy and J. Giesl

We use the following restriction of rewriting to model a call-by-value (or “cbv”) evaluation strategy. For a set of rules R, let ⇒R be the smallest relation such that s ⇒R t holds if there is a rule l → r in R such that some subterm s0 of s is an instance lθ of l, for each variable x in l there is some constructor ground term q such that xθ ⇒∗R q, and t is s with (some occurrence of ) s0 replaced by rθ. In this case, we say that the term s cbv-rewrites in one step to a term t via a set of rules R. A term s cbv-rewrites (or “cbv-reduces”) in zero or more steps to t if s ⇒∗R t, the notation ⇒∗R denoting the reflexive and transitive closure of ⇒R . Definition 9 (Constructor System). Let E be the equations defining a set of functions F . Provided that orienting E from left to right yields rewrite rules, we call these rules the rewrite system corresponding to E. A set of rules R is a constructor system if the proper subterms of the left-hand sides of R-rules are built from free-type function symbols (that is, “constructors”) and variables only. Now we introduce a localized confluence (and termination) property depending on E. Definition 10 (Type Convergence). Suppose that a specification Σ consists of a set of free types, a set of function declarations F , and a set of equations E defining the functions in F . If R is the set of rewrite rules corresponding to E, then we say that R is type convergent for the function f : T1 × · · · × Tn → T if, whenever Σ |= f (q1 , . . . , qn ) = q for any constructor ground terms q1 , . . . , qn , q, then we have f (q1 , . . . , qn ) ⇒∗R q; if this holds for all functions in F , then we say that R is type convergent. Finally, we are able to present our main result. Theorem 1 (Soundness of Closure Induction). Let R and f be as above. Then closure induction proves f (x)-validity if R is a constructor system that is type convergent for f . A proof may be found in the Appendix. Informally, the argument is as follows. If R is a type convergent constructor system for f , then, for each application f (q) that is equal to a constructor ground term, we can find a rule l → r such that l matches f (q) and the corresponding instances of any applications of f in r are smaller than f (q) with respect to some particular well-founded ordering. Consequently, in the application of closure induction to a formula P , we generate all the cases for which f is defined, and, for each such case, we assume instances of P that are smaller according to this well-founded ordering. Thus, if the hypotheses of closure induction are f (x)-valid, then so is the conclusion. That closure induction is unsound when the associated rewrite system is not type convergent is illustrated by the following. Example 4. Let E be {f (o) = o, ∀x : nats • f (x) = f (s(x))}.

Closure Induction in a Z-Like Language

483

We may prove via structural induction that ∀x : nats • f (x) = o follows from E. However, by closure induction we are also able to prove the clearly false conjecture ∀x : nats • f (x) ≥ x (for the usual definition of ≥), giving us ∀x : nats • o ≥ x. The proof proceeds as follows. The “base case” is f (o) ≥ o, which is obviously valid. In the “step case” we prove ∀x : nats • f (s(x)) ≥ s(x) ⇒ f (x) ≥ x. This may be reduced to ∀x : nats • f (x) ≥ s(x) ⇒ f (x) ≥ x by the second defining equation of f . But this is clearly valid by the usual properties of ≥ (since ∀x : nats • f (x) = o and thus, ∀x : nats • f (x) ∈ nats). We are left to prove the permissibility condition, which in this case is ∀x : nats • ¬ f (x) ∈ nats ⇒ f (x) ≥ x. But we know that ∀x : nats • f (x) = o holds, which is inconsistent with the hypothesis of this permissibility condition; thus, the condition holds trivially, and the conjecture is “proven”. t u The problem in this example is that, for each n > 0, f (sn (o)) is equal to a constructor ground term, but not reducible to one via the rewrite system corresponding to the given axioms; this rewrite system is thus not type convergent. For the sound application of closure induction, whenever f (q) is defined, the attempted proof that the conjecture holds for q must rely on induction hypotheses that are smaller w.r.t. a well-founded relation; for a constructor system, type convergence ensures that this condition is satisfied. That type convergence alone is insufficient for the soundness of closure induction is illustrated by the next example. Thus, one really needs both conditions, i.e., being a constructor system and type convergence. Example 5. Let E be {∀x : nats • f (x) = g(f (x)), ∀x : nats • g(f (x)) = o, ∀x : nats • g(x) = x}. Obviously, the rewrite system R corresponding to E is type convergent, but it is not a constructor system. By closure induction, we can prove the false conjecture ∀x : nats • f (x) ∈ nats ⇒ f (x) = s(o). The induction formula is trivial (the induction hypothesis is equal to the induction conclusion) and the permissibility conjecture is also a tautology. t u The problem in this example is that while f (q) reduces to o for each constructor ground term q, the only possible such reduction in the given system is via an “expansion” step. Consequently, we again cannot construct a well-founded ordering that justifies the assumed induction hypothesis in the proposed proof, and we are not saved by a separate induction case for which the induction hypothesis can be so justified. Finally, we give an example of the successful application of closure induction.

484

D.A. Duffy and J. Giesl

Example 6. Let nats and diff be as before. Suppose we wish to prove diff (x, y)validity of ∀x, y : nats • diff (x, y) ∈ nats ⇒ diff (x, y) + y = x.

(3)

The rules represent a constructor system that is type convergent; we may thus apply closure induction. This involves proving ∀x : nats • diff (x, o) ∈ nats ⇒ diff (x, o) + o = x, which reduces to the reflexivity axiom ∀x : nats • x = x, and proving ∀x, y : nats • P (x, y) ⇒ P (s(x), s(y)), where P (r, t) denotes diff (r, t) ∈ nats ⇒ diff (r, t) + t = r. The proof of this second subgoal is also straightforward. To prove the partial validity of the original conjecture (3), we also need to prove the permissibility conjecture ∀x, y : nats • ¬ diff (x, y) ∈ nats ⇒ (diff (x, y) ∈ nats ⇒ diff (x, y) + y = x), but this is a tautology. If we were to add the axiom ∀x, y : nats • diff (x, y) = diff (x, y) to our specification, then the associated rewrite system would still be a type convergent constructor system, and thus closure induction would still be applicable. Now an extra case would be included in which we assume the conjecture holds for (x, y) in the proof that it holds for (x, y); this clearly does not correspond to a well-founded ordering, but the diff (x, y)-validity of the conjecture will have been proven already by the other cases in the application of the closure induction. Thus, compared to the induction principle of [13], the present principle of closure induction has the advantage that it can also deal with overlapping equations. (Another advantage over that previous principle is that the requirement of type convergence is localized to the function under consideration, i.e., the rules need not be type convergent for other functions.) t u This example illustrates the fact that closure induction does not involve the construction of merely a “cover set” of cases in the sense of Bronsard et al. [8]. Instead it constructs all cases suggested by a function definition. Utilizing only sufficient rules to cover all cases would, in fact, be unsound. For example, using just the rule diff (x, y) → diff (x, y) to generate the induction cases would allow us to prove any conjecture, as (x, y) covers all possible pairs of type elements.

5

Definedness Rules

In general, closure induction is not always sufficient to prove f (x)-validity. In our example, to prove the quot(x, y)-validity of ∀x, y : nats • quot(x, y) ∈ nats ⇒ quot(x, y) ∗ y = x

(4)

Closure Induction in a Z-Like Language

485

we need to be able to make inferences about the definedness of function applications. For this, Giesl [13] has proposed definedness rules for functions; in the present context these take the form from f (t) ∈ T infer t1 ∈ T1 and . . . and tn ∈ Tn for any tuple of terms t and each n-ary function symbol. The condition that a set of rules is both a constructor system and type convergent is not sufficient to ensure that the above definedness rules may be applied soundly. For example, consider the type convergent constructor system {f (o) → o, f (o) → f (g(o))}, where o ∈ nats is given and f and g are partial functions from nats to nats. The formula f (g(o)) ∈ nats follows from this system, but g(o) ∈ nats does not. To characterize a class of rewrite systems where the definedness rules are sound, we propose a strengthening of the notion of type convergence. Definition 11 (Complete Type Convergence). Let Σ be a specification which consists of a set of free types, a set of function declarations F , and a set of equations E defining the functions in F . If R is the set of rewrite rules corresponding to E, then we say that R is completely type convergent iff Σ |= t = q implies t ⇒∗R q for all ground terms t and all constructor ground terms q. For example, the specification of diff and quot is completely type convergent. Note that here the semantics of the universal quantifier ∀ is crucial. Since it quantifies over only the objects of nats+ , the specification does not imply equations like quot(o, quot(s(o), o)) = o. The definedness rules are justified for completely type convergent constructor systems; the argument is as follows. Let C be a set of closure axioms for Σ, let x be the variables in t (of type Tx ), let q be a constructor ground term tuple, and let [q/x] denote the substitution of x by q. If Σ |= f (t)[q/x] ∈ T , then we have Σ |= f (t)[q/x] = q for some constructor ground term q and thus, f (t)[q/x] ⇒∗R q due to the complete type convergence of R. Consequently, the terms t1 [q/x], . . . , tn [q/x] also cbv-rewrite to constructor ground terms (see Lemma 1 in the Appendix). It follows that Σ |= ti [q/x] ∈ Ti for each i, and thus Σ ∪ C |= f (t)[q/x] ∈ T ⇒ t1 [q/x] ∈ T1 ∧ . . . ∧ tn [q/x] ∈ Tn

(8)

holds. If, on the other hand, Σ 6|= f (t)[q/x] ∈ T , then (8) holds again, since Σ ∪ C |= ¬f (t)[q/x] ∈ T by the definition of closure axioms. As (8) holds for all constructor ground term tuples q, we finally obtain the desired result Σ ∪ C |= ∀x : Tx • f (t) ∈ T ⇒ t1 ∈ T1 ∧ . . . ∧ tn ∈ Tn . We may “simulate” these definedness rules too in F, in the following way. For every defining equation f (t) = r we add to our specification the implication ∀x : Tx • f (t) ∈ T ⇒ r0 ∈ T 0 for every subterm r0 of r (of type T 0 ).

486

D.A. Duffy and J. Giesl

Example 7. For the quot system we obtain (besides others) the implications ∀x, y : nats • quot(s(x), y) ∈ nats ⇒ quot(diff (s(x), y), y) ∈ nats ∀x, y : nats • quot(s(x), y) ∈ nats ⇒ diff (s(x), y) ∈ nats. Now, using these definedness formulae, we can indeed prove the conjecture (4) by closure induction. For instance, the first implication above is used in the following way. The proof of quot(x, y)-validity of (4) involves the proof of . . . ⇒ ( quot(s(x), y) ∈ nats ⇒ quot(s(x), y) ∗ y = s(x) ). By the definition of quot, this may be reduced to . . . ⇒ ( quot(s(x), y) ∈ nats ⇒ s(quot(diff (s(x), y), y)) ∗ y = s(x) ). We now wish to apply the definition of “∗” to the left-hand side of the equality; but for this to be possible, the property quot(diff (s(x), y), y) ∈ nats must hold. Fortunately, since we have the hypothesis quot(s(x), y) ∈ nats, the desired property does hold by the definedness formulae for quot. t u Of course, we need a method to ensure (complete) type convergence automatically. Let R be the set of rewrite rules corresponding to the equations E, where R is a constructor system. Moreover, let R0 = {lσ → rσ | l → r ∈ R, σ replaces all variables of l by constructor ground terms }. Then confluence of R0 implies complete type convergence of R. The reason is that Σ |= t = q iff E 0 |= t = q where E 0 = {s1 σ = s2 σ | ∀x • s1 = s2 ∈ E, σ replaces x by constructor ground terms } by Birkhoff’s theorem [3] iff t ⇔∗R0 q due to R0 ’s confluence and as R0 is a constructor system. iff t ⇒∗R0 q Finally, t ⇒∗R0 q of course implies t ⇒∗R q. A sufficient condition for confluence of R0 is the requirement that the rules in R (i.e., the defining equations E of the specification) should be non-overlapping. In other words, for two different equations s1 = t1 and s2 = t2 , the terms s1 and s2 must not unify. For example, the equations for diff and quot are nonoverlapping. This sufficient criterion can easily be checked automatically. The reason for this requirement being sufficient for R0 ’s confluence is that ⇒R0 is equal to the innermost rewrite relation ⇒iR0 for ground constructor systems R0 and having non-overlapping rules implies confluence of innermost reductions [16]. So compared to [13], the requirement of orthogonality is not needed due to the definition of cbv-rewriting.

6

Conclusion

We have introduced a new “closure induction” principle in order to reason about specifications in a simply-typed Z-like set-theoretic language that includes partial functions. For this purpose, we adapted Giesl’s induction principle for partial

Closure Induction in a Z-Like Language

487

functions [13]. While Giesl’s induction principle was tailored to functional programs with an eager evaluation strategy, in the present paper we adapted it to equational specifications of our set-theoretic language, and exhibited sufficient conditions in order to render this induction principle correct. In this process, we relaxed some of the assumptions Giesl made about his programs, showing that a sufficient condition for the soundness of our principle is that the rewrite system corresponding to the equations is a constructor system that is type convergent for the function under consideration. In order to employ the frequently necessary further rules for reasoning about definedness, we also have to demand complete type convergence. A sufficient syntactic criterion for complete type convergence (and thus type convergence) is that the rewrite rules corresponding to the equational definitions of a specification are a non-overlapping constructor system. Note that the use of a much more powerful language than Giesl’s partial functional programs enables us to express the induction principle within the language itself. This allows for an easy implementation of our principle and solves the non-monotonicity problem w.r.t. extensions of specifications. For future work, we intend to find criteria for allowing non-equations in specifications, and we aim at relaxing the restriction to constructor systems. Moreover, while we do not impose the constraint that our equations are left-linear as in [13], at the moment we still have to restrict ourselves to non-overlapping equations to ensure complete type convergence; weaker criteria are needed to increase the applicability of our approach to a wider class of specifications. We plan also to consider extensions to cover conditional equations and to develop more specific techniques for nested or mutually recursive definitions.

A

Proof of the Soundness Theorem for Closure Induction

Lemma 1. Let R be a constructor system. For all ground terms t and all constructor ground terms q, if t ⇒∗R q then each subterm of t can also be cbv-reduced to a constructor ground term. Proof. Suppose t ⇒∗R q. We proceed by induction on the structure of t. If t is a constant then the lemma is obvious. Otherwise, t has the form f (t). If f is a constructor, then the lemma directly follows from the induction hypothesis. Otherwise, the reduction of t is as follows: f (t) ⇒∗ f (s) ⇒ rθ ⇒∗ q, where ti ⇒∗ si for all i and f (s) = lθ for a rule l → r. Here, l has the form f (u). By the definition of cbv-rewriting, xθ reduces to a constructor ground term for all variables x in u. As R is a constructor system, all ui are constructor terms and hence, each ui θ also reduces to a constructor ground term. Thus, as ti ⇒∗ si = ui θ, each ti reduces to a constructor ground term. For proper subterms of the ti , reducibility to a constructor ground term follows from the induction hypothesis. t u

488

D.A. Duffy and J. Giesl

Note that this result does not hold for usual rewriting (instead of cbvrewriting). For example, via the constructor system f (x) → o the term f (g(o)) rewrites to o, though g(o) is irreducible. But when using cbv-rewriting, g(o) must be reducible to a constructor ground term in order to reduce f (g(o)) to o. Definition 12 (Full Reduction in n Steps). Let R be a set of rewrite rules and let s, t be terms. We say that s cbv-reduces to t in n steps via R, denoted s ⇒R,n t, if there is a rule l → r in R such that t is s with the subterm lθ replaced by rθ and if, for each variable xi (1 ≤ i ≤ j) occurring in l, xi θ “fully reduces” to some constructor ground term qi in ki steps via R, and if k1 + · · · + kj = n. We say that s fully reduces to t in n steps via R, denoted s ⇒nR t, if there is some t0 such that s ⇒R,i t0 ⇒jR t, and i + j + 1 = n. Thus, “full reduction” counts all the rule applications involved in the rewriting. Lemma 2. Let R be a constructor system involving the function f : T1 × . . . × Tn → T . We define the relation >f over n-tuples of ground terms as follows: s >f t iff there exists a constructor ground term q such that f s ⇒iR C[f t] ⇒jR q (where C denotes some context), i > 0, and there is no k < i + j and no constructor ground term p such that f s ⇒kR p. Then >f is well founded. Proof. Suppose s1 >f s2 >f s3 >f . . .; then f s1 ⇒iR1 C1 [f s2 ], f s2 ⇒iR2 C2 [f s3 ], . . . , and C1 [f s2 ], C2 [f s3 ], . . . all reduce to constructor ground terms. Since f s1 fully reduces to a constructor ground term in a minimum of i1 +j1 steps, the minimum number of steps for the full reduction of C1 [f s2 ] is j1 . But in that case f s2 fully reduces to some constructor ground term p in at most j1 steps, by Lemma 1. Thus, we have i1 + j1 > j1 ≥ i2 + j2 > j2 ≥ i3 + j3 > j3 ≥ . . . But this is impossible.

t u

As a simple counterexample for the well-foundedness of the same relation without the minimality condition (i.e., without the requirement that f s ⇒kR p does not hold for k < i + j), consider R = {f (o) → o, f (o) → f (o)}. This set is a constructor system and f (o) ⇒1R f (o) ⇒1R o, but >f would not be well founded, as we would have o >f o. Theorem 1 (Soundness of Closure Induction). Let Σ be a specification with free types and a set of (universally quantified) equations and let R be the corresponding rewrite system. Then closure induction proves f (x)-validity in Σ if R is a constructor system that is type convergent for f . Proof. We wish to show that if the hypotheses of closure induction are f (x)valid then so is the conclusion. Note that due to Lemma 1, a conjecture P is

Closure Induction in a Z-Like Language

489

f (x)-valid iff Σ |= P (s1 , . . . , sn ) holds for all those ground (rather than just constructor ground) terms s such that Σ |= f (s) ∈ T . If R is type convergent, then Σ |= f (s) ∈ T is equivalent to the existence of a constructor ground term q with f (s) ⇒∗R q. Now suppose that the conclusion is false, that is, there is a term f (s1 , . . . , sn ), where f (s1 , . . . , sn ) ⇒∗R q for some ground constructor term q, such that the formula P (s1 , . . . , sn ) is false, and that (s1 , . . . , sn ) is minimal with respect to >f among such n-tuples. Without loss of generality, let f (s1 , . . . , sn ) ⇒∗R q be the minimal reduction of f (s1 , . . . , sn ) to a constructor ground term. Since f is not a constructor, the reduction f (s1 , . . . , sn ) ⇒∗R q must involve the application of a rule l → r ∈ R such that f (s01 , . . . , s0n ) is an instance lθ of l, where si ⇒∗R s0i for each si . Consequently, for any subterm f (t1 , . . . , tn ) of rθ, we have that (s1 , . . . , sn ) >f (t1 , . . . , tn ). But if all the P (t1 , . . . , tn ) were valid, then so would be P (s01 , . . . , s0n ), by the hypotheses of closure induction, and hence P (s1 , . . . , sn ) would be valid as well, since si ⇒∗R s0i . Thus, if l → r is a non-recursive rule, then we directly obtain a contradiction. Otherwise, one of the P (t1 , . . . , tn ) must also be false, which t u contradicts the >f -minimality of (s1 , . . . , sn ). Acknowledgement. We would like to thank the anonymous referees for many helpful comments.

References 1. R. D. Arthan. Undefinedness in Z: Issues for specification and proof. In CADE-13 Workshop on Mechanisation of Partial Functions. New Brunswick, New Jersey, USA, 1996. 2. F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, 1998. 3. G. Birkhoff. On the structure of abstract algebras. Proc. Cambridge Philos. Soc., 31:433–454, 1934. 4. A. Bouhoula and M. Rusinowitch. Implicit induction in conditional theories. Journal of Automated Reasoning, 14:189–235, 1995. 5. R. S. Boyer and J S. Moore. A Computational Logic. Academic Press, 1979. 6. J. Brauburger and J. Giesl. Termination analysis by inductive evaluation. In Proc. CADE-15, LNAI 1421, pages 254–269. Springer, 1998. 7. J. Brauburger and J. Giesl. Approximating the domains of functional and imperative programs. Science of Computer Programming, 35:113–136, 1999. 8. F. Bronsard, U. S. Reddy, and R. W. Hasker. Induction using term orders. Journal of Automated Reasoning, 16:3–37, 1996. 9. A. Bundy, A. Stevens, F. van Harmelen, A. Ireland, and A. Smaill. Rippling: A heuristic for guiding inductive proofs. Artificial Intelligence, 62:185–253, 1993. 10. N. Dershowitz and J.-P. Jouannaud. Rewrite systems. In Handbook of Theoretical Computer Science, volume B, pages 243–320. North-Holland, 1990. 11. D. A. Duffy. On partial-function application in Z. In 3rd Northern Formal Methods Workshop, Ilkley, UK, 1998. Springer. http://www.ewic.org.uk/ewic/.

490

D.A. Duffy and J. Giesl

12. J. Giesl. The critical pair lemma: A case study for induction proofs with partial functions. Technical Report IBN 98/49, TU Darmstadt, 1998. http://www. inferenzsysteme.informatik.tu-darmstadt.de/∼reports/notes/ibn-98-49.ps. 13. J. Giesl. Induction proofs with partial functions. Journal of Automated Reasoning, 2000. To appear. Preliminary version appeared as Technical Report IBN 98/48, TU Darmstadt, Germany. Available from http://www.inferenzsysteme.informatik.tudarmstadt.de/∼giesl/ibn-98-48.ps. 14. J. Giesl, C. Walther, and J. Brauburger. Termination analysis for functional programs. In W. Bibel and P. Schmitt, editors, Automated Deduction – A Basis for Applications, Vol. III, Applied Logic Series 10, pages 135–164. Kluwer, 1998. 15. J. A. Goguen, J. W. Thatcher, and E. G. Wagner. An initial algebra approach to the specification, correctness, and implementation of abstract data types. In R. T. Yeh, editor, Current Trends in Programming Methodology, volume 4. Prentice-Hall, 1978. 16. B. Gramlich. Abstract relations between restricted termination and confluence properties of rewrite systems. Fundamenta Informaticae, 34:3–23, 1995. 17. C. B. Jones. Partial functions and logics: A warning. Information Processing Letters, 54:65–67, 1995. 18. D. Kapur. Constructors can be partial, too. In R. Veroff, editor, Automated Reasoning and its Applications – Essays in Honor of Larry Wos, pages 177–210. MIT Press, 1997. 19. D. Kapur and M. Subramaniam. New uses of linear arithmetic in automated theorem proving by induction. Journal of Automated Reasoning, 16:39–78, 1996. 20. J. M. Spivey. The Z Notation: A Reference Manual, Second Edition. Prentice Hall, 1992. 21. I. Toyn. Z standard (draft). Available from the Department of Computer Science, University of York at http://www.cs.york.ac.uk/∼ian/zstan, 1999. 22. I. Toyn. CADiZ. Available from the Department of Computer Science, University of York at the web address http://www.cs.york.ac.uk/∼ian/cadiz/home.html, 2000. 23. I. Toyn, S. H. Valentine, and D. A. Duffy. On mutually recursive free types in Z. In Proceedings International Conference of Z and B Users, ZB2000, LNCS. Springer, 2000. To appear. 24. S. Valentine. Inconsistency and undefinedness in Z – a practical guide. In Proceedings 11th International Conference of Z Users, ZUM’98, LNCS 1493, pages 233–249. Springer, 1998. 25. C. Walther. Mathematical induction. In D. M. Gabbay, C. J. Hogger, and J. A. Robinson, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, volume 2. Oxford University Press, 1994. 26. C.-P. Wirth and B. Gramlich. On notions of inductive validity for first-order equational clauses. In Proc. CADE-12, LNAI 814. Springer, 1994. 27. H. Zhang, D. Kapur, and M. S. Krishnamoorthy. A mechanizable principle of induction for equational specifications. In Proc. CADE-9, LNAI 310, pages 162– 181. Springer, 1988.

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z Chris Matthews12

?

and Paul A. Swatman2

1 Division of Information Technology, School of Management, Technology and Environment La Trobe University, P.O. Box 199 Bendigo 3552, Victoria, Australia, Phone: +61 3 54447350, Fax: +61 3 54447998 [email protected] 2 School of Management Information Sciences, Deakin University, Burwood, Victoria, Australia, Phone: +61 3 9244 6268, Fax: +61 3 9244 6928

Abstract. It has been recognised that formal methods are useful as a modelling tool in requirements engineering. Specification languages such as Z permit the precise and unambiguous modelling of system properties and behaviour. However some system problems, particularly those drawn from the IS problem domain, may be difficult to model in crisp or precise terms. It may also be desirable that formal modelling should commence as early as possible, even when our understanding of parts of the problem domain is only approximate. This paper suggests fuzzy set theory as a possible representation scheme for this imprecision or approximation. We provide a summary of a toolkit that defines the operators, measures and modifiers necessary for the manipulation of fuzzy sets and relations. We also provide some examples of the laws which establishes an isomorphism between the extended notation presented here and conventional Z when applied to boolean sets and relations. Keywords: formal specification, Z, requirements determination, fuzzy set theory

1

Introduction

Formal methods are a set of tools that allow the development of a complete, precise and correct specification for system properties and behaviour. Although most commonly used in the specification of safety critical software, it has been argued that they can and should be applied to all stages of the systems development lifecycle including the specification of user requirements [41]. One commonly used specification language is Z [27,33] and it, together with its object-oriented ?

The authors would like to thank Dr. Roger Duke (Dept of Computer Science and Electrical Engineering, University of Queensland, Aus.) and Mr. Steve Dunne (School of Computing and Mathematics, University of Teesside, U.K.) for their constructive comments and suggestions made during the preparation of this paper.

J.P. Bowen et al. (Eds.): ZB 2000, LNCS 1878, pp. 491–510, 2000. c Springer-Verlag Berlin Heidelberg 2000

492

C. Matthews and P.A. Swatman

successor, Object-Z [8,9], has been used as a basis for communication and requirements validation in the FOOM (Formal Object-Oriented Methodology) systems development methodology [34,35]. Z is a powerful analytical tool that facilitates system understanding through the development of a series of unambiguous, verifiable mathematical models [12,28]. These models can be used to predict system behaviour and to identify errors prior to implementation. Various levels of abstraction are possible. These may vary from a statement of requirements (to be used as a basis for communication and validation) to a more detailed and concrete software design document. Unlike some of the more informal graphical methods such as dataflow diagrams, Z is not open to differing interpretations, but instead allows the designer to prove, through rigorous mathematical reasoning, the properties of a specification [29]. However some system problems are not naturally understood in crisp or precise terms. For example soft or socio-organisational systems, whose major focus is the interaction of people with organisations, are social rather than technical in nature. A substantial body of research has revealed that people may have differing perspectives on organisational objectives, problems and functions [3,4, 40]. There is no one problem definition waiting to be discovered, but instead the possibility of several equally relevent viewpoints depending on the participant. These multiple viewpoints may also be contradictory and could be characterised by imprecision, vagueness and uncertainty. Problem domains of this type are difficult to model using boolean set theory. Concepts or objects, whether they be individuals, organisational units, opinions etc, may not be easily or naturally categorised into precise groupings. Instead we may be more interested in the extent to which something resembles a type or in the relative ranking of something within a class or type rather than a precise description [39]. For example it might be more realistic to think of the level of experience of a particular employee rather than attempting to distinguish between an experienced and a non-experienced employee. Rather than classifying a person as one of a group of people holding a particular set of opinions, it might be more useful to consider to what extent the person resembles people holding those views. It may be more natural to consider the degree to which a set of factors influence a particular decision rather than only distinguishing between those that have no influence and those that have total influence. There are many examples similar to these and in many cases they arise in system problems involving human decision making and judgement [44]. It has been argued that a formal approach should be introduced as early as possible [34,35] in requirements determination. Formally expressed models can provide an unambiguous and precise expression of a client’s requirements. They allow the specifier and the client to share a common understanding of the problem and enable issues of ambiguity and uncertainty to be identified and resolved as early as possible. However the use of a formal specification language such as Z requires that we are able to categorise objects and concepts into precise types as the specification is developed. This may present a problem if imprecision, uncertainty and vagueness is inherent to the problem domain. There is a danger that

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

493

we will lose part of what we are attempting to represent. It may also be desirable to develop specifications as early as possible to identify and resolve the contradictory and imprecise aspects of a particular problem perspective. It would be useful if we were able to express some of this imprecision in the formal model itself and then refine the model as these issues are clarified, rather than having to resolve all uncertainty or imprecision prior to the development of a specification. One could imagine both our precise and our approximate understanding of parts of a problem domain being expressed in the same formal model. If this was to be the case then the specification language would require the syntax and semantics to capture and represent imprecision and/or approximation. Fuzzy set theory may offer one such possibility. 1.1

Fuzzy Sets

Fuzzy set theory and fuzzy logic provides a mathematical basis for representing and reasoning with knowledge in uncertain and imprecise problem domains. Unlike boolean set theory where set membership is crisp (i.e. an element is either a member or it isn’t), the underlying principle in fuzzy set theory is that an element is permitted to exhibit partial membership in a set. Fuzzy set theory allows us to represent the imprecise concepts (eg motivated employees, high profits and productive workers ) which may be important in a problem domain within an organizational context. The common set operators such as negation, union and intersection all have their fuzzy equivalents and measures for fuzzy subsetness and fuzzy set entropy have been proposed [16,17,22]. Fuzzy logic deals with degrees of truth and provides a conceptual framework for approximate rather than exact reasoning. The truth of propositions such as a few employees are motivated or productive workers lead to high profits can be estimated and reasoned with [45,46]. Fuzzy set theory and fuzzy logic have been successfully applied to the development of industrial control systems [37] and commercial expert systems [10]. Fuzzy set theory, and related theories such as possibility theory [7], have been suggested as appropriate analytical tools in the Social Sciences [31,32]. The idea that over-precision in measurement instruments may present a methodological problem in psychological measurement has led to developments such as a fuzzy graphic rating scale for the measurement of occupational preference [11], fuzzy set based response categories for marketing applications [38] or a fuzzy set importance rating scheme for personnel selection [1]. Fuzzy set theory and fuzzy logic have also been used to model group decision making, particularly when the preference for one set of options over another is not clear cut [14]. The use of fuzzy propositions to capture the elasticity of ‘soft’ functional requirements has been proposed as a technique for modelling imprecision during requirements engineering for knowledge-based system development [20]. Research has also indicated that there may be some compatibility between between fuzzy set theory and the meanings that we as humans place on the linguistic terms that are normally used to describe such sets [26,30,47,48]. This suggests that modelling

494

C. Matthews and P.A. Swatman

techniques based on fuzzy set theory may lead to models that are closer to our cognitive processes and models than those based on boolean set theory. 1.2

Motivation

The motivation for our current research can be summarised as follows. Given that there are some system problems, particularly those drawn from a socio-organisational context, that are not naturally modelled or understood in precise or crisp terms and given that we wish to retain the benefits of a specification language such as Z as a method for communication and validation, is it possible to build into the existing syntax the necessary semantics to capture the uncertainty, imprecision or vagueness characteristic of such systems? Fuzzy set theory is an established technique for representing uncertainty and imprecision and can be seen as a generalisation of boolean or crisp set theory. Given that Z is a set based specification language then it should be possible to provide a notation that incorporates fuzzy set ideas within the language itself while at the same time retaining the precision of any Z model. This paper is concerned with the development of a suitable fuzzy set notation within the existing Z syntax. It is assumed that the existing schema calculus and logical structures of Z remain. We have developed a toolkit which defines the set operators, measures and modifiers necessary for the manipulation of fuzzy sets [25]. The current version of the toolkit extends that previously presented [23] to concepts related to fuzzy relations as well as fuzzy sets. We have developed generic definitions for the domain and range of a fuzzy relation as well as those for domain and range restriction, and anti- restriction. We have also developed generic definitions for the min-max and max-min composition operators for fuzzy relations. We have defined the relational inverse of a fuzzy relation and provided an abbreviation for the identity relation in terms of a fuzzy relation. We have also included a series of laws which establishes an isomorphism between the extended notation presented here and conventional Z when applied to crisp sets (i.e. sets where the membership values are constrained to 0 or 1). The toolkit also identifies (and provides proofs for) the relevant laws from [33] that hold when partial set membership is permitted. In this paper we present a summary of the toolkit, including some sample laws. The reader is referred to [25] for the current version of the complete toolkit and proofs for the laws presented here. Due to space constraints only an abbreviated motativation is possible here. A more detailed discussion with illustrative examples forms a companion paper [24]. The paper is organised as follows: Section 2 introduces the fuzzy set representation scheme used in the toolkit. The toolkit summary is presented in Section 3. In Section 4 an alternative fuzzy set representation scheme is discussed and the paper concludes with a brief reference to some possible application areas.

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

2

495

A Possible Fuzzy Set Representation in Z

A fuzzy set, µ, can be represented as a mapping from a reference set, X , to the real number interval [0, 1]. µ : X → [0, 1] The membership of each element of the reference set is given by µ(x ), where x ∈ X . A fuzzy set can be imagined as a set of ordered pairs. Crisp sets are those where the membership values, µ(x ), are constrained to either 1 or 0 for all x ∈ X . We refer to sets where the set membership of each element is shown explicitly as being written in an extended notation. Crisp sets can be expressed in either the extended notation or in more conventional terms. For example, the set A = {(x1 , 1), (x2 , 1), (x3 , 0), (x4 , 0), (x5 , 1)} could be written simply as {x1 , x2 , x5 }. Z allows the definition of relations and functions between sets, and provides the necessary operations to manipulate them. If the reference set is considered to be a basic type within a Z specification then a fuzzy set can be defined as a total function from the reference set to the interval [0,1]. However Z does not define the fuzzy set operators for union, intersection and set difference, or the fuzzy set modifiers such as not, somewhat and very which are needed for fuzzy set and fuzzy set membership manipulation. The toolkit provides the generic definitions, axiomatic descriptions and abbreviations for these operators, modifiers and measures. We have used min{µ1 (x ), µ2 (x )} and max {µ1 (x ), µ2 (x )} to determine the membership of a reference set element x in the intersection and union of the two fuzzy sets, µ1 and µ2 . These operators are well established and preserve many of the properties of their boolean equivalents [2,15,19]. When applied to crisp sets, min and max behave in the same way as do the existing operators, ∩, ∪, for boolean sets. Two general principles have guided the preparation of the toolkit. 1. Where applicable we permit as much ‘fuzziness’ as possible. For example, rather than defining generalised union and intersection in terms of a crisp set of fuzzy sets, they are defined in terms of a fuzzy set of fuzzy sets. The domain and range restriction (and anti-restriction) for a fuzzy relation by a fuzzy rather than a crisp set is permitted and so on. 2. When applied to crisp sets in the extended notation the toolkit definitions, abbreviations and descriptions must be be isomorphic with conventional Z. Two functions, P which maps sets of type P T in the conventional notation to a fuzzy set of type FT and Q which maps those reference set elements having full membership in a fuzzy set to a power set in the conventional notation, are included and are used in the proofs which attempt to establish this isomorphism. The basic fuzzy set operators for intersection ( and ), union ( or ), complement F

(not ) and difference ( \), together with the set membership relations, in and notin , are defined as generally as possible. They are defined across partial functions of type T → 7 [0, 1], rather than being restricted to the fuzzy set formulation,

496

C. Matthews and P.A. Swatman

T → [0, 1]. This allows us to use them when defining concepts such as the degree of fuzziness or the fuzzy entropy of a set. In these cases we are only interested in those reference set elements that exhibit some set membership i.e. in a partial rather than a total function. The intersection, union and set difference of any two fuzzy sets are only defined when the partial function domains are equal. When used in a specification the operators will typically be applied to fuzzy sets where the function domains are always the total reference set. We recognise that in many cases there are alternative, and perhaps simpler ways of expressing the definitions that follow. We have attempted to use a style which attempts to explicitly show set membership in terms of the total function definition for a fuzzy set. For example when representing the membership of an element t in a fuzzy set fun, we use fun(t). When defining the functions P and Q, we represent the membership of t in the crisp set fun as either fun(t) = 1 or 0 rather than as (t, 1) or (t, 0) and so on. The toolkit has been type checked using the type checking software, ZTC 1 and can be used together with the default mathematical toolkit to type check specifications written in the extended notation.

3

The Toolkit Summary

The Z draft standard provides a given set, A which can be used to specify number systems [27,36]. We are assuming that the set of real numbers, R is defined as a subset of A. We are also assuming that a division operator / and the functions sqrt, abs, min and max have been defined for the set of real numbers. 3.1

Some Basic Definitions

Set membership is measured using the real number interval [0,1]. M == {r : R | 0 ≤ r ≤ 1} The generic symbol, F defines a fuzzy set as a total function from a reference set(type) to the real number interval, M. FT == T → M The generic symbol, C constrains the membership values of elements in a fuzzy set to {0, 1}. Sets of this type are crisp sets written in the extended notation. CT == T → {0, 1} The function, P is used to map a power set of type T (in the conventional notation) to a fuzzy set, FT in the extended notation. 1

ZTC: A Type Checker for Z Notation, Version 2.01, May 1995 (Xiaoping Jia, Division of Software Engineering, School of Computer Science, Telecommunication, and Information Sciences, DePaul University, Chicago, Illinois, USA).

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

497

[T ] P : (P T ) → (CT ) ∀ t : T ; set : P T • (t ∈ set ⇔ P(set)(t) = 1) ∧ (t 6∈ set ⇔ P(set)(t) = 0) The function, Q is used to map those reference set elements having a membership of 1 in a fuzzy set of type FT to a power set of type T (in the conventional notation). This is sometimes referred to as the core of a fuzzy set [13]. [T ] Q : (FT ) → (P T ) ∀ t : T ; fun : FT • t ∈ Q(fun) ⇔ fun(t) = 1 The toolkit also defines the support set, S, of a fuzzy set in a similar fashion. The support set is the set of those reference set elements that exhibit some membership in the fuzzy set [13,21]. S is defined as a function mapping a set of type FT to that of type P T . For sets fun1, fun2 of type CT the following hold, Qfun1 = Qfun2 ⇔ fun1 = fun2 Qfun1 = Sfun1

(1) (2)

The αcut of a fuzzy set is the set of reference set elements whose membership values are greater or equal to α, where 0 ≤ α ≤ 1. A strict αcut is one where the membership values are greater than α. The notation used here (i.e. [ ]α and [ ]α ) is taken from [19]. The concept of an αcut is useful for a set-based rather than functional representation of a fuzzy set [42]. [T ]

[ ]α : FT × M → P T [ ]α : FT × M → P T ∀ α : M; fun : FT ; t : T • t ∈ [ fun ]α ⇔ fun(t) ≥ α ∧ t ∈ [ fun ]α ⇔ fun(t) > α The extended support set, ES of a fuzzy set is a partial function of type T → 7 M containing only that part of the fuzzy set where set membership is greater than zero. [T ] ES : (FT ) → (T → 7 M) ∀ fun : FT • ES(fun) = fun − B {0}

498

C. Matthews and P.A. Swatman

The toolkit also defines an inverse extended support function, EF, which forms a fuzzy set from the extended support set. A non-empty fuzzy set is one where at least one reference set element has a membership greater than zero. F1 T == {F : FT | ∃ t : T • F (t) > 0} A finite fuzzy set (i.e. of type FT ) is one that has a finite number of elements with membership greater than zero. finite FT == {F : FT | F − B {0} ∈ F(T × M)} A non empty finite fuzzy set, finite F1 , is also defined in the toolkit. It is one where there is at least one element with a membership value greater than zero. The following is a definition for the total membership relation for an extended notation set. A zero membership relation, notin , is also provided and is defined in a similar way using {0} as the range restricting set. [T ] in : T ↔ (T → 7 M) ∀ t : T ; fun : (T → 7 M) • t in fun ⇔ t ∈ dom(fun B {1}) 3.2

Some Set Measures

An empty set, empty , of type FT is one where all membership values are zero. The empty set is a finite fuzzy set. [T ] empty : FT ∀ t : T • empty (t) = 0 A universal set, U, of type FT is defined in the toolkit as one where all membership values are one. The cardinality of a fuzzy set is defined as the sum of the membership values [18]. It only has meaning for a finite fuzzy set. Counter sums the membership values within sets of type T → 7 7 M and count restricts the summation to membership values greater than zero in a finite fuzzy set. [T ] counter : (T → 7 7 M) → R ∀ fun : (T → 7 7 M) • (∀ t : dom fun • counter (fun) = fun(t) + counter ({t} − C fun)) ∧ counter ∅ = 0

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

[T ] count : (finite FT ) → R ∀ fun : finite FT • count (fun) = counter (ES(fun)) 3.3

Some Set Operators

Set union and intersection for sets of type T → 7 M. [T ] or

: (T → 7 M) × (T → 7 M) → (T → 7 M)

∀ fun1, fun2 : (T → 7 M) • dom(fun1 or fun2) = dom fun1 ∩ dom fun2 ∧ ∀ t : dom(fun1 or fun2) • (fun1 or fun2)(t) = max {fun1(t), fun2(t)}

[T ] and

: (T → 7 M) × (T → 7 M) → (T → 7 M)

∀ fun1, fun2 : (T → 7 M) • dom(fun1 and fun2) = dom fun1 ∩ dom fun2 ∧ ∀ t : dom(fun1 and fun2) • (fun1 and fun2)(t) = min{fun1(t), fun2(t)}

The complement of a set of type T → 7 M. [T ] not : (T → 7 M) → (T → 7 M) ∀ fun : T → 7 M• dom(not fun) = dom fun ∧ ∀ t : dom fun • (not fun)(t) = 1 − fun(t) Set difference for fuzzy sets. [T ] F

\

: (T → 7 M) × (T → 7 M) → (T → 7 M)

∀ fun1, fun2 : T → 7 M• F

fun1 \ fun2 = fun1 and (not fun2)

499

500

C. Matthews and P.A. Swatman

These operators have the same meaning for crisp sets written in either the conventional or extended notation. ∀ fun1, fun2 : CT , Q(fun1 and fun2) = Qfun1 ∩ Qfun2 Q(fun1 or fun2) = Qfun1 ∪ Qfun2 F

Q(fun1 \ fun2) = Qfun1 \ Qfun2

(3) (4) (5)

The min and max operators are used for fuzzy set intersection and union to provide on the one hand, a generalisation of boolean set theory, and on the other, to preserve as much of the existing mathematical structure as possible. They are commutative, associative, idempotent and distributative 2 . When using 1 − fun(t) as the membership of the reference set element t in the complement of the fuzzy set fun, it has been shown that De Morgan’s Laws hold for sets of type FT [2,21] i.e. not (fun1 and fun2) = not (fun1) or not (fun2) not (fun1 or fun2) = not (fun2) and not (fun2)

(6) (7)

Only the law of the excluded middle is not valid in the fuzzy case. It only holds for sets of type CT . fun and not (fun) = empty

(8)

This is expected and reflects the overlap between a fuzzy set and its complement. F S , and that of a generalised intersection, The concept of a generalised union, F T , for a fuzzy set of fuzzy sets is also defined in the toolkit. Membership of a fuzzy set fun in a fuzzy set of fuzzy sets, A, could be interpreted as indicating the degree to which fun will take part in the union or the intersection. A constraint is placed on the definition of generalised intersection to ensure that it is only formed from those sets that exhibit some membership in the fuzzy set of fuzzy sets. When applied to a crisp set of fuzzy sets, generalised union and intersection have the same meaning as fuzzy set union and intersection (ie and and or ). 3.4

Fuzziness, Set Equality and Set Inclusion

The definition for the degree of fuzziness of a finite fuzzy set is based on the concept of fuzzy set entropy [17,18]. The degree of fuzziness of a fuzzy set can be estimated by determining the degree of resemblence between the set and the complement. This definition only relates to those reference set elements that 2

Alternative operator definitions such as product, bounded sum and mean have received some intuitive and practical support [5,31,48]. However they sacrifice some of these properties in the fuzzy case. For example the product operators (i.e. fun1(t) ∗ fun2(t) for intersection and fun1(t) + fun2(t) − fun1(t) ∗ fun2(t) for union) are neither idempotent or distributative for partial set membership.

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

501

exhibit some membership in the fuzzy set. For a non-empty finite crisp set, the membership value of each element in the extended support set can only be one. The degree of fuzziness of such a set is zero. [T ] fuzzyEntropy : (finite FT ) → R ∀ fun : finite FT • fun = empty ⇒ fuzzyEntropy(fun) = 0 ∧ fun 6= empty ⇒ counter (ES(fun) and not (ES(fun))) fuzzyEntropy(fun) = counter (ES(fun) or not (ES(fun))) A measurement of the degree of equality of two fuzzy sets. [T ] ≈ : (finite FT ) × (finite FT ) → M ∀ fun1, fun2 : finite FT • (fun1 = empty ∧ fun2 = empty ) ⇒ (fun1 ≈ fun2 = 1) ∧ (fun1 6= empty ∨ fun2 6= empty ) ⇒ fun1 ≈ fun2 = count (fun1 and fun2)/count (fun1 or fun2)) The expression, count (fun1 or fun2) can only be zero when both fun1 and fun2 are the empty fuzzy set (i.e. where all membership values of the reference set are zero). In this case fun1 = fun2 and therefore the degree to which fun1 equals fun2 is one. The following definitions for the subsetness relations for sets of type FT follow the original definitions [43] and define complete inclusion. [T ] F

⊆ : (FT ) ↔ (FT ) F

⊂ : (FT ) ↔ (FT ) F

∀ fun1, fun2 : FT • (fun1 ⊆ fun2) ⇔ (∀ t : T • fun1(t) ≤ fun2(t)) F

F

∧ (fun1 ⊂ fun2) ⇔ (fun1 ⊆ fun2 ∧ fun1 6= fun2) Subsetness has the same meaning for crisp sets written in either the conventional or extended notation. ∀ fun1, fun2 : CT , F

fun1 ⊆ fun2 ⇔ Qfun1 ⊆ Qfun2 F

fun1 ⊂ fun2 ⇔ Qfun1 ⊂ Qfun2

(9) (10)

A measure of the degree of subsetness defines a weaker form of set inclusion [17, 18]. It is only defined for fuzzy sets that are countable i.e. sets of type finite FT .

502

C. Matthews and P.A. Swatman

[T ] ⊂ ∼ : (finite FT ) × (finite FT ) → M ∀ fun1, fun2 : finite FT • (fun1 6= empty ) ⇒ ⊂ (fun1 ∼ fun2 = count (fun1 and fun2)/count (fun1)) ∧ ⊂ (fun1 = empty ) ⇒ (fun1 ∼ fun2 = 1) 3.5

Set Modifiers and Fuzzy Numbers

Fuzzy set modifiers transform the membership values of reference elements that exhibit partial membership in a set of type FT . When applied to sets of type CT they have no effect. They are sometimes referred to as linguistic hedges and an analogy has been drawn between them and the action of adjectival modifiers such as very, generally etc. in natural language. The meaning ascribed to each and the mathematical description provided have been argued intuitively. The reader is directed to [5,30,31,44] for a detailed discussion of these issues. The definition below is for the concentration modifier, very. [T ] very : (FT ) → (FT ) ∀ t : T ; fun : (FT ) • very(fun)(t) = fun(t) ∗ fun(t) The toolkit also provides a definition for the dilation modifier, somewhat, where the square root of the membership value is used. Approximation hedges such as near(m), around(m), roughly(m) can be modelled as fuzzy numbers. In a fuzzy system where t represents some measured scalar variable, the fuzzy set parameters could be dependent on the magnitude of t alone and scaled accordingly [5]. For example the degree to which 9.5 is around 10 may be considered to be the same as that to which 95 is around100, 950 is around 1000 and so on. Differing scaling factors would be used for differing linguistic descriptions. The following definition models the approximation hedge near as a triangular fuzzy set centred about the positive real number m. near : R+ → (FR+ ) ∀ p, m : R+ • ((p ≤ (m − 0.15 ∗ m)) ∧ (p ≥ (m + 0.15 ∗ m)) ⇒ (near(m))(p) = 0) ∧ ((m − 0.15 ∗ m) < p < (m + 0.15 ∗ m)) ⇒ (near(m))(p) = 1 − abs((m − p)/0.15 ∗ m) The toolkit also contains a similar definition for the approximation hedge around. In this case a scaling factor of 0.25 is used to increase the support set of the fuzzy number, suggesting that we would expect more numbers to be around m than near to m. These definitions are provided only as an example and we recognise

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

503

that the set parameters are subjective and may be dependent on the context in which the hedge is to be used. Gaussian or trapezoid set parameters could be added to the toolkit if necessary. The reader is directed to [2,5] for more detail on possible set parameters and the practical application of approximation hedges of this type in fuzzy systems. The toolkit provides a generic definition for the fuzzy quantifier most. It is based on that in [14]. most : M → M ∀ m : M • (0 ≤ m < 0.3) ⇒ most(m) = 0 ∧ (0.3 ≤ m < 0.8) ⇒ most(m) = 2 ∗ m − 0.6 ∧ (m ≥ 0.8) ⇒ most(m) = 1 3.6

Fuzzy Relations F

C

The generic symbols, ↔ and ↔, define fuzzy and crisp relations. F

X ↔ Y == F(X × Y ) C

X ↔ Y == C(X × Y ) A identity relation can be defined, using the extended notation. [X ]

F

F

id: FX → (X ↔ X ) ∀ fun : FX ; x1 , x2 : X •

F

(x1 = x2 ) ⇒ (id fun)(x1 7→ x2 ) = fun(x1 ) ∧ F

(x1 6= x2 ) ⇒ (id fun)(x1 7→ x2 ) = 0 This has the same meaning using either the extended or conventional notation. F

Q(id xset) = id(Q(xset))

(11)

where xset is of type CX . This definition for the identity relation is based on that found in [21]. A fuzzy relational inverse can also be defined. [X , Y ] −F

F

F

: (X ↔ Y ) → (Y ↔ X ) F

∀ R : (X ↔ Y ); x : X ; y : Y • R −F (y 7→ x ) = R(x 7→ y) This has the same meaning using either the extended or conventional notation. C ∀R : X ↔ Y, (12) Q(R −F ) = (QR)∼

504

3.7

C. Matthews and P.A. Swatman

Range and Domain for a Fuzzy Relation F

F

The domain(dom) and range(ran) for fuzzy relations can be defined as a fuzzy set [6]. The membership of an element x in the domain (and an element y in the range) could be considered to be equal to the maximum of all memberships F of the mappings {x } → Y (or X → {y}) in the fuzzy relation, X ↔ Y . The F

definition for dom is shown below. [X , Y ] F

F

dom: (X ↔ Y ) → FX F

F

∀ R : (X ↔ Y ); x : X • (dom R)(x ) = max {y : Y • R(x 7→ y)} This has the same meaning for the domain of a crisp relations written either in C the conventional or extended notation. ∀ R : X ↔ Y , F

Q(dom R) = dom(Q(R)) 3.8

(13)

Range and Domain Restrictions (and Anti-restriction) for Fuzzy Relations

A domain and range restriction for a fuzzy relation can also be defined. The membership of the maplet x 7→ y in the restricted fuzzy relation is given by the minimum of the original membership of x 7→ y in the relation, R and the membership of x or y in the restricting set, xset or yset. [X , Y ] F

F

F

C : (FX ) × (X ↔ Y ) → (X ↔ Y ) F

F

F

B : (X ↔ Y ) × (FY ) → (X ↔ Y ) F

∀ x : X ; y : Y ; R : (X ↔ Y ); xset, yset : (FX ) • F

(xset C R)(x 7→ y) = min{R(x 7→ y), xset(x )} ∧ F

(R B yset)(x 7→ y) = min{R(x 7→ y), yset(y)} These have the same meaning for crisp relations written either in the conventioC nal or extended notation. ∀ R : X ↔ Y , xset : CX , yset : CY , F

Q(xset C R) = Q(xset) C Q(R) F

Q(R B yset) = Q(R) B Q(yset)

(14) (15)

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

505

The toolkit also defines a domain and range anti-restriction for a fuzzy relation. The membership of the maplet x 7→ y in the anti-restricted fuzzy relation is given by the minimum of the original membership of x 7→ y in the relation, R and the membership of x or y in the complement of the anti-restricting set, xset F

F

or yset. The definitions are very similar to those for C and B, with (not xset)(x ) and (not yset)(y) replacing xset(x ) and yset(y) respectively. 3.9

The max-min Relational Composition Operator for Fuzzy Relations [X , Y , Z ] F

F

F

F

F

F

F

o 9

: (X ↔ Y ) × (Y ↔ Z ) → (X ↔ Z )



: (Y ↔ Z ) × (X ↔ Y ) → (X ↔ Z )

F

F

F

F

∀ R1 : (X ↔ Y ); R2 : (Y ↔ Z ); x : X ; z : Z • (R1 o9 R2 )(x 7→ z ) = max {y : Y • min{R1 (x 7→ y), R2 (y 7→ z )}} ∧ (R1

F o 9

F

R2 )(x 7→ z ) = (R2 ◦ R1 )(x 7→ z )

There are two common composition operators for fuzzy relations, max-min and min-max [2]. Only max-min is shown here. The min-max operator would be defined by interchanging min and max in the above definition. This definition has the same meaning for crisp relations written in either the C C conventional or extended notation. ∀ R1 : X ↔ Y and R2 : Y ↔ Z , Q(R1

F

R2 ) = Q(R1 ) o9 Q(R2 )

(16)

Q(R2 ◦ R1 ) = Q(R2 ) ◦ Q(R1 )

(17)

o 9

F

3.10

A Fuzzy Relational Image for fuzzy Relations [X , Y ] F

(|

F

F

|): (X ↔ Y ) × FX → FY F

∀ R : (X ↔ Y ); set : FX ; y : Y • F

F

(R (| set |))(y) = max {x : X • min{R(x 7→ y), set(x )}} This definition is based on that given in [21] and has the same meaning for crisp C relations and sets using either the conventional or extended notation. ∀ R : X ↔ Y , set : CX and y : Y , F

F

Q(R (| set |)) = Q(R)(| Q(set) |)

(18)

506

C. Matthews and P.A. Swatman

3.11

Fuzzy Functions

A series of fuzzy functions are defined in the toolkit. Each is a generalisation of the corresponding definition in the conventional notation [33]. Crisp versions of these are also be defined and it can be shown that they have the same meaning in either the conventional or extended notation [25]. A fuzzy partial function. F

F

7 Y == {R : X ↔ Y | ∀ x : X ; y1 , y2 : Y • X → (R(x 7→ y1 ) > 0) ∧ (R(x 7→ y2 ) > 0) ⇒ y1 = y2 } A fuzzy total function F

F

F

7 Y | ∀ x : X • (dom R)(x ) > 0} X → Y == {R : X → A fuzzy partial injection F

F

7 Y == {R : X → 7 Y | ∀ x1 , x2 : X ; y : Y • X  (R(x1 7→ y) > 0 ∧ R(x2 7→ y) > 0) ⇒ x1 = x2 } A fuzzy total injection F

F

X  Y == {R : X → Y | ∀ x1 , x2 : X ; y : Y • (R(x1 7→ y) > 0 ∧ R(x2 7→ y) > 0) ⇒ x1 = x2 } A fuzzy partial surjection F

F

F

→ 7 Y == {R : X → 7 Y | ∀ y : Y • (ran R)(y) > 0} X → A fuzzy total surjection F

F

F

→ Y == {R : X → Y | ∀ y : Y • (ran R)(y) > 0} X → A fuzzy bijection F

F

F

→ Y == {R : X  Y | ∀ y : Y • (ran R)(y) > 0} X 

4

Alternative Notation and Definitions

It is possible to visualise a fuzzy set other than as a total function. Reference has already been made to a set- based representation scheme based on a set of nested subsets formed from the αcuts of a fuzzy set (see section 3.1 and [19,42]). An alternative is to use a vector notation to provide a geometric rather than algebraic representation [17,18]. For a finite reference set, X , containing n elements we could define a series of fuzzy subsets. Each could be represented each an n-dimensional vector, where

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

507

the vector components are the membership values of the corresponding elements. Furthermore we could visualise the fuzzy subset geometrically as a point in an ndimensional space (or hypercube). Crisp subsets are represented by those points at the vertices of the n-dimensional hypercube. The closer a point is to the centre of the hypercube, the more fuzzy is the set it represents. The point at the centre of the hypercube is the one where all vector components are equal to 0.5. The set of possible points represents the power set of X (i.e. the set of all subsets of X — fuzzy and crisp). This notation is useful as it provides an elegant visualisation of fuzzy set measures such as cardinality, fuzzy entropy and degree of subsetness.The notation is also simple as only the membership values are represented and manipulated — the reference set elements are implied by the ordering of the vector components. However its usefulness as a representation scheme within a set based, algebraic language such as Z appears limited. A vector notation is not a basic mathematical construct within Z, and would need to be defined. The notation does not explicitly define a mapping function from the reference set to the membership interval [0,1]. The membership vector for a particular fuzzy set would need to be enumerated rather than evaluated. Within a specification it may be necessary to evaluate the set membership of a particular reference set element. For example, we may require the membership value of an input observation, t, in a fuzzy set of type FT . This is easily done if the mapping function, T :→ [0, 1], is available and explicitly stated. The alternative is to enumerate the membership value of each reference set element. For a large reference set this would become tedious and for infinite reference sets such as R, not possible.

5

Conclusion

We have suggested that some system problems, particularly those drawn from a socio-organisational context, are not naturally understood in crisp or precise terms. Fuzzy set theory as an established formalism for modelling the uncertainty and approximation characteristic of such systems. In this paper we have presented a summary of a possible fuzzy logic toolkit for Z. We are currently developing a series of examples intended to illustrate the usefulness of the toolkit. Included are amongst these are examples that use fuzzy sets to represent naturally imprecise or graded concepts such as employee motivation and productivity, fuzzy relations to model service quality and client satisfaction in a health system and fuzzy concepts as a general modelling tool for situations involving conflict and agreement. This work is ongoing and the results can be found in [24].

References 1. G.M. Allinger, S.L. Feinzig, and E.A. Janak. Fuzzy Sets and Personnel Selection: Discussion and an Application. Journal of Occupational and Organizational Psychology, 66:162–169, 1993.

508

C. Matthews and P.A. Swatman

2. G. Bojadziev and M. Bojadziev. Fuzzy Sets, Fuzzy Logic, Applications. World Scientific, Singapore, 1995. 3. P. Checkland. Systems Thinking, Systems Practice. John Wiley and Sons, Chichester, 1981. 4. P. Checkland and Jim Scholes. Soft Systems Methodology in Action. John Wiley and Sons, Chichester, 1990. 5. E. Cox. The Fuzzy Systems Handbook. AP Professional - Harcourt Brace & Company, Boston, 1994. 6. D. Dubois and H. Padre. Fuzzy Sets and Systems: Theory and Applications. Academic Press, Inc, 1980. 7. D. Dubois and H. Prade. Possibility Theory - An Approach to Computerised Processing of Uncertainty. Plenum Press, New York, 1988. 8. R. Duke, P. King, G. Rose, and G. Smith. The Object-Z Specification Language: Version 1. Technical Report 91-1, Dept of Computer Science, University of Queensland, 1991. 9. R. Duke, G. Rose, and G. Smith. Object-Z: A Specification Language Advocated for the Description of Standards. Computer Standards and Interfaces, 17:511–533, 1995. 10. I. Graham. Fuzzy Logic in Commercial Expert Systems - Results and Prospects. Fuzzy Sets and Systems, 40:451–472, 1991. 11. B. Hesketh, R. Pryor, and M. Gleitzman. Fuzzy Logic: Toward Measuring Gottfredson’s Concept of Occupational Social Space. Journal of Counselling Psychology, 36(1):103–109, 1989. 12. J. Jacky. The Way of Z: Practical Programming with Formal Methods. Cambridge University Press, Cambridge, 1997. 13. Jyh-Shing Roger Jang, Chuen-Tsai Sun, and Eiji Mizutani. Neuro-Fuzzy and Soft Computing - A Computational Approach to Learning and Machine Intelligence. Prentice-Hall, Inc., New Jersey, 1997. 14. J. Kacprzyk, M. Fedrizzi, and H. Nurmi. Fuzzy Logic with Linguistic Quantifiers in Group Decision Making. In R.R. Yager and L.A. Zadeh, editors, An Introduction to Fuzzy Logic Applications in Intelligent Systems, pages 263–280. Kluwer Academic, 1992. 15. A. Kaufmann. Introduction to the theory of Fuzzy Subsets, volume 1 - Fundamental Theoretical Elements. Academic Press, London, 1975. 16. J. Klir and D. Harmanec. Types and Measures of Uncertainty. In Janusz Kacprzyk, Hannu Nurmi, and Mario Fedrizzi, editors, Consensus under Fuzziness, pages 29– 51. Kluwer Academic, 1997. 17. B. Kosko. Fuzziness vs. Probability. Int. J. General Systems, 17:211–240, 1990. 18. B. Kosko. Neural Networks and Fuzzy Systems. Prentice-Hall, New Jersey, 1992. 19. R. Kruse, J. Gebhardt, and F. Klawonn. Foundations of Fuzzy Systems. John Wiley & Sons, Chichester, 1994. 20. J. Lee and J. Yen. Specifying Soft Requirements of Knowledge-Based Systems. In R.R Yager and L.A. Zadeh, editors, Fuzzy Sets, Neural Networks, and Soft Computing, pages 285–295. Van Nostrand Reinhold, New York, 1994. 21. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Academic, Dordrecht, 1996. 22. A. De Luca and S. Termini. A Definition of a Nonprobablistic Entropy in the Setting of Fuzzy Set Theory. Information and Control, 20:301–312, 1972. 23. C. Matthews and P. A. Swatman. Fuzzy Z? In The Second Australian Workshop on Requirements Engineering (AWRE’97), pages 99–114. Macquarie University, Sydney, 1997.

Fuzzy Concepts and Formal Methods: A Fuzzy Logic Toolkit for Z

509

24. C. Matthews and P. A. Swatman. Fuzzy Concepts and Formal methods : Some Illustrative Examples. Technical Report 1999:37, School of Management Information Systems, Deakin University, 1999. 25. C. Matthews and P. A. Swatman. Fuzzy Z - The Extended Notation (Version 0). Technical Report 1999:38, Rev 1, School of Management Information Systems, Deakin University, 1999. 26. S. E. Newstead. Quantifiers as Fuzzy Concepts. In T. Zetenyi, editor, Fuzzy Sets in Psychology, pages 51–72. Elsevier Science Publishers B.V, North-Holland, 1988. 27. J. Nichols(ed.). Z notation — version 1.3. Technical report, ISO, June 1998. 28. B. Potter, J. Sinclair, and D. Till. An Introduction to Formal Specification and Z. Prentice Hall International Series in Computer Science, Hemel Hempstead, second edition, 1996. 29. H. Saiedian. Formal Methods in Information Systems Engineering. In R. H Thayer and M. Dorfman, editors, Software Requirements Engineering, pages 336– 349. IEEE Computer Society Press, second edition, 1997. 30. K.J. Schmucker. Fuzzy Sets, Natural Language Computations, and Risk Analysis. Computer Science Press, Rockville, 1984. 31. M. Smithson. Fuzzy Set Analysis for Behavioral and Social Sciences. SpringerVerlag, New York, 1987. 32. M. Smithson. Ignorance and Uncertainty - Emerging Paradigms. Springer-Verlag, New York, 1988. 33. J.M Spivey. The Z Notation: A Reference Manual. Prentice Hall International Series in Computer Science, Hemel Hempstead, second edition, 1992. 34. P. A. Swatman. Formal Object-Oriented Method - FOOM. In H. Kilov and W. Harvey, editors, Specification of Behavioural Semantics in Object-Oriented Information Systems, pages 297–310. Kluwer Academic, 1996. 35. P. A. Swatman and P. M. C. Swatman. Formal Specification: An Analytical Tool for (Management) Information Systems. Journal of Information Systems, 2(2):121– 160, April 1992. 36. I. Toyn. Innovations in the Notation of Standard Z. In J. P Bowen, A. Fett, and M. G. Hinchey, editors, ZUM ’98: The Z Formal Specification Notation, Lecture Notes in Computer Science. Springer-Verlag, 1998. 37. G. Viot. Fuzzy Logic: Concepts to Constructs. AI Expert, 8(11):26–33, November 1993. 38. M. Viswanathan, M. Bergen, S. Dutta, and T. Childers. Does a Single Response Category in a Scale Completely Capture a Response? Psychology and Marketing, 13(5):457–479, 1996. 39. P. Wang. The Interpretation of Fuzziness. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 26(2):312–326, Apr 1996. 40. B. Wilson. Systems: Concepts, Methodologies and Applications. John Wiley and Sons, Chichester, second edition, 1990. 41. J. M. Wing. A Specifier’s Introduction to Formal Methods. IEEE Computer, 23(9):8–24, 1990. 42. Y. Y. Yao. A comparative study of rough sets and fuzzy sets. Journal of Information Sciences, 109:227–242, 1998. 43. L. A. Zadeh. Fuzzy Sets. Information and Control, 8:338–353, 1965. 44. L. A. Zadeh. The Concept of a Linguistic Variable and its Application to Approximate Reasoning I. Information Sciences, 8(4):199–249, 1975. 45. L. A. Zadeh. Fuzzy Logic. IEEE Computer, 21(4):83–92, April 1988.

510

C. Matthews and P.A. Swatman

46. L. A. Zadeh. Knowledge Representation in Fuzzy Logic. In R.R. Yager and L.A. Zadeh, editors, An Introduction to Fuzzy Logic Applications in Intelligent Systems, pages 1–25. Kluwer Academic, 1992. 47. A.C. Zimmer. A Common Framework for Colloquil Quantifiers and Probability Terms. In T. Zetenyi, editor, Fuzzy Sets in Psychology, pages 73–89. Elsevier Science Publishers B.V, North-Holland, 1988. 48. R. Zwick, D. V. Budescu, and T. S. Wallsten. An empirical study of the integration of linguistic probabilities. In T. Zetenyi, editor, Fuzzy Sets in Psychology, pages 91–125. Elsevier Science Publishers B.V, North-Holland, 1988.

Author Index

Arthan, R.D.

Laleau, R´egine 22 Lanet, Jean-Louis 363 Lopez, Nestor 209 Luck, Michael 168

433

Banach, R. 304 Bellegarde, F. 230 Bicarregui, Juan 107 Boiten, Eerke 286 Bontron, Pierre 127 Butler, Michael 324 Cansell, Dominique 148 Carrington, David 2 Cooper, David 374, 451

M´ery, Dominique 148 Maibaum, Tom 107 Mammar, Amel 22 Matthews, Brian 107 Matthews, Chris 491 Meagher, Mairead 324 Miarka, Ralph 286

Darlot, C. 230 Derrick, John 286 Dimitrakos, Theo 107 d’Inverno, Mark 168 Duffy, David A. 59, 75, 471

Poppleton, M. 304 Potet, Marie-Laure 127

Everett, David

Schneider, Steve 188 Simonot, Marianne 209 Smith, Graeme 42 Spivey, Mike 1 Stepney, Susan 250, 264, 374, 451 Stoddart, Bill 394 Swatman, Paul A. 491

450

Giesl, J¨ urgen 471 Grieskamp, Wolfgang

414

Henson, Martin C. 344 Hindriks, Koen 168 Julliand, J.

230

Kim, Soon-Kyeong 2 King, Steve 250, 264 Kouchnarenko, O. 230

Reeves, Steve 344 Robinson, Ken 95

Toyn, Ian 59, 75, 250, 264 Treharne, Helen 188 Valentine, Samuel H. 59, 250, 264 Vigui´e Donzeau-Gouge, Veronique 209